PREDICTIVE MODELING FOR THE EARLY IDENTIFICATION OF CATTLE AT RISK FOR TRANSITION DISEASES By Lauren Wisnieski A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Compar ative Medicine and Integrative Biology Doctor of Philosophy 201 9 ABSTRACT PREDICTIVE MODELING FOR THE EARLY IDENTIFICATION OF CATTLE AT RISK FOR TRANSITION DISEASES By Lauren Wisnieski The transition period is defined as the time that spans approx imately 3 weeks before to 3 weeks post - partum . During the transition period, dairy cattle experience tremendous physiological changes that occur to prepare for milk production. Cows that transition poorly undergo excessive metabolic stress, which increase s the risk for disease. Metabolic stress is a physiological state composed of 3 components: inflammat ion, oxidative stress, and nutrient metabolism. Cows undergoing metabolic stress are typically identified by measuring biomarkers, monitoring feed intake, a nd reviewing health records. The biomarkers that are typically used to monitor metabolic stress are non - esterified fatty acids (NEFA) and beta - hydroxybutryrate (BHB) , which are nutrient metabolism biomarkers. Although NEFA and BHB are effective for identif ying cows undergoing metabolic stress, they are measured too close to calving to make proactive chan ges in management to reduce disease incidence in the current calving cohort. Therefore, our objective was to build predictive models for transition cow dise ases using all 3 compon ents of metabolic stress at dry - off, which would allow more time to intervene with proactive interventions. To address this objective , we designed a prospective cohort study carried o ut on 5 Michigan herds (N = 277 cows) . We randomly selected cows to form 18 calving cohorts, which were defined as cows at the same stage of lactation that were expected to calve at approximately the same time. We followed the cows until 30 days post - partum to monitor for disease occurrence. Di sease was d efined as an aggregate outcome where a cow was defined as positive if she was diagnosed with 1) one or more clinical transition disease outcome ( mastitis, metritis, retained placenta, ketosis, lameness, pneumonia, milk fever, displaced abomasum ) and/or 2) one or more adverse health events ( i.e. abortion, death of calf or cow) . Our first specific aim was to build individual cow - level models, which are presented in Chapter 2 . We built separate model sets for each component of metabolic stress, and a model set that included all 3 components. We hypothesized that the combined model would be better able to predict transition cow diseases. Our hypothesis was supported and the area under the curve calculated from receiver operator curve curve analyses wa s significa ntly higher for the combined model compared to the separate model sets ( P < 0.05) . Our second specific aim was to build cohort - level models, which we addressed in Chapters 3 and 4. In Chapter 3, we tested ways to aggregate individual - level biomarker and co variate data to the cohort - level . We tested 3 different methods that we referred to combining all 3 methods for prediction. We hypothesized that these methods would be e ffective at aggrega ting data to the cohort level. We found that the central method , which aggregated d ata either by calculating the mean (for continuous variables) or the median (for categorical variables) of values within a cohort, was the only method that produced viable m odels. Chapter 4 tested if individual level predicti ve probabilities generated from models in Chapter 2 can be aggregated to predict disease incidence at the cohort - level. We tested 3 different methods to aggregate predicted probabilities which we referred - - - methods , and we tested the efficacy of combining all 3 methods. We hypothesized that the models would produce valid disease estimates. We found that the p - dispersion method was the only method that prod uced viable models, which involved calculating the standard d eviation of predicted probabilities within a cohort. iii ACKNOWLEDGEMENTS First, I would like to thank m y committee members, Dr s . Bo Norby, Lorraine Sordillo, Steven J. Pierce, and Tyler Becker , for their support during this project. Without their knowledge and advice, this dissertation would not be possible . In addition, I would like to thank Jeff Gandy and Jenn Brown for all their hard work collecting and analyzing samples for this project. I wo uld also l ike to thank my boyfriend, (plant) Dr. Matt Kolp , for being in my life and being a constant source of optimism and support. made throughout graduate school . Graduate school would not be nearly as f un without the m. This material is based upon work supported by the United States Department of Agriculture and the National Institute of Food and Agriculture grant numbers 2014 - 38420 - 21803 and 2014 - 68004 - 21972. iv TABLE OF CONTENTS LIST OF TABLES ................................ ................................ ................................ ............ vi i LIST OF FIGURES ................................ ................................ ................................ ........... ix KEY TO ABBREVIATIO NS ................................ ................................ .............................. x CHAPTER 1 : PROSPECTS FOR PREDICTIVE MODELING OF TRANSITION COW DISEASES ................................ ................................ ................................ ........................... 1 Abstract ................................ ................................ ................................ .................... 2 1. Introduction ................................ ................................ ................................ ........ 2 2. Predictive vs explanatory modeling ................................ ................................ ... 6 2.1 Explanatory modeling ................................ ................................ .................. 6 2.2 Predictive modeling ................................ ................................ ..................... 7 3. Candidate variables ................................ ................................ ............................ 8 3.1 Biomarkers ................................ ................................ ................................ ... 8 3.1.1 Oxidative stress biomarkers ................................ ........................... 10 3.1.2 Inflammatory biomarkers ................................ ............................... 11 3.2 Health record ................................ ................................ .............................. 12 3.3 Housing and nutritional management ................................ ........................ 13 4. Study design considerations ................................ ................................ ............. 14 4.1 ................................ ................................ . 14 4.2 Primiparous and multiparous models ................................ ......................... 15 4.3 Critical timepoints ................................ ................................ ...................... 16 5. Statistical analysis ................................ ................................ ............................ 17 5.1 Predictive models ................................ ................................ ....................... 17 5.1.1 Logistic regression ................................ ................................ ......... 17 5.1.2 Discrimina n t function analysis ................................ ...................... 19 5.1.3 Factor analysis and structural equation modeling .......................... 20 5.1.4 Classification and regression trees (CART) ................................ .. 22 5.1.5 Count models (Poisson and negative binomial) ............................ 23 5.1.6 General ized linear models ................................ .............................. 24 5.1.7 Summary ................................ ................................ ........................ 26 5.2 Additional techniques for predictive models ................................ ............. 26 5.2.1 Model averaging ................................ ................................ ............ 27 5.2.2 Shrinkage ................................ ................................ ....................... 27 5.2.3 Model evaluation and validation ................................ .................... 28 6. Conclusion ................................ ................................ ................................ ....... 29 CHAPTER 2 : PREDICTIVE MODELS FOR EARL Y LACTATION DISEASE IN TRANSITION DA I RY CATTLE AT DRY - OF F ................................ ............................. 31 Abstract ................................ ................................ ................................ .................. 32 1. Introduction ................................ ................................ ................................ ...... 33 v 2. Material and Methods ................................ ................................ ...................... 38 2.1 Animals ................................ ................................ ................................ ...... 38 2.2 Biomarkers ................................ ................................ ................................ . 39 2.3 Data collection ................................ ................................ ........................... 39 2.4 Statistical analyses ................................ ................................ ..................... 41 2.4.1 Step 1: Best subsets AIC selection ................................ ................. 43 2.4.2 Step 2: Test all two - way interactions ................................ ............. 43 2.4.3 Step 3: Test random effects ................................ ............................ 44 2.4.4 Step 4: R educe pool of candidate models ................................ ...... 44 2.4.5 Step 5: Model diagnostics ................................ .............................. 44 2.4.6 Step 6: Model averaging ................................ ................................ 45 2.4.7 Step 7: Internal validation ................................ .............................. 46 3. Results ................................ ................................ ................................ .............. 46 3.1 Inflammation models ................................ ................................ ................. 49 3.2 Nutrient metabolism models ................................ ................................ ...... 54 3.3 Oxidative stress models ................................ ................................ ............. 55 3.4 Combined model (nutrient metabolism, inflamm ation, and oxidative stress ) ................................ ................................ ................................ .......................... 56 3.5 Area under the curve comparison ................................ .............................. 58 4. Discussion ................................ ................................ ................................ ........ 59 5. Conclusions ................................ ................................ ................................ ...... 65 CHAPTER 3: COHORT - LEVEL DISEASE PREDICTION USING AGGREGATE BIOMARKER DATA IN TRANSITION DA IRY CATTLE ................................ ........... 67 Abstract ................................ ................................ ................................ .................. 68 1. Introduction ................................ ................................ ................................ ...... 69 2. Material and Methods ................................ ................................ ...................... 73 2.1 Animals ................................ ................................ ................................ ...... 73 2.2 Data collection ................................ ................................ ........................... 74 2.3 Statistical analyses ................................ ................................ ..................... 75 2.3.1 Step 1: Best subsets AIC selection ................................ ................. 77 2.3.2 Step 2: Goodness - of - fit and dispersion tests ................................ .. 78 2.3.3 Step 3: Shrinkage ................................ ................................ ........... 79 2.3.4 Step 4: Scoring rules ................................ ................................ ...... 79 2.3.5 Step 5: Model averaging ................................ ................................ 80 2.3.6 Step 6: Internal validation of o bserved versus predicted counts .... 81 3. Results ................................ ................................ ................................ .............. 83 3.1 Central method ................................ ................................ ........................... 84 3.2 Dispersion method ................................ ................................ ..................... 8 7 3.3 Count method ................................ ................................ ............................. 9 0 3.4 Combined model ( central, dispersion, and count methods ) ....................... 9 1 4. Discussion ................................ ................................ ................................ ........ 9 4 CHAPTER 4 : COHORT LEVEL DISEASE PREDICTION BY EXTRAPOLATION OF INDIVIDUAL - LEVEL P REDICTIONS IN TRANSITION DAIRY CATTLE ............. 10 1 Abstract ................................ ................................ ................................ ................ 10 2 vi 1. Introduction ................................ ................................ ................................ .... 10 3 2. Material and Methods ................................ ................................ .................... 10 7 2.1 Animals ................................ ................................ ................................ .... 10 7 2.2 Data collection ................................ ................................ ......................... 1 08 2.3 Statistical analys i s ................................ ................................ .................... 10 8 2.3.1 Step 1: Best subsets AIC selection ................................ ............... 11 0 2.3.2 Step 2: Goodness - of - fit and dispersion tests ................................ 11 1 2.3.3 St ep 3: Shrinkage ................................ ................................ ......... 11 1 2.3.4 Step 4: Scoring rules ................................ ................................ .... 11 2 2.3.5 Step 5: Model averaging ................................ .............................. 11 3 2.3.6 Step 6: Internal validation of observed versus predicted counts .. 11 4 3. Results ................................ ................................ ................................ ............ 11 6 3.1 P - Central method ................................ ................................ ..................... 11 8 3.2 P - Dispersion method ................................ ................................ ................ 11 8 3.3 P - Count method ................................ ................................ ....................... 12 1 3.4 P - C ombined model ( p - central, p - dispers ion, and p - count methods) ....... 12 2 4. Discussion ................................ ................................ ................................ ...... 12 2 5. Conclusions ................................ ................................ ................................ .... 12 6 CHAPTER 5: SUMMARY AND CONCLUSIONS ................................ ....................... 12 7 Chapter 2 ................................ ................................ ................................ .............. 1 29 Chapter 3 ................................ ................................ ................................ .............. 13 0 Chapter 4 ................................ ................................ ................................ .............. 13 2 Conclusions ................................ ................................ ................................ .......... 13 3 APPENDICES ................................ ................................ ................................ ................. 13 5 APPENDIX A : Supplemental Tables ................................ ................................ .. 1 3 6 APPENDIX B : Supplemental Figures ................................ ................................ . 1 6 5 APPENDIX C : STROBE - VET Checklist ..................... 16 8 APPENDIX D : Stata code ................................ ................................ ................... 17 1 APPENDIX E : R code ................................ ................................ ......................... 44 4 REFERENCES ................................ ................................ ................................ ................ 5 60 vii LIST OF TABLES Table 2.1 Sample characteristics (N = 277) ................................ ................................ ....... 47 Table 2.2 Incidence of transition cow disease and adverse outcomes (N = 277) .............. 48 Table 2.3 Biomarker summary - statistics in study sample (N =277) ................................ . 49 Table 2.4 Inflammation candidate models and AUC values ................................ ............. 51 Table 2.5 Calculation of AIC weights for model averaging ................................ .............. 52 Table 2.6 Internal validation results: comparison of apparent predictive validity (AUC) wit h bootstrapped estimates of predictive ability ................................ ................................ ...... 53 Table 2.7 Summary of nutrient metabolism candidate models and AUC values 1 ............. 54 Table 2.8 Summary of oxidative stress candidate models and AUC values ..................... 56 Table 2.9 Summary of combined (inflammation, nutrient metabolism, and oxidative stress) candidate models and AUC values ................................ ................................ .................... 58 Table 3.1 Adverse health event incidence in the 18 cohorts in the study sample (N= 277 cows) ................................ ................................ ................................ ................................ ............ 84 Table 3. 2 Cohort means/medians of predictor variables for the ce ntral aggregation method in 18 cohorts (N= 277 cows) ................................ ................................ ................................ ....... 8 5 Table 3.3 Summary of variables present in the central aggregation method models ........ 8 6 Table 3.4 The standard deviations/MADs about the median for the dispersion aggregation method in 18 cohorts (N = 277 cows) ................................ ................................ ................ 89 Table 3.5 Number of cows above each threshold within a cohort for the count aggregation method in 18 cohorts (N = 277 cows) ................................ ................................ ................ 9 1 Table 3. 6 Summary of combined model candidate set ................................ ...................... 9 2 Table 4 .1 Adverse health event incidence in the 18 cohorts in t he study sample (N = 277 cows) ................................ ................................ ................................ ................................ .......... 11 6 Table 4.2 Descriptive statistics of aggregate predicted probabilities from previously published individual cow - level logistic regression models (Wisnieski et al., 2019 a ) by cohort (N = 18) ................................ ................................ ................................ ................................ .......... 11 7 viii Table 4.3 Candidate model set for the p - dispersion aggregation method ........................ 1 19 Table 4.4 AIC weights for the candidate models from the p - dispersion aggregation method ................................ ................................ ................................ ................................ .......... 1 20 Table 4.5 Internal validation of predicted versus obser ved counts calculated from the p - dispersio n aggregation method model set ................................ ................................ ........ 121 Table S2.1 Effect coding for categorical variables ................................ .......................... 136 Table S2.2 Multivariable logistic regression candidate model set for inflammation component of metabolic stress ................................ ................................ ................................ ................ 138 Table S2.3 Multivariable logistic regression ca ndidate model set for nutrient metabolism component of metabolic stress ................................ ................................ ......................... 142 Table S2.4 Multivariable logistic regression candidate model set for oxidative stress component of metabolic stress ................................ ................................ ................................ ............ 145 Table S2.5 Multi variable logistic regression candidate mod el set for combined model (inflammation, oxidative stress, and nutrient metabolism) ................................ .............. 147 Table S3.1 Cut - points for biomarkers and best combined sensitivity and specificity ..... 154 Table S3.2 Candidate model set for ce ntral method ................................ ........................ 155 Table S3.3 AIC weights o f candidate models using the central method for model averaging ................................ ................................ ................................ ................................ .......... 156 Table S3.4 Internal validation of predicted versus observed counts calculated from the central method model set ................................ ................................ ................................ ............. 157 Table S3.5 Variables eligib le for combined model set ................................ .................... 157 Table S3.6 Candidate model set for combined method ................................ ................... 158 Table S3.7 AIC weights of candidate models using the combined method for model averaging ................................ ................................ ................................ ................................ .......... 159 Table S3.8 Internal validation of predicted versus observed counts calcul ated from the combined method model s et ................................ ................................ ................................ ............. 159 Table S5.1 Data dictionary ................................ ................................ .............................. 160 Table S 5.2 The STROBE - Vet statement checklist ................................ .......................... 160 ix LIST OF FIGURES Figure 3.1 Residuals (observed - predicted) versus observed counts for the averaged predictions from the central m odel set ................................ ................................ ................................ .. 8 7 Figure 3.2 Residuals (observed - predi cted) versus observed counts for the averaged predictions from the combined model set ................................ ................................ ............................. 9 3 Figure 4.1 Residuals (observed - predicted) versus observed counts for the averaged p - dispersion method model set. ................................ ................................ ................................ ............ 12 1 Figure S1 Data flow diagram for data cleaning an d merging ................................ .......... 165 Figure S2 Data flo w diagram for univariable and bivariable analyses for Chapter 2 ...... 165 Figure S3 Data flow diagram for multivariable analyses for Chapter 2 .......................... 16 6 Figure S4 Data flow diagram for analyses for Chapter 3 ................................ ................ 166 Figure S5 Data flow diagram for analyses for Chapter 4 ................................ ................ 167 x KEY TO ABBREVIATIONS APP Acute phase proteins AOP Antioxidant potential AUC Area under the curve AIC Akaike information criteria BIC Bayesian information criteria BHB Beta - hydroxybutryate B CS Body condition score CART Classification and regression trees CFA Confirmatory factor analysis DSS Dawid - Sebastiani scores DMI Dry matter intake GLMM Generalized linear mixed model GLiM Gen eralized linea r model GLM General linear model IL - 1 Interleukin - 1 IL - 6 Interleukin - 6 ICC Intraclass correlation coefficient MAD Median absolute deviation NEB Negative energy balance NEFA Non - esterified fatty acids OR Odds ratio xi ROS Reacti ve oxygen species ROC Receivor operator character istic SAA Serum amyloid A TNF Tumor necrosis factor WBC White blood cells 1 CHAPTER 1 : PROSPECTS FOR PREDICTIVE MODELING OF TRANSITION COW DISEASES Wisnieski, L. a , Norby, B. a , Pierce, S.J. b , Becker, T. c , and Sordillo, L.M. d a Department o f Large Animal Clinical Sciences, Michigan State University 736 Wilson Rd, East Lansing, MI, USA, 48824 wisnies5@msu.edu b Center for Statistical Trainin g and Consul ting, Michigan State University 293 Farm Lane, East Lansing, MI, USA, 48824 Steve.Pierce@cstat.msu.edu c Department of Food Science and Human Nutrition, Michigan State University 469 Wilson Rd, East Lansing , MI, USA, 48824 beckert4@msu.edu d Corresponding author Department of Large Animal Clinical Sciences, Michigan State University 784 Wilson Rd, East Lansing, MI, USA, 48824 sordillo@msu.edu 517 - 432 - 8821 2 Abstract Transition cow diseases can negatively imp act animal welfare and reduce dairy herd profitability. Transition cow disease incidence has remained relatively stable over time despite monitoring and management efforts aimed to reduce the risk of devel oping dis eases. Dairy cattle disease risk is monito red by assessing multiple factors, including certain biomarker - test results, health records, feed intake, body - condition score, and milk production. However, these factors, which are used to make herd mana gement de cisions, are often reviewed separately wit hout considering the correlation between them. In addition, the biomarkers that are currently used for monitoring may not be representative of the complex physiological changes that occur during the transi tion peri od. Predictive modeling, which uses data t o predict future or current outcomes, is a method that can be used to combine the most predictive variables and their interactions efficiently. The use of an effective predictive model for transition cow d isease wi th relevant predictors will result in bett er targeted interventions, and therefore lower disease incidence. This review will discuss predictive modeling methods and candidate variables in the context of transition cow diseases. The next step is to investig ate novel biomarkers and statistical metho ds that are best suited for the prediction of transition cow diseases. Keywords: transition dairy cattle, disease prediction, predictive modeling, statistical methods 1. Introduction Transition cow diseases are a set of disease conditions (e.g. mastitis, met ritis, ketosis, displaced abomasum, and retained placenta) that occur during the transition period. The transition period for dairy cows is generally defined as approximately three weeks before to three we eks after calving (Drackley, 1999). During this tim e, disease susceptibility is greatest due to physiological changes that occur as the cow prepares for calving and transitions into milk 3 production (Sordillo and Raphael, 2013). The enhanced energy demands during this time are accompanied by a diminished dr y matter intake, resulting in a negative energy balance (NEB) (Hutjens and Aalseth, 2005). Negative energy balance results in the mobilization of energy stores from adipose tissue ( Ingvartsen, 2006). These are normal physiological processes for cows during the transition period, however, failure to adequately maintain homeostasis may result in metabolic stress. Physiological processes that contribute to metabolic stress include abnor mal nutrient utilization , oxidative stress, and unregulated inflammation (S ordillo and Mavangira, 2014). Biomarkers representing altered nutrient metabolism typically measure lipid mobilization, which is defined as the imbalance between lipogenesis and lip olysis within the adipos e tissue (Contreras and Sordillo, 2011). Oxidative stress is defined as an imbalance caused by increased intermediates or repair damag e caused by them (Better idge, 2000). Lastly, inflammation typically occurs as a result of injury or destruction of tissues. However, all dairy cows experience some degree of inflammation, especially postpartum, during the transition period (Bradford et al. , 2015). Inflammation ma nifests as a series of responses including the prod uction of inflammatory mediators (Sordillo and Mavangira, 2014). Although described separately, these three physiological processes (i.e., nutrient utilization, oxidative stress, an d inflammation) are inte rrelated, often hard to differentiate, and have the potential to form destructive feedback loops. When homeostasis is interrupted, metabolic stress occurs and is associated with poor uterine health and risk of transition cow disease s (Esposito et al., 2014 ; Sordillo and Mavangira, 2014). Metabolic stress results in greater economic costs to the producer due to increased culling rates, treatment costs, impaired reproduction, and reduced milk yield (Melendez et al., 2003; Petrovski et al., 2006; LeBlanc, 200 8). Despite advances in research on transition cow disease 4 etiology, prevention, and treatment, transition cow disease incidence overall has remained stable with few exceptions (LeBlanc et al., 2006). For instance, the percent of co ws reporting clinical ma stitis has steadily increased from 13.4% in 1995 to 24.8% in 2013 (NAHMS, 1996; NAHMS, 2018). Biomarkers that indicate the energy state of the animal are frequently used for health monitoring. These biomarkers include NEFA (non - este rified fatty acids), BHB (beta - hydroxybutyrate), calcium, and body conditio n score (BCS). Threshold values for NEFA, BHB, and calcium have been established at the individual cow level to indicate high disease risk. To make whole herd decisions, usually a p roportion of the herd is sampled. Then, if there is a percentage of the her d above a certain threshold of NEFA or BHB, dietary and environmental interventions are implemented. At the individual cow level, the sensitivity and specificity for mmol/L measured in the first and second week after calving for detecting s ubclinical ketosis ranges from 87% to 93% and 93% to 100%, respectively (LeBlanc, 2010). However, BHB has only been demonstrated as a valuable biomarker post parturition, when it is too late for proactive interventions to reduce disease incidence in that p articular calving cohort. On the other hand, NEFA thresholds have been utilized for both pre - partum and post - partum cows, with threshold levels ranging from approximately 0.30 to 0. 40 mmol/L (Oetzel, 2004; Ospina et al., 2013). However, Ospina et al. (2010 ) reported lower sensitivity and specificity of NEFA measured pre - - partum (threshold the following diseases: displaced abomasum, clinical ketosis, metritis, or retained placenta (Ospina et al., 2010). Pre - partum sensitivity and specificity were 48% (95% CI: 42 - 54) and 69% (95% CI: 67 - 72), while post - partum sensitivity and specificity were 75% (95% CI: 66 - 82) and 61% (95% CI 58 - 64) (Ospina et al., 2010). Calcium 5 c oncentrations can be used to identify hypocalcemia, which occurs when calcium levels are below the normal reference range of 8.5 to 10 mg/dL (Goff, 2008a). Calcium serum levels belo w 9.4 mg/dL measured one - week pre - partum had very low sensitivity at 35% fo r displaced abomasum, while specificity was 71% (Chapinal et al., 2011). Herd - level sensitivity is negatively affected by low sensitivity at the individual level, so it is imperativ e to have high test sens itivity, especially pre - partum when there is still time to intervene (Ospina et al., 2013). A potential method to increase sensitivity in detecting cattle at risk of transition diseases is to use multiple biomarkers that can captur e the physiological comp lexity of metabolic stress. For instance, metabolic syndrome in humans, an analogous condition to metabolic stress in cows, is diagnosed by measuring a cluster of cardiometabolic biomarkers (IDF, 2006; Sordillo and Mavangira, 2014). The biomarkers used for metabolic syndrome measure different processes inv olved, including insulin resistance, visceral adiposity, dyslipidemia, glucose intolerance, hypertension, proinflammatory state, and a prothrombotic state (Kaur, 2014). A patient do es not need to present w ith all criteria to be diagnosed with metabolic syn drome (IDF, 2006). Metabolic stress in dairy cattle is very similar to metabolic syndrome in dairy cattle, in that both conditions involve interconnected physiological processes, wh ere some or all processe s are dysregulated. For metabolic stress in dairy c attle, the three major physiological processes involved are altered nutrient metabolism, oxidative stress, and inflammation. Biomarkers covering these three pillars of metabolic str ess need to be implement ed in monitoring so that dairy cattle undergoing me tabolic stress may be better identified. Even when multiple biomarkers and other monitoring tools (e.g. milk production, feed intake) are assessed together in multivariable analys es, the models are typic ally explanatory and not aimed for prediction and a pplication in the field. It is important to understand the differences 6 between predictive and explanatory methods before beginning analysis because variable selection, model evaluat ion procedures, and mode l validation vary greatly between the two methods. Explanatory modeling is used to test causal explanations between the predictors and the outcome, while predictive modeling is used if the goal is to predict future or current event s (Sainani, 2014). Since the aim is to identify cattle that will later deve lop disease as a result of metabolic stress, the focus should be on predictive modeling. Often, methods between explanatory and predictive modeling are conflated, and researchers as sume strong explanatory models have high predictive power, which is not alw ays the case (Shmueli, 2010). This paper will review predictive modeling methods in the context of transition cow diseases. Herein, we will discuss what predictive modeling is, sug gestions for variable se lection, study design considerations, and statistic al analysis tools. 2. Predictive vs explanatory modeling 2.1 Explanatory modeling Many studies assessing risk factors for transition diseases use explanatory methods, which are used to in vestigate a possible cau sal pathway between the variables (LeBlanc et al., 2005; Huzzey et al., 2007; Dubuc et al., 2010; Seifi et al., 2011; Sainani, 2014; Vanholder et al., 2015). For example, LeBlanc et al. (2004) explored impaired immune function as a potential causal factor in transition cow diseases. The study reported the association of biomarkers for immune function, including vitamin E, retinol, and beta - carotene with clinical mastitis or retained placenta (LeBlanc et al., 2004). The methods were e xplanatory in that the a im was to explain a mechanism behind the developmen t of disease. In explanatory modeling, potential variables are chosen based on their hypothesized role in the causal pathway. It is important to consider the primary predictors of i nterest, as well as a pr iori confounders, and variables that are 7 noted to b e confounders in the dataset (Dohoo et al., 2003). Automated selection procedures, such as stepwise and best subset selection, are criticized for their use in explanatory models bec ause they optimize model fit with no regard to the role of each variable (S ainani, 2014). Selection of variables is generally driven by p - values and the size of the coefficients (Sainani, 2014). Explanatory models are evaluated based on explanatory power and the strength of the relationship between explanatory variables and the outcome indicated by the model, and then validated based on model fit and internal validity (Shmueli, 2010). 2.2 Predictive modeling To the best of our knowledge, only one study modeli ng disease risk in trans ition dairy cattle used predictive modeling methods (Vergara et al., 2014). As opposed to explanatory modeling, predictive modeling is focused on predicting future or current events, rather than causal explanations. In predictive mo deling, variables are ch osen based on association, data quality, and availa bility of the predictors before the event of interest (Shmueli, 2010). The main goal of variable selection is to include the predictors that most improve the predictive value of the model (Sainani, 2014). Automated variable selection procedures are allowed , but it is suggested that Akaike Information Criteria (AIC) or Bayesian Information Criteria (BIC) are used instead of the statistical significance and size of the coefficients of the variables (Shtatland et al., 2001; Shmueli, 2010; Sainani, 2014). This is because AIC and BIC focus on the strength of the evidence for a model rather than hypothesis testing (Mazerolle, 2006) and balance model performance against model complexity (Bur nham and Anderson, 2004) . Predictive models are evaluated based on informat ion criteria, discrimination (e.g. receiver operator characteristic [ROC] curve analysis), calibration, and goodness - of - fit (Sainani, 2014), and validated based on internal and exte rnal validity (Sainani, 2014; Shmueli, 2010). 8 In the end, explanatory and predictive models can produce greatly different results. In addition to their different goals and variable selection strategies, explanatory and predictive models also differ regard ing the bias - variance tr ade - off. Explanatory models aim to minimize bias th rough obtaining the most accurate representation of the underlying theory, while predictive modeling aims to minimize bias and estimation variance, sometimes sacrificing accuracy fo r precision (Shmueli, 20 10). It is likely that the variables, explanatory p ower, and predictive ability differ between explanatory and predictive models (Shmueli, 2010). In an example from human medicine, Singal et al. (2013) compared two models for predic tion of hepatocellular c arcinoma. One model was a conventional regression m odel built with explanatory modeling methods and the other was a predictive forest plot built using machine learning techniques. The authors found that both the variables and the pr edictive ability differe d between the two methods, with the forest plot hav ing the higher predictive ability (Singal et al., 2013). There is a need in transition cow research to consider predictive modeling techniques to determine which variables are most predictive, rather than explanatory. 3. Candidate variables Candidate variabl es that have either demonstrated the ability to predict or are associated with transition cow diseases in prior studies should be considered during study design. Practical considera tions, including ease of measurement and analysis, reliability, and afforda bility, should be incorporated in variable choice. Although not comprehensive, this section will review why some select variables merit attention for modeling transition cow disease s. 3.1 Biomarkers There are generally three major classes of biomarkers: bioma rkers of exposure, effect, and susceptibility. Biomarkers of exposure are generally used to assess environmental exposure 9 to a substance, biomarkers of effect measure the success of a treatment, and biomar kers of susceptibility are used to identify who is at risk of a certain disease (Grandjean, 1995). For predictive disease modeling, biomarkers of susceptibility are of most utility as they can identify cows that have an elevated ris k of disease. Specifical ly, biomarkers that are indicators of the underlyin g physiological processes that cause transition cow disease. As mentioned earlier, metabolic stress brought on by the physiological transition from dry - off to early lactation is a k ey contributor to diseas e development. Effective and predictive biomarkers for metabolic stress are needed to identify cattle that are not transitioning smoothly. There are three components of metabolic stress that should be considered when choosing potent ial biomarkers: nutrient utilization, oxidative stress, and inflammation (S ordillo and Mavangira, 2014). As mentioned earlier, typically lipid mobilization markers are used to identify aberrant nutrient mobilization. Although the lipid mobilization biomark ers, including NEFA, BHB , and calcium have immense value in the identificat ion of cattle undergoing metabolic stress, these biomarkers only capture some of the physiological changes that occur. In an analogous condition in humans, metabolic syndrome, human s typically only need to have three or more cardiometabolic criteria to be diagnosed with metabolic syndrome. Although the underlying physiological processes are interconnected, humans with metabolic syndrome may not present with the same risk factors (IDF , 2006). Other groups ha ve recognized that this is also true of metabolic s tress in dairy cattle in that multiple biomarkers are needed to capture the variability in how metabolic stress presents (Ingvartsen et al., 2003; Celi, 2010; Abuelo et al., 2013; S ordillo and Mavangira, 2 014). There is a need to develop biomarkers that re present other physiological processes involved in metabolic stress, oxidative stress and inflammation, to better identify cows at risk. Although not 10 exhaustive, specific oxidative s tress and inflammation b iomarkers that deserve consideration will be discus sed in the following sections. 3.1.1 Oxidative stress biomarkers The measurement of both pro - and anti - oxidative status. Abuelo et al. (2013) showed that the use of an Oxidative Stress Index (OSi) that measured b oth pro - and anti - oxidants was superior compared to using pro - and anti - oxidants separately. Pro - oxidants, such as reactive oxygen species (ROS), are inherently unstable molecule s due to unpaired electron s (Betteridge, 2000). Although produced in response to normal metabolic activities, they can cause substantial damage in tissues when produced in excess (Sordillo and Aitken, 2009). The action of ROS can be counteracted by antioxi dants, which are electron donors. There are a variety of pro - and anti - oxida nt biomarkers as described capabilities. For example, the measurement of Trolox equivalent antioxidant capacity is an extremely fast and simple measurement using spectr ometry (Celi, 2010). However, this method has varying results with sample dilution and varying specificity (Celi, 2010). Many vitamins also have anti - oxidant capabilities and can serve as biomarkers for tr ansition diseases. Specifically, vitamins E and A, and their derivatives are excellent candidates. Vitamin E and vitamin A are important cellular antioxidants and reduce the risk of clinical mastitis when supplemented in feed (LeB lanc et al., 2004; Spears and Weiss, 2008; Sordillo and Aitken, 2009; Yang an d Li, 2015). Not only do vitamin E and vitamin A have anti - oxidant roles, but also decrease inflammatory markers and support the immune system when supplemented in feed (Baldi, 20 05; McDowell et al., 2007) , hence they can also be considered inflammatory bi omarkers. Vitamin E and vitamin A have been studied in dairy cattle for more than 20 years, 11 which speaks to the strength of evidence for their involvement in dairy cattle health ( Weiss, 1998; Baldi, 2005). 3.1.2 Inflammatory biomarkers Acute phase proteins and white blood cells are key players in systemic inflammation. Tissue injury or infections stimulate inflammatory responses that increase acute phase proteins (APP), including haptoglobin, serum amyloid A (S AA), and albumin (Yang et al., 2006; Ceciliani et a l., 2012; Pepys, 2012). White blood cells travel to the area of infection or injury and remove foreign cells, dead cells, or microorganisms. They dissipate once the stimulus disappears (Malarkey and McMorr ow, 2011). The use of assays for APP in dairy cattl e has grown in popularity for investigating infectious and inflammatory diseases, thereby resulting in increased availability of assays that measure APP concentrati ons (Ceciliani et al., 2012). In addition , epidemiological evidence has demonstrated relatio nships between APP and disease in dairy cattle. For example, a group in Canada found that haptoglobin and SAA were associated with multiple disease outcomes in tran sition dairy cattle, including retained p lacenta, mastitis, and lameness (Dervishi et al., 2 015; Zhang et al., 2015; Dervishi et al., 2016 a ). Another group found that high albumin levels pre - calving significantly increased the risk of a disease event (Saun , 2004). Acute phase proteins can recruit white blood cells to the area of infection. For in stance, SAA recruits monocytes to inflamed tissues (Badolato et al., 1994). In many studies, leukocytes, including neutrophils, lymphocytes, and macrophages, are as sociated with uterine diseases (Kulberg e t al., 2002; Hammon et al., 2006; Goff, 2008b; Galv ão et al., 2010). Milk somatic cell counts, which are comprised mostly of leukocytes, are used to diagnose subclinical mastitis in dairy cattle (Barkema et al., 199 8). The strong support for APP and leukoc ytes in relation to transition cow disease outcomes lends support for using these biomarkers in disease prediction. 12 Vitamin D is an optimal candidate biomarker for transition diseases due to the strong connection o f vitamin D with inflammatory conditions in humans, including obesity and metabolic syndrome (Autier et al., 2014). Vitamin D is involved in both acute and chronic inflammation, and has potent immunomodulatory properties (Gombart, 2012). Vitamin D regulate s the production of cytokines and inhibit s the production of proinflammatory cells (Yin and Agrawal, 2014). In addition to its role in disease prevention, vitamin D is involved in calcium homeostasis, which is an important lipid mobilization biomarker invo lved in metabolic stress (Horst et al., 1 994). In transition dairy cattle, a vitamin D defic iency may lead to hypocalcemia, or milk fever (DeGaris and Lean, 2008). 3.2 Health record Farm health records include information such as lactation number, dry matter i ntake (DMI), disease history, milk yield, and milk quality. Most of this data is already rou tinely collected on farms, which makes them ideal candidate variables to consider for predictive modeling. Increasing lactation number, or parity, is associated wit h greater incidence of disease, including milk fever and mastitis (DeGaris and Lean, 2008; A bebe et al., 2016). There are also many behavioral differences between heifers (cattle on their 1 st lactation) and cows (cattle on their 2 nd or greater lactation), such as time spent feeding and frequency of visits to the feed bunk, which may affect DMI (N eave et al., 2017). Lower pre - partum DMI is associated with risk of metritis and milk fever (Goff and Horst, 1997; Huzzey et al., 2007). Disease history can offer i une system. For instance, cows infected with bovine leukemia virus (BLV) have dramatic shifts in lymphocyte function, which could cause a higher risk for subclinical and clinical mastitis (Sordillo and Erskine, 2010 ). Milk yield, both from the current lact ation and previous lactations, is also associated w ith disease risk. Cows with higher 13 milk yield in the previous lactation have increased risk of retained placenta, mastitis, milk fever, and metritis (Grohn et al., 1995; Fleischer et al., 2001), whereas hi gher milk yield in the current lactation is associa ted with a higher risk of milk fever (Fleischer et al., 2001). 3.3 Housing and n utritional m anagement Similar to information from the health record, information about housing and nutritional management is re adily available. A comfortable environment is criti cal for cow health. Comfortable stalls and walkways, soft mats, or deep bedding lower the risk of lameness (Cook and Nordlund, 2009; de Vries et al., 2015). Proper shade and cooling systems prevent heat st ress, thereby improving DMI (Collier et al., 2006). In addition, cow health improves with longer stall length and greater freedom to move (e.g. free stalls instead of tie stalls) (Cook, 2003; d eVries et al., 2004; Z urbrigg et al., 2005; Simensen et al., 20 10). Grazing dairies have lower mortality and reduc ed risk of transition diseases (Bruun et al., 2002; Haskell et al., 2006; Burow et al., 2011). Pen changes and stocking density should also be considered during var iable selection. Each pen change results in the establishment of a rank order. Housing multi parous cows separately from heifers, avoiding overstocking, and careful management of maternity and sick pens can reduce social turmoil and disease incidence (Nordl und et al., 2006). The nutritional compos ition of feed can also be incorporated in disease p rediction. Proper nutritional management can quell the severity and length of NEB and heat stress during early lactation (Santos et al., 2012; Das et al., 2016). Su pplements in feed (e.g., zinc, copper, vi tamin E, selenium, and beta - carotene) can reduce th e incidence of many transition cow diseases, including clinical mastitis, metritis, and retained placenta (Spears and Weiss, 2008; Heinrichs et al., 2009). Informat ion from health records and on housing an d nutritional management are convenient variables t hat have consistent associations with disease. It is considered good practice to include a large pool of candidate 14 variables when building predictive modeling, so t hat any information that can be incorpora ted from health records, housing, and nutritional m anagement should be incorporated in model selection. 4. Study design considerations When designing a study, it is important to consider the characteristics of the t arget population. Noteworthy consideratio ns for dairy cattle include 1) the clustering of th e cows within farms, 2) the differences between primiparous and multiparous cows, and 3) the presence of critical timepoints in the lactation cycle. 4.1 Hierarchical A cluster refers to a group of observations that share some common feat ures. For instance, clusters may be formed from individuals that share a common environment or from repeated measures collected from the same individual (Dohoo et a l., 2003). For dairy cattle, potential cl uster variables may include factors such as farm or pen. Careful calculation of sample size is important when using clustered data. For instance, when using farm as a cluster variable, the estimated number of farms and number of animals needed will depend on the variation in the covariates between farms co mpared with within a farm. This estimate can be calculated using the intraclass correlation coefficient (ICC) and an adjustment to the sample size calculation can b e made if needed. Clustered samples typic ally require a higher sample size than simple rando m sampling. The ICC ranges from 0 to 1: a high ICC (closer to 1) is indicative that the variation within a group is small relative to between groups, and a low ICC (closer to 0) means that most of the vari ation is within the groups (Dohoo et al., 2003). As ICC increases, the optimal number of clusters increases and the sample size per cluster decreases (van Breukelen and Candel, 2012). For example, if the ICC estimat ing the variance between farms compared w ith the variance 15 within farms was high, a greater n umber of farms would need to be enrolled. This would correspond with a smaller number of cows per farm needed. Analyses must also be adjusted to account for the hie rarchical structure of data. If clusterin g is not considered when it is present, the standar d errors of the estimates may be underestimated. That increases the risk of falsely rejecting null hypotheses (making type I errors by concluding there is a differe nce when there is not really one). Differ ent methods to account for clustering include strat ifying the model by cluster, adjusting the standard errors, including the cluster as a fixed effect, or including the cluster as a random effect (Dohoo et al., 2003 ). However, predictive models for transit ion diseases should aim to be generalizable to diff erent populations (e.g. farms). To build predictive models that are generalizable to different farms, farm must be used as a clustering variable and represented as a random effect (Galbraith et al., 2010). That is only viable when a sufficient number of fa rms are available in the sample. There may also be more than one clustering variable in a given study. For instance, cows may be nested in cohorts, which are define d as groups of cows expected to calve aro und the same time. In this case, three - level models can be used to adjust for both cohort and farm, however, sample size must for sufficient at both levels (Nezlek, 2008). 4.2 Primiparous and multiparous models There a re many physiological differences between primiparous and multiparous cattle. For example, p rimiparous and multiparous cattle have differing concentrations of many biomarkers that are associated with disease risk (e.g., BHB, leptin, and NEFA) throughout th e lactation cycle (Wathes et al., 2007). In addition, as mentioned previously, multiparous c attle have a greater disease risk than primiparous. Including parity as a predictor in models may not be sufficient to adjust for the large physiological difference s between primiparous and multiparous 16 cow s. It may be advantageous to either build separate predictive models for primiparous and multiparous cattle or allow for interactions between parity and other variables in the same model. For example, Huzzey et al. (2011) reported separate models for primi parous and multiparous cattle for the associations between cortisol and transition disease outcomes. The authors found higher cortisol levels were protective for disease in primiparous cows but increased the risk fo r disease in multiparous cows (Huzzey et al., 2011). This indicates differences in the physi ological control of lipid mobilization in heifers and cows (Wathes et al., 2007). Therefore, the relationships between lipid mobilization biomarkers and disease out comes may also be different in heifers an d cows. Multiparous cows that have had previous l actations have additional information available in their health record that may improve disease prediction, including prior milk yield, peak milk, and days in milk (DIM) (Grohn et al., 1995; Fleischer et a l., 2001). If primiparous and multiparous cattle we re combined in the same model, primiparous cows would have empty values for these variables and be excluded from the analysis. In order to use all information avail able for prediction, separate models may be best. However, it is important to note that sepa rate models have implications for sample size. The researcher must ensure ahead of sampling that they will have an adequate number of primiparous and multiparous co ws for meaningful statistical power in bo th models. If sample size permits, researchers shou ld explore separate models for primiparous and multiparous cows given their many differences. 4.3 Critical timepoints There are critical time points during the lactati on cycle that are indicative of huge phys iological shifts and important management decisions . For instance, dry - off is the time point in the lactation cycle where the farm staff stop milking the cows in preparation for the next 17 calving. At this time, cows receive vaccinations and dry cow treatmen t regimens that are associated with later risk of d iseases (Green et al., 2007). Nutritional management of dry - cows is also critical to aid calcium metabolism and to control the release of NEFA from adipose tissue ( Overton and Waldron, 2004). In the period from late pregnancy to early lactation, all dairy cattle undergo physiological stresses, including insulin resistance, reduced immune function, and bacterial contamination of the uterus that increases risk of disea se (LeBlanc, 2010). Dairy cattle research ers should aim to design studies that capture these critical time - points through a longitudinal study design. This way, changes between the time - points can be monitored and optimal times for intervention strategies during the lactation cycle may be elucida ted. 5. Statistical a nalysis There are several types and variations of statistical analyses that can be used to create anal ysis that will work best for their resear ch question and dataset. A few topics to consider w hen choosing a statistical model for transition disease prediction are as follows: 1) purpose of the analysis (e.g. farm or individual cow level predictions), 2) sa mple size availability, 3) scale of indep endent and dependent variables (e.g. continuous, ca tegorical, ordinal), and 4) if the data can satisfy the assumptions of the analysis (McCrum - Gardner, 2008). The following section reviews statistical analysis optio ns to consider before model building. 5.1 Pr edictive models 5.1.1 Logistic regression Logistic regres sion models are well - suited for testing the association between a binary outcome and continuous and/or categorical predictors (Peng et al., 2002). In the case of modeling transition diseases, our binary re sponse variables would be coded as 0 (i.e. no disea se) or 1 (i.e. 18 disease) (Long and Freese, 2006). The coefficients in the logistic regression may be converted to odds ratios (OR) so that researchers can evaluate how each covariate contributes to predicti on (Hosmer et al., 2013). The coeffi cients can also be converted into relative risks (RRs) for prospective data within the generalized linear regression framework. Relative risk estimates are more easily interpreted than ORs (Schmidt and Kohlmann, 2008). L ogistic regression requires that: th e probabilities are a logistic function of the independent variables, the model includes all relevant variables, the observations are independent and measured without error, and that the variables are not linear combinat ions of each other (Harrell, 2015). In the literatu re, there are innumerable examples of studies that used logistic regression to model transition disease outcomes. For example, LeBlanc et al. (2004) investigated the associations of vitamin E, retinol, and beta - carotene with retained placent a and clinical mastitis in two separate models. The authors found that pre - partum alpha tocopherol was negatively associated with risk of retained placenta and retinol levels were negatively associated with the risk of c linical mastitis (LeBlanc et al., 20 04). In another study, Huzzey et al. (2007) used logistic regression analysis to test the associations of pre - partum feeding time and DMI with metritis. The authors found that increased feeding time and DMI were associat ed with a lower risk of metritis (Hu zzey et al., 20 07). The use of logistic regression is widespread, probably due to its ease of interpretation and its relationship to analyses of contingency tables (Lemeshow and Hosmer, 1982). Logistic regression models can accommodate categorical predict ors, which is e ssential in transition disease prediction where many potential predictors are categorical (e.g. parity, prior disease, co - morbidities). Another strength of logistic regression analysis for predictive model ing is that interaction terms can be used. Interact ion occurs when the effect of two variables together is different than the sum of their individual effects (Dohoo et al., 2003). Including interaction terms 19 can improve predictive performance (Kerr and Pep e, 2013). Logistic regression models are an excelle nt candidate for transition disease models and should be considered if the researcher has access to statistical packages. However, in the case of small sample sizes, perfect prediction or empty cells of ce rtain covariate patterns are more li kely to occur. This leads to unstable coefficients and standard errors and may lead to the exclusion of important covariates (Zorn, 2005). Exact logistic regression, a variation of logistic regression, can be used to rem edy this issue. Exact logistic regre ssion handles s mall sample sizes and empty cells by constructing a statistical distribution that can be completely enumerated (Hosmer et al., 2013). 5.1.2 Discriminant function analysis Discriminant function analysis is used to determine which continuous variab les discriminat e between two or more groups (McLachlan, 2004). For instance, a researcher may investigate which variables discriminate between cows that get mastitis and those that do not. Discriminant analysis is a two - step process: testing significance o f a set of disc riminant functions, which is computationally identical to multivariate analysis of variance (MANOVA), followed by classification. During classification, discriminant analysis automatically determines which set of variables best discriminate between the gro ups (McLachlan, 2004). Discriminant analysis answers the same basic questions as logistic regression but has additional assumptions. Discriminant analysis has the same assumptions as MANOVA, including equ al sample sizes between groups, a no rmal distributi on of the data, homogeneity of variances/covariances, absence of outliers, and independent means and variances. In addition, discriminant analysis requires that the variables used to discrimina te between g roups are not highly correlated, whi ch can be deter mined by calculating tolerance values (Klecka, 1980). 20 To the best of our knowledge, discriminant factor analysis has not yet been used to model transition cow diseases but has been used to investigate mul tifactorial conditions in humans, in cluding metabol ic syndrome (Corica et al., 2008; Leite et al., 2007). A benefit of using discriminant analysis is that it can handle smaller sample sizes than logistic regression. However, as mentioned earlier, another o ption for small sample sizes is exac t logistic regr ession. Exact logistic regression may be a better option than discriminant analysis in transition dairy cattle disease studies because discriminant analysis assumptions are often difficult to meet. Discrim inant analysis requires that the gro ups (disease vs . no disease) are approximately equal in size. It is unlikely that close to half of all cows will present with clinical disease. For instance, clinical mastitis, which is the most common transition disease , only has a median incidence risk of 14.2% (Van Sa un and Sniffen, 2014). This issue can be addressed through retrospective studies (e.g. c ase - control) that can select equal numbers of diseased and non - diseased cows. In prospective studies (e.g. cohort), t his can be addressed by collecting data on a much l arger sample then random subsampling from the healthy group to get balanced sample sizes . However, this option may not be cost - effective. Another consideration for discriminant analysis is that it does not handle categorical variables as well as logistic r egression (Press and Wilson, 1978), so researchers may have to exclude important predict ors, such as parity. 5.1.3 Factor analysis and structural equation modeling The intent of factor analysis is to determine the number or nature of factors that account for va riation and covariation among a set of variables, or indicators (Brown, 2015). In other words, the goal is dimension reduction: combining many indicator variables into a smaller number of latent factors th at are each broader constructs than the individual indicators. Factor 21 variables. The latent factor is considered the cause of the observed variables that explains w hy they are correlated (Brown, 2015). In general, t here are two types of factor analysis: exploratory factor analysis (EFA) and confirmator y factor analysis (CFA). Exploratory factor analysis is used to explore the underlying factor structure of a set of v ariables, whereas CFA is based on prior knowledge o r a conceptual foundation (Suhr, 2006). Typically, factor analysis is used in the social sciences for survey development to account for correlations between questions related to the same topic (Brown, 2015 ). This method is useful because it accounts for co rrelation between two topics. For example, in a mental health survey, all answers to the questions related to depression will be correlated and factor analysis can be used to derive a measure of depression as a latent construct that explains those correlat ions (Brown, 2015). That simplifies studying the relationship of depression to other con structs. To the best of our knowledge factor analysis has not been used to model transition diseases. However, in th eory, factor analysis could apply to grouping trans ition cow metabolic stress biomarkers, because they relate to the same physiological pro cesses and most likely are correlated. In fact, multiple studies have used CFA to model biomarkers for metabolic synd rome in humans (Pladevall et al., 2006; Li and Ford , 2007; Boronat et al., 2009; Gurka et al., 2014). Factor analysis has many advantages, including its high - power data reduction capabilities and the ability to handle measurement errors efficiently (Brown, 2015). The output of factor analysis is a factor s core, which is a composite variable that relays information about an ent on the factor (DiStefano et al., 2009). For the example of transition cow metabolic stress biomarkers, the factor score would correspond to the degree of metabolic stress an individual cow is undergoing. The factor scores can then be used in further an alyses to predict disease risk, as in structural equation modeling. However, factor analysis has some 22 disadvantages. The final factor scores may be difficult to interpr solution. The lack of a single solution arises from th e fact that the parameters are not uniquely defined, and the researcher has choices in how the model is specified (Di Stefano et al., 2009). This indeterminacy results i n a variable number of solutions that can produce varying factor scores (Grice, 2001). There are different variations of factor analysis that also allow for binary, count, and censored variables (Brown, 2 015; Van der Ark et al., 2012). However, the assump tions of factor analysis in its basic form are: a normal distribution for each observed variable, normal bivariate distributions for each pair of observed variables, as well as a normal multivariate distri bution (Suhr, 2006). Although transforming variable s often solves the problem of non - normality for continuous variables, adjusting for non - linearity of the associations between biomarkers for metabolic stress and disease outcomes may be more difficult. In some situations, very low or very high levels of a biomarker are harmful, whereas intermediate levels are protective. For example, both low and high BCS at calving is associated with increased incidence of ketosis, metritis, retained placenta, and mastitis (Kim and Suh, 2003; Bewley and Schutz, 2008; Roche et al., 2009). This suggests a quadratic relationship between BCS and disease, rather t han a linear relationship. 5.1.4 Classification and r egression t rees (CART) Classification and Regression Trees (CART) are obtained by partitioning the data in multiple step s to form a simple prediction model. The steps are portrayed as a decision tree (Loh, 20 11). Classification trees are used when the dependent variable is categorical (i.e. high disease risk or low disease risk), whereas regression trees are used when the d ependent variable is 23 continuous (e.g. level of disease risk). The independent variables can be either categorical or continuous (Ma, 2018). Steensels et al. (2016) developed a CART model to detect post - ca lving lameness and metritis and reported acceptable sensitivity and specificity at 69% and 87%, respectively. Classification and Regression Trees have also been used to predict difficult calvings (Zaborski et al., 2016; Fenlon et al., 2017). CART models ha ve many advantages, including ease of use, clear ru les that are easy to interpret, and the absence of assumptions that statistical models r equire. This could be of great benefit for practical use on dairy farms. A potential limitation of CART models is tha t they can only handle one variable at a time, so t here is no way to account for interactions between the variables (Loh, 2011). In additio n, CART diagrams are unstable, as slight changes in the data can result in a vastly different tree (Li and Belford, 2 002). Fenlon et al. (2017) reported that multinomia l regression, which is similar to logistic regression, outperformed the CART diagram for prediction of difficult calvings. CART diagrams are a valid option for predictive modeling and should be considered if more advanced statistical models are unavailable or unfeasible. 5.1.5 Count models (Poisson and negative binomial) Dairy herd managers are often concerned about health at the herd level. Health at the herd/cohort level can be assessed using incidence rates , which are defined as the number of new cases of a disease that occur during a specified time of interest in the population of interest (D ohoo et al., 2003). The cohort level, which is defined as cows at the same stage of lactation (e.g. dry off, close - up , calving), may be more useful in targeting immedia te interventions. This is because it offers information on the most current group of cow s undergoing the transition from pregnancy to lactation. Poisson models are particularly useful for modeling incidenc e, because 24 the Poisson distribution provides more a ccurate estimates of count data compared with the normal distribution (Long and Freese, 2006). To calculate herd/cohort level incidence, an exposure offset can be included in a Poisson model that adjusts f or the number of cows in the herd (Dohoo et al., 20 03). Poisson models hold a strict assumption that the variance is equal to the mean. Ho wever, most distributions do not meet this assumption and have a higher degree of skewness. This higher degree of ske wness, which is called over - dispersion, causes a gr eater difference between the observed and predicted values than expected (Hilbe, 2014). Models that incorporate a linear dispersion correction (e.g. negative binomial) may be used in place of Poisson regre ssion to account for this discrepancy (Hilbe, 2014) . Zero inflated and zero truncated models are versions of Poisson/negative binomial mode ls, which can be used in situations when there are many zeros and few zeros, respectively (Long and Freese, 2006). Po isson and negative binomial models have been used t o model transition diseases in numerous studies. For example, Richert et al. (2013) deve loped a Poisson model to investigate risk factors for mastitis, ketosis, and pneumonia. If a researcher is interested in herd level prediction, Poisson or negative bino mial are viable options. 5.1.6 Generalized linear models The general linear model (GLM) is comprised of both multiple regression and analysis of variance, which allow researchers to investigate the association between one or more independent variables and a con tinuous dependent variable. The generalized linear m odel (GLiM) is a flexible rendition of the GLM that can handle binary, count, and ordered and unordered categorical outcomes (Hardin and Hilbe, 2007). Du e to this flexibility, the GLiM can adapt to both l ogistic and Poisson regression, which as mentioned e arlier, can be used to predict disease risk. Model building requires specifying the model in two parts: equations linking the 25 response and explanatory va riables (e.g. identity, logit, log, reciprocal), an d the probability distribution of the response varia ble (e.g. normal, binomial, Poisson, gamma) (Harden and Hilbe, 2007). There are many advantages to GLiMs, including the ability to handle situations wher e the variance of the dependent variable is depende nt on the mean (Turner, 2008). In addition, the GLiM has greater statistical power for non - normal response data compared to a standard linear model (Ngo, 2016). The assumptions of the GLiM are: homoscedas ticity of the error terms, uncorrelated errors for individual cases, and the conditional expected value of the errors must be equal to zero. However, errors do not need to be normally distributed (Hardin and Hilbe, 2007). Logistic and Poisson regression ar e both special cases of GLiMs. They can be used in the GLiM framework as a more flexible approach to co nsider if researchers have difficulty meeting certain model assumptions. Many researchers have utilized GLiMs to model transition cow diseases, both usin g Poisson regression (Lima et al., 2012; McArt et a l., 2012; McArt et al., 2013 a ) and logistic regressi on (LeBlanc et al., 2002; LeBlanc et al., 2004; LeBlanc et al., 2005; Benzaquen et al., 2007; Seifi et al., 2011). The assumption of uncorrelated errors for GLiMs requires that the observations are indep endent. This may not be the case if cows are cluster ed within cohorts or farms. As mentioned earlier, cows within farm or cohort may be more alike compared to cows in different farms or cohorts. Data can a lso be correlated if multiple measurements were tak en from the same cow (e.g. multiple time points). Me asurements from the same cow are more likely to be alike than measurements from different cow s . Researchers can further relax the assumption of uncorrela ted errors by moving to generalized linear mixed mo dels (GLMMs). The GLMM can 26 account for this correlat ion through the addition of a random effect term for each level of clustering (e.g. farm, cohort, cow) (Rabe - Hesketh and Skrondal, 2008). 5.1.7 Summary In su m mary , there are a variety of statistical models to consider, each with their own strengths, weaknesses , and assumptions. It is up to the researcher to determine which predictive model is best for their analysis, but there are multiple factors to consider. For instance, if the sample size is limited, exact logistic regression, CART diagrams or discriminant factor analysis are best. For data not conforming to certain assumptions (e.g. normality of response variable), a more flexible analysis using GLiM may b e warranted. Factor analysis may be a valid option for researchers having issues with multicollinearity of independent variables but should be wary of non - linear relationships between predictors and the outcome. If health at the cohort or herd level is the purpose of the analysis, count models may be a pre ferred choice, while many of the other models mentio ned are useful for prediction at the individual cow level (e.g. logistic regression). These are not an exhaustive list of considerations, but as mentione d before, there are many factors to consider. Resea rchers should plan the statistical analysis while de signing the study, so that the study design can support the type of statistical analysis that will work best with the resources available. 5.2 Additional techniques for predictive models There are additio nal techniques specific for predictive models that h ave yet to be incorporated in modeling transition cow diseases. Some examples that deserve attention include model averaging and shrinkage. 27 5.2.1 Model aver aging Often modeling procedures, such as variable s election and CART, are sensitive to slight changes in the data (Shtatland et al., 2004). Model averaging is a technique that reduces model uncertainty by combining or averaging multiple potential models (G hosh and Yuan, 2009). There are many ways to implem ent model averaging, but most methods start with perfo rming best subsets selection to identify a set of potential models. Then, regression coefficients or predictions can be averaged over multiple potentia l models (Shtatland et al., 2004). A more advanced technique is Bayesian model averaging, which involves direct model selection, combined estimation, and combined prediction using Bayes ian inference (Fragoso et al., 2018). Bayesian methods incorporate the use of probability for quantifying and adjusting fo r uncertainty in inference (Gelman et al., 2014). 5.2.2 Shri nkage Typically, predictive models will not predict as well in future samples as they did in the samples used to build them (Ivanescu et al., 2016). Sh rinkage is a model validation tool that can account for this discrepancy (Ivanescu et al., 2016). Shrinkage pulls the regression coefficients towards zero and moves the prediction towards the average risk and improves predictive value (Steyerberg et al., 2 001a). This also helps against overfitting, which o ccurs when there are few events compared to pre dictors. Overfitting causes the model to underestimate the risk in low - risk individuals and overestimate the risk in high - risk individuals (Pavlou et al., 201 5). There are a variety of shrinkage techniques to consider, including shrinking the coefficients by a common factor (e.g. 20%) or by penalized regression (e.g. lasso regression) (Pavlou et al., 2015). Although not all researchers have the ability to inco rporate model averaging or shrinkage, they should b e considered as potential options to improve pr edictive ability. 28 5.2.3 Model evaluation and validation As mentioned earlier, predictive models should be evaluated for their discriminant ability, calibration, and goodness - of - fit (Sainani, 2014). Discriminant pow er is the ability of the model to systematicall y assign higher predicted probabilities to those that have the outcome (disease) compared to those that do not (no disease) (Sainani, 2014). A common test that measures discriminant power is ROC curve analysis . From the ROC curve, the area under the curve (AUC) can be calculated which gives the probability that a randomly tested disease individual has a greater test value compared with a randomly selected indivi dual with no disease (Dohoo et al., 2003). Model c alibration is a measure of the accuracy of pred ictions, which is usually assessed through calibration plots or can be tested through the Hosmer - Lemeshow deciles of risk test (Sainani, 2014, Homer et al., 20 13). Calibration plots and the Hosmer - Lemeshow dec iles of risk test can group the observations ba sed on deciles of predicted probabilities (e.g. 0.10, 0.20, 0.30, and so on) and compare the observed versus expected number of individuals in each decile grou p (Hosmer et al., 2013). Goodness - of - fit tests dep end on the statistical model used. For example, the Hosmer - Lemeshow deciles of risk test also serves as a goodness - of - fit test for logistic regression (Hosmer et al., 2013), and the deviance and Pearson goo dness - of - fit tests can be used for Poisson models (Long and Freese, 2006). It is the responsibili ty of the researcher to choose appropriate goodness - of - fit tests for the analysis. Predictive models should also be validated. External validation involves te sting the model in an entirely new data set. Howev er, sometimes this is not feasible due to cost or time (Sainani, 2014). Internal validation is a method that aims to improve the performance of the model in new subjects by partitioning the current dataset and testing how the model performs in this hypothe tical set of new subjects (Steyerberg et al., 2 001b; Pavlou et al., 2015). Internal 29 validation can be done using a split sample approach, where a proportion of the data are used to build the model and the r est of the data are used to test the model. Cross - validation is another option that is an extensi on of split - sample, where a randomly selected proportion of the data are used to build the model, and the other proportion is used to test the data, and vise - v ersa. This procedure is performed multiple times a nd the results are then averaged (Steyerberg et al., 2001b). The internal validation method that is most efficient and resistant to bias and variability is called bootstrapping (Steyerberg et al., 2001b). T he bootstrap procedure involves the random selecti on of a sample with replacement from the origin al dataset. The bootstrap procedure selects multiple samples (usually 200 or more), each the same size of the original dataset, and averages the predictive per formance over all the random samples (Steyerberg e t al., 2001b; Pavlou et al., 2015). If there is no opportunity for external validation in new datasets, researchers should explore internal validation options. 6. Conclusion Transition cow disease incidence has remained steady over time despite improved mon itoring and management efforts. Currently, cows are monitored for risk using a multitude of factors, including nutrient mobilization biomarker levels, health records, and milk production. These factors are often assessed separately and not systematically. The biomarkers typically measured for monitorin g, NEFA and BHB, are not validated for use in the early dry period and fall short in sensitivity and specificity. A potential solution to improving sensitivity and specificity is to incorporate additional biom arkers that can capture the biological complexi ty of metabolic stress, including biomarkers for oxidative stress and inflammation. In addition, most researchers use explanatory modeling methods, rather than predictive modeling methods. There 30 is a need for transition cow epidemiologists to explore and e valuate modeling aimed towards prediction, rather than explanation. 31 CHAPTER 2 : PREDICTIVE MODELS FOR EARLY LACTATION DISEASE IN TRANSITION DAIRY CATTLE AT DR Y - OFF Wisnieski, L. a , Norby, B. a , Pierce, S.J. b , Becker, T. c , Gandy, J. a , and Sordillo, L.M. d a Department of Large Animal Clinical Sciences, Michigan State University 736 Wilson Rd, East Lansing, MI, USA, 48824 wi snies5@msu.edu b Center for Statistical Training and Consulting, Michigan State University 293 Farm Lane, East Lansing, MI, USA, 48824 Steve.Pierce@cstat.msu.edu c Department of Food Science and Human Nutr ition, Michigan State University 469 Wilson Rd, Ea st Lansing, MI, USA, 48824 beckert4@msu.edu d Corresponding author Department of Large Animal Clinical Sciences, Michigan State University 784 Wilson Rd, East Lansing , MI, USA, 48824 sordillo@msu.edu 517 - 432 - 8821 Published in Preventive Veterinary Medicine, 2019, 163(1), 68 - 78. 32 Abstract During the transition period, dairy cattle undergo tremendous metabolic and physiological changes to prepare for mi lk synthesis and secretion. Failure to sufficientl y regulate these changes may lead to metabolic stress, which increases risk of transition diseases. Metabolic stress is defined as a physiological state consisting of 3 components: aberrant nutrient metabol ism, oxidative stress, and inflammation. Current m on itoring methods to detect cows experiencing metabolic stress involve measuring biomarkers for nutrient metabolism. However, these biomarkers, including non - esterified fatty acids, beta - hydroxybutyrate, an d calcium are typically measured a few weeks befor e to a few days after calving. This is a retroactive approach, because there is little time to integrate interventions that remediate metabolic stress in the current cohort. Our objective was to determine i f biomarkers of metabolic stress measured at dry - o ff are predictive of transition disease risk. We designed a prospective cohort study carried out on 5 Michigan dairy farms (N= 277 cows). We followed cows from dry - off to 30 days post - calving. Diseases and adverse outcomes were grouped in an aggregate outc om e that included mastitis, metritis, retained placenta, ketosis, lameness, pneumonia, milk fever, displaced abomasum, abortion, and death of the calf or the cow. We used best subsets selection to select ca ndidate models for four different sets of models: on e set for each component of metabolic stress (nutrient metabolism, oxidative stress, and inflammation), and a combined model that included all 3 components. We used model averaging to obtain average predi cted probabilities across each model set. We hypot he sized that the averaged predictions from the combined model set with all 3 components of metabolic stress would be more effective at predicting disease than each individual component model set. The area u nder the curve estimated using receiver operator c ha racteristic curves for the combined model set (0.93; 95% confidence interval [CI] = 0.90 - 33 0.96) was significantly higher compared with averaged predictions from the inflammation (0.87; 95% CI = 0.83 - 0.91), oxidative stress (0.78; 95% CI = 0.72 - 0.84), and nu trient metabolism (0.73; 95% CI = 0.67 - 0.79) model sets (p < 0.05). Our results indicate that it may be possible to detect cattle at risk for some transition diseases as early as dry - off. This has importa nt implications in disease prevention, as earlier id entification of cows at risk of health disorders will allow for earlier implementation of intervention strategies. A limitation of the current study is that we did not perform external validation. Future validation studies are needed to confirm our findi ng s. Keywords: cattle, disease, prediction, lactation, modeling, dairy, biomarkers 1. Introduction Dairy cattle face tremendous metabolic and physiological changes during the transition from late gestation t o early lactation in preparation for calving and m ilk production. These dramatic changes lead to an increased risk of many health conditions, includ ing mastitis, metritis, ketosis, and displaced abomasum (LeBlanc et al., 2005; Huzzey et al., 2007; Dubuc et al., 2010; Ospina et al., 2010). During this time , increased energy demands, decreased dry matter intake (DMI), negative energy balance (NEB), insu lin resistance, lipolysis, and body condition loss occur in essentially all cows (Leblanc, 2010). Even thoug h these processes are normal in moderation at this point in the lactation cycle, failure to cope with these changes can lead to metabolic stress and increased risk for disease. Metabolic stress can be described as three physiological processes: altered nut rient metabolism, oxidative stress, and inflammati on. These three processes can function in a synergistic way to exacerbate metabolic stress (Sordil lo and Raphael, 2013; Sordillo and Mavangira, 2014). 34 In commercial dairy herds, the most commonly used biom arkers to monitor metabolic stress in cattle are n utrient metabolic biomarkers, including non - esterified fatty acids (NEFA), beta - hydroxybutyrate (B HB), and body condition score (BCS). These biomarkers represent the balance of mobilization of excess body f at tissue as a result of NEB (Sordillo and Mavangi ra, 2014). As mobilization of body fat increases, plasma NEFA and BHB concentrations increase. Pre partum NEFA concentrations greater than 0.30 or 0.40 mmol/L and postpartum BHB concentrations greater than 1 le at risk of disease (LeBlanc, 2010; Ospina et al., 2010). Body condition score, which is an asse ssment of the fat reserves of the cow, signifies the overall energy balance of the cow (Roche et al., 2009). Cows that have a higher BCS undergo greater degrees of lipid mobilization (Treacher et al., 1986; Sordillo and Raphael, 2013). In addition to the b iomarkers that are most commonly used, calcium concentrations less than 2.0 mmol/L are sometimes used to ide ntify hypocalcemia, which can occur due to the dramatically increased calcium demands for milk production (Goff, 2008 a ). Additionally, cholesterol c oncentrations, although not routinely used to monitor dairy cattle, are considered markers of energy balance and represent overall lipoprotein concentrations and liver metabolism (Bobe et al., 2004; Cavestany et al., 2005). Lipoproteins transport triglycer ides that accumulate in the liver as a consequence of excess fatty acid release from adipose tissue (Bobe et al., 2004). Although no specific cut - off values are established, increased cholesterol concentrations a week before parturition are associated with an increased odds of retained placenta (Quiroz - Rocha et al., 2009). Due to the physiological complexities d uring the non - lactating period that may contribute to metabolic stress, current monitoring strategies that focus only on nutrient metabolism may not adequately identify cows at risk of early lactation diseases. Adding biomarkers for oxidative 35 stress and in flammation to the list of current nutrient metabolism biomarkers may help capture the complexity of metabolic stress (Ingvartsen et al., 2003; Ingva rtsen, 2006; Sordillo and Mavangira, 2014). There are several oxidative stress and inflammation biomarkers t hat have physiological roles in metabolic stress, suggesting they may also be useful for disease prediction. During the transition period, oxygen r equirements increase, causing a greater production of oxidants (Sordillo and Aitken, 2009). Oxidative stress occurs when the balance of oxidants and anti - oxidants is disrupted. Oxidative stress can cause cell death and structural tissue damage that can inc rease the risk of disease. Measuring oxidative stress typically involves direct and indirect measures of oxi dants and anti - oxidants (Lykkesfeldt and Svendsen, 2007). A widely used marker, reactive oxygen species (ROS), is a measure of oxygen - centered free radicals and their metabolites (Celi, 201 0 ). Commonly measured antioxidants include alpha tocopherol (vitami n E), vitamin A, and beta - carotene, which have the ability to counteract the action of harmful oxidants (Lykkesfeldt and Svendsen, 2007; Celi, 201 0 ) . Causes of inflammation during the transition period include mammary gland differentiation and proliferati on, uterus and placenta interaction, immunosuppression, and physical effort during calving (Trevisi et al., 2011). Acute phase proteins (APP), such as albumin, haptoglobin, and serum amyloid A (SAA), are involved in the acute and systemic response to infla mmation (Ceciliani et al., 2012). The APP reduces the synthesis of liver proteins, causing greater risk for health issues around calving (Trevisi et al., 2011). White blood cells (WBC), particularly neutrophils, macrophages, and lymphocytes, are readily re cruited to the site of inflammation to eliminate microbial pathogens (Sordillo and Raphael, 2013). Another potential biomarker for inflammation is v itamin D, which exerts autocrine and paracrine immunomodulatory effects and is inversely related to inflamma tory cytokines (Chagas et al., 36 2012; Guillot et al., 2010). Vitamin D deficiency is linked to many inflammatory and metabolic conditions in humans ( Wang et al., 2010; Chagas et al., 2012; Pludowski et al., 2013). Vitamin D helps regulate calcium homeostasi s by interacting with the parathyroid, kidney, and gastrointestinal tract (Guillot et al., 2010), suggesting a role in the development of milk fever (hypocalcemia). Investigation into the predictive value of these inflammatory and oxidative stress biomarke rs along with other promising biomarkers is needed to improve the prediction of transition cow diseases. The nutrient metabolism serum markers curr ently used for monitoring (NEFA and BHB) are typically measured a few days to a few weeks before calving (Le blanc et al., 2005; Ospina et al., 2010 ). I dentifying dairy cattle at risk of disease in the periparturient period leaves little time for proactive interventions. The underlying disease processes may be too advanced to reverse by the time the cows are iden - changes, can only aid the next or future calving cohorts. Recent studies h ave investigated earlier biomarkers of risk for transition cow diseases (Dervishi et al., 2015, 2016a, 2016b ; Zhang et al., 2015, 2016), which underscores the increased interest and significance of this topic. Earlier diagnosis of disease may reduce econom ic costs associated with treatment, reduced milk production, decreased fertility, death, or removal from the herd. Most previous studies that study risk factors for transition diseases focus on determining associations between risk factors and disease out comes, which typically entails selecting variables based on their potential causal association with the outc ome (LeBlanc et al., 2004, 2005; Huzzey et al., 2007; Dubuc et al., 2010; Vanholder et al., 2015). This method is described as explanatory, because the resulting models aim to explain the outcome with the selected variables. However, explanatory modeling i s not aimed towards predicting outcomes (Shmueli, 37 2010). Compared with explanatory modeling, predictive modeling considers a larger set of candidate variables and interaction terms (Sainani et al., 2014). Multicollinearity and the interpretability of model coefficients in predictive models are not as important to the researcher, unlike in explanatory models (Vaughan and Berry, 2005; Shmueli, 2010). Th e steps of model building differ in predictive versus explanatory modeling. For instance, variable selection in explanatory models is determined by statistical significance ( P - values), whereas in predictive modeling, variable selection is driven by predict ive value (e.g., information criteria, discriminant ability) (Sainani, 2014). Automated variable selection p rocedures, specifically best subsets selection, are preferred for predictive models (Sarkar et al., 2010). Often, resulting models from explanatory versus predictive modeling differ in variables and predictive value (Shmueli, 2010). One predictive modeling technique is model averaging, which accounts for model uncertainty by averaging coefficients or predictions over many candidate models (Symonds and Moussalli, m any possible models (Clyde, 2009). Model averaging provides a more realistic estimate of parameter estimates and future predictions (Richards et al. , 2011). To our knowledge, only one study used predictive modeling methods in studying risk factors for tran sition diseases (Vergara et al., 2014). Our objective was to identify early biomarkers of transition diseases measured at dry - off using a predictive modeling approach. Biomarkers of all 3 physiological components of metabolic stress (nutrient metabolism, o xidative stress, and inflammation) were included. We built four sets of models: 1 set for each of the 3 components of metabolic stress and a combine d model set that included all 3 components. We used model averaging to average predictions from candidate mo dels within each set of models. We hypothesized that the averaged predictions from 38 the combined model set with all metabolic stress components would have greater predictive ability than the averaged predictions from each individual metabolic stress compone nt model set. 2. Material and Methods The Strengthening the Reporting of Observational Studies in Epidemiology - Veterinary Extension (STROBE - VET) state ment guidelines were followed in the reporting of this study 2.1 Animals The Animal U se and Care Committee at Michigan State University (East Lansing) approved this study. We designed a prospective cohort study that enrolled clinical ly healthy cows (n=300) at time of dry - off from five commercial dairy farms in Michigan. The outcome was def ined as either (1) having one or more clinical transition disease, including: mastitis, metritis, ketosis, lameness, displaced abomasum, pneumonia, milk fever, and/or retained placenta, and/or (2) having other negative health events including abortion, and death of the cow or calf within the study period (from dry - off to 30 days post parturition). The sample size was calculated using OpenEpi ( Dean et al., 2006 ). We used a sample size calculation for cohort studies from Kelsey et al. (1996) assuming a 95% two - sided significance level and 80% power. Ability to detect a significant relative ri sk (RR) of 1.6 was assumed, based on the most conservative results reported fr om Ospina et al. (2010) for NE FA levels above 0.29 mmol/L prepartum for displaced abomasum, ketosis, and metritis. The results from Ospina et al. (2010) indicated that the ratio was 0.38. Be cause the herds used in our study are well managed and may have lower prevalence of metabolic stress, the ratio of exposed to unexposed used was conservative at 0.15. The percent of unexposed with the outcome was assumed to be 52% based on Ospina et al. (2 010). A design 39 effect calculation was incorporated due to the hierarchical nature of our data assuming an intraclass correlation coefficient (ICC) of 0.13 reported from Nyman et al. (2008) based on the between herd differences of NEFA levels. Assuming a 20 % potential loss to follow - up and greater sample size inflation due to an unknown ICC design effect adjustment for the cohort level, the sample size estimate for this study was r ounded up from 181 to 300. Dairy farms were eligible for the study if they had over 1000 lactating cows, used pcDart or DairyComp 305, artificially bred cows, had all Holstein cows, and separated cows for close - up. Between 2 - 4 cohorts of ~15 cows per farm were enrolled across different seasons to account for seasonal variation. Coho rts were formed from cows that were randomly selected from the pen from the list of cows to be dried off that week. Cows were excluded from the analysis that were lost to follow - up, had missing data, were not pregnant , and were not clinically healthy at dr y - off. 2.2 Biomarkers Biomarkers that represented all three physiological components of metabolic stress (i.e., nutrient metabolism, oxidative stress, and inflammation) were selecte d. The nutrient metabolism biomarkers selected were BHB, NEFA, calcium, BCS, a nd cholesterol. The oxidative stress biomarkers included ROS, antioxidant potential (AOP), alpha tocopherol, beta - carotene, and vitamin A. The inflammatory biomarkers selected we re albumin, haptoglobin, SAA, vitamin D, and WBC counts (band cells, basophils , eosinophils, lymphocytes, monocytes, and neutrophils). 2.3 Data c ollection Whole blood for plasma was collected in K2 EDTA 10.8 mg vacutainers and clotted blood for serum was coll ected in Serum Separator tubes (SST) vacutainers from the tail vein/artery on the day of dry - off. All samples were immediately placed on ice, transported to the laboratory, and processed. Serum and plasma samples were flash frozen in liquid nitrogen and 40 st ored at - 20° C and - 80° C, respectively, while pending analysis within 2 month s of collection. Haptoglobin was measured using a commercially available, photometric colorimetric assay (PHASE Haptoglobin Assay kit, Tridelta Development, LLC., Boonton Twp, NJ , USA). Reactive oxygen species were measured in serum utilizing an OxiSelect In Vitro ROS/RNS Assay Kit (Cell BioLabs, Inc., San Diego, CA, USA). The whole blood for WBC differential counts was analyzed using Wright Stain at Advanced Animal Diagnostics (M orrisville, NC, USA). The Michigan State University Diagnostic Center for Popu lation and Animal Health (DCPAH; East Lansing, MI, USA) used commercially available assays to assess albumin (Albumin, Beckman Coulter Inc., Brea, CA, USA), NEFA (HR Serious NEFA - HR(2), Wako Life Sciences, Mountain View), cholesterol (Cholesterol, Beckman Coulter Inc., Brea, CA, USA), and BHB (Catachem Inc., Bridgeport, CT, USA) by colorimetric measurement. Calcium was analyzed by DCPAH via colorimetric measurement using reagent f rom Beckman Coulter (Brea, CA). Beta carotene, vitamin A, and alpha tocopherol were measured chromatographically using a Waters Acquity system and Water Empower Pro Chromatography Manager Software at DCPAH. Vitamin D was analyzed as 25 - hydroxyvitamin D by methods previously reported (Holcombe et al., 2018). Antioxidant potential (AO P) was measured using the Trolox equivalents antioxidant capacity (Re et al., 1999). All biomarkers were measured in duplicate or triplicate except ROS and haptoglobin. Disease i ncidence was monitored from dry - off until 30 days post parturition in order to capture disease occurrence through the transition period. Diagnoses of all diseases (mastitis, metritis, ketosis, lameness, displaced abomasum, pneumonia, milk fever, and retain ed placenta) were confirmed by trained farm employees or the veterinarian. Bod y condition scoring was done by 2 trained lab employees using the established 5 - point method (Kristensen, 1986). 41 2.4 Statistical a nalyses Prior to analysis, data were examined for u nusual observations (e.g. extreme outliers) and missing values using summary statistics and box plots. Bivariable and multivariable multilevel logistic regression analyses were performed using Stata 14.2 and R version 3.4.1. The primary exposures (independ ent variables) of interest were proposed biomarkers of metabolic stre ss. The outcome was an aggregate binary variable (1 = presence of one or more clinical transition disease and/or having one or more negative health events, 0 = absence of disease and nega tive health events). Continuous variables that were not linearly rela ted to the outcome on the log odds scale were transformed to meet logistic regression assumptions, either through arithmetical transformation or categorization. Categories were based on q uartiles or quintiles, depending on which option resulted in linearit y with the outcome in the log - odds scale. Cut - points for continuous variables that were categorized to meet linearity assumptions are presented in Table S1. Some of the variables in the d ataset had an excessive number of zero values (e.g., basophils). In t his case, a separate level was created for zeros and quartiles were created based on the remaining data. Body condition score was measured at 0.5 increments on a 5 - point scale. Body condi tion score was treated as nominal: ideal BCS versus non - ideal BCS at dry - off. Ideal BCS (reference level) was defined as a BCS of 3.5 or 4.0, whereas a non - ideal BCS was defined as <3.5 or >4.0 as reported by Kellogg (201 0 ). Lactation number was treated as ordinal and categories were created based on the distribution in the sample (1st, 2nd, and 3+). Sampling weights were not needed in the analysis because the sample represented the underlying population distribution (Pfeffermann, 1996). The minimum and max imum lactation numbers were 1st and 7th, respectively. Season was nom inal with four levels based on month of dry - off (March, April, May= Spring; June, July, August= Summer; September, October, November= 42 Fall; December, January, February= Winter). We used e ffects coding for each categorical variable with the user - ig enerate - Catran, 2016). Effects coding is very similar to dummy variable coding, but uses zeros, ones, and negative ones. We used effects coding because it provi des more reasonable estimates of the coefficients and standard errors when concerned about potential multicollinearity issues from including interaction terms (Kraemer and Blasey, 2003; Robinson and Schumacker, 2009). We used unweighted effects coding as t he default method but used weighted effects coding in the case of une qual sample sizes between levels of the categorical variable (Grotenhuis et al., 2017). Due to the large number of predictors, we first built 3 separate sets of models for the components of metabolic stress (nutrient metabolism, oxidative stress, and infl ammation). Then, we narrowed down the pool of candidate variables by selecting those variables that were in the top 3 final models for their respective component of metabolic stress (i.e. , the variables in the top 3 nutrient metabolism, oxidative stress, a nd inflammation models). The final combined model set was built from this final variable pool that included each component of metabolic stress. For the multivariable analyses, we implem ented a predictive modeling approach, which used Akaike Information C riteria (AIC) values as the model selection criterion. Our multivariable statistical analyses included the following steps and was repeated for the nutrient metabolism, oxidative stress, inflammation, and combined models: 1) best subsets AIC selection, 2) test all two - way interactions, 3) test random effects, 4) choose final candidate models, 5) model diagnostics, 43 6) model averaging, and 7) internal validation. roccomp to perform a two - tailed test (alpha of 0.05) to test differences bet ween the area under the curve (AUC) values calculated from receiver operator characteristic (ROC) curve analyses from the four final averaged predictions of each model set (inflammation, nutrient metabolism, oxidative stress, and combined). 2.4.1 Step 1: Best su bsets AIC selection glmulti to perform best subsets AIC selection to select the pool of candidate models. The best subsets approach compares all possible unique models from the l ist of explanatory variables provided and, if specified, pairwise interactions. Due to the large number of candidate variables, we did not include pairwise interactions in this step. The best subsets approach reports the list of models potential candidate models (AIC i (AIC min ), i = AIC i min (1) The set of potential candidate models contained all i < 4, as models with i i e analysis. 2.4.2 Step 2: Test all two - way interactions Within the poo l of candidate models, all two - way interactions were tested using AIC values with forwards selection. Interaction terms relax the assumption that each predictor has a constant effect across t he levels of all other variables in the model and represents the effect of 44 having both factors present (Dohoo et al., 2012). Each interaction term was retained if the interaction term lowered the overall AIC value of the model. We did not perform any forma l tests for multicollinearity, such as checking variance inflatio n factors (VIFs). However, in extreme cases of multicollinearity, the coefficients and standard errors become extremely unstable and unusable. In this case, we removed the terms responsible f or this high level of multicollinearity. 2.4.3 Step 3: Test random eff ects Random intercepts to account for clustering at the farm and cohort levels were tested for inclusion in each candidate model using AIC values. We used AIC values over other methods for t esting random effects (e.g., likelihood ratio tests), because AIC value s allow us to compare multiple models simultaneously (Wagenmakers and Farrell, 2004). Additionally, predictive models should be built using criteria based on predictive ability, rather than hypothesis testing (Sainani, 2014). Four models were estimated and compared: one model with both farm and cohort as random intercepts, a model with just farm, a model with just cohort, and a model with no random intercepts. The model with the lowest A 2.4.4 Step 4: Reduce pool of candi date models After testing and/or inclusion of two - way interactions and random intercept estimates, the AIC values of the candidate models were re - determined i values < 4 were r etained. 2.4.5 Step 5: Model diagnostics We performed model diagnostics to assess discrimination, goodness - of - fit, and influential or poorly fit covariate patterns for each remaining candida te model (Hosmer et al., 2013). The ween sick (disease and/or adverse outcome) and healthy cows 45 was assessed using ROC curve analyses by calculating area under the curve (AUC) using the command in Stata. The AUC values were evaluated based on the following scale: if 0.5 < AUC < 0.7, model had outstanding discrimination (Hosmer et al., 2013). The Hosmer - Lemeshow d eciles of risk test was used to judge goodness - of - fit and an insignificant P - value (p > 0.05) indicated the model fit the data well. In addition, diagnostic graphs for leverage and deviance residuals were visually assessed for problematic covariate pattern s. Models were excluded from the candidate models if they demonstrated poor fit through the Hosmer - 0.05) or less than acceptable discrimination (AUC < 0.7). The final candidate models remained after this step. 2.4.6 Step 6: M odel averaging Prior to model averaging, Akaike model weights were calculated for each final candidate model, which represent the weight of evidence in favor of that model being the best model (Burnham and Anderson, 2004). Akaike weights were calculated wi i values (1) using the following formula (where N = number of candidate models): Together, the weights ( w i model contributed the highest proportion, the second - best model contr ibuted the next highest proportion, and so on. The predicted probabilities calculated from each model were weighted with the Akaike weights and a final averaged predicted probability was calculated for each cow. 46 Area under the curve estimates were calcul ated from ROC curves using the averaged predicted probabilities. Optimal thresholds for each combined model set were determined using roctg using the optimal threshold for each model set. 2.4.7 Step 7: Internal validation Internal validation for each candidate model and their respective averaged predictions was completed using a bootstrap approach (Sainani, 2014). Internal validation of predictive models tests the performance of the model in subjects from a similar population as the original sample originated from (Steyerberg et al. 2001 b ). We used a bootstrap approach to simulate the selection of 500 datasets, of which cows were randomly selected with replacement from the ori ginal dataset (Pavlou et al., 2015). The sample size of each dataset equaled the sample size in the original dataset. Each dataset was tested in the original model. We then averaged the AUC values over the 500 bootstrapped samples to create one estimate. F ive hundred bootstrapped samples were selected because Steyerberg et al. (2001 b ) reported 100 samples was sufficient for logistic regression internal validation, but more is preferred. Bootstrapped analyses were based on the individual cow, as a two - step b ootstrapped (e.g. farm and individual) overestimates model performance when the number of clusters is low (Bouwmeester, 2012). 3. Results Our final sample included 277 cows, 189 multiparious and 88 primiparous (Table 2. 1). In total, 23 cows were excluded: 1 6 cows were missing data for dry - off, 4 were lost to follow - up (did not have da ta past dry - off and/or calving), one cow was diagnosed with lameness on the date of dry - off, one cow was missing data for SAA, and one cow did not calve. Whole blood samples wer e collected 47.95 ± 12.81 (mean ± SD) days before calving. The overall incidenc e of 47 having at least one disease and/or adverse outcome was 33.6% (Table 2 .2 ). The most prevalent disease conditions were ketosis and metritis at 7.9% and 7.2%, respectively. Al l categorical variables, including those categorized to meet linearity assumpti ons of logistic regression, are presented in Table S 2. 1. Summary statistics for the biomarkers collected are shown in Table 2. 3. Table 2.1 Sample characteristics (N = 277) N Farm 1 64 2 31 3 61 4 61 5 60 Lactation Number 1st 88 2nd 94 3rd or greater 95 Season of dry - off Fall 93 Spring 30 Summer 131 Winter 33 BCS at dry - off 2.5 or less 27 3 95 3.5 125 4 27 Greater than 4.0 3 Total 277 BCS: B ody condition score 48 Table 2.2 Incidence of transition cow disease and adverse outcomes (N = 277) n % Disease/Other outcomes 1 Ketosis 22 7.94 Metritis 20 7.22 Lameness 19 6.86 Retained placenta 18 6.50 Mastitis 15 5.42 Pneumonia 8 2.89 Disp laced abomasum 7 2.53 Abortion 4 1.44 Death of cow 4 1.44 Milk fever 3 1.08 Death of calf 1 0.36 Any disease and/or other outcome Yes 93 33.57 No 184 66.43 Total 277 1 15 cows had 2 diseases, 4 cows had 3 diseases, 1 cow had 4 diseases, 0 co ws had a disease and an adverse outcome 49 Table 2.3 Biomarker summary - statistics in study sample (N=277) MS component Biomarker Mean SE Min 25th percentile Median 75th percentile Max I F Albumin (g/dL) 3.40 0.02 2.00 3.30 3.40 3.60 4.00 Haptoglobin (mg /mL) 0.21 0.02 0.00 0.07 0.12 0.17 2.09 Serum Amyloid A 54.69 4.02 0.01 10.04 29.14 77.00 458.99 Vitamin D (ng/mL) 99.02 1.75 25.40 78.20 97.30 114.30 227.50 WBC count (per Band Cells 48.10 5.31 0.00 0.00 0.00 61.90 7 50.75 Basophils 36.80 4.20 0.00 0.00 0.00 47.85 564.50 Eosinophils 311.15 17.32 0.00 117.75 229.50 408.00 1853.30 Lymphocytes 4700.32 180.73 638.75 2866.50 3757.74 5563.20 21699.00 Monocytes 214.82 12.29 0.00 68.00 161.00 314.00 1272.00 Neut rophils 3287.85 87.48 855.00 2204.00 3042.00 4144.80 10400.00 Neutrophil: Lymphocyte 0.96 0.04 0.10 0.48 0.76 1.23 4.81 NM BHB (mg/dL) 5.46 0.13 1.72 4.02 5.01 6.35 14.39 Calcium (mg/dL) 9.69 0.03 7.70 9.40 9.70 10.00 11.00 Cholesterol (mmol/L) 1 75.83 3.77 70.00 120.00 174.00 224.00 397.00 NEFA (mmol/L) 0.26 0.01 0.07 0.15 0.21 0.32 1.10 OS ALPHA (µg/mL) 4.10 0.11 0.84 2.69 3.95 5.42 9.15 Antioxidant potential (Trolox equivalents/µL) 4.48 0.09 1.31 3.35 4.33 5.31 10.80 Beta - carotene mL) 4.77 0.24 0.21 2.54 3.57 5.42 23.96 ROS (DCF equivalents/ 51.90 2.15 2.96 28.96 42.70 64.90 318.06 Vitamin A (ng/mL) 291.60 4.55 112.00 237.00 287.00 346.00 528.00 WBC: white blood cell; BHB: beta - hydroxybutyrate; NEFA: non - esterified fatty acids; ROS: reactive oxygen species ; ALPHA: alpha tocopherol; MS: metabolic stress; IF: inflammation; NM: nutrient metabolism; OS: oxidative stress. 3.1 Inflammation m odels i < 4 in reference to the best model. Five final models remained after testing all two - way interactions, testing random intercepts for the farm and cohort levels, and again reducing the pool of candidate i < 4). All 5 final model s had gr eater than acceptable discrimination (AUC > 0.70) and passed the Hosmer - Lemeshow deciles of risk test. The P - values for the 50 Hosmer - Lemeshow deciles of risk test for the final models (Model 1 - Model 5) were 0.88, 0.69, 0.88, 0.74, and 0.88, respecti vely. A summary of the final 5 models and their respective AUC values calculated using ROC curve analyses is shown in Table 2. 4. Model coefficients, standard errors, P - values, and random effect estimates are shown in Table S 2. 2. Lactation number, monocytes , SAA, a lbumin, a lactation number and monocyte interaction term, and an SAA and albumin interaction term were present in all 5 models. Haptoglobin, vitamin D, band cells, basophils, and eosinophils were not present in any models. 51 Table 2.4 Inflammation candida te models and AUC values 1 Model 1 Model 2 Model 3 Model 4 Model 5 Total Lactation Number Y Y Y Y Y 5 Monocytes Y Y Y Y Y 5 SAA Y Y Y Y Y 5 Albumin Y Y Y Y Y 5 Season Y Y N Y Y 4 NL Ratio Y N N N N 1 Lymphocyte N N N Y N 1 Neutrophils N Y N N N 1 2 Lactation X Monocytes Y Y Y Y Y 5 3 SAA X Albumin Y Y Y Y Y 5 4 Lactation Number X Neutrophils N Y N N N 1 AUC 0.86 0.86 0.88 0.86 0.85 - 1 Full models are presented in Tables S 2. 2 2 Lactation number and monocyte interaction term 3 SAA and albumin inter action term 4 Lactation number and neutrophil interaction term SAA: serum amyloid A; NL Ratio: neutrophil: lymphocyte ratio; AUC: area under the curve The calculation of AIC weights for model averaging for the final 5 candidate models are shown in Table 2. 5. Model weights for the inflammation models ranged from 6% to 30%. Model validation results are shown in Table 2. 6 for each individual model as well as the averaged model. Apparent AUC values are calculated from our original sample and the bootstrapped AU C values are calculated with 500 bootstrapped samples. The apparent and bootstrapped AUC values for the averaged model were 0.87 (95% CI = 0.83 - 0.91) and 0.82 (95% CI = 0.76 - 0.87), which indicated excellent discrimina tion. The optimal threshold determined from ROC curve analyses for predicted probabilities was 0.35 which corresponded to a sensitivity of 78.5% and a specificity of 78.3%. The positive predictive value (PPV) and negative predictive value (NPV) were 64.0% and 87.7%, respectively. 52 Table 2.5: Calculation of AIC weights for model averaging Model 1 AIC i - AIC best ) exp( - i ) w Inflammation Model 1 333.60 ref 1.00 0.30 Model 2 333.75 0.15 0.93 0.28 Model 3 333.80 0.20 0.90 0.27 Model 4 335.98 2.38 0.30 0.09 Model 5 336.80 3.20 0.20 0.06 Nutrient Metabolism Model 1 333 .16 ref 1.00 0.36 Model 2 333.77 0.61 0.74 0.27 Model 3 334.58 1.42 0.49 0.18 Model 4 335.29 2.13 0.34 0.13 Model 5 336.64 3.48 0.18 0.06 Oxidative Stress Model 1 341.71 ref 1 0.30 Model 2 341.91 0.2 0.90 0.27 Model 3 343.5 1.79 0.41 0.12 Model 4 343.51 1.8 0.41 0.12 Model 5 344.58 2.87 0.24 0.07 Model 6 344.82 3.11 0.21 0.06 Model 7 344.83 3.12 0.21 0.06 Combined Model 1 312.67 ref 1 0.44 Model 2 313.74 1.07 0.59 0.26 Model 3 314.04 1.37 0.50 0.22 Model 4 316.1 3.43 0.18 0. 08 1 Full models are presented in Tables S 2. 2, S 2. 3, S 2. 4, and S 2. 5 AIC: Akaike information criteria 53 Table 2.6 Internal validation results: comparison of apparent predictive validity (AUC) with bootstrapped estimates of predictive ability Model 1 Apparent AUC Lower 95% CI Upper 95% CI Bootstrapped AUC Lower 95% CI Upper 95% CI Inflammation Model 1 0.8596 0.8161 0.9032 0.8076 0.7492 0.8660 Model 2 0.8626 0.8182 0.9069 0.8218 0.7652 0.8784 Model 3 0.8781 0.8378 0.9184 0.7977 0.7382 0.8572 Model 4 0.8596 0.8146 0.9047 0.8153 0.7579 0.8727 Model 5 0.8479 0.8011 0.8947 0.8102 0.7522 0.8682 Averaged Model 0.8723 0.8304 0.9142 0.8178 0.7608 0.8748 Nutrient metabolism Model 1 0.7228 0.6621 0.783 4 0.7224 0.6561 0.7887 Model 2 0.7259 0.6662 0.7856 0.7262 0.6601 0.7923 Model 3 0.7233 0.6626 0.7840 0.7230 0.6567 0.7893 Model 4 0.7270 0.6674 0.7867 0.7274 0.6614 0.7934 Model 5 0.7347 0.6754 0.7941 0.7351 0.6697 0.8005 Averaged Model 0.7257 0.6656 0.7858 0.7258 0.6597 0.7919 Oxidative Stress Model 1 0.7536 0.6938 0.8134 0.7071 0.6397 0.7745 Model 2 0.7811 0.7234 0.8388 0.6828 0.6140 0.7516 Model 3 0.7824 0.7247 0.8401 0.6930 0.6248 0.7612 Model 4 0.7629 0.7048 0.8210 0.7147 0.6478 0.7816 Model 5 0.7572 0.6957 0.8187 0.6568 0.5867 0.7269 Model 6 0.7531 0.6933 0.8129 0.6564 0.5863 0.7265 Model 7 0.7324 0.6683 0.7964 0.6445 0.5739 0.7151 Averaged Model 0.7785 0.7210 0.8360 0.7028 0.6351 0.7705 Combined Model 1 0.9297 0.8991 0.9603 0.9050 0.8620 0.9480 Model 2 0.9272 0.8949 0.9595 0.9269 0.8888 0.9650 Model 3 0.9219 0.8894 0.9544 0.9220 0.8827 0.9613 Model 4 0.9182 0.8833 0.9530 0.9184 0.8783 0.9585 Averaged Model 0.9333 0.9031 0.9634 0.9091 0.8669 0.9513 1 Full mo dels are presented in Tables S 2. 2, S 2. 3, S 2. 4, and S 2. 5 AUC: area under the curve 54 3.2 Nutrient m etabolism m odels i < 4 compared to the best model. Six models remained after testing all two - way interactions, testing random intercepts for farm and cohort, and again reducing the pool of cand idate models with i < 4). All 6 candidate models for nutrient met abolism had greater than acceptable discrimination (AUC > 0.70). Five out of 6 models passed the Hosmer - Lemeshow deciles of risk test. The P - values for the Hosmer - Lemeshow dec iles of risk test for Models 1 through 6 were 0.28, 0.30, 0.28, 0.69, 0.03 and 0.93, respectively. The model that failed the Hosmer - Lemeshow deciles of risk test was excluded at this point, resulting in 5 final candidate models. The AUC values calculated f rom the ROC curve analyses and the variables in each model are shown in Table 2. 7. Model coefficients, standard errors, and P - values are shown in Table S 2. 3. Neither farm nor cohort level random effects improved AIC value s for any candidate models so were excluded. The most common predictors were lactation number, season, NEFA, and calcium, which were in all 5 candidate models. Cholesterol and BHB were the least common predictors, which were only present in two models. 55 Table 2.7 Summary of nutrient met a bolism candidate models and AUC values 1 Model 1 Model 2 Model 3 Model 4 Model 5 Total Lactation Number Y Y Y Y Y 5 Season Y Y Y Y Y 5 NEFA Y Y Y Y Y 5 Calcium Y Y Y Y Y 5 BCS N Y N Y Y 3 BHB N N N Y N 1 Cholesterol N N N N Y 1 2 Season X NEFA N N N N Y 1 AUC 0.72 0.73 0.72 0.73 0.73 - 1 Full models are presented in Table S 2. 3 2 Season and NEFA interaction term AUC: area under the curve; NEFA: non - esterified fatty acids; BCS: body condition sco r e; BHB: beta - hydroxybutyrate The AIC weights that corresponded to the final 5 models for model averaging ranged from 6% to 36% (Table 2. 5). The apparent and bootstrapped AUC values for the averaged model were 0.73 (95% CI = 0.67 - 0.79) and 0.73 (95% CI = 0 .66 - 0.79), which indicated acceptable predictive validity. The optimal threshold of predicted probabilities determined from the ROC analysis was 0.34, which corresponded to a sensitivity of 68.8% and a sp ecificity of 67.4%. The PPV and NPV were 51.6% and 8 1.1%, respectively. 3.3 Oxidative s tress m odels i < 4 in reference to the best model. Seven models remained after testing all possible two - way interaction terms, testing random intercepts for the farm and cohort levels, and again reducing the pool of candidate i < 4). All 7 mo dels had greater than acceptable discrimination (AUC > 0.70) and passed the Hosmer - Lemeshow deciles of risk tests. The P - values for the Hosmer - Lemeshow deciles of risk test for the 7 models (Model 1 - Mo del 7) were 0.78, 0.77, 0.79, 0.85, 0.66, 0.76, and 0.5 4, respectively. The model variables and AUC values calculated 56 from ROC curve analyses are presented in Table 2. 8. The model coefficients, standard errors, P - values, and random effects are presented in Table S 2. 4. The most common predictors were lactation number, AOP, and alpha tocopherol, which were present in 7, 6, and 5 models, respectively. Beta - carotene and ROS were not in any oxidative stress models. No interaction terms were present in any of the 7 models. Table 2.8 Summary of oxidative stress candidate models and AUC values 1 Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Total Lactation Number Y Y Y Y Y Y Y 7 AOP Y Y Y Y Y N Y 6 Alpha tocopherol Y Y Y Y N Y N 5 Season Y N N Y N N Y 3 Vitamin A N N Y Y N N N 1 AUC 0.75 0.78 0.78 0.76 0.76 0.75 0.73 - 1 Full models are presented in Table S 2. 4. AUC: area under the curve; AOP: antioxidant potential The AIC weights for model averaging ranged from 6% to 30% for the 7 final models . The apparent and bootstrapped AUC values were 0.78 (95% CI = 0.72 - 0.84) and 0.70 (95% CI = 0.64 - 0.77), indicating acceptable discrimination. The optimal predicted probability threshold determined from ROC analyses was 0.33 which corresponded to a sensiti vity of 68.8% and a specificity of 69.0%. The PPV and NPV were 53.3% and 81.5%, respectively. 3.4 Combined m odel ( n utrient metabolism, i nflammation, and o xidative Stress) All v ariables in the top three best models for nutrient metabolism, inflammation, and oxidative stress were included in variable selection for the combined model. This ensured an adequate pool of variables was used that included the best predictors but did not overwhelm the best subsets procedure. This list included: lactation number, seas on, monocytes, SAA, albumin, neutrophil/lymphocyte ratio, neutrophils, NEFA, calcium, BCS, BHB, AOP, alpha tocopherol, i < 4 in reference to the best model. After testing all two - way interactions, rand i < 57 Four out of 5 models pa ssed the Hosmer - Lemeshow deciles of risk test. The P - values of the Hosmer - Lemesh ow deciles of risk test for the 5 models (Model 1 - Model 5) were 0.73, 0.049, 0.08, 0.51, and 0.34. The model that failed the Hosmer - Lemeshow deciles of risk test was excluded at this point, therefore 4 final candidate models remained. The final 4 models and their respective AUC values calculated from ROC analyses are shown in Table 2. 9. The model coefficients, standard errors, P - values, and random effect estimates are shown in Tables S 2. 5. Lactation number, season, monocytes, SAA, albumin, AOP, alpha toco pherol, the neutrophil/lymphocyte ratio, and calcium were in all 4 models. Additionally, each model contained the following interactions: lactation number and monocytes, lactation number and SAA, and SAA and albumin. The apparent and bootstrapped AUC for t he averaged model were 0.93 (95% CI = 0.90 - 0.96) and 0.91 (95% CI = 0.87 - 0.95), which indicated outstanding predictive ability. The optimal threshold of predicted probabilities determined from ROC curve analyses was 0.41, which corresponded to a sensitivit y of 88.2% and a specificity of 87.0%. The PPV and NPV were 77.4% and 93.6%, respectively. 58 Table 2.9 Summary of combined (inflammation, nutrient metabolism, and oxidative stress) candi date models and AUC values 1 Model 1 Model 2 Model 3 Model 4 Total Lactation Number Y Y Y Y 4 Season Y Y Y Y 4 Monocytes Y Y Y Y 4 SAA Y Y Y Y 4 Albumin Y Y Y Y 4 AOP Y Y Y Y 4 Alpha tocopherol Y Y Y Y 4 NL Ratio Y Y Y Y 4 Calcium Y Y N Y 4 NEFA N Y Y Y 3 BCS Y Y Y N 3 BHB N Y Y N 2 Vitamin A N Y N Y 2 2 Lacta tion X Monocytes Y Y Y Y 4 3 Lactation X SAA Y Y Y Y 4 4 SAA X albumin Y Y Y Y 4 5 AOP X BCS Y Y Y N 3 6 AOP X NL Ratio N N N Y 1 7 Season X BHB N Y N N 1 8 NL Ratio X BHB N N Y N 1 AUC 0.93 0.93 0.92 0.92 - 1 Full models are presented in Table S5 2 La ctation number and monocytes interaction term 3 Lactation number and SAA interaction term 4 SAA and albumin interaction term 5 AOP and BCS interaction term 6 AOP and NL Ratio interaction 7 NEFA and BCS interaction term 8 NL ratio and BHB AUC: area und er the curve; SAA: serum amyloid A; AOP: antioxidant potential; NL ratio: neutrophil: lymphocyte ratio; BCS: body condition score; BHB: beta - hydroxybutyrate 3.5 Area u nder the c urve c omparison The averaged predictions from the combined model set that incl uded all 3 components of metabolic stress (0.93; 95% CI = 0.90 - 0.96) had significantly higher predictive a bility (measured as AUC) compared with the averaged predictions from the nutrient metabolism (0.73; 95% CI = 0.67 - 0.79), oxidative stress (0.78; 95% C I = 0.72 - 0.84), and inflammation (0.87; 95% CI = 0.83 - 59 0.91) model sets (p < 0.01 for all three comparison s, respectively). When comparing the predictive ability between the averaged predictions of nutrient metabolism, oxidative stress, and inflammation, t he averaged predictions from the inflammation model set had the highest predictive ability. The AUC calcul ated from the inflammation model set was significantly greater than the oxidative stress model set (p < 0.01) and the nutrient metabolism model set (p < 0.01). The AUC calculated from the averaged predictions of the oxidative stress model set was greater t han that calculated from the nutrient metabolism model set but were not significantly different (p = 0.09). 4. Discussion All four averaged models (inf lammation, nutrient metabolism, oxidative stress, and the combined model) had greater than acceptable pred ictive ability. In fact, the combined model, which included all 3 components of metabolic stress, had outstanding predictive ability. These results in dicate that it may be possible to identify cattle at risk of transition diseases as early as dry - off. The dry period (abrupt cessation of lactation) begins approximately 51 to 60 days prior to calving under standard management (Bachman and Schairer, 2003). Since most disease incidence occurs in the transition period, 3 weeks before to 3 weeks after calving (Dr ackley, 1999), this allows for approximately 30 to 40 days to implement interventions to prevent disease development in at - risk cattle. Interventions may include improving environmental factors, nutrition, or dry - cow treatments (Cameron et al., 1998; LeBla nc et al., 2006). We identified multiple candidate models for each metabolic stress component (nutrient metabolism, oxidative stress, and inflammatio n), and for the combined model. The oxidative stress model set had the greatest number of candidate models (7), followed by the nutrient metabolism and inflammation model sets (5), and the combined model set (4). The nutrient 60 metabolism averaged model had lower predictive ability compared with the inflammation and oxidative stress averaged models. This is some what interesting, given that typically nutrient metabolism biomarkers are currently used to predict risk of transition cow disease. However, these bio markers (e.g., BHB, NEFA, and calcium) are usually measured a few weeks before to a few days after calving (LeBlanc, 2010). Biomarkers in the current study were measured at dry - off. This may indicate that biomarkers for inflammation and oxidative stress ar e earlier indicators of disease risk compared with nutrient metabolism biomarkers alone. Alterations in im munity and oxidative stress may precede nutrient metabolism changes. These results are supported by a series of studies from a research group in Canad a, which found that inflammatory biomarkers, including SAA, tumor necrosis factor (TNF), lactate, interleu kin - 1 (IL - 1), and interleukin - 6 (IL - 6) measured 8 weeks before calving were associated with a variety of transition cow outcomes. find any associations between NEFA or BHB and transition cow disease outcomes at 8 weeks before calving ( Dervishi et al., 2015, 2016a, 2016b; Zhang et al., 2015, 2016). Another study measured markers associated with oxidative stress (reactive oxygen and n itrogen species, AOP, and an oxidative stress index) and inflammation (haptoglobin, SAA, albumin, and WBC differential) at dry - off, but found no associations with lameness risk (Abuelo et al., 2016). Contrary to these findings, we found that AOP, SAA, mono cytes, the neutrophil: lymphocyte ratio, and albumin were all strong predictive markers for transition dis ease and/or adverse outcomes measured at dry - off. Lastly, we also found that alpha tocopherol is a strong predictor, but to the best of our knowledge, has not been previously measured at dry - off to study disease outcomes. The combined model, which include d biomarkers from each component of metabolic stress, had the highest predictive ability compared with each individual metabolic stress 61 component alon e. This supports our hypothesis that including biomarkers from all three components of metabolic stress wi ll improve predictive ability. Though, the improved predictive ability could be explained by the greater number of potential variables for the combine d model, which leads to higher risk of overfitting (Sainani, 2014). However, the models in the combined an alysis set did include variables from each metabolic stress component: inflammation, oxidative stress, and nutrient metabolism. This supports the impo rtance of including all three metabolic stress components in disease prediction. This makes biological sen se given that the components of metabolic stress are intertwined and can form destructive feedback loops, exacerbating metabolic stress (Sordillo and Mavangira, 2014). A major strength of this analysis is the use of predictive modeling techniques, includ ing the use of information criteria for model selection and model averaging. Predictive modeling is better suited for predicting outcomes compared to explanatory modeling. To the best of our knowledge, only one study used predictive modeling methods for de termining factors that affect dairy cattle health (Vergara, et al., 2014). In Vergara et al. (2014), predictive models for post - partum problems were c onstructed using a stepwise approach aimed to maximize predictive accuracy using AUC values determined fro m ROC curve analyses. Alternatively, most studies aimed towards identification of risk factors for transition diseases use explanatory methods. For in stance, Vanholder et al. (2015) implemented a backwards selection procedure using P - values as criterion to select variables in a model aimed to identify risk factors for the development of ketosis. The results of predictive modeling and explanatory models typically vary greatly. Vergara et al. (2014) found that the variables selected for a model aimed to predi ct treatment or removal of cows from the herd varied with the use of explanatory versus predictive methods. In the present study, many variables in th e candidate sets did not reach traditional values of 62 statistical significance (P < 0.05) but were retained due to their predictive value based on AIC values. In addition, all two - way interactions were tested in the models, without consideration of their bi ological plausibility. This can make the model coefficients difficult to interpret. However, inter pretation is more important in explanatory models versus predictive models (Shmueli, 2010). Interaction terms can improve predictive validity, without the ext ra cost or time to collect additional biomarkers or predictors. Another strength of the study is t hat we performed internal validation of the candidate models using bootstrapping methods. The predictive value of models is overestimated when the model is te sted on the same subjects that were used to develop the model. Internal validation is a method tha t is used to more accurately determine predictive ability (Steyerberg et al., 2001 b ). There are different methods to internally validate models, including spl it - sample, cross - validation, and bootstrapping (Steyerberg et al., 2001 b ). Steyerberg et al. (2001 b ) found that the bootstrapping method for predictive logistic regression models is best for producing efficient estimates that have low bias and low variabil ity. An additional strength of our study is that the cows were randomly selected, which decreased the risk of selection bias. Additionally, the majority of the biomarkers were analyzed in duplicate or triplicate, and diagnoses were confirmed by trained sta ff and/or a veterinarian. These factors reduced the risk of information bias (measurement error). There is a possibility that diseases were misdiagnosed (e.g., different protocols, staff). However, errors in diagnosis should occur at the same rate in cows undergoing metabolic stress and those not undergoing metabolic stress. This would result in non - di fferential misclassification bias and would bias the results towards the null. Despite the strengths of our study, limitations of our study must be addresse d. One limitation is that it could be argued that some of the biomarkers were categorized incorrec tly into their respective metabolic stress component (nutrient metabolism, inflammation, and oxidative 63 stress). We recognize that many of the biomarkers of me tabolic stress have roles in more than one component of metabolic stress and these processes are i nterrelated (Sordillo and Mavangira, 2014). For instance, we categorized calcium and cholesterol into the nutrient metabolism category, although it could be a rgued that they both have a role in inflammation. Intracellular calcium signaling is involved in i mmune cell activation (Kimura et al., 2006) and cholesterol promotes inflammatory responses (Tall and Yvan - Charvet, 2015). However, we chose to categorize cal cium in nutrient metabolism due to its key role in the metabolic disease milk fever, which can res ult in reduced nerve and muscle function (Goff, 2008 a ). We chose to categorize cholesterol under nutrient metabolism because cholesterol levels are largely af fected by levels of lipoproteins, which correspond directly to the degree of lipid mobilization (B ionaz et al., 2007). In addition, we could not include variables pertaining to previous lactations that affect disease risk, such as length of dry period and previous milk yield (Grohn et al., 1995; Fleischer et al., 2001; Pinedo et al., 2011). Logistic re gression analyses exclude observations that have incomplete data, so heifers would be excluded from all models that included cow characteristics. We could not have used imputation to handle missing data in this case, because imputation is used for missing data that could have been observed but was not (Sterne et al., 2009). The variables in question cannot be observed for a heifer even in principle. Future stud ies could consider building separate models for heifers and cows so that cow - specific variables ca n be incorporated. Another limitation is that we did not perform external validation, which would test the generalizability of the results outside of our so urce population. Internal validation may not be sufficient, especially for relatively small datase ts (Bleeker et al., 2003). External validation should be performed using new data with substantial sample sizes to ensure the models are 64 generalizable to othe r populations (Steyerberg et al., 2003). External validation is also imperative, since the study p opulation had lower than average disease incidence. For example, only 5.4% of the cows developed clinical mastitis, which is drastically lower than the 2014 e stimates from the National Animal Health Monitoring System (NAHMS) of 24.8% (NAHMS, 2018). After v alidation in similar populations, such as other large commercial Holstein herds, further generalizability could be tested (e.g. smaller herds, different breed s, high disease incidence). Another potential limitation is that only clinical diseases were incl uded, so the models may not be predictive of subclinical diseases, such as subclinical mastitis, ketosis, or hypocalcemia. However, a potential way to remedy the discrepancy between identifying subclinical and clinical disease could be to lower the cut - off predicted probability to increase the sensitivity of the models. This would increase the probability that a cow who is going to be become sick will be identi fied by the test. However, lowering the cut - off may increase the number of false positives, and th us spending more on interventions in cows that would not have become sick. Future studies could perform cost - benefit analyses to determine optimal cut - offs on ce predictive models have been thoroughly tested. Another limitation of our study is that we used an aggregate outcome; therefore, we cannot use the models to predict specific diseases. This may make it more difficult to determine which interventions to im plement because it is possible that different diseases will require different interventions. For i nstance, interventions for lameness may include environmental changes, such as providing more comfortable bedding, whereas interventions for milk fever may in volve better calcium management (Cook and Nordlund, 2009; Santos et al., 2012; de Vries et al., 20 15). Future studies could focus on building predictive models for specific outcomes. 65 Lastly, our study did not address the costs associated with sampling and analyzing biomarkers. Most of the models included many variables, which may become costly. Future studies could perform a cost - benefit analysis to determine if the potential reduction in disease incidence offsets the initial investment in measuring biomar kers. In addition to external validation of the candidate models presented in this study, future work should include determining what interventions are most effective at reducing disease incidence after identification of at - risk cows at dry - off. These in terventions may include nutritional changes, environmental changes (e.g. bedding, temperature cont rol), or dry - cow treatments (Cameron et al., 1998; LeBlanc et al., 2006). Future work could also build cohort or herd level models. Although individual cow pr edictions have immense value, prediction at the cohort or herd level is also important (LeBlanc et al., 2006). Currently, herd alarm levels are established for NEFA and BHB. If a certain percentage of the herd is above an established risk threshold, the he rd is considered at risk and interventions are implemented (Ospina et al., 2010). Herd level predi ctions are useful because they reduce the amount of cattle tested and can determine when changes in herd management are needed (LeBlanc, 2010). Lastly, resear chers could extend predictive modeling techniques to other areas of dairy cattle health, including culling rates, reproductive issues, and milk production. Predictive models can then be implemented on the farm through the use of user - friendly monitoring ap plications. 5. Conclusions We developed models with good predictive ability for early lactation dis eases measured at dry - off. We identified multiple biomarkers of nutrient metabolism, oxidative stress, and inflammation that show potential as routine cow mon itoring tools. These results have significant 66 implications for reducing disease incidence. Identif ying cows at risk of early lactation diseases at dry - off may allow for earlier implementation of preventive measures. 67 CHAPTER 3 : COHORT - LEVEL DISEASE PREDICTION USING AGGREGATE BIOMARKER DATA IN TRANSITION DAIRY CATTLE Wisnieski, L. a , Norby, B. a , Pierce, S.J. b , Becker, T. c , Gandy, J. a , and Sordillo, L.M. d a Department of Large Animal Clinical Sciences, Michigan State University 736 Wilson Rd, East Lan sing, MI, USA, 48824 wisnies5@msu.edu b Center for Statist ical Training and Consulting, Michigan State University 293 Farm Lane, East Lansing, MI, USA, 48824 Steve.Pie rce@cstat.msu.edu c Department of Food Science and Human Nutrition, Michigan State University 469 Wilson Rd, East Lansing, MI, USA, 48824 beckert4@msu.edu d Corresponding author Department of Large Animal Clinical S ciences, Michigan State University 784 Wilson Rd, East Lansing, MI, USA, 48824 sordillo@msu.edu 517 - 432 - 8821 68 Abstract During the transition from late gestation to early lactation, dairy cattle are at increased risk for disease. Herd level monitor ing for disease risk involves evaluating multiple factors, including food intake, cow density, and biomarkers of nutrient metabo lism. Biomarkers that are measured include non - esterified fatty acids (NEFA) and beta - hydroxybutyrate (BHB), which are usually m easured in a subset of the herd (i.e. cohort). If a certain proportion of cows in the cohort are above a specific threshold for a biomarker , the cohort is considered at high risk of disease. F ew previous studies have investigated other methods to aggregate individual cow - level data to the cohort level. We designed a prospective cohort study using cows (N = 277) from 5 Michigan comm ercial dairy herds to determine which data aggregation methods were useful when predicting cohort incidence of adverse health ev ents including 1) clinical diseases: mastitis, metritis, retained placenta, ketosis, lameness, pneumonia, milk fever, displaced abomasum, 2) and abortion or death of the calf or the cow. Multiple cohorts of cows (2 - 4 cohorts per farm, 18 total) were enroll ed that shared the same dry - off date. We tested 3 different methods to aggregate individual cow data (i.e. biomarkers and covari ates) that we referred to as the central, dispersion, and count aggregation methods. The central method consisted of calculating the average value of each variable within a cohort, and the dispersion method involved taking the standard deviation or mean ab solute deviation about the median of each variable within a cohort. The count method consisting of counting the number of cows a bove a specific threshold for each variable within a cohort. We used best subsets selection to select a bouquet of candidate cou nt models for each aggregation method and averaged the predictions over the model set. We built 4 sets of count models: one for each aggregation method and a combined model that included all 3 methods. We evaluated the models based on goodness - of - fit, mode l calibration using scoring rules, and 69 comparison of observed versus predicted counts. The central and the combined method produ ced models that had good fit and model calibration. However, the models tended to overestimate disease incidence when observed c ounts were low and underestimate disease incidence when observed counts were high. These results indicate that it may be possibl e to use aggregate measures to predict cohort disease incidence as early as dry - off. A limitation of this study is the small sam ple size at the cohort level (N = 18). Future work should include validating these models in additional cohorts. Keywords: catt le, disease, prediction, modeling, dairy, biomarkers 1. Introduction Transition cow diseases, which most commonly occur in the pe riod that spans 3 weeks before to 3 weeks after calving, have negative economic implications and affect animal welfare, farm sus tainability , and food security (Mulligan and Doherty, 2008). Mastitis, the most common transition cow disease, costs an average of $444 with in the first 30 days of lactation alone (Rollin et al., 2015). Physiological changes that occur in all dairy cattle during fetal development, calving, and milk production contribute to increased risk of disease during the transition period. Increased energy demands and a decreased dry matter intake lead to negative energy balance (NEB). Negative energy balance is a norm al occurrenc e in dairy cattle and results in dyslipidemia, increased lipid mobilization, reduced immune function, and insulin resistance (Contreras and Sordillo, 2011; Esposito et al., 2014). However, some dairy cattle fail to adjust appropriately and unde rgo excessiv e NEB, which may lead to metabolic stress. Metabolic stress can be described as 3 physiological processes: abnormal nutrient metabolism, excessive inflammation, and oxidative stress (Sordillo and Mavangira, 2014). 70 Although monitoring for meta bolic stress can be done at the individual cow level, herd managers are often more concerned about cohort health. Calving cohorts are defined as cows at the same stage of lactation that are expected to calve around the same time. Results from monitoring ma y lead produ cers to change management factors, such as nutrition or housing that are implemented on the cohort level. Monitoring cohorts for metabolic stress involves consideration of a multitude of factors including biomarker concentrations, health record s, food inta ke, and levels of rumination and activity (Mulligan et al., 2006 ; Leblanc, 2010 ). Biomarkers that ar e routinely used to monitor NEB are markers of nutrient metabolism, which is a component of metabolic stress. However, these biomarkers (non - esterified fatty acids [NEFA] and beta - hydroxybutyrate [BHB]), are typically measured 7 to 10 days before to two we eks after ca lving (Leblanc, 2010) . Albeit this method is good at predicting the level of disease in a calving cohort (Ospina et al., 2013) , it is considered reactive, because there is little time to implement in cidence in t he current calving cohort. Identifying calving cohorts at risk of disease earlier, perhaps as early as dry - off, may allow time for interventions that will reduce excessive NEB, and therefore lower disease incidence. To the best of our knowledg e, no studies have investigated cohort o r herd - level prediction as early as dry - off. However, some recent studies identified biomarkers that are associated with disease at the individual cow level earlier than previously reported. A research group in Canad a identified cow - level biomarkers associ ated with transition diseases at - 8, - 4, 0, and +4 weeks relative to calving (Dervishi et al., 2015, 2016a, 2016b; Zhang et al., 2015, 2016). Many of the above studies found that increased concentrations of inflammat ory biomarkers such as serum amyloid A ( SAA), tumor necrosis factor (TNF), lactate, interleukin - 1 (IL - 1), and 71 interleukin - 6 (IL - 6) measured 8 weeks before calving were associated with increased risk of disease . However, neither NEFA nor BHB were related to disease at this time. In concurrence wi th these findings, we previously found that antioxidant potential (AOP) and many inflammatory biomarkers (i.e. SAA, monocytes, the neutrophil: lymphocyte ratio, and albumin) measured at dry - off are strong predictors of transition diseases at the individual cow level (Wisnieski et al., 2019 a ). Together these results suggest that inflammation and oxidative stress biomarkers may be earlier predictors of disease than nutrient metabolism biomarkers. This finding stresses t he importance of considering biomarkers from all three components of metabolic stress in disease prediction. Typically, a proportion of a cohort is tested and the test results are evaluated as either the mean of a biomarker, or the proportion of cows abov e or below a threshold (Oetzel, 2004 ; Seifi et al., 2004 ; Ospina et al., 2013) . This method of aggregating d ata from the bottom level to describe em ergent properties in the top level is referred to as compositional modeling (Kozlowski and Klein, 2000). Individual data can be summarized using measures of central tendency, dispersion, or consensus (e.g. percent ag reement) to characterize the group level (Chan, 1998). The biomarkers mentioned earlier, NEFA and BHB, are typically evaluated as the proportion of cows above a threshold of risk within a cohort. Cohorts with more than 15 - 20% prevalence of cows with NEFA o r BHB co ncentrations above a specified threshold in early lactation are considered high - risk due to increased disease incidence, lower milk production, and reduced reproductive performance (McArt et al., 2013 b ; Ospina et al., 2013). The t hreshold method has also been used for body condition score (BCS) (e.g. < 25% of cohort should have > 0.5 BCS loss in early lactation) (Pryce et al., 2001; Buckley et al., 2003; M ulligan et al., 2006 ). A marker that is evaluated as a cohort mean is urinary pH, which can be used to predict 72 hypocalcemia after calving (Eberhart et al., 1982 ; Goff, 2004; Seifi et al., 2004). An additional measure that could be used to potentially predict cohort health is the dispersion of biomarkers across the cohort. Metrics of dispersion inclu de stand ard deviation, variance, interquartile range, and median absolute deviation (MAD) about the median. A larger variance or standard deviation indicates that the cows in a cohort have a larger range of values. This may indicate that the group is more heteroge neous and potentially more cows are outliers i.e., have extreme values (high and/or low). To the best of our knowledge, metrics of biomarker dispersion have not been considered predictors of disease risk at the cohort level. However, measures of di spersion as predictors have been used in other fields, such as in the social sciences, in the study of organizational characteristics (Chan, 1998) . Most studies that aimed to identify biomarkers for transition cow diseases have used explanatory modeling methods, rather than predictive modeling methods. Explanatory modeling methods can be used to explore a hypothesized causal link between factors, while predictive modeling methods are better suited to predict current or future events, such as disease occurrence (Sainani, 2014). Currently, there are no studies that have used predictive mo deling t echniques to determine cohort - level disease risk in dairy cattle . However, predictive modeling techniques have been used at the individual cow level (Vergara et al., 2014; Wisnieski et al., 2019 a ). In Wisnieski et al., 2019 a , rather than build mode ls based on the traditional method of using P - values and/or presence of confounding, they used Akaike Information Criteria (AIC) values to Additionally, evaluation of the m odels was based on predictive measures (i.e. receiver operator curve [ROC] analyses) (Wisnieski et al., 2019 a ). 73 Our objective was to investigate different methods to aggregate biomarker data measured at dry - off for prediction of adverse health e vent inc idence (mastitis, metritis, retained placenta, ketosis, lameness, pneumonia, milk fever, displaced abomasum, abortion, and death of the calf or the cow) at the cohort level. We hypothesized that aggregate biomarkers will be able to predict disease incidenc e at the cohort level. We tested 3 different aggregation methods that we referred calculating the mean/median value of measured biomarkers and covariates within a cohort, the dispersion method aggregates data by calculating the standard deviation or MAD for each biomarker and covariate, and the count method counts the number of cows within a cohort above a set threshold for each biomarker and covariate with in a coh ort. We built four sets of predictive models: one set for each of the 3 aggregate measures and one set of models that combined all three aggregate measures. 2. Material and Methods The Strengthening the Reporting of Observational Studies in Epidemio logy - Veterinary Extension (STROBE - al., 2016). 2.1 Animals The Animal Use and Care Committee at Michigan State Un iversity (East Lansing) approved this study. This prospective cohort study was conducted on 5 commercial dairy cattle herds in Michigan. Briefly, we enrolled 300 clinically healthy cows that were nested within cohorts. The cohorts were sampled across diffe rent seasons to account for seasonal variation and consisted of cows randomly selected from the pen among cows that were to be dried off that week. A total of 18 cohorts (2 - 4 per farm based on the date of dry - off) were enrolled. Cows were 74 excluded that had missing data, were lost to follow - up, or turned out not to be pregn ant. More information about the sample size calculation and inclusion and exclusion criteria at the herd level is previously described in Wisnieski et al. (2019 a ). 2.2 Data collection B iomarkers that represented the three processes of metabolic stress ( oxidative stress, inflammation, and nutrient metabolism) were measured from serum samples collected from the tail vein/artery at dry - off. The oxidative stress biomarkers included reactive oxygen species (ROS) and anti - oxidant potential (AOP), as well as th e antioxidants alpha tocopherol, beta - carotene, and vitamin A. The inflammation biomarkers included haptoglobin, SAA, and white blood cells (WBC) counts (neutrophils, basophils, eosinophil s, lymphocytes, monocytes, and banded neutrophils). The nutrient met abolism biomarkers included BHB, NEFA, and BCS. Cholesterol, albumin, vitamin D, and calcium were selected as biomarkers of both nutrient metabolism and inflammation. Body condition scorin g was done by two trained lab employees using the 5 - point method (Kr istensen, 1986). More information on how the biomarkers were collected and analyzed was previously reported (Wisnieski et al., 2019 a ). A veterinarian or trained farm employee confirmed dia gnoses of all clinical diseases and the majority of biomarkers (exce pt ROS and haptoglobin) were measured in duplicate or triplicate to reduce the potential for misclassification of samples (i.e. information bias). Adverse health - event incidence was monito red from dry - off until 30 days post - parturition. An adverse health e vent was defined as either 1) having one or more clinical disease outcome (mastitis, metritis, ketosis, lameness, displaced abomasum, pneumonia, milk fever, and/or retained placenta) and/o r 2) abortion, or death of calf or cow. 75 2.3 Statistical analysis S tata 14.2 and R version 3.4.1 were used for all analyses. Data cleaning and management was performed in Stata, which involved identifying unusual and missing observations using descriptive statistics and bo x - whisker plots. Multivariable models were constru cted using count models (Poisson and negative binomial regression). The primary outcome of interest was adverse health - event incidence at the cohort level and the primary exposures (indepe ndent variables) were the aggregated biomarkers of metabolic stress. Season and lactation number were covariates. We compared three different methods of aggregating individual - level data to the cohort level for the eligible biomarkers and covariates: the c entral, dispersion, and count aggregation methods. We built 4 sets o f models: one for each method of data aggregation and a combined model that included variables from each aggregation method. We analyzed data at the cohort level because each cohort contai ned cows at the same stage of a lactation cycle (i.e. enrolled at dr y - off) and because adjustments to remedy increased risk of disease are often made at the cohort level. Cows in the same cohort were exposed to the same factors such as environment, diet, c ow density, and ambient temperatures. Due to the small number of far ms (N = 5), we did not include a random intercept for farm (Austin, 2010). For continuous predictive variables, the cohort mean was calculated as the average value within each cohort. We d id not perform any transformations to a djust for non - normality (i.e. skewness ) of the biomarkers , as extreme values are informative in predicting disease risk (i.e. cows that are an outlier may be at higher risk of disease). For the dispersion method, the standard deviation of the values for each continuous variable within each cohort was calculated. For the categorical variables that were ordinal (e.g. lactation number and BCS), we calculated the cohort median for the central method 76 and cohort MAD about th e median for the dispersion method. The cohort MAD about the median for each ordinal categorical variable was calculated using the following equation: MAD = where X represents the observation, M represents the median, and N represents the number of observations (Elamir, 2012). The only categorical variable that was nominal was season and was left as is, since all cows within a cohort were dried o ff in the same season. Season was categorized using the following scheme: Spring = March, April, May; Summer = June, July, August; Fall = September, October, November; and Winter = December, January, February. For the count method, thresholds of biomarker s were chosen roctg in Stata, which chooses a threshold based on the best combined specificity and sensitivity in ROC analyses with the outcome (adverse health event incidence) and are shown in Supplementary Table 3. 1 (S 3. 1) (Reichenhei m, 2002). Prio r to this step, LOWESS (Locally Weighted Scatterplot Smoothing) curves were visually assessed to determine which biomarkers needed 2 thresholds due to having a quadratic association with adverse health events (e.g. lower and higher values bot h increased ri sk of adverse events, but values in the middle were associated with a decreased risk of adverse events). For each continuous predictor variable, the number of BCS rang e were quantified in each cohort. Optimal BCS at dry - off was defined as a BCS of 3.5 or 4.0 as reported by Kellogg (2010). For lactation number, the number of cows in their 1 st lactation (heifers) were quantified in each cohort. Our predicti ve modeling ap proach included the following steps, which were completed for each of the 3 methods (central, dispersion, and count methods) and for the combined model which included variables from all 3 aggregation methods: 77 1) best subsets AIC selection, 2) good ness - of - fit an d dispersion tests, 3) shrinkage, 4) scoring rules, 5) model averaging, 6) and internal validation of observed versus predicted counts. Lastly, we compared Dawid and Sebastiani scores (DSS; Dawid and Sebastiani, 1999) values and the agreement between o bserved and predicted counts calculated from all models to determine which aggregation method(s) resulted in the best predictive ability. 2.3.1. Step 1: Best subsets AIC selection glmulti in R was used glmulti the set of variables at the same time and ranks the models by AIC. Each possible model was a generalized linear model with a Poisson or negative bino mial distribution and a log link. The outcome was the number of cows within each cohort with one or more adverse health events. Aggregate me asures of biomarkers and covariates specific for that aggregation method (central, dispersion, and count) were consi dered for entry into the models. Due to the large number of glumulti variables from the separate aggregation methods model sets were included in variable selection for the combined mode l. The top 30 variables were chosen by ranking each variable by AIC in bivariable analyses with the outcome (lower AIC= higher rank). The ma ximum number of variables in each model was set to 3 so that the number of observations per variable was at least 6 to reduce the risk of overfitting (Vittinghof and McCulloch, 2007). Interaction terms were not 78 considered for entry in the models because of this constraint. An offset term (log[count]) was included to account for the number of cows in each cohort in all m odels. model. For each additional model, the difference in AIC from the best model was calculated using the equation: i = AIC i min (1), where AIC min is the AIC from the model with the smallest AIC value (model 1) and AIC i represents the AIC calculated from model 2 through n candidate models, where n is th e total i < 4 wa i have considerably less support for being the best model (Burnham and Anderson, 2004). Models i 2 .3.2. Step 2: Goodness - of - fit and dispersion tests The devia nce goodness - of - fit test was used to test model fit for each remaining candidate model. This tests the current model versus the saturated model (a model with one parameter for each observation) (Long and Freese, 2006). A P - value greater than 0.05 indicated that the model fit the data reasonably well. Candidate models that had poor fit (that failed the deviance goodness - of - dispersiontest in the R package (Version 1.2 - 5, Kleiber and Z eileis, 2017) was used to test for overdispersion, which occurs when there is greater variability in the data than expected. This command tests the null hypothesis of equidispersion against the alternative hypothesis of overdispersion. If the P - value was g reater than 0.05, this indicated that there was not significant overdispersion, and a Poisson model was used. If the P - value was less than 0.05, this indicated that there was significant overdispersion, and a negative binomial model was used. Negative 79 bino nbreg - dispersion parameterization (NB2) (Lon g and Freese, 2006). 2.3.3 Step 3: Shrinkage penalized package (Version 0.9 - 51, Goeman et al., 2018) in R was used to perform rid ge regression to shrink the variable coefficients in each remaining model to help prevent overfitting. The tuning para meter lambda1 (L1) was set to 0, so that all variables were retained in optL2 value for the tuning parameter lambda2 (L2), which was then applied to all regression coefficients (except the interce pt). Shrinkage was performed on the standardized coefficients due to the different scales of the variables, however, unstandardized coeffi cients were reported (Goeman et al., 2018). 2.3.4 Step 4: Scoring rules Scoring rules were used to evaluate model cal ibration, which measure the accuracy of predicted values by comparing them to the mean event rate within other cohorts that have similar t est values (Fenlon et al., 2018). In the current study, we used DSS, which are suited for predictive models (Dawid and Sebastiani, 1999). The scores were calculated with the following equation: dss ( P, x ) = where P is the predictive distribution, x P is the mean predicted count, P is the standard deviation of the predic ted counts. The mean score was calculated for each model and the model with the lowest DSS indicated better ensuing pr obabilistic forecasts (Czado et al., 2009). The DSS values for each model was tested using the following test statistic which is optimal f or low disease counts: Z s = (2) 80 where s is t he DSS value for i through n observations, and E 0 (s) and Var 0 (s) are the expected values and variance, respectively, of the scores under ideal prediction. A two - sided P - value was computed using the test statistic Z s. For Poisson models, the expectation is calculated using the following equation (Wei and Held, 2014): E 0 Var 0 calculated using this formula: E 0 (DSS) = 1 + log ( 2 for overdispersion. The variance was calculated using the following formula (Wei and Held, 2014): Var 0 (DSS) = 2 + 6/ + 1/( + 2 / ) for overdispersion. If the z - value calculated from Equation 2 was significant ( p < 0.05), this indicated that the model had poor calibration and was exclude d from model averaging. 2.3.5 Step 5: Model averaging Moussalli, 2011). In the case of transition disease prediction, it is likely that there are multiple models that have s imilar predictive ability. By averaging predictions over each candidate model set (central, dispersion, count, and combined), we adjusted for this uncertainty and improve predictions (Hu et al., 2015). The predictions from each contributing model were weig hted using 81 Akaike model weight s. Akaike model weights represent the relative performance of that model i values (1), we calculated the Akaike weights using the following formula: where N is the number of candidate models. Together, the sum of all weights within a candidate model set add up to 1. The model with the lowest AIC had the highest weight, the model with the second lowest AIC had the next highest weight, and so on. The p redicted counts calculated from each model within a model set were averaged using the Akaike model weights. Therefore, there will be four final averaged predicted counts (from the central, dispersion, count, and combined model se ts). 2.3.6. Step 6: Intern al validation of observed versus predicted counts Poisson regression, or negative binomial regression if overdispersion was present, was performed to assess the association between the observed and log - transformed averaged pred icted counts for each indiv idual model as well as for the log - transformed averaged predictions across each model set: central, dispersion, count, and combined. For each regression, 500 bootstrapped samples were selected based on recommendations from Steyer berg et al. (2001 b ), which reported that selecting 100 samples was enough for internal validation, but more is recommended. For each bootstrapped sample, cohorts were randomly selected from the study population with replacement (i.e. the same cohort could be selected more than once) , until the sample size was equal to the size of the original data set (Pavlou et al., 2015). Bootstrapping mimics the random selection of subjects from a population similar but not identical to the starting population (Moons et al., 2012). 82 Bootstrapped b ias - corrected confidence intervals for the slope and the intercept were calculated. Ideally, regression of observed versus log - transformed predicted counts should result in an intercept not significantly different from 0 and a sl ope not significantly diffe rent from 1 (i.e. perfect prediction) (Pineiro et al., 2008). We adopted a stricter approach for comparing the slopes to 1 and the intercepts to 0 using equivalence testing, with a margin of equivalence of = 0.1. We chose = 0 .1, because even for cohort s with higher disease count (i.e. 8 - 10), this margin would all ow for a predicted value that differed from the observed value only by a few cows (i.e. 2 - 3 cows), which would be relatively insignificant for determining when cohort - level interventions are nee ded (Wiens, 2002). This involved performing two one - sided t - te sts (i.e. H0: - versus H1: 1 - < slope < 1+ ) to test if the estimates were outside of the equivalence interval [ p - 0.1 to p +0.1], where p is the target value for the par ameter (Mascha and Sessler, 2011). These tests performed at an alpha of 0.05 are analogous to examining whether a 90% confidence interval around the estimate falls entirely inside the equivalence interval (Mascha and Sessler, 2011). Therefore, if the entir e 90% confidence interval for the slope lies within the interv al [0.90, 1.10], the slope was equivalent to a value of 1 and the model predictions were consistent with the observed counts. If the 90% confidence interval for the intercept lies within the int erval [ - 0. 1 0, 0.10], the intercept was equivalent to 0 and the refore the model was unbiased (Pineiro et al., 2008). from target parameter values because it cannot capitalize on failing to reject H0 due to small sample size. D emonstrating equivalence requires precise estimates that are close to the target values (i.e., surrounded by confidence intervals narrow enough to fit in the equivalence interval), whereas faili ng to reject H0 in difference tests just requires confidence i ntervals wide enough to 83 encompass the target values. Small samples yield larger standard errors and thus wider confidence intervals. Regardless, the 90% confidence intervals used for equivalence testing can also be inspected to conduct the difference tests recommended by Pineiro et al. (2008), albeit at alpha of 0.10 instead of 0.05. For each model and model set, the proportion of explained deviance (D 2 ) was calculated using the command from the package in R (Version 1.3.2; Barbosa et al ., 2016). D 2 is better suited for count models compared to R 2 for many reasons. One reason is that D 2 values range from 0 to 1 and are more easily int erpretable compared to R 2 measures that can result in values that are negative or greater than 1 (Cameron and Windmeijer, 1996). D 2 compares each model to the null model (intercept only) and increases when predictors that have some accuracy in predicting t he outcome are added (Coxe et al., 2009). A perfect model has no residual deviance and a D 2 value of 1 (Gu isan and Zimmerman, 2000; Coxe et al., 2009). 3. Results The final sample included 277 cows (189 multiparous and 88 primiparous cows) after excluding 1 6 cows that were missing data at dry - off, 4 cows that were lost to follow - up (did not have data past dry - o ff and/or calving), one cow that had missing SAA data, one that was diagnosed with lameness at dry - off, and one that did not calve. The final sample i ncluded 18 cohorts from 5 farms. Blood samples were collected 47.95 ± 12.81 (mean ± SD) days before calvin g. More description of the study sample and descriptive statistics at the individual cow level has been previously published (Wisnieski et al., 2019 a ) . Disease incidence at the cohort level is presented in Table 3. 1. 84 Table 3.1 Adverse health event inc idence in the 18 cohorts in the study sample (N = 277 cows) Cohort 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Sum Abortion 1 1 1 1 4 Death of calf 1 1 Death of cow 1 1 1 1 4 DA 1 2 1 1 1 1 7 Ketosis 4 2 2 1 2 1 5 2 2 1 22 Lameness 1 3 9 2 3 1 19 Mastitis 1 1 2 1 2 2 2 1 1 2 15 Metritis 1 1 2 2 1 2 1 2 2 3 1 1 1 2 0 Milk fever 1 1 2 Pneumonia 2 1 1 1 1 1 1 8 RP 1 2 2 2 2 1 2 1 1 1 1 2 18 Cohort incidence 1 3 1 8 6 1 3 6 13 4 5 8 3 3 3 8 7 4 6 92 A dverse outcomes 2 3 3 9 7 1 3 1 4 16 8 7 8 4 3 4 11 7 4 8 120 Total N 14 17 18 15 15 16 15 17 14 15 16 15 16 14 17 16 13 14 277 DA: displaced abomasum; RP: retained placenta 1 Defined as the number of cows within each cohort that had 1 or more adverse health events. 2 Total number of adv erse outcomes within each cohort. 3.1 Central method The mean values for all variables for each of the 18 cohorts are shown in Table 3. 2. Median values are shown for BCS and lactation number. In total, there were 20 candidate models i d both the deviance goodness - of - fit and overdispersion tests ( P > 0.05). However, three of the models had significant DSS values, which indicated poor model calibration (2). These models were excluded at this stage, leaving 17 total models (Table 3. 3). The model parameters are presented in detail in Table S 3 . 2 . Mean calcium and albumin were present in all 17 models. Lactation number was excluded from model selection because the median lactation number did not vary between cohorts. Season was not present in any models. The model with the best predictive ability, indicated by DSS values, w as model 15 (DSS = 1.41), which included mean calcium, mean albumin, and mean NEFA. Akaike model weights for model averaging varied from 4% to 14% and are shown in Table S 3. 3 . 85 Table 3.2 Cohort means/medians of predictor variables for the central aggrega tion method in 18 cohorts (N = 277 cows) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Total n 14 17 18 15 15 16 15 17 14 15 16 15 16 14 17 16 13 14 277 Albumin 3.5 3.5 3.4 3.4 3.2 3.2 3.6 3.5 3.4 3.4 3.4 3.4 3.5 3.5 3.6 3.5 3.3 3.2 3.4 Alph a tocopher ol 5.7 5.1 4.1 3.6 4.3 4.4 5.2 5.1 3.5 3.3 5.1 5.0 2.8 2.9 4.2 5.0 2.1 1.6 4.1 AOP 4.4 5.2 6.0 3.4 4.4 4.7 5.8 5.1 3.9 4.1 4.6 5.4 2.8 3.0 4.3 6.5 2.8 3.2 4.4 Beta carotene 3.3 3.3 2.3 1.9 3.0 3.2 3.7 3.7 1.9 2.1 3.2 4.0 12.3 15.5 4.5 4.2 8.9 6 .9 4.9 BH B 4.9 5.7 4.0 4.4 6.2 5.8 4.0 3.8 4.6 5.6 5.3 4.3 6.8 8.1 6.1 6.8 6.0 6.4 5.5 Calcium 9.9 9.9 9.6 9.5 9.8 9.6 9.9 9.7 9.5 9.4 9.5 9.8 10.0 9.9 9.7 9.6 9.6 9.5 9.7 Cholesterol 203.1 209.6 171.8 174.7 153.3 146.4 175.1 175.9 182.1 197.0 203.7 166 .5 165.7 1 85.1 167.8 180.4 166.1 137.7 175.3 Hapto - globin 0.2 0.1 0.1 0.1 0.5 0.4 0.3 0.1 0.1 0.3 0.5 0.3 0.1 0.1 0.3 0.2 0.0 0.0 0.2 NEFA 0.2 0.2 0.3 0.2 0.2 0.2 0.3 0.3 0.3 0.4 0.3 0.3 0.4 0.6 0.3 0.2 0.2 0.2 0.3 ROS 27.1 40.6 69.6 98.2 26.6 41.5 28.2 32.3 82.3 66.0 50.9 42.3 56.3 43.5 46.3 44.1 71.8 74.2 52.1 SAA 26.3 28.6 46.9 75.4 98.7 79.3 81.9 52.5 41.4 145.4 72.8 28.9 45.5 32.8 18.8 50.3 28.0 29.6 54.1 Vitamin A 355.6 332.4 247.4 262.7 288.7 288.9 302.3 315.9 236.4 224.6 217.4 257.0 269.2 351.7 258.2 267 .6 358.4 337.1 292.7 Vitamin D 113.2 108.9 103.6 114.7 104.0 93.3 82.6 71.6 78.5 93.8 87.0 81.6 94.2 87.7 124.3 118.2 115.6 109.9 99.0 BCS 1 3.0 3.0 3.5 3.5 2.5 3.0 3.0 3.0 3.5 3.5 3.3 3.5 3.5 3.5 3.5 3.5 3.5 3.5 3.5 Lactation 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 Season 2 F F Sp Sp F F F F Sp Sp W Sp S S W Sp F F - BN 107.9 81.4 6.5 10.1 108.9 102.1 78.4 81.0 5.6 28.4 79.1 9.6 21.7 9.1 49.0 31.5 6.3 38.2 47.5 B 61.8 14.5 0.0 38.9 184.0 5.9 30.3 25.0 0.0 48.2 13.9 15.3 79.8 36.4 15.4 16.7 37.2 56.8 37.8 E 114.0 170.4 306.2 364.5 303.6 314.8 203.6 351.0 451.6 357.8 227.8 338.8 539.7 208.7 371.0 157.9 314.4 515.1 312.9 L 6451.7 6470.5 5969.4 5209.0 7280.5 5365.6 5279.2 4554.4 3510.8 3538.2 4102.0 3720.9 4409.7 3665.7 4016.1 3320.1 3360.4 3841.8 4672.9 M 234.6 108.4 160.4 177.7 74.6 152.6 138.4 127 .3 126.4 381.5 237.7 195.5 441.8 510.0 152.4 146.5 407.9 168.0 218.7 N 3580.9 2603.1 4221.3 4832.5 2168.8 2311.0 2245.1 2388.1 3654.2 3778.3 2976.3 2982.6 3506.1 4542.2 3215.8 3176.8 4379.8 2984.5 3314.0 ALB: albumin; ALPH: alpha tocopherol; AOP: antiox idant potential; B ETA : beta - carotene; BHB: beta - hydroxybutyrate; C A : calcium; C H : cholesterol; HAP: haptoglobin; NEFA: non - esterified fatty acids; ROS: reactive oxygen species; SAA: serum amyloid A; VITA: vitamin A; VITD: vitamin D; WBC: white blood cells; BN: band neutrophils; B: basophils; L: lymphocytes; M: monocytes; N: neutrophils; BCS: body condition score; F: fall; Sp: spring; S: summer; W: winter - carote , calcium (mg/dL), cholesterol (mmol/L), haptoglobin (mg/dL), NEFA (mmol/L), ROS (DCF /mL), vitamin D (ng/mL) 1 Presented as median value. 2 Season already a cohort level variable so left as is. 86 Table 3.3 Summary of variables present in the central aggregation method models Model Variable 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 Sum Albumin X X X X X X X X X X X X X X X X X 17 Alpha tocopherol X 1 AOP X 1 Band cells X 1 Basophils X 1 Beta - carotene X 1 BCS X 1 BHB X X 1 Calcium X X X X X X X X X X X X X X X X X 17 Eosinophils X 1 Haptoglobin X 1 Lymphocytes X 1 Monocytes X 1 Neutrophils X 1 NEFA X 1 ROS X 1 Vitamin A X 1 X = variable present in model, models presented in more detail in Table S 3. 2 . Internal validation results of observed versus log - transform ed predicted counts are presented in Table S 3. 4 . All models had an intercept that was not signif icantly different from zero and slopes that were not significantly different from one. However, all slopes and intercepts for the individual models and the averaged model also failed to reject the null hypotheses for the equivalence tests. At least one end point for the 90% confidence interval for each slope and each intercept fell outside the relevant equivalence interval for that parameter. Therefore, we cannot confidently say that the slopes were equivalent to 1 and the intercepts were equivalent to 0. Th is indicated that the models could potentially be biased and produce predicted counts not consistent with observed counts. The slope and intercept for the Poisson regression of observed versus predicted counts fo r the averaged predictions w ere 1.23 (9 0 % CI = 0. 82 - 1.62) and - 0.37 (90% = CI = - 1.06 - 0.29), respectively. Figure 3. 1 shows that the model overestimated disease incidence when the observed count was low and underestimated disease incidence when the observe d 87 count was high. The D 2 of the Poisson regr ession analysis for observed and log - transformed predicted counts for the averaged model was 0. 64. Figure 3. 1 Residuals (observed - predicted) versus observed counts for the averaged predictions from the central model set Bootstrapped estimates of Poisson regression line for observed versus predicted counts: log( Observed Count) = - 0.37 + 1.23 ( log( Predicted Coun t)) . Intercept not significantly different from 0 (9 0 % CI = - 1.06 - 0.29 ) , but also not equivalent to 0 either. Slope not significantly different from 1 (9 0 % CI = 0.82 - 1.64), but also not equivalent to 1 either. 3.2 Dispersion method Summary statistics for the aggregated variables for the dispersion method are shown in Table 3. 4. Values for BCS and lactation number are presented as the MAD about the median. i < 4 from - of - fit and 88 overdispersion tests ( P > 0.05). However, all 25 models had significant DSS val u es ( P (2), therefore poor calibration and predictive performance. Due to having no viable cand idate models, analysis of the models from the dispersion method was stopped at this point. 89 Table 3.4 The standard deviations/MADs about the median for t he dispersion aggregation method in 18 cohorts (N = 277 cows) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Total n 14 17 18 15 15 16 15 17 14 15 16 15 16 14 17 16 13 14 277 Albumin 0.2 0.2 0.2 0.3 0.1 0.3 0.2 0.1 0.4 0.2 0.2 0.3 0.1 0.2 0.2 0.3 0.3 0. 3 0.2 tocopherol 2.2 1.8 1.4 1.6 2.0 2.2 1.4 1.6 1.1 0.9 1.8 2.1 1.3 0.7 0.9 1.5 0.7 0.5 1.4 AOP 1.4 1.2 2.0 0.4 0.9 0.8 0.7 0.7 0.8 0.5 1.4 1.0 0.4 0.3 0.4 1.7 0.6 0.4 0.9 Beta carotene 1.3 1.8 1.5 1.0 1.3 1.5 1.0 0.9 0.5 0.9 1.2 0.9 3.3 4.6 1.4 1.5 2.4 1.9 1.6 BHB 0.8 1.4 1.2 1.6 2.6 1.8 1.1 0.8 1.0 2.1 1.6 1.2 3.6 3.3 2.0 1.2 2.0 1.3 1.7 Calcium 0.5 0.3 0.3 0.4 0.5 0.4 0.3 0.2 0.6 0.4 0.3 0.3 0.4 0.6 0.5 0.5 0.3 0.3 0.4 Cholesterol 85.6 87.0 65.2 67.1 51.9 57.2 61.3 60.5 39.4 54.0 62.8 65.7 59.9 67.9 54 .4 63.7 38.6 40.2 60.1 Hapto - globin 0.1 0.1 0.1 0.0 0.5 0.4 0.4 0.1 0.1 0.5 0.7 0.4 0.1 0.1 0.4 0.2 0.0 0.0 0.2 NEFA 0.1 0.8 0.1 0.1 0.1 0.1 0.2 0.1 0.1 0.2 0.2 0.2 0.2 0.3 0.2 0.1 0.1 0.1 0.1 ROS 11.7 14.9 41.1 42.5 9.9 22.9 9.2 15.6 71.5 24.0 21.3 14. 2 42.8 26.4 21.8 25.7 36.7 38.5 27.3 SAA 39.2 33.2 60.5 77.2 118.6 56.5 80.8 53.9 20.7 122.9 48.0 27.2 70.1 28.6 13.0 39.7 28.6 35.4 53.0 Vitamin A 80.0 98.2 44.7 60.4 71.7 91.7 45.7 50.4 42.4 53.0 51.4 56.3 44.4 50.0 44.2 29.5 59.0 76.4 58.2 Vitamin D 13.9 14.5 38.5 46.7 21.3 17.3 17.9 12.9 23.0 22.8 29.0 22.0 13.5 21.7 35.3 33.0 21.4 21.8 23.7 BCS 1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.5 0.5 0.3 0.0 0.3 0.0 0.0 0.0 0.0 0.0 - Lactation 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Season 2 F F Sp Sp F F F F Sp S p W Sp S S W S F F - BN 132.0 80.1 20.0 22.5 119.2 79.8 78.3 178.4 21.0 37.7 141.0 17.0 34.0 26.0 83.2 57.6 15.6 40.5 47.5 B 80.7 21.3 0.0 62.6 184.9 23.5 35.8 25.5 0.0 54.2 22.1 45.8 48.3 56.5 30.5 24.8 27.0 47.3 43.9 E 150.1 192. 8 248.3 261.6 215.8 210.6 115.6 312.6 398.1 254.6 215.6 219.1 372.2 199.8 405.4 123.4 200.9 296.1 255.0 L 4809.4 4520.7 2901.4 3002.7 4581.5 3435.6 2517.3 2921.7 1265.2 2738.1 2163.8 1979.0 1732.0 1705.5 2149.9 1562.5 2091.7 1155.2 2624.0 M 169.4 99.9 11 2.3 146.1 72.4 109.7 136.0 97.4 100.7 160.3 237.1 199.2 267.8 316.9 177.3 107.5 201.4 154.6 159.2 N 1184.1 1076.1 1189.1 1625.9 680.3 1312.6 699.4 971.7 707.8 1254.3 1131.7 1380.8 2305.8 1564.5 1344.4 871.2 1144.2 1174.3 1201.0 AOP: antioxidant potential ; BHB: beta - hydroxybutyrate; NEFA: non - esterified fatty acids; ROS: reactive oxygen species; SAA: serum amyloid A; WBC: white blood cells; BN: band neutrophils; B: basophils; L: lymphocytes; M: monocytes; N: neutrophils; BCS: body condition score; F: fall; Sp: sp ring; S: summer; W: winter - equivalen /mL), vitamin D (ng/mL) 1 Presented as MAD about the median. 2 Season already a cohort level variable so left as is 90 3.3 Count method The thresholds chosen for each continuous biomarker are shown in Table S 3. 1. One cut - point was used for all biomarkers, except for alpha - determined from ROC analysis was 2.34 5.24 µg/mL. The summary statistics showing the count of cows above each threshold (or outside of optimal range) within each cohort are shown in Table 3. 5. The numbe r of cows in their 1 st lactation (heifers) are shown for lactation number. model). All models passed the devi ance goodness - of - fit test and the test for overdisper sion ( P > 0.05). Howev er, all 21 models had significant DSS values ( P calibration and predictive ability. Therefore, the analysis for the count method was stopped at th is point due to lack of viable models. 91 Table 3.5 N umber of cows above ea ch threshold within a cohort for the count aggregation method in 18 cohorts (N = 277 cows) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Total n 14 17 18 15 15 16 15 17 14 15 16 15 16 14 17 16 13 14 277 Albumin 8 12 12 7 1 2 12 14 11 9 9 11 1 2 9 13 11 5 5 163 Al pha tocopherol 5 9 8 7 5 5 8 10 10 13 5 9 8 12 14 11 5 1 145 AO P 7 16 16 1 8 10 14 13 3 3 11 12 0 0 6 15 0 0 135 Be ta - carotene 6 7 3 0 7 7 7 7 0 1 4 10 16 14 12 10 13 14 138 BHB 6 12 3 6 8 8 3 1 3 9 8 3 9 12 13 14 10 10 138 Ca lcium 9 14 8 7 9 9 11 10 5 4 5 10 14 7 12 10 6 3 153 Ch olesterol 8 11 10 7 6 4 8 8 8 8 10 6 7 8 9 10 5 3 136 Ha ptoglobin 13 3 8 0 15 12 11 8 6 6 15 12 8 4 16 10 0 0 147 NEFA 5 4 9 4 4 3 7 15 8 12 12 10 16 13 9 6 8 4 149 RO S 1 5 14 15 1 8 1 3 12 14 8 7 8 5 8 5 11 12 138 SA A 3 6 7 11 9 11 9 7 10 12 12 4 5 6 3 10 3 5 133 Vi tamin A 12 9 4 5 8 9 10 12 2 3 1 3 16 14 6 5 12 9 140 Vi tamin D 13 14 9 10 9 6 3 0 3 7 7 3 7 5 13 12 11 8 140 BCS 1 12 15 7 3 14 13 8 9 4 4 8 2 3 3 7 6 4 3 125 He ifers 4 6 6 5 5 5 5 6 4 5 4 4 6 3 5 5 5 5 88 Se ason 2 F F Sp Sp F F F F Sp Sp W Sp S S W Sp F F - WBC (co BN 10 13 2 3 9 13 13 11 1 8 6 4 7 2 8 7 2 8 127 B 8 6 0 6 14 1 7 10 0 10 5 2 14 7 4 7 10 11 122 E 2 4 10 5 8 9 5 8 10 10 5 9 11 6 11 4 7 13 137 L 10 15 15 8 11 10 10 8 3 4 6 5 8 5 7 5 3 7 140 M 8 6 8 8 2 5 6 4 4 14 8 6 13 13 6 9 13 6 139 N 11 5 14 13 1 2 3 4 11 11 6 4 7 12 10 10 12 6 142 AO P: antioxidant potential; BHB: beta - hydroxybutyrate; NEFA: non - esterified fatty acids; ROS: reactive oxygen species; SA A: serum amyloid A; WBC: white blood cells; BN: band cells; B: basophils; L: lymphocytes; M: monocytes; N: neutrophils; BCS: body conditi on score; F: fall; Sp: spring; S: summer; W: winter Units for biomarkers: albumin (g/dL), alpha tocopherol - (mmol/L), haptoglobin (mg/dL), NEFA (mmol/L), /mL), vitamin D (ng/mL) 1 Presented as the number of co ws outside of the optimal BCS range (<3.5 or >4.0) within each cohort. 2 Season already a cohort level variable so left as is. 3.4 Combined method (central, dispersion, and count methods) The variabl es that were eligible for entry in the combined models are presented in Table S 3. model). All 101 models passed the deviance goodness - of - fit t est and the test for overdispersion ( P > 0.05). Two models were removed due to L2 shrinkage parameters that went to infinity. Out of the remaining 99 models, 89 had DSS Z - scores that were signific ant 92 ( P efore, 10 final candidate models and their parameters, AIC values, DSS values, an d shrinkage parameters are presented in Table S 3. 6. A summary of the models is presented in Table 3.6. The most com mon biomarkers present in the models were mean albumin and m ean calcium, which were present in all 10 models. The AIC weights calculated for model averaging ranged from 7% to 21% and are shown in Table S 3. 7. Evaluation of DSS scores showed that Model 10, w hich included mean albumin, mean calcium, and the standard d eviation of vitamin D, had the best predictive ability (DSS = 1.41) (Table S 3. 6). Table 3.6 Summary of combined model candidate set Model Variable 1 2 3 4 5 6 7 8 9 10 Sum Albumin (C) X 1 Albumin (Ce) X X X X X X X X X X 10 Band cells (D) X 1 Beta - carotene (Ce) X 1 BHB (C e) X 1 BCS (D) X 1 Calcium (Ce) X X X X X X X X X X 10 Cholesterol (D) X 1 Heifer (C) X 1 Vitamin A (Ce) X 1 Vitamin D (D) X 1 C: count; Ce : mean; D: dispersion; BHB: beta - hydroxybutyrate; BCS: body condition score X = variable present in model, models presented in more detail in Table S6 The D 2 calculated from the Poisson regression analysis of observed counts regressed on the log transfor med averaged predicted counts from the combined model set was 0.64 . Internal validation results indicated that f or all 10 models and for the averaged model, th e intercept s wer e not significantly different from 0 (Table S8) and the slope s were not significa ntly different from 1. However, all slopes and intercepts also failed equivalence testing; therefore, we cannot conclude that the slopes were equivalent to 1 a nd the intercept s were equivalent to 0. This suggests that the models may potentially be biased a nd produce observed counts different from 93 predicted counts. The slope and intercept for the Poisson regression of observed versus log - transformed predicted counts for the aver aged model was 1.24 (9 0 % CI = 0. 83 - 1.65) and - 0.42 ( 95% CI = - 1.08, 0.26), respec tively. Visual assessment of residuals (observed - predicted counts) versus predicted counts indicated a trend of the model to overestimate disease incidence when observed count s were low and underestimate disease incidence when observed counts were high (Fi gure 3.2). Figure 3.2 Residuals (observed - predicted ) versus observed counts for the avera ged predictions from the combined model set. Bootstrapped estimates of Poisson reg ression line for observed versus predicted counts: log( Observed Count) = - 0.42 + 1 .24(log( Predicted C ount ) ) . Intercept not significantly different from 0 (9 0 % CI = - 1.08 - 0.26) , but also not equivalent to 0 either. Slope not significantly different from 1 ( 95% CI = 0.83 - 1.65), but also not equivalent to 1 either. 94 4. Discussion The centr al and combined methods were viable techniques to predict disease incidence at the co hort level. Both methods produced several models that had good fit and model calibration (me asured by DSS values), which indicates valid ensuing predictions. However, the count and dispersion methods produced no viable models, unless combined with other ag gregation methods (e.g. combined method). The central and combined method should be consider ed in addition to current monitoring methods that involve counting the number a bove a set threshold of a biomarker (typically NEFA or BHB), especially if they are t ruly superior at predicting disease incidence. The results of this study show that it is pos sible to predict cohorts of cattle at risk of early lactation diseases at dry - o ff. Most diseases occur during the transition period, which is the period from 3 week s before to 3 weeks after calving (Drackley, 1999). Identifying cohorts at risk of disease a t the beginning of the dry period would allow approximately 30 - 40 days to imple ment interventions before the transition period begins. Interventions may include alt ering the environment (e.g. housing, cooling fans, bedding), improving nutrition, or changin g dry - cow treatments (Cameron et al., 1998; LeBlanc et al., 2006). We identifi ed multiple candidate models using the central and combined methods. The central meth od model set had 17 candidate models and the combined model set had 10 models. All models in each model set and their respective averaged predictions had acceptable DSS va lues, which indicates good future probabilistic predictions. All models had intercept s that were not significantly different from 0 and slopes not significantly different from 1 . However, none of the slopes and intercepts from our models were demonstrably equivalent to the target values for those parameters because at least one endpoint fo r each confidence interval fell outside the corresponding equivalence intervals (i.e., [0.90 , 1.1] for slopes; [ - 0.10, 0.10] for intercepts). This 95 suggests that the models could potentially be biased and produce predicted counts that do not align with obse rved counts if the true parameter values fall outside those ranges. Visual assessment of the observed versus predicted counts indicated a trend of the models to overestima te disease risk in cohorts with lower observed counts and underestima te disease risk in cohorts with higher observed counts. The averaged model using both the central and combin ed model sets had the same D 2 value (0.64). In addition, the best model (indica ted by DSS) for the central and combined model sets was equal (DSS = 1.41) and had similar D 2 values (0.62 and 0.63, respectively). These results indicate that combining differe nt aggregation methods d id not improve predictive ability. However, the combine d method, using multiple ag gregate measures in one model, could be more practically appealing than using separate measures because the models can include different aggregate mea sures for the same variable (i.e. mean albumin and count of cows above threshol d) . This would decrease the cost of testing by reducing the number of biomarkers that would need to be tested. Another option for aggregating biomarker measures would be to use the coefficient of variation as an aggregate measure, which is calculated as th e ratio of the standard dev iation to the mean. However, Sorensen ( 2002 ) reported that the coefficient of variation has many weaknesses in compositional modeling , including the f act that it confounds the mean and standard deviation which may have separate e ffects on the outcome. Alb umin and calcium were overall the most common biomarkers in the candidate models. Calcium, although typically used to monitor disease risk at the indi vidual cow level around calving, was shown to be associated with risk of displa ced abomasum at the cohort level when measured a week after calving in a previous study (Chapinal et al., 2012) . Our study is, to the best of our knowledge, the first that investi gated the predictive ability of calcium for disease at the group level as early as dry - off. The predictive ability of calcium measured at dry - off is 96 supported by our previous findings at the individual cow level (Wisnieski et al., 2019 a ). In the central an d combined models, calcium had a negative coefficient which indicates that high er cohort calcium levels at dry - off were associated with a lower incidence of disease around calving. Higher calcium concentrations at dry - off may indicate that the cow will be able to better replenish calcium stores in preparation for the next lactation a nd avoid severe hypocalcemi a (Horst et al., 1994). Hypocalcemia reduces feed intake, causes increased lipid mobilization, and increases susceptibility to disease (Goff, 2008 a ). Therefore, lower cohort calcium concentrations at dry - off may indicate that the cohort will not adjust wel l to the stresses of early lactation. Prior to this study, albumin has not been used as a cohort - level predictor. However, we previously found that al bumin concentrations measured at dry - off were a strong predictor of disease ris k at the individual cow lev el (Wisnieski et al., 2019 a ). In the central and combined models in the present analysis, albumin had a positive coefficient which indicates that high er cohort albumin concentrations measured at dry - off w ere associated with a hig her incidence of disease around calving. This was not expected, given albumin is a negative acute phase protein and decreases in response to inflammation . In addition, albumin s erves as a carrier of NEFA so higher a lbumin concentratio ns occur with lower le vels of lipid mobilization after partuitio n ( C ontreras and Sordillo, 2011; Tóthová et al., 2014) . However, there may be other reasons albumin at dry - off was an important predict or. Albumin serves as a carrier of calcium and it has been argued that calcium should be calculated with an albumin correction in certain cases (e.g. hypoproteinemic conditions) if ionized calcium cannot be measured (Seifi et al., 2005). None of the final models incl uded solely albumin and calcium, which le nds strength to the argumen t that together they may be better predictors than separate predictors. 97 Although the nutrient metabolism biomarkers typically used for monitoring disease risk at the cohort lev el (NEFA an d BHB) were present in some models, they were not in as many models as calcium and albumin. This indicates that calcium and albumin may be better and earlier indicators of transition diseases . These findings have also been corroborated at the in dividual co w - level in a series of studies that used explanatory modeling method s. The authors did not find associations between nutrient metabolism markers (BHB and NEFA) measured 8 weeks before calving with any transition disease (Dervishi et al., 2015, 2 016a, 2016b ; Zhang et al., 2015, 2016). However, the authors found that biomark ers for inflammation (SAA, TNF , lactate, IL - 1 , and IL - 6 ) measured 8 weeks before calving were associated with a diversity of transition disease outcomes. A strength of our stud y is that w e performed internal validation of the observed versus predicted cou nts for each model and the averaged models using bootstrapped analyses. Even though the opportunity for external validation was not available (enrolling additional cohorts to te st the mode ls), internal model validation through bootstrapping simulates the s election of counts for the averaged models indicated that the models tended to overesti mate diseas e incidence when observed counts were low and tended to underestimat e disease incidence when observed counts were high. This could be problematic because herd managers may not apply interventions where they are truly needed or may spend unnecess ary money o n interventions that were not needed. Future studies could perform c ost - benefit analyses to determine the optimum action level of predicted incidence. A limitation of our study is the low sample size at the cohort level, which limited the numb er of varia bles that we could include in each model and contributed to the larg e standard 98 errors around the parameter estimates for equivalence testing. The number of observations per variable was either 6 (when three variables were in the model) or 9 (whe n two variables were in the model). This is still considered relatively low, as at least 10 observations per variable is typically recommended to avoid overfitting (Lin et al., 2013 ; Pavlou et al., 2015 ) . Overfitted models tend to overestimate risk in high risk individuals and underestimate risk in low risk individuals (Pavlou e t al., 2015). However, we counteracted the risk of overfitting through penalize d regression which shrinks the predictions towards the average predicted risk. Shrinkage improves the predictive performance of the models in future individuals (Shmueli, 2010; Pavlou et al., 2015). Despite having few variables, our final models in the cen tral and combined model sets had good fit and calibration (determined by DSS values) (Tables S 3. 2 and S 3. 6). Including fewer variables makes the models more economically appeali ng and practically feasible as measurements of fewer biomarkers are needed. Lo w sample size could also limit statistical power for the slope and intercept significance tests for the observed versus predicted Poisson regression results. The equivalence tes ts (wh ich tested whether slopes were close to 1 and intercepts were close to 0) failed for all models. The wide confidence intervals surrounding the estimates did not fall entirely within the equivalence intervals for slopes [0.9, 1.1] and intercepts [ - 0.9 0, 0.1 0]. Even though we lacked statistical power and precision to demonstrate equivalence, this study sets a precedent by arguing that equivalence testing is a more appropriate framework for internal validation of models. Testing whether or not slopes and inter cepts of the relationship between observed and predicted counts differ fr om their target values capitalizes on the wide confidence intervals associated with small sample size and fails to demand affirmative evidence from the data. Equivalence tests a re a s tricter approach. They require (a) defining a margin of 99 equivalence for h margin is a reasonable boundary between equivalent and non - equivalent values, and (c) estimates that are p recise enough to show that non - equivalent values are implausible ( Masch a nd Sessler, 2011 ; Lakens, 2017). Small samples make it harder to demonstrate equivalence because they lead to imprecise estimates. Our equivalence tests thus yielded cautionary findin gs that show our models could be biased or produce unreasonable predictio ns despite the fact that difference tests failed to reject the target values for slopes and intercepts as plausible. Additional studies should confirm our results in a larger sa mple o f cohorts to ensure adequate power and precision. Another potential lim itation of this study is that we used an aggregate outcome (any clinical transition disease and/or adverse event). The models may be better aimed at predicting the most common d iseases that occurred, because they contributed more weigh t. For instance, the most common diseases in our sample were ketosis and metritis. Conversely, displaced abomasum and milk fever were the least common transition diseases (Table 3. 1). Adverse outcom es, including abortion and death of the cow or calf, were the rarest. The model s may have lower predictive ability for the rare adverse outcomes and least common diseases. Additionally, if high - risk cohorts are successfully identified, choosing an appropri ate intervention strategy may not be as straight - forward w hen using predictive models for aggregate outcomes. For example, interventions for lameness may include environmental changes (e.g. providing deep bedding, mats where cows walk and stand, adding coo ling fans), while milk fever requires nutritional interven tions (e.g. adjusting calcium management program) (Cook and Nordlund, 2009; Santos et al., 2012; de Vries et al., 2015; Das et al., 2016). It may be unclear which type of intervention is appropriate if the models indicate risk of high disease incidence. Fu ture studies can inve stigate 100 factors that are predictive of specific diseases or groups of similar diseases. For instance, reproductive diseases, such as retained placenta and metritis, can be group ed together due to their strong association (Dubuc et al., 2011; Giuliodori et al., 2013) . An additional limitation is the potential for bias. One source of systematic error that could be present is information bias, which occurs w hen the measured valu e of a variable is different than its true value (Dohoo et al., 2003). Information bias would be most likely in our study in the ascertainment of disease status. Inherently, the re are differences in disease diagnosis protocols between farms. However, bias would most likely be non - differential, because the rate of misclassification would not differ between those exposed (i.e. undergoing metabolic stress) and those unexposed (i.e. not undergoing metabolic stress) (Dohoo et al., 2003). The herd managers were b lind to the biomarker results, so there is no way that the results would influence whether a cow was diagnosed or not. Another potential source of bias that could have occurred is selection bias on the herd level. The herds that we sam pled were chosen base d on the herd - managed and had relatively low disease incidence. Future studies could verify the models in a more divers e sample of herds with varying levels of disease incidence . That would permit d rawing inferences that are generalizable to a broader population. To the best of our knowledge, this is the first study that has reported models predictive of early lactation d isease incidence at the cohort level using biomarkers meas ured at dry off. Futu re work could include validating the models in additional cohorts and building separate models for different diseases or groups of diseases. Our results have important practical implications. Earlier and more accurate disease predictio n will result in the ability to implement earlier interventions, thereby potentially reducing disease incidence. 101 CHAPTER 4 : COHORT - LEVEL DISEASE PREDICTION BY EXTRAPOLATION OF INDIVIDUAL - LEVEL PRE DICTIONS IN TRANSITION DAIRY CATTLE Wisnieski, L. a , Norby, B. a , Pierce, S.J. b , Becker, T. c , Gandy, J. a , and Sordillo, L.M. d a Department of Large Animal Clinical Sciences, Michigan State University 736 Wilson Rd, East Lansing, MI, USA, 48824 wisnies5@msu.edu b Center for Statistical Training and Consulting, Michigan State University 293 Farm Lane, East Lansing, MI, USA, 48824 Steve.Pierce@cstat.msu.edu c Department of Food Science and Human Nutrition, Michigan State University 469 Wilson Rd, E ast Lansing, MI, USA, 48824 beckert4@msu.edu d Corresponding author Department of Large Animal Clinical Sciences, Michigan State Universi ty 784 Wilson Rd, East Lansing, MI, USA, 48824 sordillo@msu.edu 517 - 432 - 8821 102 Abstract Dairy cattle experience tremendous physiological and metabolic changes during the transition f rom late gestation to early lactation. During this transition, cows t hat undergo metabolic stress are at higher risk for diseases (e.g. mastitis, me tritis, and ketosis). Metabolic stress is described as a physiological state composed of 3 processes: nutrie nt metabolism, oxidative stress, and inflammation. Current strategies for monitoring dairy cattle herds include assessing health records and nutrien t metabolism biomarker results (non - esterified fatty acids [NEFA] and beta - hydroxybutyrate [BHB]). Typically , if a certain proportion of the current calving cohort (cows that ar e scheduled to calve around the same time) have biomarker results above a pre - s pecified threshold, the cohort is considered high - risk and interventions can be implemented to help reduce d isease incidence. Although this method is effective, NEFA and BHB are typically measured around calving, which leaves little time for an interventio n to mitigate the risk of disease in a cohort. Previously, we published predictive models for early lactatio n diseases at the individual cow level at dry - off. We also previously investigated different methods of aggregating individual level biomarker data to predict cohort - level risk. However, it is unknown if predictive probabilities from individual - level model s can be aggregated to the cohort level to predict cohort - level incid ence. Therefore, our objective was to test different data aggregation methods u sing previously published models that represented the 3 components of metabolic stress (nutrient metabolism, oxidative stress, and inflammation). We included 277 cows from 5 Mic higan dairy herds for this prospective cohort study. On each farm, 2 - 4 calving cohorts were formed, totaling 18 cohorts. We measured biomarker data at dry - off and followed the cows for 30 days post - partition for cohort disease incidence, which was defined as the number of cows: 1) having one or more clinical transition disease outcom e, and/or 2) having an 103 adverse health event (abortion or death of calf or cow) within each cohort. We tested 3 different aggregation methods that we refer to as the p - central, p - dispersion, and p - count methods. For the p - central method, we calculated the a veraged predicted probability within each cohort. For the p - dispersion method, we calculated the standard de viation of the predicted probabilities within a cohort. For the p - cou nt method, we counted the number of cows above a specified threshold of predict ed probability within each cohort. We built four sets of models: one for each aggregation method and one tha t included all three aggregation methods (p - combined m ethod) . We foun d that the p - dispersion method was the only method that produced viable predict ive models. However, these models tended to over estimate incidence in cohorts with low observed counts and u ndere stimate risk in cohorts with high observed counts. Keywords: ca ttle, disease, prediction, modeling, dairy, biomarkers 1. Introduction Transitio n cow diseases (e.g. mastitis, metritis, and ketosis) negatively impact animal welfare, decrease milk produc tion, and cause a great economic burden to dairy farmers. The transit ion period, which is defined as the weeks leading up to and after calving, is w hen most transition diseases occur (LeBlanc et al., 2005; Huzzey et al., 2007; Dubuc et al., 2010). As the c ows physiologically prepare for calving and milk production, energy d emands rise greatly and exceed energy intake. This causes a negative energy bal ance (NEB), which is a normal process in all cows (Hutjens and Aalseth, 2005). However, when NEB is excessiv e, cows may undergo metabolic stress which increases the risk for tra nsition diseases. Metabolic stress is a physiological state that is composed of three processes: altered nutrient metabolism, oxidative stress, and unregulated inflammation. Altered nutri ent metabolism is defined as the disruption of normal homeostatic con trol of nutrient balance (Sordillo and Mavangira, 2014). Oxidative stress 104 is as an imbalance between reactive species and antioxidant defenses, which can lead to tissue damage (Halliwell, 2007). Inflammation is an important physiologic process that is the first response to injury and infection and helps to restore the animal to its p re - injury state, but when inflammation is not properly regulated, it can cause tissue and organ damage (Serh an et al., 2010). These three processes work both separately and syne rgistically and may lead to exacerbated metabolic stress (Sordillo and Mavangir a, 2014). Current monitoring to identify metabolic stress in dairy cattle is done by observing a variety of factors, such as energy intake, health records, and biomarker concen trations. Currently, the biomarkers used to monitor metabolic stress in dairy c attle are nutrient metabolism biomarkers, including non - esterified fatty acids (NEFA) and beta - hydroxybutyra te (BHB). Typically, NEFA and BHB biomarkers are measured in a subset of cows within a calving cohort (cows that are in the same stage of lactation and are expected to calve around the same time). If a certain percentage of cows within a cohort (i.e. 15 - 25 %) are above a pre - specified biomarker threshold, then the cohort is considered at - risk of increased disease incidence (Ospina et al., 2013). Althou gh this method is effective at predicting high - risk cows and cohorts (Ospina et al., 2013), it is retroactive. It is retroactive because NEFA s are typically measured 7 - 10 days b efore to a few weeks after calving and BHB is typically measured post - partum, w hen disease occurrence is at its highest (LeBlanc, 2010; LeBlanc et al., 2005; Huzzey et al., 200 7; Dubuc et al., 2010; Ospina et al., 2010). Likely, the cows that have elevated NEFA and BHB concentrations close to calving would not have time to respond su fficiently to interventions aimed to reduce metabolic stress, and interventions may only be imple mented to aid the following calving cohort. 105 In our previous study, we investig ated detection of high - risk cohorts for early lactation diseases and adverse he alth events at dry - off (Wisnieski et al., 2019b ). We used compositional modeling, which involved aggregating data (i.e. potential metabolic stress biomarkers and covariates) fro m individual cow s to the cohort level (Chan, 1998; Wisnieski et al., 2019b ). Br iefly, we tested 3 methods that we referred to as the central, dispersion, and count methods. For the central method, we used the mean/median of the variables as predictors for cohorts . For the dispersion method we used the standard deviation/median absolu te deviation about the median of the variables as predictors for cohorts . For the count method, w e used the number of cows above a pre - specified threshold for each variable as p redictors for cohorts . We found that the mean method was the only aggregation m ethod that produced models with valid predictions. However, the models had different predictors ( but with some overlap) compared with our previously published individual cow - lev el models (Wisnieski et al., 2019 a ; Wisnieski et al., 2019b ). Therefore, to use the models to calculate both individual cow - level predictions and cohort - level predictions, one would need to measure a larger set of biomarkers. Although individual cow - level predictions using NEFA and BHB can be extrapolated to the cohort - level, it is u nknown how well predictions calculated from the individual cow - level multivariable models (i.e. m odels with multiple explanatory variables) can be extrapolated to the cohort - lev el. If this method proves effective, it would save time and money by reducing t he number of biomarkers measured for prediction at the cow and cohort levels. Another limitation of the existing transition disease research that we addressed in our previous p aper is that most researchers use explanatory modeling methods instead of predi ctive modeling methods. Predictive modeling is aimed towards the prediction of current or future events, whereas explanatory modeling aims to explain causal relationships betwee n variables 106 (Sainani, 2014). Explanatory modeling typically involves choosing v ariables based on their potential role in the causal pathway, using P - values as variable selection criterion, checking for multicollinearity, and ensuring the interpretability o f the results. In contrast, predictive modeling methods include starting with a larger pool of candidate variables, using different variable selection criteri a such as Akaike Information Criteria (AIC), and not being as concerned with interpreting the coef ficients and results (Shmueli, 2010; Sainani, 2014). It is important to differe ntiate between the methods, because they will result in different models (Saina ni, 2014; Vergara et al., 2014). For instance, Vergara et al. ( 2014 ) used predictive modeling meth ods to build models with an aggregate outcome that was defined as having one or more transition diseases and/or removal from the herd. The authors compared mo dels using explanatory and predictive models. They found differences in variables and predictive a bility between the two models (Vergara et al., 2014). Our objective was to de termine if individual cow - level disease predictions can be aggregated to the co hort level to predict cohort - level disease incidence. For this analysis, we used previously publis hed individual level models that represented the 3 components of metabolic stre ss: nutrient metabolism, oxidative stress, and inflammation models (Wisnieski e t al., 2019 a ). We compared different methods for aggregating individual cow - level risk that we wil - - - - cen tral method involved averaging the individual cow predicted probabilities withi n a cohort. The p - dispersion method involved calculating the standard deviation of the individual cow predicted probabilities within a cohort. Finally, the p - count method involv ed calculating the number of cows above a threshold of predicted risk of diseas e within a cohort. In total, we built 4 sets of models: one for each aggregation method and one th at allowed for all 3 aggregation metho f ds (the p - combined 107 method) . We hypothesi zed that aggregated individual cow - level predictions can predict cohort - level disease incidence. 2. Material and Methods The Strengthening the Reporting of Observational Studies in Epidemiology - Veterinary Extension (STROBE - VET) checklist was u sed in the rep 2016). 2.1 Animals The Animal Use and Care Committee at Michigan State University (East Lansing) approved this study. This prospective cohort study was carried out among 5 commercial dairy herds in Mi chigan. At dry - off, 300 clinically healthy cows were enrolled. Within each herd, 2 - 4 cohorts were formed by randomly selecting cows that were to be dried off that week. At least 2 cohorts wer e dried - off in each of the four seasons to account for seasonal v ariation. Each cohort contained equal numbers of 1 st lactation, 2 nd lactation, and 3+ lactation cows. First lactation cows were selected based on having an equivalent time in days to calving as the multiparous cows. Adverse health event incidence was monit ored until 30 days post - parturition. Adverse health events were defined as either: (1) having one or more clinical transition disease (mastitis, metritis, ketosis, lameness, displaced abomasu m, pneumonia, milk fever, and/or retained placenta), and/or (2) a bortion, and d eath of the cow or calf within the study period. Cows were excluded that were lost to follow - up, had missing data, or were not successfully impregnated. The original sample size calculation and inclusion criteria at the herd level are describ ed in Wisniesk i et al. ( 2019 a ) . 108 2.2 Data c ollection Whole blood for plasma was collected in K2 EDTA 10.8 mg vacutainers, and clotted blood for serum was collected in Serum Separator tube s (SST) vacutainers from the tail vein/artery on the same day as cows were drie d off. Samples were immediately placed on ice, transported, and processed in the laboratory. Serum and plasma samples were flash frozen in liquid nitrogen and stored at - 20° C a nd - 80° C, respectively. All samples were analyzed within 2 month s of collectio n. Blood was analyzed for biomarkers representing all 3 components of metabolic stress (inflammation, nutrient metabolism, and oxidative stress) (Sordillo and Mavangira, 2014; W isnieski et al., 2019 a ). The inflammation biomarkers included: al bumin, haptogl obin, serum amyloid A (SAA), vitamin D, and white blood cell (WBC) counts (band cells, basophils, eosinophils, lymphocytes, monocytes, and neutrophils). The nutrient metabolism biomarkers were: BHB, NEFA, calcium, body condition score (BCS), and cholestero l. The oxidative stress biomarkers included: reactive oxygen species (ROS), antioxidant potential (AOP), alpha tocopherol, beta - carotene, and vitamin A. More information on biom arker assays and measurement methods are previously published in Wisnieski et a l. ( 2019 a ) . Diagnoses of clinical disease outcomes were confirmed by trained farm employees or a veterinarian. Body condition scoring was completed by 2 trained lab employees us ing the 5 - point method with 0.5 increments (Kristensen, 1986). 2 .3 Statistical analys i s Stata version 14.2 and R version 3.4.1 were used for the statistical analysis. During data cleaning, the dataset was checked for unusual and missing observations using summary statistics and descriptive plots (e.g. box plots). Multi variable model s were constructed as count models (Poisson or negative binomial models). The outcome was adverse health - event incidence at the 109 cohort level. Data was analyzed at the cohort lev el because a cohort contains cows that will calve at approximately the same tim e and interventions are typically implemented in response to detecting a high - risk cohort. An offset term (log count) was used in the models to account for the number of cows wi thin each cohort. The primary exposure variables were aggregated individual cow predictions from models published in our previous study (Wisnieski et al., 2019 a ). In the previous study, we built a bouquet of candidate models using logistic regression analy ses for each of the 3 components of metabolic stress (nutrient metabolism, oxid ative stress, and inflammation), and a combined model set that included all 3 components (Wisnieski et al., 2019 a ). For the present analysis, we aggregated predicted probabiliti es from the 3 model sets that corresponded to each me tabolic stress component. The model sets are shown in Supplementary Table 1 (S1), Table S2 and Table S3 (Wisnieski et al., 2019 a ). We aggregated the individual level predictions from the model sets using the 3 aggregation methods: p - central, p - dispersion, p - count. We built 4 differ ent sets of models, one for each aggregation method and a model that included all 3 aggregation methods. For the p - central aggregation method, mean predicted probabilities were calculated from the inflammation, nutrient metabolis m, and oxidative stress mo del sets within each of the 18 cohorts. For the p - dispersion method, the standard deviation of the predicted probabilities from the inflammation, nutrient metabolism, and oxidat ive stress model sets within each cohort were calcula ted. For the p - count metho d, the number of cows above a predicted probability threshold within each cohort were counted. The threshold was determined using the Stata, which chooses the threshold that maximizes sensitivity and specificity in ROC curve analyses wit h the outcome (Reichenheim, 2002). The p - combined model set included aggregate variables from the p - central, p - dispersion, and p - count methods. 110 Our modeling approach for the p - central, p - dispersion, p - count, and p - combined aggreg ate methods included the f ollowing steps: 1) best subsets AIC selection, 2) goodness - of - fit and dispersion tests, 3) shrinkage, 4) scoring rules, 5) model averaging, 6) and internal validation of observed versus predic ted counts. 2.3.1. Step 1: Best subsets AIC selection Best subsets AIC selection was glmulti package (Version 1.0.7; Calcagno and de Mazancourt, 2010) in R. Best subsets AIC selection runs every possible model giv en the candidate variables and ranks the models by AI C. For each aggregation me thod (p - central, p - dispersion, and p - count), the aggregated predicted probabilities calculated from each of the 3 individual cow model sets (inflammation, nutrient metabolism, a nd oxidative stress) were included as candidate varia ble s. For example, for the p - central method, mean predictions from the inflammation model set, nutrient metabolism set, and oxidative stress mode l set were candidate variables. For the p - combined models, the candidate variables were the aggregated predictions from all 3 aggregation methods for all 3 model sets . T he outcome was cohort - level disease incidence. A maximum of 3 variables were allowed in each model and no interaction terms were considered in or der to achieve at least 6 observa tions per variable (Vittinghof and McCulloch, 200 7 equation: 111 i = AIC i min (1), where AIC min i is the AIC from model 2 through n i < 4 were retained, as this indicates that they have 4). Every i e candidate model set at this point in the analysis. 2.3.2. Step 2: Goodness - of - fit and dispersion tests For each remaining candidate model after AIC selection in Step 1, devia nce goodness - of - fit and dispersion tests were performed to check proper fit and speci fication of the model. The deviance goodness - of - fit test compares the current model versus the saturated model (Long and Freese, 2006). If the deviance goodness - of - fit tes t reported a significant p - value (p < 0.05), the model had poor fit and was excluded command in the R package was used to test for over - dispersion (Version 1.2 - 5, Kleiber and Zeileis, 2017). If the dispersion test indicate d that overdispersion was present (p < 0.05), a negative bino mial model was used instead of a Poisson model. The negative binomial model adds an additional variance parameter that accounts for the fact that the mean is not equal to the variance. nbreg with the default mean - dispersion parameterization (NB2) (Long and Freese, 2006). 2.3.3. Step 3: Shrinkage Shrinkage, specifically ridge regression, was performed using the package (Versio n 0.9 - 51, Goeman et al., 2018) in R. Shrinkage helps prevent overfitting, which occurs when there is a small amount of observations relative to the number of variables in the model. Overfitting causes the model to over - and under - e stimate predictions and shrinkage brings the regression parameters closer to the mean (Pavlou et al., 2015). All variables were retained in 112 optL2 command was used to calculate the optimum shrinkage parameter (L2) . All regression coefficients besides the intercept were shru nk by the L2 parameter. Shrinkage was performed on standardized coefficients due to the different scales of the variables, but unstandardized coefficients were reported (G oeman et al., 2018). 2.3.4. St ep 4: Scoring rules Dawid - Sebastiani scores (DSS) were used to evaluate model calibration and sharpness (i.e. the concentration of the predictive distributions) (Dawid and Sebastiani, 1999; Gneiting and Raftery, 2007). Scoring rule s assign a numerical sco re that is calculated by comparing the predicted values to th e event that actually materializes (Gneiting and Raftery, 2007). The scores were calculated using dss ( P, x ) = where P is the pr edictive distribution, x is the predicted co P is the mean predicted count, and P is the standard deviation of the predicted counts. The lower the DSS scores, the better ensuing predictions (Czado et al., 2009). The DSS values for each model and for the averaged predictions within each model s et were tested using the following test s tatistic: Z s = (2) where s is the DSS value for i through n observations, and E 0 (s) and Var 0 (s) a re the expected values and variance of the scores under ideal prediction (W ei and Hel d, 2014). The test statistic was compared with the critical value calculated from the z - table that corresponded with a two - sided test at an alpha of 0.05 ( z = 1.96). For P oisson models, the expectation was calculated using 113 E 0 w the mean o f the Poisson distribution and the variance is calculated using Var 0 models, the expectation was calculating using this formula: E 0 (DSS) 2 was calculated with the following formula (Wei and Held, 2014): Var 0 (DSS) = 2 + 6/ + 1/( + 2 / ) s the mea n of the Poisson dist accounts for overdispersion. If the z test - statistic calculated from Equation 2 was significant (p > 0.05), the model had poor calibration and was removed durin g this st ep. 2.3.5. Step 5: Model averaging We performed weighted model averaging using AIC values. Akaike weights represent the amount of evidence in favor of that model being the best model (Burnham and Anderson, 2004). i values (Equation 1), the A kaike wei ghts were calculated with the following equation: A model weight is calculated for model 1 (r=1) through N candidate models (Wagenmakers and Farrell, 2004). The sum of all model weights within a candidate model set add up to 1. The model with the lowest AIC, which is considered the predictions, followed by the model with t he next lowest AIC, and so on. If all aggregation 114 methods proved to be viable techniques to aggregate individual level predictions to the cohort l evel, there would be four fina l averaged predicted counts (from the mean, dispersion, count, and p - combined mo del sets). 2.3.6. Step 6: Internal validation of observed versus predicted counts Poisson regression (or a negative binomial if dispersion was present) of observed versus log transfo rmed predicted counts for each individual model and each model set was conducted (p - central, p - dispersion, p - count, and p - combined). Bias - corrected 95% conf idence intervals for the slope and intercept were calculated through b ootstrapping (Sainani, 2014). This process involved randomly selecting 500 samples from the original data set with replacement (Steyerberg et al., 2001 b ). Each sample contained the same number of cohorts as in the final data set (Pavlou et al., 2015). Boot strapping imitates the random selection of cohorts from a similar population as the study population and tests the models without collecting additional data (Moons et al., 2012). Poisson regression of observed versus log transformed predicted counts should r esult in an intercept not si gnificantly different from 0 and a slope not significantly different from 1 for unbiased and accurate predictions, respectively (Pinerio et al., 2008). Howe ver, small sample size limits the power to detect a significant differ ence by contributing to wide c onfidence intervals that can more easily contain the target value (Lakens, 2017). Therefore, we used a more stringent approach, equivalence testing, which d oes not capitalize on low power caused by small sample sizes and requi res more precise estimates tha t are close to the target value. Equivalence testing involves using two one - sided tests to determine if the estimate lies within an equivalence range (i.e. - versus H1: 1 - < slope < 1+ , where i s the margin of equivalence) ( Mascha and Sessler, 2011). We set the margin of equivalence to = 0.1, so that even cohorts with a high disease incidence (i.e. 8 - 10 diseased cows), predic ted counts would be 115 within 2 - 3 cows of the observed counts. This level of prediction error would be insignificant when prescribing cohort - level interventions (Wiens, 2002). The equivalence ranges [ p - 0.1 to p +0.1], where p is the target value for the parame ter , were therefore calculated as 0.9 1.1 for the slope and - 0.1 0 .1 for the intercept. Two one - sided tests at an alpha of 0.05 is equivalent to calculating a 90% confidence interval around the estimate. Therefore, the slope was equivalent to 1 if the entire 90% confidence interval for the slope estimate lied within the equivalence range for the slop e [0.9, 1.1] (Mascha and Sessler, 2011). This would indicate that the estimates produced by the model were consistent with observed counts (Pineiro et al., 2008). The intercept was equivalent to 0 if the entire 90% confidence interval for the intercept lie d within the equivalence range for the slope [ - 0.10, 1.0] (Mascha and Sessler, 2011), which would indicate that the model was unbiased (Pineiro et al., 2008 ). For each model and the averaged predictions from each model set, t he proportion of deviance expl ained by the model (D 2 ) Dsquared modEvA 2 , which is also called R 2 deviance , R 2 DEV,P , or R 2 KL , compares each model to the null m odel (intercept only) and incr eases when predictors are included that improve prediction of the outcome ( Cameron and Windmeijer; 1996; Coxe et al., 2009). D 2 is a good tool for eval uating count models because D 2 stays within an interpretable range (0 to 1) , increases with additional pa rameters, concurs with R 2 based on explained sum of squares, and corresponds with significance tests for the slope parameters (Cameron and Windmeijer, 1996 ; Cameron and Windmeijer, 1997 ). An optimal model will have a D 2 of 1 ( Coxe et al., 2009). 116 3. Results In total, the final sample included 277 cows from 18 cohorts after excluding 23 cows: 16 were missing data at dry - off, 4 were lost to follow - up (did not have data past dry - off and/or calving), one had missing SAA data, one w as diagnosed with lameness at dr y - off, and one did not calve. Blood samples were collected 47.95 ± 12.81 (mean ± SD) days pre - partum. More information about the study sample and individual level descriptive statistics are reported in Wisnieski et al. ( 2019 a ) . Disease incidence at the c oh ort level is presented in Table 4. 1 (Wisnieski et al., 2019b). Descriptive statistics for aggregated individual level predictions are shown in Table 4. 2. Table 4.1 Adverse health event incidence in the 18 cohorts in the stu dy sample (N = 277 cows) Co hort 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Sum Abortion 1 1 1 1 4 Death of calf 1 1 Death of cow 1 1 1 1 4 DA 1 2 1 1 1 1 7 K etosis 4 2 2 1 2 1 5 2 2 1 22 Lameness 1 3 9 2 3 1 19 Mastitis 1 1 2 1 2 2 2 1 1 2 15 Metritis 1 1 2 2 1 2 1 2 2 3 1 1 1 20 Milk fever 1 1 2 Pneumonia 2 1 1 1 1 1 1 8 RP 1 2 2 2 2 1 2 1 1 1 1 2 18 Cohort incidence 1 3 1 8 6 1 3 6 13 4 5 8 3 3 3 8 7 4 6 92 A dverse outcomes 2 3 3 9 7 1 3 14 16 8 7 8 4 3 4 11 7 4 8 120 Tota l N 14 17 18 15 15 16 15 17 14 15 16 15 16 14 17 16 13 14 277 DA: displaced abomasum; RP: retained placenta 1 Defined as the number of cows within each cohort that had 1 or more adverse health events. 2 Total number of adverse outcomes within each cohort. 117 Table 4.2 Descriptive statistics of aggregate pred icted probabilities from previously published individual cow - level logistic regression models (Wisnieski et al., 2019 a ) by cohort (N = 18) Nutrient Metabolism Inflammation Oxidative Stress Cohort n Me an Predicted Probability Standard Deviation of Pred icted Probabilities Number Above Risk Threshold 1 Mean Predicted Probability Standard Deviation of Predicted Probabilities Number Above Risk Threshold 2 Mean Predicted Probability Standard Deviation of Predi cted Probabilities Number Above Risk Threshold 3 1 14 0.27 0.14 4 0.21 0.19 2 0.26 0.12 2 2 17 0.24 0.19 6 0.23 0.23 5 0.23 0.08 1 3 18 0.34 0.11 7 0.37 0.32 8 0.38 0.13 13 4 15 0.38 0.13 8 0.31 0.19 6 0.31 0.12 6 5 15 0.23 0.14 4 0.14 0.12 1 0.16 0.08 0 6 16 0.22 0.16 4 0.22 0.20 4 0.22 0.13 4 7 15 0.36 0.28 7 0.45 0.31 9 0.41 0.18 9 8 17 0.48 0.20 12 0.55 0.31 12 0.51 0.18 14 9 14 0.37 0.18 7 0.39 0.31 6 0.38 0.17 9 10 15 0.35 0.14 8 0.36 0.30 6 0.34 0.20 6 11 16 0.50 0.12 16 0.55 0.24 13 0.45 0 .16 12 12 15 0.32 0.12 5 0.19 0.14 2 0.27 0.15 6 13 16 0.20 0.10 3 0.22 0.17 4 0.24 0.12 2 14 14 0.20 0.10 1 0.19 0.18 3 0.19 0.11 1 15 17 0.47 0.17 13 0.39 0.30 8 0.44 0.18 10 16 16 0.36 0.18 6 0.47 0.30 10 0.40 0.19 7 17 13 0.44 0.19 9 0.40 0.30 8 0.36 0.19 8 18 14 0.28 0.11 4 0.31 0.25 6 0.40 0.1 4 11 Average 277 0.33 0.15 6.89 0.33 0.24 6.28 0.33 0.15 6.72 1 Risk Threshold = 0.335 2 Risk Threshold = 0.351 3 Risk Threshold = 0.331 118 3.1 P - C entral method The candidate variables eligible for inclusion in the best subsets procedure for the p - central me thod were the mean predictions from the inflammation model, the oxidative stress model, and the nutrient metabo i values < 4 ( Equation 1). All models pass ed the deviance goodness - of - fit test and the test f or overdispersion ( P > 0.05). However, all 5 models had significant ( P indicated poor calibration and predictive ability (Equation 2) . Therefore, analysis was stopped for this met hod at this time. 3.2 P - D ispersion method The can didate variables for the best subsets procedure for the p - dispersion method included the standard deviation of predicted probabilities from the nutrient metabolism, inflammation, and oxidative stress indiv idual level models. The best subsets procedure iden tified i values < 4 ( Equation 1). All candidate models passed the deviance goo dness - of - fit test and the test for overdispersion, however, only 3 models had insignificant DSS values ( Equa tion 2; P > 0.05) (Table 4. 3). Of these, the model with the lowest AIC included the averaged predictions from the inflammation model sets. Akaike we ights ranged from 16% to 54% (Table 4. 4). 119 Table 4.3 Candidate model set for the p - dispersion aggregation method Model Variable Coefficient Model AIC Model D SS Shrinkage Parameter Model 1 76.98 1.5 4.07 Inflammation 5.46 Intercept - 2.47 Model 2 78.14 1.42 21.96 Inflammation 3.2 Oxidative stress 4.03 Intercept - 2.5 Mode l 3 79.43 1.43 19.17 Inflammation 3.34 Nut rient metabolism - 1.19 Oxidative stress 4.4 Intercept - 2.46 AIC: Akaike Information Criteria; DSS: Dawid - Sebastiani Score Table 4.4 AIC weights for the candidate models from the p - disper sion aggregation method Model AIC - AICb est) exp( - w Model 1 76.98 0.00 1.00 0.54 Model 2 78.14 1.16 0.56 0.30 Model 3 79.43 2.45 0.29 0.16 AIC: Akaike Information Criteria; w: weight. Internal validation results of the observed versus log transformed predicte d counts calculat ed from the p - dispersion models indicated that all intercepts were not significantly different from 0 and all slopes were not significantly different from 1 (Table 4.5) . However, all slopes failed to reject the null hypothesis for the equi valence tests, be cause at least one side of the 90% confidence interval for each slope w as outside the equivalence interval [0.9, 1.1] (Mascha and Sessler, 2011). Therefore, we could not confidently say that the slope was equal to 1 and that the predicted c ounts reflected observed counts. In addition, none of the intercepts failed to reject t he null hypothesis for the equivalence tests, because at least one side of the 90% confidence intervals around each intercept lied outside the equivalence range [ - 0.10, 0.10] (Mascha an d Sessler, 2011) . Therefore, we could not confidently say that the inte rcepts were equivalent to 0 and suggests that the models could potentially be biased. Figure 4. 1 shows that the averaged model tended to overestimate disease incidence in cohorts with l ow observed 120 counts and underestimate disease incidence in cohorts with high observed counts. The D 2 for the observed versus log transformed predicted counts for the averaged p - dispersion mode l was 0.57, which indicated that the averaged mo del explained 57% of the deviance (Table 4. 5). Table 4.5 Internal validation of predicted versus observed counts calculated from the p - dispersion aggregation method model set Model Slope (90% CI) 1 Intercept (90% CI) 2 D 2 Model 1 1.03 (0.64, 1.48) - 0.07 ( - 0.88, 0.61) 0. 51 Model 2 1.20 (0.75, 1.70) - 0.42 ( - 1.24, 0.40) 0.57 Model 3 1.20 (0.73, 1.64) - 0.43 ( - 1.23, 0.38) 0.60 Averaged 1.14 (0.72, 1.61) - 0.28 ( - 1.12, 0.41) 0.57 CI: confidence interval 1 All slopes not significantly different from 1 ( P > 0. 05) , but not equi valent to 1 either. 2 All intercepts not significantly different from 0 ( P > 0.05) , but not equivalent to 0 either. 121 Figure 4. 1 Residuals (observed - predicted) versus observed counts for the averaged p - dispersion method model set Bo otstrapped estima tes of Poisson regression line for observed versus predicted counts: log( Observed count ) = - 0.28 + 1.14(log(Predicted count)) . Intercept not sig nificantly different from 0 (90% CI = - 1.12, 0.41) , but also not equivalent to 0. Slope not sig nificantly differ ent from 1 (90% CI = 0.72, 1.61 ), but also not equivalent to 1. 3.3 P - C ount method Candidate variables for the best subsets procedure for the p - count method included the number of cows that were above the specified threshold of predicte d risk in the oxi dative stress, inflammation, and the nutrient metabolism model sets. The thresholds, determined from receiver operator characteristic (ROC) curv e analyses with the outcome, were 0.33, 0.35, and 0.34 for the oxidative stress, inflammation, and nutrient meta bolism models, respectively. The best subsets i values < 4 in reference to the model with the lowest AIC 122 ( Equation 1). All models passed the deviance goodness - of - fit test and the test for overdispersion. However, all 6 m odels had significant DSS scores ( Equation 2), therefore had poor model calibration and poor predictive ability. Therefore, analysis for the p - count method was stopped at this point in analysis. 3.4 P - C ombined method (p - central, p - dispers ion, and p - count methods) All three aggregation methods (p - central, p - count, and p - dispersion) from the three model sets (oxidative stress, inflammation, and nutrient metabolism) were included in variable selection for the p - combined model set. Best subs ets procedure ide ntified 47 candidate models within 4 i values < 4) ( Equation 1). All candidate models passed the deviance goo dness - of - fit test and the test for overdispersion. However, all models had significant DSS values ( P Equa tion 2), so had poor calibration and low predictive ability. Therefore, analysis for the p - combined model set was stopped at this point in the analysis. 4. Discussion In this study, we tested 4 different methods to aggregate individual cow level disease pre dictions to the cohort level: p - central, p - dispersion, p - count , and p - combined . We found that the p - dispersion method (taking the standard deviation of the predicted probabilities within a cohort) was the only method that produced valid mo dels that had acc urate predictions and had good model calibration. However, the final averaged p - dispersion model tended to overestimate disease in cohorts with low observed disease counts and underestimate incidence in cohorts with high observed counts. S till, this is the first study that demonstrated that predictions generated from individual leve l multivariable models can be aggregated to predict disease incidence at the cohort level. This method is useful because herd managers implement interventions at the cohort level , which translate to improved individual cow health (Mulligan et al., 2006). F urthermore, 123 identifying cohorts that are high - risk at dry - off provides additional time to intervene compared with current biomarker monitoring methods (i.e. meas uring NEFA and BH B either 7 - 10 days before to a few weeks after calving) (LeBlanc, 2010). Once - dry - off, interventions such as altering nutrition and housing can be implemented to reduce disease incidence in the current calving cohort ( Cameron et al., 1998; LeBlanc et al., 2006). It is noteworthy that the p - disp ersion method was the only method that produced valid predictions. A high dispersion of predicted probabilities indicates that there is a lot of variation in ris k within a cohort , meaning heterogeneity in individual cow risk estimates. This may indicate so cial stress, where, for example, cows lower in the social hierarchy may be doing worse than the cows higher in the hierarchy (Nordlund et al., 2006). The p - centr al method may not have been valid for aggregating predicted probabilities because the p - central method does not convey extreme observations. If there are extreme observations on both the high end and the low end of predicted risk, the mean will still fall level of risk even though there is high heterogeneity. Current methods for mo nitoring cohorts for risk usually do not consider specified threshold o f NEFA or BHB (Le blanc, 2010). Our results indicate that using a measure of dispersion may be a better, or another useful tool, to aggregate cohort data to predict disease incidence. In all 3 of our final p - dispersion models, aggregate predictions from th e inflammation mo del set were present. This is consistent with our previous findings that infla mmatory biomarkers measured at dry - off may be better predictors of disease risk compared with nutrient metabolism biomarkers (Wisnieski et al., 2019 a ). This is a lso consistent wi th a series of studies from a research group in Canada that found that biomark ers for inflammation (e.g. SAA) were 124 predictive of transition disease 8 weeks before calving, whereas nutrient metabolism biomarkers (i.e. NEFA and BHB) were not (Dervishi et al. , 2015, 2016a, 2016b; Zhang et al., 2015, 2016). Inflammation during the early dry - period could indicate a poor transition back from a lactating state to a non - lactating state. During this time, cows are at higher risk of intramammary infe ctions that may a lso increase risk for other diseases (Green et al., 2002; Dingwell et al., 200 4). Nutrient metabolism biomarkers may not be weaker predictors at dry - off, because energy stores are typically in a recovery state at dry - off and stay in recove ry until the ener gy demands increase again for the next ensuing calving and lactation (Strucken et al., 2015). A strength of our statistical analysis is the use of ridge regression, which is a type of shrinkage. Shrinkage reduces the risk of overfitting, which occurs when there is a small number of variables relative to the sample size and causes m odels to both over - and under - estimate risk (Pavlou et al., 2015). Shrinkage brings the predictions towards the mean and produces models that will perform bette r in future cohor ts. Another strength is that internal validation was used to samples to test the validity of the models, which tests the external generalizability of the results without having to test a new set of samples (Pavlou et al., 2015). Therefore , it is likely the results can be applied to other cohorts that have similar c haracteristics as our study population (i.e. Holstein, large commercial herds, located in Midwest). A strength of this study is the methodology used to reduce t he risk for bias. The animals were randomly selected and stratified by lactation and season, so the risk of selection bias was low (Dohoo et al., 2003). Confounding bias was not an issue in the present study, because confounding is not a pertinent problem in predictive mod eling unlike in explanatory modeling (Sainani, 2014). However, one source of s ystematic error could be information bias, which could have occurred due to errors in disease diagnoses or biomarker measurement. Farms may have 125 different protoc ols for diagnosin g diseases, however, errors in disease diagnosis should not depend on the expo sure (predictions from metabolic stress models built using biomarkers). Therefore, this would most likely result in non - differential misclassification which woul d bias the result s towards the null (Dohoo et al., 2003). A limitation of this study is the sm all sample size at the cohort level. This could limit statistical power for the equivalence tests used in the observed versus predicted count Poisson regression analyses. Future studies should validate these aggregation methods in larger samples or conside r other aggregation methods (e.g. median risk, within group agreement) (Chan, 1998). Another limitation of this study is we did not perform a cost - benefit analys is. Measuring mul tiple biomarkers in multiple cows will increase the cost of monitoring. Howeve r, the potential benefit of reducing treatment costs and production losses may outweigh the costs. Future studies can determine if the models are economically fe asible and determ ine the optimum number of cows to test within each cohort to gain the maximum benefit. Now that it has been established that predictions can be aggregated to the cohort level, future studies can attempt to build models that maximize predic tive ability with the smallest number of biomarkers possible to reduce testing costs. Another l imitation is that we used an aggregate outcome (one or more clinical disease and/or other adverse health event) in our models. The diseases included in the aggre gate outcome may have largely different causes/risk factors, so deciding on an intervention to implement after detecting a high - risk cohort may be difficult. For instance, interventions for mastitis may involve dry cow treatments, while interventions for l ameness may invol ve changing bedding or the use of lime on free stalls (Barker et al., 2009; Ra biee and Lean, 2013). Unfortunately, we were unable to build separate models due to the low disease incidence in our cohorts. However, this 126 study can be used as a proof - of - concep t for aggregating individual predicted probabilities for prediction at the coh ort level. Future studies should consider building separate models for I n addition to val idating aggregation methods in additional cohorts and building separate models for different diseases, future studies can determine what interventions are most effective and cost - efficient when implemented at dry - off. Future studies can al so determine inci dence action levels (i.e. at what level of predicted incidence should herd man agers institute interventions). 5. Conclusion s Compositional modeling, or aggregating data from the individual level to the group level, is rarely, if at all, used in dairy cattle disease prediction. We found that individual cow level predictions calculated f rom multivariable models could be aggregated to the cohort level by using the p - dispersion method (taking the standard deviation of the predicted probabilities within a cohort) to predict early - lactation adverse health events at dry - off. This would allow t ime to implement proactive interventions in cohorts that are projected to have high disease incidence. 127 CHAPTER 5: SUMMARY AND CONCLUSIONS The broad goal of this dissertatio n was to build predictive models using biomarkers of metabolic stress for early - lactation cow diseases at the individual and cohort level at dry - off. The rationale for building predictive models at dry - off was that the biomarkers that are currently used to monitor metabolic stress in dairy cattle (i.e. NEFA and BHB) are measured too late in the lactation cycle to make proactive management changes to reduce disease risk in the current calving cohort. Management changes can only be made to ai d the next calvin g cohort. Additionally, the biomarkers typically used for monitoring are nutrient metabolism biomarkers, which is only one of three components of metabolic stress (Sordillo and Mavangira, 2014). Consequently, we included biomarkers of the other two compone nts of metabolic stress (inflammation and oxidative stress) in addition to nutrient metabolism biomarkers in our predictive models. Another issue that we addressed in this dissertation is that most previous st udies that assess factors that affect disease r isk in dairy cattle use explanatory modeling methods, rather than predictive modeling methods (LeBlanc et al., 2004, 2005; Huzzey et al., 2007; Dubuc et al., 2010; Vanholder et al., 2015). Predictive modeling methods were developed to impr ove prediction of outcomes, such as transition cow diseases. In contrast, explanatory modeling is useful for investigating the causal associations between diseases and carefully considering confounding variables (Shmueli, 2010 ; Sainani, 2014). Therefore, w e implemented a p redictive modeling approach for this project, which involved using AIC best subsets selection, model averaging, ridge regression, and internal validation. For all analyses, we used an aggregate outcome that wa s defined as having : 1) one or more transition disease (mastitis, metritis, retained placenta, ketosis, lameness, pneumonia, milk fever, displaced abomasum), and/or 2) one or more 128 adverse health event (death of calf, death of cow, or abortion). Chapter 2 f ocused on building predictive models for transi tion diseases at the individual cow - level. For this analysis, we found that inflammation and oxidative stress biomarkers were more predictive of transition disease outcomes compared to nutrient metabolism biom arkers. All models had good fi t and acceptable discriminant ability. Chapter s 3 and 4 focused on building cohort - level models. A cohort is defined as a group of cows that are dried - off at the same time, therefore, have similar expected calving dates. Herd managers are often more conce rned about cohort health versus individual cow health because interventions are typically made at the cohort level (e.g. nutritional changes) (Oetzel, 2004; LeBlanc, 2010). Determining cohort health is also more economically f easible compared to determinin g individual cow health because it reduces the number of cows that need to be tested. To monitor cohort health using biomarkers, NEFA and BHB are measured in a proportion of a cohort and if a certain percentage of animals are above a pre - specified threshol d, the cohort is considered at - risk (Oetzel, 2004). Although this method is effective at determining cohort level risk (Ospina et al., 2013), few studies have explored other ways to aggregate individual cow - level data to the c ohort level. We explored metho ds of composition al modeling, which are commonly used in the social sciences to extrapolate individual level data to the organizational level (e.g. workplace) (Chan, 1998). Chapter 3 focused on aggregating biomarkers/covariate s using three different aggreg ation techniques only method that produced viable predictive models. However, the model s tended to underestimate dise ase risk when obs erved counts were low and overestimate disease risk when observed counts were high. Chapter 4 focused on aggregating predicted probabilities from 129 individual cow models presented in Chapter 2 to the cohort leve l using 3 aggregation methods that we referred - , - , - p - dispersion method was the only method that produced acceptable models. However, these models also tended to underestimate disea se incidence when observed cou nts were low and overestimate disease incidence when observed counts were high. Figures S1 S5 outline the steps of the analyses used in Chapters 2 4. Chapter 2 In Chapter 2, we built individual cow - level predictive models for transition diseases. We b uilt four sets of models: one for each component of metabolic stress (oxidative stress, inflammation, and nutrient metabolism) and one set that included all three components. The first major finding of Chapter 2 was that the models measured at dry - off pro duced good predic tions, demonstrated by higher than acceptable AUC values calculated from ROC curve analyses (> 0.70). The models produced sensitivity and specificity levels (> 60%) that are comparable with levels reported for the current method of using N EFA and BHB for i ndividual cow - level prediction (Ospina et al., 2010). These results indicate d that it is possible to predict risk of disease as early as dry - off, which would leave ample time to intervene with interventions, s uch as nutritional or housing changes (e.g. bed ding, cooling fans, dry cow treatments), to reduce risk of disease in the current calving cohort (Cameron et al., 1998; LeBlanc et al., 2006). The second major finding of Chapter 2 was that the inflammation and oxidative stress models pro duced better pred ictions compared to the nutrient metabolism models. This is consistent with previous findings from a Canadian research group that found that inf lammatory biomarkers (including SAA, TNF, IL - 1, and IL - 6) were predictive of multiple transitio n disease outcome s 8 weeks before calving, whereas nutrient metabolism biomarkers (NEFA and BHB) 130 were not (Dervishi et al., 2015, 2016a, 2016b; Zhang et al., 201 5, 2016). However, the combined models, which included biomarkers from all three components of metabolic stress produced the best predictions. The averaged model from the combined model set had a sensitivity of 88.2% and a specificity of 87.0%. The fact th at the combined model set, which included all 3 components of metabolic stress, had the best pr edictions support ed our hypothesis that including all 3 components would improve predictions. Including all 3 components may be essential to capturing the comple xity of metabolic stress, as the 3 components can exacerbate each other and worsen metabolic st ress (Sordillo an d Mavangira, 2014). Chapter 3 In Chapter 3, we built cohort - level models using aggregated biomarkers/covariates from the individual cow - level. We tested 3 data aggregation methods: the central, dispersion, and count methods. We built 4 se ts of models: one for each aggregation method and a combined model that included all 3 aggregation methods. The central method involved calculating the mean (for continuous variables) or median (for categorical variables) of the biomarker/covariate results within each coho rt. The dispersion method involved calculating the standard deviation (for continuous variables) or MAD from the median (for categorical variabl es) for each biomarker/covariate within each cohort. The count method involved counting the num ber of cows above a pre - specified threshold for each biomarker/covariate. The thresholds for continuous variables were chosen based on the best combined sensitiv ity and specificity determined from ROC curve analyses with the outcome (any transition disease and/or adverse h ealth event). The first major finding of Chapter 3 was that the central method and the combined methods were the only methods that produced val id cohort - level models. The dispersion and 131 count models all failed the z - test for DSS values, w hich indicated po or model calibration and future predictions. The observed versus predicted count regression analyses for the final models from the central and c ombined models all had intercepts that were not significantly different from 0 and slopes not s ignificantly diff erent from 1. However, results from the equivalence tests indicated that we could not confidently say that the slopes were equal to 1 and the in tercepts were equal to 0. The D 2 values for the averaged central model and the combined model w ere both 0.64, wh ich indicated that 64% of the deviance in the observed counts was explained by predicted counts. E ven though the final models from the central a nd combined models produced valid predictions, they tended to underestimate cohort disease inci dence when observ ed counts were low and overestimate disease incidence when observed counts were high. The models should be validated in additional cohorts with varying disease incidence to determine the ability of the models to produce accurate prediction s. The second ma jor finding of Chapter 3 was that the most common biomarkers present across all models were calcium and albumin. Although calcium is commonly us ed for monitoring dairy cattle for hypocalcemia, this is typically at the cow - level rather than the herd - level (Goff, 2008 a ). To the best of our knowledge, this is the first study that demonstrated that calcium can be used for monitoring on the cohort lev el. Mean cohort calcium concentrations at dry - off represent the average level of calcium stores within a cohort. Higher calcium stores at dry - off may indicate that the cow is better able to handle the increasing calcium demands for the next ensuing lactati on (Horst et al., 1994). Albumin, to the best of our knowledge, is a potential novel biomarker for monitoring me tabolic stress in dairy cattle. In our cohort - level models, higher albumin was associated with a lower risk of disease. Albumin is a negative ac ute phase protein, therefore responds inversely to inflammation , and is a marker of liver produ ction of 132 importan t immune system proteins. Albumin is also a carrier of calcium and it has been argued that calcium concentrations should be corrected for albumin concentrations, if ionized calcium cannot be measured (Seifi et al., 2005). Albumin was also very common in th e models for individual cow - level predicti ons in Chapter 2, which strengthens the evidence for its predictive value for transition diseases at dry - off. Chapter 4 Chapter 4 aimed to aggregate individual - level predictions (from the individu al level averaged models for oxidative stress, inflammation , and nutrient metabolism in Chapter 2) for prediction of disease at the cohort level. We tested three different aggregation methods that we - - - c d we also tested using all 3 aggregation m ethods in a combined model. The p - central method involved averaging the predicted probabilities calculated from the individual cow - level models within each cohort. The p - dispersion method involved calculating the s tandard deviation of the predicted probabi lities from the individual cow - level models within each cohort. Lastly, the p - count method involved counting the number of cows above a set predicted probability threshold within each cohort. The t hreshold of predi cted probability was chosen based on the b est combined sensitivity and specificity in ROC curve analyses with the outcome (any transition disease and/or adverse health event). The first major finding of Chapter 4 was that the p - dispersion method was the o nly method that produced viable models. Th e models produced using the p - central, p - count, and the combined models all failed the z - test for DSS values, which indicated poor model calibration. The observed versus predicted analysis for the final models from the p - dispersion models had intercepts no t significantly different from 0 and slopes not significantly different from 1 . 133 However, the results from the equivalence tests indicated that we could not confidently say that the slope s were equa l to 1 and the in tercepts were equal to 0. D 2 for the final averaged p - dispersion model was 0.57, which indicated that 57% of the deviance in the observed counts was explained by the predicted counts . The final models produced from the p - dispersion method tended to underes timate incidence when observed counts were low and underestimate disease incidence when observed counts were high. A higher dispersion of predicted probabilities within re more cows at h igher risk and lower risk. This discrepanc y between cows could be a result of social stress and cows that are higher in the social hierarchy may be doing better than cows lower in the hierarchy (Nordlund et al., 2006). The second major fin ding of Chapter 4 was that the predictions from the inflamm ation model set were the most common variable, which were present in all three final models. This is consistent with our findings from Chapter 2 that inflammation biomarkers were better predictors of transition cow diseases compared to nutrient metabolism biomarkers. This is also consistent with Chapter 3 that found that albumin, a biomarker for inflammation, was one of the most common biomarkers in the models. These results confirm that inflammator y biomarkers may be more predictive of transition diseases compared with nutrient metabolism biomarkers at dry - off. Conclusions We successfully built predictive models for transition diseases at the individual cow - level and cohort - level measured at dry - o ff. Across all an alyses, the inflammatory biomarkers were m ore predictive of disease compared with the nutrient metabolism biomarkers. This demonstrates the importance of investigating new biomarkers for transition diseases that may be more effective at id entifying individ ual cows and cohorts at risk earlier in th e lactation cycle than 134 current methods. We also found that compositional modeling was an effective way to aggregate individual cow - level data to the cohort le vel. 135 APPENDICE S 136 APPENDIX A : Supplemental T ables Table S2.1: Effect coding for categorical variables Variable Category Effect Coding Cut - points Inflammation Albumin (g/dL) 1 (Ref) - 1: - 1: - 1: - 1 <3.2 2 1: 0: 0: 0 3.2 - <3.3 3 0: 1: 0: 0 3.3 - <3.5 4 0: 0: 1: 0 3.5 - <3.6 5 0: 0: 0: 1 3.6 and greater 1 (Ref) - 0.21; - 0.21: - 0.21: - 0.21 0 2 1: 0: 0: 0 >0 - <38.05 3 0: 1: 0: 0 38.05 - <74.53 4 0: 0: 1: 0 74.53 - <126.53 5 0: 0: 0: 1 126.53 and greater 1 (Ref) - 0.20: - 0.19: - 0.20: - 0.19 0 2 1: 0 : 0: 0 >0 - <37.50 3 0: 1: 0: 0 37.50 - <50.68 4 0: 0: 1: 0 50.68 - <94.00 5 0: 0: 0: 1 94.00 and greater 1 (Ref) - 1: - 1: - 1: - 1 <95.90 2 1: 0: 0: 0 95.90 - < 179.00 3 0: 1: 0: 0 1 79.00 - <294.68 4 0: 0: 1: 0 294.68 - <499.68 5 0: 0: 0: 1 499.68 and greater Haptoglobin (mg/mL) 1 (Ref) - 1: - 1: - 1: - 1 <0.064 2 1: 0: 0: 0 0.064 - <0.097 3 0: 1: 0: 0 0.097 - <0.137 4 0: 0: 1: 0 0.137 - <0.202 5 0: 0: 0: 1 0.202 and greater 1 (Ref) - 1 : - 1: - 1: - 1 <2704.48 2 1: 0: 0: 0 2704.48 - <3411.84 3 0: 1: 0: 0 3411.84 - <4289.14 4 0: 0: 1: 0 4289.14 - <5982.66 5 0: 0: 0: 1 5982.66 and greater 1 (Ref) 1.67: - 1.69: - 1.68: - 1.69 0 2 1: 0: 0: 0 >0 - <96.30 3 0: 1 : 0: 0 96.30 - <184.83 4 0: 0: 1: 0 184.83 - <343.70 5 0: 0: 0: 1 343.70 and greater 1 (Ref) - 1: - 1: - 1 <10.04 2 1: 0: 0 10.04 - <29.14 3 0: 1: 0 29.14 - <77.00 4 0: 0: 1 77.00 and gre ater Vitamin D (ng/mL) 1 (Ref) - 1: - 1: - 1: - 1 <74.62 2 1: 0: 0: 0 74.62 - <90.06 3 0: 1: 0: 0 90.06 - < 105.78 4 0: 0: 1: 0 105.78 - <118.60 5 0: 0: 0: 1 118.60 and greater Nutrient Metabolism Body Condition Score Non - Ideal BCS (Ref) - 1.22 <3.5 or >4.0 Ideal BCS 1 3.5 or 4. 0 Cholesterol (mmol/L) 1 (Ref) - 1: - 1: - 1 < 120 2 1: 0: 0 120 - <173 3 0: 1: 0 173 - <224 4 0: 0: 1 224 and greater 137 Table S2.1 ( con t d ) Oxidative Stress Alpha 1 (Ref) - 1: - 1: - 1: - 1 <2.25 2 1: 0: 0: 0 2.25 - <3.24 3 0: 1: 0: 0 3.24 - <4.45 4 0: 0: 1: 0 4.45 - <5.73 5 0: 0: 0: 1 5.73 and greater Antioxidant Potential (Trolox 1 (Ref) - 1: - 1: - 1: - 1 <3.17 2 1: 0: 0: 0 3. 17 - <4.00 3 0: 1: 0: 0 4.00 - <4.74 4 0: 0: 1: 0 4.74 - <5.57 5 0: 0: 0: 1 5.57 and greater Beta - 1 (Ref) - 1: - 1: - 1: - 1 <2.26 2 1: 0: 0: 0 2.26 - <3.06 3 0: 1: 0: 0 3.06 - <4.14 4 0: 0: 1: 0 4.14 - <5.93 5 0: 0: 0: 1 5.93 an d greater Reactive Oxygen Species (DCF 1 (Ref) - 1: - 1: - 1: - 1 <26.70 2 1: 0: 0: 0 26.70 - <36.26 3 0: 1: 0: 0 36.26 - <49.76 4 0: 0: 1: 0 49.76 - <71.13 5 0: 0: 0: 1 71.13 and greater Vitamin A (ng/mL) 1 (Ref) - 1: - 1: - 1 <237 2 1: 0: 0 237 - <286 3 0: 1: 0 286 - <346 4 0: 0: 1 346 and greater Other covariates Lactation Number 1 (Ref) - 1: - 1 1 2 1: 0 2 3 0: 1 3 and greater Season 1 (Ref) - 0.32: - 1.30: - 0.35 Spring (March, April, May) 2 1: 0: 0 Summer (June, July, Au gust) 3 0: 1: 0 Fall (September, October, November) 4 0: 0: 1 Winter (December, January, February) SAA: serum amyloid A; BCS: body condition score; AOP: antioxidant potential; ROS: reactive oxygen species 138 Table S2.2 Multivariable logistic r egression candidate model set for inflammation component of metabolic stres s Model Variable Coefficient SE Lower 95% CI Upper 95% CI P - value Model AIC Model AUC Model 1 1 333.60 0.86 Albumin (g/dL) <3.2 (ref) 3.2 - <3.3 0.18 0.35 - 0.50 0.86 0.61 3.3 - <3.5 - 0.27 0.30 - 0.86 0.33 0.38 3.5 - <3.6 0.84 0.44 - 0.02 1.70 0.05 3.6 and greater - 0.41 0.41 - 1.21 0.39 0.32 Lactation Number 1st (ref) 2nd - 0.35 0 . 26 - 0.85 0.16 0.18 3rd or greater 0.80 0.25 0.30 1.29 0.00 0 (ref) >0 - <96.30 0.75 0.32 0.12 1.38 0.02 96.30 - <184.83 0.22 0.32 - 0.41 0.85 0.49 184.83 - <343.70 - 0.30 0 . 38 - 1.05 0.44 0.43 343.70 and greater 0.27 0.36 - 0.43 0.98 0.45 Neutrophil: Lymphocyte ratio - 0.71 0.32 - 1.34 0.26 0.16 Season Spring (ref) Summer - 0.47 0.68 - 1.80 0.87 0.50 Fall - 0.32 0.26 - 0. 8 3 0.18 0.21 Winter 1.31 0.52 0.30 2.32 0.01 <10.04 (ref) 10.04 - <29.14 - 0.59 0.32 - 1.22 0.03 0.06 29.14 - <77.00 - 0.13 0.34 - 0.79 0.53 0.70 77.00 and greater 0.75 0.34 0.09 1.41 0.03 Lactation Number and Monocyte (count pe 1st Lactation (ref) 2nd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 - 0.65 0.46 - 1.56 0.26 0.16 96.30 - <184.83 0.10 0.45 - 0.77 0.98 0.82 184.83 - <34 3.70 0.79 0.49 - 0.18 1.75 0.11 343.70 and greater - 0.97 0.54 - 2.02 0.08 0.07 3rd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 0.49 0.44 - 0.37 1.34 0.27 96.30 - <184.83 0.31 0.48 - 0.62 1.24 0.51 184.83 - <343.70 0.52 0.48 - 0.42 1.47 0.28 343.70 and greater - 0.32 0.48 - 1.25 0.61 0.50 SAA (1st quartile) and Albumin (1st quintile) Interaction (ref) SAA (2nd quartile) and Albumin (2nd quintile) Interaction - 0.50 0.65 - 1.77 0.77 0.44 SAA (2nd quar tile) and Albumin (3rd quintile) Interaction 1.37 0.52 0.35 2.38 0.01 SAA (2nd quartile) and Albumin (4th quintile) Interaction - 1.02 0.65 - 2.30 0.25 0.12 SAA (2nd quartile) and Albumin (5th quintile) Interaction - 0.32 0.67 - 1.64 1.01 0.64 SAA (3rd quartile) and Albumin (2nd quintile) Interaction 0.01 0.56 - 1.09 1.10 0.99 SAA (3rd quartile) and Albumin (3rd quintile) Interaction 0.92 0.51 - 0.08 1.91 0.07 SAA (3rd quartile) and Albumin (4th quintile) Interaction 0.35 0.76 - 1.15 1.84 0.65 SAA (3rd quartile) and Albumin (5th quintile) Interaction 0.96 0.69 - 0.39 2.30 0.16 SAA (4th quartile) and Albumin (2nd quintile) Interaction - 0.84 0.56 - 1.94 0.25 0.13 SAA (4th quartile) and Albumin (3rd quintile) Inter action - 0.52 0.54 - 1.58 0.55 0.34 SAA (4th quartile) and Albumin (4th quintile) Interaction 0.41 0.74 - 1.04 1.85 0.58 SAA (4th quartile) and Albumin (5th quintile) Interaction 0.61 0.65 - 0.67 1.89 0.35 Intercept - 0.36 0.49 - 1.32 0.59 0 .46 Model 2 2 333.75 0.86 Albumin (g/dL) <3.2 (ref) 3.2 - <3.3 0.11 0.35 - 0.58 0.79 0.76 3.3 - <3.5 - 0.27 0.31 - 0.87 0.34 0.39 3.5 - <3.6 0.86 0.44 0.003 1.72 0.05 3.6 and greater - 0 .39 0.41 - 1.20 0.42 0.34 Lactation Number 1st (ref) 2nd 0.85 0.70 - 0.53 2.22 0.23 3rd or greater - 0.66 0.65 - 1.95 0.62 0.31 Monocytes (count 0 (ref) >0 - <96.30 0.71 0.33 0.06 1.36 0.03 96.30 - <184.83 0.19 0.32 - 0.44 0.82 0.56 184.83 - <343.70 - 0.19 0.39 - 0.97 0.58 0.62 343.70 and greater 0.12 0.37 - 0.59 0.84 0.74 Neutrophils - 0.0003 2E - 04 - 0.001 0.00000762 0.06 Season Spring (ref) Summer - 0.58 0.72 - 1.99 0.82 0.42 Fall - 0.32 0.26 - 0.83 0.19 0.22 Winter 1.32 0.53 0.28 2.36 0.01 Serum <9.64 (ref) 9.64 - <29.13 - 0.58 0.32 - 1.20 0.04 0.07 29.13 - <77.00 - 0.09 0.34 - 0.76 0.58 0.79 77.00 and greater 0.61 0.32 - 0.02 1.24 0.06 139 Lactation Number and Interaction 1st Lactation (ref) 2nd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 - 0.77 0.47 - 1.67 0.16 0.11 96.30 - <184.83 - 0.08 0.46 - 0.98 0.82 0.86 184.83 - <343.70 1.03 0.52 0.01 2.05 0.05 343.70 and greater - 0.83 0.55 - 1.90 0.24 0.13 3rd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 0.47 0.44 - 0.40 1.33 0.29 96.30 - <184.83 0.51 0.48 - 0.42 1.44 0.28 184.83 - <343.70 0.60 0.49 - 0.36 1.56 0.22 343.70 and greater - 0.74 0.51 - 1.73 0.25 0.14 1st Lactation (ref) 2nd Lactation and Neutrophil Interaction - 0.0004 2E - 04 - 0.0008 0.00002 0.06 3+ Lactation and Neutrophil Interaction 0.0004 2E - 04 0.00008 0 .0008 0.02 SAA (1st quartile) and Albumin (1st quintile) Interaction (ref) SAA (2nd quartile) and Albumin (2nd quintile) Interaction - 0.48 0.64 - 1.74 0.78 0.45 SAA (2nd quartile) and Albumin (3rd quintile) Interaction 1.38 0.53 0.34 2.42 0.01 SAA (2nd quartile) and Albumin (4th quintile) Interaction - 1.14 0.66 - 2.44 0.16 0.09 SAA (2nd quartile) and Albumin (5th quintile) Interaction - 0.41 0.69 - 1.76 0.94 0.55 SAA (3rd quartile) and Albumin (2nd quintile) Interaction 0.12 0.57 - 1.01 1.24 0.84 SAA (3rd quartile) and Albumin (3rd quintile) Interaction 0.98 0.52 - 0.05 2.01 0.06 SAA (3rd quartile) and Albumin (4th quintile) Inter action 0.39 0.76 - 1.10 1.88 0.61 SAA (3rd quartile) and Albumin (5th quintile) Interaction 0.97 0.72 - 0.43 2.38 0.17 SAA (4th quartile) and Albumin (2nd quintile) Interaction - 0.91 0.56 - 2.00 0.18 0.10 SAA (4th quartile) and Albumin (3 rd quintile) Interaction - 0.43 0.56 - 1.53 0.66 0.44 SAA (4th quartile) and Albumin (4th quintile) Interaction 0.16 0.73 - 1.27 1.58 0.83 SAA (4th quartile) and Albumin (5th quintile) Interaction 0.81 0.67 - 0.51 2.12 0.23 Intercept - 0.11 0.60 - 1.27 1.06 0.86 Model 3 3 333.80 0.88 Albumin (g/dL) <3.2 (ref) 3.2 - <3.3 0.19 0.35 - 0.51 0.88 0.60 3.3 - <3.5 - 0.25 0.31 - 0.85 0.36 0.43 3.5 - <3.6 0.65 0.44 - 0.21 1.52 0.14 3 .6 and greater - 0.18 0.42 - 1.01 0.65 0.67 Lactation Number 1st (ref) 2nd - 0.39 0.27 - 0.92 0.14 0.15 3rd or greater 0.81 0.26 0.30 1.32 0.00 0 (ref) >0 - <96.30 0.82 0.34 0.15 1.48 0.02 96.30 - <184.83 0.10 0.33 - 0.55 0.76 0.76 184.83 - <343.70 - 0.24 0.40 - 1.02 0.54 0.54 343.70 and greater 0.24 0.38 - 0.50 0.97 0.53 Neutrophil: Lymphocyte ratio - 0.65 0.33 - 1.30 - 0.01 0.05 <10.04 (ref) 10.04 - <29.14 - 0.74 0.34 - 1.40 - 0.07 0.03 29.14 - <77.00 - 0.05 0.35 - 0.75 0.64 0.88 77.00 and greater 0 .88 0.37 0.16 1.60 0.02 1st Lactation (ref) 2nd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 - 0.63 0.49 - 1.58 0.32 0.1 9 96.30 - <184.83 0.14 0.47 - 0.78 1.06 0.76 184.83 - <343.70 0.70 0.51 - 0.29 1.69 0.17 343.70 and greater - 1.06 0.57 - 2.19 0.06 0.06 3rd and greater Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 0.52 0.46 - 0.38 1.43 0.26 96.30 - <184.83 0.32 0.50 - 0.65 1.29 0.52 184.83 - <343.70 0.59 0.50 - 0.40 1.57 0.24 343.70 and greater - 0.28 0.48 - 1.23 0.67 0.56 SAA (1st quartile) and Albumin (1st quintile) Interaction (ref) SAA (2nd quartile) and Albumin (2nd quintile) Interaction - 0.23 0.65 - 1.49 1.04 0.73 SAA (2nd quar tile) and Albumin (3rd quintile) Interaction 1.21 0.52 0.20 2.22 0.02 SAA (2nd quartile) and Albumin (4th quintile) Interaction - 1.17 0.67 - 2.49 0.15 0.08 SAA (2nd quartile) and Albumin (5th quintile) Interaction - 0.36 0.69 - 1.71 1.00 0.61 SAA (3rd quartile) and Albumin (2nd quintile) Interaction - 0.08 0.58 - 1.22 1.05 0.88 SAA (3rd quartile) and Albumin (3rd quintile) Interaction 0.96 0.53 - 0.08 2.00 0.07 SAA (3rd quartile) and Albumin (4th quintile) Interaction 0.46 0.79 - 1.10 2.01 0.57 SAA (3rd quartile) and Albumin (5th quintile) Interaction 1.05 0.71 - 0.34 2.44 0.14 SAA (4th quartile) and Albumin (2nd quintile) Interaction - 1.11 0.59 - 2.27 0.06 0.06 SAA (4th quartile) and Albumin (3rd quintile) Inte raction - 0.33 0.57 - 1.45 0.79 0.56 SAA (4th quartile) and Albumin (4th quintile) Interaction 0.45 0.76 - 1.04 1.94 0.55 SAA (4th quartile) and Albumin (5th quintile) Interaction 0.47 0.67 - 0.85 1.79 0.49 Intercept - 0.44 0.51 - 1.44 0.56 0.39 Model 4 4 335.98 0.86 Albumin (g/dL) <3.2 (ref) 3.2 - <3.3 0.11 0.35 - 0.57 0.80 0.75 140 Table 3.3 - <3.5 - 0.15 0.31 - 0.75 0.46 0.64 3.5 - <3.6 0.72 0.44 - 0.14 1.58 0.10 3.6 and greater - 0.39 0.41 - 1.20 0.42 0.34 Lactation Number 1st (ref) 2nd - 0.23 0.27 - 0.76 0.29 0.39 3rd or greater 0.71 0.26 0.20 1.23 0.01 <2704.48 (ref) 2704.48 - <3411.84 - 0.82 0.36 - 1.52 - 0.12 0.02 3411.84 - <4289.14 0.11 0.34 - 0.56 0.78 0.74 4289.14 - <5982.66 - 0.18 0.36 - 0.89 0.53 0.62 5982.66 and greater 0.9 6 0.42 0.14 1.78 0.02 0 (ref) >0 - <96.30 0.90 0.33 0.25 1.55 0.01 96.30 - <184.83 0.39 0.32 - 0.25 1.03 0.23 184.83 - <343.70 - 0.35 0.38 - 1.10 0.40 0.36 343.70 and greater 0.05 0.36 - 0.65 0.76 0.89 Season Spring (ref) Summer - 0.57 0.68 - 1.90 0.76 0.40 Fall - 0.21 0.25 - 0.69 0.28 0.40 Winter 1.33 0.51 0.33 2.34 0.01 <9.64 (re f) 9.64 - <29.13 - 0.58 0.32 - 1.21 0.04 0.07 29.13 - <77.00 - 0.17 0.34 - 0.84 0.49 0.62 77.00 and greater 0.64 0.33 - 0.01 1.29 0.05 1st Lactation (r ef) 2nd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 - 0.51 0.47 - 1.42 0.41 0.28 96.30 - <184.83 0.02 0.45 - 0.86 0.90 0.97 184.83 - <343.70 0.83 0.50 - 0.14 1.80 0.10 343.70 and gr eater - 1.03 0.54 - 2.10 0.04 0.06 3rd and greater Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 0.53 0.45 - 0.35 1.41 0.24 96.30 - <184.83 0.37 0.47 - 0.55 1.29 0.43 184.83 - <343.70 0.53 0.49 - 0.4 2 1.49 0.27 343.70 and greater - 0.33 0.47 - 1.26 0.60 0.49 Interaction SAA (1st quartile) and Albumin (1st quintile) Interaction (ref) SAA (2nd quartile) and Albumin (2 nd quintile) Interaction - 0.50 0.64 - 1.74 0.75 0.44 SAA (2nd quartile) and Albumin (3rd quintile) Interaction 1.31 0.51 0.31 2.32 0.01 SAA (2nd quartile) and Albumin (4th quintile) Interaction - 0.95 0.66 - 2.24 0.34 0.15 SAA (2nd quarti le) and Albumin (5th quintile) Interaction - 0.32 0.67 - 1.64 1.00 0.63 SAA (3rd quartile) and Albumin (2nd quintile) Interaction 0.03 0.57 - 1.09 1.15 0.96 SAA (3rd quartile) and Albumin (3rd quintile) Interaction 0.88 0.51 - 0.11 1.88 0.08 SAA (3rd quartile) and Albumin (4th quintile) Interaction 0.27 0.77 - 1.24 1.78 0.72 SAA (3rd quartile) and Albumin (5th quintile) Interaction 1.08 0.68 - 0.26 2.41 0.11 SAA (4th quartile) and Albumin (2nd quintile) Interaction - 0.89 0.58 - 2 .04 0.25 0.13 SAA (4th quartile) and Albumin (3rd quintile) Interaction - 0.33 0.54 - 1.40 0.73 0.54 SAA (4th quartile) and Albumin (4th quintile) Interaction 0.25 0.75 - 1.22 1.71 0.74 SAA (4th quartile) and Albumin (5th quintile) Intera ction 0.75 0.69 - 0.60 2.10 0.28 Intercept - 1.04 0.41 - 1.84 - 0.24 0.01 Model 5 5 336.80 0.85 Albumin (g/dL) <3.2 (ref) 3.2 - <3.3 0.12 0.34 - 0.54 0.78 0.73 3.3 - <3.5 - 0.25 0.30 - 0.84 0.34 0. 40 3.5 - <3.6 0.73 0.42 - 0.09 1.55 0.08 3.6 and greater - 0.16 0.38 - 0.91 0.58 0.67 Lactation Number 1st (ref) 2nd - 0.36 0.25 - 0.85 0.14 0.16 3rd or greater 0.77 0.25 0.29 1.26 0.00 Monocyt 0 (ref) >0 - <96.30 0.76 0.32 0.13 1.38 0.02 96.30 - <184.83 0.24 0.31 - 0.38 0.85 0.45 184.83 - <343.70 - 0.34 0.37 - 1.08 0.39 0.36 343.70 and greater 0.12 0.35 - 0.57 0.81 0.74 Se ason Spring (ref) Summer - 0.71 0.67 - 2.02 0.61 0.29 Fall - 0.17 0.23 - 0.62 0.29 0.48 Winter 1.23 0.49 0.27 2.20 0.01 <10.04 (ref) 10.04 - <29.14 - 0 .47 0.30 - 1.05 0.14 0.13 29.14 - <77.00 - 0.11 0.33 - 0.76 0.53 0.73 77.00 and greater 0.57 0.31 - 0.04 1.18 0.07 1st Lactation (ref) 2nd Lactation a nd Monocyte Interaction 0 (ref) >0 - <96.30 - 0.53 0.45 - 1.41 0.35 0.24 96.30 - <184.83 0.12 0.44 - 0.74 0.98 0.78 141 Table 184.83 - <343.70 0.73 0.48 - 0.21 1.68 0.13 343.70 and greater - 1.03 0.43 - 2.06 0.01 0.05 3rd and greater Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 0.34 0.42 - 0.49 1.17 0.42 96.30 - <184.83 0.38 0.46 - 0.53 1.28 0.41 184.83 - <343.70 0.63 0.47 - 0.29 1.56 0.18 343.70 and greater - 0.44 0.46 - 1.35 0.46 0.34 SAA (1st quartile) and Albumin (1st quintile) Interaction (ref) SAA (2nd quartile) and Albumin (2nd quintile) Interaction - 0.48 0.62 - 1.68 0.73 0.44 SAA (2nd quartile) and Albumin (3rd quintile) Interaction 1.20 0.50 0.23 2.18 0.02 SAA (2nd quartile) and Albumin (4th quintile) Interaction - 1.03 0.63 - 2.26 0.21 0.10 SAA (2nd quartile) and Albu min (5th quintile) Interaction - 0.30 0.65 - 1.57 0.97 0.64 SAA (3rd quartile) and Albumin (2nd quintile) Interaction - 0.18 0.54 - 1.24 0.89 0.75 SAA (3rd quartile) and Albumin (3rd quintile) Interaction 0.94 0.50 - 0.04 1.92 0.06 SAA (3rd quartile) and Albumin (4th quintile) Interaction 0.38 0.74 - 1.07 1.82 0.61 SAA (3rd quartile) and Albumin (5th quintile) Interaction 1.17 0.65 - 0.10 2.45 0.07 SAA (4th quartile) and Albumin (2nd quintile) Interaction - 0.64 0.53 - 1.68 0.41 0. 23 SAA (4th quartile) and Albumin (3rd quintile) Interaction - 0.46 0.53 - 1.51 0.58 0.38 SAA (4th quartile) and Albumin (4th quintile) Interaction 0.30 0.70 - 1.08 1.69 0.67 SAA (4th quartile) and Albumin (5th quintile) Interaction 0.45 0.63 - 0.79 1.69 0.48 Intercept - 0.97 0.33 - 1.62 - 0.31 0.00 1 Random intercept estimate (Farm: 0.06, 95% CI = 0.09 - 3.98) 2 Random intercept estimate (Farm: 0.54, 95% CI = 0.07 - 4.01) 3 Random intercept estimates (Farm: 0.50, 95% CI = 0.03 - 7.94; Cohort: 0.64, 95% CI = 0.13 - 3.18) 4 Random intercept estimate (Farm: 0.57, 95% CI = 0.08 - 3.95) 5 Random intercept estimate (Farm: 0.32, 95% CI = 0.03 - 3.26) SE: standard error; CI: confidence interval; AIC: Akiake informa tion criteria; AUC: area under the curve 142 Table S2.3 Multivariable logistic regression candidate model set for nutrient metabolism component of metabolic stress Model Variable Coefficient SE Lower 95% CI Upper 95% CI P - value Model AIC Model A UC Model 1 333.16 0.72 Calcium (mg/dL) - 0.73 0.35 - 1.41 - 0.04 0.04 Lactation Number 1st (ref) 2nd - 0.15 0.20 - 0.54 0.25 0.46 3rd or greater 0.60 0.20 0.20 0.99 0.00 NEFA (mmol/L) - 1.08 0. 61 - 2.28 0.11 0.08 Season Spring (ref) Summer - 1.68 0.71 - 3.08 - 0.29 0.02 Fall 1.16 0.34 0.49 1.83 <0.01 Winter - 0.11 0.63 - 1.34 1.12 0.86 NEFA (mmol/L) and Season Interaction Sp ring (ref) Summer 3.43 2.63 - 1.72 8.58 0.19 Fall - 2.41 0.68 - 3.74 - 1.07 <0.01 Winter 1.17 1.18 - 1.15 3.49 0.32 Intercept 6.96 3.39 0.32 13.60 0.04 Model 2 333.77 0.73 BCS <3.5 or >4.0 (re f) 3.5 or 4.0 0.16 0.14 - 0.11 0.43 0.24 Calcium (mg/dL) - 0.72 0.35 - 1.40 - 0.03 0.04 Lactation Number 1st (ref) 2nd - 0.11 0.20 - 0.51 0.29 0.59 3rd or greater 0.58 0.20 0.18 0.98 <0.01 NEFA (mmol/L) - 1.05 0.61 - 2.25 0.15 0.09 Season Spring (ref) Summer - 1.77 0.71 - 3.17 - 0.37 0.01 Fall 1.16 0.34 0.49 1.84 <0.01 Winter - 0.02 0.63 - 1.26 1.22 0.97 NEFA (mmol/L) and Season Interac tion Spring (ref) Summer 3.53 2.62 - 1.61 8.66 0.18 Fall - 2.30 0.69 - 3.64 - 0.95 <0.01 Winter 0.98 1.20 - 1.36 3.33 0.41 Intercept 6.87 3.39 0.23 13.50 0.04 Model 3 334.58 0.72 BHB (mg/ dL) - 0.06 0.08 - 0.21 0.09 0.45 Calcium (mg/dL) - 0.72 0.35 - 1.41 - 0.03 0.04 Lactation Number 1st (ref) 2nd - 0.11 0.21 - 0.52 0.30 0.58 143 3rd or greater 0.62 0.20 0.22 1.02 <0.01 NE FA (mmol/L) - 1.01 0.62 - 2.22 0.20 0.10 Season Spring (ref) Summer - 1.54 0.74 - 2.98 - 0.10 0.04 Fall 1.12 0.35 0.44 1.79 <0.01 Winter - 0.06 0.63 - 1.30 1.17 0.92 NEFA (mmol/L) and Season Interaction Spring (ref) Summer 3.41 2.64 - 1.76 8.59 0.20 Fall - 2.35 0.68 - 3.69 - 1.01 <0.01 Winter 1.11 1.19 - 1.23 3.44 0.35 Intercept 7.19 3.42 0.49 13.88 0.04 Model 4 335.29 0.73 BCS <3.5 or >4.0 (ref) 3.5 or 4.0 0.35 0.31 - 0.25 0.95 0.26 BHB (mg/dL) - 0.05 0.08 - 0.21 0.10 0.49 Calcium (mg/dL) - 0.71 0.35 - 1.40 - 0.03 0.04 Lactation Number 1st (ref) 2nd - 0.08 0.21 - 0. 49 0.33 0.69 3rd or greater 0.60 0.21 0.20 1.00 <0.01 NEFA (mmol/L) - 0.99 0.62 - 2.20 0.23 0.11 Season Spring (ref) Summer - 1.64 0.74 - 3.09 - 0.19 0.03 Fall 1.12 0.35 0.44 1.80 <0.01 Winter 0.02 0.64 - 1.23 1.27 0.98 NEFA (mmol/L) and Season Interaction Spring (ref) Summer 3.51 2.63 - 1.66 8.67 0.18 Fall - 2.25 0.69 - 3.60 - 0.90 <0.01 Winter 0.93 1.20 - 1.43 3.28 0.44 Intercept 6.89 3.42 0.19 13.60 0.04 Model 5 336.64 0.73 BCS <3.5 or >4.0 (ref) 3.5 or 4.0 0.18 0.14 - 0.10 0.45 0.21 Calcium (mg/dL) - 0.74 0.35 - 1.43 - 0.06 0.03 Cholesterol (mmol/L) < 120 (ref) 120 - <173 - 0.41 0.27 - 0.94 0.12 0.13 173 - <224 - 0.05 0.29 - 0.62 0.52 0.87 224 and greater - 0.06 0.32 - 0.70 0.57 0.85 Lactation Number 1st (ref) 2nd - 0.02 0.28 - 0.57 0.54 0.95 144 Table S2.3 3rd or greater 0.79 0.27 0.26 1.32 <0.01 NEFA (mmol/L) - 1.19 0.64 - 2.44 0.06 0.06 Season Spring (ref) Summer - 1.89 0.72 - 3.30 - 0.45 0.01 Fall 1.16 0.35 0.47 1.84 <0.01 Winter - 0.02 0. 64 - 1.28 1.24 0.98 NEFA (mmol/L) and Season Interaction Spring (ref) Summer 3.79 2.61 - 1.33 8.92 0.15 Fall - 2.28 0.69 - 3.64 - 0.92 <0.01 Winter 1.03 1.21 - 1.34 3.39 0.40 Intercept 7.17 3.40 0.50 1 3.84 0.04 SE: standard error; CI: confidence interval; AIC: Akaike information criteria; AUC: area under the curve; NEFA: non - esterfied fatty acids BCS: body condition score; BHB: beta - hydroxybutyrate 145 Table S2.4 Multivariable logistic re gression candidate model set for oxidative stress component of metabolic stress Model Variable Coefficient SE Lower 95% CI Upper 95% CI P - value Model AIC Model AUC Model 1 1 341.71 0.75 <2.25 (ref) 2.25 - <3.24 - 0.86 0.36 - 1.58 - 0.17 0.02 3.24 - <4.45 - 0.04 0.28 - 0.57 0.51 0.90 4.45 - <5.73 0.03 0.31 - 0.62 0.64 0.91 5.73 and greater 0.06 0.35 - 0.63 0.74 0.87 Antioxidant Potential (Trolox <3.17 (ref) 3.17 - <4.00 - 0.90 0.33 - 1.55 - 0.25 0.01 4.00 - <4.74 - 0.37 0.31 - 0.96 0.23 0.23 4.74 - <5.57 0.54 0.30 - 0.04 1.11 0.07 5.57 and greater 0.06 0.30 - 0.53 0.66 0.83 Lactation Number 1st (ref) 2nd - 0.27 0.24 - 0.74 0.21 0.27 3rd or greater 0.77 0.21 0.36 1.17 <0.01 Season Spring (ref) Summer - 0.53 0.64 - 1.78 0.72 0.40 Fall - 0.29 0.21 - 0.69 0.11 0.15 Winter 1.17 0.45 0.28 2.05 0.01 Intercept - 0.92 0.30 - 1.51 - 0.33 <0.01 Model 2 2 341.91 0.78 <2.25 (ref) 2.25 - <3.24 - 0.91 0.36 - 1.61 - 0.20 0.01 3.24 - <4.45 0.12 0.28 - 0.4 4 0.68 0.67 4.45 - <5.73 0.06 0.33 - 0.58 0.70 0.86 5.73 and greater - 0.02 0.37 - 0.74 0.70 0.95 Antioxidant Potential (Trolox <3.17 (ref) 3.17 - <4.00 - 0.88 0.34 - 1.55 - 0.20 0.01 4.0 0 - <4.74 - 0.13 0.31 - 0.75 0.68 0.67 4.74 - <5.57 - 0.53 0.30 - 0.58 0.70 0.86 5.57 and greater 0.01 0.34 - 0.74 0.70 0.95 Lactation Number 1st (ref) 2nd - 0.22 0.25 - 0.71 0.26 0.36 3rd or greater 0.75 0.21 0.34 1.16 <0.01 Intercept - 0.92 0.29 - 1.49 - 0.36 <0.01 Model 3 3 343.50 0.78 <2.25 (ref) 2.25 - <3.24 - 0.91 0.36 - 1.61 - 0.20 0.01 3.24 - <4.45 0.05 0.29 - 0.52 0.63 0.86 4.45 - <5.73 0.01 0.33 - 0.63 0.65 0.98 5.73 and greater - 0.01 0.37 - 0.73 0.72 0.99 Antioxidant Potential (Trolox <3.17 (ref) 3.17 - <4.00 - 0.89 0.34 - 1.56 - 0.21 0.01 4.00 - <4.74 - 0.12 0.31 - 0.73 0.49 0.70 4.74 - <5.57 0.53 0.31 - 0.07 1.13 0.08 5.57 and greater - 0.04 0.34 - 0.70 0.61 0.90 Lactation Number 1st (ref) 2nd - 0.17 0.25 - 0.66 0.33 0.51 3rd or greater 0.79 0.21 0.37 1.20 <0.01 Vitamin A (ng/mL) <237 (ref) 237 - <286 0.25 0.26 - 0 .25 0.75 0.33 286 - <346 0.31 0.26 - 0.19 0.82 0.22 346 and greater - 0.53 0.31 - 1.13 0.08 0.09 Intercept - 0.94 0.28 - 1.50 - 0.39 <0.01 Model 4 4 343.51 0.76 <2.25 (ref) 2.25 - <3.24 - 0.89 0.36 - 1.59 - 0.19 0.01 3.24 - <4.45 - 0.10 0.29 - 0.66 0.46 0.72 146 Table S2.4 4.45 - <5.73 - 0.01 0.31 - 0.62 0.61 0.99 5.73 and greater 0.05 0. 35 - 0.64 0.75 0.88 Antioxidant Potential (Trolox <3.17 (ref) 3.17 - <4.00 - 0.93 0.34 - 1.59 - 0.27 0.01 4.00 - <4.74 - 0.34 0.31 - 0.95 0.26 0.27 4.74 - <5.57 0.57 0.30 - 0.02 1.16 0.06 5.57 and greater - 0.02 0.31 - 0.62 0.59 0.96 Lactation Number 1st (ref) 2nd - 0.26 0.25 - 0.75 0.23 0.30 3rd or greater 0.79 0.21 0.37 1.20 <0.01 Vitamin A (ng/mL) <237 (ref) 237 - <286 0.16 0.26 - 0.35 0.67 0.54 286 - <346 0.42 0.26 - 0.08 0.93 0.10 346 and greater - 0.27 0.33 - 0.92 0.37 0.41 Intercept - 0.95 0.31 - 1.56 - 0.34 <0.01 Model 5 5 344.58 0.76 Antioxidant Potential (Trolox equivalen <3.17 (ref) 3.17 - <4.00 - 0.74 0.32 - 1.37 - 0.12 0.02 4.00 - <4.74 - 0.13 0.31 - 0.72 0.46 0.67 4.74 - <5.57 0.39 0.29 - 0.18 0.96 0.18 5.57 and greater - 0.03 0.31 - 0.64 0.59 0.94 Lactation Numb er 1st (ref) 2nd - 0.18 0.20 - 0.57 0.21 0.36 3rd or greater 0.71 0.20 0.32 1.09 <0.01 Intercept - 0.86 0.26 - 1.38 - 0.35 <0.01 Model 6 6 344.82 0.75 <2. 25 (ref) 2.25 - <3.24 - 0.83 0.35 - 1.50 - 0.15 0.02 3.24 - <4.45 0.09 0.28 - 0.45 0.63 0.74 4.45 - <5.73 0.14 0.31 - 0.48 0.75 0.67 5.73 and greater 0.03 0.35 - 0.65 0.71 0.93 Lactation Number 1st (ref) 2nd - 0.18 0.23 - 0.64 0.28 0.44 3rd or greater 0.64 0.20 0.26 1.03 <0.01 Intercept - 0.85 0.25 - 1.34 - 0.36 <0.01 Model 7 7 344.83 0.73 Antioxidant Potential (Trolox <3.17 (ref) 3.17 - <4.00 - 0.73 0.31 - 1.33 - 0.12 0.02 4.00 - <4.74 - 0.36 0.30 - 0.94 0.23 0.23 4.74 - <5.57 0.35 0.28 - 0.19 0.89 0.21 5.57 and greater - 0.02 0.2 8 - 0.56 0.53 0.95 Lactation Number 1st (ref) 2nd - 0.20 0.20 - 0.59 0.18 0.31 3rd or greater 0.72 0.20 0.34 1.10 <0.01 Season Spring (ref) Summer - 0.83 0.60 - 2.00 0.34 0.17 Fall - 0.15 0.18 - 0.51 0.21 0.41 Winter 0.92 0.41 0.11 1.72 0.03 Intercept - 0.85 0.25 - 1.35 - 0.36 <0.01 1 Random intercept estimate (Farm: 0.33, 95% CI = 0.05 - 2.09) 2 Random intercept estimates (Farm: 0.15, 95% CI = 0.003 - 8.17 ; Cohort: 0.44, 95% CI = 0.09 - 2.1)) 3 Random intercept estimate (Farm: 0.16, 95% CI = 0.003 - 7.99; Cohort: 0.35, 95% CI = 0.06 - 2.1) 4 Random intercept estimate (Farm: 0.35, 95% CI = 0.05 - 2.25) 5 Random intercept estimates (Farm: 0.13, 95% CI = 0.0 02 - 6.51; Cohort: 0.34, 95% CI = 0.07 - 1.76) 6 Random intercept estimates (Farm: 0.08, 95% CI = 0.0002 - 33.98; Cohort: 0.37, 95% CI = 0.07 - 2.02) 7 Random intercept estimate (Farm: 0.20, 95% CI = 0.02 - 1.72) SE: standard error; CI: confidence interv al; AIC: Akaike information criteria; AUC: area under the curve 147 Table S2.5 Multivariable logistic regression candidate model set for combined model (inflammation, oxidative stress, and nutrient metabolism) Model Variable Coefficient SE Lower 95% CI Up per 95% CI P - value Model AIC Model AUC Model 1 1 312.67 0.93 Albumin (g/dL) <3.2 (ref) 3.2 - <3.3 0.17 0.44 - 0.70 1.03 0.71 3.3 - <3.5 - 0.05 0.39 - 0.81 0.72 0.90 3.5 - <3.6 1.54 0.60 0.36 2.72 0 .01 3.6 and greater - 0.10 0.57 - 1.22 1.02 0.86 <2.25 (ref) 2.25 - <3.24 - 2.14 0.73 - 3.58 - 0.71 0.00 3.24 - <4.45 - 0.24 0.49 - 1.20 0.71 0.62 4.45 - <5.73 0.51 0.51 - 0.48 1. 51 0.31 5.73 and greater 0.39 0.62 - 0.83 1.60 0.53 <3.17 (ref) 3.17 - <4.00 - 1.21 0.57 - 2.32 - 0.10 0.03 4.00 - <4.74 - 0.45 0.49 - 1.41 0.51 0.36 4.74 - <5.57 1.18 0.47 0.26 2.11 0.01 5.57 and greater - 0.66 0.49 - 1.62 - 0.30 0.18 BCS <3.5 or >4.0 (ref) 3.5 or 4.0 0.17 0.25 - 0.33 0.66 0.51 Calcium (mg/dL) - 1.23 0.71 - 2.61 0.16 0.08 Lactation Number 1st (ref) 2nd - 0.83 0.46 - 1.73 0.06 0.07 3rd or greater 1.11 0.34 0.44 1.79 <0.01 0 (ref) >0 - <96.30 0.71 0.43 - 0.12 0.54 0.10 96.30 - <184.83 0.16 0.41 - 0.66 0.97 0.71 184.83 - <343.70 - 0.29 0.49 - 1.25 0.67 0.55 343.70 and greater 0.48 0.52 - 0.53 1.49 0.35 Neutrophil: Lymphocyte ratio - 1.16 0.47 - 2.07 - 0.24 0.01 Season Spring (ref) Summer - 1.55 1.13 - 3.76 0.66 0.17 Fall - 0.06 0.38 - 0.80 0.68 0.87 Winter 2.12 0.75 0.65 3.59 0.01 Serum Amyloid . <10.04 (ref) 10.04 - <29.14 - 0.53 0.45 - 1.40 0.35 0.24 29.14 - <77.00 0.34 0.43 - 0.51 1.19 0.43 77.00 and greater 0.32 0.46 - 0.59 1.23 0.49 <3.17 (ref) 3.17 - <4.00 - 0.80 0.45 - 1.69 0.10 0.08 4.00 - <4.74 - 0.34 0.39 - 1.10 0.42 0.38 4.74 - <5.57 1.39 0.46 0.49 2.29 0.00 5.57 and greater 0.22 0.40 - 0.57 1.01 0.59 Lactation Number and Monocyte Interaction 1st Lactation (ref) 2nd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 - 0.86 0.58 - 2.00 0.28 0.14 96.30 - <184.83 0.44 0.59 - 0.74 1.60 0.45 18 4.83 - <343.70 2.01 0.67 0.71 3.32 <0.01 148 Table S2. 5 343.70 and greater - 2.52 0.82 - 4.12 - 0.92 <0.01 3rd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 0.67 0.82 - 4.12 - 0.92 <0.01 96.30 - <184.83 - 0.32 0.60 - 0.52 1.85 0.27 184.83 - <343.70 0.15 0.61 - 1.51 0.87 0.60 343.70 and greater 0.73 0.64 - 1.10 1.40 0.82 Interaction 1st Lactation (ref) 2nd Lactation and SAA Interaction <10.04 (ref) 10.04 - <29.14 1.05 0.58 - 0.08 2.18 0.07 29.14 - <77.00 - 2.07 0.65 - 3.35 - 0.78 <0.01 77.00 and greater 0.48 0.52 - 0.54 1.49 0.36 3rd Lactation and SAA Interaction <10.04 (ref) 10.04 - <29.14 - 1.30 0.60 - 2.48 - 0.13 0.03 29.14 - <77.00 1.23 0.66 - 0.06 2.51 0.06 77.00 and greater 0.83 0.53 - 0.21 1.86 0.12 Interaction SAA (1st quartile) and Albumin (1st quintile) Interaction (ref) SAA (2nd quartile) and Albumin (2nd quintile) Interaction 0.14 0.77 - 1.38 1.66 0.86 SAA (2nd quart ile) and Albumin (3rd quintile) Interaction 1.33 0.78 - 0.19 2.85 0.09 SAA (2nd quartile) and Albumin (4th quintile) Interaction - 1.40 0.92 - 3.20 0.40 0.13 SAA (2nd quartile) and Albumin (5th quintile) Interaction - 1.14 0.94 - 2.98 0.70 0.23 SAA (3rd quartile) and Albumin (2nd quintile) Interaction - 0.60 0.77 - 2.10 0.90 0.43 SAA (3rd quartile) and Albumin (3rd quintile) Interaction 2.08 0.71 0.70 3.47 <0.01 SAA (3rd quartile) and Albumin (4th quintile) Interaction 1.35 1.04 - 0.69 3.39 0.19 SAA (3rd quartile) and Albumin (5th quintile) Interaction 1.85 0.99 - 0.10 3.80 0.06 SAA (4th quartile) and Albumin (2nd quintile) Interaction - 0.82 0.70 - 2.18 0.54 0.24 SAA (4th quartile) and Albumin (3rd quintile) Inte raction - 1.28 0.70 - 2.65 0.09 0.07 SAA (4th quartile) and Albumin (4th quintile) Interaction - 0.01 0.92 - 1.81 1.79 0.99 SAA (4th quartile) and Albumin (5th quintile) Interaction 0.16 0.80 - 1.40 1.73 0.84 Intercept 11.48 6.89 - 2.03 24.9 9 0.96 Model 2 313.74 0.93 Albumin (g/dL) <3.2 (ref) 3.2 - <3.3 0.23 0.45 - 0.66 1.12 0.61 3.3 - <3.5 0.05 0.41 - 0.76 0.85 0.91 3.5 - <3.6 1.11 0.58 - 0.04 2.25 0.06 3.6 and greater - 0.23 0.58 - 1.36 0.89 0.69 <2.25 (ref) 2.25 - <3.24 - 2.32 0.78 - 3.85 - 0.78 <0.01 3.24 - <4.45 0.10 0.49 - 0.60 1.07 0.83 4.45 - <5.73 0.53 0.53 - 0.51 1.56 0.32 5.73 and greater - 0.04 0.63 - 1.29 1.20 0.95 <3.17 (ref) 149 3.17 - <4.00 - 1.26 0.57 - 2.38 - 0.14 0.03 4.00 - <4.74 - 0.43 0.46 - 1.32 0.46 0.34 4.74 - <5.57 1.27 0.49 0.31 2.23 0.01 5 .57 and greater - 0.86 0.52 - 1.88 0.16 0.10 BHB (mg/dL) - 0.03 0.14 - 0.31 0.25 0.85 BCS <3.5 or >4.0 (ref) 3.5 or 4.0 0.39 0.25 - 0.09 0.87 0.12 Calcium (mg/dL) - 1.66 0.71 - 3.05 - 0.27 0.02 Lactation Numbe r 1st (ref) 2nd - 1.01 0.50 - 1.99 - 0.04 0.04 3rd or greater 1.19 0.37 0.46 1.92 <0.01 Season Spring (ref) Summer - 6.26 2.25 - 10.67 - 1.85 0.01 Fall 2.02 0.85 0.37 3.68 0.02 Winter 1.63 1.76 - 1.83 5.09 0.36 0 (ref) >0 - <96.30 0.58 0.45 - 0.31 1.47 0.21 96.30 - <184.83 0.26 0.43 - 0.59 1.11 0.55 184.83 - <343.70 - 0.31 0.50 - 1.30 0.67 0.53 343.70 and greater 0.61 0.54 - 0.44 1.66 0.25 Neutrophil: Lymphocyte ratio - 1.06 0.46 - 1.97 - 0.15 0.02 NEFA (mmol/L) - 1.57 0.94 - 3.41 0.28 0.10 Serum . <10.04 (ref) 10.04 - <29.14 - 0.53 0.44 - 1.39 0.33 0.23 29.14 - <77.00 0.57 0.47 - 0.35 1.49 0.22 77.00 and greater - 0.04 0.44 - 0.89 0.81 0.93 Vitamin A (ng/mL) <237 (ref) 237 - <286 - 0.87 0.45 - 1.75 0.002 0.05 286 - <346 0.88 0.41 0 .07 1.69 0.03 346 and greater 0.90 0.57 - 0.22 2.02 0.11 <3.17 (ref) 3.17 - <4.00 - 0.51 0.45 - 1.39 0.37 0.26 4.00 - <4.74 - 0.10 0.38 - 0.84 0.65 0.80 4.74 - <5.57 1.31 0.46 0.41 2.21 <0.01 5.57 and greater - 0.02 0. 43 - 0.86 0.82 0.96 Interaction 1st Lactation (ref) 2nd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 - 0.88 0.62 - 2.09 0.34 0.16 96.30 - <184.83 0.32 0.60 - 0.86 1.50 0.59 184.83 - <343.70 1.91 0.64 0.65 3.18 <0.01 343.70 and greater - 2.51 0.79 - 4.05 - 0.97 <0.01 3rd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 0.72 0. 64 - 0.54 1.98 0.26 96.30 - <184.83 - 0.27 0.61 - 1.47 0.93 0.66 184.83 - <343.70 - 0.05 0.63 - 1.28 1.19 0.94 343.70 and greater 0.79 0.69 - 0.56 2.14 0.25 Interaction 1st Lactation (ref) 2nd Lactation and SAA Interaction <10.04 (ref) 10.04 - <29.14 0.83 0.59 - 0.32 1.98 0.16 150 Table 29.14 - <77.00 - 2.27 0.70 - 3.63 - 0.91 <0.01 77.00 and greater 0.77 0.56 - 0.32 1.87 0.17 3rd Lactation and SAA Interaction <10.04 (ref) 10.04 - <29.14 - 1.52 0.62 - 2.72 - 0.31 0.01 29.14 - <77.00 0.87 0.6 4 - 0.39 2.13 0.17 77.00 and greater 0.91 0.56 - 0.19 2.00 0.11 Season and BHB (mg/dL) Interaction Spring (ref) Summer 0.35 0.29 - 0.22 0.92 0.24 Fall - 0.40 0.16 - 0.71 - 0.08 0.02 Winter 0.14 0.30 - 0 .44 0.72 0.64 Interaction SAA (1st quartile) and Albumin (1st quintile) Interaction (ref) SAA (2nd quartile) and Albumin (2nd quintile) Interaction 0.53 0.81 - 1.06 2.12 0.52 SAA (2nd quartile) and Albumin (3rd quintile) Interaction 0.70 0.74 - 0.76 2.15 0.35 SAA (2nd quartile) and Albumin (4th quintile) Interaction - 0.85 0.90 - 2.62 0.92 0.35 SAA (2nd quartile) and Albumin (5th quintile) Interaction - 1.12 0. 94 - 2.96 0.72 0.23 SAA (3rd quartile) and Albumin (2nd quintile) Interaction 0.47 0.80 - 2.03 1.09 0.56 SAA (3rd quartile) and Albumin (3rd quintile) Interaction 2.14 0.74 0.70 0.59 <0.01 SAA (3rd quartile) and Albumin (4th quintile) In teraction 0.79 1.06 - 1.28 2.87 0.45 SAA (3rd quartile) and Albumin (5th quintile) Interaction 2.32 1.03 0.31 4.34 0.02 SAA (4th quartile) and Albumin (2nd quintile) Interaction - 0.67 0.73 - 2.10 0.76 0.36 SAA (4th quartile) and Albumin (3rd quintile) Interaction - 1.09 0.75 - 2.57 0.38 0.15 SAA (4th quartile) and Albumin (4th quintile) Interaction 0.54 0.95 - 1.32 2.39 0.57 SAA (4th quartile) and Albumin (5th quintile) Interaction - 0.22 0.85 - 1.88 1.45 0.80 Intercept 16 .83 6.93 3.24 30.41 0.02 Model 3 314.04 0.92 Albumin (g/dL) <3.2 (ref) 3.2 - <3.3 0.19 0.43 - 0.65 1.02 0.67 3.3 - <3.5 0.09 0.39 - 0.68 0.86 0.82 3.5 - <3.6 1.32 0.57 0.20 2.44 0.02 3 .6 and greater 0.18 0.52 - 0.85 1.20 0.73 Alpha <2.25 (ref) 2.25 - <3.24 - 2.12 0.73 - 3.55 - 0.69 <0.01 3.24 - <4.45 - 0.01 0.50 - 0.98 0.96 0.98 4.45 - <5.73 0.57 0.48 - 0.38 1.51 0.24 5.73 and greater 0.32 0.61 - 0.87 1.50 0.60 An <3.17 (ref) 3.17 - <4.00 - 1.25 0.55 - 2.33 - 0.17 0.02 4.00 - <4.74 - 0.43 0.45 - 1.31 0.45 0.33 4.74 - <5.57 1.11 0.47 0.18 2.04 0.02 5.57 and greater - 0.64 0.49 - 1.60 0.32 0.19 BHB (mg/dL) - 0.26 0.21 - 0.67 0.15 0.21 BCS <3.5 or >4.0 (ref) 151 Table 3.5 or 4.0 - 0.13 0.41 - 0.94 0.69 0.76 Calcium (mg/dL) - 1.72 0.67 - 3.04 - 0.40 0.01 Lactation Number 1st (ref) 2nd - 0.93 0.48 - 1.87 0.00 0.05 3rd or greater 1.17 0.36 0.47 1.86 <0.01 0 (ref) >0 - <96.30 0.57 0.45 - 0.30 1.45 0.20 96.30 - <184.83 0.22 0.42 - 0.61 1.04 0.61 184.83 - <343.70 - 0.37 0.50 - 1.34 0.60 0.46 343.70 and greater 0.48 0.52 - 0.53 1.49 0.35 Neutrophil: Lymphocyte ratio - 2.59 1.17 - 4.87 - 0.30 0.03 NEFA (mmol/L) - 1.42 0.88 - 3.14 0.29 0.10 Season Spring (ref) Summer - 3.17 1.11 - 5.35 - 0.99 <0.01 Fall 0.38 0.32 - 0.24 1.00 0.23 Wint er 1.48 0.58 0.34 2.62 0.01 . <10.04 (ref) 10.04 - <29.14 - 0.33 0.40 - 1.11 0.45 0.41 29.14 - <77.00 0.48 0.44 - 0.38 1.34 0.27 77.00 and greater - 0.07 0.41 - 0.87 0.72 0.86 teraction <3.17 (ref) 3.17 - <4.00 - 0.66 0.42 - 1.49 0.17 0.12 4.00 - <4.74 - 0.13 0.37 - 0.85 0.59 0.72 4.74 - <5.57 1.17 0.44 0.31 2.02 0.01 5.57 and greater 0.18 0.41 - 0.61 0.97 0.66 Lactation Numb Interaction 1st Lactation (ref) 2nd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 - 0.77 0.59 - 1.90 0.39 0.19 96.30 - <184.83 0.08 0.56 - 1.01 1.18 0.88 184.83 - <343.70 2.00 0.64 0.73 3.25 <0.01 343.70 and greater - 2.32 0.74 - 3.78 - 0.86 <0.01 3rd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 0.76 0.60 - 0.43 1.94 0.21 96.30 - <184.8 3 - 0.09 0.58 - 1.24 1.05 0.87 184.83 - <343.70 0.06 0.62 - 1.16 1.28 0.92 343.70 and greater 0.47 0.66 - 0.82 1.76 0.47 Interaction 1st Lactation (ref) 2nd Lactation and SAA Interaction <10.04 (ref) 10.04 - <29.14 0.93 0.57 - 0.19 2.05 0.10 29.14 - <77.00 - 2.14 0.66 - 3.45 - 0.84 <0.01 77.00 and greater 0.55 0.52 - 0.47 1.57 0.29 3rd Lactation and SAA Interaction <10.04 (ref) 10.04 - <29.14 - 1.36 0.58 - 2.51 - 0.22 0.02 29.14 - <77.00 1.00 0.63 - 0.22 2.23 0.11 77.00 and greater 0.92 0.51 - 0.09 1.93 0.07 Neutrophil: Lymphocyte ratio and BHB (mg/dL) Interaction 0.30 0.17 - 0.04 0.64 0.08 NEFA (mmol/L) and BCS Interaction 0.93 0.73 - 0.49 2.36 0.20 152 and Albumin (g/dL) Interaction SAA (1st quartile) and Albumin (1st quintile) Interaction (ref) SAA (2nd quartile) and Albumin (2nd quintile) Interaction 0.37 0.73 - 1.07 1.81 0.61 SAA (2nd quart ile) and Albumin (3rd quintile) Interaction 0.79 0.71 - 0.59 2.18 0.26 SAA (2nd quartile) and Albumin (4th quintile) Interaction - 1.24 0.89 - 2.98 0.50 0.16 SAA (2nd quartile) and Albumin (5th quintile) Interaction - 1.02 0.87 - 2.72 0.68 0.24 SAA (3rd quartile) and Albumin (2nd quintile) Interaction - 0.44 0.72 - 1.86 0.98 0.54 SAA (3rd quartile) and Albumin (3rd quintile) Interaction 1.95 0.72 0.54 3.37 0.01 SAA (3rd quartile) and Albumin (4th quintile) Interaction 1.11 1.00 - 0.85 3.07 0.27 SAA (3rd quartile) and Albumin (5th quintile) Interaction 2.09 0.96 0.21 3.96 0.03 SAA (4th quartile) and Albumin (2nd quintile) Interaction - 0.79 0.68 - 2.12 0.54 0.25 SAA (4th quartile) and Albumin (3rd quintile) Intera ction - 0.93 0.70 - 2.30 0.45 0.19 SAA (4th quartile) and Albumin (4th quintile) Interaction 0.19 0.90 - 1.57 1.94 0.83 SAA (4th quartile) and Albumin (5th quintile) Interaction - 0.23 0.76 - 1.73 1.26 0.76 Intercept 18.45 6.64 5.43 31.48 0 .01 Model 4 316.10 0.92 Albumin (g/dL) <3.2 (ref) 3.2 - <3.3 - 0.05 0.41 - 0.87 0.76 0.90 3.3 - <3.5 0.08 0.40 - 0.69 0.85 0.84 3.5 - <3.6 1.62 0.57 0.51 2.74 <0.01 3.6 and greater 0.0 0 0.53 - 1.03 1.04 0.99 <2.25 (ref) 2.25 - <3.24 - 1.88 0.64 - 3.15 - 0.62 <0.01 3.24 - <4.45 0.11 0.48 - 0.83 1.04 0.82 4.45 - <5.73 0.43 0.48 - 0.52 1.37 0.38 5.73 and greater - 0.18 0.53 - 1.22 0.86 0.73 <3.17 (ref) 3.17 - <4.00 - 0.42 1.00 - 2.39 1.55 0.67 4.00 - <4.74 0.51 0.87 - 1.19 2.21 0.56 4.74 - <5.57 - 1.22 0.88 - 2.94 0.51 0.17 5.57 and greater - 0.72 0.82 - 2.33 0.89 0.38 Calcium (mg/dL) - 1.77 0.70 - 3.15 - 0.39 0.01 Lactation Number 1st (ref) 2nd - 0.92 0.42 - 1.74 - 0.09 0.03 3rd or greater 1.20 0.36 0.49 1.91 <0.01 0 (ref) >0 - <96.30 0.40 0.43 - 0.44 1.25 0.35 96.30 - <184.83 0.25 0.40 - 0.54 1.05 0.53 184.83 - <343.70 - 0.42 0.48 - 1.36 0.53 0.39 343.70 and greater 0.48 0.48 - 0.47 1.42 0.32 NEFA (mmol/L) - 1.89 0.85 - 3.56 - 0.22 0.03 Neutrophil: Lymphocyte ratio - 0.70 0.48 - 1.63 0.24 0.14 Season Spring (ref) Summer - 3.68 1.09 - 5.83 - 1.54 <0.01 Fall 0.03 0.32 - 0.59 0.66 0.91 153 Table Winter 1.95 0 .62 0.74 3.15 <0.01 . <10.04 (ref) 10.04 - <29.14 - 0.26 0.37 - 0.99 0.47 0.48 29.14 - <77.00 0.23 0.43 - 0.60 1.07 0.59 77.00 and greater 0.14 0.40 - 0.65 0.93 0.73 Vitamin A (n g/mL) <237 (ref) 237 - <286 - 0.20 0.38 - 0.95 0.55 0.60 286 - <346 0.82 0.38 0.07 1.56 0.03 346 and greater 0.26 0.50 - 0.72 1.24 0.60 Lymphocyte Ratio Interaction <3.17 (ref) 3.17 - <4.00 - 0.87 0.94 - 2.72 0.98 0.36 4.00 - <4.74 - 1.03 0.91 - 2.81 0.76 0.26 4.74 - <5.57 2.37 1.03 0.35 4.39 0.02 5.57 and greater 0.18 0.76 - 1.32 1.68 0.81 Lactation Number and Mo Interaction 1st Lactation (ref) 2nd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 - 0.67 0.58 - 1.82 0.47 0.25 96.30 - <184.83 0.30 0.54 - 0.77 1.37 0.58 184.83 - <343.70 1.46 0.53 0.24 2.69 0.02 343.70 and greater - 2.14 0.69 - 3.49 - 0.78 <0.01 3rd Lactation and Monocyte Interaction 0 (ref) >0 - <96.30 0.50 0.59 - 0.66 1.65 0.40 96.30 - <184.83 - 0.01 0. 58 - 1.15 1.13 0.98 184.83 - <343.70 0.28 0.61 - 0.91 1.48 0.64 343.70 and greater 0.68 0.65 - 0.59 1.95 0.29 Interaction 1st Lactation (ref) 2nd Lactation and SAA Interaction <9.64 (ref) 9.64 - <29.13 0.94 0.54 - 0.11 1.99 0.08 29.13 - <77.00 - 1.93 0 .64 - 3.20 - 0.67 <0.01 77.00 and greater 0.45 0.52 - 0.56 1.47 0.38 3rd Lactation and SAA Interaction <9.64 (ref) 9.64 - <29.13 - 1.45 0.59 - 2.60 - 0.30 0.01 29.13 - <77.00 0.66 0.64 - 0.61 1.92 0.31 7 7.00 and greater 1.02 0.54 - 0.03 2.07 0.06 Interaction SAA (1st quartile) and Albumin (1st quintile) Interaction (ref) SAA (2nd quartile) and Albumin (2nd quintile) Interaction 0.49 0.77 - 1.03 2.00 0.53 SAA (2nd quart ile) and Albumin (3rd quintile) Interaction 0.28 0.66 - 1.02 1.58 0.67 SAA (2nd quartile) and Albumin (4th quintile) Interaction - 0.70 0.82 - 2.30 0.91 0.39 SAA (2nd quartile) and Albumin (5th quintile) Interaction - 0.69 0.88 - 2.42 1.03 0.43 SAA (3rd quartile) and Albumin (2nd quintile) Interaction - 0.42 0.73 - 1.84 1.00 0.56 SAA (3rd quartile) and Albumin (3rd quintile) Interaction 1.84 0.65 0.56 3.12 0.01 SAA (3rd quartile) and Albumin (4th quintile) I nteraction 0.50 1.05 - 1.55 2.55 0.63 154 Table SAA (3rd quartile) and Albumin (5th quintile) Interaction 2.68 0.92 0.87 4.49 <0.01 SAA (4th quartile) and Albumin (2nd quintile) Interaction - 0.72 0.65 - 1.99 0.55 0.27 SAA (4th quartile) and Albu mi n (3rd quintile) Interaction - 0.53 0.67 - 1.85 0.78 0.43 SAA (4th quartile) and Albumin (4th quintile) Interaction 0.12 0.89 - 1.63 1.87 0.90 SAA (4th quartile) and Albumin (5th quintile) Interaction - 0.42 0.78 - 1.95 1.10 0.59 Intercep t 17.63 6.90 4.10 31.15 <0.01 1 Random intercept estimate (Farm: 1.22, 95% CI = 0.16 - 9.51) SE: standard error; CI: confidence interval; AIC: Akaike information criteria; ROC: receiver operator curve; SAA: serum amylo id A NEFA: non - esterified f at ty acids; BCS: body condition score; BHB: beta - hydroxybutyrate; AOP: antioxidant potential; AUC: area under the curve Table S3.1 Cut - points for biomarkers and best combined sensitivity and specificity Biomarker Cut Points Sensitivity (95% CI) Specifici ty (95% CI) Albumin (g/dL) 0.65 (0.54 - 0.74) 0.44 (0.37 - 0.52) 2.34 - 5.24 0.48 (0.38 - 0.59) 0.46 (0.38 - 0.53) 0.56 (0.45 - 0.66) 0.55 (0.47 - 0.62) 0.46 (0.36 - 0.5 7) 0.54 (0.47 - 0.62) Basophils 0.40 (0.30 - 0.50) 0.54 (0.46 - 0.61) Beta - 0.48 (0.38 - 0.59) 0.49 (0.42 - 0.57) BHB (mg/dL) 0.51 (0.40 - 0.61) 0.51 (0.43 - 0.58) Calcium (mg/dL) 0.43(0.33 - 0.54) 0.39 (0.32 - 0.46) Cholesterol (mmol/L) 0.57 (0.46 - 0.67) 0.55 (0.47 - 0.62) 0.51 (0.40 - 0.61) 0.51 (0.44 - 0.59) Haptoglobin (mg/mL) 0.55 (0.43 - 0.64) 0.45 (0.40 - 0.55) 0.49 (0.39 - 0.60) 0.49 (0.41 - 0 .5 6) Monocytes (count 0.49 (0.39 - 0.60) 0.49 (0.42 - 0.57) NEFA (mmol/L) 0.58 (0.47 - 0.68) 0.48 (0.41 - 0.56) 0.48 (0.38 - 0.59) 0.47 (0.40 - 0.55) 0.48 (0.38 - 0.59) 0.49 (0. 42 - 0.57) SAA 0.57 (0.46 - 0.67) 0.57 (0.49 - 0.64) Vitamin A (ng/mL) 0.49 (0.39 - 0.60) 0.49 (0.41 - 0.56) Vitamin D (ng/mL) 0.49 (0.39 - 0.60) 0.49 (0.41 - 0.56) CI: confidence interval 155 Table S3.2 Candidate model set for central m et hod Variable Coefficient Model AIC Model DSS Shrinkage Parameter Model 1 Albumin (g/dL) 2.56 76.64 1.43 10.82 Calcium (mg/dL) - 1.93 Vitamin A (ng/mL) 0.0005 Intercept 8.69 Model 2 Albumin (g/dL) 2.70 77.21 1.51 6.52 Calcium (mg /d L) - 1.97 Intercept 8.73 Model 3 Albumin (g/dL) 2.60 78.14 1.54 6.17 Calcium (mg/dL) - 2.33 - 0.01 Intercept 12.87 Model 4 Albumin (g/dL) 2.86 78.17 1.56 4.91 Calcium (mg/dL) - 2.23 - 0.0001 Intercept 11.14 Model 5 Albumin (g/dL) 2.91 78.41 1.55 4.74 Band cells 0.002 Calcium (mg/dL) - 2.23 Intercept 10.41 Model 6 Albumin (g/dL) 2.65 78.48 1.50 8.16 Calcium (mg/dL) - 1.92 Vitamin D (ng/mL) - 0.01 Intercept 8.91 Model 7 Albumin (g/dL) 2.93 78.50 1.51 5.49 Bet a - 0.02 Calcium (mg/dL) - 2.23 Intercept 10.38 Model 8 Albumin (g/dL) 2.35 78.51 1.51 9.05 BHB (mg/dL) - 0.09 Calcium (mg/dL) - 1.74 Intercept 8.15 Model 9 Albumin (g/dL) 2.70 78.67 1.49 8.26 BCS - 0. 17 Calcium (mg/dL) - 2.03 Intercept 9.95 Model 10 Albumin (g/dL) 2.28 78.98 1.44 13.50 Calcium (mg/dL) - 1.72 0.00 Intercept 7.86 Model 11 Albumin (g/dL) 2.79 79.02 1.51 6.76 Calcium (mg/dL ) - 1.91 0.0004 156 Table S 3.2 Intercept 7.74 Model 12 Albumin (g/dL) 2.23 79.04 1.48 10.05 - 0.002 Calcium (mg/dL) - 1.69 Intercept 7.73 Model 13 Albumin (g/d L) 2.56 79.11 1.49 7.34 Calcium (mg/dL) - 1.80 - 0.0001 Intercept 7.77 Model 14 Albumin (g/dL) 2.55 79.12 1.47 9.21 Calcium (mg/dL) - 1.87 Haptoglobin (mg/mL) - 0.20 Intercept 8.33 Model 15 Albumin (g/dL) 2.33 79.19 1.41 13.91 Calcium (mg/dL) - 1.73 NEFA (mmol/L) - 0.07 Intercept 7.72 Model 16 Albumin (g/dL) 2.54 79.20 1.51 7.14 0.03 Calcium (mg/dL) - 1.93 Intercept 8.77 Mo del 17 Albumin (g/dL) 2.48 79.21 1.48 8.92 0.02 Calcium (mg/dL) - 1.90 Intercept 8.72 AIC: Akaike information criteria; DSS: Dawid - Sebastiani score, All DSS NS (P>0.05) Table S 3.3 AIC weights of candidate models using the central method for model averaging AIC i - AIC best ) exp( - i ) w Model 1 76.64 0 1.00 0.13 Model 2 76.98 0.34 0.84 0.11 Model 3 77.21 0.57 0.75 0.10 Model 4 78.14 1.50 0.47 0.06 Model 5 78.17 1.53 0.47 0.06 Model 6 7 8.41 1.77 0.41 0.05 Model 7 78.48 1.84 0.40 0.05 Model 8 78.50 1.86 0.39 0.05 Model 9 78.51 1.87 0.39 0.05 Model 10 78.67 2.03 0.36 0.05 Model 11 78.98 2.34 0.31 0.04 Model 12 79.02 2.38 0.30 0.04 Model 13 79.04 2.40 0.30 0.04 Model 14 79.11 2.47 0 .29 0.04 Model 15 79.12 2.48 0.29 0.04 Model 16 79.19 2.55 0.28 0.04 Model 17 79.20 2.56 0.28 0.04 AIC: Akaike information criteria; w: weight 157 Table S 3 .4 Internal validation of predicted versus observed counts calculated from the central method mode l set Slope (90% CI) Intercept (90% CI) D 2 Model 1 1.29 (0.88, 1.72) - 0.47 ( - 1.16, 0.17) 0.64 Model 2 1.17 (0.78, 1.60) - 0.29 ( - 0.96, 0.31) 0.62 Model 3 1.17 (0.77, 1.60) - 0.29 ( - 1.03, 0.36) 0.65 Model 4 1.13 (0.73, 1.50) - 0.22 ( - 0.90, 0.40) 0.63 Mo del 5 1.13 (0.76, 1.54) - 0.22 ( - 0.92, 0.42) 0.63 Model 6 1.21 (0.71, 1.61) - 0.35 ( - 1.05, 0.48) 0.64 Model 7 1.18 (0.84, 1.61) - 0.32 ( - 0.94, 0.29) 0.62 Model 8 1.18 (0.75, 1.62) - 0.30 ( - 1.12, 0.35) 0.62 Model 9 1.22 (0.79, 1.65) - 0.37 ( - 1.09, 0.36) 0.6 2 Model 10 1.28 (0.88, 1.72) - 0.48 ( - 1.25, 0.21) 0.61 Model 11 1.17 (0.76, 1.58) - 0.30 ( - 0.96, 0.36) 0.62 Model 12 1.20 (0.80, 1.55) - 0.34 ( - 0.90, 0.27) 0.60 Model 13 1.19 (0.83, 1.61) - 0.32 ( - 0.99, 0.30) 0.63 Model 14 1.23 (0.85, 1.66) - 0.40 ( - 1.07, 0.18) 0.62 Model 15 1.32 (0.90, 1.79) - 0.55 ( - 1.32, 0.13) 0.62 Model 16 1.16 (0.79, 1.61) - 0.28 ( - 0.96, 0.34) 0.61 Model 17 1.21 (0.81, 1.67) - 0.36 ( - 1.08, 0.29) 0.61 Averaged 1.23 (0.82, 1.64) - 0.37 ( - 1.06, 0.29) 0.64 CI: confidence interval Tab le S 3.5 Variables eligible for combined model set Count AIC Albumin 82.72 NEFA 84.95 Vitamin D 86.11 BHB 86.21 BCS 86.86 Heifer 86.95 Cholesterol 86.99 Calcium 87.15 AOP 87.22 Dispersion Alpha tocopherol 86.54 Band cells 85.49 Basophils 82.3 0 BCS 85.04 BHB 83.02 Calcium 85.32 Cholesterol 87.30 Eosinophils 86.91 Lymphocytes 85.15 Neutrophils 87.37 Vitamin D 86.16 Central Albumin 86.03 Basophils 83.29 BCS 86.81 Beta - carotene 87.03 BHB 84.22 Calcium 82.90 Lymphocytes 85.00 Mon ocytes 87.29 Vitamin A 86.27 AIC: Akaike information criteria; NEFA: non - esterified fatty acids; AOP: antioxidant potential; BCS: body condition score; BHB: beta - hydroxybutyrate 158 Table S3.6 Candidate model set for combined method Variable Coefficient Model AIC Mean Model DSS Shrinkage Parameter Model 1 Albumin (g/dL) (M) 2.56 76.64 1.43 10.82 Calcium (mg/dL) (M) - 1.93 Vitamin A (ng/mL) (M) 0.001 Intercept 8.69 Model 2 Albumin (g/dL) (M) 2.70 77.21 1.51 6.52 Calcium (mg/dL) (M) - 1.97 Intercept 8.73 Model 3 Albumin (C) 0.05 78.26 1.49 15.96 Albumin (g/dL) (M) 0.95 Calcium (mg/dL) (M) - 1.58 Intercept 10.43 Model 4 Albumin (g/dL) (M) 2.91 78.46 1.50 8.03 Calcium (mg/dL) (M) - 1.73 Choleste r ol (mmol/L) (D) - 0.01 Intercept 6.21 Model 5 Albumin (g/dL) (M) 2.93 78.50 1.51 5.49 Beta - 0.02 Calcium (mg/dL) (M) - 2.23 Intercept 10.38 Model 6 Albumin (g/dL) (M) 2.35 78.51 1.51 9.05 BHB (mg/dL) (M) - 0.09 Calcium (mg/dL) (M) - 1.74 Intercept 8.15 Model 7 Albumin (g/dL) (M) 2.2 2 78.59 1.51 9.06 - 0.003 Calcium (mg/dL) (M) - 1.67 Intercept 7.56 Model 8 Albumin (g/dL) (M) 2.70 78.67 1.49 8.26 BCS (M) - 0.17 Calcium (mg/dL) (M) - 2.03 Intercept 9.95 Model 9 Album in (g/dL) (M) 2.48 78.73 1.53 7.91 Calcium (mg/dL) (M) - 1.91 Heifer (C) 0.10 Intercept 8.42 Model 10 Albumin (g/dL) (M) 2.35 78.81 1.41 13.89 Calcium (mg/dL) (M) - 1.77 Vitamin D (ng/mL) (D) - 0.001 Intercept 8.07 M: mean; D: dispersion; C: count; AIC: Akaike information criteria; DSS: Dawid - Sebastiani score All DSS NS (P>0.05) 159 Table S3.7 AIC weights of candidate models using the combined method for model averaging AIC i - AIC best ) exp( - i ) w Model 1 76.64 0.00 1.00 0.21 Model 2 77.21 0.57 0.75 0.16 Model 3 78.26 1.62 0.44 0.09 Model 4 78.46 1.82 0.40 0.08 Model 5 78.50 1.86 0.39 0.08 Model 6 78.51 1.87 0.39 0.08 Model 7 78.59 1.95 0.38 0.08 Model 8 78.67 2.03 0.36 0.08 Model 9 78.73 2.09 0.35 0.07 Model 10 78.81 2.17 0.34 0.07 AIC: Akaike information criteria; w: weight Table S3.8 Internal validation of predicted versus observed counts calculated from the combined method model set Slope (90% CI) Intercept (90% CI) D 2 Model 1 1.29 (0.88, 1.72) - 0.47 ( - 1.16, 0.17) 0.64 Model 2 1.17 (0.78, 1.60) - 0.29 ( - 0.96, 0.31) 0.62 Model 3 1.23 (0.78, 1.61) - 0.39 ( - 1.01, 0.33) 0.63 Model 4 1.22 ( - 1.01, 0.22) - 0.38 ( - 1.01, 0.22) 0.66 Model 5 1.18 (0.84, 1.61) - 0.32 ( - 0.94, 0.29) 0.64 Model 6 1.18 (0.75, 1.62) - 0.30 ( - 1.12, 0.35) 0.62 Model 7 1.17 (0.83, 1.50) - 0.29 ( - 0.80, 0.25) 0.61 Model 8 1.22 (0.79, 1.65) - 0.37 ( - 1.09, 0.36) 0.62 Model 9 1.15 (0.77, 1.54) - 0.25 ( - 0.90, 0.36) 0.62 Model 10 1.34 (0.87, 1.81) - 0.57 ( - 1.33, 0.19) 0.63 Averaged 1.24 (0.83, 1.65) - 0.42 ( - 1.08, 0.26) 0.64 CI: confidence interval 160 Table S 5.1 Data dictionary Raw Variables Variable Unit Data type Format Description LstcId integer Cow ID FarmId integer (1 - 5) Farm ID Cohort integer (1 - 4) Cohort ID Eartag integer Eartag ID LactationNumber integer Current lactation number DryEnrollDate DDMMYYYY Date of enrollment/dry off CalvingDate DDMMYYYY Date of calving AnyDz integer Number of diseases in current lactation MilkFever integer (0 or 1) 1= had milk fever; 0= did not have milk fever Diagnosed with milk fever in current lactation Ketosis integer (0 or 1) 1= had ketosis; 0= did not have ketosis Diagnosed with ketosis in current lactation Metr itis integer (0 or 1) 1= had metritis; 0= did not have metritis Diagnosed with metritis in c urrent lactation DA integer (0 or 1) 1= had displaced abomasum; 0= did not have displaced abomasum Diagnosed with displaced abomasum in current lactation Pneumo nia integer (0 or 1) 1= had pneumonia; 0= did not have pneumonia Diagnosed with pneumonia in current lactation RP integer (0 or 1) 1= had retained placenta; 0= did not have retained placenta Diagnosed with retained placenta in current lactation Comment s string Additional comments from sample collectors mastitis integer (0 or 1) 1= had mastitis; 0= did not have mastitis Diagnosed with mastitis in current lactation lameness integer (0 or 1) 1= had lameness; 0= did not have lameness Diagnosed with la meness in current lactation DzSumAll integer Number of diseases diagnosed in current lactation AnyDzCalc integer (0 or 1) 1= had at least one disease and/or adverse event; 0= had 0 diseases and adverse events Was diagnosed with at least one disease in current lactation, or died, had an abortion, or the calf died died integer (0 or 1) 1= died; 0= lived Died during study abort integer (0 or 1) 1= aborted; 0= did not abort Aborted calf calfdead integer (0 or 1) 1= died; 0= lived Calf died monthdry integer (1 - 12) January (1) through December (12) The month they were dried off season integer (1 - 4) 1= spring; 2= summer; 3= fall; 4= winter The season they were dried off rounddrybcs decimal (0.5 increments) Rounded body condition score Ntot count (per decimal Neutrophil count Ltot count (per decimal Lymphocyte count Mtot count (per decimal Macrophage count Etot count (per decimal Eisinophil count Btot count (per decimal Basophil count BDtot count (per decimal Banded neutrophil cell count albumin g/dL decimal Serum albumin calcium mg/dL decimal Serum calcium cholesterol mg/dL decimal Serum cholesterol nefa mEq/L decimal Serum non - esterified fatty acid bhb mg/dL decimal Serum betahydroxybutyra te saaugml ug/mL decimal Serum amyloid A haptoglobin mg/mL decimal Serum haptoglobin ros RFU/u L decimal Serum reactive oxygen species and reactive nitrogen species aopugul ug/mL decimal Serum antioxidant potential vitamind ng/mL decimal Ser um vitamin D vitamina ng/mL decimal Serum vitamin A alphatocopherol ug/mL decimal Serum alphatocopherol 161 Table S 5.1 betacarotene ug/mL decimal Serum betacarotene Calculated variables Variable Unit Data type Format Description nCowsLs tc integer Number o f cows (lactation number>1) by Farm nCowsLstc2 integer Number of cows (lactation number>1) by cohort nTotalLstc2 integer Number of cows+heifers by cohort nDzLstc2 integer Number of cows+heifers that were diagnosed with a disea se in the current lac tation by cohort DzPrev2 ratio Current disease prevalence by cohort LactCat integer (1 - 3) 1= 1st lactation, 2= 2nd lactation, 3= 3+ lactation Current lactation number Cohort2 integer (1 - 18) Cohort identifier NLratio ratio Neutrophil:Lymphocyte r atio bcscat integer (0 or 1) 1= dry bcs is 3.5 or 4.0; 0= dry bcs not 3.5 or 4.0 Categorical BCS lognefasq decimal nefa transformed by log then square transformations cholesterol42 integer (0 or 1) 1= cholesterol levels with in 2nd quartile; 0= not i n 2nd quartile Cholesterol categorized into 4 quartiles cholesterol43 integer (0 or 1) 1= cholesterol levels within 3rd quartile; 0= not within 3rd quartile Cholesterol categorized into 4 quartiles cholesterol44 integer (0 or 1 ) 1= cholesterol levels w ithin 4th quartile; not within 4th quartile Cholesterol categorized into 4 quartiles saaugml42 integer (0 or 1) 1= SAA levels within 2nd quartile; 0= not in 2nd quartile Serum amyloid A categorized into 4 quartiles saaugml43 in teger (0 or 1) 1= SAA lev els within 3rd quartile; 0= not within 3rd quartile Serum amyloid A categorized into 4 quartiles saaugml44 integer (0 or 1) 1= SAA levels within 4th quartile; not within 4th quartile Serum amyloid A catego rized into 4 quartiles MtotCat2 integer (0 or 1 ) 1= >0 - <96.3; 0= not in this range Macrophages categorized into 5 categories based on distribution MtotCat3 integer (0 or 1) 1= 96.3 - <184.83; 0= not in this range Macrophages categorized into 5 categories based on distributio n MtotCat4 integer (0 o r 1) 1= 184.83 - <343.7; 0= not in this range Macrophages categorized into 5 categories based on distribution MtotCat5 integer (0 or 1) 343.7 and greater; 0= not in this range Macrophages categorized into 5 categories based on dis tribution vitamind52 in teger (0 or 1) 1= vitamin D levels within 2nd quintile; 0= not in 2nd quintile Vitamin D categorized into 5 quintiles vitamind53 integer (0 or 1) 1= vitamin D levels within 3rd quintile; 0= not in 3rd quintile Vitamin D categoriz ed into 5 quintiles vita mind54 integer (0 or 1) 1= vitamin D levels within 4th quintile; 0= not in 4th quintile Vitamin D categorized into 5 quintiles vitamind55 integer (0 or 1) 1= vitamin D levels within 5th quintile; 0= not in 5th quintile Vitamin D categorized into 5 quint iles albumin52 integer (0 or 1) 1= albumin levels within 2nd quintile; 0= not in 2nd quintile Albumin categorized into 5 quintiles albumin53 integer (0 or 1) 1= albumi n levels within 3rd quintile; 0= not in 3rd quintile Albumin categorized into 5 quint iles albumin54 integer (0 or 1) 1= albumin levels within 4th quintile; 0= not in 4th quintile Albumin categorized into 5 quintiles albumin55 integer (0 or 1) 1= albumin levels within 5th quintile; 0= not in 5th quintile Albumin categorized into 5 quint iles haptoglobin52 integer (0 or 1) 1= haptoglobin levels within 2nd quintile; 0= not in 2nd quintile Haptoglobin categorized into 5 quintiles haptoglobin53 integer (0 or 1) 1= haptoglobin levels within 3rd quintile; 0= not in 3rd quintile Haptoglobin categorized into 5 quintiles haptoglobin54 integer (0 or 1) 1= haptoglobin levels within 4th quintile; 0= not in 4th quintile Haptoglobin catego rized into 5 quintiles haptoglobin55 integer (0 or 1) 1= haptoglobin levels within 5th quintile; 0= not in 5 th quintile Haptoglobin categorized into 5 quintiles Ltot52 integer (0 or 1) 1= lymphocyte levels within 2nd quintile; 0= not in 2nd quintile Lymphocyte count categorized into 5 quintiles Ltot53 integer (0 or 1) 1= lymphocyte l evels within 3rd quintile ; 0= not in 3rd quintile Lymphocyte count categorized into 5 quintiles Ltot54 integer (0 or 1) 1= lymphocyte levels within 4th quintile; 0= not in 4th quintile Lymphocyte count categorized into 5 quintiles Ltot55 integer (0 or 1) 1= lymphocyte levels w ithin 5th quintile; 0= not in 5th quintile Lymphocyte count categorized into 5 quintiles Etot52 integer (0 or 1) 1= eosinophil levels within 2nd quintile; 0= not in 2nd quintile Eosinophil count categorized into 5 quintiles Etot 53 integer (0 or 1) 1= e osinophil levels within 3rd quintile; 0= not in 3rd quintile Eosinophil count categorized into 5 quintiles 162 Table S 5.1 Etot54 integer (0 or 1) 1= eosinophil levels within 4th quintile; 0= not in 4th quintile Eosinophil c ount categorized into 5 quintiles Etot55 int eger (0 or 1) 1=eosinophil levels within 5th quintile; 0= not in 5th quintile Eosinophil count categorized into 5 qu intiles BtotCat2 integer (0 or 1) 1= >0 - <37.5; 0= not in this range Macrophages categorize d into 5 categories based on distribution Bto tCat3 integer (0 or 1) 1= 37.5 - <50.68; 0= not in this range Macrophages categorized into 5 categories based on distribution BtotCat4 integer (0 or 1) 1= 50.68 - <94; 0= not in this range Macrophages categori zed into 5 categories based on distribution B totCat5 integer (0 or 1) 1= 94 and greater; 0= not in this range Macrophages categorized into 5 categories based on distribution BDtotCat2 integer (0 or 1) 1= >0 - <38.05; 0= not in this range Banded neutrop hils categorized into 5 categories based on di stribution BDtotCat3 integer (0 or 1) 1= 38.5 - <74.53; 0= not in this range Banded neutrophils categorized into 5 categories based on distribution BDtotCat4 integer (0 or 1) 1= 74.53 - <126.53; 0= not in thi s range Banded neutrophils categorized into 5 categories based on distribution BDtotCat5 integer (0 or 1) 1= 126.53 and greater; 0= not in this range Banded neutrophils categorized into 5 categories based on distribution aopugul52 integer (0 or 1) 1= A OP levels within 2nd quintile; 0= not in 2nd q uintile Antioxidant potential categorized into 5 quintiles aopugul53 integer (0 or 1) 1= AOP levels within 3rd quintile; 0= not in 3rd quintile Antioxidant potential categorized into 5 quintiles aopugul54 i nteger (0 or 1) 1= AOP levels within 4th quint ile; 0= not in 4th quintile Antioxidant potential categorize d into 5 quintiles aopugul55 integer (0 or 1) 1= AOP levels within 5th quintile; 0= not in 5th quintile Antioxidant potential categorized into 5 qui ntiles vitamina42 integer (0 or 1) 1= vitami n A levels within 2nd quartile; 0= not in 2nd quartile Vitamin A categorized into 4 quartiles vitamina43 integer (0 or 1) 1= vitamin A levels within 3rd quartile; 0= not in 3rd quartile Vitamin A categorized into 4 quartiles vitamina44 integer (0 or 1) 1= vitamin A levels within 4th quartile; 0= not in 4th quartile Vitamin A categorized into 4 quartiles alphatocopherol52 integer (0 or 1) 1= alphatocopherol levels within 2nd quintile; 0= not in 2nd quint ile Alphatocopherol categorized into 5 quintil es alphatocopherol53 integer (0 or 1) 1= alphatocopherol levels within 3rd quintile; 0= not in 3rd quintile Alphatocopherol categorized into 5 quintiles alphatocopherol54 integer (0 or 1) 1= alphatocopherol levels within 4th quintile; 0= not in 4th qui ntil e Alphatocopherol categorized into 5 quintiles alphatocopherol55 integer (0 or 1) 1= alphatocopherol levels within 5th quintile; 0= not in 5th quintile Alphatocopherol categorized into 5 quintiles ros52 integer (0 or 1) 1= ROS levels within 2nd qui ntile; 0= not in 5th quintile Reactive oxygen species categorized into 5 quintiles ros53 integer (0 or 1) 1= ROS levels within 3rd quintile; 0= not in 3rd quintile Reactive oxygen species categorized into 5 q uintiles ros54 integer (0 or 1) 1= ROS level s within 4th quintile; 0= not in 4th quintile Reactive oxygen species categorized into 5 quintiles ros55 integer (0 or 1) 1= ROS levels within 5th quintile; 0= not in 5th quintile Reactive oxygen species cate gorized into 5 quintiles betacarotene52 inte ger (0 or 1) 1= betacarotene levels within 2nd quintile; 0= not in 2nd quintile Betacarotene categorized into 5 quintiles betacarotene53 integer (0 or 1) 1= betacarotene levels within 3rd quintile; 0= not in 3rd quintile Betacarotene categorized into 5 q uintiles betacartoene54 integer (0 or 1) 1= betacarotene levels within 4th quintile; 0= not in 4th quintile Betacarotene categorized into 5 quintiles betacarotene55 integer (0 or 1) 1= betacarotene levels w ithin 5th quintile; 0= not in 5th quintile Bet acarotene categorized into 5 quintiles meanbhb decimal Mean BHB within cohort sdbhb decimal Standard deviation of BHB within cohort meannefa decimal Mean NEFA within cohort sdnefa decimal Standard deviation of NEFA within cohort meancalcium decimal Mean calcium within cohort sdcalcium decimal Standard deviation of calcium within cohort meancholesterol decimal Mean cholesterol within cohort sdcholesterol decimal Standard deviation of chol esterol within cohort medBCS decimal Median BCS within cohort manBCS decimal MAD about the median of BCS within cohort meanNtot decimal Mean neutrophil count within cohort sdNtot decimal Standard deviation of neutrophil counts within cohort me anLtot decimal Mean lymphocyte count within cohort sdLtot decimal Standard deviation of lymphocyte counts within cohort 163 Table S 5.1 meanNLratio decimal Mean neutrophil:lymphocyte ratio within cohort sdNLratio decimal Standard deviation of the neutrophil:lymphocyte ratio within cohort meanalbumin deci mal Mean albumin within cohort sdalbumin decimal Standard deviation of albumin within cohort meanhaptoglobin decimal Mean haptoglobin within cohort sdhaptoglobin decimal Standard deviation of haptoglobin within cohort meanvitamina decimal Mean vitamin A within cohort sdvitamina decimal Standard deviation of vitamin A within cohort meanMtot decimal Mean macrophage count within cohort sdMtot decimal Standard deviation of macrophage counts within cohort meanBtot decimal Mean basophil c ount within cohort sdBtot decimal Standard deviation of basophil counts within cohort meanBDtot decimal Mean banded neutrophil count within cohort sdBDtot decimal Standard deviatio n of banded neutrophil counts within cohort meanEtot decimal Mea n eosinophil count within cohort sdEtot decimal Standard deviation of eosinophil counts within cohort meansaa decimal Mean serum amyloid A within cohort sdsaaugml decimal Standard deviation of serum amyloid A within cohort meanaopugul decimal M ean antioxidant potential within cohort sdaopugul decimal Standard deviation of antioxidant potential within cohort meanalphatocopherol decimal Mean alpha tocopherol within cohort sd alphatocopherol decimal Standard deviation of alpha tocopherol wi thin cohort meanbetacarotene decimal Mean beta - carotene within cohort sdbetacartoene decimal Standard deviation of beta - carotene within cohort meanros decimal Mean reactive oxygen species within cohort sdros decimal Standard deviation of reacti ve oxygen species within cohort meanvitamind decimal Mean vitamin D within cohort sdvitamind decimal Standard deviation of vitamin D within cohort medLact integer Median lactation number within cohort manLact integer MAD about the median of lac tation number within cohort nNefa integer mmol/L Number of cows above the NEFA threshold within cohort nBhb integer mg/dL Number of cows above the BHB threshold within cohort nCalcium integer mg/dL Number of cows above the calcium threshold within cohort nSaaugml integer Cows counted if serum amyloid Number of cows above the serum amyloid A threshold within cohort nBetacarotene integer Cows counted if beta - carotene Number of cows abo ve the beta carotene threshold within cohort nHaptoglobin integer Cows counted if haptoglobin Number of cows above the haptoglobin threshold within cohort nCholesterol integer mmol/L Number of cows above the cholesterol threshold within cohort nNtot integer Cows counted if neutrohils Number of cows above the neutrophil threshold within cohort nLtot integer Cows counted if lymphocytes Number of cows above the lympho cyte threshold within cohort nEtot integer Cows counted if eosinophils Number of cows above the eosinophil threshold within cohort nBtot integer Number of cows above the basophil threshold within cohort nBDtot integer Cows counted if banded Number of cows above the banded neutrophil threshold within cohort nAopugul integer Cows counted if antioxidant Number of cows abov e the antioxidant potential threshold within cohort nVitamina integer ng/mL Number of cows above the vitamin A threshold within cohort nRos integer Cows counted if reactive oxygen Number of cow s above the reactive oxygen species threshold within cohort nVitamind integer ng/mL Number of cows above the vitamin D threshold within cohort nAlbumin integer g/dL Number of cows above the a lbumin threshold within cohort nMtot integer Cows counted if macrophage Number of cows above the macrophage threshold within cohort 164 Table S 5.1 nAlphatocopherol integer Cows counted if alpha tocopherol outside of this range: 2.34 - 5.24 Numbe r of cows above the alpha tocopherol threshold within cohort nHeifer integer Number of heifers within a cohort nBCS integer Cows counted if BCS not 3.5 or 4.0 Number of cows outside the optimal range of BCS within cohort NM_meanp hat decimal Mean pr edicted probability from nutrient metabolism models within cohort NM_sdphat decimal Standard deviation of predicted probabilities from nutrient metabolism models within cohort NM_nPhat decimal Cows counted if risk > 0.335 Number o f cows above predicte d probability threshold from nutrient metabolism models within cohort IF_meanphat decimal Mean predicted probability from inflammation models within cohort IF_sdphat decimal Standard deviation of predicted probabilities from infl ammation models withi n cohort IF_nPhat decimal Cows counted if risk > 0.351 Number of cows above predicted probability threshold from inflammation models within cohort OX_meanphat decimal Mean predicted probability from oxidative stress models within cohort OX_sdphat de cimal Standard deviation of predicted probabilities from oxidative stress models within cohort OX_nPhat decimal Cows counted if risk > 0.331 Number of cows above predicted probability threshold from oxidative stress models within a cohort 165 APPENDIX B : Supplemental F igures Figure S1 Data flow diagram for data cleaning and merging Figure S2 Data flow diagram for univariable and bivariable analyses for Chapter 2 166 Figure S3 Data flow diagram for multivariable analyses for Chapter 2 Figure S4 Data flow diagram for analys e s for Chapter 3 167 Figure S5 Data flow diagram for analys e s for Chapter 4 168 APPENDIX C: STROBE - VET C hecklist C onnor et al ., 2016) Table S 5.2 The STROBE - Vet statement checklist Item STROBE - V et recommendation Title and Abstract 1 ( a ) Indicate that the study was an observational study and , if applicable , use a common study design term ( b ) Indicate why the study was conducted , the design , the results , the limitations , and the relevance of t he findings Background / rationale 2 Explain the scientific background and rationale for the investigation being reported Objectives 3 ( a ) State specific objectives , including any primary or secondary prespecified hypotheses or their absence ( b ) Ensu re that the level of orga nization a is clear for each objective and hypothesis Study design 4 Present key elements of study design early in the paper Setting 5 ( a ) Describe the setting , locations , and relevant dates , including periods of recruitment , expo sure , follow - up , and data collection ( b ) If applicable , include information at each level of organization Participants b 6 (a) Describe the eligibility criteria for the owners / managers and for the animals , at each relevant level of organization (b) Describe the sources and methods of selection for the owners / managers and for the animals , at each relevant level of organization (c) Describe the method of follow - up ( d ) For matched studies , describe matching criteria and the number of matched indiv iduals per subject ( e . g ., number of controls per case ) Variables 7 ( a ) Clearly define all outcomes , exposures , predictors , potential confounders , and effect modifiers . If applicable , give diagnostic criteria ( b ) Describe the level of organization at wh ich each variable was mea sured ( c ) For hypothesis - driven studies , the putative causal - structure among variables should be described ( a diagram is strongly encouraged ) Data sources / measurement 8* ( a ) For each variable of interest , give sources of data and details of methods o f assessment ( measurement ). If applicable , describe comparability of assessment methods among groups and over time ( b ) If a questionnaire was used to collect data , describe its development , validation , and administration ( c ) Describe whether or not i ndividuals involved in data collection were blinded , when applicable ( d ) Describe any efforts to assess the accuracy of the data ( including methods used for data cleaning in primary research , or methods used for validating sec ondary data ) Bias 9 Desc ribe any efforts to address potential sources of bias due to confounding , selection , or information bias Study size 10 ( a ) Describe how the study size was arrived at for each relevant level of organization 169 Table S 5. 2 ( b ) Describe how non - independence of measure ments was incorporated into sample - size considerations , if applicable ( c ) If a formal sample - size calculation was used , describe the parameters , assumptions , and methods that were used , including a justif ication for the effect size selected Quantitativ e variables 11 Explain how quantitative variables were handled in the analyses . If applicable , describe which groupings were chosen , and why Statistical methods 12 ( a ) Describe all statistical methods for e ach objective , at a level of detail sufficient fo r a knowledgeable reader to replicate the methods . Include a description of the approaches to variable selection , control of confounding , and methods used to control for non - independence of observations ( b ) Describe the rationale for examining subgroups and interactions and the methods used ( c ) Explain how missing data were addressed ( d ) If applicable , describe the analytical approach to loss to follow - up , matching , complex sampling , and multiplicity of analyses ( e ) Describe any methods used to assess the robustness of the analyses ( e . g ., sensitivity analyses or quantitative bias assessment ) Participants 13* ( a ) Report the numbers of owners / managers and animals at each stage of study and at ea ch relevant level of organization - e . g ., numbers eligible , included in the study , completing follow - up , and analyzed ( b ) Give reasons for non - participation at each stage and at each relevant level of organization ( c ) Consider use of a flow diagram and / or a diagram of the organizational structure Descriptive data on exposures and potential confounders 14* ( a ) Give characteristics of study participants ( e . g ., demographic , clinical , social ) and information on exposures and potential confounders by gr oup and level of organization , if applicable ( b ) Indicate number of participants with missing data for each variable of interest and at all relevant levels of organization ( c ) Summarize follow - up time ( e . g ., average and total amount ), if appropriate to the study design Outcome data 15* ( a ) Report outcomes as appropriate for the study design and summarize at all relevant levels of organization ( b ) For proportions and rates , report the numerator and denominator ( c ) For continuous outcomes , report the number of observations and a measure of vari ability Main results 16 ( a ) Give unadjusted estimates and , if applicable , adjusted estimates and their precision ( e . g ., 95% confidence interval ). Make clear which confounders and interactions were adjusted . Report all relevant parameters that were part of the model ( b ) Report category boundaries when continuous variables were categorized ( c ) If relevant , consider translating estimates of relative risk into absolute risk for a meaningful time period Ot her analyses 17 Report other analyses done , such as sensitivity / robustness analysis and analysis of subgroups Key results 18 Summarize key results with reference to study objectives 170 Table S 5. 2 ) Strengths and Limitations 19 Discuss strengths and limitations of the study , taking into account sources of potential bi as or imprecision . Discuss both direction and magnitude of any potential bias Interpretation 20 Give a cautious overall interpretation of results considering objectives , limitations , multiplicity of analyse s , results from similar studies , and other releva nt evidence Generalizability 21 Discuss the generalizability ( external validity ) of the study results Transparency 22 ( a ) Funding - Give the source of funding and the role of the funders for the present stu dy and , if applicable , for the original study on which the present article is based ( b ) Conflicts of interest - Describe any conflicts of interest , or lack thereof , for each author ( c ) Describe the authors roles - Provision of an authors declaration of tra nsparency is recommended ( d ) Ethical approval - In clude information on ethical approval for use of animal and human subjects ( e ) Quality standards - Describe any quality standards used in the conduct of the research a Level of organization recognizes that observational studies in veterinary research ofte n deal with repeated measures (within an animal or herd) or animals that are maintained in groups (such as pen s and herds); thus, the observations are not statistically independent. This non - independence has profound implications for the design, analysis, and results of these studies. b for both the animal owner /manager and for the animals themselves. *Give such information separately for cases and controls in case - control studies and, if applicable, for exposed and unexposed groups in cohort and cross - sectional studies . 171 APPENDIX D : Stata code 1 _ IMPORTANDC LEAN.DO /* Title: Importing and cleaning datas ets Author: Lauren Wisnieski Purpose: Import and clean datasets Revision history: - Created October 24, 2017 Steps: 1. Import raw data (.xlxs files) 2. Cleaning lab data set 3. Cleaning current heal th dataset 4. Cleaning prior health dataset 5. C leaning BLV dataset 6. Cleanin g Wright Stain dataset */ ******************************************************************************** * 1) Importing raw data (converting .xlxs to Stata dataset) *We have 7 excel files that need importing and mergi ng. ******************************************************************************** * LabData import excel "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Excel files \ LabData_24Oct2017.xlsx", /// sheet("AllResultsQuery") firstrow clear sav e "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ LabData_24Oct2017.dta", replace * Prior Health Data import excel "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Excel files \ PriorHeal th_ 24Oct2017.xlsx", /// sheet("PriorCowHealth ") firstrow clear save "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ PriorHealth_24Oct2017.dta", replace * Current Health Data /*using updated Current Health Data from manual fixes wit h Jeff in January*/ import excel "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Excel files \ CurrentHealth_1Nov2017.xlsx", /// sheet("Sheet1") firstrow clear save "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ CurrentHealth_1 Nov 2017.dta", replace * Wright Stain Results imp ort excel "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Excel files \ WrightStain_24Oct2017.xlsx", /// sheet("WrightStainResults") firstrow clear save "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION AN ALYSIS \ Data files \ WrightStain_24Oct2017.dta", rep lace * SampleDates import excel "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Excel files \ CowSampleDates_24Oct2017.xlsx", /// sheet("Sheet1") firstrow clear save "C: \ Users \ wisni \ Amazon Dr ive \ DISSERTATION ANALYSIS \ Data files \ CowSampleDat es_24Oct2017.dta", replace * BLV results import excel "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Excel files \ BLV_24Oct2017.xlsx", /// sheet("BLV") firstrow clear save "C: \ Users \ wisni \ Ama zon Drive \ DISSERTATION ANALYSIS \ Data files \ BLV_24 Oct2017.dta", replace /*Leaving out QScout for now* * QScout results import excel "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Excel files \ QScout_25Oct2017.xlsx", /// sheet("Sheet1") firstrow clear save "C: \ Users \ wisni \ Amazon Drive \ DISSER TATION ANALYSIS \ Data files \ QScout_25Oct2017. dta", replace */ ******************************************************************************** * 2) Cleaning and checking lab data *********************** ************************************************* ******** use "C: \ Users \ wisni \ Amazon Drive \ DI SSERTATION ANALYSIS \ Data files \ LabData_24Oct2017.dta", clear *************************** /*Checking for duplicates*/ *************************** duplicates lis t LstcId Eartag SamplePeriod /*results: no dupli cates found*/ 172 ********************** /*Ren aming variables*/ ********************** rename AlbumingdL albumin rename CalciummgdL calcium rename CholesterolmgdL cholesterol rename NEFAmEqL nefa rename BHBmgd L bhb rename SAAugml saaugml rename HaptoGlobinmg ml haptoglobin rename ROSRNSRFUul ros rename AOPugul aopugul rename AOPBradfordCorrected aopbradford rename VitaminDngml vitamind rename VitaminAngmL vitamina rename AlphaTocopherolugmL alphatocopherol renam e BetacaroteneugmL betacarotene ************** ******** /*Labeling variables*/ ****** **************** replace SamplePeriod="1" if SamplePeriod=="Dry Off" replace SamplePeriod="2" if SamplePeriod=="Close Up" replace SamplePeriod="3" if SamplePeriod=="C + 7" replace SamplePeriod="4" if SamplePeriod=="C + 14" replace SamplePeriod="5" if Samp lePeriod=="C + 21" destring SamplePeriod, replace label define smplprd_lbl 1 "Dry Off" 2 "Close Up" 3 "C+7" 4 "C+14" 5 "C+21" , modify label values SamplePeriod smplp rd_lbl replace FarmId="1" if FarmId=="Maple Row" replace FarmId="2" if FarmId=="Halbe rt Dairy Farm" replace FarmId="3" if FarmId=="Green Meadow Farm" replace FarmId="4" if FarmId=="Nobis Dairy Farm" replace FarmId="5" if FarmId=="Sand Creek Dairy" label define farm_lbl 1 "MRD" 2 "HDF" 3 "GMF" 4 " Nobis" 5 "SC", modify destring FarmId, replace label values FarmId farm_lbl ************************************* /*Dropping unneeded variables this variable is in another dataset*/ ******************** ***************** drop LactationNumber ********* *********************** ***** /*Dropping other time points, only need dry off for this analysis*/ ************************************* keep if SamplePeriod==1 save "C: \ Users \ wisni \ Amazon Drive \ DISSE RTATION ANALYSIS \ Data files \ LabDataCLEAN_26Oct201 7.dta", replace ****** ************************************************************************** * 3) Cleaning and checking current health data ************************************************************* ****************** use "C: \ Users \ wisni \ Amazon Dr ive \ DISSERTATION ANALYS IS \ Data files \ CurrentHealth_1Nov2017.dta", clear ************************* /*Checking for duplicates *************************/ duplicates list LstcId FarmId Eartag /*no duplicat es*/ **************************************** ******* *Inputting zeros for cows with no disease /*(Only cows with disease had values inputted*/ *********************************************** replace MastitisCase=0 if MastitisCase==. replace MildMasti tisCase=0 if MildMastitisCase==. replace Moderate MastitisCase=0 if ModerateMastitisCase==. replace SevereMastitisCase=0 if SevereMastitisCase==. replace Ketosis=0 if Ketosis==. replace Metritis=0 if Metritis==. replace MilkFever=0 if MilkFever==. destring DA, repla ce replace DA=0 if DA==. replace Pneumo nia=0 if Pneumonia==. 173 replace Lameness=0 if Lameness==. replace Lameness2=0 if Lameness2==. replace RP=0 if RP==. replace Twins=0 if Twins==. ****************************************** /*Combining multip le mastit is and lameness occurrences into either pos/neg These conditions can happen multiple times but we are only interested whether they happened or not*/ ******************************************* replace mastitis=1 if MildMastitisCase==1 | Moder ateMastitisCase==1 | SevereMastitisCase==1 | Mast itisCase==1 label variable mastitis "Mastitis" replace lameness=1 if Lameness==1 | Lameness2==1 label variable lameness "LamenessAll" ******* ****************************************** /*New va riables *************** - Created new disease var iable that includes all transition cow disease outcomes - AnydzCalc2= all diseases without additional outcomes - AnyDzCalc= all disease outcomes plus addit ional outcomes including abort, dead calf, or deat h - Creating season variable based on time of dr y off - Cohort2 variable: creates new cohort variable so that each cohort within a farm are their own unit (n=18). - Rounded BCS so that it is on the 0.5 scale - Created new BCS variables for herd level an alysis */ *************************************** ********** *died gen died=0 replace died=1 if LstcId==220 | LstcId==88 | LstcId==46 | LstcId==131 *abort gen abort=0 replace abort=1 if LstcId==163 | LstcId==90 | LstcId==282 | LstcId==10 *calf born dead gen calfdead=0 replace calfdead=1 if LstcId==80 *added dz outcomes for non - Lstc cows replace abort=1 if FarmId==2 & Cohort==2 & Eartag==9861 replace died=1 if FarmId==2 & Cohort==2 & Eartag==9994 replace abort=1 if FarmId==1 & Cohort==1 & Eartag==2048 9 gen AnyDzCalc2=0 replace AnyDzCalc2=1 if ma stitis==1 | Ketosis==1 | Metritis==1 | MilkFever==1 | /// DA==1 | Pneumonia==1 | lameness==1 | RP==1 label variable AnyDzCalc2 "An y Disease event" tab AnyDzCalc replace AnyDzCalc=1 if ma stitis==1 | Ketosis==1 | Metritis==1 | MilkFever= =1 | /// DA==1 | Pneumonia==1 | lameness==1 | RP==1 | died==1 | abort==1 | calfdead==1 label variable AnyDzCalc "Any Disease event + other outcome" tab AnyDzCalc *season 1= spring, season 2= summer, season 3= autumn, season 4= winter gen m onthdry= month(DryEnrollDate) gen season=. replace season= 1 if monthdry==3 replace season= 1 if monthdry==4 replace season= 1 if monthdry==5 replace season= 2 if monthdry==6 replace season= 2 if monthdry==7 replace season= 2 if monthdry==8 replace season= 3 if monthdry==9 replace season= 3 if monthdry==10 replace season= 3 if monthdry==11 174 replace season= 4 if monthdry==12 replace season= 4 if monthdry==1 replace season= 4 if monthd ry==2 /*Creating cohort2*/ gen Cohort2= . sort FarmId Cohort replace Cohort2=1 if FarmId==1 & Cohort==1 replace Cohort2 =2 if FarmId==1 & Cohort==2 replace Cohort2=3 if FarmId==1 & Cohort==3 replace Cohort2=4 if FarmId==1 & Cohort==4 replace Cohor t2=5 if FarmId==2 & Cohort==1 replace Cohort2=6 i f FarmId==2 & Cohort==2 replace Cohort2=7 if FarmId==3 & Cohort==1 replace Cohor t2=8 if FarmId==3 & Cohort==2 replace Cohort2=9 if FarmId==3 & Cohort==3 replace Cohort2=10 if FarmId==3 & Cohort==4 replace Co hort2=11 if FarmId==4 & Cohort==1 replace Cohort2 =12 if FarmId==4 & Cohort==2 replace Cohort2=13 if FarmId==4 & Cohort==3 replace Cohort2=14 if FarmId==4 & Cohort==4 replace Cohort2=15 if FarmId==5 & Cohort==1 replace Cohort2=16 if FarmId==5 & Cohort==2 re place Cohort2=17 if FarmId==5 & Cohort==3 replace Cohort2=18 if FarmId==5 & Cohort==4 /*Rounding bcs*/ gen rounddrybcs= round(DryBCS, .5) /*BCS variables for herd level analysis*/ sort FarmId egen maxbcs= max(rounddrybcs), by(FarmId) egen minbcs= m in(rounddrybcs), by(FarmId) gen rangebcs= maxbcs - minbcs egen meanbcs= mean(rounddrybcs), by(FarmId) egen m edianbcs= median(rounddrybcs), by(FarmId) egen sdbcs= sd(rounddrybcs), by(FarmId) gen varbcs= sdbcs^2 tab maxbcs tab minbcs tab rangebcs tab medianbcs tab varbcs sort Cohort2 egen maxbcs 2= max(rounddrybcs), by(Cohort2) egen minbcs2= min(rounddrybcs), by(Cohort2) gen rangebcs2= maxbcs - minbcs egen meanbcs2= mean(rounddrybcs), by(Cohort2) egen medianbcs2= median(rounddrybcs), by(Cohort2) egen sdbcs2= sd(rounddrybcs), by(Cohort2) gen varbcs2= sdbcs2^2 /*Making BCS categorical*/ t ab rounddrybcs gen bcscat= 0 replace bcscat= 1 if rounddrybcs==3.5 | rounddrybcs==4.0 ******************************************* /*Drop unneeded variables *******************************************/ drop DryBCS CloseUpBCS DIMBCS CalvingType BirthDate save "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ CurrentHealthCLEAN_27Oct2017.dta", replace ********************************************** ********************************** * 4) Cleaning and checking prior health data **************** *************************************************************** use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ PriorHealth_24Oct2017.dta", clear ************************* /*Checking for duplicates *************************/ duplicates list LstcId FarmId Eartag /*no duplicates*/ ********************* 175 /*Renaming variable*/ ********************* rename TotalLifeMilk pTotalLifeMilk ****************************** /*Dropping unnee ded variable*/ * ***************************** drop DateAdded save "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ PriorHealthCLEAN_24Oct2017.dta", replace ******************************* ************************************************* * 5) Cleaning a nd checking BLV data ******************************************************************************* use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ BLV_24Oct2017.dta", c lear *********************************** /*Re - formatting vari ables********** ***********************************/ * LSTC * Pull out first 3 characters gen lstc1 = substr(SampleID, 1, 3) * Remove characters ( "C", "D", "F", "Dr", "Cl", "Fr") gen lstc2=lstc1 replace lstc2 = subinstr(lstc2,"C","" ,.) replace lstc2 = subinstr(lstc2,"D","",.) replace lstc2 = subinstr(lstc2,"F","",.) /* Takes out F */ replace lstc2 = subinstr(lstc2,"l","",.) /* takes out left over l from Cl etc */ replace lstc2 = subin str(lstc2,"r","",.) destring lstc2, gen(LstcId) d rop lstc1 lstc2 sort LstcId * Sample date gen date = substr(CowIDDate, - 8, 8) gen date1=date replace date1="12 - 09 - 14" if date==" on vial" replace date1="11 - 19 - 14" if date=="/19/2014" replace date1="11/24 /14" if date=="/24/2014" re place date1="11/29/14 " if date=="0/9/2014" replace date1="12/29/14" if date=="/29/2014" gen date2 = date(date1, "MD20Y") format %td date2 rename date2 SampleDate * eartag number * Pull out first 5 characters gen eartag1 = substr(CowIDDate, 1, 5) sort eartag1 replace eartag1="9747" if LstcId==26 replace eartag1="9800" if LstcId==27 replace eartag1="9845" if LstcId==28 replace eartag1="9926" if LstcId==29 replace eartag1="8567" if LstcId==50 replace eartag1="630" if LstcId==56 replace ea rtag1="9847" if LstcId== 58 replace eartag1="18329" if LstcId==71 replace eartag1="286" if LstcId==159 replace eartag1="500" if LstcId==42 replace eartag1="547" if LstcId==95 replace eartag1="553" if LstcId==44 replace earta g1="798" if LstcId==187 replace eartag1="853" if LstcId==274 replace eartag1="876" if LstcId==93 replace eartag1="918" if LstcId==89 replace eartag1="8332" if LstcId==48 gen eartag2 = strrtrim(eartag1) destring eartag2, gen(Eartag) * Sample Peri od gen sp1=SampleID * Deleting all numbers rep lace sp1 = subinstr(sp1,"0","",.) replace sp1 = subinstr(sp1,"1","",.) 176 replace sp1 = subinstr(sp1,"2","",.) replace sp1 = subinstr(sp1,"3","",.) replace sp1 = subinstr(sp1,"4","",.) replace sp1 = subinstr(sp 1,"5","",.) replace sp1 = subinstr(sp1,"6","",.) replace sp1 = subinstr(sp1,"7","",.) replace sp1 = subinstr(sp1,"8","",.) replace sp1 = subinstr(sp1,"9","",.) rename sp1 SamplePeriod replace SamplePeriod="1" if SamplePeriod=="Dry Off" replac e SamplePer iod="2" if SamplePeriod=="Close Up" replace Samp lePeriod="3" if SamplePeriod=="Fresh" destring SamplePeriod, replace label define smplprd_lbl 1 "Dry Off" 2 "Close Up" 3 "C+7", modify label values SamplePeriod smplprd_lbl /***************** *********** **************** * Creating new variables - Crea ted BLV variables based on cut - offs used for diagnosis ********************************************/ gen blv_co1=. replace blv_co1=0 if BLVELISAValue<1 replace blv_co1=1 if BLVELISAValue>=1 la bel define blv_lbl 0 "Neg" 1 "Pos" label values blv_co1 blv _lbl gen blv_co05=. replace blv_co05=0 if BLVELISAValue<0.5 replace blv_co05=1 if BLVELISAValue>=0.5 label values blv_co05 blv_lbl ************************************* /*Dropping other time points , only need dry off for this analysis*/ ******** ***************************** keep if SamplePeriod==1 ********************************** * Renaming variables ************************* ********* rename BLVELISAValue elisa **************************** ** /*Dropping unneeded variable*/ *************** *************** drop SampleID CowIDDate date date1 SampleDate eartag1 eartag2 BLVResult ************************* /*Checking for duplica tes *************************/ duplicates list LstcId Eartag Sampl ePeriod /*no duplicates*/ save "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ BLVCLEAN_24Oct2017.dta", replace ******************************************************************************** * 6) Cleaning and checking Wright Stai n ********************************************* ********************************** use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANAL YSIS \ Data files \ WrightStain_24Oct2017.dta", clear /****************************************** *Merging with SampleDat e file to match samples with the corresponding Sa mplePeriod ******************************************/ merge 1:1 LstcId SampleDate usin g /// "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ CowSampleDates_24Oct2017.dta" save "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Da ta files \ WrightSampleDates_28Oct2017.dta", replace sort _merge /*Fixing sample period variable*/ replace SamplePeriod="1" if SamplePeriod=="Dry Off" replace SamplePeriod="2" if SamplePeriod=="Close Up" replace SamplePeriod="3" if SamplePeriod=="C+7" destring SamplePeriod, replace label define smplprd_lbl 1 "Dry Off" 2 "Close Up" 3 "C+ 7", modify 177 label values SamplePeriod smplprd_lbl ************************************* /*Dropping other time point s, only need dry off for this analysis*/ ******* ****************************** keep if SamplePeriod==1 | SamplePeriod==. /*Fixing missing sample periods*/ replace SamplePeriod=3 if LstcId==10 & SamplePeriod== . replace SamplePeriod=1 if LstcId==46 & Sa mplePeriod== . replace SamplePeriod=1 if LstcId== 47 & SamplePeriod== . replace SamplePeriod=1 if LstcId==48 & SamplePeriod== . replace SamplePeriod=1 if LstcId==49 & SamplePeriod== . replace SamplePeriod=1 if LstcId==50 & SamplePeriod== . replace SamplePe riod=1 if LstcId==51 & SamplePeriod== . replace S amplePeriod=1 if LstcId==52 & SamplePeriod== . replace SamplePeriod=1 if Ls tcId==57 & SamplePeriod== . replace SamplePeriod=1 if LstcId==58 & SamplePeriod== . replace SamplePeriod=1 if LstcId==59 & SamplePer iod== . replace SamplePeriod=1 if LstcId==60 & S amplePeriod== . replace SamplePeriod=1 if LstcId==61 & SamplePeriod== . rep lace SamplePeriod=1 if LstcId==62 & SamplePeriod== . replace SamplePeriod=1 if LstcId==63 & SamplePeriod== . replace SamplePeriod=1 if LstcId==64 & SamplePeriod== . replace SampleP eriod=1 if LstcId==65 & SamplePeriod== . replace SamplePeriod=1 if LstcId== 66 & SamplePeriod== . replace SamplePeriod=1 if LstcId==67 & SamplePeriod== . replace SamplePeriod=1 if LstcId==68 & SamplePeriod== . replace SamplePeriod=1 if LstcId==69 & SamplePe riod== . replace SamplePeriod=1 if LstcId==70 & SamplePeriod== . replace SamplePeriod=1 if LstcId==71 & SamplePeriod== . replace SamplePeriod=1 if LstcId==72 & SamplePeriod== . replace SamplePeriod=1 if Lst cId==73 & SamplePeriod== . replace SamplePeriod=1 if LstcId==74 & SamplePeriod== . replace SamplePeriod=1 if LstcI d==75 & SamplePeriod== . replace SamplePeriod=1 if LstcId==76 & SamplePeriod== . replace SamplePeriod=1 if LstcId==77 & SamplePeriod== . repl ace SamplePeriod=1 if LstcId==78 & SamplePeriod== . replace SamplePeriod=1 if LstcId==79 & SamplePeriod== . replac e SamplePeriod=1 if LstcId==91 & SamplePeriod== . replace SamplePeriod=1 if LstcId==92 & SamplePeriod== . replace SamplePeriod=1 if LstcId==9 3 & SamplePeriod== . replace SamplePeriod=1 if Ls tcId==94 & SamplePeriod== . replace SamplePeriod=1 if LstcId==95 & SamplePeriod== . replace SamplePeriod=1 if LstcId==96 & SamplePeriod== . replace SamplePeriod=2 if LstcId==101 & SamplePeriod== . replace S amplePeriod=2 if LstcId==103 & SamplePeriod== . r eplace SamplePeriod=1 if LstcId==135 & SamplePeriod== . replace S amplePeriod=1 if LstcId==136 & SamplePeriod== . replace SamplePeriod=1 if LstcId==137 & SamplePeriod== . replace SamplePeriod=1 if LstcId==13 8 & SamplePeriod== . replace SamplePeriod=1 if Ls tcId==139 & SamplePeriod== . replace SamplePeriod=1 if LstcId==140 & SamplePeriod== . replace SamplePeriod=1 if LstcId==142 & SamplePeriod== . ********************* /*Fixing duplicates*/ ******************* ** keep if SamplePeriod==1 | SamplePeriod==. sor t SamplePeriod duplicates drop LstcId if SamplePeriod==. , force duplicates list LstcId SamplePeriod drop if N==. duplicates list LstcId SamplePeriod drop if LstcId==59 & N==99 duplicates list LstcI d replace SamplePeriod=1 if LstcId==80 drop if SamplePeriod==. /********************** ****************** 178 *Creating new variables - calculating WBC differentials and totals from counts *****************************************/ replace Npct=Npct/100 re place Lpct=Lpct/100 replace Mpct=Mpct/100 replace Epct=Epct/100 gen Bpct = Baso2/100 gen BD pct = Band2/100 gen OTHpct = Other2/100 drop Baso2 Band2 Other2 gen Ntot = Npct*TotalmlBLD*1000 gen Ltot = Lpct*TotalmlBLD*1000 gen Mtot = Mpct*TotalmlBLD*1000 g en Etot = Epct*TotalmlBLD*1000 gen Btot = Bpct*To talmlBLD*1000 gen BDtot = BDpct*TotalmlBLD*1000 gen OTHtot = OTHpct*TotalmlBLD*1000 gen NLratio= Ntot/Ltot /****************************** *Dropping unneeded variables ******************************/ drop SampleDate procytetotal TotalmlBLD DateAdded _merge CowSampleId save "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ WrightStainCLEAN_24Oct2017.dta", replace 2 _ MERGE.DO /* Title: Merging datasets Author: Lauren Wisnieski Purpose: Mer ge excel datasets Revision history: - Created October 24, 2017 Steps: 1. Merge prior and current health 2. Merge BLV data 3. Merge Wright stain data 4. Merge Lab data */ /************************************************************************ ******* * 1. MERGING PRIOR AND CURRENT HEALTH *** ******************** ********************************************************/ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ PriorHealthCLEAN_24Oct2017.dta", clear sort FarmId LstcId Ea rtag merge 1:1 FarmId LstcId Eartag using "C: \ U sers \ wisni \ Amazon Dr ive \ DISSERTATION ANALYSIS \ Data files \ CurrentHealthCLEAN_27Oct2017.dta", /*Checking for merge errors*/ sort _merge LstcId FarmId /*These are missing current health. Confirmed this is correct. LstcId FarmId Eartag . 5 1 6539 . 5 1 6384 . 5 1 562 1*/ /*These are missing prior health. Confirmed this is corrrect LstcId FarmId Cohort Eartag 159 5 2 286 160 5 2 7151 161 5 2 7007 162 5 2 7180 163 5 2 6980 . 1 3 17644 . 1 3 17013 179 . 1 3 19277 . 1 3 15044 . 1 3 13 663 . 1 3 19490 . 1 3 19721 . 1 3 19843 . 1 3 17371 . 1 3 19642*/ /*Checking for duplicates*/ duplicates list LstcId Eartag FarmId /*no duplicates*/ /*Creating pDz categorical variable*/ gen pAnyDzCalc= . replace pAnyDzCalc=1 if pAnyDz>0 re place pAnyDz Calc=0 if pAnyDz==0 replace pAnyDzCalc=. if pAnyDz==. drop _merge save "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ AllCowHealth_31Oct2017.dta", replace /****************** ************************************************* ************ * 2. MERGING PRIOR AND CURRENT HEALTH WITH BLV DATA *******************************************************************************/ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTA TION ANALYSIS \ Data files \ AllCowHealth_31Oct2017.dta", clear sort L stcId Eartag merge m:m LstcId Eartag using "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ BLVCLEAN_24Oct2017.dta", /*checking for merge mistakes*/ sort _merge LstcId /*m issing BLV data. th ese are correct. LstcId FarmId Eartag 12 1 2014 5 30 2 9965 38 3 3190 46 2 7014 47 2 7790 88 3 9373 110 4 8482 131 4 6008 163 5 6980 189 3 2239 220 3 3777 256 4 8651*/ /*Checking for duplicates*/ duplicates list LstcId Eartag FarmId /*no duplicates*/ drop _merge save "C: \ Users \ wisni \ Amazon Drive \ D ISSERTATION ANALYSIS \ Data files \ AllHealthBLV_31Oct2017.dta", replace /******************************************************************************* * 3. MERGING PRIOR, CURRENT HE ALTH & BLV DATA WITH WRIGH T DATA ***************************************** **************************************/ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ AllHealthBLV_31Oct2017.dta", clear sort LstcId merge m:1 LstcId using "C: \ Users \ wisni \ Amazon Drive \ D ISSERTATION ANALYSIS \ Data files \ WrightStainCLEAN_ 24Oct2017.dta", /*checking for merging errors. none found*/ sort _merge LstcId /*checking for duplicates*/ duplicates list FarmId LstcId Eartag /*no duplicates*/ drop _merge s ave "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ AllHealth BLVWright_31Oct2017.dta", replace /******************************************************************************* * 4. MERGING PRIOR, CURRENT HEALTH, BLV DATA, & WRIGHT DATA WITH LAB DATA *************** ************************************************* ***************/ 180 use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ AllHealthBLVWright_31Oct2017.dta", clear sort LstcId Eartag SamplePeriod merge m:1 LstcId using "C: \ Users \ wisni \ Amazon Driv e \ DISSERTATION ANALYSIS \ Data files \ LabDataCLEAN_2 6Oct2017.dta" /*checking for merging errors. none found*/ sort _merge LstcId /*checking for duplicates*/ duplica tes list FarmId LstcId Eartag drop _merge save "C: \ Users \ wisni \ Amazon Drive \ DISSERTATIO N ANALYSIS \ Data files \ FullDataset_1Nov2017.dta", replace 3 _ CHECK.DO /* Title: Checking Data Author: Lauren Wisnieski Purpose: Checking final data Revision history : - Created October 28, 2017 Steps: Step 1: Censor cows Step 2: Checking missing da ta Step 3: Create cohort level variables Step 4 : Drop unneeded variables Step 5: Descriptive Statistics Step 6: Create data dictionary */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ FullDataset_1Nov2017.dta", clear ********* ************ /*1. CENSORING COWS*/ ************* ********* /* Non lstc cows with missing dz data (lost to f ollow up) *Note: Farm Id 4 recorded missing data differently */ drop if LstcId==. & CalvingDate==. & FarmId==1 drop if LstcId==. & CalvingDate==. & FarmId==2 drop if LstcId==. & CalvingDate==. & F armId==3 drop if LstcId==. & CalvingDate==. & FarmId==5 drop if LstcId==. & FarmId==4 & HealthHxCheck==. drop if LstcId==. & FarmId==5 & Eartag==6539 drop if LstcId==. & FarmId==5 & Eartag==6384 drop if LstcId==. & FarmId==5 & Eartag==5621 /*Those that were diseased before transition period 3 weeks befor e calving)*/ gen daysmast= MastitisDate - CalvingDate gen daysmetr= MetritisDate - CalvingDate gen daysRP= RPDate - CalvingDate gen daysDA= DADate - CalvingDate gen daysL= LamenessDate - CalvingD ate gen daysL2= LamenessDate - DryEnrollDate gen daysket= KetosisDate - CalvingDate gen daysMF= MilkFeverDate - CalvingDate gen daysBRD= BRDDate - CalvingDate tab daysmast list LstcId FarmId if daysmast== - 319 tab daysmetr tab daysRP tab daysDA tab d aysL list LstcId FarmId if daysL< - 30 tab daysL2 list LstcId FarmId if daysL2< - 30 tab daysket tab daysMF tab daysBRD /*checking Lstc cow with lameness*/ list LstcId if daysL<0 181 list LstcId if daysL2 ==0 list AnyDz if LstcId==59 list abort died ca lfdead if LstcId==59 /*cow has no comorbidities, lameness was diagnosed at dry - off, will censor cow from analysis because we only want healthy cows from the start*/ drop if LstcId==59 *off study drop i f LstcId==145 *did not calve drop if LstcId==12 *no data since dry of f drop if LstcId==254 | LstcId==110 *no data since calving date drop if LstcId==256 *missing serum amyloid A drop if LstcId==192 *************************************************** ***** /*2. Checking for missing data and inaccura cies Checked separatel y for lstc cows and non - lstc cows since missing data patterns will be different e.g. lstc cows have biomarker data, whereas non - lstc cows do not *********************************** *********************/ misstable summarize if Ls tcId==. misstable summarize if LstcId<1000 /*copied and pasted to word document to examine "MISSING DATA TABLE_11Nov2017.docx"*/ /*2 serum amyloid A samples missing from datasheet but in database*/ repla ce saaugml=32.56 if LstcId==71 replace saaugml=18 7.86 if LstcId==164 /*Fixing incorrect date entries These determined by manually checking file*/ replace BirthDate= date("18 - 03 - 2013", "DMY") if LstcId==128 replace CalvingDate= date("25 - 10 - 2014","DMY") i f LstcId==10 replace BirthDate= date("29 - 3 - 2013" , "DMY") if LstcId==186 replace PriorLactationNumber=2 if LstcId==47 replace PriorLactationNumber=0 if LstcId==159 replace PriorLactationNumber=0 if LstcId==160 replace PriorLactationNumber=0 if LstcId== 161 replace PriorLactationNumber=0 if LstcId==16 2 repla ce PriorLactationNumber=0 if LstcId==163 replace PriorLactationNumber=1 if LstcId==197 replace PriorLactationNumber=1 if LstcId==200 replace PriorLactationNumber=1 if LstcId==202 replace PriorLac tationNumber=2 if LstcId==203 replace PriorLacta tionNumb er=2 if LstcId==206 replace PriorLactationNumber=0 if LstcId==207 replace PriorLactationNumber=0 if LstcId==209 replace PriorLactationNumber=0 if LstcId==211 replace PriorLactationNumber=2 if Lst cId==231 replace PriorLactationNumber=3 if LstcI d==245 replace PriorLactationNumber=1 if LstcId==198 replace PriorLactationNumber=3 if LstcId==199 replace PriorLactationNumber=1 if LstcId==201 replace PriorLactationNumber=3 if LstcId==205 replace Pri orLactationNumber= 0 if LstcId==208 replace Prior LactationNumber=0 if LstcId==210 replace LactationNumber=1 if LstcId==41 replace LactationNumber=5 if LstcId==204 replace LactationNumber=4 if LstcId==235 replace LactationNumber=2 if LstcId==244 replac e LactationNumber= 3 if LstcId==253 replace Lacta tionNumber=3 if LstcId==47 tab LactationNumber gen LactCat=. replace LactCat=1 if LactationNumber==1 182 replace LactCat=2 if LactationNumber==2 replace LactCat=3 if LactationNumber>=3 replace LactCat=. if LactationNumber ==. /*Checking cows with miss ing sample period variable*/ list LstcId if SamplePeriod==. & LstcId<1000 /*these are all dry off so do not need this variable any longer*/ /*Checking missing data patterns for prior disease*/ list LstcId LactationNumber if pAnyDz==. sort FarmId LstcId list FarmId LstcId if pAnyDz==. & LactationNumber>1 list FarmId LstcId if AnyDzCalc==. ***************************************************** /*3. Creating Cohort level incidence variables - Prior herd p r evalence - can be a predictor on the herd leve l or cohort level - Using cows only for this because there is large variation in whether farms included data for heifers. only using lstc cows because missing data high for nonlstc cows - Curr e nt herd prevalence - calculated for both Lstc cows and other cows ****************************************************/ /*Prior herd/cohort prevalence*/ egen nCowsLstc= count(LstcId) if LactationNumber>1 & pAnyDzCalc<300 , by(FarmId) egen npDzLstc= t o tal(pAnyDzCalc==1) if LstcId<300 & LactationNumb er>1, by(FarmId) gen pDzPrev= npDzLstc/nCowsLstc tab pDzPrev egen nCowsLstc2= count(LstcId) if LactationNumber>1 & pAnyDzCalc<300 , by(Cohort2) egen npDzLstc2= total(pAnyDzCalc==1) if LstcId<300 & Lac t ationNumber>1, by(Cohort2) gen pDzPrev2= npDz Lstc2/ nCowsLstc2 tab pDzPrev2 /*Calculate current disease incidence*/ /*Lstc cows*/ egen nTotalLstc= count(LstcId) , by(FarmId) egen nDzLstc= total(AnyDzCalc==1) if LstcId<300, by(FarmId) gen DzPrev = nDzLstc/nTotalLstc tab DzPrev replace DzPrev= . if nTotalLstc==0 egen nTotalLstc2= count(LstcId) , by(Cohort2) egen nDzLstc2= total(AnyDzCalc==1) if LstcId<300, by(Cohort2) ge n DzPrev2= nDzLstc2/nTotalLstc2 tab DzPrev2 replace DzPrev2=. if nTotal Lstc2==0 /*NonLstc cows*/ egen nTotalNonLstc= t otal(LstcId==.) , by(FarmId) egen nDzNonLstc= total(AnyDzCalc==1) if LstcId==., by(FarmId) gen DzPrevNonLstc= nDzNonLstc/nTotalNon Lstc tab DzPrevNonLstc replace DzPrevNonLstc=. if nTotalNonLstc==0 ege n nTotalNonLstc2= total(LstcId==.) , by(Cohort2) egen nDzNonLstc2= total(AnyDzCalc==1) if LstcId==., by(Cohort2) gen DzPrevNonLstc2= nDzNonLstc2/nTotalNonLstc2 tab DzPrevNonLstc2 replace DzPrevNonLstc2=. if nTotalNonLstc2==0 *********************** ******* /*4. Drop unneeded variables*/ ********** ******************** drop pLamenessDate pLameness2Date MildMastitisDate MildMastitisQuarter /// ModerateMastitisQuarter SevereMastitisQuarter /// DzSumOnce DzInflSumAll DzInflSumOnce DzInfl_woPneu DzInfl Dz metab DzMastMetr blv_co1 /// blv_co05 N L M E Bas o Band Other Total Npct Lpct Mpct Epct Bpct BDpct OTHpct HealthHxCheck /// daysmast daysmetr day sRP daysDA daysL daysL2 daysket daysMF daysBRD SamplePeriod ****************************** /*5. Descriptive St atistics*/ ****************************** ****** ***************************************************** * a) Summary Statistics 183 ***************** ****************************************** *Basic summary statistics to see outliers tabstat albumin , statist ics(count mean sem min p25 p50 p75 max) columns(s tatistics) tabstat calcium , statistics(count mean sem sd min p25 p50 p75 max) columns(statistic s) tabstat cholesterol , statistics(count mean sem min p25 p50 p75 max var) columns(statistics) tabstat nefa , statistics(count mean sem min p25 p50 p75 max var ) columns(statistics) tabstat bhb , statistics(count mean sem min p25 p50 p75 max) columns(stati stics) tabstat saaugml , statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat haptoglobin , statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat ros , statistics(count mean sem min p25 p50 p75 max) columns(stat istics) tabstat aopugul , statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat aopbradfor d , statistics(count mean sem min p25 p50 p75 max ) columns(statistics) tabstat vitamina, statistics(count mean sem min p25 p50 p75 max) columns( statistics) tabstat vitamind, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat betaca rotene, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat alphatocopherol, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat Ntot, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat Ltot, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat NLratio, statistics(count mean sem min p25 p50 p75 max) colu mns(statistics) tabstat Mtot, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat Etot, sta tistics(count mean sem min p25 p50 p75 max) colum ns(statistics) tabstat Btot, statistics(count mean sem min p25 p50 p75 max) columns(statistic s) tabstat BDtot, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat elisa, statistics(cou nt mean sem min p25 p50 p75 max) columns(statisti cs) tabstat rounddrybcs, statistics(count mean sem min p25 p50 p75 max) columns(statistics) *********************************************************** * b) Box Plots and Histograms ********************* ************************************** ********* **** ** Albumin ** ************* histogram albumin, frac graph box albumin graph box albumin, over(FarmId) ** Box Plot with annotation of outliers gen dummy=1 egen mean1 = mean(albumin) egen median1 = median(albumin) egen upq1 = pctile(albumin), p(75) egen loq1 = pctile(albumin), p(25) egen iqr1 = iqr(albumin) egen upper1 = max(min(albumin, upq + 1.5 * iqr)) egen lower1 = min(max(albumin, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy , pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle (p1) msize(*2) || /// scatter albumin dummy if !inrange(albumin, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(red) legend (off) || /// scatter mean1 dummy , ms(Dh) m lwidth(thick) mc(blue) msize(*1) /// y label(, ang(h)) ytitle(albumin (xx/xx)) xtitle("") ************* ** Calcium ** ************* histogram calcium, frac graph box calcium graph box calcium, ov er(FarmId) drop dummy mean1 median1 upq1 loq 1 iqr1 upper1 lower1 ** Box Plot with anno tation of outliers gen dummy=1 egen mean1 = mean(calcium) egen median1 = median(calcium) egen upq1 = pctile(calcium), p(75) egen loq1 = pctile(calcium), p(25) egen iqr1 = iqr(calcium) egen upper1 = max(min(calcium, upq + 1.5 * iqr)) egen lower1 = min(max(calcium, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, psty le(p1) blc(gs15) bfc(gs8) barw(0. 35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// 184 rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter ca lcium dummy if !inrange(calcium, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(calc ium (xx/xx)) xti tle("") ************* ** Cholesterol ** ************* histogram cholesterol, frac graph box cholesterol graph box cholesterol, over(FarmId) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen d ummy=1 egen mean1 = mean(cholesterol) egen median1 = median(cholesterol) egen upq1 = pctile(cholesterol), p(75) egen loq1 = pctile(cholesterol), p(25) egen iqr1 = iqr(cholesterol) egen upper1 = m ax(min(cholesterol, upq + 1.5 * iqr)) egen lowe r1 = min(max(cholesterol, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || // / rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || / // sca tter cholesterol dummy if !inrange(cholesterol, l ower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang (h)) ytitle(chole sterol (xx/xx)) xtitle("") ************* * * NEFA ** ************* histogram nefa, frac graph box nefa graph box nefa, over(FarmId) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen dummy=1 ege n mean1 = mean(nefa) egen median1 = median(nefa ) egen upq1 = pctile(nefa), p(75) egen loq1 = pctile(nefa), p(25) egen iqr1 = iqr(nefa) egen upper1 = max(min(nefa, upq + 1.5 * iqr)) egen lower1 = min(max(nefa, loq - 1.5 * iqr)) twoway rbar m edian1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle (p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || // / rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter nefa dummy if !inrange(nefa, lower1, upp er1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc( red) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(nefa (xx/xx)) xtitle("") ************* ** BHB ** ** *********** histogram bhb, frac graph box bhb graph box bhb, over (FarmId) drop dummy mean1 median1 upq1 loq1 iqr 1 upper1 lower1 ** Box Plot with annotation of outliers gen dummy=1 egen mean1 = mean(bhb) egen median1 = median(bhb) 185 egen upq1 = p ctile(bhb), p(75) egen loq1 = pctile(bhb), p(25) egen iqr1 = iqr(bhb) egen upper1 = max(min(bhb, upq + 1.5 * iqr)) egen lower1 = min(max(bhb, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0 .35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter bhb dummy if !inrange(bhb, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs (medium) mlabc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(bhb (xx/xx) ) xtitle("") ************* ** Saaugml ** ************* histogram saaugml, frac graph box sa augml graph box saaugml, over(FarmId) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen dummy=1 egen mean1 = mean(saaugml) egen median1 = media n(saaugml) egen upq1 = pctile(saaugml), p(75) egen loq1 = pctile(saaugml), p(25) egen iqr1 = iqr(saaugml) egen upper1 = max(min(saaugml, upq + 1.5 * iqr)) egen lower1 = min(max(saaugml, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle (p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter saaugml dummy if !inrange(saaugml, lower1, upper1), ms(Oh) mc(red) msiz(medum) ml a(LstcI d) mlabs(medium) mlabc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(saaugml (xx/xx)) xtitle("") ***************** ** Haptoglobin ** ***************** hist ogram h aptoglobin, frac graph box haptoglobin gra ph box haptoglobin, over(FarmId) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen dummy=1 egen mean1 = mean(haptoglobin) egen median1 = median(hapt oglobin ) egen upq1 = pctile(haptoglobin), p(75) egen loq1 = pctile(haptoglobin), p(25) egen iqr1 = iqr(haptoglobin) egen upper1 = max(min(haptoglobin, upq + 1.5 * iqr)) egen lower1 = min(max(haptoglobin, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0. 35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize( *2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter haptoglobin dummy if !inrange(hap toglobin, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(red) legend(off) || /// scatter mean 1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(haptoglobin (xx/xx)) xtitle("") **** ********* ** ROS ** ************* histogram ros, frac graph box ros graph box r os, over(FarmId) 186 drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen dummy=1 egen mean1 = mean(ros) egen median1 = media n(ros) egen upq1 = pctile(ros), p(75) egen loq1 = pctile(ros), p(25) egen iqr1 = iqr(ros) egen upper1 = max(min(ros, upq + 1.5 * iqr)) egen lower1 = min(max(ros, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8 ) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter ros dummy if !inrange(ros, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs (medium) mlabc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(ros (xx/xx)) xtitle("") ************* ** AOPugul ** ************* histogram aopugul, frac graph box ao pugul graph box aopugul, over(FarmId) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen du mmy=1 egen mean1 = mean(aopugul) egen median1 = median(aopugul) egen upq1 = pctile(aopugul), p(75) egen loq1 = pctile(aopugul), p(25) egen iqr1 = iqr(aopugul) egen upper1 = max(min(aopugul, upq + 1.5 * iqr)) egen lower1 = min(max(aopugul, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy , pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq 1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter aopugul dummy if !inrange(aopugul, lower1, upper1), ms(Oh) mc(red) msiz(m edum) mla(LstcId) mlabs(medium) mlabc(red) legend (off) || /// scatter mean1 dummy , ms(Dh) mlw idth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(aopugul (xx/xx)) xtitle("") ***************** ** AOPbradford ** ************** *** histogram aopbradford, frac graph box aopbrad ford graph box aopbradford, over(FarmId) drop dum my mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen dummy=1 egen mean1 = mean(aopbradford) egen median1 = med ian(aopbradford) egen upq1 = pctile(aopbradford ), p(75) egen loq1 = pctile(aopbradford), p(25) egen iqr1 = iqr(aopbradford) egen upper1 = max(min(aopbradford, upq + 1.5 * iqr)) egen lower1 = min(max(aopbradford, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || / // rcap upper1 upper1 dummy, pstyle(p1 ) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter aopbradford dummy if !inrange(aopbradford, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs (medium) mlabc(red) legend(off) || /// scat ter mean1 dummy , ms (Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(aopbradford (xx/xx)) xtitle("") 187 ************** ** Vitamin A** ************** histogram vitamina, frac graph box vitamina graph box vitamina, over(FarmId) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen dummy=1 egen mean1 = mean(vitamina) egen median1 = median(vitamina) egen upq1 = pctile(vitamina), p(75) egen loq1 = pctile(vitamina), p(25) egen iqr 1 = iqr(vitamina) egen upper1 = max(min(vitamina, upq + 1.5 * iqr)) egen lower1 = min(max(vitamina, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs 15) bfc(gs8) barw(0. 35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) ms ize(*2) || /// scatter vitamina dummy if !inrange(vitamina, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) m lwidth(thick) mc(blue) msize(*1) /// ylabel (, ang(h)) ytitle(vitamina (xx/xx)) xtitle("") *************** ** Vitamin D ** *************** histogram vitamind, frac graph box vitamind graph box vitamind, over(FarmId) drop dummy mean1 medi a n1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot wit h annotation of outliers gen dummy=1 egen mean1 = mean(vitamind) egen median1 = median(vitamind) egen upq1 = pctile(vitamind), p(75) egen loq1 = pctile(vitamind), p(25) egen iqr1 = iqr(vitamind) egen upper1 = max(min(vitamind, upq + 1.5 * iqr )) egen lower1 = min(max(vitamind, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw( 0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter vitamind dummy if !inrange(vitami nd, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) m lwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle( vitamind (xx/xx)) xtitle("") ************ ****** ** Betacarotene ** ****************** histogram betacarotene, frac graph box betacarotene graph box betacarotene, over(FarmId) dro p dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot w ith annotation of outliers gen dummy=1 egen m ean1 = mean(betacarotene) egen median1 = median(betacarotene) egen upq1 = pctile(betacarotene), p(75) egen loq1 = pctile(betacarotene) , p(25) egen iqr1 = iqr(betacarotene) egen upper1 = max(min(bet acarotene, upq + 1.5 * iqr)) egen lower1 = min( max(betacarotene, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// 188 rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter bet acarotene dummy if !inrange(betacarotene, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(betacaroten e (xx/xx)) xtitle("") ************** ****** ** alphatocoherol ** ******************** histogram alphatocopherol, frac graph box alphatocopherol graph box alph atocopherol, over(FarmId) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 * * Box Plot with annotation of outliers gen dumm y=1 egen mean1 = mean(alphatocopherol) egen median1 = median(alphatocopherol) egen upq1 = pctile(alphatocopherol), p(75) egen loq1 = pctile(alphatocopherol), p(25) egen iqr1 = iqr(alphatocopherol) egen upper1 = max(min(alphatocopherol, upq + 1. 5 * iqr)) egen lower1 = min(max(alphatocopherol, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(g s15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize( *2) || /// scatter alphatocopherol dum my if !inrange(alphatocopherol, l ower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(alphatocopherol (xx/x x)) xtitle("") *********** ** ** Ntot ** ************* histogram Ntot, frac graph box Ntot graph box Ntot, over(FarmId) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with a nnotation of outliers gen dummy=1 egen mean1 = mean(Ntot) egen median1 = med ian(Ntot) egen upq1 = pctile(Ntot), p(75) egen loq1 = pctile(Ntot), p(25) egen iqr1 = iqr(Ntot) egen upper1 = max(min(Ntot, upq + 1.5 * iqr)) egen lower1 = min(max( Ntot, loq - 1.5 * iqr)) twoway rbar median1 u pq1 dummy, pstyle(p1) blc(gs15) b fc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// r spike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter Ntot dummy if !inrange(Ntot, lower1, upper1), ms(Oh) mc(red) msi z(medum) mla(LstcId) mlabs(medium) mlabc(red) leg end(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(Ntot (xx/xx)) xtitle("") ************* ** Ltot ** ************* hi stogram Ltot , frac graph box Ltot graph box Ltot, over(FarmId ) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen dummy=1 189 egen mean1 = mean(Ltot) egen median1 = median(Ltot) egen upq1 = pctile(Lt ot), p(75) egen loq1 = pctile(Ltot), p(25) egen iqr1 = iq r(Ltot) egen upper1 = max(min(calcium, upq + 1.5 * iqr)) egen lower1 = min(max(calcium, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rc ap lower1 lower1 dummy, pstyle(p1) msize(*2) || / // scatter Ltot dummy if !inrange(Ltot, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(re d) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blu e) msize(*1) /// ylabel(, ang(h)) ytitle(Lt ot (xx/xx)) xtitle("") ************* ** NLRatio ** ************* histogram NLratio, frac graph box NLratio graph box NLratio, over(FarmId) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen dummy=1 egen mean1 = mean(NLratio) egen median1 = median(NLratio) egen upq1 = pctile(NLratio), p(75) egen loq1 = pctile( NLratio), p(25) egen iqr1 = iqr(NLratio) egen upper1 = max(min(NLratio, up q + 1.5 * iqr)) egen lower1 = min(max(NLratio, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 d ummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 u pper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter NLratio dummy if !inr ange(NLratio, lower1, upper1), ms(Oh) mc(red) msi z(medum) mla(LstcId) mlabs(medium) ml abc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(NLratio (xx/xx)) xtitle("") ************* ** Mtot ** ********** *** histogram Mtot, frac graph box Mt ot graph box Mtot, over(FarmId) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen dummy=1 egen mean1 = mean(Mtot ) egen median1 = median(Mtot) egen upq1 = pct ile(Mtot), p(75) egen loq1 = pctile (Mtot), p(25) egen iqr1 = iqr(Mtot) egen upper1 = max(min(Mtot, upq + 1.5 * iqr)) egen lower1 = min(max(Mtot, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyl e(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap u pper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter Mtot dummy if !inrange(Mtot, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(re d) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(t hick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(Mtot (xx/xx)) xtitle("") ************* ** Etot ** ************* 190 histogram Etot, frac graph box Etot graph box Etot, over(FarmI d) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen dummy=1 egen mean1 = mean(Etot) egen median1 = median(Etot) egen upq1 = pctile(Etot), p(75) egen loq1 = pctile(Etot), p(25) egen iqr1 = i qr(Etot) egen u pper1 = max(min(Etot, upq + 1.5 * iqr)) egen lo wer1 = min(max(Etot, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw (0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || / // rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter Etot dummy if !inrange(Etot, lower1, upper1), ms( Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth( thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(Etot (xx/xx)) xtitle("") ************* ** Btot ** ******* ****** histogram Btot, frac graph box Btot graph box Btot, over(FarmId) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 l ower1 ** Box Plot with annotation of outliers gen dummy=1 egen mean1 = mean( Btot) egen median1 = median(Btot) egen upq1 = pctile(Btot), p(75) egen loq1 = pctile(Btot), p(25) egen iqr1 = iqr(Btot) egen upper1 = max(min(Btot, upq + 1.5 * iq r)) egen lower1 = min(max(Btot, loq - 1.5 * iqr)) twoway rbar median1 upq1 dum my, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || / // rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, ps tyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rc ap upper1 upper1 dummy, pstyle(p1) msize(*2) || / // rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter Btot dummy if !inrange(Btot, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(red) legend(off ) || /// scatter mean1 dummy , ms(Dh) mlwid th( thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(Btot (xx/xx)) xtitle("") ************* ** BDtot ** ************* histogram BDtot, frac graph box BDtot graph box BDtot, o ver(FarmId) drop dummy mean1 median1 upq1 loq1 iq r1 upper1 lower1 ** Box Plot with annotation of outliers gen dummy=1 egen mean1 = mean(BDtot) egen median1 = median(BDtot) egen upq1 = pctile(BDtot), p(75) egen loq1 = pctile(BDtot), p(25) egen iqr1 = iqr(BDtot) egen upper1 = max(min(BD tot , upq + 1.5 * iqr)) egen lower1 = min(max(BDtot, loq - 1.5 * iqr)) twoway rbar median1 upq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs1 5) bfc(gs8) barw(0.35) || /// rspike u pq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// 191 rcap lower1 lower1 dummy, pstyle(p1) msi ze(*2) || /// scatt er BDtot dummy if ! inrange(BDtot, lower1, upper1), ms(Oh) mc(red) msiz(medum) mla(LstcId) mlabs(medium) mlabc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang( h)) ytitle(BDtot (xx/xx)) xtit le("") ********** *** ** elisa ** ************* histogram elisa, frac graph box elisa graph box elisa, over(FarmId) drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ** Box Plot with annotation of outliers gen dum my=1 egen mean1 = mean(elisa ) egen median1 = median(elisa) egen upq1 = pctile(elisa), p(75) egen loq1 = pctile(elisa), p(25) egen iqr1 = iqr(elisa) egen upper1 = max(min(elisa, upq + 1.5 * iqr)) egen lower1 = min(max(elisa, loq - 1.5 * iqr)) twoway rbar median1 upq1 d ummy, pstyle(p1) bl c(gs15) bfc(gs8) barw(0.35) || /// rbar median1 loq1 dummy, pstyle(p1) blc(gs15) bfc(gs8) barw(0.35) || /// rspike upq1 upper1 dummy, pstyle(p1) || /// rspike loq1 lower1 dummy, pstyle(p1) || /// rcap upper1 upper1 dummy, pstyle(p1) msize(*2) || /// rcap lower1 lower1 dummy, pstyle(p1) msize(*2) || /// scatter elisa dummy if !inrange(elisa, lower1, upper1), ms(O h) mc(red) msiz(medum) mla(LstcId) m labs(medium) mlabc(red) legend(off) || /// scatter mean1 dummy , ms(Dh) mlwidth(thick) mc(blue) msize(*1) /// ylabel(, ang(h)) ytitle(elisa (xx/xx)) xtitle("") drop dummy mean1 median1 upq1 loq1 iqr1 upper1 lower1 ************* ** BCS ** ** *********** histogram rounddrybcs, frac tab round drybcs ************* ** season ** ************* histogram season, frac tab season ************* ** Lact Num ** ************* histogram LactCat, frac tab LactCat save "C: \ Users \ wisni \ Amazon Drive \ DISSE RTATION ANALYSIS \ Data files \ FinalDataset_5June201 8.dta", replace ******************************* /* 6. Create data dictionary */ ******************************* outfile using "C: \ Users \ wisni \ Amazon Drive \ DISSERTATI ON ANALYSIS \ Data files \ FinalDataset_11N ov2017.dta" , dictionary list 4 _ DESCRIPTIVEST ATISTICS.DO /* Title: Descriptive Statistics Author: Lauren Wisnieski Purpose: Descriptive Statistisc Revision history: - Created December 7, 2017 192 Steps: Step 1: Calculating prevalence Ste p 2: Calculate days from calving Step 3: Count co ws in each farm Step 4: Summary/descriptive statistics */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ FinalDataset_5June2018.dta", clear ********************************** /*STEP ON E: CALCULATE PREVALENCE*/ *********************** *********** tab Ketosis if LstcId<300 tab mastitis if LstcId<300 tab Metritis if LstcId<300 tab Pneumonia if LstcId<300 tab lameness if LstcId<300 tab DA if LstcId<300 tab RP if LstcId<300 tab MilkFever if LstcId<300 tab abort if LstcId<300 tab calfdead i f LstcId<300 tab died if LstcId<300 tab AnyDzCalc if LstcId<300 /*Comorbidities*/ tab DzSumAll if LstcId<300 tab DzSu mAll AnyDzCalc if LstcId<300 tab DzSumAll AnyDzCalc2 if LstcId<300 *************** ************************** /*STEP TWO: CALCULATE DAYS FROM CALVING*/ ***************************************** gen daysfromcalving= CalvingDate - DryEnrollDate if LstcId<300 tabstat daysfromcalving, statistics(count mean sd min p25 p50 p75 max) columns(stati stics) by(AnyDzCalc) /*3 cows missing this variab le*/ list LstcId if daysfromcalving==. list CalvingDate if LstcId==30 | LstcId==8 8 | LstcId==220 *************************************** /*STEP THREE: COUNT COWS IN EACH FARM*/ *************************** ************ tab FarmId if LstcId<300 tab FarmId AnyDzCalc ************************************************************* /*STEP F OUR: SUMMARY/DESCRIPTIVE STATISTICS (NONBIOMARKERS)*/ ************************************************************* tab bcsc at if LstcId<300 tab rounddrybcs tab LactCat if L stcId<300 tab FarmId if LstcId<300 tab season if LstcId<300 save "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ FinalDataset2_5June2018.dta", replace 5 _ TRANSFORMINGVARIABLES.DO /* Title: Transforming Variables Author: Lauren Wisnieski Purpose: Transforming variables Revision history: - Created December 11, 2017 - Revised March 11, 2018 - Revised March 12, 2018 - Revised June 5, 2018 Steps: Step 1: Checking linearity in log odds for all continuous variables Step 2: Performing transformations and re - checking linearity 193 Step 3: Center categorical varia bles to reduce risk of multicollinearity */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ FinalDataset2_5Jun e2018.dta", clear ****************************** ***************************************** /*Step 1: Checking linearity in lo g odds for all continuous variables*/ *********************************************************************** twoway lowess AnyDzC alc nefa , logit /*not linear*/ twoway lowess AnyDzCalc bhb , logit /*linear*/ twoway lowess AnyDzCalc calcium , logit /*sort of linear*/ twoway lowess AnyDzCalc saaugml , logit /*sort of linear*/ twoway lowess AnyDzCalc aopugul , logit /*not linear*/ twoway lowess AnyDzCalc vitamina , logi t /*sort of linear*/ twoway lowess AnyDzCalc ros , logit /*sort of linear*/ twoway lowess AnyDzCalc vitamind , logit /*not linear*/ twoway lowess AnyDzCalc alphatocopherol , logit /*not linear*/ twow ay lowess AnyDzCalc betacarotene , logit /*not linear*/ twoway lowess AnyDzCalc al bumin , logit /*not linear*/ twoway lowess AnyDzCalc haptoglobin , logit /*not linear*/ twoway lowess AnyDzCalc cholesterol , logit /*not linear*/ twoway lowess AnyDz Calc Ntot , logit /*linear*/ twoway lowess AnyD zCalc Ltot , logit /*sort of linear*/ twoway lowess AnyDzCalc NLratio , logit /*linear*/ twoway lowess AnyDzCalc Etot , logit /*not linear*/ twoway lowess AnyDzCalc Btot , logit /*not linear*/ tw oway lowess AnyDzCalc BDtot , logit /*not linear */ twoway lowess Any DzCalc Mtot , logit /*not linear*/ *********************************************************************** /*Step 2: Performing transformations and re - checking linearity *******/ *** ************************************************* ******************* /*NEFA*/ gen lognefa= log10(nefa) gen sqrtnefa= sqrt(nefa) gen lognefasq= lognefa^2 twoway lowess AnyDzCalc lognefa , logit twoway lowess AnyDzCalc lognefasq , logit twoway lowess A nyDzCalc sqrtnefa , logit /*lognefasq looks goo d*/ xtile nefa4= ne fa, nq(4) twoway lowess AnyDzCalc nefa4 , logit xtile nefa5= nefa, nq(5) twoway lowess AnyDzCalc nefa5 , logit xtile nefa6= nefa, nq(6) twoway lowess AnyDzCalc nefa6 , logit /*n efa5 looks the best*/ /*calcium*/ gen logcalciu m= log10(calcium) twoway lowess AnyDzCalc logcalcium , logit gen sqrtcalcium= sqrt(calcium) twoway lowess AnyDzCalc sqrtcalcium , logit /*sqrt looks ok*/ xtile calcium4= calcium, nq(4) twoway lowess Any DzCalc calcium4 , logit /*calcium4 linear*/ 194 /*s aaugml*/ tab saaugml histogram saaugml /*a lot of values close to zero*/ gen logsaaugml= log10(saaugml) twoway lowess AnyDzCalc logsaaugml , logit gen sqrtsaaugml= sqrt(saaugml) twoway lowess AnyDzCa lc sqrtsaaugml , logit /*sqrt looks best*/ xti le saaugml4= saaugml, nq(4) twoway lowess AnyDzCalc saaugml4 , logit /*saaugml4 looks good*/ /*aopugul*/ gen logaopugul= log10(aopugul) twoway lowess AnyDzCalc logaopugul , logit /*log looks good*/ x tile aopugul4= aopugul, nq(4) twoway lowess AnyD zCalc aopugul4 , logit xtile aopugul5= aopugul, nq(5) twoway lowess AnyDzCalc aopugul5 , logit /*aopugul5 looks best*/ /*vitamina*/ gen logvitamina= log10(vitamina) gen logvitaminasq= logvitamina^2 t w oway lowess AnyDzCalc logvitamina , logit twowa y lowess AnyDzCalc logvitaminasq , logit gen sqrtvitamina= sqrt(vitamina) twoway lowess AnyDzCalc sqrtvitamina , logit xtile vitamina4= vitamina, nq(4) twoway lowess AnyDzCalc vitamina4 , logit xtile vitamina5= vitamina, nq(5) twoway lowess AnyDzCa lc vitamina5 , logit xtile vitamina6= vitamina, nq(6) twoway lowess AnyDzCalc vitamina6 , logit /*vitamina4 best*/ /*ros*/ gen logros= log10(ros) twoway lowess AnyDzCalc logros , logit gen rossq= r os^2 gen roscubed= ros^3 gen sqrtros= sqrt(ros) t woway lowess AnyDzCalc sqrtros , logit twoway lowess AnyDzCalc rossq , logit twoway lowess AnyDzCalc roscubed , logit xtile ros4= ros, nq(4) twoway lowess AnyDzCalc ros4 , logit xtile ros5= ros, nq(5) twoway lowess AnyDzCalc ros5 , logit /*ros5 the best, rossq also good*/ /*vitamind*/ gen logvitamind= log10(vitamind) twoway lowess AnyDzCalc logvitamind , logit gen sqrtvitamind= sqrt(vitamind) twoway lowess AnyDzCalc sqrtvit amind , logit xtile vi tamind4= vitamind, nq(4) twoway lowess AnyDzCalc vitamind4 , logit xtile vitamind5= vitamind, nq(5) twoway lowess AnyDzCalc vitamind5 , logit /*vitamind5 looks best*/ xtile vitamind6= vitamind, nq(6) twoway lowess AnyDzCalc vitamind6 , logit /*a lphatocopherol*/ gen logalphatocopherol= log10(a lphatocopherol) twoway lowess AnyDzCalc logalphatocopherol , logit gen sqrtalphatocopherol= sqrt(alphatocopherol) twoway lowess AnyDzCalc sqrtalphatocopherol , logit xtile alphato copherol4= alphatocopher ol, nq(4) twoway lowess AnyDzCalc alphatocophero l4 , logit xtile alphatocopherol5= alphatocopherol, nq(5) twoway lowess AnyDzCalc alphatocopherol5 , logit xtile alphatocopherol6= alphatocopherol, nq(6) twoway lowess AnyDzCalc alphatocopherol6 , log it /*alphatocopherol4*/ /*betacarotene*/ 195 gen logbetacarotene= log10(betacarotene) twoway lowess AnyDzCalc logbetacarotene , logit gen sqrtbetacarotene= log 10(betacarotene) twoway lowess AnyDzCalc sqrtbetacarotene , logit xtile betacarotene4= betaca rotene, nq(4) twoway lowess AnyDzCalc betacarote ne4 , logit xtile betacarotene5= betacarotene, nq(5) twoway lowess AnyDzCalc betacarotene5 , logit xtile be tacarotene6= betacarotene, nq(6) twoway lowess AnyDzCalc betacarotene6 , logit /*betacaroten e4 looks best*/ /*albumin*/ gen logalbumin= lo g10(albumin) twoway lowess AnyDzCalc logalbumin , logit gen sqrtalbumin= sqrt(albumin) twoway lowess AnyDzCalc sqrtalbumin , logit xtile albumin4= albumin, nq(4) twoway lowess AnyDzCalc albumin4 , logi t xtile albumin5= albumin, nq(5) twoway lowess AnyDzCalc albumin5 , logit xtile albumin6= albumin, nq(6) twoway lowess AnyDzCalc albumin6 , logit /*albumin5 the best*/ /*haptoglobin*/ gen loghaptoglobin= log10(haptoglobin) twoway lowess AnyDzCalc loghaptoglobin , logit gen sqrthaptoglobin= sq rt(haptoglobin) twoway lowess AnyDzCalc sqrthaptoglobin , lo git /*sqrt looks best*/ xtile haptoglobin4= haptoglobin, nq(4) twoway lowess AnyDzCalc haptoglobin4 , logit xtile haptoglobin5= haptoglobin, nq(5) twoway lowess AnyDzCalc haptoglobin5 , lo git xtile haptoglobin6= haptoglobin, nq(6) twoway lowess An yDzCalc haptoglobin6 , logit /*haptoglobin5 looks best*/ /*cholesterol*/ gen logcholesterol= log10(cholesterol) twoway lowess AnyDzCalc logc holesterol , logit gen sqrtcholesterol= sqrt(ch olesterol) twoway lowess AnyDzCalc sqrtcholesterol , logit xtile cholesterol4= cholesterol, nq(4) twoway lowess AnyDzCalc cholesterol4 , logit xtile cholesterol5= cholesterol, nq(5) twoway lowess AnyDz Calc cholesterol5 , logit xtile cholesterol6= ch olesterol, nq(6) twoway lowess AnyDzCalc cholesterol6 , log it /*cholesterol4 best*/ /*Ltot*/ gen logLtot= log10(Ltot) twoway lowess AnyDzCalc logLtot , logit gen sqrtLtot= sqrt(Ltot) twoway lowess Any DzCalc sqrtLtot , logit xtile Ltot4= Ltot, nq(4 ) twoway lowess AnyDzCalc Ltot4 , logit xtile Ltot5= Ltot, nq(5) twoway lowess AnyDzCalc Ltot5 , logit xtile Ltot6= Ltot, nq(6) twoway lowess AnyDzCalc Ltot6 , logit /*Ltot5 best*/ /*Etot*/ gen logEtot= log10(Etot) twoway lowess AnyDzCalc logE tot , logit gen sqrtEtot= sqrt(Etot) twoway lowess AnyDzCalc sqrtEtot , logit /*sqrt looks best*/ xtile Etot4= Etot, nq(4) twoway lowess AnyDzCalc Etot4 , logit xtile Etot5= Etot, nq(5) twoway lowes s AnyDzCalc Etot5 , logit /*Etot5 looks best*/ /*Btot*/ tabstat Btot if Btot>0 , statistics(count mean min p25 p50 p75 max) columns(statistics) 196 gen logBtot= log10(Btot) twoway lowess AnyDzCalc logBtot , logit gen sqrtBtot= sqrt(Btot) twoway lowess AnyDzCalc sqrtBtot , logit /*logBtot best*/ xt ile Btot4= Btot, nq(4) twoway lowess AnyDzCalc Btot4 , logit xtile Btot5= Btot, nq(5) twoway lowess AnyDzCalc Btot5 , logit xtile Btot6= Btot, nq(6) twoway lowess AnyDzCalc Btot6 , logit /*Btot has a lot of zeros, so created new categorical varia ble */ gen BtotCat=. replace BtotCat=1 if Btot==0 replace BtotCat=2 if Btot>0 & Btot<=37.5 replace BtotCat=3 if Btot>37.5 & Btot<=50.68 replace BtotCat=4 if Btot>50.68 & Btot<=94 replace BtotCat=5 if Btot> 94 replace BtotCat=. if Btot==. /*Bdtot*/ tabs tat BDtot if BDtot>0 , statistics(count mean min p25 p50 p75 max) columns(statistics) gen logBDtot= log10(BDtot) twoway lowess AnyDzCalc logBDtot , logit gen sqrtBDtot= sqrt(BDtot) twoway lowess AnyDzCalc sqrtBDtot , logit /*logBDtot best*/ xtile BDto t4= BDtot, nq(4) twoway lowess AnyDzCalc BDtot4 , logit xtile BDtot5= BDtot, nq(5) twoway lowess AnyDzCalc BDtot5 , logit xtile BDtot6= BDtot, nq(6) twoway lowess AnyDzCalc BDtot6 , logit /*BDtot al so has a lot of zero s, so created new categorical variable */ gen BDtotCat=. replace BDtotCat=1 if BDtot==0 replace BDtotCat=2 if BDtot>0 & BDtot<=38.05 replace BDtotCat=3 if BDtot>38.05 & BDtot<=74.53 replace BDtotCat=4 if BDtot>74.53 & BDtot<=126.53 re place BDtotCat=5 if BDtot>126.53 replace BDtotCat =. if BDtot==. /*Mtot*/ tabstat Mtot if Mtot>0 , statistics(count mean min p25 p50 p75 max) columns(statistics) gen MtotCat=. replace MtotCat=1 if Mtot==0 replace MtotCat=2 if Mtot>0 & Mtot<=96.3 replace MtotCat=3 if Mtot>9 6.3 & Mtot<=184.83 replace Mt otCat=4 if Mtot>184.83 & Mtot<=343.7 replace MtotCat=5 if Mtot>343.7 replace MtotCat=. if Mtot==. gen logMtot= log10(Mtot) twoway lowess AnyDzCalc logMtot , logit gen sqrtMtot= sqrt(Mtot) twoway lowess Any DzCalc sqrtMtot , l ogit /*Mtot5 best*/ ****** ************************************************************************** /*Step 3: Center all categorical variables using effect coding to reduce risk of multicollinearity. Note: using unweighted effe cts coding for variables with equal sample siz es in each level. Weighted for those variables with unequal sample size between different levels. First calculated the mean level and then subtracted the mean level from each current level valu e*/ ********************************************* *********************************** /*Unweighted effect coding: these varia bles have equal sample sizes in each level*/ /*LactCat, has three levels 1, 2 and 3*/ igenerate LactCat, coding(effect) /*Chol esterol*/ igenerate cholesterol4, coding(effect) /*AOPugul*/ igenerate aopugul5, coding(effect) /*Vitamin A*/ 197 igenera te vitamina4, coding(effect) /*Saaugml */ igenerate saaugml4, coding(effect) /*alphatocopherol*/ igenerate alphatocopherol5, c oding(effect) /*Ltot*/ igenerate Ltot5, coding (effect) /*Etot*/ igenerate Etot5, coding(effect) /*ros*/ igenerate ros5, coding(effect) /*betacarotene*/ igenerate betacarotene5, coding(effect) /*albumin*/ igenerate albumin5, coding(effect) /*Haptoglobin*/ igenerate haptoglobin5, coding( effect) /*Vitamin D*/ igenerate vitamind5, coding(effect) /*Weighted centered variables. These variables have unequal sample sizes in each level, so used weighted, rather than unweighted coding*/ /*BCScat: has two levels 0 and 1*/ igenerate bcs cat if LstcId<300, coding(weightedeffect) /*Btot*/ igenerate BtotCat, coding(weightedeffect) /*BDtot*/ igenerate BDtotCat, coding(weightedeffect) /*Mtot*/ igenerate MtotCat, coding(weightedeffect ) /*season*/ igenerate season if LstcId<300, coding(weightedeffect) save "C: \ Users \ wisni \ Amazon D rive \ DISSERTATION ANALYSIS \ Data files \ FinalDataset3_5June2018.dta", replace 6 _ BIVARIABLEAN ALYSES .DO /* Title: Bivariable analyses Author: Lauren Wis nieski Purpose: Bivariable analyses Revision his tory: - Created December 11, 2017 - Revised March 12, 2018 Steps: Step 1: Test random effects with AIC Step 2: Display final models */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data f iles \ FinalDataset3_11Dec2017.dta", clear ******* **************************** /* Step 1: Test ra ndom effects*/ *********************************** /*Note: each loop runs 4 different models for each variable. This is for the purpose of testing the rando m effects (None, Farm only, Cohort only, or bot h). - We export the results to a table with AIC values so that the model with the lowest AIC is considered the "best" - Categorical variables treated differently because we are using centered 198 varia bles, so code must run multiple variables in the same moddel (e.g. lact1 and lact2)*/ / *Continuous variables: Each loop runs the four models for up to two variables*/ eststo clear foreach var of varlist bhb calcium { melogit AnyDzCalc $var if LstcId<300 eststo melogit AnyDzCalc $var if LstcId<300 || FarmId: eststo melogit AnyDzCalc $var if LstcId<300 || Cohort: eststo melogit AnyDzCalc $var if LstcId<300 || FarmId: || Cohort: eststo } esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" " Farm Only" "Cohort Only" "Both") eststo clear foreach var of varlist lognefasq { melogit AnyDzCalc $var if LstcId<300 eststo melogit AnyDzCalc $var if LstcId<300 || FarmId: eststo melogit AnyDzCalc $var if LstcId<30 0 || Cohort: eststo melogit AnyDzCalc $var i f LstcId<300 || FarmId: || Cohort: eststo } esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohort Only" "Both") eststo clear foreach var of varlist NLratio Ntot { me logit AnyDzCalc $var if LstcId<300 eststo me logit AnyDzCalc $var if LstcId<300 || FarmId: eststo melogit AnyDzCalc $var if LstcId<300 || Cohort: eststo melogit AnyDzCalc $var if LstcId<300 || FarmId: || Cohort: eststo } esttab, b( %9.0g) aic varwidth(4 ) /// mtitles("None" "Fa rm Only" "Cohort Only" "Both") /*Categorical variables*/ ************************************************************************* eststo clear global season "season2 season3 season4" melogi t AnyDzCalc $season i f LstcId<300 eststo melog it AnyDzCalc $season if LstcId<300 || FarmId: eststo melogit AnyDzCalc $season if LstcId<300 || Cohort: eststo melogit AnyDzCalc $season if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g ) aic varwidth(4) /// mtitles("None" "Farm On ly" "Cohort Only" "Both") eststo clear global cholesterol "cholesterol42 cholesterol43 cholesterol44" melogit AnyDzCalc cholesterol if LstcId<300 eststo melogit AnyDzCalc $cholesterol if LstcI d<300 || FarmId: eststo melogit AnyDzCalc $cho lesterol if LstcId<300 || Cohort: eststo melogit AnyDzCalc $cholesterol if LstcId<300 || FarmId: || Cohort: eststo 199 esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohort Only" " Both") eststo clear global bcscat "bcscat2" melogit AnyDzCalc $bcscat if LstcId<300 eststo melogit AnyDzCalc $bcscat if LstcId<300 || FarmId: eststo melogit AnyDzCalc $bcscat if LstcId<300 || Cohort: eststo melogit AnyDzCalc $bcscat if Lst cId<300 | | FarmId: || Cohort: eststo esttab , b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohort Only" "Both") eststo clear global LactCat "LactCat2 LactCat3" melogit AnyDzCalc $LactCat if LstcId<300 eststo melogit AnyDzCa lc $LactCat if LstcId<300 || FarmId: eststo me logit AnyDzCalc $LactCat if LstcId<300 || Cohort: eststo melogit AnyDzCalc $LactCat if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm On ly" "Co hort Only" "Both") eststo clear global aopugul "aopugul52 aopugul53 aopugul54 aopugul55" melogit AnyDzCalc $aopugul if LstcId<300 eststo melogit AnyDzCalc $aopugul if LstcId<300 || FarmId: eststo melogit AnyDzCalc $aopugul if LstcI d<300 | | Cohort: eststo melogit AnyDzCalc $aopugul if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohort Only" "Both") eststo clear global vitamina "vitamina42 vitamina43 vitamina4 4" melogit AnyDzCalc $vitamina if LstcId<300 eststo melogit AnyDzCalc $vitamina if LstcId<300 || FarmId: eststo melogit AnyDzCalc $vitamina if LstcId<300 || Cohort: eststo melogit AnyDzCalc $vitami na if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// mtit les("None" "Farm Only" "Cohort Only" "Both") eststo clear global ros "ros52 ros53 ros54 ros55" melogit AnyDzCalc $ros if LstcId<300 eststo melogi t AnyDzCalc $ros if LstcId<300 || FarmId: ests to melogit AnyDzCalc $ros if LstcId<300 || Cohor t: eststo melogit AnyDzCalc $ros if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohor t Only" "Both") eststo clear global alp hatocopherol "alphatocopherol52 alphatocopherol53 alphatocopherol54 alphatocopherol55" melogit AnyDzCalc $alphatocopherol if LstcId<300 eststo melogit AnyDzCalc $alphatocopherol if LstcId<300 || FarmId: eststo melogit AnyDzCalc $alphatocopherol if LstcId<300 || Cohort: 200 eststo melogit AnyDzCalc $alphatocopherol if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohort Only" "B oth") eststo clear global betacarotene "betaca rotene52 betacarotene53 betacarotene54 betacarote ne55" melogit AnyDzCalc $betacarotene if LstcId<300 eststo melogit AnyDzCalc $betacarotene if LstcId<300 || FarmId: eststo melogit AnyDzCalc $b etacarotene if LstcId<300 || Cohort: eststo melogit A nyDzCalc $betacarotene if LstcId<300 || FarmId: | | Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohort Only" "Both") eststo clear global saaugml "saaugml42 saaugml43 saaugml44" melogit AnyDzCalc $saaugml if LstcId<300 eststo melogit AnyDzCalc $saaug ml if LstcId<300 || FarmId: eststo melogit AnyDzCalc $saaugml if LstcId<300 || Cohort: eststo melogit AnyDzCalc $saaugml if Ls tcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4 ) /// mtitles("None" "Farm Only" "Cohort Only " "Both") eststo clear global vitamind "vitamind52 vitamind53 vitamind54 vitamind55" melogit AnyDzCalc $vitamind if LstcId<3 00 eststo melogit AnyDzCalc $vitamind if LstcId<300 || FarmId: eststo melogit AnyDzCalc $vitamind if LstcId<300 || Coh ort: eststo melogit AnyDzCalc $vitamind if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohort Only" "Both") eststo clear globa l albumin "albumin52 albumin53 albumin54 albumin5 5" melogit AnyDzCalc $albumin if LstcId<300 eststo melogit AnyDzCalc $albumin if LstcId<300 || FarmId: eststo melogit AnyDzCalc $albumin if LstcId<300 || Cohort: eststo melogit AnyDzCalc $albumin if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohort Only" "Both") eststo clear global haptoglobi n "haptoglobin52 haptoglobin53 haptoglobin54 haptoglobin55" melogit AnyDzCalc $ haptoglobin if LstcId<300 eststo melogit AnyDz Calc $haptoglobin if LstcId<300 || FarmId: eststo melogit AnyDzCalc $haptoglobin if LstcId<300 || Cohort: eststo melogit AnyDzCalc $haptoglobin if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" " Farm Only" "Cohort Only" "Both") eststo clear global Ltot "Ltot52 Ltot53 Ltot54 Ltot55" melogit AnyDzCalc $Ltot if Lst cId<300 eststo 201 melogit AnyDzCalc $Ltot if LstcId<300 || FarmId: eststo melo git AnyDzCalc $Ltot if LstcId<300 || Cohort: es tsto melogit AnyDzCalc $Ltot if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohort Only" "Both") eststo clear global Etot "Etot52 Etot53 Etot54 Etot55" melogit AnyDzCalc $Etot if LstcId<300 eststo melogit AnyDzCalc $Etot if LstcId<300 || FarmId: eststo melogit AnyDzCalc $Etot if LstcId<300 || Cohort: eststo melogit AnyDzCalc $Etot if LstcId<300 || FarmId: || Cohort: es tsto esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohort Only" "Both") eststo clear global Btot "BtotCat2 BtotCat3 BtotCat4 BtotCat5" melogit AnyDzCalc $Btot if LstcId<300 eststo melogit AnyDzCalc $Btot if LstcId<300 | | FarmId: eststo melogit AnyDzCalc $Btot if Ls tcId<300 || Cohort: eststo melogit AnyDzCalc $Btot if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohort Only" "Both") eststo c lear global BDtot "BDtotCat2 BDtotCat3 BDtotCat4 BDtotCat5" melogit AnyDzCalc $BDtot if LstcId<300 eststo melogit AnyDzCalc $BDtot if LstcId<300 || FarmId: eststo melogit AnyDzCalc $BDtot if LstcId<300 || Cohort: eststo melogit AnyDzCalc $BDtot if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth( 4) /// mtitles("None" "Farm Only" "Cohort Only" "Both") eststo clear global Mtot "MtotCat2 MtotCat3 MtotCat4 MtotCat5" melogit AnyDzCalc $Mtot if LstcId<300 estst o melogit AnyDzCalc $Mtot if LstcId<300 || FarmI d: eststo melogit AnyDzCalc $Mtot if LstcId<300 || Cohort: eststo melogit AnyDzCalc $Mtot if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// mtitles("None" "Farm Only" "Cohort Only" "Both") *************** ******************** /* Step 2: Display final models */ *********************************** /*Note: chose models for each variable above that have lowest AIC*/ /*Models with neither farm and cohort*/ m elogit AnyDzCalc $season if LstcId<300 melogit A nyDzCalc calcium if L stcId<300 melogit AnyDzCalc bcscat2 if LstcId<300 melogit AnyDzCalc $vitamina if LstcId<300 melogit AnyDzCalc $alphatocopherol if LstcId<300 melogit AnyDzCalc $vitamind if LstcId<300 melogit AnyDzCalc $albumin if LstcId<300 melogit AnyDzCalc $haptoglob in if LstcId<300 202 melogit AnyDzCalc $BDtotCat if LstcId<300 /*Models with farm only*/ melogit AnyDzCalc $betacarotene if LstcId<300 || FarmId: melogit AnyDzCalc bhb || FarmId: me logit AnyDzCalc lognefasq if LstcId<300 || FarmId : melogit AnyDzCalc $cholesterol if LstcId<300 || FarmId: melogit AnyDzCalc $LactCat if LstcId<300 || FarmId: melogit AnyDzCalc NLratio if LstcId<300 || FarmId: melogit AnyDzCalc Ntot if LstcId<300 || Far mI d: melogit AnyDzCalc $aopugul if LstcId<300 || F armId: melogit AnyDzCalc $ros if LstcId<300 || FarmId: melogit AnyDzCalc $saaugml if LstcId<300 || FarmId: melogit AnyDzCalc $Ltot if LstcId<300 || FarmId: melogit AnyDzCalc $Etot if LstcId<300 || Farm Id : melogit AnyDzCalc $BtotCat if LstcId<300 || Fa rmId: melogit AnyDzCalc $MtotCat if LstcId<300 || FarmId: M1 _ AIM1NUTRIENTMETABOLISM.DO /* Title: Nutrient Metabolism model Author: Lauren Wisnieski Purpose: build nutrient metabolism model Rev is ion history: - Created December 13, 2017 Ste ps: Step 1: Best subsets AIC selection: DONE IN R Step 2: Test all two - way interactions Step 3: Test random effects Step 4: Choose final candidate models Step 5: Model diagnostics Step 6: Model averaging Step 7: Internal validation Candidate variab les: - BHB - Calcium - Cholesterol - NEFA - BCS - parity - season */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ FinalDataset3_5June2018.dta", clear save full, replace ********* ************************************************* ********************** /*Step 1: Best subsets AIC selection*/ ******************************************************************************** /* Note: We now perform best subsets AI C selection in R. R can handle categorical variables and is also faster. Selected models that were within 4 AICs of best model were retained*/ ******************************************************************************** /*Step 2: Testing interaction terms for eac h model */ ************************************** ****************************************** /* Note: Retained interaction term if re duced AIC value of model*/ /*First, defined variables and interactions using macros*/ global season "season2 season3 se ason4" global LactCat "LactCat2 LactCat3" glo bal cholesterol "cholesterol42 cholesterol43 cholesterol44" global bcscat "bcscat 2" global SeasonXNefa "c.season2#c.lognefasq c.season3#c.lognefasq c.season4#c.lognefasq" global SeasonXCalc "c.season2#c .calcium c.season3#c.calcium c.season4#c.calcium" global SeasonXBCS "c.season2#c.bcscat2 c.season3#c.bcscat2 c.season4#c.bcscat2" global LactXSeason "c.LactCat2#c.season2 c.LactCat2#c.season3 c.LactCat2#c.season4 c.LactCat3#c.season2 c.LactCat3#c.seaso n3 c.LactCat3#c.season4" global LactXSeason "c. LactCat2#c.season2 c.LactCat2#c.season3 c.LactCat2#c.season4 c.LactCat3#c.season2 c .LactCat3#c.season3 c.LactCat3#c.season4" 203 global LactXBCS "c.LactCat2#c.bcscat2 c.LactCat3#c.bcscat2" global LactXCalc "c .LactCat2#c.calcium c.LactCat3#c.calcium" global LactXNefa "c.LactCat2#c.lognefasq c.LactCat3#c.lognefasq" global SeasonXbhb "c.s eason2#c.bhb c.season3#c.bhb c.season4#c.bhb" global LactXbhb "c.LactCat2#c.bhb c.LactCat3#c.bhb" global LactXChol "c.L actCat2#c.cholesterol42 c.LactCat2#c.cholesterol4 3 c.LactCat2#c.cholesterol44 c.LactCat3#c.cholesterol42 c.LactCat3#c.cholesterol43 c.LactCat3#c.cholesterol44" global CholXSeason "c.cholesterol42#c.season2 c.cholesterol42#c.season3 c.cholesterol42#c.seaso n4 c.cholesterol43#c.season2 c.cholesterol43#c.se ason3 c.cholesterol43#c.season4 c.cholesterol44#c.season2 c.cholesterol44#c.season3 c.cholesterol44#c.season4" global CholXBCS "c.cholesterol42#c.bcscat2 c.cholesterol43#c.bcscat2 c.cholesterol44#c.bcscat2" global CalcXChol "c.calcium#c.cholesterol42 c. calcium#c.cholesterol43 c.calcium#c.cholesterol44" global NefaXChol "c.lognefasq# c.cholesterol42 c.lognefasq#c.cholesterol43 c.lognefasq#c.cholesterol44" global CholXbhb "c.cholesterol42#c.bhb c.choleste rol43#c.bhb c.cholesterol44#c.bhb" /*Now testi ng each 2 way interaction for each model*/ /*Model 1*/ eststo clear global model1 "$LactCat $season $bcscat calcium lognefasq" logit AnyDzCalc $model1 estat ic logit AnyDzCalc $model1 $LactXSe ason estat ic logit AnyDzCalc $m odel1 $Lact XBCS estat ic logit AnyDzCalc $model1 $LactXCalc estat ic logit AnyDzCalc $model1 $LactXNefa estat ic logit AnyDzCalc $model1 $SeasonXBCS estat ic logit AnyDzCalc $model1 $SeasonXCalc estat ic logit AnyDzCalc $mo del1 $Seaso nXNefa estat ic eststo logit AnyDzCalc $model1 $SeasonXNefa c.lognefasq#c.bcscat2 estat ic logit AnyDzCalc $model1 $SeasonXNefa c.lognefasq#c.calcium estat ic esttab, aic /// titl e("Model 1") /*Model 2*/ eststo clear global model2 "$LactCat $season $bcscat bhb calcium lognefasq" logit AnyDzCalc $model2 $LactXSeason estat ic logit AnyDzCalc $model2 $LactXBCS estat ic logit AnyDzCalc $model2 $LactXbhb estat ic logit AnyDzC alc $model2 $LactXCalc estat i c logit AnyDzCalc $model2 $LactXNefa estat ic logit AnyDzCalc $model2 $SeasonXBCS estat ic logit AnyDzCalc $model2 $SeasonXbhb estat ic logit AnyDzCalc $model2 $SeasonXCalc estat i c logit AnyDz Calc $model2 $SeasonXNefa esta t ic eststo logit AnyDzCalc $model2 $SeasonXNefa c.bhb#c.bcscat2 estat ic logit AnyDzCalc $model2 $SeasonXNefa c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model2 $SeasonXNefa c.lognefa sq#c.bcscat2 estat ic logit AnyDzCalc $m odel2 $SeasonXNefa c.bhb#c.calcium estat ic logit AnyDzCalc $model2 $SeasonXNefa c.calcium#c.lognefasq estat ic esttab, aic /// title("Model 2") 204 /*Model 3*/ eststo clear gl obal model3 "$LactCat bhb calcium lognefasq" logit AnyDzCalc $model3 estat ic logit AnyDzCalc $model3 $LactXbhb estat ic logit AnyDzCalc $model3 $LactXCalc estat ic logit AnyDzCalc $model3 $LactXNefa estat ic logit AnyDzCalc $mode l3 c.bhb#c.calcium estat ic logit Any DzCalc $model3 c.bhb#c.lognefasq estat ic logit AnyDzCalc $model3 c.calcium#c.lognefasq estat ic /*no interactions*/ /*Model 4*/ eststo clear global model4 "$LactCat $season calcium l ognefasq" logit AnyDzCalc $model4 $LactXSeason estat ic logit AnyDzCalc $model4 $LactXCalc estat ic logit AnyDzCalc $model4 $LactXNefa estat ic logit AnyDzCalc $model4 $SeasonXCalc estat ic logit AnyDzCalc $model4 $Seas o nXNefa estat ic eststo logit AnyDzCalc $ model4 c.lognefasq#c.calcium estat ic esttab, aic /// title("Model 4") /*Model 5*/ eststo clear global model5 "$LactCat $bcscat bhb calcium lognefasq" logit AnyDzCalc $model5 $LactXBCS estat ic logit AnyDzCalc $model5 $LactXbhb estat ic logit AnyDzCalc $model5 $LactXCalc estat ic logit AnyDzCalc $model5 $LactXNefa estat ic logit AnyDzCalc $model5 c.bcscat2#c.bhb estat ic logit AnyDzCalc $m odel5 c.bcscat2#c.calcium estat ic logit An yDzCalc $model5 c.bcscat2#c.lognefasq estat ic eststo logit AnyDzCalc $model5 c.bcscat2#c.lognefasq c.bhb#c.calcium estat ic logit AnyDzCalc $model5 c.bcscat2#c.lognefasq c.calciu m#c.lognefas q estat ic esttab, aic /// title(" Model 5") /*Model 6*/ eststo clear global model6 "$LactCat $bcscat bhb calcium" logit AnyDzCalc $model6 estat ic logit AnyDzCalc $model6 $LactXBCS estat ic logit AnyD zCalc $model 6 $LactXbhb estat ic logit AnyDzCalc $mo del6 $LactXCalc estat ic logit AnyDzCalc $model6 c.bcscat2#c.bhb 205 estat ic logit AnyDzCalc $model6 c.bcscat2#c.calcium estat ic logit AnyDzCalc $model6 c.bhb#c.calcium estat ic /*No interactions*/ /*Model 7*/ eststo clear global model7 "$LactCat $season bhb calcium lognefasq" logit AnyDzCalc $model7 $LactXSeason if LstcId<300 estat ic logit AnyDzCalc $model7 $LactXbhb estat ic logit AnyDzCalc $model7 $Lact XCalc estat ic logit AnyDzCalc $model7 $ LactXNefa estat ic logit AnyDzCalc $model7 $SeasonXbhb estat ic logit AnyDzCalc $model7 $SeasonXCalc estat ic logit AnyDzCalc $model7 $SeasonXNefa esta t ic eststo logit AnyDzCal c $model7 $SeasonXNefa c.bhb#c.calcium estat i c logit AnyDzCalc $model7 $SeasonXNefa c.bhb#c.lognefasq estat ic logit AnyDzCalc $model7 $SeasonXNefa c.calcium#c.lognefasq estat ic esttab, aic /// title("Model 7") /*Model 8*/ eststo clear global model8 "$LactCat bhb calciu m" logit AnyDzCalc $model8 estat ic logit AnyDzCalc $model8 $LactXbhb estat ic logit AnyDzCalc $model8 $LactXCalc if LstcId<300 estat ic logit AnyDz Calc $model8 c.bhb#c.calcium e stat ic /*no interaction*/ /*Model 9* / eststo clear global model9 "$LactCat calcium lognefasq" logit AnyDzCalc $model9 estat ic logit AnyDzCalc $model9 $LactXCalc estat ic logit AnyDzCalc $model9 $LactXNefa estat ic logit AnyDzCalc $model9 c.lognefasq#c.calcium estat ic /*no interaction*/ /*Model 10*/ eststo clear global model10 "$LactCat $bcscat calcium lognefasq" logit AnyDzCalc $model10 estat ic logit AnyDzCalc $model10 $LactXBCS estat ic logit AnyDzCalc $model10 $LactXCalc estat ic logit AnyDzCalc $model10 $LactXNefa estat ic /*no interaction*/ /*Model 11*/ eststo clear 206 global model11 "$Lac tCat $season $cholesterol $bcscat calcium lognefasq " logit AnyDzCalc $model11 estat ic logit AnyDzCalc $model11 $LactXSeason estat ic logit AnyDzCalc $model11 $LactXChol estat ic logit AnyDzCalc $model11 $LactXBCS estat ic logit AnyDzCalc $model11 $LactXCalc estat ic logit AnyDzCalc $model11 $LactXNefa e stat ic logit AnyDzCalc $model11 $CholXSeason estat ic logit AnyDzCalc $model11 $SeasonXBCS estat ic logit AnyDzCalc $model11 $Seaso nXCalc estat ic logit AnyDzCalc $model11 $SeasonXNefa estat ic eststo logit AnyDzCalc $model1 1 $SeasonXNefa $CholXBCS estat ic logit AnyDzCalc $model11 $SeasonXNefa $CalcXChol estat ic logit AnyDzCalc $model11 $SeasonXNefa $Nefa XChol estat ic logit AnyDzCalc $model11 $SeasonXNef a c.bcscat#c.calcium estat ic logit AnyDz Calc $model11 $SeasonXNefa c.bcscat#c.lognefasq estat ic esttab, aic /// title("Model 11") /*Model 12*/ eststo clear global model12 "$LactCat $bcscat calcium" logit AnyDzCalc $ model12 estat ic logit AnyDzCalc $model12 $LactXBCS estat ic logit AnyDzCalc $model12 $LactXBCS estat ic logit AnyDzCalc $model12 c.bcscat2#c.calcium estat ic /*no interaction*/ /*Model 13*/ eststo clear global model13 "$LactCat $season $bcscat lognefasq" logit AnyDzCalc $model13 $LactXSeason estat ic logit AnyDzCalc $model13 $LactXBCS estat ic logit AnyDzCalc $model13 $L actXNefa estat ic logit AnyDzCalc $model13 $SeasonXBCS estat ic logit AnyDzCalc $model13 $SeasonXNefa e stat ic eststo logit AnyDzCalc $model13 $SeasonXNefa c.bcscat2#c.lognefasq estat ic esttab, aic /// title("Model 13") /*Model 14*/ eststo clear global model14 "$LactCat $s eason $bcscat calcium" logit AnyDzCalc $model 14 estat ic logit AnyDzCalc $model14 $LactXSeason estat ic logit AnyDzCalc $model14 $LactXBCS estat ic 207 logit AnyDzCalc $model14 $LactXCalc estat ic logit AnyDzCalc $model14 $ SeasonXBCS estat ic logit AnyDzCalc $model1 4 $SeasonXCalc estat ic logit AnyDzCalc $model14 c.bcscat2#c.calcium estat ic /*no interactions*/ /*Model 15*/ eststo clear global model15 "$LactCat $season $bcscat bhb calcium" log it AnyDzCalc $model15 estat ic logit AnyD zCalc $model15 $LactXSeason estat ic logit AnyDzCalc $model15 $LactXBCS estat ic logit AnyDzCalc $mode l15 $LactXbhb estat ic logit AnyDzCalc $model15 $LactXCalc estat ic logit AnyDzCalc $model15 $SeasonXBCS estat ic log it AnyDzCalc $model15 $SeasonXbhb estat ic logit AnyDzCalc $model15 $SeasonXCalc estat ic logit AnyDzCa lc $model15 c.bcscat2#c.bhb estat ic logit AnyDzCalc $model15 c.bcscat2#c.calciu m estat ic logit AnyDzCalc $model15 c.bh b#c.calcium estat ic /*no interactions*/ /*Model 16*/ eststo clear global model16 "$LactCat calcium" logit AnyDzCalc $model16 estat ic logit AnyDzCalc $model16 $LactXCalc es tat ic /*no interactions*/ /*Mod el 17*/ eststo clear global model17 "$LactCat $season $cholesterol $bcscat bhb calcium lognefasq" logit A nyDzCalc $model17 $LactXSeason estat ic logit AnyDzCalc $model17 $LactXChol estat ic logit AnyDzCalc $model17 $LactXBCS estat i c logit AnyDzCalc $model17 $LactXbhb estat ic logit AnyDzCalc $model17 $LactXCalc estat ic l ogit AnyDzCalc $model17 $LactXNefa estat ic logit AnyDzCalc $model17 $CholXSeason estat ic logit AnyDzCalc $model17 $SeasonXBCS estat ic logit AnyDzCalc $model17 $SeasonXbhb estat ic logit AnyDzCalc $model17 $SeasonXCalc estat ic logit AnyDzCalc $model17 $SeasonXNefa estat ic eststo logit AnyDzCalc $m odel17 $SeasonXNefa $CholXBCS estat ic logi t AnyDzCalc $model17 $SeasonXNefa $CholXbhb estat ic logit AnyDzCalc $model17 $SeasonXNe fa $CalcXChol estat ic 208 logit AnyDzCalc $model17 $SeasonXNefa $NefaXChol estat ic logit AnyDz Calc $model17 $SeasonXNefa c.bcscat#c.bhb esta t ic logit AnyDzCalc $model17 $SeasonXNefa c.bcscat#c.calcium estat ic logit AnyD zCalc $model17 $SeasonXNefa c.bcscat#c.lognefasq estat ic logit AnyDzCalc $model17 $SeasonXNefa c.bhb#c.c alcium estat ic logit AnyDzCalc $model1 7 $SeasonXNefa c.bhb#c.lognefasq estat ic logit AnyDzCalc $model17 $SeasonXNefa c.cal cium#c.lognefasq estat ic esttab, aic /// title("Model 17") /*Model 18*/ eststo cl ear global model18 "$LactCat $season $bcscat bhb lognefasq" logit AnyDzCalc $model18 $LactXSeason estat ic logit AnyDzCalc $model18 $LactXBCS estat ic logit AnyDzCalc $model18 $LactXbhb estat ic logit AnyDzCalc $model18 $LactX Nefa estat ic logit AnyDzCalc $model18 $Sea sonXBCS estat ic logit AnyDzCalc $model18 $SeasonXbhb estat ic logit An yDzCalc $model18 $SeasonXNefa estat ic eststo logit AnyDzCalc $model18 $SeasonXNefa c.bcscat2#c.bhb estat ic logit AnyDzCalc $model18 $SeasonXNefa c .bcscat2#c.lognefasq estat ic logit AnyDzCalc $model18 $SeasonXNefa c.lognefasq# c.bhb estat ic esttab, aic /// title("Model 18") /*Model 19*/ eststo clear global mode l19 "$LactCat $season $cholesterol calcium lognef asq" logit AnyDzCalc $model19 $LactXSeason estat ic logit AnyDzCalc $model19 $CholXLact estat ic logit AnyDzCalc $model19 $LactXCalc estat ic logit AnyDzCalc $model19 $LactXNefa estat ic logit AnyDzCalc $model19 $SeasonXC hol estat ic logit AnyDzCalc $model19 $SeasonXCalc estat i c logit AnyDzCalc $model19 $SeasonXNefa estat ic eststo logit AnyDzCalc $model19 $SeasonXNefa $CalcXChol estat ic logit AnyDzCalc $model19 c.calcium#c.lognefasq estat ic esttab, aic /// title("Model 19") ******************************************************************************** /*Step 4: Test random effects*/ ******************** ************************************************* *********** /*Note: testing random effects for inclusion in each candidate model using AIC values. If variance is not estimated, excluded. Exported results to a table to compare AICs .*/ 209 /*Model 1*/ eststo clear global model1 "$LactCat $season $bc scat calcium lognefasq $SeasonXNefa" melogit AnyDzCalc $model1 eststo melogit AnyDzCalc $model1 || FarmId: eststo melogit AnyDzCalc $model1 if LstcId<300 || Cohort: eststo melo git AnyDzCalc $model 1 if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 1") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 2*/ eststo clear global model2 "$LactCat $season bcsca t bhb calcium lognef asq $SeasonXNefa" melogit AnyDzCalc $model2 if LstcId<300 eststo melogit AnyDzCalc $model2 if LstcId<300 || FarmId: eststo melogit AnyDzCalc $model2 if LstcId<300 || Cohort: eststo melogit AnyDzCalc $model2 if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 2") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 3*/ eststo clear global model3 "$LactCat bhb calcium lognefasq" melogit AnyDzCalc $model3 if LstcId<300 eststo m elogit AnyDzCalc $model3 if LstcId<300 || FarmId: eststo melogit AnyDzCalc $model3 if LstcId<300 || Cohort: eststo melogit AnyDzCalc $model3 if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 3") // / mtitles("None" "Farm Only" "Cohort Only" "B oth") /*Model 4*/ eststo clear global model4 "$LactCat $season calcium lognefasq $SeasonXNefa" melogit AnyDzCalc $model4 if LstcId<300 eststo melogit AnyDzCalc $model4 if LstcId<300 || FarmId: eststo melogit AnyDzCalc $model4 if LstcId<300 | | Cohort: eststo melogit AnyDzCalc $model4 if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 4") /// mtitles("None" "Farm Only" "Cohort Only" "Both ") /*Model 5*/ eststo clear global model5 "$LactCat bcscat bhb calcium lognefasq c.bcscat2#c.lognefasq" melogit AnyDzCalc $model5 if LstcId<300 eststo melogit AnyDzCalc $ model5 if LstcId<300 || FarmId: eststo melogit AnyDzCalc $model5 if Lst cId<300 || Cohort: eststo melogit AnyDzCalc $m odel5 if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 5") /// mtitles("Non e" "Farm Only" "Cohort Only" "Both") /*Model 6*/ eststo clear 210 global mo del6 "$LactCat $bcscat bhb calcium" melogit AnyD zCalc $model6 if LstcId<300 eststo melogit AnyDzCalc $model6 if LstcId<300 || FarmId: eststo melogit AnyDzCalc $model6 if LstcId<300 || Cohort: eststo melogit AnyDzCalc $model6 if LstcId<300 || Farm Id: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 6") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 7*/ eststo clear global model7 "$LactCat $season bhb calcium lognefasq $SeasonXNefa" melogit A nyDzCalc $model7 if LstcId<300 eststo melogit AnyDzCalc $model7 if LstcId<300 || FarmId: eststo melogit AnyDzCalc $model7 if LstcId<300 || Cohort: eststo melogit AnyDzC alc $model7 if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) a ic varwidth(4) /// title("Model 7") /// m titles("None" "Farm Only" "Cohort Only" "Both") /*Model 8*/ eststo clear global model8 "$LactCat bhb calcium" melogit AnyDzCalc $model8 if LstcId<300 eststo melogit AnyDzCalc $model8 if LstcId< 300 || FarmId: eststo melogit AnyDzCalc $model 8 if LstcId<300 || Cohort: eststo melogit AnyDzCalc $model8 if LstcId<300 || FarmId: || Cohort: eststo esttab, b (%9.0g) aic varwidth(4) /// title("Model 8") /// mtitles("None" "Farm Only" "C ohort Only" "Both") /*Model 9*/ e ststo clear global model9 "$LactCat calcium lognefasq" melogit AnyDzCalc $model9 if LstcId<300 eststo melogit AnyDzC alc $model9 if LstcId<300 || FarmId: eststo melogit AnyDzCalc $model9 if LstcId<300 || Cohort: eststo melogit AnyDzCalc $model9 i f LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 9") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 10*/ eststo clear global m odel10 "$LactCat $bcscat calcium lognefasq" melo git AnyDzCalc $model10 if LstcId<300 eststo melogit AnyDzCalc $model10 if LstcId<300 || FarmId: eststo melog it AnyDzCalc $model10 if LstcId<300 || Cohort: eststo melogit AnyDzCalc $model10 if LstcId <300 || FarmId: || Cohort: eststo esttab, b (%9.0g) aic varwidth(4) /// title("Model 10") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /* Model 11*/ eststo clear global model11 "$LactCat $season $cholesterol $bcscat calcium logn efasq $SeasonXNefa" 211 melogit AnyDzCalc $model11 eststo melogit AnyDzCalc $model11 || FarmId: eststo melogit AnyDzCalc $model11 || Cohort: eststo melogit AnyDzCalc $model11 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 11") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 12*/ eststo clear global model12 "$LactCat $bcscat calcium" melogit AnyDzCalc $model12 eststo melogit AnyDzCalc $model12 || FarmId: eststo melogit AnyDzCal c $model12 || Cohort: eststo melogit AnyDzCal c $model12 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 12") / // mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 13*/ eststo clear global mode l13 "$LactCat $season $bcscat lognefasq" melogit AnyDzCalc $model13 eststo melogit AnyDzCalc $model13 || FarmId: eststo melogit AnyDzCalc $model13 || Cohort: eststo melogit AnyDzCalc $model13 || FarmId: || Cohort: eststo esttab, b(%9.0g) a ic varwidth(4) /// title("Model 13") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 14*/ eststo clear glob al model14 "$LactCat $season $bcscat calcium" melogit AnyDzCalc $model14 eststo melogit AnyDzCalc $model14 || FarmId: eststo melogit AnyDzCalc $model14 || Cohort: eststo melogit AnyDzCalc $model14 || FarmId: || Cohort: eststo esttab, b(%9. 0g) aic varwidth(4) /// title("Model 14") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /* Model 15*/ eststo clear global model15 "$LactC at $season $bcscat bhb calcium" melogit AnyDzCalc $model15 eststo melogit AnyDzCalc $model15 || FarmId: eststo melogit AnyDzCalc $model15 || Cohort: eststo melogit AnyDzCalc $model15 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwid th(4) /// title("Model 15") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 16*/ eststo clear global model16 "$LactCat calcium" 212 melogit AnyDzCalc $model16 eststo melogit Any DzCalc $model16 || FarmId: eststo melogit AnyD zCalc $model16 || Cohort: eststo melogit AnyDzCalc $model16 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 16") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 17*/ eststo clear global model17 "$LactCat $season $cholesterol $bcscat bhb calcium lognefasq $SeasonXNefa" melogit AnyDzCalc $model17 eststo melogit AnyDzCalc $model17 || FarmId: eststo melogit AnyDzCalc $model17 || Cohort : eststo melogit AnyDzCalc $model17 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 17") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 18*/ eststo clear global model18 "$LactCa t $season $bcscat bhb lognefasq $SeasonXNefa" me logit AnyDzCalc $model18 eststo melogit AnyDzCalc $model18 || FarmI d: eststo melogit AnyDzCalc $model18 || Cohort: eststo melogit AnyDzCalc $model18 || FarmId: || Cohort: eststo esttab, b (%9.0g) aic varwidth(4) /// title("Model 18") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 19*/ eststo clear global model19 "$LactCat $season $cholesterol calcium lognefasq $SeasonXNefa" melogit AnyDzCalc $model19 e ststo melogit AnyDzCalc $model19 || FarmId: es tsto melogit AnyDzCalc $model19 || Cohort: eststo melogit AnyDzC alc $model19 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 19") /// mtitles("None" "Far m Only" "Cohort Only" "Both") ********* ********************************************************************* ** /*Step 5: Reduce pool of candidate models*/ ******************************************************************************** /*Note: in this step, we ran the above models with the r andom effects included (where applicable) and exported the AIC value s. Then, we ordered AICs and kept all within 4 AICs of the "best" model*/ eststo clear melogit AnyDzCalc $model1 eststo melogit AnyD zCalc $model2 eststo melogit AnyDzCalc $model3 e ststo melogit AnyDzCalc $model4 eststo melogit AnyDzCalc $model5 213 eststo esttab, b(%9.0g) aic varwidth(4) /// title("Models 1 - 5") eststo clear melogit AnyDzCalc $model6 eststo melogit AnyD zCalc $model7 eststo melogit AnyDzCalc $model8 eststo melogit AnyDzCalc $model9 || FarmId: || Cohort: es tsto melogit AnyDzCalc $model10 eststo esttab, b(%9.0g) aic varwidth(4) /// title("Models 6 - 10") eststo clear melogit AnyDzCal c $model11 eststo melogit AnyDzCalc $model12 est sto melogit AnyDzCalc $model13 eststo melogit AnyDzCalc $model14 eststo melogit AnyDzCalc $model15 eststo esttab, b(%9.0g) aic varwidth(4) /// title("Models 11 - 15") eststo clear melogit An yDzCalc $model16 || FarmId: || Cohort: eststo me logit AnyDzCalc $model17 eststo melogit AnyDzC alc $model18 eststo melogit AnyDzCalc $model19 eststo esttab, b(%9.0g) aic varwidth(4) /// title("Models 16 - 19") ************************** ************************************************* ***** /*Step 6: Model Diagnostics*/ ******************************************************************************** /* Note: for model diagnostics, recorded discriminant ability (by using ROC) and test ed goodness - of - fit using hosmer - lemeshow. Looked at deviance as well as leverage graphs */ logit AnyDzCalc $model4 predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*dia gnostic plots*/ predict hat, hat scatter hat phat, mlab(LstcId) yline (0) list LactationNumber season calcium lognefasq if LstcId==258 gen id=_n predict dv, dev scatter dv id, mlab(LstcId) drop phat hat dec id dv logit AnyDzCalc $mo del1 predict phat roct ab AnyDzCalc phat egen dec=cut(phat),at (0(0.1 )1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ 214 predict hat, hat scatter hat phat, mlab(LstcId) yline(0) list LactationNumber season rounddrybcs calcium lognefasq if LstcId==258 gen id=_n predict dv, dev scatter dv id, mlab(LstcId) drop phat hat dec id dv logit AnyDzCalc $model7 predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ predict hat, hat scat ter hat phat, mlab(LstcId) yline(0) list Lacta tionNumber season bhb calcium lognefasq if LstcId==258 | LstcId==190 gen id=_n predict dv, dev scatter dv id, mlab (LstcId) drop phat hat dec id dv logit AnyDzCalc $model2 predict phat ro ctab AnyDzCalc phat egen dec=cut(phat),at (0(0 .1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ predict hat, hat scatter hat phat, mlab(LstcId) yline(0) gen id=_n predict dv, dev scatter dv id, mlab(LstcId) drop phat hat dec id dv logit AnyDzCalc $model19 predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnos tic plots*/ predict hat, hat scatter hat phat, mlab(LstcId) yline(0) gen id=_n predict dv, dev sca tter dv id, mlab(LstcId) drop phat hat dec id dv logit AnyDzCalc $model11 predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ predict hat, hat scatter hat phat, mlab(Ls tcId) yline(0) gen id=_n predict dv, dev s catter dv id, mlab(LstcId) drop phat hat dec id dv **************** **************************************************************** /*Step 7: Internal validation */ ************************* ************************************************* ****** /*Calculating ROC using 500 bootstrapped samples for each candidate mode l*/ ************************************************************************************ /*Note: For each model, I created a program that calculates AUC by hand. Added a bsam ple line, so that with each run, a bootstrapped sample is used. I simulated this calculation 500 times, and averaged the predictive ability to get a bootstrapped estimate*/ /*Model 4*/ 215 /*running the m odel and saving the coefficients*/ use full, cl ear logit AnyDzCalc $model4 matrix b=e(b) matrix list b global b_1_4= b[1,1] global b_2_4= b[1,2] global b_3_4= b[1,3] global b_4_4= b[1,4] global b_5_4= b[1,5] global b_6_4= b[1,6] global b_7_4= b[1,7] global b_8_4= b[1,8] global b_9_4 = b[1,9] global b_10_4= b[1,10] global b_11_4= b[1,11] program model 4, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_11_4 + ($b_1_4 * Lac tCat2) + ($b_2_4 * LactCat3) + /// ($b_3_4 * season2) + ($b_4_4 * season3) + ($b_5_4 * season4) + /// ($b_6_4 * ca lcium) + ($b_7_4 * lognefasq) + /// ($b_8_4 * season2 * lognefasq) + ($b_9_4 * season3 * lognefasq) + ($b_10_4 * season4 * lognef asq) gen prob0 = 2.718 ^ logodds if AnyDzCal c==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile N D save ND , replace use all, clear d rop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1 > prob0 gen discordant= 0 replace disc ordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tie dN = total(tied==1) gen concordantPCT = concor dantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model4 = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(model4), reps(500) seed(12345): model4 egen AUC1 = mean(AUC) /* AUC1 .7224417 */ /*Model 1*/ use full, cl ear melogit AnyDzCalc $model1 matrix b= e(b) matrix list b 216 global b_1_1= b[1,1] global b_2_1= b[1,2] global b_3_1= b[1,3] global b_4_1= b[1,4] global b_5_1= b[1,5] global b_6_1= b[1,6] globa l b_7_1= b[1,7] global b_8_1= b[1,8] global b_9 _1= b[1,9] global b_10_1= b[1,10] global b_11_1= b[1,11] global b_12_1= b[1,12] program model1, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_12_1 + ($b_1_1 * LactCat2) + ($b_2_1 * Lact Cat3) + /// ($b_3_1 * season2) + ($b_4_1 * season3) + ($b_5_1 * season4) + /// ($b_6_1 * bcscat2) + ($b_7_1 * calcium) + ($b_8_1 * lognefasq) + /// ($b_9_1 * season2 * lognefasq) + ($b_10_1 * se ason3 * lognefasq) + /// ($b_11_1 * season4 * lognefasq) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseas ed*/ drop prob1 drop if prob0==. temp file ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = to tal(concordant==1) eg en tiedN = total(tied==1) gen concordantPCT = concordantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model1 = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(model1), reps (500) seed(12345): model1 egen AUC1 = mean(AUC) /*AUC1 .7261916 */ /*Model 7*/ use full, clear logit AnyDzCalc $model7 matrix b=e(b) matrix list b global b_1_7= b[1,1] global b_2_7= b[1,2] global b_3_7= b[1,3] global b_4_7= b[1,4] global b_5_7= b[1,5] 217 global b_6_7= b[1,6] glo bal b_7_7= b[1,7] global b_8_7= b[1,8] global b_9_7= b[1,9] global b_10_7= b[1,10] global b_11_7= b[1,11] global b_12_7= b[1,12] program model7, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodd s = $b_12_7 + ($b_1_7 * LactCat2) + ($b_2_7 * LactCat3) /// + ($b_3_7 * season2) + ($b_4_7 * season3) + ($b_5_7 * season4) /// + ($b_6_7 * bhb) + ($b_7_7 * calcium) + ($b_8_7 * lognefasq) /// + ( $b_9_7 * season2 * lognefasq) + ($b_10_7 * season 3 * lognefasq) /// + ($b_11_7 * season4 * lognefasq) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Cartes ian product of diseased vs non diseased* / drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D / * 3. Labe l concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model7 = concordantPCT + (0.5 * tiedPCT) end us e full, clear simulate AUC=r(model7), reps(50 0) seed(12345): model7 egen AUC1 = mean(AUC) /* AUC1 .7230101 */ /*Model 2*/ use full, clear melogit AnyDzCalc $model2 matrix b=e(b) matrix list b global b_1_2= b[1,1] global b_2_2= b[1,2] global b_3_2= b[1,3] global b_4_2= b[1,4 ] global b_5_2= b[1,5] global b_6_2= b[1,6] global b_7_2= b[1,7] global b_8_2= b[1,8] global b_9_2= b[1,9] global b_10_2= b[1,10] 218 global b_11_2= b[1,11] global b_12_2= b[1,12] global b_13_2= b[1,13 ] program model2, rclass version 14. 2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_13_2 + ($b_1_2 * LactCat2) + ($b_2_2 * LactCat3) + /// ($b_3_2 * season2) + ($b_4_2 * season3) + ($b_5_2 * season 4) /// + ($b_6_2 * bcscat) + ($b_7_2 * bhb) + ($b_8_ 2 * calcium) + /// ($b_9_2 * lognefasq) + ($b_10_2 * season2 * lognefasq) + /// ($b_11_2 * season3 * lognefasq) + ($b_12_2 * season4 * lognefasq) gen prob0 = 2.718 ^ logodds if AnyDzCalc= =0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 k eep prob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, re place use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace dis cordant =1 if prob1< prob0 gen tied= 0 replace ti ed=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/ 17 112 gen tiedPCT= tiedN/17112 return scalar model2 = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(model2), reps(500) seed(12345): model2 egen AUC1 = mean(AUC) /*AUC1 .7274488 */ /*Model 11*/ use full, clear melo git AnyDzCalc $model11 matrix b=e(b) matrix list b global b_1_11= b[1,1] global b_2_11= b[1,2] global b_3_11= b[1,3] global b_4_11= b[1,4] global b_5_11= b[1,5] global b_6_11= b[1,6] global b_7_11= b[1,7] glob al b_8_11= b[1,8] global b_9_1 1= b[1,9] global b_10_11= b[1,10] global b_11_1 1= b[1,11] global b_12_11= b[1,12] global b_13_11= b[1,13] global b_14_11= b[1,14] global b_15_11= b[1,15] 219 program model11, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_15_11 + ( $b_1_11 * LactCat2) + ($b_2_11 * LactCat3) + /// ($b_3_11 * season2) + ($b_4_11 * season3) + ($b_5_11 * season4) /// + ($b_6_11 * chole sterol42) + ($b_7_11 * cholesterol43) + ($b_8_11 * cholesterol44 ) + /// ($b_9_11 * bcscat2) + ($b_10_11 * cal cium) + ($b_11_11 * lognefasq) + /// ($b_12_11 * season2 *lognefasq) + ($b_13_11 * season3 * lognefasq) + ($b_14_11 * season4 * lognefasq) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 p rob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace u se ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if pro b1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = to tal(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model11 = c oncordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(model11), reps(500) seed(12345): mode l11 egen AUC1 = mean(AUC) /*AUC1 .7351109 */ ***************************************** *************************************** /*Step 8 : Model Averaging */ ******************************************************************************** /*Note: We calculated AIC weights in excel. First, we average predictions from the original models. Then, we create bootstrapped average predictions over 500 reps*/ u se full, clear /*Averaging predictions from models*/ /*Model 4*/ melogit AnyDzCalc $model4 predict phat rename phat phat1 /*Model 1*/ melogit AnyDzCalc $model1 predict phat 220 rename phat phat2 /*Model 7* / melogit AnyDzCalc $model7 predict phat rename phat phat3 /*Model 2*/ melogit AnyDzCalc $model2 predict phat rename phat phat4 /*Model 11*/ melogit AnyDzCalc $model11 predict phat rename phat phat5 /*Averaged*/ gen a vgphat= (0.36* phat1) + (.27 * phat2) + (.18 * phat3) + (.13 * phat4) + /// (.06 * phat5) roctab AnyDzCalc avgphat , detail graph /* 0.7257 */ drop if LstcId==. drop if avgphat==. roctg AnyDzC alc avgphat , optimal smooth lowess co interval(0 .01) /*cut 0. 335*/ /*NPV and PPV*/ gen positive = 0 replace positive = 1 if avgphat > 0.335 replace positive =. if avgphat==. tab AnyDzCalc positive, column /*Averaging predictions f rom 500 bootstrapped samples*/ use full, clea r program averaged, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds4 = $b_11_4 + ($b_1_4 * LactCat2) + ($b_2_4 * LactCat3) + /// ($b_3_4 * seas on2) + ($b_4_4 * season3) + ($b_5_4 * season4) + /// ($b_6_4 * calcium) + ($b_7_4 * lognefasq) + /// ($b_8_4 * season2 * lognefasq) + ($b_9_4 * season3 * lognefasq) + /// ($b_10_4 * season4 * lognefasq) gen logodd s1 = $b_12_1 + ($b_1_1 * LactCat2) + ($b_2_1 * LactCat3) + /// ($b_3_1 * season2) + ($b_4_1 * season3) + ($b_5_1 * season4) + /// ($b_6_1 * bcscat2) + ($b_7_1 * calcium) + ($b_8_1 * lognefasq) + /// ($b_9_1 * season2 * lognefasq) + ($b_10_1 * season3 * lognefasq) + // / ($b_11_1 * season4 * lognefasq) gen lo godds7 = $b_12_7 + ($b_1_7 * LactCat2) + ($b_2_7 * LactCat3) /// + ($b_3_7 * season2) + ($b_4_7 * season3) + ($b_5_7 * season4) /// + ($b_6_7 * bhb) + ($b_7_7 * calcium) + ($ b_8_7 * lognefasq) /// + ($b_9_7 * season2 * lognefasq) + ($b_10_7 * s eason3 * lognefasq) /// + ($b_11_7 * season4 * lognefasq) gen logodds2 = $b_13_2 + ($b_1_2 * LactCat2) + ($b_2_2 * LactCat3) + /// ($b_3_2 * season2) + ($b_4_2 * season3) + ($b_5_2 * season4) / // + ($b_6_2 * bcscat) + ($b_7_2 * bhb) + ($b _8_2 * calcium) + /// ($b_9_2 * lognefasq) + ($b_10_2 * seaso n2 * lognefasq) + /// ($b_11_2 * season3 * lognefasq) + ($b_12_2 * season4 * lognefasq) gen logodds11 = $b_15_11 + ($b_1_11 * Lac tCat2) + ($b_2_11 * LactCat3) + /// ($b_3_11 * season2) + ($b_4_11 * season3) + ($b_5_11 * season4) /// + ($b_6_11 * cholesterol42) + ($b_7_11 * cholesterol43) + ($b_8_11 * cholesterol44) + /// ($b_9_11 * bcscat2) + ($b_10_11 * calcium) + ($b_1 1_11 * lognefasq) + /// ($b_12_11 * season2 * lognefasq) + ($b_13_11 * season3 * lognefasq) + ($b_14_11 * seaso n4 * lognefasq) gen avglogodds= (0.36 * logodds4) + (.27* logodds1) + (.18 * logodds7) /// + (.13 * logodds2) + (0.06 * logodds11) gen prob0 = 2.718 ^ avglogodds if AnyDzCalc ==0 221 gen prob1 = 2.718 ^ avglogodds if AnyDzCalc==1 keep p rob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if pro b1> prob0 gen discordant= 0 replace di scordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen t iedN = total(tied==1) gen concordantPCT = conc ordantN/ 17112 gen tiedPCT= tiedN/ 17112 return scal ar averaged = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(averaged), reps(500) seed(12345):averaged egen AUCall = mean(AUC ) /* AUCall .7257846 */ M2 _ AIM1INFLAMMATION .DO /* Title: Inflammation model Author: Lauren Wisnieski Purpose: build Inflammation model Revision history: - Created December 12, 2017 - Edited June 6, 2018 Steps: Step 1: Best subsets AIC selec tion Step 2: Reduce pool of candidate models Ste p 3: Tes t all two - way interactions Step 4: Test random effects Step 5: Choose final candidate models Step 6: Model diagnostics Step 7: Model averaging Step 8: Internal validation Candidate variables: - NL ratio - Ntot - Saaugul - Vitamin D - Albumi n - Haptoglobin - Ltot - Etot - Btot 222 - Bdtot - Mtot - parity - season */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ FinalDataset3_5June2018.dta", clear save full, replace *** ************************************************* **** ************************ /*Step 1: Best subsets AIC selection*/ ******************************************************************************** /* Note: We now perform best subsets AIC selection in R. R can handle categorical variables and is also f aste r. Selected models that were within 4 AICs of best model*/ ******************************************************************************** /*Step 2: Testing interaction terms for each model */ ********************************************** ********************************** /* Note: Retained interaction term if reduced AIC value of model*/ /*Creating macros for all interaction terms and variables to speed up analysis*/ global Mtot "MtotCat2 Mt otCat3 MtotCat4 MtotCat5" global saaugml "saa ugml42 saaugml43 saaugml44" global vitamind "vitamind52 vitamind53 vitamind54 vitamind55" global albumin "albumin52 albumin53 albumin54 albumin55" global haptoglobin "haptoglobin52 haptoglobin53 haptoglob in5 4 haptoglobin55" global Ltot "Ltot52 Ltot53 L tot54 Ltot55" global Etot "Etot52 Etot53 Etot54 Etot55" global Btot "BtotCat2 BtotCat3 BtotCat4 BtotCat5" global BDtot "BDtotCat2 BDtotCat3 BDtotCat4 BDtotCat5" global season "season2 season3 season4" g lob al LactCat "LactCat2 LactCat3" global LactXSe ason "c.LactCat2#c.season2 c.LactCat2#c.season3 c.LactCat2#c.season4 c.LactCat3#c.season2 c.LactCat3#c.season3 c.LactCat3#c.season4" global LactXMtot "c.LactCat2#c.MtotCat2 c.LactCat2#c.MtotCat3 c.LactCat2# c.M totCat4 c.LactCat2#c.MtotCat5 c.LactCat3#c.Mto tCat2 c.LactCat3#c.MtotCat3 c.LactCat3#c.MtotCat4 c.LactCat3#c.MtotCat5" global LactXsaa "c.LactCat2#c.saaugml42 c.LactCat2#c.saaugml43 c.LactCat2#c.saaugml44 c.LactCat3#c.saaugml42 c.LactCat3#c.saaugml43 c .La ctCat3#c.saaugml44" global LactXalb "c.LactCa t2#c.albumin52 c.LactCat2#c.albumin53 c.LactCat2#c.albumin54 c.LactCat2#c.albumin55 c.LactCat3#c.albumin52 c.LactCat3#c.albumin53 c.LactCat3#c.albumin54 c.LactCat3#c.albumin55" global LactXhapto "c.LactCat2 #c. haptoglobin52 c.LactCat2#c.haptoglobin53 c.Lac tCat2#c.haptoglobin54 c.LactCat2#c.haptoglobin55 c.LactCat3#c.haptoglobin52 c.LactCat3#c.haptoglobin53 c.LactCat3#c.haptoglobin54 c.LactCat3#c.haptoglobin55" global LactXNL "c.LactCat2#c.NLratio c.LactCat3# c.N Lratio" global SeasonXMtot "c.season2#c.MtotC at2 c.season2#c.MtotCat3 c.season2#c.MtotCat4 c.season2#c.MtotCat5 c.season3#c.MtotCat2 c.season3#c.MtotCat3 c.season3#c.MtotCat4 c.season3#c.MtotCat5 c.season4#c.MtotCat2 c.season4#c.MtotCat3 c.season4#c.Mt otC at4 c.season4#c.MtotCat5" global SeasonXsaa " c.season2#c.saaugml42 c.season2#c.saaugml43 c.season2#c.saaugml44 c.season3#c.saaugml42 c.season3#c.saaugml43 c.season3#c.saaugml44 c.season4#c.saaugml42 c.season4#c.saaugml43 c.season4#c.saaugml44" global Sea sonXalb "c.season2#c.albumin52 c.season2#c.alb umin53 c.season2#c.albumin54 c.season2#c.albumin55 c.season3#c.albumin52 c.season3#c.albumin53 c.season3#c.albumin54 c.season3#c.albumin55 c.season4#c.albumin52 c.season4#c.albumin53 c.season4#c.albumin54 c. sea son4#c.albumin55" global SeasonXNL "c.season2 #c.NLratio c.season3#c.NLratio c.season4#c.NLratio" global MtotXsaa "c.MtotCat2#c.saaugml42 c.MtotCat2#c.saaugml43 c.MtotCat2#c.saaugml44 c.MtotCat3#c.saaugml42 c.MtotCat3#c.saaugml43 c.MtotCat3#c.saaug ml44 c.MtotCat4#c.saaugml42 c.MtotCat4#c.saaugml4 3 c.MtotCat4#c.saaugml44 c.MtotCat5#c.saaugml42 c.MtotCat5#c.saaugml43 c.MtotCat5#c.saaugml44" global MtotXalb "c.MtotCat2#c.albumin52 c.MtotCat2#c.albumin53 c.MtotCat2#c.albumin54 c.MtotCa t2#c.albumin55 c.M totCat3#c.albumin52 c.MtotCat3#c.albumin53 c.Mtot Cat3#c.albumin54 c.MtotCat3#c.albumin55 c.MtotCat4#c.albumin52 c.MtotCat4#c.albumin53 c.MtotCat4#c.albumin54 c.MtotCat4#c.albumin55 c.MtotCat5#c.albumin52 c.MtotCat5#c.albumin53 c.MtotCat5# c.albumin54 c.Mtot Cat5#c.albumin55" global MtotXNL "c.MtotCat2#c.NL ratio c.MtotCat3#c.NLratio c.MtotCat4#c.NLratio c.MtotCat5#c.NLratio" global saaXalb "c.saaugml42#c.albumin52 c.saaugml42#c.albumin53 c.saaugml42#c.albumin54 c.saaugml42#c.albumin55 c.saaug ml43#c.albumin52 c .saaugml43#c.albumin53 c.saaugml43#c.albumin54 c. saaugml43#c.albumin55 c.saaugml44#c.albumin52 c.saaugml44#c.albumin53 c.saaugml44#c.albumin54 c.saaugml44#c.albumin55" global saaXNL "c.saaugml42#c.NLratio c.saaugml43#c.NLratio c.saaugml44 #c.NLratio" globa l albXNL "c.albumin52#c.NLratio c.albumin53#c.NLr atio c.albumin54#c.NLratio c.albumin55#c.NLratio" global LactXhap "c.LactCat2#c.haptoglobin52 c.LactCat2#c.haptoglobin53 c.LactCat2#c.haptoglobin54 c.LactCat2#c.haptoglobin55 c.LactCat3#c.h aptoglobin52 c.Lac tCat3#c.haptoglobin53 c.LactCat3#c.haptoglobin54 c.LactCat3#c.haptoglobin55" global SeasonXhap "c.season2#c.haptoglobin52 c.season2#c.haptoglobin53 c.season2#c.haptoglobin54 c.season2#c.haptoglobin55 c.season3#c.haptoglobin52 c.season3#c .haptoglobin53 c.s eason3#c.haptoglobin54 c.season3#c.haptoglobin55 c.season4#c.haptoglobin52 c.season4#c.haptoglobin53 c.season4#c.haptoglobin54 c.season4#c.haptoglobin55" global MtotXhap "c.MtotCat2#c.haptoglobin52 c.MtotCat2#c.haptoglobin53 c.MtotCat2#c .haptoglobin54 c.M totCat2#c.haptoglobin55 c.MtotCat3#c.haptoglobin5 2 c.MtotCat3#c.haptoglobin53 c.MtotCat3#c.haptoglobin54 223 c.MtotCat3#c.haptoglobin55 c.MtotCat4#c.haptoglobin52 c.MtotCat4#c.haptoglobin53 c.MtotCat4#c.haptoglobin54 c.MtotCat4#c.haptoglobin5 5 c.MtotCat5#c.hap toglobin52 c.MtotCat5#c.haptoglobin53 c.MtotCat5# c.haptoglobin54 c.MtotCat5#c.haptoglobin55" global saaXhap "c.saaugml42#c.haptoglobin52 c.saaugml42#c.haptoglobin53 c.saaugml42#c.haptoglobin54 c.saaugml42#c.haptoglobin55 c.saaugml43#c.hap toglobin52 c.saaug ml43#c.haptoglobin53 c.saaugml43#c.haptoglobin54 c.saaugml43#c.haptoglobin55 c.saaugml44#c.haptoglobin52 c.saaugml44#c.haptoglobin53 c.saaugml44#c.haptoglobin54 c.saaugml44#c.haptoglobin55" global hapXNL "c.haptoglobin52#c.NLratio c.hapt oglobin53#c.NLrati o c.haptoglobin54#c.NLratio c.haptoglobin55#c.NLr atio" global albXhap "c.albumin52#c.haptoglobin52 c.albumin52#c.haptoglobin53 c.albumin52#c.haptoglobin54 c.albumin52#c.haptoglobin55 c.albumin53#c.haptoglobin52 c.albumin53#c.haptoglobin5 3 c.albumin53#c.ha ptoglobin54 c.albumin53#c.haptoglobin55 c.albumin 54#c.haptoglobin52 c.albumin54#c.haptoglobin53 c.albumin54#c.haptoglobin54 c.albumin54#c.haptoglobin55 c.albumin55#c.haptoglobin52 c.albumin55#c.haptoglobin53 c.albumin55#c.haptoglobin54 c. albumin55#c.haptog lobin55" global LactXNtot "c.LactCat2#c.Ntot c.L actCat2#c.Ntot c.LactCat3#c.Ntot c.LactCat3#c.Ntot" global SeasonXNtot "c.season2#c.Ntot c.season3#c.Ntot c.season4#c.Ntot" global MtotXNtot "c.MtotCat2#c.Ntot c.MtotCat3#c.Ntot c.MtotCat4 #c.Ntot c.MtotCat5 #c.Ntot" global saaXNtot "c.saaugml42#c.Ntot c.sa augml43#c.Ntot c.saaugml44#c.Ntot" global albXNtot "c.albumin52#c.Ntot c.albumin53#c.Ntot c.albumin54#c.Ntot c.albumin55#c.Ntot" global LactXLtot "c.LactCat2#c.Ltot52 c.LactCat2#c.Ltot53 c. LactCat2#c.Ltot54 c.LactCat2#c.Ltot55 c.LactCat3#c.Ltot52 c.LactCat 3#c.Ltot53 c.LactCat3#c.Ltot54 c.LactCat3#c.Ltot55" global SeasonXLtot "c.season2#c.Ltot52 c.season2#c.Ltot53 c.season2#c.Ltot54 c.season2#c.Ltot55 c.season3#c.Ltot52 c.season3#c.Ltot53 c.s eason3#c.Ltot54 c. season3#c.Ltot55 c.season4#c.Ltot52 c.season4#c.L tot53 c.season4#c.Ltot54 c.season4#c.Ltot55" global MtotXLtot "c.MtotCat2#c.Ltot52 c.MtotCat2#c.Ltot53 c.MtotCat2#c.Ltot54 c.MtotCat2#c.Ltot55 c.MtotCat3#c.Ltot52 c.MtotCat3#c.Ltot53 c.Mto tCat3#c.Ltot54 c.M totCat3#c.Ltot55 c.MtotCat4#c.Ltot52 c.MtotCat4#c .Ltot53 c.MtotCat4#c.Ltot54 c.MtotCat4#c.Ltot55 c.MtotCat5#c.Ltot52 c.MtotCat5#c.Ltot53 c.MtotCat5#c.Ltot54 c.MtotCat5#c.Ltot55" global saaXLtot "c.saaugml42#c.Ltot52 c.saaugml42#c.Ltot53 c.saaugml42#c.Ltot 54 c.saaugml42#c.Ltot55 c.saaugml43#c.Ltot52 c.sa augml43#c.Ltot53 c.saaugml43#c.Ltot54 c.saaugml43#c.Ltot55 c.saaugml44#c.Ltot52 c.saaugml44#c.Ltot53 c.saaugml44#c.Ltot54 c.saaugml44#c.Ltot55" global albXLtot "c.albumin52#c.Ltot52 c.albu min52#c.Ltot53 c.a lbumin52#c.Ltot54 c.albumin52#c.Ltot55 c.albumin5 3#c.Ltot52 c.albumin53#c.Ltot53 c.albumin53#c.Ltot54 c.albumin53#c.Ltot55 c.albumin54#c.Ltot52 c.albumin54#c.Ltot53 c.albumin54#c.Ltot54 c.albumin54#c.Ltot55 c.albumin55#c.Ltot52 c.albumin5 5#c.Ltot53 c.album in55#c.Ltot54 c.albumin55#c.Ltot55" /*Model 1 */ eststo clear global model1 "$LactCat $season $Mtot $saaugml $albumin NLratio" logit AnyDzCalc $model1 $LactXSeason estat ic logit AnyDzCalc $model1 $LactXMtot estat ic logit AnyDzCalc $ model1 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model1 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model1 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model1 $LactXMtot $SeasonXMtot estat ic logi t AnyDzCalc $model1 $LactXMtot $SeasonXsaa estat ic logit AnyDzCalc $model1 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model1 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model1 $LactXMtot $MtotXsaa estat ic log it AnyDzCalc $model 1 $LactXMtot $MtotXalb estat ic logit A nyDzCalc $model1 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model1 $LactXMtot $saaXalb estat ic eststo logit AnyDzCalc $model1 $LactXMtot $saaXalb $saaXNL estat ic logit AnyDzCalc $model1 $LactXMtot $saaXalb $albXNL estat ic esttab, aic /// title("Model 1") /*Model 2*/ eststo clear global model2 "$LactCat $season $Mtot $saaugml" logit AnyDzCalc $model2 $LactXSeason 224 estat ic logit AnyDzCalc $mo del2 $LactXMtot estat ic eststo logit A nyDzCalc $model2 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model2 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model2 $LactXMtot $SeasonXsaa estat ic log it AnyDzCalc $model2 $LactXMtot $MtotXsaa estat ic esttab, aic /// title("Model 2") /*Model 3*/ eststo clear global model19 "$LactCat $season $Mtot NLratio" logit AnyDzCalc $model3 $LactXSeason estat ic logit AnyDzCalc $model3 $LactXM tot estat ic eststo logit AnyDzCalc $mo del3 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model3 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $mod el3 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model3 $LactXMtot $MtotX NL estat ic esttab, aic /// tit le("Model 3") /*Model 4*/ eststo clear global model4 "$LactCat $season $Mtot $saaugml $albumin" logit AnyDzCal c $model4 $LactXSeason estat ic logit AnyDzCalc $model4 $LactXMtot estat ic logit AnyDzCalc $model4 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model4 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model4 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model4 $LactXMtot $SeasonXsaa estat ic logi t AnyDzCalc $model4 $LactXMtot $SeasonXalb est at ic logit AnyDzCalc $model4 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model4 $LactXMtot $M totXalb estat ic logit AnyDzCalc $model4 $LactXMtot $saaXalb estat ic estst o esttab, aic /// title("Model 4") /*Model 5*/ eststo clear global model5 "$LactCat $season $Mtot $saaugml $haptoglobin" logit AnyDzCal c $model5 estat ic logit AnyDzCalc $model5 $LactXSeason estat ic logit AnyDzCalc $ model5 $LactXMtot estat ic eststo logit AnyDzCalc $model5 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model5 $LactXMtot $LactXhapto 225 estat ic logit AnyDzCalc $model5 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model5 $ LactXMtot $SeasonXsaa estat ic logit AnyDzC alc $model5 $LactXMtot $SeasonXhap estat ic logit AnyDzCalc $model5 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model5 $LactXMtot $MtotXhap estat ic logit AnyDzCalc $mode l5 $LactXMtot $saaXhap estat ic esttab , aic /// title("Model 5") /* Model 6 excluded: duplicate*/ /*Model 7*/ eststo clear global model7 "$LactCat $season $Mtot $saaugml $albumin $haptoglobin NLratio" logit AnyDzCalc $mo del7 $LactXSeason estat ic logit AnyDzCalc $model7 $LactXMtot estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model7 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model7 $LactXMtot $LactXhap estat ic logit AnyDzCalc $model7 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model7 $LactXMtot $SeasonXMtot estat ic logi t AnyDzCalc $model7 $LactXMtot $SeasonXsaa estat ic logit AnyDzCalc $model7 $LactXMtot $SeasonXalb e stat ic logit AnyDzCalc $model7 $LactXMtot $ SeasonXhap estat ic logit AnyDzCalc $model7 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model7 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model7 $LactXMtot $MtotXalb estat i c logit AnyDzCalc $model7 $LactXMtot $MtotXh ap estat ic logit AnyDzCalc $model7 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model7 $LactXMtot $saaXalb estat ic logit AnyDzCalc $model7 $LactXMtot $saaXalb $saaXhap estat ic logit AnyDzCalc $model7 $LactXMtot $saaXalb $saaXNL estat ic logit AnyDzCalc $model7 $LactXMtot $saaXalb $albXhap estat ic logit AnyDzCalc $model7 $LactXMtot $saaXalb $albXNL estat ic logit AnyDzCalc $model7 $LactXMtot $saaX alb $hapXNL estat ic eststo esttab, aic /// title("Model 7") /*Model 8*/ eststo clear global model8 "$LactCat $Mto t $saaugml NLratio" logit AnyDzCalc $model8 $LactXMtot estat ic eststo logit AnyDzCalc $model8 $LactXMt ot $LactXsaa estat ic logit AnyDzCalc $m odel8 $LactXMtot $LactXNL 226 estat ic logit AnyDzCalc $model8 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model8 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model8 $LactXMtot $saaXN L estat ic esttab, aic /// title( "Model 8") /*Model 9*/ eststo clear global model9 "$LactCat $season $Mtot" logit An yDzCalc $model9 estat ic logit AnyDzCalc $model9 $LactXSeason estat ic logit AnyDzCalc $model9 $LactXMto t estat ic eststo logit AnyDzCalc $mo del9 $LactXMtot $SeasonXMtot estat ic esttab, aic /// title("Model 9") /*Model 10*/ eststo clear global model10 "$LactCat $season $Mtot $saaugml" logit AnyDzCalc $model10 e stat ic logit AnyDzCalc $model10 $LactXSeason estat ic logit AnyDzCalc $model10 $LactXMtot estat ic eststo logit AnyDzCalc $model10 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model10 $LactXMtot $SeasonXMtot estat ic log it AnyDzCalc $model10 $LactXMtot $SeasonXsaa e stat ic logit AnyDzCalc $model10 $LactXMtot $MtotXsaa estat ic estta b, aic /// title("Model 10") /*Model 11*/ eststo clear global model11 "$LactCat $season $Mtot $saaugml Nto t" logit AnyDzCalc $model11 $LactXSeason estat ic logit AnyDzCalc $model11 $LactXMtot estat ic eststo logit A nyDzCalc $model11 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model11 $LactXMtot $LactXNtot estat ic logit AnyDzCalc $model11 $LactXMtot $SeasonXMtot e stat ic logit AnyDzCalc $model11 $LactXMtot $SeasonXsaa estat ic logit AnyDzCalc $model11 $LactXMtot $SeasonXNtot estat ic logit AnyDzCalc $model11 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model11 $LactXMtot $MtotXNtot estat ic logit AnyDzCalc $model11 $LactXMtot $saaXNtot estat ic esttab, aic /// title("Model 11") /*Model 12*/ eststo clear 227 global model12 "$LactCat $season $Mtot $saaugml $ albumin Ntot" logit AnyDzCalc $model12 $LactXS eason estat ic logit AnyDzCalc $model12 $LactXMtot estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model12 $LactXMtot $LactXalb estat ic lo git AnyDzCalc $model12 $LactXMtot $LactXNtot e stat ic logit AnyDzCalc $model12 $LactXMtot $LactXNtot $SeasonXMtot estat ic logit AnyDzCalc $model12 $LactXMtot $LactXNtot $SeasonXsaa estat ic logit AnyDzCalc $model12 $LactXMtot $LactXN tot $SeasonXalb estat ic logit AnyDzCalc $model12 $LactXMtot $LactXNtot $SeasonXNtot estat ic logit AnyDzCalc $model12 $LactXMtot $LactXNtot $MtotXsaa estat ic logit AnyDzCalc $model12 $LactXMtot $LactXNtot $MtotXalb estat ic logit AnyDzCalc $model12 $LactXMtot $LactXNtot $MtotXNtot estat ic logit AnyDzCalc $model12 $Lact XMtot $LactXNtot $saaXalb estat ic eststo logit AnyDzCalc $model12 $LactXMtot $LactXNtot $saaXalb $saaXNtot estat ic logit AnyDzC alc $model12 $LactXMtot $LactXNtot $saaXalb $albX Ntot estat ic esttab, aic /// title("M odel 12") /*Model 14*/ eststo clear global model14 "$LactCat $Mtot $saaugml $albumin NLratio" logit AnyDzCalc $model14 $LactXMtot es tat ic logit AnyDzCalc $model14 $LactXMtot $La ctXsaa estat ic logit AnyDzCalc $model14 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model14 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model14 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model14 $LactXMtot $M totXalb estat ic logit AnyDzCalc $model14 $ LactXMtot $MtotXNL estat ic logit AnyDzCalc $model14 $LactXMtot $saaXalb estat ic eststo logit AnyDzCalc $model14 $LactXMtot $saaXNL est at ic logit AnyDzCalc $model14 $LactXMtot $ albXNL estat ic esttab, aic /// ti tle("Model 14") /*Model 15*/ eststo clear global model15 "$LactCat $season $Mtot $albumin NLratio" logit AnyDzCalc $model15 estat ic log it AnyDzCalc $model15 $LactXSeason estat ic logit AnyDzCalc $model15 $LactXMtot estat ic eststo logit AnyDzCalc $model15 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model15 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model15 $LactXMtot $SeasonXMtot estat ic 228 logit AnyDzCalc $model15 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model15 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model15 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model15 $LactXMtot $MtotXNL estat ic l o git AnyDzCalc $model15 $LactXMtot $albXNL estat ic esttab, aic /// title("Model 15") /*Model 16*/ eststo clear global model16 "$LactCat $season $Mtot $Ntot" logit AnyDzCalc $ model16 estat ic logit AnyDzCalc $model16 $LactXSeason estat ic logit AnyDzCalc $model16 $LactXMtot estat ic eststo logit AnyDzCalc $model16 $LactXMtot $LactXNtot estat ic logit AnyDzCalc $model16 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model16 $LactXMto t $SeasonXNtot estat ic logit AnyDzCalc $model16 $LactXMtot $MtotXNtot estat ic esttab, aic /// title("Model 16") /*Model 17*/ eststo clear global model17 "$LactCat $season $Mt ot $saaugml $Ntot" logit AnyDzCalc $ model17 estat ic logit AnyDzCalc $model17 $LactXSeason estat ic logit AnyDzCalc $model17 $LactXMtot estat ic eststo logit AnyDzCalc $model17 $LactXMtot $LactXsaa estat ic logit A nyDzCalc $model17 $LactXMtot $LactXNtot estat ic logit AnyDzCalc $model17 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model17 $LactXMtot $SeasonXsaa estat ic logit AnyDzCalc $model17 $LactXMtot $SeasonXNtot estat ic l ogit AnyDzCalc $model17 $Lact XMtot $MtotXsaa e stat ic logit AnyDzCalc $model17 $LactXMtot $MtotXNtot estat ic logit AnyDzCalc $model17 $LactXMtot $saaXNtot estat ic esttab, aic /// title("Model 17") /*Model 18*/ eststo clear global model18 "$Lact Cat $season $Mtot $s aaugml $haptoglobin NLratio" logit AnyDzCalc $model18 estat ic logit AnyDzCalc $model18 $LactXSeason estat ic logit AnyDzCalc $model18 $LactXMtot estat ic logit AnyDzCalc $model18 $LactXMtot $LactXsaa estat ic logit AnyD zCalc $model18 $LactXMtot $LactXhapto 229 estat ic logit AnyDzCalc $model18 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model18 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model18 $LactXMtot $SeasonXsaa estat ic logit A nyDzCalc $model18 $LactXMtot $SeasonXhap estat ic logit AnyDzCalc $model18 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model18 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $mod el18 $LactXMtot $MtotXha p estat ic logit AnyDzCalc $model18 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model18 $LactXMtot $saaXhap estat ic logit AnyDzCalc $model18 $LactXMtot $saaXNL estat ic logit AnyDzCalc $model18 $LactXMtot $hapXNL estat ic eststo esttab, aic /// title("Model 18") /*Model 19*/ eststo clear global model19 "$LactCat $Mtot NLratio" logit AnyDzCalc $model19 estat ic logit AnyDzCalc $model19 $LactXMtot estat ic eststo logit AnyDzCalc $model19 $LactXMtot $ LactXNL estat ic logit AnyDzCalc $model19 $LactXMtot $MtotXNL estat ic esttab, aic /// title("Model 19") /*Model 20*/ eststo clear global model20 "$LactCat $Mtot $saau gml" lo git AnyDzCalc $model20 $LactXMtot e stat ic eststo logit AnyDzCalc $model20 $LactXsaa estat ic logit AnyDzCalc $model20 $MtotXsaa estat ic esttab, aic /// title("Model 20") /*Model 22*/ eststo clea r global model22 "$LactCat $Mtot" logit AnyD zCalc $model22 estat ic logit AnyDzCalc $model22 $LactXMtot estat ic eststo esttab, aic /// title("Model 22") /*Model 23*/ eststo clear global model23 "$LactCat $s eas on $Mtot $saaugml NLratio" logit AnyDzCalc $model23 $LactXSeason estat ic logit AnyDzCalc $model23 $LactXMtot estat ic eststo 230 logit AnyDzCalc $model23 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model23 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model23 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model23 $LactXMtot $SeasonXsaa estat ic logit AnyDzCalc $model23 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model23 $LactXMtot $Mto tXsaa es tat ic logit AnyDzCalc $model23 $LactXMtot $Mt otXNL estat ic logit AnyDzCalc $model23 $LactXMtot $saaXNL estat ic esttab, aic /// title("Model 23") /*Model 25*/ eststo clear global model25 "$LactCat $season $Mtot $ saaugml $al bumin $Ltot" logit AnyDzCalc $model25 $LactXS eason estat ic logit AnyDzCalc $model25 $LactXMtot estat ic logit AnyDzCalc $model25 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model25 $LactXMtot $LactXalb estat ic logit An yDzCalc $model25 $LactXMtot $LactXLtot estat i c logit AnyDzCalc $model25 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model25 $LactXMtot $SeasonXsaa estat ic logit AnyDzCalc $model25 $LactXMtot $SeasonX alb estat ic logit AnyDzCalc $model25 $LactXMtot $SeasonXLtot estat ic logit AnyDzCalc $model25 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model25 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model25 $LactXMtot $MtotXLt ot estat ic l ogit AnyDzCalc $model25 $LactXMtot $saaXalb es tat ic eststo logit AnyDzCalc $model25 $LactXMtot $saaXalb $saaXLtot estat ic logit AnyDzCalc $model25 $LactXMtot $saaXalb $albXLtot estat ic logit AnyDzCalc $model25 $LactXMtot $ saaXalb $albXLtot estat ic esttab, aic /// title("Model 25") ******************************************************************************** /*Step 4: Test random effects*/ ************************* ******************************* ************************ /*Note: testing random e ffects for inclusion in each candidate model using AIC values. If variance is not estimated, excluded*/ /*Model 1*/ eststo clear global model1 "$LactCat $season $Mtot $s aaugml $albumin NLratio $LactXM tot $saaXalb" melogit AnyDzCalc $model1 estst o melogit AnyDzCalc $model1 || FarmId: eststo melogit AnyDzCalc $model1 if LstcId<300 || Cohort: eststo melogit AnyDzCalc $model1 if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// 231 title("Model 1") /// mtit les("None" "Farm Only" "Cohort Only" "Both") /*Model 2*/ eststo clear global model2 "$LactCat $season $Mtot $saaugml $LactXMtot" melogit AnyDzCalc $model2 eststo melogit AnyDzCalc $model2 || Far mId: eststo melogit AnyDzCalc $model2 || Cohor t: eststo melogit AnyDzCalc $model2 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 2") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 3*/ e ststo clear global model3 "$LactCat $season $Mto t NLratio $LactXMtot" melogit AnyDzCalc $model3 if LstcId<300 eststo melogit AnyDzCalc $model3 if LstcId<300 || FarmId: eststo melo git AnyDzCalc $model3 if LstcId<300 || Cohort: eststo melogit AnyD zCalc $model3 if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 3") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 4*/ eststo clear global model4 "$LactCat $season $Mtot $saaugm l $albumin $LactXMtot $saaXalb" melogit AnyDzCal c $model4 eststo melogit AnyDzCalc $model4 || FarmId: eststo melogit AnyDzCalc $model4 || Cohort: eststo melogit AnyDzCalc $ model4 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4 ) /// title("Model 4") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 5*/ eststo clear global model6 "$LactCat $season $Mtot $saaugml $haptoglobin $LactX Mtot" melogit AnyDzCalc $model5 eststo melogit AnyDzCalc $model5 || F armId: eststo melogit AnyDzCalc $model5 || Coh ort: eststo melogit AnyDzCalc $model5 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 5") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 7 */ eststo clear global model7 "$LactCat $seaso n $Mtot $saaugml $albumin $haptoglobin NLratio $LactXMtot $hapXNL $saaXalb" melogit AnyDzCalc $model7 if LstcId<300 eststo melogit AnyDzCalc $model7 if LstcId<300 || FarmId: eststo melogit AnyDzCalc $ model7 if LstcId<300 || Cohort: eststo melogit AnyDzCalc $model7 if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// 232 title("M odel 7") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 8*/ est sto clear global model8 "$LactCat $Mtot $saaugml NLratio $LactXMtot" melogit AnyDzCalc $model8 if LstcId<300 eststo melogit AnyDzCalc $model8 if LstcId<300 || Fa rmId: eststo melogit AnyDzCalc $model8 if LstcId<300 || Cohort: eststo melogit AnyDz Calc $model8 if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 8") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 9*/ eststo clear global model7 "$LactCat $season $Mtot $Lac tXMtot" melogit AnyDzCalc $model9 if LstcId<300 eststo melogit AnyDzCalc $model9 if LstcId<300 || FarmId: eststo melogit AnyDzCalc $model9 if LstcId <300 || Cohort: eststo melogit AnyDzCalc $model9 if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("M odel 9") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 10*/ eststo clear glob al model10 "$LactCat $season $Mtot $saaugml $LactXMtot" melogit AnyDzCalc $model10 if LstcId<300 e ststo melogit AnyDzCalc $model10 if LstcId<300 | | FarmId: eststo melogit AnyDzCalc $model10 if LstcId<300 || Cohort: eststo melogit AnyDzCalc $model10 if LstcId<300 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("M odel 10") /// mtitles("None" "Farm Only" "Coh ort Only" "Both") /*Model 11*/ eststo clear global model11 "$LactCat $seas on $Mtot $saaugml Ntot $LactXMtot" melogit AnyDzCalc $model11 eststo melogit AnyDzCalc $model11 || FarmId: eststo melo git AnyDzCalc $model11 || Cohort: eststo melo git AnyDzCalc $model11 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic var width(4) /// title("Model 11") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 12*/ eststo clear global model12 "$LactCat $season $Mtot $saaugml $albumin Ntot $LactXMtot $LactXNtot $saaXalb" melogit AnyDzCalc $model12 eststo melogit AnyDzCalc $model12 || FarmId: eststo melogit AnyDzCalc $model12 || Cohort: eststo melogit AnyDzCalc $model1 2 || FarmId: || Cohort: eststo esttab, b(%9 .0g) aic varwidth(4) /// 233 title("Model 12") /// mtitles("None" "Farm O nly" "Cohort Only" "Both") /*Model 14*/ eststo clear global model14 "$LactCat $Mtot $saaugml $albumin NLratio $Lac tXMtot $saaXalb" melogit AnyDzCalc $model14 e ststo melogit AnyDzCalc $model14 || FarmId: eststo melogit AnyDzCalc $mod el14 || Cohort: eststo melogit AnyDzCalc $model14 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 14") /// mtitles("None" "Farm On ly" "Cohort Only" "Both") /*Model 15*/ eststo clear global model15 "$LactCat $season $Mtot $albumin NLratio $LactXMtot" melogit AnyDzCalc $model15 eststo melogit AnyDzCalc $model15 || FarmId: est sto melogit AnyDzCalc $model15 || Cohort: est sto melogit AnyDzCalc $model15 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 15") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 16*/ e ststo clear global model16 "$LactCat $season $Mt ot $Ntot $LactXMtot" melogit AnyDzCalc $model16 eststo melo git AnyDzCalc $model16 || FarmId: eststo melogit AnyDzCalc $model16 || Cohort: eststo melogit AnyDzCalc $model16 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 16") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 17*/ eststo clear global model17 "$LactCat $season $Mtot $saaugml $Ntot $LactXMtot" melogit AnyDzCalc $model1 7 eststo melogit AnyDzCalc $model17 || FarmId: eststo melogit AnyDzCalc $model17 || Cohort: estst o melogit AnyDzCalc $model17 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 17") /// mtitles("None" "Fa rm Only" "Cohort Only" "Both") /*Model 18* / eststo clear global model18 "$LactCat $season $Mtot $saaugml $haptoglobin NLratio $LactXMtot $hapXNL" melogit AnyDzCalc $model18 eststo melogit AnyDzCalc $model18 || FarmId: eststo melogit AnyDzC alc $model18 || Cohort: eststo melogit AnyDzC alc $model18 || FarmId: || Cohort: 234 eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 18") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 19*/ eststo clear glo bal model19 "$LactCat $Mtot NLratio $LactXMtot" melogit AnyDzCalc $model19 eststo melogit AnyDzCalc $model19 || FarmId: eststo melogit AnyDzCalc $model19 || Cohort: eststo melogit AnyDzCalc $model19 || FarmId: || Cohort: eststo esttab , b(%9.0g) aic varwidth(4) /// title("Model 1 9") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 20*/ eststo clear global model20 "$LactCat $Mtot $saaugml $LactXMtot" melogit AnyDzCalc $model20 eststo melogit AnyDzCalc $mo del20 || FarmId: eststo melogit AnyDzCalc $mod el20 || Cohort: eststo melogit AnyDzCalc $model20 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 20") /// mtitles("None" "Farm Only" "Cohort Only" "Both" ) /*Model 22*/ e ststo clear global mo del22 "$LactCat $Mtot $LactXMtot" melogit AnyDzCalc $model22 eststo melogit AnyDzCalc $model22 || FarmId: eststo melogit AnyDzCalc $model22 || Cohort: eststo melogit AnyDzCalc $model22 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 22") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 23*/ eststo clear global model23 "$LactCat $season $Mtot $saaugml NLratio $LactXMtot" melogit AnyDzCalc $model23 eststo melogit AnyDzCalc $ model23 || FarmId: eststo melogit AnyDzCalc $model23 || Cohort: eststo melogit AnyDzCalc $model23 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 23") /// mtitles("None" "Fa rm Only" "Cohort Only" "Bot h") /*Model 25*/ eststo clear global model25 "$LactCat $season $Mtot $saaugml $albumin $Ltot $LactXMtot $saaXalb" melogit AnyDzCalc $model25 eststo melogit AnyDzCalc $model25 || FarmId: eststo melogit AnyDzCalc $m odel25 || Cohort: eststo melogit AnyDzCalc $model25 || FarmId: || Cohort: eststo 235 esttab, b(%9.0g) aic varwidth(4) /// title("Model 25") /// mtitles("None" "Farm Only" "Cohort Only" "Both") ******** ************************************************* *********************** /*Step 5: Reduce pool of candidate models*/ ******************************************************************************** /*Note: in this step, we ran the above models w ith the random effects included (where applicable) and e xported the AIC values. Then, we ordered AICs and kept all within 4 AICs of the "best" model*/ eststo clear melogit AnyDzCalc $model1 || FarmId: eststo melogit AnyDzCalc $model2 eststo melogit AnyDzCa lc $model3 || FarmId: eststo melogit AnyDzCalc $m odel4 || FarmId: eststo melogit AnyDzCalc $model5 eststo esttab, b(%9.0g) aic varwidth(4) /// title("Models 1 - 5") eststo clear melogit AnyDzCalc $model7 eststo melogit AnyDzCalc $model8 | | FarmId: || Cohort: eststo melogit AnyDzCalc $mo del9 eststo melogit AnyDzCalc $model10 eststo melogit AnyDzCalc $model11 || FarmId: eststo melogit AnyDzCalc $model12 || FarmId: eststo esttab, b(%9.0g) aic varwidth(4) /// t itle("Models 7 - 12") eststo clear melogit AnyDzCalc $model 14 || FarmId: || Cohort: eststo melogit AnyDzCalc $model15 || FarmId: eststo melogit AnyDzCalc $model16 || FarmId: eststo melogit AnyDzCalc $model17 eststo esttab, b(%9.0g) aic varwidth( 4) /// title("M odels 14 - 17") eststo clear melogit Any DzCalc $model18 eststo melogit AnyDzCalc $model19 || FarmId: || Cohort: eststo melogit AnyDzCalc $model20 || FarmId: || Cohort: eststo melogit AnyDzCalc $model22 || FarmId: || Cohort: eststo melogit A nyDzCalc $model23 || FarmId: eststo melogit Any DzCalc $model25 || FarmId: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Models 18 - 25") ************************* ******************************************************* /*Step 6: Model Diagnostics*/ **************************** **************************************************** /* Note: for model diagnostics, recorded discriminant ability (by using ROC) and 236 tes ted goodness - of - fit using hosmer - lemeshow. Looked at deviance as well as leverage graphs */ melogit AnyDzCalc $model1 || FarmId: predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ drop phat dec logit AnyDzCalc $model1 predict phat predict hat, hat scatter hat phat, mlab(LstcId ) yline(0) list LactationNumber season Mtot saaugml albumin NLratio if LstcId==126 gen id=_ n predict dv, dev scatter dv id, mlab(LstcId) drop hat phat id dv melogit AnyDzCalc $model12 || F armId: predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ drop phat dec logit AnyDzCalc $model12 predict hat, hat predict phat scatter hat phat, mlab(LstcId) yline(0) l ist LactationNumber season albumin Mtot Ltot saau gml if LstcId==245 list LactationNumber season albumin Mtot Ltot saaugml if LstcId==247 gen id=_n predict dv, dev scatter dv id, mlab(LstcId) drop phat hat id dv melogit AnyDzCalc $model14 || FarmId: || Cohort: predict phat roctab Any DzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /* diagnostic plots*/ drop phat dec logit AnyDzCalc $model14 predict phat predict hat, hat scatter hat phat, mlab(LstcId ) yline(0) list LstcId LactationNumber season albumin Mtot saaugml Ntot if LstcId==233 | LstcId==126 gen id=_n predict dv , dev scatter dv id, mlab(LstcId) drop phat hat id dv melogit AnyDzCalc $model25 || FarmId: predict phat rocta b AnyDzCalc phat egen dec=cut(phat),at (0(0.1) 1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ drop phat dec logit AnyDzCalc $model25 predict phat predict hat, hat scatter hat phat, mlab(LstcId) yline(0) list LstcId LactationNumber sea son albumin Mtot saaugml if LstcId==247 | LstcId= =245 gen id=_n predict dv, dev scatter dv id, mlab(LstcId) drop phat hat id dv melogit AnyDzCalc $model4 || FarmId: predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) 237 /*diagnostic plots*/ drop phat dec logit AnyDzCalc $model4 predict phat predict hat, hat scatter hat phat, mlab(LstcId) yline(0) list LstcId LactationNumber season albumin Mtot saaugml if LstcId==247 | LstcId==245 gen id=_n predict dv, dev scatter dv id, mla b(LstcId) drop phat hat id dv ******************************************************************************** /*Step 7: Internal validation */ ************************************************* ******************************* /*Calculating ROC using 500 bootstrapped samples for eac h candidate model*/ ************************************************************************************ /*Note: For each model, I created a program that calculates AUC by hand. Added a bsample line, so that with each run, a bootstrapped sample is used. I simulated this calculation 500 times, and averaged the predictive ability to get a bootstrapped estimate*/ /*Model 1*/ use full, clear melogit AnyDzCalc $ model1 || FarmId: matrix b=e(b) matrix list b global b_1_1= b[1,1] global b_2_1= b[1,2] global b_3_1= b[1,3] global b_4_1= b[1,4] global b_5_1= b[1,5] global b_6_1= b[1,6] global b_7_1= b[1,7] global b_8_1= b[1,8] global b_9_1= b[1,9] gl obal b_10_1= b[1,10] global b_11_1= b[1,11] glo bal b_12_1= b[1,12] global b_13_1= b[1,13] global b_14_1= b[1,14] global b_15_1= b[1,15] global b_16_1= b[1,16] global b_17_1= b[1,17] global b_18_1= b[1,18] global b_19_1= b[1,19] global b_20_1= b[1 ,20] global b_21_1= b[1,21] global b_22_1= b[1, 22] global b_23_1= b[1,23] global b_24_1= b[1,24] global b_25_1= b[1,25] global b_26_1= b[1,26] global b_27_1= b[1,27] global b_28_1= b[1,28] global b_29_1= b[1,29] global b_30_1= b[1,30] global b_ 31_1= b[1,31] global b_32_1= b[1,32] global b_3 3_1= b[1,33] global b_34_1= b[1,34 ] global b_35_1= b[1,35] global b_36_1= b[1,36] global b_37_1= b[1,37] global b_38_1= b[1,38] program model1, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodd s = $b_38_1 + ($b_1_1 * LactCat2) + ($b_2_1 * LactCat3) + /// ($b_3_1 * season2) + ($b_4_1 * season3) + ($b_5_1 * season4) + /// ($b_6_1 * MtotCat2) + ($b_7_1 * MtotCat3) + /// ($b_8_1 * MtotCa t4) + ($b_9_1 * MtotCat5) + ($b_10_1 * saaugml42) + /// ($b_11_1 * saaugml43) + ($b_12_1 * saaugml44) + ($b_13_1 * albumin52) + /// ($b_14_1 * albumin53) + ($b_15_1 * albumin54) + ($b_16_1 * albumin55) + /// 238 ($b_17_1 * NLratio) + ($b_18_1 * Lac tCat2 * MtotCat2) + ($b_19_1 * LactCa t2 * MtotCat 3) + /// ($b_20_1 * LactCat2 * MtotCat4) + ($b_21_1 * LactCat2 * MtotCat5) + /// ($b_22_1 * LactCat3 * MtotCat2) + ($b_23_1 * LactCat3 * MtotCat3) + /// ($b_24_1 * LactCat3 * MtotCat4) + ($b_25_1 * LactCat3 * MtotCat5) + /// ($b _26_1 * saau gml42 * albumin52) + ($b_27_1 * saaugml42 * albumin53) + /// ($b_28_1 * saaugml42 * albumin54) + ($b_29_1 * saaugml42 * albumin55) + /// ($b_30_1 * saaugml43 * albumin52) + ($b_31_1 * saaugml43 * alb umin53) + /// ($b_32_1 * saaugml4 3 * albumin5 4) + ($b_33_1 * saaugml43 * albumin55) + /// ($b_34_1 * saaugml44 * albumin52) + ($b_35_1 * saaugml44 * albumin53) + /// ($b_36_1 * saaugml44 & albumin54) + ($b_37_1 * saaugml44 * albumin55) gen prob0 = 2.718 ^ logodds if AnyDzCalc ==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace dis cordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(ti ed==1) gen concordantPCT = conco rdantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model1 = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(model1), reps(500) seed(12345): model1 egen AUC1 = mean(AUC) /*AUC1 .8075638 */ /*Model 12*/ use full, clear melogi t AnyDzCalc $model12 || FarmId: matrix b=e(b) matrix list b global b_1_12= b[1,1] global b_2_12= b[1,2] global b_3_12= b[1,3] global b_4_12= b[1,4] global b_5_12= b[1,5] global b_6_12= b[1,6] global b_7_12= b[1,7] glob al b_8_12= b[1,8] gl obal b_9_12= b[1,9] global b_10_12= b[1,10] global b_11_12= b[1,11] global b_12_12= b[1,12] global b_13_12= b[1,13] global b_14_12= b[1,14] global b_15_12= b[1,15] global b_16_12= b[1,16] global b_17 _12= b[1,17] global b_18_12= b[1,18] 239 global b_1 9_12= b[1,19] global b_20_12= b[1,20] global b_21_12= b[1,21] global b_22_12= b[1,22] global b_23_12= b[1,23] global b_24_12= b[1,24] global b_25_12= b[1,25] global b_26_12= b[1,26] global b_27_12= b[1,27] global b_28_ 12= b[1,28] global b_29_12= b[1,29] global b_30_12= b[1,30] global b_31_12= b[1,31] global b_32_12= b[1,32] global b_33_12= b[1,33] global b_34_12= b[1,34] global b_35_12= b[1,35] global b_36_12= b[1,36] global b_37_12= b[1,3 7] global b_38_12= b [1,38] global b_39_12= b[1 ,39] global b_40_12= b[1,40] program model12, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_40_12 + ($b_1_12 * LactCat2) + ($b_2_12 * L actCat3) + /// ($b_3_12 * season2) + ($b_4_1 2 * season3) + ($b_5_12 * season4) + /// ($b_6_12 * MtotCat2) + ($b_7_12 * MtotCat3) + /// ($b_8_12 * MtotCat4) + ($b_9_12 * MtotCat5) + ($b_10_12 * saaugml42) + /// ($b_11_12 * saaugml43) + ($b _12_12 * saaugml44) + ($b_13_12 * albumin52) + // / ($b_14_12 * albumin53) + ($b_15_12 * albumin54) + ($b_16_12 * albumin55) + /// ($b_17_12 * Ntot) + ($b_18_12 * LactCat2 * MtotCat2) + /// ($b_19_12 * LactCat2 * MtotCat3) + ($b_20_12 * LactCat2 * MtotCat4) + /// ($b_21_12 * LactCat2 * Mto tCat5) + /// ($b_22_12 * LactCat3 * MtotCat2) + ($b_23_12 * LactCat3 * MtotCat3) + /// ($b_24_12 * LactCat3 * MtotCat4) + ($b_25_12 * La ctCat3 * MtotCat5) + /// ($b_26_12 * LactCat2 * Ntot) + ($b _27_12 * LactCat3 * Ntot) + /// ($b_28_12 * s aaugml42 * albumin52) + ($b_29_12 * saaugml42 * albumin53) + /// ($b_30_12 * saaugml42 * albumin54) + ($b_31_12 * saaugml42 * albumin55) + / // ($b_32_12 * saaugml43 * albumin52) + ($b_33_12 * saaugml 43 * albumin53) + /// ($b_34_12 * saaugml43 * albumin54) + ($b_35_12 * saaugml43 * albumin55) + /// ($b_36_12 * saaugml44 * albumin52) + ($b_37_12 * saaugml44 * albumin53) + /// ($b _38_12 * saaugml44 & albumin54) + ($b_39_12 * saaugml44 * album in55) gen prob0 = 2.718 ^ logodds if AnyDzCa lc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Cartesian product of diseased vs n on diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob 1> prob0 gen discordant= 0 replace dis cordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace 240 /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen ti edN = total(tied==1) gen concordantPCT = conco rdantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model12 = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(model12), reps(500) seed(12345): model12 egen AUC1 = mean(AUC) /*AUC1 .8217935 */ /*Model 14*/ use ful l, clear melogit AnyDzCalc $model14 || FarmId: || Cohort: matrix b=e(b) matrix list b global b_1_14= b[1,1] global b_2_14= b[1,2] global b_3_14= b[1,3] global b_4_14= b[1,4] global b_5_14= b [1,5] global b_6_14= b[1,6] global b_7_14= b[1, 7] global b_8_14= b[1,8] global b_9_14= b[1,9] global b_10_14= b[1,10] global b_11_14= b [1,11] global b_12_14= b[1,12] global b_13_14= b[1,13] global b_14_14= b[1,14] global b_15_14= b[1,15] global b_16_14= b[1,16] global b_17_14= b[1,17] global b_18_14= b[1,18] global b_19_14= b[1,19] global b_20_14= b[1,20] global b_21_14= b[1,21] global b_22_14= b[1,22] global b_23_14= b[1,23] global b_24_14= b[1,24] global b_25_14= b[1,25] global b_26_ 14= b[1,26] global b_27_14= b[1,27] global b_28 _14= b[1,28] global b_29_14= b[1,29] global b_30_14= b[1,30] global b_31_14= b[1,3 1] global b_32_14= b[1,32] global b_33_14= b[1,33] global b_34_14= b[1,34] global b_35_14= b[1,35] program model1 4, rclass version 14.2 /* 1. calculat e pprob */ use full, clear bsample gen logodds = $b_35_14 + ($b_1_14 * LactCat2) + ($b_2_14 * LactCat3) + /// ($b_3_14 * MtotCat2) + ($b_4_14 * MtotCat3) + /// ($b_5_14 * MtotCat4) + ($b _6_14 * MtotCat5) + ($b_7_14 * saaugml42) + /// ($b_8_14 * saaugml43) + ($b_9_14 * saaugml 44) + ($b_10_14 * albumin52) + /// ($b_11_14 * albumin53) + ($b_12_14 * albumin54) + ($b_13_14 * albumin55) + /// ($b_14_14 * NLratio) + ($b_15_14 * LactC at2 * MtotCat2) + /// ($b_16_14 * LactCat2 * MtotCat3) + ($b_17_14 * LactCat2 * MtotCat4) + /// ($b_18_14 * LactCat2 * MtotCat5) + /// ($b_19_14 * LactCat3 * MtotCat2) + ($b_20_14 * LactCat3 * MtotCat3) + /// ($b_21_14 * LactCat3 * MtotCat4 ) + ($b_22_14 * LactCat3 * MtotCat5) + /// ($ b_23_14 * saaugml42 * albumin52) + ($b_24_14 * saaugml42 * albumin53) + /// ($b_25_14 * saaugml42 * albumin54) + ($b_26_14 * saaugml42 * albumin55) + /// ($b_27_14 * saaugml43 * albumin52) + ($b_28_1 4 * saaugml43 * albumin53) + /// ($b_29_14 * saaugml43 * albumin54) + ($b_30_14 * saaugml4 3 * albumin55) + /// ($b_31_14 * saaugml44 * albumin52) + ($b_32_14 * saaugml44 * albumin53) + /// 241 ($b_33_14 * saaugml44 * albumin54) + ($b_34_14 * saaugm l44 * albumin55) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logo dds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all , clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied==1) gen concordant PCT = concordantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model14 = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(model14), reps(500) seed(12345): model14 egen AUC1 = m ean(AUC) /*AUC1 .7977021 */ /*Model 25*/ use full, clear melogit AnyDzCalc $model25 || FarmId: matrix b=e(b) matrix list b global b_1_25= b[1,1] global b_2_25= b[1,2] global b_3_25= b[1,3] global b_4_25= b[1,4] global b_5_25= b[1 ,5] global b_6_25= b[1,6] global b_7_25= b[1,7] global b_8_25= b[1, 8] global b_9_25= b[1,9] global b_10_25= b[1,10] global b_11_25= b[1,11] global b_12_25= b[1,12] global b_13_25= b[1,13] global b_14_25= b[1,14] global b_15_25= b[1,15] global b_ 16_25= b[1,16] global b_17_25= b[1,17] global b _18_25= b[1,18] global b_19_25= b[1,19] global b_20_25= b[1,20] global b_21_25= b[1,21] global b_22_25= b[1,22] global b_23_25= b[1,23] global b_24_25= b[1,24] global b_25_25= b[1,25] global b_26_25 = b[1,26] global b_27_25= b[1,27] 242 global b_28_2 5= b[1,2 8] global b_29_25= b[1,29] global b_30_25= b[1,30] global b_31_25= b[1,31] global b_32_25= b[1,32] global b_33_25= b[1,33] global b_34_25= b[1,34] global b_35_25= b[1,35] global b_36_25= b[1 ,36] global b_37_25= b[1,37] global b_38_25= b[ 1,38] g lobal b_39_25= b[1,39] global b_40_25= b[1,40] global b_41_25= b[1,41] program model25, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodd s = $b_41_25 + ($b_1_25 * LactCat2) + ($b_2_25 * LactCat3) + /// ($b_3_25 * season2) + ($b_4_25 * season3) + ($b_5_25 * season4) + /// ($b_6_25 * MtotCat2) + ($b_7_25 * MtotCat3) + /// ($b_8_25 * MtotCat4) + ($b_9_25 * MtotCat5) + ($b_10_25 * saaugml42) + /// ($b_11_25 * saaugml43) + ( $b_12_25 * saaugml44) + ($b_13_25 * albumin52) + /// ($b_14_25 * albumin53) + ($b_15_25 * albumin54) + ($b_16_25 * albumin55) + /// ($b_17_25 * Ltot52) + /// ($b_18_25 * Ltot53) + ($b _19_25 * Lto t54) + /// ($b_20_25 * Ltot55) + /// ($b_ 21_25 * LactCat2 * MtotCat2) + ($b_22_25 * LactCat2 * MtotCat3) + /// ($b_23_25 * LactCat2 * MtotCat4) + ($b_24_25 * LactCat2 * MtotCat5) + /// ($b_25_25 * LactCat3 * MtotCat2) + ($b_26_2 5 * LactCat3 * MtotCat3) + /// ($b_27_25 * LactCat3 * Mto tCat4) + ($b_28_25 * LactCat3 * MtotCat5) + /// ($b_29_25 * saaugml42 * albumin52) + ($b_30_25 * saaugml42 * albumin53) + /// ($b_31_25 * saaugml42 * albumin54) + ($b_32_25 * saaugml42 * albumin55) + /// ($b_33_25 * saaugml43 * albumin52) + ($b _34_25 * saaugml43 * albumin53) + /// ($b_35_25 * saaugml43 * albumin54) + ($b_36_25 * saaugml43 * albumin55) + /// ($b_37_25 * saaugml44 * albumin52) + ($b_38_25 * saaugml44 * albumin53) + /// ( $b_39_25 * saaugml44 & albumin54) + ($b_40_25 * s aaugml44 * albumin55) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0= =. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordan t pairs */ gen concordant= 0 replace conco rdant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use p airs, clear egen concordantN = total(concordan t==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model25 = concordantPCT + (0.5 * tiedPCT) end 243 use full, clear simulate AUC=r(m odel25), reps(500) seed(12345): model25 egen AUC 1 = mean(AUC) /*AUC1 .8152578 */ /*Model 4*/ use full, clear melogit AnyDzCalc $model4 || FarmId: matrix b=e(b) matrix list b global b_ 1_4= b[1,1] global b_2_4= b[1,2] global b_3_4= b[1,3] global b_4_4= b[1,4] global b_5_4= b[1,5 ] global b_6_4= b[1,6] global b_7_4= b[1,7] global b_8_4= b[1,8] global b_9_4= b[1,9] global b_10_4= b[1,10] global b_11_4= b[1,11] global b_12_4= b[1 ,12] global b_13_4= b[1,13] global b_14_4= b[1,1 4] global b_15_4= b[1,15] global b_16_4= b[1,16 ] global b_17_4= b[1,17] global b_18_4= b[1,18] global b_19_4= b[1,19] global b_20_4= b[1,20] global b_21_4= b[1,21] global b_22_4= b[1,22] global b_23_4= b[1,23] global b_24_4= b[1,24] global b_25_ 4= b[1,25] global b_26_4= b[1,26] global b_27_ 4= b[1,27] global b_28_4= b[1,28] global b_29_4= b[1,29] global b_30_4= b[1,30] global b_31_4= b[1,31] global b_32_4= b[1,32] global b_33_4= b[1,33] global b_34_4= b[1,34] global b_35_4= b[1,35] gl obal b_36_4= b[1,36] global b_37_4= b[1,37] program model4, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_37_4 + ($b_1_4 * LactCat2) + ($b_2_4 * LactCat3) + /// ($b_3_4 * season2) + ($b_4_4 * season3) + ($b_5_4 * season4) + /// ($b_6_4 * MtotCat2) + ($b_7_4 * MtotCat3) + /// ($b_8_4 * MtotCat4) + ($b_9_4 * MtotCat5) + ($b_10_4 * saaugml42) + /// ($b_11_4 * saaugml43) + ($b_12_4 * saaugml44) + ($b_13_4 * albumin52) + /// ($b_14_4 * albumin53) + ($b_15_4 * albumin54 ) + ($b_16_4 * albumin55) + /// ($b_17_4 * LactCat2 * MtotCat2) + ($b_18_4 * LactCat2 * MtotCat3) + /// ($b_19_4 * LactCat2 * MtotCat4) + ($b_20_4 * LactCat2 * MtotCat5) + /// ($b_21_4 * LactCat3 * MtotCat2) + ($b_22_4 * LactCat3 * MtotCat3) + /// ($b_23_4 * LactCat3 * MtotCat4) + ($b_24_4 * LactCat3 * MtotCat5) + /// ($b_25_4 * saaugml42 * a lbumin52) + ($b_26_4 * saaugml42 * albumin53) + /// ($b_27_4 * saaugml42 * albumin54) + ($b_28_ 4 * saaugml42 * albumin55) + /// ($b_29_4 * s aaugml43 * albumin52) + ($b_30_4 * saaugml43 * albumin53) + /// ($b_31_4 * saaugml43 * albumin54) + ($b_ 32_4 * saaugml43 * albumin55) + /// ($b_33_4 * saaugml44 * albumin52) + ($b_34_4 * saaugml44 * albumin53) + /// ($b_35_4 * saaugml44 & album in54) + ($b_36_4 * saaugml44 * albumin55) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2 .718 ^ logodds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Cartesian pr oduct of diseased vs non diseased*/ drop prob 1 drop if prob0==. 244 tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label conc ordant and discordant pairs */ gen concordant = 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calcul ate AUC */ use pairs, clear egen concordan tN = total(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model4 = concordantPCT + (0.5 * tiedPCT) end use full, clea r simulate AUC=r(model4), reps(500) seed(12345): model4 egen AUC1 = mean(AUC) /*AUC1 .8102025 */ ******************************************************************************** /*Step 8: Model Averaging */ ************************************** ****************************************** /*Not e: We calculated AIC weights in excel. First, we average predictions from the original models. Then, we create bootstrapped average predictions over 500 reps*/ use full, clear /*Averaging predictions from 5 models*/ /*Model 1*/ melogit AnyD zCalc $model1 || FarmId: predict phat rename pha t phat1 /*Model 12*/ melogit AnyDzCalc $model12 || FarmId: predict phat rename phat phat2 /*Model 14*/ melogit AnyDzCalc $model14 | | FarmId: || Cohort: predict phat rename phat phat3 /*Model 25*/ melogit AnyDzCalc $model25 || FarmId: predict phat rename phat phat4 /*Model 4*/ melogit AnyDzCalc $model4 || FarmId: predict phat rename phat phat5 /*Average d*/ gen avgphat= (0.30* phat1) + (.28* phat2) + (.27 * phat3) + (.09 * phat4) + /// (0.06 * phat5) 245 roctab AnyDzCalc avgphat, detail roctg AnyDzCalc avgphat , optimal smooth lowess co interval(0.01) /*cut 0.351*/ /*NPV and PPV*/ gen posi tive = 0 replace positive = 1 if avgphat > 0.3 5 replace positive =. if avgphat==. tab AnyDzCalc positive, column /*Averaging predictions from 500 bootstrapped samples*/ use full, clear program averaged, rclass version 14.2 /* 1. calculate pprob */ use full, clear b sample gen logodds1 = $b_38_1 + ($b_1_1 * LactCat2) + ($b_2_1 * LactCat3) + /// ($b_3_1 * season2) + ($b_4_1 * season3) + ($b_5_1 * season4) + /// ($b_6_1 * MtotCat2) + ($b_7_1 * MtotCat3) + /// ($b_8_1 * MtotCat4) + ($b_9_1 * MtotCat 5) + ($b_10_1 * saaugml42) + /// ($b_11_1 * saaugml43) + ($b_12_1 * saaugml44) + ($b_13_1 * albumin52) + /// ($b_14_1 * albumin53) + ($b_15_1 * albumin54) + ($b_16_1 * albumin55) + /// ($b_17_1 * NLratio) + ($b_18_1 * LactCat2 * MtotCat2) + ($b _19_ 1 * LactCat2 * MtotCat3) + /// ($b_20_1 * LactCat2 * MtotCat4) + ($b_21_1 * LactCat2 * MtotCat5) + /// ($b_22_1 * LactCat3 * MtotCat2) + ($b_23_1 * LactCat3 * MtotCat3) + /// ($b_24_1 * LactC at3 * MtotCat4) + ($b_25_1 * LactCat3 * MtotCat5) + / // ($b_26_1 * saaugml42 * albumin52) + ($b_27_1 * saaugml42 * albumin53) + /// ($b_28_1 * saaugml42 * albumin54) + ($b_29_1 * saaugml42 * albumin55) + /// ($b_30_1 * saaugml43 * albumin52) + ($b_31_1 * saaugml43 * albumin53) + /// ($b_3 2_1 * saaugml43 * albumin54) + ($b_33_1 * saaugml43 * albumin55) + /// ($b_34_1 * saaugml44 * albumin52) + ($b_35_1 * saaugml44 * albumin53) + /// ($b_36_1 * saaugml44 & albumin54) + ($b_37_1 * saaug ml44 * albumin55) gen logodds12 = $b_40_12 + ($b_1_12 * LactCat2) + ($b_2_12 * LactCat3) + /// ($b_3_12 * season2) + ($b_4_12 * season3) + ($b_5_12 * season4) + /// ($b_6_12 * MtotCat2) + ($b_7_12 * MtotCat3) + /// ($b_8_12 * MtotCat4) + ($b_9_12 * MtotCat5) + ($b_10_12 * saaugml42) + /// ($b_11_12 * saaugml43) + ($b_12_12 * saaugml44) + ($b_13_12 * albumin52) + /// ($b_14_12 * albumin53) + ($b_15_12 * albumin54) + ($b_16_12 * albumin55) + /// ($b_17_12 * Ntot) + ($b_18_12 * LactCat2 * MtotCat2) + /// ($b_19_12 * LactC at2 * MtotCat3) + ($b_20_12 * LactCat2 * MtotCat4) + /// ($b_21_12 * LactCat2 * MtotCat5) + /// ($b_22_12 * LactCat3 * MtotCat2) + ($b_23_12 * Lact Cat3 * MtotCat3) + /// ($b_24_12 * LactCat3 * Mt otCat4) + ($b_25_12 * LactCat3 * MtotCat5) + /// ($b_26_12 * LactCat2 * Ntot) + ($b_27_12 * LactCat3 * Ntot) + /// ($b_28_12 * saaugml42 * albumin52) + ($b_29_12 * saaugml42 * albumin53) + /// ($b_30_12 * saaugml42 * albumin54) + ($b_31_12 * sa augml42 * albumin55) + /// ($b_32_12 * saaugm l43 * albumin52) + ($b_33_12 * saaugml43 * albumin53) + /// ($b_34_12 * saaugml43 * albumin54) + ($b_35_12 * saaugml43 * albumin55) + /// ($b_36_12 * saaugml44 * albumin52) + ($b_37_12 * saaugml44 * albumin53) + /// ($b_38_12 * saaugml44 & albu min54) + ($b_39_12 * saaugml44 * albumin55) gen logodds14 = $b_35_14 + ($b_1_14 * LactCat2) + ($b_2_14 * LactCat3) + /// ($b_3_14 * MtotCat2) + ($ b_4_14 * MtotCat3) + /// ($b_5_14 * MtotCat4) + ($b_6_14 * MtotCat5) + ($b_7_14 * saaugml42) + / // ($b_8_14 * saaugml43) + ($b_9_14 * saaugml44) + ($b_10_14 * albumin52) + /// ($b_11_14 * albumin53) + ($b_12_14 * albumin54) + ($b_13_14 * album in55) + /// ($b_14_14 * NLratio) + ($b_15_14 * L actCat2 * MtotCat2) + /// ($b_16_14 * LactCat 2 * MtotCat3) + ($b_17_14 * LactCat2 * MtotCat4) + /// ($b_18_14 * LactCat2 * MtotCat5) + /// ($b_19_14 * LactCat3 * MtotCat2) + ($b_20_14 * LactCa t3 * MtotCat3) + /// ($b_21_14 * LactCat3 * Mtot Cat4) + ($b_22_14 * LactCat3 * MtotCat5) + /// ($b_23_14 * saaugml42 * albumin52) + ($b_24_14 * saaugml42 * albumin53) + /// ($b_25_14 * saaugml42 * albumin54) + ($b_26_14 * saaugml42 * albumin55) + /// ($b_27_14 * saaugml43 * albumin52) + ($b_ 28_14 * saaugml43 * albumin53) + /// ($b_29_1 4 * saaugml43 * albumin54) + ($b_30_14 * saaugml43 * albumin55) + /// ($b_31_14 * saaugml44 * albumin52) + ($b_32_14 * saaugml44 * albumin53) + /// ($b_33_14 * saaugml44 * albumin54) + ($b_34_14 * sa augml44 * albumin55) gen logodds25 = $b_41_ 25 + ($b_1_25 * LactCat2) + ($b_2_25 * LactCat3) + /// ($b_3_25 * season2) + ($b_4_25 * season3) + ($b_5_25 * season4) + /// ($b_6_25 * MtotCat2) + ($b_7_25 * MtotCat3) + /// ($b_8_25 * MtotCat 4) + ($b_9_25 * MtotCat5) + ($b_10_25 * saaugml42 ) + /// 246 ($b_11_25 * saaugml43) + ($b_12_25 * saaugml44) + ($b_13_25 * albumin52) + /// ($b_14_25 * albumin53) + ($b_15_25 * albumin54) + ($b_16_25 * albumin55) + /// ($b_17_25 * Ltot52) + /// ($b_18_25 * Ltot53) + ($b_19_25 * Ltot54) + /// ($b_20_25 * Ltot55) + /// ($b_21_25 * LactCat2 * MtotCat2) + ($b_22_25 * LactCat2 * MtotCat3) + /// ($b_23_25 * LactCat2 * MtotCat4) + ($b_24_25 * LactCat2 * MtotCat5) + /// ($b_25_25 * LactC at3 * MtotCat2) + ($b_26_25 * LactCat3 * MtotCat3 ) + /// ($b_27_25 * LactCat3 * MtotCat4) + ($b_28_25 * LactCat3 * MtotCat5) + /// ($b_29_25 * saaugml42 * albumin52) + ($b_30_25 * saaugm l42 * albumin53) + /// ($b_31_25 * saaugml42 * albumin54) + ($b_32_25 * saaugml42 * albumin55) + /// ($ b_33_25 * saaugml43 * albumin52) + ($b_34_25 * saaugml43 * albumin53) + /// ($b_35_25 * saaugml43 * albumin54) + ($b_36_25 * saaugml43 * albu min55) + /// ($b_37_25 * saaugml44 * albumin52) + ($b_38_2 5 * saaugml44 * albumin53) + /// ($b_39_25 * saaugml44 & albumin54) + ($b_40_25 * saaugml44 * albumin55) gen logodds4= $b_37_4 + ($b_1_4 * LactCat2) + ($b_2_4 * LactCat3) + /// ($b _3_4 * season2) + ($b_4_4 * season3) + ($b_5_4 * season4) + // / ($b_6_4 * MtotCat2) + ($b_7_4 * MtotCat3) + /// ($b_8_4 * MtotCat4) + ($b_9_4 * MtotCat5) + ($b_10_4 * saaugml42) + /// ($b_11_4 * saaugml43) + ($b_12_4 * saaugml44) + ($b_13_4 * albumin52) + /// ($b_14_4 * albumin53) + ($b_15_4 * albumi n54) + ($b_16_4 * albumin55) + /// ($b_17_4 * LactCat2 * MtotCat2) + ($b_18_4 * LactCat2 * MtotCat3) + /// ($b_19_4 * LactCat2 * MtotCat4) + ($b_20_4 * LactCat2 * MtotCat5) + /// ($b_21_4 * LactCat3 * MtotCat2) + ($b_22_4 * LactCat3 * MtotCat3) + /// ($b_23_4 * LactCat3 * MtotCat4) + ($b_ 24_4 * LactCat3 * MtotCat5) + /// ($b_25_4 * saaugml42 * albumin52) + ($b_26_4 * saaugml42 * albumin53) + /// ($b_27_4 * saaug ml42 * albumin54) + ($b_28_4 * saaugml42 * albumin55) + /// ($b_29_4 * saaugml43 * albumin52) + ($b_30_4 * saaugml43 * albumin53) + /// ($b_31_4 * saaugml43 * albumin54) + ($b_32_4 * saaugml43 * albumin55) + /// ($b_33_4 * saaugml44 * albumin52 ) + ($b_34_4 * saaugml44 * albumin53) + /// ($b_35_4 * saaugml44 & al bumin54) + ($b_36_4 * saaugml44 * albumin55) gen avglogodds= (0.30 * logodds1) + (.28 * logodds12) + (.27 * logodds14) /// + (.09 * logodds25) + (0.06 * logodds4) gen pro b0 = 2.718 ^ avglogodds if AnyDzCalc==0 gen prob1 = 2.718 ^ avglogodd s if AnyDzCalc==1 keep prob0 prob1 save a ll, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save N D , replace use all, clear drop prob0 drop if prob1==. t empfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 ge n tied= 0 replace tied=1 if prob1==prob0 s ave pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = to tal(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/ 17112 gen tiedPCT= tiedN/ 17112 return scalar averaged = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(averaged), reps(500) seed(12345):aver aged egen AUCall = mean(AUC) /* AUCall .8177805 247 */ M4_AIM1OXIDATIVESTRESS.DO /* Title: Oxidative stress mod el Author: Lauren Wisnieski Purpose: build oxidat ive stress model Revision history: - Created December 12, 2017 - Revised June 8, 2018 Steps: Step 1: Best subsets AIC selection: DONE IN R Step 2: Test all two - way interactions Step 4: Test random effects Step 5: Choose final candidate models S tep 6: Model diagnostics Step 7: Model averaging Step 8: Internal validation Candidate variables: - Aopugul - Vitamin A - ROS - Alphatocopherol - Betacarotene - parity - season */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data fil es \ FinalDataset3_5June2018.dta", clear save full, replace *********************** ********************************************************* /*Step 1: Best subsets AIC selection*/ *************************** ************************************************* **** /* Note: We now perform best subsets AIC selection in R. R can handle categor ical variables and is also faster. Selected models that were within 4 AICs of best model*/ ******************** ************************************************* *********** /*Step 2: Testing interaction terms for each model */ ***************** *************************************************************** /* Note: Retained interaction term if reduced AIC value of m odel*/ global season "season2 season3 season4" global LactCat "LactCat2 LactCat3" global aopugul "aopugul52 aopugul53 aopugul54 aopugul55" global alphatocopherol "alphatocopherol52 alphatocopherol53 alphatocopherol54 alphatocopherol55" global vit amina "vitamina42 vitamina43 vitamina44" globa l LactXSeason "c.LactCat2#c.season2 c.L actCat2#c.season3 c.LactCat2#c.season4 c.LactCat3#c.season2 c.LactCat3#c.season3 c.LactCat3#c.season4" global LactXAOP "c.LactCat2#c.aopugul52 c.LactCat2#c.aopugul53 c .LactCat2#c.aopugul54 c.LactCat2#c.aopugul55 c.La ctCat3#c.aopugul52 c.LactCat3#c.aopugul 53 c.LactCat3#c.aopugul54 c.LactCat3#c.aopugul55" global LactXalpha "c.LactCat2#c.alphatocopherol52 c.LactCat2#c.alphatocopherol53 c.LactCat2#c.alphatocopherol54 c.La ctCat2#c.alphatocopherol55 c.LactCat3#c.alphatoco pherol52 c.LactCat3#c.alphatocopherol53 c.LactCat3#c.alphatocopherol54 c.LactCat3#c.alphatocopherol55" global SeasonXAOP "c.season2#c.aopugul52 c.season2#c.aopugul53 c.season2#c.aopugul54 c.season2#c.aopug ul55 c.season3#c.aopugul52 c.season3#c.aopugul53 c.season3#c.aopugul54 c.season3#c.aopug ul55 c.season4#c.aopugul52 c.season4#c.aopugul53 c.season4#c.aopugul54 c.season4#c.aopugul55" global SeasonXalpha "c.season2#c.alphatocopherol52 c.season2#c.alphatocop herol53 c.season2#c.alphatocopherol54 c.season2#c .alphatocopherol55 c.season3#c.alphatoc opherol52 c.season3#c.alphatocopherol53 c.season3#c.alphatocopherol54 c.season3#c.alphatocopherol55 c.season4#c.alphatocopherol52 c.season4#c.alphatocopherol53 c.season 3#c.alphatocopherol54 c.season4#c.alphatocopherol 55" global AOPXalpha "c.aopugul52#c.al phatocopherol52 c.aopugul52#c.alphatocopherol53 c.aopugul52#c.alphatocopherol54 c.aopugul52#c.alphatocopherol55 c.aopugul53#c.alphatocopherol52 c.aopugul53#c.alphatocop herol53 c.aopugul53#c.alphatocopherol54 c.aopugul 53#c.alphatocopherol55 c.aopugul54#c.al phatocopherol52 c.aopugul54#c.alphatocopherol53 c.aopugul54#c.alphatocopherol54 248 c.aopugul54#c.alphatocopherol55 c.aopugul55#c.alphatocopherol52 c.aopugul55#c.alphatocop herol53 c.aopugul55#c.alphatocopherol54 c.aopugul 55#c.alphatocopherol55" global LactXvi tamina "c.LactCat2#c.vitamina42 c.LactCat2#c.vitamina43 c.LactCat2#c.vitamina44 c.LactCat3#c.vitamina42 c.LactCat3#c.vitamina43 c.LactCat3#c.vitamina44" global SeasonX vitamina "c.season2#c.vitamina42 c.season2#c.vita mina43 c.season2#c.vitamina44 c.season3 #c.vitamina42 c.season3#c.vitamina43 c.season3#c.vitamina44 c.season4#c.vitamina42 c.season4#c.vitamina43 c.season4#c.vitamina44" global AOPXvitamina "c.aopugul52#c.vi tamina42 c.aopugul52#c.vitamina43 c.aopugul52#c.v itamina44 c.aopugul53#c.vitamina42 c.ao pugul53#c.vitamina43 c.aopugul53#c.vitamina44 c.aopugul54#c.vitamina42 c.aopugul54#c.vitamina43 c.aopugul54#c.vitamina44 c.aopugul55#c.vitamina42 c.aopugul55#c.vitamina 43 c.aopugul55#c.vitamina44" global alphaXvitami na "c.alphatocopherol52#c.vitamina42 c. alphatocopherol52#c.vitamina43 c.alphatocopherol52#c.vitamina44 c.alphatocopherol53#c.vitamina42 c.alphatocopherol53#c.vitamina43 c.alphatocopherol53#c.vitamina44 c.alp hatocopherol54#c.vitamina42 c.alphatocopherol54#c .vitamina43 c.alphatocopherol54#c.vitam ina44 c.alphatocopherol55#c.vitamina42 c.alphatocopherol55#c.vitamina43 c.alphatocopherol55#c.vitamina44" /*Model 1*/ eststo clear global model1 "$LactCat $season $aopugul $alphatocopherol" logit AnyDzCalc $mo del1 estat ic eststo logit Any DzCalc $model1 $LactXSeason estat ic logit AnyDzCalc $model1 $LactXAOP estat ic logit AnyDzCalc $model1 $LactXalpha estat ic logit AnyDzCalc $model1 $ SeasonXAOP estat ic logit AnyDzCalc $model1 $SeasonXalpha estat ic logit AnyDzCalc $model1 $AOPXalpha estat ic esttab, aic /// title("Model 1") /*Model 2*/ eststo clear global model2 "$LactCat $season $aopugul" logit AnyDzC alc $model2 eststo logit AnyDzCalc $model2 $LactXSeason estat ic lo git AnyDzCalc $model2 $LactXAOP estat ic logit AnyDzCalc $model2 $SeasonXAOP estat ic esttab, aic /// title("Model 2") /*Model 3*/ eststo clear glob al model3 "$LactCat $season $aopugul $vitamina $a lphatocopherol" logit AnyDzCalc $model3 estat ic eststo logit AnyDzCalc $model3 $LactXSeason estat ic logit AnyDzCalc $model3 $LactXAOP estat ic logit AnyDzCalc $model3 $LactXvitamin a estat ic logit AnyDzCalc $model3 $LactXa lpha estat ic logit AnyDzCalc $model3 $SeasonXAOP estat ic logit AnyDzCalc $model3 $SeasonXvitamina estat ic logit AnyDzCalc $model3 $SeasonXalpha estat ic logit AnyDzCalc $model 3 $AOPXvitamina estat ic logit AnyDzCalc $ model3 $AOPXalpha estat ic logit AnyDzCalc $model3 $alphaXvitamina estat ic 249 esttab, aic /// title("Model 3") /*Model 4*/ eststo clear global model4 "$LactCat $aopugul $vitam ina $alphatocopherol" logit AnyDzCalc $model4 eststo logit AnyDzCalc $model4 $LactXAOP estat ic logit AnyDzCalc $model4 $LactXvitamina estat ic logit AnyDzCalc $model4 $LactXalpha estat ic logit AnyDzCalc $model4 $AOPXvitam ina estat ic logit AnyDzCalc $model4 $AO PXalpha es tat ic logit AnyDzCalc $model4 $alphaXvitamina estat ic esttab, aic /// title("Model 4") /*Model 5*/ eststo clear global model5 "$LactCat $aopugul $alphatoco pherol" logit AnyDzCalc $model5 eststo logit AnyDzCa lc $model5 $LactXAOP estat ic logit AnyDzCalc $model5 $LactXalpha estat ic logit AnyDzCalc $model5 $AOPXalpha estat ic esttab, aic /// title("Model 5") /*Model 6*/ eststo clear global model6 "$LactC at $alphatocopherol" logit AnyDzCalc $model6 estat ic eststo logit AnyDzCalc $model6 $LactXalpha estat ic esttab, aic /// title("Model 6") /*Model 7*/ eststo clear global mo del7 "$LactCat $vitamina $alphatocopherol" log it AnyDzCalc $model7 eststo logit AnyDzCalc $model7 $LactXvitamina estat ic logit AnyDzCalc $model7 $LactXalpha estat ic logit AnyDzCalc $model7 $alphaXvitamina estat ic est tab, aic /// title("Model 7") /*Model 8*/ eststo clear global model8 "$LactCat $season $alphatocopherol" logit AnyDzCalc $model8 estat ic eststo logit AnyDzCalc $model8 $LactXSeason estat ic logit AnyDzCalc $model8 $SeasonXalpha estat ic esttab, aic /// title("M odel 8") /*Model 9*/ 250 eststo clear global model9 "$LactCat $aopugul" logit AnyDzCalc $model9 estat ic eststo logit AnyDzCalc $model9 $LactXAOP estat ic esttab, aic /// title(" Model 9") /*Model 10*/ eststo clear global m odel10 "$LactCat $season $aopugul $vitamina" logit AnyDzCalc $model10 eststo logit AnyDzCalc $model10 $LactXSeason estat ic logit AnyDzCalc $model10 $LactXAOP estat ic logit AnyDzCal c $model10 $L actXvitamina estat ic logit AnyDzCalc $model10 $SeasonXAOP estat ic logit AnyDzCalc $model10 $SeasonXvitamina estat ic logit AnyDzCalc $model10 $SeasonXAOP estat ic logit AnyDzCalc $model10 $SeasonXvi tamina est at ic logit AnyDzCalc $model1 0 $AOPXvitamina estat ic esttab, aic /// title("Model 10") /*Model 11*/ eststo clear global model11 "$LactCat $vitamina" logit AnyDzCalc $model11 estat ic eststo logit AnyD zCalc $model11 $LactXvitamina estat ic e sttab, aic /// title("Model 11") ******************************************************************************** /*Step 4: Test random effects*/ **************************************** **** ************************************ /*Note: test ing random effects for inclusion in each candidate model using AIC values. If variance is not estimated, excluded*/ /*Model 1*/ eststo clear melogit AnyDzCalc $model1 eststo melogit AnyDzCalc $mod el1 || FarmId: eststo melogit AnyDzCalc $model 1 || Cohort: eststo melogit AnyDzCalc $model1 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 1") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*M odel 2*/ eststo clear melogit AnyDzCalc $model 2 eststo melogit AnyDzCalc $model2 || FarmId: eststo melogit AnyDzCalc $model2 || Cohort: eststo melogit AnyDzCalc $model2 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// 251 title("Model 2") /// mtitles("None" "Farm On ly" "Cohort Only" "Both") /*Model 3*/ eststo clear melogit AnyDzCalc $model3 eststo melogit AnyDzCalc $model3 || FarmId: eststo melogit AnyDzCalc $model3 || Cohort: eststo melogit AnyDzCalc $model3|| FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 3") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 4*/ eststo clear melogit AnyDzCalc $model4 eststo melogit AnyDzCalc $model4 || FarmId: eststo melogit AnyDzCalc $model4 || Co hort: eststo melogit AnyDzCalc $model4 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 4") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 5* / eststo clear melogit AnyDzCalc $model5 es tsto melogit AnyDzCalc $model5 || FarmId: eststo melogit AnyDzCalc $model5 || Cohort: eststo melogit AnyDzCalc $model5 || FarmId: || C ohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title ("Model 5") /// mtitles("None" "Farm Only" "C ohort Only" "Both") /*Model 6*/ eststo clear melogit AnyDzCalc $model6 eststo melogit AnyDzCalc $model6 || FarmId: eststo melogit AnyDzCalc $model6 || Cohort: eststo melogit AnyDzCalc $mod el6 || FarmId: || Cohort: eststo esttab, b( %9.0g) aic varwidth(4) /// title("Model 6") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 7*/ eststo clear melogit AnyDzCalc $model7 eststo melogit AnyDzCalc $model7 | | FarmId: eststo melogit AnyDzCalc $model7 || Cohort: eststo melogit AnyDzCalc $model7 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 7") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Mode l 8*/ eststo clear 252 melogit AnyDzCalc $model8 eststo melogit AnyDzCalc $model8 || FarmId: eststo melogit AnyDzCalc $model8 || Cohort: eststo melogit AnyDzCalc $model8 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// ti tle("Model 8") /// mtitles("None" "Farm Only" "Both") /*Model 9*/ eststo clear melogit AnyDzCalc $model9 eststo melogit AnyDzCalc $model9 || FarmId: est sto melogit AnyDzCalc $model9 || Cohort: eststo melogit AnyDzCalc $model9 || FarmId : || Cohort: eststo esttab, b(%9.0g) aic va rwidth(4) /// title("Model 9") /// mtitles("None" "Farm Only" "Both") /*Model 10*/ eststo clear melogit AnyDzCalc $model10 eststo melogit AnyDzCalc $model10 || FarmId: eststo melogit AnyDzCalc $model10 || Cohort: eststo melogit AnyDzCalc $model10 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4 ) /// title("Model 10") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 11*/ eststo clear melogit AnyDzCalc $model11 estst o melogit AnyDzCalc $model11 || FarmId: eststo melogit AnyDzCalc $model11 || Cohort: eststo mel ogit AnyDzCalc $model11 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title ("Model 11") /// mtitles("None" "Farm Only" " Cohort Only" "Both") ******************************************************************************** /*Step 5: Reduce pool of candidate models*/ ********************************************* *********************************** /*Note: in t his step, we ran the above models w ith the random effects included (where applicable) and exported the AIC values. Then, we ordered AICs and kept all within 4 AICs of the "best" model*/ eststo clear me logit AnyDzCalc $model1 || FarmId: eststo melogi t AnyDzCalc $model2 || FarmId: ests to melogit AnyDzCalc $model3 || FarmId: eststo melogit AnyDzCalc $model4 || FarmId: || Cohort: eststo melogit AnyDzCalc $model5 || FarmId: || Cohort: eststo esttab, b(%9.0 g) aic varwidth(4) /// title("Models 1 - 5") 253 eststo clear melogit AnyDzCalc $model6 || FarmId: || Cohort: eststo melogit AnyDzCalc $model7 || FarmId: || Cohort: eststo melogit AnyDzCalc $model8 || FarmId: || Cohort: eststo melogit AnyDzC alc $model9 || FarmId: || Cohort: eststo melogit AnyDzCalc $model10 || F armId: eststo melogit AnyDzCalc $model11 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Models 6 - 11") ********************************** ********************************************** / *Step 6: Model Diagnosti cs*/ ******************************************************************************** /* Note: for model diagnostics, recorded discriminant ability (by using ROC) and tested goodn ess - of - fit using hosmer - lemeshow. Looked at devia nce as well as levera ge graphs */ melogit AnyDzCalc $model1 || FarmId: predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ drop phat dec logit AnyDzCalc $model1 predi ct phat predict hat, hat scatter hat phat, mlab(LstcId) yline(0) list LstcId if hat> 0.4 gen id=_n predict dv, dev scatter dv id, mlab(LstcId) drop hat phat id dv melogit AnyDzCalc $mode l5 || FarmId: || Cohort: predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ drop phat dec logit AnyDzCalc $model5 predict hat, hat predict phat scatter hat phat, mlab(Lst cId) yline(0) gen id=_n predict dv, dev sc atter dv id, mlab(LstcId) drop phat hat id dv melogit AnyDzCalc $model4 || FarmId: || Cohort: predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ drop phat dec logit A ny DzCalc $model4 predict phat predict hat, hat scatter hat phat, mlab(LstcId) yline(0) gen id=_n predict dv, dev scatter dv id, mlab(LstcId) drop phat hat id dv melogit AnyDzCalc $model 3 || FarmId: 254 predict phat roctab AnyDzCal c phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ drop phat dec logit AnyDzCalc $model3 predict phat predict hat, hat scatter hat phat, mlab(LstcId) yline( 0) gen id=_n predict dv, dev scatter dv id, mlab(LstcId) drop phat hat id dv melogit AnyDzCalc $model9 || FarmId: || Cohort: predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnost ic plots*/ drop phat dec logit AnyDzCalc $mod el9 predict phat predict hat, hat scatter hat phat, mlab(LstcId) yline(0) gen id=_n predict dv, dev scatter dv id, mlab(LstcId) drop phat hat id dv melogit AnyDzCalc $model6 || FarmId: | | Cohort: predict phat rocta b AnyDzCalc ph at egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ drop phat dec logit AnyDzCalc $model6 predict phat predict hat, hat scatter hat phat, mlab(LstcId) yline(0) gen id=_n predict dv, dev scatter dv id, m lab(LstcId) drop phat hat id dv melogit AnyDzCalc $model2 || FarmId: predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ d rop phat dec logit AnyDzCalc $m odel2 predict phat predict hat, hat scatter hat phat, mlab(LstcId) yline(0) gen id=_n predict dv, dev scatter dv id, mlab(LstcId) drop phat hat id dv ********************************************* ********************************* ** /*Step 7: In ternal validation */ ******************************************************************************** /*Calculating ROC using 500 bootstrapped samples for each candidate model*/ ************************* ************************************************* ********** /*Note: For each model, I created a program that calculates AUC by hand. Added a bsample line, so that with each run, a bootstrapped sample is used. I simulated this calc ulation 500 times, and a veraged the predictive ability to get a bootstrap ped estimate*/ /*Model 1*/ use full, clear melogit AnyDzCalc $model1 || FarmId: matrix b=e(b) matrix list b 255 global b_1_1= b[1,1] global b_2_1= b[1,2] global b_3 _1= b[1,3] global b_4_1= b[1,4] global b_5_1= b[1,5] global b_6_1= b[1, 6] global b_7_1= b[1,7] global b_8_1= b[1,8] global b_9_1= b[1,9] global b_10_1= b[1,10] global b_11_1= b[1,11] global b_12_1= b[1,12] global b_13_1= b[1,13] global b_14_1= b[1,14] program model1, rclass version 14.2 /* 1. calculate p prob */ use full, clear bsample gen logodds = $b_14_1 + ($b_1_1 * LactCat2) + ($b_2_1 * LactCat3) + /// ($b_3_1 * season2) + ($b_4_1 * season3) + ($b_5_1 * season4) + /// ($b_6_1 * aopug ul52) + ($b_7_1 * aopugul53) + /// ($b_8_1 * aopugul54) + ($b_9_1 * aopugul55) + /// ($b_10_1 * alphatocopherol52) + ($b_11_1 * alphatocopherol53) + /// ($b_12_1 * alphatocopherol54) + ($b_13_1 * alphatocophero l55) gen prob0 = 2.718 ^ logo dds if AnyDzCalc==0 gen prob1 = 2.718 ^ logod ds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile N D save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 g en tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = to tal(concordant==1) egen tiedN = total(tied==1) gen concord antPCT = concordantN/ 17112 gen tiedPCT= tiedN /17112 return scalar model1 = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(model1), reps(500) seed(12345): model1 egen AUC1 = mean(AUC) /*AUC1 .7070861 */ /*Model 5 */ use full, clear melogit AnyDzCalc $model5 || FarmId: || Cohort: matrix b=e(b) matrix list b global b_1_5 = b[1,1] global b_2_5= b[1,2] global b_3_5= b[1,3] 256 global b_4_5= b[1,4] global b_5_5= b[1,5] global b_6_5= b[1,6] global b_7_5= b[ 1,7] global b_8_5= b[1,8] global b_9_5= b[1,9] global b_10_5= b[1,10] global b_11_5= b[1,11] program model5, rclass version 14.2 /* 1. calculate pprob */ use f ull, clear bsample gen logodds = $b_11_5 + ($b_1_5 * LactCat2) + ( $b_2_5 * LactCat3) + /// ($b_3_5 * aopugul52 ) + ($b_4_5 * aopugul53) + ($b_5_5 * aopugul54) + /// ($b_6_5 * aopugul55) + ($b_7_5 * alphatocopherol52) + /// ($b_8_5 * al phatocopherol53) + ($b_9_5 * alphatocopherol54) + /// ($b_10_5 * alphat ocopherol55) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. te mpfile ND save ND , replace use all, cl ear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 repl ace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model5 = concordantPCT + (0.5 * tiedP CT) end use full, clear simulate AUC=r(model5), reps(500) seed(12345): model5 egen AUC1 = mean(AUC ) /*AUC1 .6827729 */ /*Model 4*/ use fu ll, clear melogit AnyDzCalc $model4 || FarmId: || Cohort: matrix b=e(b) matrix list b glo bal b_1_4= b[1,1] global b_2_4= b[1,2] global b_3_4= b[1,3] global b_4_4= b[1,4] global b_5_4= b[1, 5] global b_6_4= b[1,6] global b_7_4= b[1,7] g lobal b_8_4= b[1,8] global b_9_4= b[1,9] global b_10_4= b[1,10] 257 global b_11_4= b[1,11] global b_12_4= b[1,12] global b_13_4= b[1,13] global b_14_4= b[1,14] program model4, rclass vers ion 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_14_4 + ($b_1_4 * LactCat2) + ($b_2_4 * LactCat3) + /// ($b_3_4 * aopugul52) + ($b_4_4 * aopugul53) + ($b_5_4 * aopugul54) + /// ($b_6_4 * aopugul55) + ($b_7_4 * vitamina42) + /// ($b_8_4 * vitamina43) + ( $b_9_4 * vitamina44) + /// ($b_10_4 * alphatocopherol52) + ($b_11_4 * alphatocopherol53 ) + /// ($b_12_4 * alphatocopherol54) + ($b_13_4 * alphatocopherol55) gen prob0 = 2.718 ^ logodds if AnyDzC alc==0 gen prob1 = 2.718 ^ logodds if AnyDzCa lc==1 keep prob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace di scordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = conc ordantN/ 17112 gen tiedPCT= tiedN/17112 ret urn scalar model4 = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(model4), reps(500) seed(12345): model4 egen AUC1 = mean(AUC) /* AUC1 .69295930. */ /*Model 3*/ use fu ll, clear melogit AnyDzCalc $model3 || FarmId: matrix b=e(b) matrix list b glob al b_1_3= b[1,1] global b_2_3= b[1,2] global b_3_3= b[1,3] global b_4_3= b[1,4] global b_5_3= b[1,5] global b_6_3= b[1,6] global b_7_3= b[1,7] global b_8_3= b[1,8] global b_9_3= b[1,9] global b_10_3= b[1 ,10] 258 global b_11_3= b[1,11] global b_12_3 = b[1,12] global b_13_3= b[1,13] global b_14_3= b[1,14] global b_15_3= b[1,14] global b_16_3= b[1,14] global b_17_3= b[1,14] program model3, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_17_3 + ($b_1_3 * LactCat2) + ($b_2_3 * LactCat3) + /// ($b_3_3 * season2) + ($b_4_3 * season3) + ($b_5_3 * season4) + /// ($b_6_3 * aopugul52) + ($b_7_3 * aopugul53) + ($b_8_3 * aopugul54) + / // ($b_9_3 * aopugul55) + ($b_1 0_3 * vitamina42) + ($b_11_3 * vitamina43) + /// ($b_12_3 * vitamina44) + ($b_13_3 * alphatocopherol52) + /// ($b_14_3 * alphatocopherol53) + ($b_15_3 * alphatocoph erol54) + /// ($b_16_3 * alphatocopherol55) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ dro p prob1 drop if prob0==. tempfile ND s ave ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen conc ordant= 0 replace concordant =1 if prob1> prob 0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen con cordantN = total(concordant==1) egen tiedN = t otal(tied==1) gen concord antPCT = concordantN/ 17112 gen tiedPCT= tiedN/ 17112 return scalar model3 = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(model3), reps(500) seed (12345): model3 egen AUC1 = mean(AUC) /*AUC1 .7147408 */ /*Model 9*/ use full, clear melogit AnyDzCalc $model9 || FarmId: || Cohort: matrix b=e(b) matrix list b global b_1_9= b[1,1] global b_2_9= b[1,2] global b_3_9= b[1,3] globa l b_4_9= b[1,4] global b_5_9= b[1,5] global b_6 _9= b[1,6] glo bal b_7_9= b[1,7] 259 program model9, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_7_9 + ($b_1_9 * LactCat2) + ($b_2_9 * LactC at3) + /// ($b_3_9 * aopugul52) + ($b_4_9 * a opugul53) + ($b _5_9 * aopugul54) + /// ($b_6_9 * aopugul55) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non d iseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pair s */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, rep lace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/17112 gen tiedPCT= tiedN/17112 return scalar model9 = concordantPCT + (0.5 * tiedPC T) end use full, clear simulate AUC=r(model9), reps(500) seed(12345): model9 egen AUC1 = mean(AUC) /*AUC1 .6568046 */ /*Model 6*/ use full, clear melogit AnyDzCalc $model6 || FarmId: || Cohort: matrix b=e(b) matrix list b glob al b_1_6= b[1,1] global b_2_6= b[1,2] global b_3_ 6= b[1,3] global b_4_6= b[1,4] global b_5_6= b[1,5] global b_6_6= b[1,6] global b_7_6= b[1,7] program model6, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_7_6 + ($b_1_6 * LactCat2) + ($b_2_6 * LactCat3) + /// ($b_3_6 * alphatocopherol52) + ($b_4_6 * alphatocopherol53) + /// ($b_5_6 * alphatocopherol54) + ($b_6_6 * alphatocopherol55) gen prob0 = 2.718 ^ logodds if AnyDzCalc= =0 gen prob1 = 2.718 ^ logodds if AnyDzCalc== 1 keep prob0 prob1 260 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replac e use all, clear dro p prob0 drop if prob1==. tempfile D sav e D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discor dant =1 if prob1< prob0 gen tied= 0 rep lace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = to tal(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concorda ntN/17112 gen tiedPCT= tiedN/17112 return s calar model6 = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC= r(model6), reps(500) seed(12345): model6 egen AUC1 = mean(AUC) /*AUC1 .6563719 */ /*Model 2*/ use full, clear melogit AnyDzCalc $model2 || FarmId: ma trix b=e(b) matrix list b global b_1_2= b[1,1] global b_2_2= b[1,2] global b_3_2= b[1,3] global b_4_2= b[1,4] global b_5_2= b[1,5] global b_6_2= b[1,6] global b_7_2= b[1,7] global b_8_2= b[1,7] global b_9_2= b[1,7] global b_10_2= b[1,7] program model2, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_10_2 + ($b_1_2 * LactCat2) + ($b_2_2 * LactCat3) + /// ($b_3_2 * season2) + ($b_4_2 * season3) + ($b_5_2 * season4) + /// ($b_6_2 * aopugul52) + ($b_7_2 * aopugul53) + /// ($b_8_2 * aopugul54) + ($b_9_2 * aopugul55) gen prob0 = 2.71 8 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep p rob0 prob1 save all, replace /* 2. Car tesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace 261 use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. L abel concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/17112 gen tiedPCT= tiedN/17112 return scalar model2 = concordantPCT + (0.5 * tiedPCT) end use fu ll, clear simulate AUC= r(model2), reps(500) seed(12345): model2 egen AUC1 = mean(AUC) /*AUC1 . 6445203 */ ******************************************************************************** /*Step 8: M odel Averaging */ ****************************** ************************************************** /*Note: We calculated AIC weights in excel. Firs t, we average predictions from the original models. Then, we create bootstrapped average predictions ov er 500 reps*/ use full, clear /*Averaging pre dictions from 4 models*/ /*Model 1*/ melogit AnyDzCalc $model1 || FarmId: predict phat rename phat phat1 /*Model 5*/ melogit AnyDzCalc $model5 || FarmId: || Cohort: predict phat re name phat phat2 /*Model 4*/ melogit AnyDzC alc $model4 || FarmId: || Cohort: predict phat rename phat phat3 /*Model 3*/ melogit AnyDzCalc $model3 || FarmId: predict phat rename phat phat4 /*Model 9*/ melogit AnyDzCalc $model9 || FarmId: || Cohort: predict phat rename phat phat5 /*Model 6*/ melogit AnyDzCalc $model6 || FarmId: || Cohort: predict p hat rename phat phat6 262 /*Model 2*/ melogit AnyDzCalc $model2 || FarmId: predict phat rename phat phat7 /*Averaged*/ gen avgphat= (0.30 * ph at1) + (.27 * phat2) + (.12 * phat3) + (.12 * phat4) + /// (.07 * phat5) + (0.06 * phat6) + (0.06 * phat7) roctab AnyDzCalc avgphat, detail /* 0.7785 */ roctg AnyDzCalc avgphat , optimal smooth l owess co interval(0.01) /*cut 0.334*/ /*NP V and PPV*/ gen positive = 0 replace positive = 1 if avg phat > 0.334 replace positive =. if avgphat==. tab AnyDzCalc positive, column /*Averaging predictions from 500 bootstrapped samples*/ use full, clear program averaged, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds1 = $b_14_1 + ($b_1_1 * LactCat2) + ($b_2_1 * LactCat3) + /// ($b_3_1 * season2) + ($b_4_1 * season3) + ($b_5 _1 * season4) + /// ($b_6_1 * aopugul52) + ($ b_7_1 * aopugul53) + /// ($b_8_1 * aopugul54 ) + ($b_9_1 * aopugul55) + /// ($b_10_1 * alphatocopherol52) + ($b_11_1 * alphatocopherol53) + /// ($b_12_1 * alphatocopherol54) + ($b_13_1 * alphatoco pherol55) gen logodds5 = $b_11_5 + ($b_1_5 * LactCat2) + ($b_2_5 * LactCat3) + /// ($ b_3_5 * aopugul52) + ($b_4_5 * aopugul53) + ($b_5_5 * aopugul54) + /// ($b_6_5 * aopugul55) + ($b_7_5 * alphatocopherol52) + /// ($b_8_5 * alphatocophe rol53) + ($b_9_5 * alphatocopherol54) + /// ( $b_10_5 * alphatocopherol55) gen logodds4 = $b_14_4 + ($b_1_4 * LactCat2) + ($b_2_4 * LactCat3) + /// ($b_3_4 * aopugul52) + ($b_4_4 * aopugul53) + ($b_5_4 * aopugul54) + /// ($b_6_4 * aopugul5 5) + ($b_7_4 * vitamina42) + /// ($b_8_4 * vi tamina43) + ($b_9_4 * vitamina44) + /// ($b_ 10_4 * alphatocopherol52) + ($b_11_4 * alphatocopherol53) + /// ($b_12_4 * alphatocopherol54) + ($b_13_4 * alphatocopherol55) gen logodds3 = $b_17_3 + ($b_1_3 * LactCat2) + ($b_2_3 * LactCat3) + /// ($b_3_3 * season2) + ($b_4_3 * season3) + ( $b_5_3 * season4) + /// ($b_6_3 * aopugul52) + ($b_7_3 * aopugul53) + ($b_8_3 * aopugul54) + /// ($b_9_3 * aopugul55) + ($b_10_3 * vitamina42) + ($b_1 1_3 * vitamina43) + /// ($b_12_3 * vitamina44 ) + ($b_13_3 * alphatocopherol52) + /// ($b_ 14_3 * alphatocopherol53) + ($b_15_3 * alphatocopherol54) + /// ($b_16_3 * alphatocopherol55) gen logodds9 = $b_7_9 + ($b_1_9 * LactCat2) + ($b_2_9 * L actCat3) + /// ($b_3_9 * aopugul52) + ($b_4_9 * aopugul53) + ($b_5_9 * aopugul54) + /// ($b_6_9 * aopugul55) gen logodds6 = $b_7_6 + ($b_1_6 * LactCat2) + ($b_2_6 * LactCat3) + /// ($b_3_6 * alphatocopherol52) + ($b_4_6 * alphatocopherol 53) + /// ($b_5_6 * alphatocopherol54) + ($b_ 6_6 * alphatocopherol55) gen logodds2 = $ b_10_2 + ($b_1_2 * LactCat2) + ($b_2_2 * LactCat3) + /// ($b_3_2 * season2) + ($b_4_2 * season3) + ($b_5_2 * season4) + /// ($b_6_2 * aopugul52) + ($b_7 _2 * aopugul53) + /// ($b_8_2 * aopugul54) + ($b_9_2 * aopugul55) gen avglogodds= (0.30 * logodds1) + (.27* logodds5) + (.12 * logodds4) /// + (.12 * logodds3) + (0.07 * logodds9) + (0.06 * logodds6) + /// (0.06 * logodds2) gen prob0 = 2.718 ^ avglogodds if AnyDzCalc==0 gen prob1 = 2.718 ^ avglogodds if AnyDzCalc==1 keep p rob0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , r eplace use all, clear 263 drop prob0 dro p if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if pro b1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied= =1) gen concordantPCT = concordantN/ 17112 gen tiedPCT= tiedN/17112 return scala r averaged = concordantPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(averaged), reps(500) seed(12345):averaged egen AUCall = mean(AUC) /* AUCall .7028 32 *. M4_AIM1METABOLICSTRESS.DO /* Ti tle: Combined model Author: Lauren Wisnieski Purpose: build combined model (with ox stress, IF, and LM) Revision history: - Created December 12, 2017 - revised june 10, 2018 Steps: Step 1: Best sub sets AIC selection: DONE IN R Step 2: Test all tw o - way interactions Step 4: Test r andom effects Step 5: Choose final candidate models Step 6: Model diagnostics Step 7: Model averaging Step 8: Internal validation Candidate variables: - Lactation n umber - Season - Calcium - NEFA - BCS - BHB - Mto t - Saa - Albumin - NL ratio - Ntot - AOP - Alphatocopherol - Vitamin A */ 264 use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ FinalDataset3_5June2018.dta", clear save full, replace ***** ************************************************* ************************** /*Step 1: Best subsets AIC selection*/ ******************************************************************************** /* Note: We now perform best subsets AIC selection in R. R can handle categorical variables and is also fas ter. Selected models that w ere within 4 AICs of best model*/ ******************************************************************************** /*Step 2: Testing interaction terms for each model */ ************************************************ *************************** ***** /* Note: Retained interaction term if reduced AIC value of model*/ global season "season2 season3 season4" global LactCat "LactCat2 LactCat3" global aopugul "aopugul52 aopugul53 aopugul54 aopugul55" global alphatoco pherol "alphatocopherol52 a lphatocopherol53 alphatocopherol54 alphatocopherol55" global vitamina "vitamina42 vitamina43 vitamina44" global bcscat "bcscat2" global Mtot "MtotCat2 MtotCat3 MtotCat4 MtotC at5" global saaugml "saaugml42 saaugml43 saaugm l44" global albumin "albu min52 albumin53 albumin54 albumin55" global Ltot "Ltot52 Ltot53 Ltot54 Ltot55" global LactXSeason "c.LactCat2#c.season2 c.LactCat2#c.season3 c.LactCat2#c.season4 c.LactCat3# c.season2 c.LactCat3#c.season3 c.LactCat3#c.seaso n4" global LactXAOP "c.Lac tCat2#c.aopugul52 c.LactCat2#c.aopugul53 c.LactCat2#c.aopugul54 c.LactCat2#c.aopugul55 c.LactCat3#c.aopugul52 c.LactCat3#c.aopugul53 c.LactCat3#c.aopugul54 c.LactCat3#c.aopugul55" global LactXalpha "c.LactCat2#c.alphatocopherol 52 c.LactCat2#c.alphatocoph erol53 c.LactCat2#c.alphatocopherol54 c.LactCat2#c.alphatocopherol55 c.LactCat3#c.alphatocopherol52 c.LactCat3#c.alphatocopherol53 c.LactCat3#c.alphatocopherol54 c.LactCat3#c.alpha tocopherol55" global SeasonXAOP "c.season2#c.ao pugul52 c.season2#c.aopugul 53 c.season2#c.aopugul54 c.season2#c.aopugul55 c.season3#c.aopugul52 c.season3#c.aopugul53 c.season3#c.aopugul54 c.season3#c.aopugul55 c.season4#c.aopugul52 c.season4#c.aopugul53 c .season4#c.aopugul54 c.season4#c.aopugul55" glob al SeasonXalpha "c.season2# c.alphatocopherol52 c.season2#c.alphatocopherol53 c.season2#c.alphatocopherol54 c.season2#c.alphatocopherol55 c.season3#c.alphatocopherol52 c.season3#c.alphatocopherol53 c.season3# c.alphatocopherol54 c.season3#c.alphatocopherol55 c.season4#c.alphatocophero l52 c.season4#c.alphatocopherol53 c.season3#c.alphatocopherol54 c.season4#c.alphatocopherol55" global AOPXalpha "c.aopugul52#c.alphatocopherol52 c.aopugul52#c.alphatocopherol53 c. aopugul52#c.alphatocopherol54 c.aopugul52#c.alpha tocopherol55 c.aopugul53#c.alphatocopherol52 c.aopugul53#c.alphatocopherol53 c.aopugul53#c.alphatocopherol54 c.aopugul53#c.alphatocopherol55 c.aopugul54#c.alphatocopherol52 c.aopugul54#c.alphatocopherol53 c. aopugul54#c.alphatocopherol54 c.aopugul54#c.alpha tocopherol55 c.aopugul55#c.alphatocopherol52 c.aopugul55#c.alphatocopherol53 c.aopugul55#c.alphatocopherol54 c.aopugul55#c.alphatocopherol55" global LactXvitamina "c.LactCat2#c.vitamina42 c.LactCat2#c.vitam ina43 c.LactCat2#c.vitamina44 c.LactCat3#c.vitami na42 c.LactCat3#c.vitamina43 c.LactCat3#c.vitamina44" global SeasonXvitamina "c.season2#c.vitamina42 c.season2#c.vitamina43 c.season2#c.vitamina44 c.season3#c.vitamina42 c.season3#c.vitamina43 c.season3#c.v itamina44 c.season4#c.vitamina 42 c.season4#c.vita mina43 c.season4#c.vitamina44" global AOPXvitamina "c.aopugul52#c.vitamina42 c.aopugul52#c.vitamina43 c.aopugul52#c.vitamina44 c.aopugul53#c.vitamina42 c.aopugul53#c.vitamina43 c.aopugul53#c.vitamina44 c.ao pugul54#c.vitamina42 c.aopugul 54#c.vitamina43 c.a opugul54#c.vitamina44 c.aopugul55#c.vitamina42 c.aopugul55#c.vitamina43 c.aopugul55#c.vitamina44" global alphaXvitamina "c.alphatocopherol52#c.vitamina42 c.alphatocopherol52#c.vitamina43 c.alphatocopherol52 #c.vitamina44 c.alphatocophero l53#c.vitamina42 c. alphatocopherol53#c.vitamina43 c.alphatocopherol53#c.vitamina44 c.alphatocopherol54#c.vitamina42 c.alphatocopherol54#c.vitamina43 c.alphatocopherol54#c.vitamina44 c.alphatocopherol55#c.vitamina42 c.alphatoco pherol55#c.vitamina43 c.alphat ocopherol55#c.vitam ina44" global SeasonXNefa "c.season2#c.lognefasq c.season3#c.lognefasq c.season4#c.lognefasq" global SeasonXCalc "c.season2#c.calcium c.season3#c.calcium c.season4#c.calcium" global SeasonXBCS "c.season2 #c.bcscat2 c.season3#c.bcscat2 c.season4#c.bcscat 2" global LactXSeason "c.LactCat2#c.season2 c.LactCat2#c.season3 c.LactCat2#c.season4 c.LactCat3#c.season2 c.LactCat3#c.season3 c.LactCat3#c.season4" global LactXSeason "c.LactCat2#c.season2 c.LactCat2#c .season3 c.LactCat2#c.season4 c.LactCat3#c.season 2 c.LactCat3#c.season3 c.LactCat3#c.season4" global LactXBCS "c.LactCat2#c.bcscat2 c.LactCat3#c.bcscat2" global LactXCalc "c.LactCat2#c.calcium c.LactCat3#c.calcium" global LactXNefa "c.LactCat2#c.lognef asq c.LactCat3#c.lognefasq" global SeasonXbhb " c.season2#c.bhb c.season3#c.bhb c.season4#c.bhb" global LactXbhb "c.LactCat2#c.bhb c.LactCat3#c.bhb" global LactXMtot "c.LactCat2#c.MtotCat2 c.LactCat2#c.MtotCat3 c.LactCat2#c.MtotCat4 c.LactCat2#c.Mtot Cat5 c.LactCat3#c.MtotCat2 c.L actCat3#c.MtotCat3 c.LactCat3#c.MtotCat4 c.LactCat3#c.MtotCat5" 265 global LactXsaa "c.LactCat2#c.saaugml42 c.LactCat2#c.saaugml43 c.LactCat2#c.saaugml44 c.LactCat3#c.saaugml42 c.LactCat3#c.saaugml43 c.LactCat3#c.saaugml44" gl obal LactXalb "c.LactCat2#c.al bumin52 c.LactCat2# c.albumin53 c.LactCat2#c.albumin54 c.LactCat2#c.albumin55 c.LactCat3#c.albumin52 c.LactCat3#c.albumin53 c.LactCat3#c.albumin54 c.LactCat3#c.albumin55" global LactXNL "c.LactCat2#c.NLratio c.LactCat3#c.NLra tio" global SeasonXMtot "c.s eason2#c.MtotCat2 c .season2#c.MtotCat3 c.season2#c.MtotCat4 c.season2#c.MtotCat5 c.season3#c.MtotCat2 c.season3#c.MtotCat3 c.season3#c.MtotCat4 c.season3#c.MtotCat5 c.season4#c.MtotCat2 c.season4#c.MtotCat3 c.season4#c.MtotCat 4 c.season4#c.MtotCat5" glob al SeasonXsaa "c.se ason2#c.saaugml42 c.season2#c.saaugml43 c.season2#c.saaugml44 c.season3#c.saaugml42 c.season3#c.saaugml43 c.season3#c.saaugml44 c.season4#c.saaugml42 c.season4#c.saaugml43 c.season4#c.saaugml44" global Sea sonXalb "c.season2#c.albumin52 c.season2#c.albumi n53 c.season2#c.albumin54 c.season2#c.albumin55 c.season3#c.albumin52 c.season3#c.albumin53 c.season3#c.albumin54 c.season3#c.albumin55 c.season4#c.albumin52 c.season4#c.albumin53 c.season4#c.albumin54 c.sea son4#c.albumin55" global Sea sonXNL "c.season2#c .NLratio c.season3#c.NLratio c.season4#c.NLratio" global MtotXsaa "c.MtotCat2#c.saaugml42 c.MtotCat2#c.saaugml43 c.MtotCat2#c.saaugml44 c.MtotCat3#c.saaugml42 c.MtotCat3#c.saaugml43 c.MtotCat3#c.saaugm l44 c.MtotCat4#c.saaugml42 c.M totCat4#c.saaugml43 c.MtotCat4#c.saaugml44 c.MtotCat5#c.saaugml42 c.MtotCat5#c.saaugml43 c.MtotCat5#c.saaugml44" global MtotXalb "c.MtotCat2#c.albumin52 c.MtotCat2#c.albumin53 c.MtotCat2#c.albumin54 c.MtotCat2#c.albumin55 c.M totCat3#c.albumin52 c.MtotCat3 #c.albumin53 c.Mtot Cat3#c.albumin54 c.MtotCat3#c.albumin55 c.MtotCat4#c.albumin52 c.MtotCat4#c.albumin53 c.MtotCat4#c.albumin54 c.MtotCat4#c.albumin55 c.MtotCat5#c.albumin52 c.MtotCat5#c.albumin53 c.MtotCat5#c.albumin54 c.Mtot Cat5#c.albumin55" global Mtot XNL "c.MtotCat2#c.N Lratio c.MtotCat3#c.NLratio c.MtotCat4#c.NLratio c.MtotCat5#c.NLratio" global saaXalb "c.saaugml42#c.albumin52 c.saaugml42#c.albumin53 c.saaugml42#c.albumin54 c.saaugml42#c.albumin55 c.saaugml43#c.albumin52 c.saaugml43#c.albumin53 c.saa ugml43#c.albumin54 c.saaugml43#c.albumin55 c.saaugml44#c.albumin52 c.saaugml44#c.albumin53 c.saaugml44#c.albumin54 c.saaugml44#c.albumin55" global saaXNL "c.saaugml42#c.NLratio c.saaugml43#c.NLratio c.saaugml44#c.NLratio" g lobal albXNL "c.albumin52#c.NL ratio c.albumin53#c .NLratio c.albumin54#c.NLratio c.albumin55#c.NLratio" global LactXNtot "c.LactCat2#c.Ntot c.LactCat2#c.Ntot c.LactCat3#c.Ntot c.LactCat3#c.Ntot" global SeasonXNtot "c.season2#c.Ntot c.season3#c.Ntot c.sea son4#c.Ntot" global MtotXNtot "c.MtotCat2#c.Ntot c.MtotCat3#c.Ntot c.MtotCat4#c.Ntot c.MtotCat5#c.Ntot" global saaXNtot "c.saaugml42#c.Ntot c.saaugml43#c.Ntot c.saaugml44#c.Ntot" global albXNtot "c.albumin52#c.Ntot c.albumin53#c.Ntot c.albumin54#c.Ntot c.albumin55#c.Ntot" global LactXLtot "c.LactCat2 #c.Ltot52 c.LactCat2#c.Ltot53 c.LactCat2#c.Ltot54 c.LactCat2#c.Ltot55 c.LactCat3#c.Ltot52 c.LactCat3#c.Ltot53 c.LactCat3#c.Ltot54 c.LactCat3#c.Ltot55" global SeasonXLtot "c.season2#c.Ltot52 c.season2#c.Ltot 53 c.season2#c.Ltot54 c.season2#c.Ltot55 c.season 3#c.Ltot52 c.season3#c.Ltot53 c.season3#c.Ltot54 c.season3#c.Ltot55 c.season4#c.Ltot52 c.season4#c.Ltot53 c.season4#c.Ltot54 c.season4#c.Ltot55" global MtotXLtot "c.MtotCat2#c.Ltot52 c.MtotCat2#c.Ltot53 c. MtotCat2#c.Ltot54 c.M totCat2#c.Ltot55 c.MtotCat3# c.Ltot52 c.MtotCat3#c.Ltot53 c.MtotCat3#c.Ltot54 c.MtotCat3#c.Ltot55 c.MtotCat4#c.Ltot52 c.MtotCat4#c.Ltot53 c.MtotCat4#c.Ltot54 c.MtotCat4#c.Ltot55 c.MtotCat5#c.Ltot52 c.MtotCat5#c.Ltot53 c.MtotCat5#c.Ltot5 4 c.MtotCat5#c.Ltot55 " global saaXLtot "c.saaug ml42#c.Ltot52 c.saaugml42#c.Ltot53 c.saaugml42#c.Ltot54 c.saaugml42#c.Ltot55 c.saaugml43#c.Ltot52 c.saaugml43#c.Ltot53 c.saaugml43#c.Ltot54 c.saaugml43#c.Ltot55 c.saaugml44#c.Ltot52 c.saaugml44#c.Ltot53 c.sa augml44#c.Ltot54 c.sa augml44#c.Ltot55" global a lbXLtot "c.albumin52#c.Ltot52 c.albumin52#c.Ltot53 c.albumin52#c.Ltot54 c.albumin52#c.Ltot55 c.albumin53#c.Ltot52 c.albumin53#c.Ltot53 c.albumin53#c.Ltot54 c.albumin53#c.Ltot55 c.albumin54#c.Ltot52 c.albumin 54#c.Ltot53 c.albumin 54#c.Ltot54 c.albumin54#c.Lt ot55 c.albumin55#c.Ltot52 c.albumin55#c.Ltot53 c.albumin55#c.Ltot54 c.albumin55#c.Ltot55" global MtotXAOP "c.MtotCat2#c.aopugul52 c.MtotCat2#c.aopugul53 c.MtotCat2#c.aopugul54 c.MtotCat2#c.aopugul55 c.Mto tCat3#c.aopugul52 c.M totCat3#c.aopugul53 c.MtotCa t3#c.aopugul54 c.MtotCat3#c.aopugul55 c.MtotCat4#c.aopugul52 c.MtotCat4#c.aopugul53 c.MtotCat4#c.aopugul54 c.MtotCat4#c.aopugul55 c.MtotCat5#c.aopugul52 c.MtotCat5#c.aopugul53 c.MtotCat5#c.aopugul54 c.MtotCa t5#c.aopugul55" glob al MtotXalpha "c.MtotCat2#c. alphatocopherol52 c.MtotCat2#c.alphatocopherol53 c.MtotCat2#c.alphatocopherol54 c.MtotCat2#c.alphatocopherol55 c.MtotCat3#c.alphatocopherol52 c.MtotCat3#c.alphatocopherol53 c.MtotCat3#c.alphatocopherol54 c.M totCat3#c.alphatocoph erol55 c.MtotCat4#c.alphato copherol52 c.MtotCat4#c.alphatocopherol53 c.MtotCat4#c.alphatocopherol54 c.MtotCat4#c.alphatocopherol55 c.MtotCat5#c.alphatocopherol52 c.MtotCat5#c.alphatocopherol53 c.MtotCat5#c.alphatocopherol54 c.MtotCat5 #c.alphatocopherol55" global MtotXvitamina "c.M totCat2#c.vitamina42 c.MtotCat2#c.vitamina43 c.MtotCat2#c.vitamina44 c.MtotCat3#c.vitamina42 c.MtotCat3#c.vitamina43 c.MtotCat3#c.vitamina44 c.MtotCat4#c.vitamina42 c.MtotCat4#c.vitamina43 c.MtotCat4#c.vitam ina44 c.MtotCat5#c.vi tamina42 c.MtotCat5#c.vitami na43 c.MtotCat5#c.vitamina44" global MtotXCalcium "c.MtotCat2#c.calcium c.MtotCat3#c.calcium c.MtotCat4#c.calcium c.MtotCat5#c.calcium" global MtotXNefa "c.MtotCat2#c.lognefasq c.MtotCat3#c.lognefasq c.Mto tCat4#c.lognefasq c.M totCat5#c.lognefasq" global MtotXNL "c.MtotCat2#c.NLratio c.MtotCat3#c.NLratio c.MtotCat4#c.NLratio c.MtotCat5#c.NLratio" global MtotXBCS "c.MtotCat2#c.bcscat2 c.MtotCat3#c.bcscat2 c.MtotCat4#c.bcscat2 c.MtotCat5#c.bcscat2" global s aaXAOP "c.saaugml42#c .aopugul52 c.saaugml42#c.aop ugul53 c.saaugml42#c.aopugul54 c.saaugml42#c.aopugul55 c.saaugml43#c.aopugul52 c.saaugml43#c.aopugul53 c.saaugml43#c.aopugul54 c.saaugml43#c.aopugul55 c.saaugml44#c.aopugul52 c.saaugml44#c.aopugul53 c.saaug ml44#c.aopugul54 c.sa augml44#c.aopugul55" 266 global saaXalpha "c.saaugml42#c.alphatocopherol52 c.saaugml42#c.alphatocopherol53 c.saaugml42#c.alphatocopherol54 c.saaugml42#c.alphatocopherol55 c.saaugml43#c.alphatocopherol52 c.saaugml43#c.alphatocopherol53 c.s aaugml43#c.alphatocop herol54 c.saaugml43#c.alphat ocopherol55 c.saaugml44#c.alphatocopherol52 c.saaugml44#c.alphatocopherol53 c.saaugml44#c.alphatocopherol54 c.saaugml44#c.alphatocopherol55" global saaXvitamina "c.saaugml42#c.vitamina42 c.saaugml42#c.vita mina43 c.saaugml42#c. vitamina44 c.saaugml43#c.vit amina42 c.saaugml43#c.vitamina43 c.saaugml43#c.vitamina44 c.saaugml44#c.vitamina42 c.saaugml44#c.vitamina43 c.saaugml44#c.vitamina44" global saaXCalcium "c.saaugml42#c.calcium c.saaugml43#c.calcium c.saaug ml44#c.calcium" glo bal saaXNefa "c.saaugml42#c. lognefasq c.saaugml43#c.lognefasq c.saaugml44#c.lognefasq" global saaXNL "c.saaugml42#c.NLratio c.saaugml43#c.NLratio c.saaugml44#c.NLratio" global saaXBCS "c.saaugml42#c.bcscat2 c.saaugml43#c.bcscat2 c. saaugml44#c.bcscat2" global albXAOP "c.albumin5 2#c.aopugul52 c.albumin52#c.aopugul53 c.albumin52#c.aopugul54 c.albumin52#c.aopugul55 c.albumin53#c.aopugul52 c.albumin53#c.aopugul53 c.albumin53#c.aopugul54 c.albumin53#c.aopugul55 c.albumin54#c.aopugul52 c .albumin54#c.aopugul5 3 c.albumin54#c.aopugul54 c. albumin54#c.aopugul55 c.albumin55#c.aopugul52 c.albumin55#c.aopugul53 c.albumin55#c.aopugul54 c.albumin55#c.aopugul55" global albXalpha "c.albumin52#c.alphatocopherol52 c.albumin52#c.alphatocopherol53 c.al bumin52#c.alphatocoph erol54 c.albumin52#c.alphato copherol55 c.albumin53#c.alphatocopherol52 c.albumin53#c.alphatocopherol53 c.albumin53#c.alphatocopherol54 c.albumin53#c.alphatocopherol55 c.albumin54#c.alphatocopherol52 c.albumin54#c.alphatocopherol53 c.al bumin54#c.alphatocoph erol54 c.albumin54#c.alphato copherol55 c.albumin55#c.alphatocopherol52 c.albumin55#c.alphatocopherol53 c.albumin55#c.alphatocopherol54 c.albumin55#c.alphatocopherol55" global albXvitamina "c.albumin52#c.vitamina42 c.albumin52#c.vitami na43 c.albumin52#c.vi tamina44 c.albumin53#c.vitam ina42 c.albumin53#c.vitamina43 c.albumin53#c.vitamina44 c.albumin54#c.vitamina42 c.albumin54#c.vitamina43 c.albumin54#c.vitamina44 c.albumin55#c.vitamina42 c.albumin55#c.vitamina43 c.albumin55#c.vitamina44" global albXCalcium " c.albumin52#c.calcium c.albu min53#c.calcium c.albumin54#c.calcium c.albumin55#c.calcium" global albXNefa "c.albumin52#c.lognefasq c.albumin53#c.lognefasq c.albumin54#c.lognefasq c.albumin55#c.lognefasq" global albXNL "c.albumin52#c .NLratio c.albumin53#c.NLratio c.albumin54#c.NLra tio c.albumin55#c.NLratio" global albXBCS "c.albumin52#c.bcscat2 c.albumin53#c.bcscat2 c.albumin54#c.bcscat2 c.albumin55#c.bcscat2" global AOPXCalcium "c.aopugul52#c.calcium c .aopugul53#c.calcium c.aopug ul54#c.calcium c.aopugul55#c.calcium" global AO PXNefa "c.aopugul52#c.lognefasq c.aopugul53#c.lognefasq c.aopugul54#c.lognefasq c.aopugul55#c.lognefasq" global AOPXNL "c.aopugul52#c.NLratio c.aopugul53#c.NLratio c.aopugul54#c .NLratio c.aopugul55#c.NLrat io" global AOPXBCS "c.aopugul52#c.bcscat2 c.aop ugul53#c.bcscat2 c.aopugul54#c.bcscat2 c.aopugul55#c.bcscat2" global alphaXCalcium "c.alphatocopherol52#c.calcium c.alphatocopherol53#c.calcium c.alphatocopherol54#c.calcium c.a lphatocopherol55#c.calcium" global alphaXNefa "c.alphatocopherol52#c.lognef asq c.alphatocopherol53#c.lognefasq c.alphatocopherol54#c.lognefasq c.alphatocopherol55#c.lognefasq" global alphaXNL "c.alphatocopherol52#c.NLratio c.alphatocopherol53#c.NLratio c.alphatocopherol54#c.NLrat io c.alphatocopherol55#c.NLratio" global alphaX BCS "c.alphatocopherol52#c.bcscat2 c.alphatocopherol53#c.bcscat2 c.alphatocopherol54#c.bcscat2 c.alphatocopherol55#c.bcscat2" global vitaminaXCalcium "c.vitamina42#c.calcium c.v itamina43#c.calcium c.vitami na44#c.calcium" global vitaminaXNefa "c.vitami na42#c.lognefasq c.vitamina43#c.lognefasq c.vitamina44#c.lognefasq" global vitaminaXNL "c.vitamina42#c.NLratio c.vitamina43#c.NLratio c.vitamina44#c.NLratio" global vitamina XBCS "c.vitamina42#c.bcscat2 c.vitamina43#c.bcscat2 c.vitamina44#c.bcscat2" global AOPXLtot "c.aopugul52#c.Ltot52 c.aopugul52#c.Ltot53 c.aopugul52#c.Ltot54 c.aopugul52#c.Ltot55 c.aopugul53#c.Ltot52 c.aopugul53#c.Ltot53 c.aopugul53#c.Ltot54 c.aopugul52#c .Ltot55 c.aopugul54#c.Ltot52 c.aopugul54#c.Ltot53 c.aopugul54#c.Ltot54 c.aopu gul52#c.Ltot55 c.aopugul55#c.Ltot52 c.aopugul55#c.Ltot53 c.aopugul55#c.Ltot54 c.aopugul52#c.Ltot55" global alphaXLtot "c.alphatocopherol52#c.Ltot52 c.alphatocopherol52#c.Ltot53 c.alphatocopherol52#c.Ltot54 c.alphatocopherol52#c.Ltot55 c.alphatocopherol53 #c.Ltot52 c.alphatocopherol53#c.Ltot53 c.alphatocopherol53#c.Ltot54 c.alphatocopherol53#c.Ltot55 c.alphatocopherol54#c.Ltot52 c.alphatocopherol54#c.Ltot53 c.alphatocopherol54#c.Lt ot54 c.alphatocopherol54#c.L tot55 c.alphatocopherol55#c.Ltot52 c.alphatocophe rol55#c.Ltot53 c.alphatocopherol55#c.Ltot54 c.alphatocopherol55#c.Ltot55" global vitaminaXLtot "c.vitamina42#c.Ltot52 c.vitamina42#c.Ltot53 c.vitamina42#c.Ltot54 c.vitamina42#c. Ltot55 c.vitamina43#c.Ltot52 c.vitamina43#c.Ltot53 c.vitamina43#c.Ltot54 c.vi tamina43#c.Ltot55 c.vitamina44#c.Ltot52 c.vitamina44#c.Ltot53 c.vitamina44#c.Ltot54 c.vitamina44#c.Ltot55" global calciumXLtot "c.calcium#c.Ltot52 c.calcium#c.Ltot53 c.calcium#c .Ltot54 c.calcium#c.Ltot55" global NefaXLtot "c.lognefasq#c.Ltot52 c.lognef asq#c.Ltot53 c.lognefasq#c.Ltot54 c.lognefasq#c.Ltot55" global BCSXLtot "c.bcscat2#c.Ltot52 c.bcscat2#c.Ltot53 c.bcscat2#c.Ltot54 c.bcscat2#c.Ltot55" global MtotXbhb "c.MtotCa t2#c.bhb c.MtotCat3#c.bhb c. MtotCat4#c.bhb c.MtotCat5#c.bhb" global saaXbhb "c.saaugml42#c.bhb c.saaugml43#c.bhb c.saaugml44#c.bhb" global albXbhb "c.albumin52#c.bhb c.albumin53#c.bhb c.albumin54#c.bhb c.albumin55#c.bhb" global AOPXbhb "c.aopugul52#c.bh b c.aopugul53#c.bhb c.aopugu l54#c.bhb c.aopugul55#c.bhb" global alphaXbhb "c .alphatocopherol52#c.bhb c.alphatocopherol53#c.bhb c.alphatocopherol54#c.bhb c.alphatocopherol55#c.bhb" global vitaminaXbhb "c.vitamina42#c.bhb c.vitamina43#c.bhb c.vitamina44#c. bhb" 267 /*Model 1*/ eststo clear global model1 "$LactCat $season $Mtot $al bumin $aopugul calcium lognefasq NLratio bcscat2" logit AnyDzCalc $model1 $LactXSeason estat ic logit AnyDzCalc $model1 $LactXMtot estat ic eststo logit AnyDzCalc $model1 $LactXMtot $LactXa lb estat ic logit AnyDzCalc $model1 $Lac tXMtot $LactXAOP estat ic logit AnyDzCalc $model1 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model1 $LactXMtot $LactXNefa estat ic logit AnyDzCal c $model1 $LactXMtot $LactXNL est at ic logit AnyDzCalc $model1 $LactXMtot $Lact XBCS estat ic logit AnyDzCalc $model1 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model1 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model1 $L actXMtot $SeasonXAOP estat ic logit AnyDzCalc $model1 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model1 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model1 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model1 $LactXMtot $SeasonXBCS estat ic logit A nyDzCalc $model1 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model1 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model1 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model1 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $m odel1 $LactXMtot $MtotXNL estat ic logit An yDzCalc $model1 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model1 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model1 $LactXMtot $albXCalcium estat ic logit AnyDzCalc $model1 $LactXMto t $albXNefa estat ic logit AnyDzCalc $model 1 $LactXMtot $albXNL estat ic logit AnyDzCalc $model1 $LactXMtot $albXBCS estat ic logit AnyDzCalc $model1 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model1 $LactXMtot $AOPXNefa e stat ic logit AnyDzCalc $model1 $LactXMtot $AO PXNL estat ic logit AnyDzCalc $model1 $LactXMtot $AOPXBCS estat ic eststo logit AnyDzCalc $model1 $LactXMtot $AOPXBCS c.calcium#c.lognefasq estat ic logit AnyDzCalc $model1 $LactXMtot $ AOPXBCS c.calcium#c.NLratio estat ic logit AnyDzCalc $model1 $LactXMtot $AOPXBCS c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model1 $LactXMtot $AOPXBCS c.lognefasq#c.NLratio estat ic logi t AnyDzCalc $model1 $LactXMtot $AOPXBCS c.lognefa sq#c.bcscat2 estat ic eststo logit AnyD zCalc $model1 $LactXMtot $AOPXBCS c.lognefasq#c.bcscat2 c.NLratio#c.bcscat2 estat ic esttab, aic /// title("Model 1") 268 /*Model 2*/ e ststo clear global model2 "$LactCat $season $Mto t $albumin $aopugul $alphatocopherol $vitamina ca lcium lognefasq NLratio bcscat" logit AnyDzCalc $model2 $LactXSeason estat ic logit AnyDzCalc $model2 $LactXMtot estat ic logit AnyDzCalc $model2 $LactXMtot $LactXalb estat ic logit AnyD zCalc $model2 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model2 $LactXMtot $LactXalpha estat ic logit AnyDzCalc $model2 $LactXMtot $LactXvitamina estat ic logit AnyDzCalc $model2 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model2 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model2 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model2 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model2 $LactXMtot $SeasonXMtot estat ic logit AnyDzCa lc $model2 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model2 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model2 $LactXMtot $SeasonXalpha estat ic logit AnyDzC alc $model2 $LactXMtot $SeasonXvitamina estat ic logit Any DzCalc $model2 $LactXMtot $SeasonXCalc estat i c logit AnyDzCalc $model2 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model2 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model2 $LactXMtot $SeasonXBCS estat ic logit AnyD zCalc $model2 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model2 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model2 $LactXMtot $MtotXalpha estat ic logit AnyD zCalc $model2 $LactXMtot $MtotXvitamina estat ic logit AnyDzCal c $model2 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model2 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model2 $LactXMtot $MtotXNL estat ic logit AnyDz Calc $model2 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $ model2 $LactXMtot $albXAOP estat ic logi t AnyDzCalc $model2 $LactXMtot $albXalpha estat ic logit AnyDzCalc $model2 $LactXMtot $albXvitamina estat ic logit AnyDzCalc $model2 $LactXMtot $albXCalcium estat ic logit AnyDzCalc $ model2 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model2 $LactXMtot $albXNL estat ic logit AnyDzCalc $model2 $LactXMtot $albXBCS estat ic logit AnyDzCalc $model2 $LactXMtot $AOPXalpha estat ic logit AnyDzCalc $model2 $Lac tXMtot $AOPXvitamina estat ic logit AnyDz Calc $model2 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model2 $LactXMtot $AOPXNefa 269 estat ic logit AnyDzC alc $model2 $LactXMtot $AOPXNL estat ic logit AnyDzCalc $model2 $LactX Mtot $AOPXNL $AOPXBCS estat ic logit AnyD zCalc $model2 $LactXMtot $AOPXNL $alphaXvitamina estat ic logit AnyDzCalc $model2 $LactXMtot $AOPXNL $alphaXCalcium estat ic logit AnyDzCalc $model2 $LactXMtot $AOPXNL $alphaXNefa estat ic logit AnyDzCalc $model2 $LactXMtot $AOPXNL $alphaXNL estat ic logit AnyDzCalc $model2 $LactXMtot $AOPXNL $alphaXBCS estat ic logit AnyDzCalc $model2 $LactXMtot $AOPXNL $vitaminaXCalcium estat ic logit AnyDzCalc $model2 $LactX Mtot $AOPXNL $vitaminaXNefa estat ic lo git AnyDzCalc $model2 $LactXMtot $AOPXNL $vitaminaXNL estat ic logit AnyDzCalc $model2 $LactXMtot $AOPXNL $vitamin aXBCS estat ic logit AnyDzCalc $model2 $LactXMtot $AOPXNL c.calcium#c.logne fasq estat ic logit AnyDzCalc $model2 $L actXMtot $AOPXNL c.calcium#c.NLratio estat ic logit AnyDzCalc $model2 $LactXMtot $AOPXNL c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model2 $LactXMtot $AOPXNL c.lognefasq#c.NLratio estat ic logit AnyDzCalc $model2 $LactXMtot $AOPXNL c.lognefasq#c.bcscat2 estat ic eststo logit AnyDzCalc $model2 $LactXMtot $AOPXNL c.lognefasq#C.bcsc at2 c.NLratio#c.bcscat2 estat ic esttab, aic /// title("Model 2") *Model 3*/ eststo clear global model 12 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol calcium lognefasq NLratio bcscat2" logit AnyDzCalc $model3 estat ic logit AnyDzCalc $model3 $LactXSeason estat ic logit AnyDzCalc $model 3 $LactXMtot estat ic eststo logit AnyDzCalc $model3 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model3 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model3 $LactXMtot $LactXalpha estat ic logit AnyDzCalc $model3 $LactXMtot $LactXCalc est at ic logit AnyDzCalc $model3 $LactXMtot $LactX Nefa estat ic logit AnyDzCalc $model3 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model3 $LactXMtot $LactX BCS estat ic logit AnyDzCalc $model3 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $ model3 $LactXMtot $SeasonXalb estat ic logit An yDzCalc $model3 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model3 $LactXMtot $SeasonXalpha estat ic logit AnyDzCalc $model3 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model3 $LactXMtot $ SeasonXNefa estat ic logit AnyDzCalc $model3 $ LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model3 $LactXMtot $SeasonXBCS 270 estat ic logit AnyDzCalc $mo del3 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model3 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model3 $LactXMtot $MtotXalpha e stat ic logit AnyDzCalc $model3 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model3 $LactXMtot $Mtot XNefa estat ic logit AnyDzCalc $model3 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model 3 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model3 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model3 $LactXMtot $albXalpha estat ic logit AnyDz Calc $model3 $LactXMtot $albXCalcium estat ic logit AnyDzCalc $model3 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model3 $LactXMtot $alb XNL estat ic logit AnyDzCalc $model3 $LactXMtot $albXBCS estat ic logit AnyDzCalc $model3 $LactXMtot $AOPXalpha estat ic logit AnyDzCalc $model3 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $ model3 $LactXMtot $AOPXNefa estat ic logit Any DzCalc $model3 $LactXMtot $AOPXNL estat ic logit AnyDzCalc $model3 $LactXMtot $AOPXBCS estat ic eststo logit AnyDzCalc $model3 $LactXMtot $AOPXBCS $alphaXCalcium estat ic logit AnyDzCalc $model3 $La ctXMtot $AOPXBCS $alphaXNefa estat ic logit A nyDzCalc $model3 $LactXMtot $AOPXBCS $alphaXNL estat ic logit AnyDzCalc $model3 $LactXMtot $AO PXBCS $alphaXBCS estat ic logit AnyDzCalc $model3 $LactXMtot $AOPXBCS c.calcium#c.lognefasq estat ic log it AnyDzCalc $model3 $LactXMtot $AOPXBCS c.calciu m#c.NLratio estat ic logit AnyDzCalc $model3 $LactXMtot $AOPXBCS c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model3 $LactXMtot $AOPXBCS c.lognefasq#c.NLratio estat ic logit AnyDzCalc $model3 $LactX Mtot $AOPXBCS c.lognefasq#c.bcscat2 estat ic e ststo logit AnyDzCalc $model3 $LactXMtot $AOPXBCS c.lognefasq#c.bcs cat2 c.NLratio#c.bcscat2 estat ic esttab, aic /// title("Model 3") /*Model 4*/ eststo clear global model4 "$LactCat $season $Mtot $saaugml $albumin $aopugul calcium lognefasq NLratio bcscat2" logit AnyDzCalc $model4 $LactXSeason estat ic logit AnyDzCalc $model4 $LactXMtot estat ic eststo logit AnyDzCalc $model4 $LactXMtot $LactXsaa estat ic logi t AnyDzCalc $model4 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model4 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model4 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model4 $LactXMtot $LactXNefa 271 estat ic logit Any DzCalc $model4 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model4 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model4 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model4 $LactXMtot $SeasonXsaa estat ic logit AnyDzC alc $model4 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model4 $LactXMtot $SeasonXAOP estat ic lo git AnyDzCalc $model4 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model4 $LactXMtot $SeasonXNefa estat ic logit AnyD zCalc $model4 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model4 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model4 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model4 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $ model4 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model4 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model4 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model4 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model4 $LactXMtot $MtotXBCS estat ic logit Any DzCalc $model4 $LactXMtot $saaXalb estat ic eststo logit AnyDzCalc $model4 $LactXMtot $saaXalb $saaXAOP estat ic logit AnyDzCalc $model4 $LactXMtot $saaXalb $saaXCalcium estat ic log it AnyDzCalc $model4 $LactXMtot $saaXalb $saaXNef a estat ic logit AnyDzCalc $model4 $LactXMtot $saaXa lb $saaXNL estat ic logit AnyDzCalc $model4 $LactXMtot $saaXalb $saaXBCS estat ic logit AnyDzCalc $model4 $LactXMtot $saaXalb $albX AOP estat ic logit AnyDzCalc $model4 $La ctXMtot $saaXalb $albXCalcium estat ic logit AnyDzCalc $model4 $LactXMtot $saaXalb $albXNefa estat ic logit AnyDzCalc $model4 $LactXMtot $saaXalb $albXNL estat ic logit AnyDzCalc $model4 $LactXMtot $saaXalb $albXBCS estat ic l ogit AnyDzCalc $model4 $LactXMtot $saaXalb $AOPXCalcium estat ic logit AnyDzCalc $model4 $LactXMtot $saaXalb $AOPXNefa estat ic logit AnyDzCalc $model4 $LactXMtot $saaXalb $AOPXNL estat ic logit AnyDzCalc $model4 $LactXMtot $saaXalb $AOPXBCS estat ic eststo logit An yDzCalc $model4 $LactXMtot $saaXalb $AOPXBCS c.calcium#c.lognefasq estat ic logit AnyDzCalc $model4 $LactXMtot $saaXalb $AOPXBCS c.calcium#c.NLratio est at ic logit AnyDzCalc $model4 $LactXMtot $s aaXalb $AOPXBCS c.calcium#c.bcscat2 esta t ic logit AnyDzCalc $model4 $LactXMtot $saaXalb $AOPXBCS c.lognefasq#c.NLratio estat ic eststo logit AnyDzCalc $model4 $LactXMtot $saaXalb $AOPXBCS c.lognefasq#c.NLratio c.lognefasq#c.bcscat2 es tat ic logit AnyDzCalc $model4 $LactXMto t $saaXalb $AOPXBCS c.lognefasq#c.NLratio c.NLratio#c.bcscat2 272 estat ic esttab, aic /// title("Model 4") /*Model 5*/ eststo clear glo bal model5 "$LactCat $season $Mtot $saaugml $albu min $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio bcscat2" logit AnyDzCalc $model5 $LactXSeason estat ic logit AnyDzCalc $model5 $LactXMtot estat ic logit AnyDzCalc $model5 $L actXMtot $LactXsaa estat ic logit AnyDzCalc $model5 $Lac tXMtot $LactXalb estat ic logit AnyDzCalc $model5 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model5 $LactXMtot $LactXalpha estat ic logit AnyDzCalc $model5 $LactXMtot $Lact Xvitamina estat ic logit AnyDzCalc $model5 $LactXMtot $L actXCalc estat ic logit AnyDzCalc $model5 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model5 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model5 $LactXMtot $LactXBCS es tat ic logit AnyDzCalc $model5 $LactXMtot $Sea sonXMtot e stat ic logit AnyDzCalc $model5 $LactXMtot $SeasonXsaa estat ic logit AnyDzCalc $model5 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model5 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model5 $LactXMtot $SeasonXalp ha estat ic logit AnyDzCalc $model5 $LactXMtot $SeasonXvitamina estat ic logit AnyDzCalc $model5 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model5 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model5 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model5 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model5 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model5 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model5 $LactXMtot $MtotXAOP estat i c logit AnyDzCalc $model5 $LactXMtot $MtotXalpha estat ic logit AnyDzCalc $model5 $LactXMtot $MtotXvitamina estat ic logit AnyDzCalc $model5 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model5 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model5 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model5 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb estat ic logit AnyDzCalc $ model5 $LactXMtot $saaXalb $s aaXAOP estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $saaXalpha estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $saaXvitamina estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $saaXCalcium e stat ic 273 logit AnyDzCalc $model5 $LactXMtot $saaXalb $saaXNefa estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $saaXNL estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $saaXBCS estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $albXAOP estat i c logit AnyDzC alc $model5 $LactXMtot $saaXalb $albXalpha estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $albXvitamina estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $albXCalcium estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $albX Nefa estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $albXNL estat ic logit AnyDzCalc $model5 $LactXMtot $saaXa lb $albXBCS estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $AOPX alpha estat ic logit AnyDzCalc $model5 $Lac tXMtot $saaXalb $AOPXvitamina estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $AOPXCalcium estat ic logit An yDzCalc $model5 $LactXMtot $saaXalb $AOPXNefa estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $AOPXNL estat ic eststo logit AnyDzCalc $model5 $LactXMtot $saaXalb $AOPXNL $AOPXBCS estat ic logit AnyDzCalc $model5 $LactXMtot $saaXa lb $AOPXNL $alphaXvitamina estat ic logit AnyDzCalc $model5 $LactXM tot $saaXalb $AOPXNL $alphaXCalcium estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $AOPXNL $alphaXNefa estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $A OPXNL $alphaXNL estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $ AOPXNL $alphaXBCS estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $AOPXNL $vitaminaXCalcium estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $AOPXNL $vitaminaXNefa estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $AOPXN L $vitaminaXNL estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $AOPXNL $vitaminaXBCS estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $AOPXNL c.calci um#c.lognefasq estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $AOPXNL c.calcium#c.NLratio estat ic logit AnyD zCalc $model5 $LactXMtot $saaXalb $AOPXNL c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $AO PXNL c.lognefasq#c.NLratio estat ic logit AnyDzCalc $model5 $LactXMtot $saaX alb $AOPXNL c.lognefasq#c.bcscat2 estat ic logit AnyDzCalc $model5 $LactXMtot $saaXalb $AOPXNL c.NLratio#c.bcscat2 estat ic esttab, aic /// title("Mo del 5") /*Model 6*/ eststo clear global model6 "$LactCat $season $Mtot $albu min $aopugul $vitamina calcium lognefasq NLratio bcscat2" logit AnyDzCalc $model6 estat ic logit AnyDzCalc $model6 $LactXSeason estat ic logit AnyDzCalc $model6 $LactXMtot estat ic eststo logit AnyDzCalc $model6 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model6 $LactXMtot $ LactXAOP 274 estat ic logit AnyDzCalc $model6 $LactXMtot $LactXvitamina estat ic logit AnyDzCalc $ model6 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model6 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model6 $LactXMtot $Lac tXNL estat ic logit AnyDzCalc $model6 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model6 $ LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model6 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model6 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model6 $LactXMtot $SeasonXvitamina estat ic logit AnyDzCalc $model6 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model6 $LactXMtot $SeasonXNefa estat i c logit AnyDzCalc $model6 $LactXMtot $Seaso nXNL estat ic logit AnyDzCalc $model6 $LactXMtot $SeasonXBCS estat ic logit A nyDzCalc $model6 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model6 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model6 $LactXMtot $MtotXvitamin a estat ic logit AnyDzCalc $model6 $LactXMtot $MtotXCalcium estat ic logit AnyD zCalc $model6 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model6 $LactXMtot $MtotXNL estat ic lo git AnyDzCalc $model6 $LactXMtot $MtotXBCS esta t ic logit AnyDzCalc $model6 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model6 $LactXMtot $albXvitamina estat ic logit AnyDzCalc $model6 $LactXMtot $albXCalcium estat ic logit AnyD zCalc $model6 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model6 $LactXMtot $albXNL estat ic logit AnyDzCalc $model6 $LactXMtot $albXBCS estat ic logit AnyDzCalc $model6 $LactXMtot $AOPXvitamina estat ic logit AnyDzCalc $model6 $La ctXMtot $AOPXCalcium estat ic logit AnyDzC alc $model6 $LactXMtot $AOPXNefa estat ic logit AnyDzCalc $model6 $LactXMtot $ AOPXNL estat ic eststo logit AnyDzCalc $model6 $LactXMtot $AOPXNL $AOPXBCS estat ic logit AnyDzCalc $model6 $Lac tXMtot $AOPXNL $vitaminaXCalcium estat ic log it AnyDzCalc $model6 $LactXMtot $AOPXNL $vitaminaXNefa estat ic eststo logit AnyDzCalc $model6 $LactXMtot $AOPXNL $vitaminaXNefa $vitaminaXNL estat ic logit AnyDzCalc $model6 $LactXMtot $AOPXNL $vitaminaXNefa $vitaminaXBCS estat ic logit AnyDzCalc $model6 $LactXMtot $AOPXNL $vitaminaXNefa c.calcium#c.lognefasq estat ic logit AnyDzCalc $model6 $LactXMtot $AOPXNL $vitaminaXNefa c.calcium#c.NLratio estat ic logit AnyDzCalc $model6 $L actXMtot $AOPXNL $vitaminaXNefa c.calcium#c.bcsca t2 estat ic logit AnyDzCalc $model6 $LactXMtot $AOPXNL $vitaminaXNefa c.lognefasq#c.NLratio estat ic 275 logit AnyDzCalc $model6 $LactXMtot $AOPXNL $vitaminaXNefa c.lognefasq#c.bcscat2 estat ic est sto logit AnyDzCalc $model6 $LactXMtot $AOPXNL $vitaminaXNefa c.lognefasq#c.bcscat2 c.NLratio#c.bcscat2 e stat ic eststo esttab, aic /// title("Model 6") /*Model 7*/ eststo clear global model7 "$LactCat $season $Mtot $saaugml $albumi n $aopugul $alphatocopherol calcium lognefasq NLr atio bcscat2" logit AnyDzCalc $model7 $LactXSeason esta t ic logit AnyDzCalc $model7 $LactXMtot estat ic eststo logit AnyDzCalc $model7 $LactXMtot $LactXsaa estat ic eststo logit AnyDzCalc $model7 $LactXMtot $LactXsaa $LactXalb estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $LactXAOP estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $LactXalpha estat ic logit AnyDzCalc $model7 $LactXMtot $LactX saa $LactXCalc estat ic logit AnyDzCalc $m odel7 $LactXMtot $LactXsaa $LactXNefa es tat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $LactXNL estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $LactXBCS estat ic logit A nyDzCalc $model7 $LactXMtot $LactXsaa $SeasonXMto t estat ic logit AnyDzCalc $model 7 $LactXMtot $LactXsaa $SeasonXsaa estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $SeasonXalb estat ic logit AnyDzCalc $model7 $LactXMtot $L actXsaa $SeasonXAOP estat ic logit AnyDzCa lc $model7 $LactXMtot $LactXsaa $SeasonXalp ha estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $SeasonXCalc estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $SeasonXNefa esta t ic logit AnyDzCalc $model7 $LactXMtot $L actXsaa $SeasonXNL estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $SeasonXBCS estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $MtotXsaa estat ic logit AnyDzCalc $model 7 $LactXMtot $LactXsaa $MtotXalb estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $MtotXAOP estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $MtotXalpha estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $MtotXCalciu m estat ic logit AnyDzCalc $model7 $Lac tXMtot $LactXsaa $MtotXNefa estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $MtotXNL estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $MtotXBCS estat ic logit AnyDzCalc $ model7 $LactXMtot $LactXsaa $saaXalb estat ic eststo logit Any DzCalc $model7 $LactXMtot $LactXsaa $saaXalb $saaXAOP estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $saaXalpha estat ic logit AnyDzCalc $model7 $L actXMtot $LactXsaa $saaXalb $saaXCalcium estat ic 276 logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $saaXNefa estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $saaXNL estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $saaXBCS estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $albXAOP estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $albXalpha estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $ saaXalb $albXCalcium estat ic logit AnyDzC alc $model7 $LactXM tot $LactXsaa $saaXalb $albXNefa estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $albXNL estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $al bXBCS estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsa a $saaXalb $AOPXalpha estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $AOPXCalcium estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $AOPXNefa estat ic logit AnyDzCalc $model7 $LactX Mtot $LactXsaa $saa Xalb $AOPXNL estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $AOPXBCS estat ic eststo logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $AOPXBCS $alp haXCalcium estat ic logit AnyDzCalc $mo del7 $LactXMtot $La ctXsaa $saaXalb $AOPXBCS $alphaXNefa estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $AOPXBCS $alphaXNL estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $ saaXalb $AOPXBCS $alphaXBCS estat ic lo git AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $AOPXBCS c.calcium#c.lognefasq estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $AOPXBCS c.calcium#c.NLratio estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $ AOPX BCS c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $AOPXBCS c.lognefasq#c.NLratio estat ic logit AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $AOPXBC S c.lognefasq#c.bcscat2 estat ic eststo logi t AnyDzCalc $model7 $LactXMtot $LactXsaa $saaXalb $AOPXBCS c.lognefasq#c.bcscat2 c.NLratio#c.bcscat2 estat ic esttab, aic /// title("Model 7") /*Model 8*/ eststo clear global model8 "$LactCat $season $Mtot $albumin $aopugul calc ium lognefasq NLratio" logit AnyDzCalc $model8 estat ic logit AnyDzCalc $model8 $LactXSeason estat ic logit AnyDzCalc $model8 $LactXMtot estat ic eststo logit AnyDzCalc $model8 $Lact XMtot $LactXalb estat ic logit AnyDzCalc $mod el8 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model8 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model8 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model8 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model8 $LactX Mtot $SeasonXMtot estat ic logit AnyDzCalc $model8 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model8 $LactXMtot $SeasonXAOP 277 estat ic logit AnyDzCalc $model8 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model8 $LactXMtot $ Seas onXNefa estat ic logit AnyDzCalc $model8 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model8 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model8 $LactXMtot $MtotXAOP estat ic log it AnyDzCalc $model8 $LactXMtot $MtotXCalcium e stat ic logit AnyDzCalc $model8 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model8 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model8 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model8 $LactXMtot $albXCalcium estat ic l ogit AnyDzCalc $model8 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model8 $LactXMtot $albXNL estat ic logit AnyDzCalc $model8 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model8 $L actXMtot $AOPXNefa estat ic logit AnyDzCalc $model8 $LactXMtot $AOPXNL estat ic logit AnyDzCalc $model8 $LactXMtot c.calcium#c.lognefasq estat ic logit AnyDzCalc $model8 $LactXMtot c.calcium#c.NLratio estat ic logit AnyDzCalc $model 8 $LactXMtot c.lognefasq#c.NLrat io estat ic eststo esttab, aic /// title("Model 8") /*Model 9 had both NL ratio and Ntot, removed Ntot, same as model 1 so removed*/ /*Model 10*/ eststo clear global model10 "$LactCat $season $Mtot $aopugul calcium lognefasq NLratio bcscat2" lo git AnyDzCalc $model10 estat ic logit AnyDzCalc $model10 $LactXSeason estat ic logit AnyDzCalc $model10 $LactXMtot estat ic eststo logit AnyDzCalc $model10 $LactXMtot $LactXAOP estat ic logit AnyDzC alc $model10 $LactXMtot $LactXCa lc estat ic logit AnyDzCalc $model10 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model10 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model10 $LactXMtot $LactXBCS estat ic logit An yDzCalc $model10 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model10 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model10 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model10 $LactXMtot $SeasonXNefa estat ic logit AnyDz Calc $model10 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model10 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model10 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model10 $LactXMtot $MtotXCalcium estat ic 278 logit AnyDzCalc $mode l10 $LactXMtot $MtotXNefa estat ic log it AnyDzCalc $model10 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model10 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model10 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model10 $L actXM tot $AOPXNefa estat ic logit AnyDzCalc $ model10 $LactXMtot $AOPXNL estat ic logit AnyDzCalc $model10 $LactXMtot $AOPXBCS estat ic eststo logit AnyDzCalc $model10 $LactXMtot $AOPXBCS c.calcium#c.lognefasq estat ic logit AnyDzCalc $m odel10 $LactXMtot $AOPXBCS c.calcium#c.NLratio estat ic logit AnyDzCalc $model10 $LactXMtot $AOPXBCS c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model10 $LactXMtot $AOPXBCS c.lognefasq#c.NLratio estat ic logit AnyDzCalc $mod el10 $LactXM tot $AOPXBCS c.lognefasq#c.bcscat2 estat ic eststo logit AnyDzCalc $model10 $LactXMtot $AOPXBCS c.lognefasq#c.bcscat2 /// c.NLratio#c.bcscat2 estat ic esttab, aic /// title("Model 10") /*Model 11*/ eststo clear global mode l11 "$LactCat $season $Mtot $albumin $aopugul cal cium lognefasq NLratio bcscat2 bhb" logit AnyDzCalc $model11 estat ic logit AnyDzCalc $model11 $LactXSeason estat ic logit AnyDzCalc $model11 $LactXMtot estat ic eststo logit AnyDz Calc $model11 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model11 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model11 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model11 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model11 $LactXMtot $LactXNL estat ic logi t AnyDzCalc $model11 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model11 $LactXMtot $LactXbhb estat ic logit AnyDzCalc $model11 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model 11 $LactXMtot $SeasonXalb estat ic logit A nyDzCalc $model11 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model11 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model11 $LactXMtot $SeasonXNefa estat ic logi t AnyDzCalc $mod el11 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model11 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model11 $LactXMtot $SeasonXbhb estat ic logit AnyDzCalc $model11 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model1 1 $LactXMtot $MtotXAOP estat ic logit AnyD zCalc $model11 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model11 $LactXMtot $MtotXNefa 279 estat ic logit AnyDzCalc $model11 $LactXMtot $MtotXNL estat ic logi t AnyDzCalc $model11 $L actXMtot $MtotXBCS estat ic logit AnyDzCal c $model11 $LactXMtot $MtotXbhb estat ic logit AnyDzCalc $model11 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model11 $LactXMtot $albXCalcium estat ic logit AnyD zCalc $model11 $LactXMt ot $albXNefa estat ic logit AnyDzCalc $mod el11 $LactXMtot $albXNL estat ic logit AnyDzCalc $model11 $LactXMtot $albXBCS estat ic logit AnyDzCalc $model11 $LactXMtot $albXbhb estat ic logit AnyDzCalc $model 11 $LactXMtot $AOPXCalc ium estat ic logit AnyDzCalc $model11 $Lac tXMtot $AOPXNefa estat ic logit AnyDzCalc $model11 $LactXMtot $AOPXNL estat ic logit AnyDzCalc $model11 $LactXMtot $AOPXBCS estat ic eststo logit AnyDzCalc $model11 $LactXMtot $AOPXBCS $AOPXbhb estat ic eststo logit AnyDzCal c $model11 $LactXMtot $AOPXBCS $AOPXbhb c.calcium#c.lognefasq estat ic logit AnyDzCalc $model11 $LactXMtot $AOPXBCS $AOPXbhb c.calcium#c.NLratio estat ic log it AnyDzCalc $model11 $LactXMtot $AOPXB CS $AOPXbhb c.calcium#c.bcscat2 estat ic lo git AnyDzCalc $model11 $LactXMtot $AOPXBCS $AOPXbhb c.calcium#c.bhb estat ic logit AnyDzCalc $model11 $LactXMtot $AOPXBCS $AOPXbhb c.lognefasq#c.NLratio estat i c logit AnyDzCalc $model11 $LactXMt ot $AOPXBCS $AOPXbhb c.lognefasq#c.bcscat2 esta t ic eststo logit AnyDzCalc $model11 $LactXMtot $AOPXBCS $AOPXbhb c.lognefasq#c.bcscat2 /// c.lognefasq#c.bhb estat ic logit AnyDzCalc $model11 $LactXMtot $AOPXBCS $AOPXbhb c.lognefasq#c.bcscat2 /// c.NLratio#c.bhb estat ic logit AnyD zCalc $model11 $LactXMtot $AOPXBCS $AOPXbhb c.lognefasq#c.bcscat2 /// c.NLratio#c.bcscat2 estat ic logit AnyDzCalc $model11 $LactXMtot $AOPXBCS $AOPXbhb c.lognefasq#c.bcscat2 /// c.bcscat2#c.bhb estat ic esttab, aic /// title("Model 11") /*Model 12*/ eststo clear global model12 "$LactCat $season $Mtot $saaugml $albumin $aopugul $alphatocopherol calcium lognefasq NLratio" lo git AnyDzCalc $model12 $LactXSeason estat ic logit AnyDzCalc $model12 $LactXMtot estat ic eststo logit AnyDzCalc $model12 $LactXMtot $LactXsaa estat ic eststo logit AnyDzCalc $model12 $LactXMtot $LactXsaa $LactXalb estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $Lac tXAOP estat ic logit AnyDzCalc $model12 $La ctXMtot $LactXsaa $LactXalpha estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $LactXCalc estat ic 280 logit AnyDzCalc $model12 $LactXMtot $LactXsaa $LactXNefa estat ic logit Any DzCalc $model12 $LactXMtot $LactXsaa $LactXNL estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $SeasonXMtot estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $ SeasonXsaa estat ic logit AnyDzCalc $model12 $LactXMtot $La ctXsaa $SeasonXalb estat ic logit AnyDzC alc $model12 $LactXMtot $LactXsaa $SeasonXAOP estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $SeasonXalpha estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $SeasonXCalc est at ic logit AnyDzCalc $model12 $LactXMtot $ LactXsaa $SeasonXNefa estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $SeasonXNL estat ic logit AnyDzCalc $model12 $L actXMtot $LactXsaa $MtotXsaa estat ic logit AnyDzCalc $mode l12 $LactXMtot $LactXsaa $MtotXalb estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $MtotXAOP estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $MtotXalpha estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $MtotXC alcium estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $MtotXNefa estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $MtotXNL estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb estat ic eststo logi t AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXal b $saaXAOP estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb $saaXalpha estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb $saaXCalcium estat ic logit AnyDz Calc $model12 $LactXMtot $LactXsaa $saaXalb $saaX Nefa estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb $saaXNL estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb $albXAOP estat ic logit AnyDzCalc $mode l12 $LactXMtot $LactXsaa $saaXalb $albXalpha e stat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb $albXCalcium estat ic logit AnyDzCalc $model12 $Lac tXMtot $LactXsaa $saaXalb $albXNefa estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb $albXNL estat i c logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb $AOPXalpha estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb $AOPXCalcium estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb $AOPXNefa estat ic log it AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb $AOPXNL estat ic logit AnyDzCalc $model12 $LactXMt ot $LactXsaa $saaXalb $alphaXCalcium estat ic logit AnyDzCalc $model12 $LactXMtot $Lac tXsaa $saaXalb $alphaXNefa estat ic logi t AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb $alphaXNL estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb c.calcium#c.lognefasq estat ic logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb c.calcium#c.NLratio estat i c logit AnyDzCalc $model12 $LactXMtot $LactXsaa $saaXalb c.lognefasq#c.NLratio estat ic esttab, aic /// title("Model 12") 281 /*Model 13*/ eststo clear global model13 "$LactCat $ season $Mtot $saaugml $albumin $aopugul calcium l ognefasq NLratio" logit AnyDzCalc $model13 estat ic logit AnyDzCalc $model13 $LactXSeason estat ic logit AnyDzCalc $model13 $LactXMtot estat ic eststo logit AnyDzCalc $model13 $LactXMt ot $LactXsaa estat ic logit AnyDzCalc $mode l13 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model13 $LactXMtot $LactXAOP estat ic logit AnyDz Calc $model13 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model13 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model13 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model 13 $LactXMtot $SeasonXsaa estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXalb estat i c logit AnyDzCalc $model13 $LactXMtot $SeasonX AOP estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model13 $Lact XMtot $SeasonXNefa estat ic eststo logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa $Season XNL estat ic logit AnyDzCalc $model13 $Lact XMtot $SeasonXNefa $MtotXsaa estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa $MtotXalb estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa $MtotXAOP estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa $MtotXCalcium estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa $MtotXNefa estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa $MtotXNL estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNef a $saaXalb estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa $saaXAOP estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonX Nefa $saaXCalcium estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa $saaXNefa estat ic logit A nyDzCalc $model13 $LactXMtot $SeasonXNefa $saaXNL estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa $albXAOP estat ic logi t AnyDzCalc $model13 $LactXMtot $SeasonXNefa $albXCalcium estat ic logit AnyDzCalc $model13 $LactXMtot $Season XNefa $albXNefa estat ic logit AnyDzCalc $m odel13 $LactXMtot $SeasonXNefa $albXNL estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa $AOPXCalcium estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa $AOPXNefa estat ic logi t AnyDzCalc $model13 $LactXMtot $SeasonXNefa $AOP XNL estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa c.calcium#c .lognefasq estat ic logit AnyDzCalc $model13 $LactXMtot $SeasonXNefa c.calcium#c.NLratio estat ic logit AnyDzCalc $model 13 $LactXMtot $SeasonXNefa c.lognefasq#c.NLratio estat ic 282 esttab, aic /// title("Model 13") /*Model 14*/ ests to clear global model14 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio" logit AnyDzCalc $model14 estat ic logit AnyDzCa lc $model14 $LactXSeason estat ic logit AnyDzCalc $model14 $LactXMtot estat ic eststo logit AnyDzCalc $model14 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model14 $LactXMtot $LactXAOP es tat ic logit AnyDzCalc $model14 $LactXMtot $Lac tXalpha estat ic logit AnyDzCalc $model14 $LactXMtot $LactXvitamina estat ic logit AnyDzCalc $model14 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model14 $LactXMtot $LactXNefa estat i c logit AnyDzCalc $model14 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model14 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model14 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model14 $LactXMtot $SeasonXAOP estat ic logi t AnyDzCalc $model14 $LactXMtot $SeasonXalpha e stat ic logit AnyDzCalc $model14 $LactXMtot $SeasonXvitamina estat ic logit AnyDzCalc $model14 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model14 $LactXMtot $SeasonXNefa estat ic log it AnyDzCalc $model14 $LactXMtot $SeasonXNL est at ic logit AnyDzCalc $model14 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model14 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model14 $LactXMtot $MtotXalpha estat ic logit AnyDzCal c $model14 $LactXMtot $MtotXvitamina estat ic logit AnyDzCalc $model14 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model14 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model14 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $mod el14 $LactXMtot $albXAOP estat ic logit AnyDz Calc $model14 $LactXMtot $albXalpha estat ic logit AnyDzCalc $mod el14 $LactXMtot $albXvitamina estat ic logit AnyDzCalc $model14 $LactXMtot $albXCalcium estat ic logit AnyDzCalc $model14 $Lac tXMtot $albXNefa estat ic logit AnyDzCalc $ model14 $LactXMtot $albXNL estat ic logit AnyDzCalc $model14 $LactXMtot $AOPXalpha estat ic logit AnyDzCalc $model14 $LactXMtot $AOPXvitamina estat ic logit AnyDzCalc $model14 $LactXMtot $AOP XCalcium estat ic logit AnyDzCalc $model14 $LactXMtot $AOPXNefa estat ic logit AnyDzCalc $model 14 $LactXMtot $AOPXNL estat ic 283 logit AnyDzCalc $model14 $LactXMtot $alphaXvitamina estat ic logit AnyDzCalc $model14 $LactXMtot $alphaXCal cium estat ic logit AnyDzCalc $model14 $LactX Mtot $alphaXNefa estat ic logit AnyDzCalc $model14 $ LactXMtot $alphaXNL estat ic logit AnyDzCalc $model14 $LactXMtot $vitaminaXCalcium estat ic logit AnyDzCalc $model14 $LactXMtot $vitaminaXN efa estat ic eststo logit AnyDzCalc $mod el14 $LactXMtot $vitaminaXNefa $vitaminaXNL estat ic l ogit AnyDzCalc $model14 $LactXMtot $vitaminaXNefa c.calcium#c.lognefasq estat ic logit AnyDzCalc $model14 $LactXMtot $vitaminaXNefa c.calcium#c .NLratio estat ic logit AnyDzCalc $model14 $LactXMtot $vitaminaXNefa c.lognefasq#c.NLratio estat ic esttab, aic /// title("Model 14") /*Model 15*/ eststo clear global model15 "$LactCat $season $Mtot $saaugml $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq NLra tio" logit AnyDzCalc $m odel15 $LactXSeason estat ic logit AnyDzCalc $model15 $LactXMtot estat ic eststo logit AnyDzCalc $model15 $LactXMtot $LactXsaa estat ic eststo logit AnyDzC alc $model15 $LactXMtot $LactXsaa $LactXalb es tat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $LactXAOP estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $LactXalpha estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $LactXv itamina estat ic logit AnyDzCalc $model15 $ LactXMtot $LactXsaa $LactX Calc estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $LactXNefa estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $LactXNL estat ic logit AnyDzCalc $mo del15 $LactXMtot $LactXsaa $SeasonXMtot estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $SeasonXsaa estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $SeasonXalb estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $Seaso nXAOP estat ic logit AnyDzCalc $model15 $LactXMtot $Lac tXsaa $SeasonXalpha estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $SeasonXvitamina estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $SeasonXCalc estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $SeasonXNefa estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $SeasonXNL estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $MtotXsaa estat ic logit AnyDzCalc $model15 $L actXMtot $LactXsaa $MtotXalb estat ic lo git AnyDzCalc $ model15 $LactXMtot $LactXsaa $MtotXAOP estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $MtotXalpha estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $MtotXvitamin a estat ic logit AnyDzCalc $model15 $Lac tXMtot $LactXsaa $MtotXCalcium estat ic 284 logit AnyDzCalc $model15 $LactXMtot $LactXsaa $MtotXNefa estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $MtotXNL estat ic logit AnyD zCalc $model15 $LactXMtot $LactXsaa $saaXalb e stat ic eststo logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $saaXAOP estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $saaXalpha estat ic logit AnyDzCalc $ model15 $LactXMtot $LactXsaa $saaXalb $saaXvi tami na estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $saaXCalcium estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $saaXNefa estat ic logit AnyDzCalc $ model15 $LactXMtot $LactXsaa $saaXalb $saaXNL estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $albXAOP estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $albXalpha estat ic logit AnyDzCalc $model15 $ LactXMtot $LactXsaa $saaXalb $albXvitamina est at ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $albXCalcium estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $albXNefa estat ic logit AnyDzCalc $model15 $ LactXMtot $LactXsaa $saaXalb $albXNL estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $AOPXalpha estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $AOPXvitamina estat ic logit AnyDzCalc $model15 $Lact XMtot $LactXsaa $saaXalb $AOPXCalcium estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $AOPXNefa estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $AOPXNL estat ic eststo logit AnyDzCalc $model15 $ LactXMtot $LactXsaa $saaXalb $AOPXNL $al phaXvitam ina estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $AOPXNL $alphaXCalcium estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $AOPXNL $alphaXNefa estat ic logit AnyDzCalc $model15 $LactXMtot $ LactXsaa $saaXalb $AOPXNL $alphaXNL estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $AOPXNL $vitaminaXCalcium estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $AOPXNL $vi taminaXNefa estat ic logit AnyDzCa lc $model 15 $LactXMtot $LactXsaa $saaXalb $AOPXNL $vitaminaXNL estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $AOPXNL c.calcium#c.lognefasq estat ic logit AnyDzCalc $model15 $LactXMt ot $LactXsaa $saaXalb $AOPXNL c.calcium#c.NLratio estat ic logit AnyDzCalc $model15 $LactXMtot $LactXsaa $saaXalb $AOPXNL c.lognefasq#c.NLratio estat ic esttab, aic /// title("Model 15") /*Model 16*/ eststo clear global m odel16 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol calcium lognefasq NLratio" logit AnyDzCalc $model16 estat ic logit AnyDzCalc $model16 $LactXSeason estat ic logit AnyDzCalc $model16 $LactXMtot estat ic eststo logit Any DzCalc $model16 $LactXMtot $La ctXalb estat ic logit AnyDzCalc $model16 $LactXMtot $LactXAOP estat ic 285 logit AnyDzCalc $model16 $LactXMtot $LactXalpha estat ic logit AnyDzCalc $model16 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $m odel16 $LactXMtot $LactXNefa estat ic logi t AnyDzCalc $model16 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model16 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model16 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $mode l16 $LactXMtot $Se asonXAOP estat ic logit AnyDzCalc $model16 $LactXMtot $SeasonXalpha estat ic logit AnyDzCalc $model16 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model16 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $mo del16 $LactXMtot $ SeasonXNL estat ic logit AnyDzCalc $model16 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model16 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model16 $LactXMtot $MtotXalpha estat ic logit AnyDzCalc $model16 $La ctXMtot $MtotXCalc ium estat ic logit AnyDz Calc $model16 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model16 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model16 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model16 $LactXMtot $albXalpha estat ic logit AnyDzCalc $mode l16 $LactXMtot $albXCalcium estat ic logit AnyDzCalc $model16 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model16 $LactXMtot $albXNL estat ic logit AnyDzCalc $model16 $LactXMtot $AOPXal p ha estat ic logit AnyDzCalc $model16 $Lac tXMtot $AOPXCalcium estat ic logit AnyDzCalc $model16 $LactXMtot $AOPXNefa estat ic logit AnyDzCalc $model16 $LactXMtot $AOPXNL estat ic eststo logit AnyDzCalc $model16 $LactXMtot $AOP X NL $alphaXCalcium estat ic logit AnyDzCal c $model16 $LactXMtot $AOPXNL $alphaXNefa estat ic logit AnyDzCalc $model16 $LactXMtot $AOPXNL $alphaXNL estat ic logit AnyDzCalc $model16 $LactXMtot $AOPXNL c.calcium#c.lognefasq estat ic lo git AnyDzCalc $model16 $LactXMtot $AOPXNL c.calci um#c.NLratio estat ic logit AnyDzCalc $model16 $LactXMtot $AOPXNL c.lognefasq#c.NLratio estat ic esttab, aic /// title("Model 16") /*Model 17*/ eststo clear global model17 "$LactC at $season $Mtot $aopugul $alphatocopherol $vitamin a lognefasq calcium NLratio bcscat2" logit AnyDzCalc $model17 estat ic logit AnyDzCalc $model17 $LactXSeason estat ic logit AnyDzCalc $model17 $LactXMtot estat ic eststo 286 logit AnyDzCa lc $model17 $LactXMtot $LactXAOP estat ic lo git AnyDzCalc $model17 $LactXMtot $LactXalpha estat ic logit AnyDzCalc $model17 $LactXMtot $LactXvitamina estat ic logit AnyDzCalc $model17 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $ model17 $LactXMtot $LactXCalc estat ic log it AnyDzCalc $model17 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS estat ic eststo logit AnyDzCalc $model17 $LactXMtot $LactXBCS $SeasonXMtot estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $SeasonXA OP estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $SeasonXalpha estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $SeasonXvitamina estat ic logit AnyDzCalc $model17 $LactX Mtot $ LactXBCS $SeasonXNefa estat ic logit AnyDzC alc $model17 $LactXMtot $LactXBCS $SeasonXCalc estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $SeasonXNL estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $SeasonXBCS estat ic lo git AnyDzCalc $model17 $LactXMtot $LactXBCS $Mtot XAOP estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $MtotXalpha estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $MtotXvitamina estat ic logit AnyDzCalc $model17 $LactXM tot $L actXBCS $MtotXNefa estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $MtotXCalcium estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $MtotXNL estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $MtotXBCS estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXalph a estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXvitamina estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXNefa estat ic logit AnyDzCalc $model17 $LactXMtot $Lac tXBCS $ AOPXCalcium estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXNL estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBCS estat ic eststo logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBCS $alphaXvitamina est at ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBCS $alphaXNefa estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBCS $alphaXCalcium estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBCS $alphaXNL estat ic logi t AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBC S $alphaXBCS estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBCS $vitaminaXNefa estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBCS $vitaminaXCalcium estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBCS $vitaminaXNL estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBCS $vitaminaXBCS estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBCS c.lognefasq#c.calcium estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $A OPXBCS c.lognefasq#c.NLratio estat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBCS c.lognefasq#c.bcscat2 287 estat ic eststo logit AnyDzCalc $model1 7 $LactXMtot $LactXBCS $AOPXBCS c.log nefasq#c.bcscat2 /// c.calcium#c.NLratio es tat ic logit AnyDzCalc $model17 $LactXMtot $LactXBCS $AOPXBCS c.lognefasq#c.bcscat2 /// c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model17 $LactXMtot $La ctXBCS $AOPXBCS c.lognefasq#c.bcscat2 /// c.NLratio#c.bcscat2 estat ic e sttab, aic /// title("Model 17") /*Model 18*/ eststo clear global model18 "$LactCat $season $Mtot $aopugul $vitamina calcium lognefasq NLratio bcscat2" logit AnyDzCalc $model18 estat ic logit AnyDzCalc $model18 $LactXSeason esta t ic logit AnyDzCalc $model18 $LactXMtot estat ic eststo logit AnyDzCalc $model18 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model18 $Lact XMtot $LactXvitamina estat ic logit AnyDzCa lc $model18 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model18 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model18 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model18 $LactXMtot $Lac tXBCS estat ic logit AnyDzCalc $model18 $ LactXMtot $SeasonXMtot estat ic logit AnyDz Calc $model18 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model18 $LactXMtot $SeasonXvitamina estat ic logit AnyDzCalc $model18 $LactXMtot $Seaso nXCalc estat ic logit AnyDzCalc $model18 $LactXMtot $SeasonXNefa estat ic logit AnyDzC alc $model18 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model18 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model18 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model18 $LactXMt ot $MtotXvitamina estat ic logit AnyDzCalc $m odel18 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model18 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model18 $LactXMtot $Mto tXNL estat ic logit AnyDzCalc $model18 $LactXMtot $Mto tXBCS estat ic logit AnyDzCalc $model18 $La ctXMtot $AOPXvitamina estat ic logit AnyDzCalc $model18 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model18 $LactXMtot $AOPXNefa es tat ic logit AnyDzCalc $model18 $LactXMtot $AOPXNL e stat ic logit AnyDzCalc $model18 $LactXMtot $AO PXBCS estat ic eststo logit AnyDzCalc $model18 $LactXMtot $AOPXBCS $vitaminaXCalcium estat ic logit AnyDzCalc $model18 $LactXMtot $AOPXBCS $vitaminaXNefa estat ic eststo logit AnyDzCalc $ model18 $LactXMtot $AOPXBCS $vitaminaXNefa $vitam inaXNL 288 estat ic logit AnyDzCalc $model18 $LactXMtot $AOPXBCS $vitaminaXNefa $vitaminaXBCS estat ic logit AnyDzCalc $mod el18 $LactXMtot $AOPXBCS $vitaminaXNefa c.calcium#c.lognefasq estat ic logit AnyDzCalc $model18 $LactXMtot $AOPXBCS $vit aminaXNefa c.calcium#c.NLratio estat ic logit AnyDzCalc $model18 $LactXMtot $AOPXBCS $vitaminaXNefa c.calcium#c.bcscat2 es tat ic logit AnyDzCalc $model18 $LactXMtot $AOPXBCS $vitaminaXNefa c.logn efasq#c.NLratio estat ic logit AnyDzCalc $mod el18 $LactXMtot $AOPXBCS $vitaminaXNefa c.lognefasq#c.bcscat2 estat ic eststo logit AnyDzCalc $model18 $LactXMtot $AOPXBCS $vitaminaXNefa c.lognefasq#c.bcscat2 /// c.NLratio#c.bcscat2 estat ic esttab, aic /// title("Model 18") /*Model 19*/ eststo clear global model19 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq bcscat2" logit AnyDzCalc $model19 estat ic logit AnyDzCalc $mo del19 $LactXSeason estat ic logit AnyDzCa lc $model19 $LactXMtot estat ic eststo logit AnyDzCalc $model19 $LactXMtot $LactXalb estat ic logit AnyDzCal c $model19 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model19 $LactXMtot $LactXalpha estat ic logit AnyDzCalc $model19 $LactXMtot $LactXvitamina estat ic logit AnyDzCalc $model19 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $mod el19 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model19 $LactXMtot $LactXB CS estat ic logit AnyDzCalc $model19 $LactX Mtot $SeasonXMtot estat ic logit AnyDzCalc $model19 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model19 $Lact XMtot $SeasonXAOP estat ic logit AnyDzCalc $model19 $LactXMtot $SeasonXalpha estat ic logit AnyDzCalc $model19 $LactXMto t $SeasonXvitamina estat ic logit AnyDzCalc $model19 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model19 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model19 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model19 $LactXMt ot $MtotXalb estat ic logit AnyDzCalc $model19 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model19 $LactXMtot $MtotXalpha estat ic logit AnyDzCalc $model19 $LactXMtot $MtotXvitamina esta t ic logit AnyDzCalc $model19 $LactXMtot $Mto tXCalcium estat ic logit AnyDzCalc $model19 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model19 $L actXMtot $MtotXBCS estat ic logit AnyDzCalc $model19 $LactXMtot $albXAOP estat ic log it AnyDzCalc $model19 $LactXMtot $albXalpha est at ic 289 logit AnyDzCalc $model19 $LactXMtot $albXvitamina estat ic logit AnyDzCalc $model19 $LactXMtot $albXCalcium estat ic logit AnyDzCalc $model19 $LactXMtot $albXNefa estat ic logit AnyDz Calc $model19 $LactXMtot $albXBCS estat ic logit AnyDzCalc $model19 $LactXMtot $AOPXalpha estat ic logit AnyDzCalc $model19 $LactXMtot $A OPXvitamina estat ic logit AnyDzCalc $model19 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $m odel19 $LactXMtot $AOPXNefa estat ic logit An yDzCalc $model19 $LactXMtot $AOPXBCS estat ic logit AnyDzCalc $model19 $LactXMtot $alphaXvit amina estat ic logit AnyDzCalc $model19 $LactXMtot $alphaXCalcium estat ic logit AnyDzCalc $model19 $LactXMtot $alphaXNefa estat ic eststo l ogit AnyDzCalc $model19 $LactXMtot $alphaXNefa $alphaXBCS estat ic logit AnyDzCalc $model19 $LactXMtot $alphaXNefa $vitaminaXCalcium estat ic logit AnyDzCalc $model19 $LactXMtot $alphaXNefa $vitami naXNefa estat ic eststo logit AnyDzCalc $ model19 $LactXMtot $alphaXNefa $vitaminaXNefa /// $vitaminaXBCS estat ic logit AnyDzCalc $model19 $LactXMtot $alphaXNefa $vitaminaXNefa /// c.calcium#c.lognefasq estat ic logit AnyDzCalc $m odel19 $LactXMtot $alphaXNefa $vitaminaXNefa /// c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model19 $LactXMtot $alphaXNefa $v itaminaXNefa /// c.lognefasq#c.bcscat2 estat ic eststo esttab, aic /// title("Model 19") /*Model 20*/ eststo clear global model20 "$LactCat $se ason $Mtot $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio bcscat 2" logit AnyDzCalc $model20 estat ic logit AnyDzCalc $model20 $LactXSeason estat ic logit AnyDzCalc $ model20 $LactXMtot estat ic eststo logit An yDzCalc $model20 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model20 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model20 $LactXMtot $LactXalpha estat ic logit AnyDzCalc $model20 $LactXMtot $LactXvitamina estat ic logit AnyDzCalc $mode l20 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model20 $LactXMtot $ LactXNefa estat ic logit AnyDzCalc $model20 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model20 $LactXMtot $LactXBCS es tat ic logit AnyDzCalc $model20 $LactXMtot $Sea sonXMtot estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXalb esta t ic 290 logit AnyDzCalc $model20 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXalpha estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXvitam ina estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa estat ic eststo logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $Season XNL estat ic logit AnyDzCalc $model20 $LactXM tot $SeasonXNefa $SeasonXBCS estat ic logit AnyDzCalc $model20 $ LactXMtot $SeasonXNefa $MtotXalb estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $MtotXAOP estat ic logit AnyDzCalc $mode l20 $LactXMtot $SeasonXNefa $MtotXalpha estat i c logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $MtotXvitamina estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $MtotXCalcium estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $ MtotXNefa estat ic logit AnyDzCalc $model20 $ LactXMtot $SeasonXNefa $MtotXNL estat ic logit AnyDzCalc $model2 0 $LactXMtot $SeasonXNefa $MtotXBCS estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $albXAOP estat ic logit AnyDzCalc $mo del20 $LactXMtot $SeasonXNefa $albXalpha estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $albXvitamina estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $albXCalcium estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $a lbXNefa estat ic logit AnyDzCalc $model20 $La ctXMtot $SeasonXNefa $albXNL estat ic logit AnyDzCalc $model20 $ LactXMtot $SeasonXNefa $albXBCS estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $AOPXalpha estat ic logit AnyDzCalc $mode l20 $LactXMtot $SeasonXNefa $AOPXvitamina estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $AOPXCalcium estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $AOPXNefa estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $AOPX NL estat ic logit AnyDzCalc $model20 $LactXMt ot $SeasonXNefa $AOPXBCS estat ic logit AnyDzCalc $model20 $Lact XMtot $SeasonXNefa $alphaXvitamina estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $alphaXCalcium estat ic logit AnyDzCal c $model20 $LactXMtot $SeasonXNefa $alphaXNefa estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $alphaXN L estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $alphaXBCS estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $v itaminaXCalcium estat ic logit AnyDzCalc $mod el20 $LactXMtot $SeasonXNefa $vitaminaXNefa estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $vitaminaXNL estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa $vitaminaXBCS estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa c.calcium#c.lognefasq estat ic logit AnyDzCalc $model20 $L actXMtot $SeasonXNefa c.calcium#c.NLratio estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa c.calcium#c.bcscat2 estat ic 291 log it AnyDzCalc $model20 $LactXMtot $SeasonXNefa c.l ognefasq#c.NLratio estat ic logit AnyDzCalc $model20 $LactX Mtot $SeasonXNefa c.lognefasq#c.bcscat2 estat ic logit AnyDzCalc $model20 $LactXMtot $SeasonXNefa c.NLratio#c.bcscat2 estat ic est tab, aic /// title("Model 20") /*Model 2 1*/ eststo clear global model21 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio bcscat2 bhb" logit AnyDzCalc $model21 estat ic logit AnyDzCalc $mode l21 $LactXSeason estat ic logit AnyDzCalc $model21 $LactXMtot estat ic eststo logit AnyDzCa lc $model21 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model21 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model21 $LactXMtot $Lac tXalpha estat ic logit AnyDzCalc $model21 $LactXMtot $LactXvitamina estat ic logit AnyDzCalc $model21 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model21 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model21 $LactXMtot $La ctXNL estat ic logit AnyDzCalc $model21 $ LactXMtot $LactXBCS estat ic logit AnyDzCalc $model 21 $LactXMtot $LactXbhb estat ic logit AnyDzCalc $model21 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model21 $LactXMtot $SeasonXal b estat ic logit AnyDzCalc $model21 $LactX Mtot $SeasonXAOP estat ic logit AnyDzCalc $model21 $LactXMtot $SeasonXalpha estat ic logit AnyDzCalc $model21 $LactXMtot $SeasonXvitamina estat ic logit AnyDzCalc $model21 $LactXMtot $Se asonXCalc estat ic logit AnyDzCalc $model2 1 $LactXMtot $SeasonXNefa estat ic logit Any DzCalc $model21 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model21 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model21 $LactXMtot $Sea sonXbhb estat ic logit AnyDzCalc $model21 $ LactXMtot $MtotXalb estat ic logit AnyDzCa lc $model21 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model21 $LactXMtot $MtotXalpha estat ic logit AnyDzCalc $model21 $LactXMtot $MtotXvit amina estat ic logit AnyDzCalc $model21 $Lac tXMtot $MtotXCalcium estat ic logit AnyDzC alc $model21 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model21 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model21 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model21 $LactXMt ot $MtotXbhb estat ic logit AnyDzCalc $model21 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model21 $LactXMtot $albXalpha 292 estat ic logit AnyDzCalc $model21 $LactXMtot $albXvitamina esta t ic logit AnyDzCalc $model21 $LactXMtot $al bXCalcium estat ic logit Any DzCalc $model21 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model21 $LactXMtot $albXNL estat ic logit AnyDzCalc $model21 $LactXMtot $albXBCS estat ic l ogit AnyDzCalc $model21 $LactXMtot $albXbhb est at ic logit AnyDzCalc $model 21 $LactXMtot $AOPXalpha estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXvitamina estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXCalcium estat ic lo git AnyDzCalc $model21 $LactXMtot $AOPXNefa est at ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL estat ic eststo logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXBCS estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb estat ic eststo logit AnyDzCalc $model21 $LactXMto t $AOPXNL $AOPXbhb $alpha Xvitamina estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $alphaXCalcium estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $alphaXNefa estat i c logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $alphaXNL esta t ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $alphaXBCS estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $alphaXbhb estat ic logit AnyDzCal c $model21 $LactXMtot $AOPXNL $AOPXbhb $vitaminaX Calcium estat ic log it AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa estat ic eststo logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// $vitaminaXNL e stat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// $vitaminaXBCS estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// $vitaminaXbhb estat ic logit AnyDzCalc $model21 $LactXM tot $AOPXNL $AOPXbhb $vitaminaXNefa /// c.cal cium#c.lognefasq estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// c.calcium#c.NLratio estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $vitamina XNefa /// c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// c.calcium#c.bhb estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// c.lognefasq#c.NLratio estat ic logit AnyDzCalc $model21 $LactXMt ot $AOPXNL $AOPXb hb $vitaminaXNefa /// c.lognefasq#c.bcscat2 estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// c.lognefasq#c.bhb estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNe fa /// c.NLra tio#c.bcscat2 estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// c.NLratio#c.bhb 293 estat ic logit AnyDzCalc $model21 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// c.bcscat2#c.bhb estat i c esttab, aic /// title("Model 21") /*Model 22 - duplicate of 6 once Ntot deleted. (NL ratio and Ntot both in model*/ /*Model 23*/ eststo clear global model23 "$LactCat $season $Mtot $a lbumin $aopugul calcium NLratio bcscat2" logit AnyDzCalc $model23 estat ic logit AnyDzCalc $model23 $LactXSeason estat ic logit AnyDzCalc $model23 $LactXMtot estat ic eststo logit AnyDzCalc $model23 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model23 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model23 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model23 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model23 $LactXMtot $LactXBCS estat ic logit AnyDzCa lc $model23 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model23 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model23 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model23 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model23 $LactXMtot $SeasonXNL estat ic lo gi t AnyDzCalc $model23 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model23 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model23 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model23 $ LactXMtot $MtotXCalcium estat ic logit AnyDz Ca lc $model23 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model23 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model23 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model23 $LactXMtot $a lbXCalcium estat ic logit AnyDzCalc $model2 3 $LactXMtot $albXNL estat ic logit AnyDzCalc $model23 $LactXMtot $albXBCS estat ic logit AnyDzCalc $model23 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model23 $LactXMtot $AOPXNL es tat ic logit AnyDzCalc $model23 $LactXM tot $AO PXBCS estat ic eststo logit AnyDzCalc $model23 $LactXMtot $AOPXBCS c.calcium#c.NLratio estat ic logit AnyDzCalc $model23 $LactXMtot $AOPXBCS c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model23 $LactXMtot $AOPXBCS c.NLratio#c.b cscat2 estat ic esttab, aic /// title("Model 23") /*Model 24 is the same as model 4, so excluded*/ /*Model 25 is the same as model 3, so excluded*/ 294 /*Model 26*/ eststo clear global model26 "$L actCat $season $Mtot $saaugml $albumin $ao pugul $ vitamina lognefasq calcium NLratio bcscat2" logit AnyDzCalc $model26 estat ic logit AnyDzCalc $model26 $LactXSeason estat ic logit AnyDzCalc $model26 $LactXMtot estat ic eststo logit A nyDzCalc $model26 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model26 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model26 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model26 $LactXMtot $LactXvitamina estat ic logit AnyDzCalc $model26 $LactXMtot $LactXNef a estat ic log it AnyDzCalc $model26 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model26 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model26 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model26 $L actXMtot $SeasonXMtot estat ic logit AnyDzCal c $model26 $LactXMtot $SeasonXsaa estat ic logit AnyDzCalc $model26 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model26 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model26 $LactXMto t $SeasonXvitamina estat ic logit AnyDzCalc $model26 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model26 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model26 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model26 $LactXM to t $SeasonXBCS estat ic logit AnyDzCalc $m odel26 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model26 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model26 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model26 $LactXMtot $MtotXvita mi na estat ic logit AnyDzCalc $model26 $Lac tXMtot $MtotXNefa estat ic logit AnyDzCalc $model26 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model26 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model26 $LactXMtot $MtotXBCS est at ic logit AnyDzCalc $model26 $LactXMtot $sa aXalb estat ic eststo logit AnyDzCalc $model26 $LactXMtot $saaXalb $saaXAOP estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $saaXvitamina estat ic logit AnyDzCalc $model26 $LactXMto t $saaXalb $saaXNefa estat ic logit AnyDzC alc $model26 $LactXMtot $saaXalb $saaXCalcium estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $saaXNL estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $saaXBCS estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $albXAOP estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $albXvitamina estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $albXNefa 295 estat ic logit AnyDzCalc $model26 $LactXMtot $sa aXalb $al bXCalcium estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $albXNL estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $albXBCS estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $AOPXvitamina estat ic logit An yDzCalc $ model26 $LactXMtot $saaXalb $AOPXNefa estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $AOPXCalcium estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $AOPXNL estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $AOPXBCS estat ic eststo logit AnyDzCalc $model26 $LactX Mtot $saaXalb $AOPXBCS $vitaminaXNefa estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $AOPXBCS $vitaminaXCalcium estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $AOPXB CS $vitaminaXNL estat ic logit AnyDzCalc $model26 $LactXMt ot $saaXalb $AOPXBCS $vitaminaXBCS estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $AOPXBCS c.lognefasq#c.calcium estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $AO PXBCS c.lognefas q#c.NLratio estat ic eststo logit AnyDz Calc $model26 $LactXMtot $saaXalb $AOPXBCS c.lognefasq#c.NLratio c.lognefasq#c.bcscat2 estat ic eststo logit AnyDzCalc $model26 $LactXMtot $saaXalb $AOPXBCS c.lognefasq#c.NLratio c.l ognefasq#c.bcsca t2 /// c.calcium#c.NLratio estat ic lo git AnyDzCalc $model26 $LactXMtot $saaXalb $AOPXBCS c.lognefasq#c.NLratio c.lognefasq#c.bcscat2 /// c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model26 $LactXMtot $saaXalb $AOPXBCS c.lognefasq#c .NLratio c.lognefasq#c.bcscat2 /// c.NLratio# c.bcscat2 estat ic esttab, aic /// title("Model 26") /*Model 27*/ eststo clear global model27 "$LactCat $season $Mtot $albumin $aopugul $alphatocophe rol calcium lognefasq NLratio b cscat2 bhb" logit AnyDzCalc $model27 estat ic logit AnyDzCalc $model27 $LactXSeason estat ic logit AnyDzCalc $model27 $LactXMtot estat ic eststo logit AnyDzCalc $model27 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model 27 $LactXMtot $LactXAOP estat ic logit AnyDz Calc $model27 $LactXMtot $LactXalpha estat ic logit AnyDzCalc $model27 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model27 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model27 $Lact XMtot $LactXNL estat ic logit AnyDzCalc $mod el27 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model27 $LactXMtot $LactXbhb estat ic logit AnyDzCalc $model27 $LactXMtot $SeasonXMtot estat ic 296 logit An yDzCalc $model27 $LactXMtot $Sea sonXalb estat ic logit AnyDzCalc $model27 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model27 $LactXMtot $SeasonXalpha estat ic logit AnyDzCalc $model27 $LactXMtot $SeasonXCalc estat ic logit Any DzCalc $model27 $LactXMtot $Seas onXNefa estat ic logit AnyDzCalc $model27 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model27 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model27 $LactXMtot $SeasonXbhb estat ic logit AnyDz Calc $model27 $LactXMtot $MtotXa lb estat ic logit AnyDzCalc $model27 $Lact XMtot $MtotXAOP estat ic logit AnyDzCalc $model27 $LactXMtot $MtotXalpha estat ic logit AnyDzCalc $model27 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model27 $LactXMtot $MtotXNef a estat ic logit AnyDzCalc $model27 $LactX Mtot $MtotXNL estat ic logit AnyDzCalc $model27 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model27 $LactXMtot $MtotXbhb estat ic logit AnyDzCalc $model27 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model27 $LactXMtot $albXal pha estat ic logit AnyDzCalc $model27 $LactXMtot $albXCalcium estat ic logit AnyDzCalc $model27 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model2 7 $LactXMtot $albXNL estat ic logi t AnyDzCalc $model27 $LactXMtot $albXBCS estat ic logit AnyDzCalc $model27 $LactXMtot $albXbhb estat ic logit AnyDzCalc $model27 $LactXMtot $AOPXalpha estat ic logit AnyDzCalc $model27 $LactXMtot $AOP XCalcium estat ic logit AnyDzCal c $model27 $LactXMtot $AOPXNefa estat ic log it AnyDzCalc $model27 $LactXMtot $AOPXNL estat ic logit AnyDzCalc $model27 $LactXMtot $AOPXBCS estat ic eststo logit AnyDzCalc $model27 $LactXMtot $AOPXBCS $AOPXbhb estat ic eststo logit AnyDzCalc $model27 $LactXMtot $AOPXBCS $AOPXbhb $alphaXCalcium estat ic logit AnyDzCalc $model27 $LactXMtot $AOPXBCS $AOPXbhb $alphaXNefa estat ic logit AnyDzCalc $model27 $LactXMtot $ AOPXBCS $AOPXbhb $alphaXNL estat ic logit AnyDzCal c $model27 $LactXMtot $AOPXBCS $AOPXbhb $alphaXBC S estat ic logit AnyDzCalc $model27 $LactXMtot $AOPXBCS $AOPXbhb $alphaXbhb estat ic logit AnyDzCalc $model27 $LactXMtot $AOPXBCS $AOPXbhb c.calcium#c.lognefasq estat ic logit AnyDzCalc $mod el27 $LactXMtot $AOPXBCS $AOPXbhb c.calcium#c.NLr atio estat ic logit AnyDzCalc $model27 $LactXMtot $AOPXBCS $AOPXbhb c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model27 $LactXMtot $AO PXBCS $AOPXbhb c.calcium#c.bhb estat ic logit AnyDz Calc $model27 $LactXMtot $AOPXBCS $AOPXbhb c.logn efasq#c.NLratio estat ic logit AnyDzCalc $model27 $LactXMtot $AOPXBCS $AOPXbhb c.lognefasq#c.bcscat2 297 estat ic logit AnyDzCalc $model27 $LactXMtot $AOPXBCS $AOPXbhb c.lognefasq#c.bhb estat ic logit AnyDzCalc $model27 $LactXMtot $AOPXBCS $A OPXbhb c.NLratio#c.bcscat2 estat ic logit AnyDzCalc $model27 $LactXMtot $AOPXBCS $AOPXbhb c.NLratio#c.bhb estat ic eststo log it AnyDzCalc $model27 $LactXMtot $AOPXBCS $AOPXbhb c.NLratio#c.bhb c.b cscat2#c.bhb estat ic esttab, aic /// title("Model 27") /*Model 28*/ eststo clear global model28 "$LactCat $season $Mtot $aopugul calcium lognefasq NLratio" logit Any DzCalc $model28 estat ic logit AnyDzCalc $model28 $LactXSeason estat ic logit AnyDzCalc $model28 $LactXMt ot estat ic eststo logit AnyDzCalc $model28 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model28 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model28 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model28 $LactXMtot $LactXNL es tat ic logit AnyDzCalc $model28 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model28 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model28 $LactXMtot $SeasonXCalc estat ic logit A nyDzCalc $model28 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model28 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model28 $LactXMtot $MtotXAOP estat ic logit A nyDzCalc $model28 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model28 $LactXMtot $MtotXNefa estat ic logi t AnyDzCalc $model28 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model28 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model28 $LactXMtot $AOPXNefa estat ic logit AnyDzCalc $model28 $Lact XMtot $AOPXNL estat ic logit AnyDzCalc $model 28 $LactXMtot c.calcium#c.lognefasq estat ic logit AnyDzCalc $model28 $LactXMtot c.calcium#c.NLratio estat ic logit AnyDzCalc $model28 $LactXMtot c.lognefasq#c.NLratio estat ic eststo e sttab, aic /// title("Model 28") /*Model 29*/ eststo clear global model29 "$LactCat $season $Mtot $aopugul $alphatocopherol calcium lognefasq NLratio bcscat2" logit AnyDzCalc $model29 estat ic logit AnyDzCalc $model29 $LactXSeason est at ic logit AnyDzCalc $model29 $LactXMtot estat ic eststo logit AnyDzCalc $model29 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model29 $LactXMtot $Lac tXalpha estat ic 298 logit AnyDzCalc $model29 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model29 $LactXMtot $LactXNef a estat ic logit AnyDzCalc $model29 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model29 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model29 $LactXMtot $SeasonXMtot estat ic logit Any DzCalc $model29 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model29 $LactXMtot $SeasonXalpha estat ic logit AnyDzCalc $model29 $LactXMtot $Season XCalc estat ic logit AnyDzCalc $model29 $LactXMtot $SeasonXNefa estat ic logit AnyDz Calc $model29 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model29 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model29 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model29 $LactXMtot $MtotXalpha estat ic logit AnyDzCalc $mod el29 $LactXMtot $MtotXCalcium estat ic logi t AnyDzCalc $model29 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model29 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model29 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model29 $Lact XMtot $AOPXalpha estat ic logit AnyDzCalc $ model29 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model29 $LactXMtot $AOPXNefa estat ic logit AnyDzCalc $model29 $LactXMtot $AOPXNL estat ic logit AnyDzCalc $model29 $LactXMtot $AOPXB CS estat ic eststo logit AnyDzCalc $model 29 $LactXMtot $AOPXBCS $alphaXCalcium estat ic logit AnyDzCalc $model29 $LactXMtot $AOPXBCS $alphaXN efa estat ic logit AnyDzCalc $model29 $LactXMtot $AOPXBCS $alphaXNL estat ic logit AnyDzC alc $model29 $LactXMtot $AOPXBCS $alphaXBCS est at ic logit AnyDzCalc $model29 $LactXMtot $AOPXBCS c.calcium#c.lognefasq estat ic logit AnyDzCal c $model29 $LactXMtot $AOPXBCS c.calcium#c.NLratio estat ic logit AnyDzCalc $model29 $LactXMtot $AOPXBCS c.calcium#c.bcscat2 estat ic logi t AnyDzCalc $model29 $LactXMtot $AOPXBCS c.lognefasq#c.NLratio estat ic logit AnyDzCalc $model29 $La ctXMtot $AOPXBCS c.lognefasq#c.bcscat2 estat ic eststo logit AnyDzCalc $model29 $LactXMtot $ AOPXBCS c.lognefasq#c.bcscat2 c.NLratio#c.bcscat2 estat ic esttab, aic /// title("Model 29") /*Model 30*/ eststo clear global model30 "$LactCat $season $Mtot $saaugml $albumin $aopugul calcium lognefasq NLratio bcscat2 bhb" logit AnyDz Calc $model30 estat ic logit AnyDzCalc $mo del30 $LactXSeason estat ic logit AnyDzCalc $model30 $LactXMtot estat ic 299 eststo logit AnyDzCalc $model30 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model30 $LactXMtot $LactXalb esta t ic logit AnyDzCalc $model30 $LactXMtot $Lac tXAOP estat ic logit AnyDzCalc $model30 $LactXMtot $LactXCalc estat ic logit AnyDzCal c $model30 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model30 $LactXMtot $LactXNL estat ic logi t AnyDzCalc $model30 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model30 $LactXMtot $LactXbhb estat ic logit AnyDzCalc $model30 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model30 $LactXMtot $SeasonXsaa estat ic logit AnyDz Calc $model30 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model30 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $mo del30 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model30 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model30 $LactXMtot $SeasonXNL estat ic lo git AnyDzCalc $model30 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model30 $ LactXMtot $SeasonXbhb estat ic logit AnyDzCalc $model30 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model30 $LactXMtot $MtotXalb estat ic logit AnyDzC alc $model30 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model30 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model30 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model30 $LactXMt ot $MtotXNL estat ic logit AnyDzCalc $model 30 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model30 $LactXMtot $ MtotXbhb estat ic logit AnyDzCalc $model30 $LactXMtot $saaXalb estat ic eststo logit AnyDzCalc $model30 $LactXMtot $saa XAOP estat ic logit AnyDzCalc $model30 $Lac tXMtot $saaXCalcium estat ic logit AnyDzCalc $model30 $LactXMtot $sa aXNefa estat ic logit AnyDzCalc $model30 $LactXMtot $saaXNL estat ic logit AnyDzCalc $model30 $LactXMtot $saaXBCS estat i c logit AnyDzCalc $model30 $LactXMtot $saaXbhb estat ic logit AnyDzCalc $model30 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model30 $LactXMtot $albXCalcium estat ic logit AnyDzCalc $model30 $LactXMtot $albXNefa estat ic logit Any DzCalc $model30 $LactXMtot $albXNL estat ic logit AnyDzCalc $model30 $LactXMtot $albXBCS estat ic logit AnyDzCalc $model30 $LactXMtot $albXbhb estat ic logit AnyDzCalc $model30 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model30 $LactXMtot $AOPXNefa 300 estat ic logit AnyDzCal c $model30 $LactXMtot $AOPXNL estat ic logit AnyDzCalc $model 30 $LactXMtot $AOPXBCS estat ic logit AnyDzCalc $model30 $LactXMtot $AOPXbhb estat ic logit AnyDzCalc $model30 $LactXMtot c.calciu m#c.lognefasq estat ic logit AnyDzCalc $mod el30 $LactXMtot c.calcium#c.NLratio estat ic logit AnyDzCalc $m odel30 $LactXMtot c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model30 $LactXMtot c.calcium#c.bhb estat ic logit AnyDzCalc $model 30 $LactXMtot c.lognefasq#c.NLratio estat ic logit AnyDzCalc $model30 $LactXMtot c.lognefasq#c.bcscat2 estat ic logit AnyDzCalc $model30 $LactXMtot c.lognefasq#c.bhb estat ic logit AnyDzCalc $model30 $LactXMtot c.NLratio#c.bcscat2 estat i c logit AnyDzCalc $model30 $LactXMtot c.NLrat io#c.bhb estat ic logit AnyDzCalc $model30 $LactXMtot c.bcscat2#c.bhb estat ic esttab, aic /// title("Model 30") /*Model 31*/ eststo clear global model31 "$LactCat $season $Mtot $album in $aopugul calcium lognefasq $bcscat" logit A nyDzCalc $model31 $LactXSeason esta t ic logit AnyDzCalc $model31 $LactXMtot estat ic eststo logit AnyDzCalc $model31 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model31 $LactXMtot $La ctXAOP estat ic logit AnyDzCalc $model31 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model31 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model31 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model31 $LactXMtot $Seaso nXMtot estat ic logit AnyDzCalc $model31 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model31 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model31 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model31 $LactXMtot $ SeasonXNefa estat ic logit AnyDzCalc $mod el31 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model31 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model31 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model31 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $ model31 $LactXMtot $MtotXNefa esta t ic logit AnyDzCalc $model31 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model31 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model31 $LactXMtot $albXCalcium estat ic logit AnyDzCalc $mod el31 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model31 $LactXMtot $albXBCS estat ic logit AnyDzCalc $model31 $LactXMtot $AOPXCalcium estat ic 301 logit AnyDzCalc $model31 $LactXMtot $AOPXNefa estat ic logit AnyDzCalc $mode l31 $LactXMtot $AOPXBCS es tat ic eststo logit AnyDzCalc $model31 $LactXMtot $AOPXBCS c.calcium#c.lognefasq estat ic logit AnyDzCalc $model31 $LactXMtot $AOPXBCS c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model31 $LactXMtot $AOPXB CS c.lognefasq#c.bcscat2 estat ic eststo esttab, aic /// title("Model 31") /*Model 32*/ eststo clear global model32 "$LactCat $season $Mtot $albumin $aopugul calcium lognefasq NLra tio bhb" logit AnyDzCalc $model32 estat ic logit AnyDzCalc $model3 2 $LactXSeason estat ic logit AnyDzCalc $model32 $LactXMtot estat ic eststo logit AnyDzCalc $model32 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model3 2 $LactXMtot $LactXAOP estat ic logit AnyDz Calc $model32 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model32 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model32 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model32 $LactXMto t $LactXbhb estat ic logit AnyDzCalc $model 32 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model32 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model32 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model32 $LactXMtot $Se asonXCalc estat ic logit AnyDzCalc $model32 $LactXMtot $S easonXNefa estat ic logit AnyDzCalc $model32 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model32 $LactXMtot $SeasonXbhb estat ic logit AnyDzCalc $model32 $LactXMtot $MtotX alb estat ic logit AnyDzCalc $model32 $Lact XMtot $MtotXAOP estat ic logit AnyDzCalc $model32 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model32 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model32 $LactXMtot $MtotXNL e stat ic logit AnyDzCalc $model32 $LactXMt ot $M totXbhb estat ic logit AnyDzCalc $model32 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model32 $LactXMtot $albXCalcium estat ic logit AnyDzCalc $model32 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model32 $LactXMtot $albXNL e stat ic logit AnyDzCalc $model32 $LactXMtot $albXbhb estat ic logit AnyDzCalc $model32 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model32 $LactXMtot $AOPXNefa estat ic logit AnyDz Calc $model32 $LactXMtot $AOPXNL estat ic l ogit AnyDzCalc $model32 $LactXMtot $AOPXbhb 302 estat ic eststo logit AnyDzCalc $model32 $LactXMtot $AOPXbhb c.calcium#c.lognefasq estat ic logit AnyDzCalc $model32 $LactXMtot $AOPXbhb c.calcium# c.NLratio estat ic logit AnyDzCalc $model32 $LactXMtot $AOPXbhb c.calcium#c.bhb estat ic logit AnyDzCalc $model32 $LactXMtot $AOPXbhb c.lognefasq#c.NLratio estat ic eststo logit AnyDzCalc $model32 $LactXMtot $AOPXbhb c.lognefasq#c.NLra tio /// c.lognefasq#c.bhb estat ic l ogit AnyDzCalc $model32 $LactXMtot $AOPXbhb c.lognefasq#c.NLratio /// c.NLratio#c.bhb estat ic esttab, aic /// title("Model 32") /*Model 33*/ eststo clear global model33 "$LactCat $season $Mtot $albumin $aopugul $vitamina calcium lognefasq NLratio" logit AnyDzCalc $model33 estat ic logit AnyDzCalc $model33 $LactXSeason estat ic logit AnyDzCalc $model33 $LactXMtot estat ic eststo logit AnyDzCalc $ model33 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model33 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model33 $LactXMtot $LactXvitamina estat ic logit AnyDzCalc $model33 $LactXMtot $LactXCalc estat ic logit AnyDzC alc $model33 $LactX Mtot $LactXNefa estat ic logit AnyDzCalc $model33 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model33 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model33 $LactXMtot $SeasonXalb estat ic logit Any DzCalc $model33 $La ctXMtot $SeasonXAOP estat i c logit AnyDzCalc $model33 $LactXMtot $SeasonXvitamina estat ic logit AnyDzCalc $model33 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model33 $LactXMtot $SeasonXNefa estat ic logit AnyDzC alc $model33 $LactXMtot $Seaso nXNL estat ic logit AnyDzCalc $model33 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model33 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model33 $LactXMtot $MtotXvitamina estat ic logit AnyDzCalc $model33 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model33 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model33 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model33 $LactXMtot $albXAO P estat ic logit AnyDzCalc $model33 $La ctXMtot $albXvitamina estat ic logit AnyDzCalc $model33 $LactXMtot $albXCalcium estat ic logit AnyDzCalc $model33 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model33 $LactXMtot $albXNL est at ic logit AnyDzCalc $model33 $LactXMtot $AOPXvitamina estat ic 303 logit AnyDzCalc $model33 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model33 $LactXMtot $AOPXNefa estat ic logit AnyDzCalc $model33 $LactXMtot $AOPXNL es tat ic eststo logit AnyDzCal c $model33 $LactXMtot $AOPXNL $vitaminaXCalcium estat ic logit AnyDzCalc $model33 $LactXMtot $AOPXNL $vitaminaXNefa estat ic eststo logit AnyDzCalc $model33 $LactXMtot $AOPXNL $vitaminaXNef a /// $vitaminaXNL estat ic logit An yDzCalc $model33 $LactXMtot $AOPXNL $vitaminaXNefa /// c.calcium#c.lognefasq estat ic logit AnyDzCalc $model33 $LactXMtot $AOPXNL $vitaminaXNefa /// c.calcium#c.NLratio estat ic logit AnyDzCalc $model33 $LactXMtot $AOPXNL $vitaminaXN efa /// c.lognefasq#c.NLratio estat ic esttab, aic /// title("Model 33") /*Model 34 same as model 5 - removed*/ /*Model 35 same as model 8 - removed*/ /*Model 36 */ eststo cl ear global model36 "$LactCat $season $Mtot $albu min $aopugul $vitamina calcium lognefasq NLratio bcscat2 bhb" logit AnyDzCalc $model36 estat ic logit AnyDzCalc $model36 $LactXSeason estat ic logit AnyDzCalc $model36 $LactXMtot estat ic eststo logit AnyDzCalc $model36 $LactXMt ot $LactXalb estat ic logit AnyDzCalc $model36 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model36 $LactXMtot $LactXvitamina estat ic logit AnyDzCalc $model36 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model36 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model36 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model36 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model36 $Lact XMtot $LactXbhb estat ic logit AnyDzCalc $model36 $LactXMtot $Se asonXMtot estat ic logit AnyDzCalc $model36 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model36 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model36 $ LactXMtot $SeasonXvitami na estat ic logit AnyDzCalc $model36 $LactX Mtot $SeasonXCalc estat ic logit AnyDzCalc $model36 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model36 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model36 $LactXMtot $Seaso nXBCS estat ic logit AnyDzCalc $model36 $ LactXMtot $SeasonXbhb estat ic logit AnyDzCalc $model36 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model36 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model36 $LactXMtot $Mtot Xvitamina 304 estat ic logit AnyDzCalc $model3 6 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model36 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model36 $LactXMtot $MtotXNL estat ic logi t AnyDzCalc $model36 $LactXMtot $Mto tXBCS estat ic logit AnyDzCalc $model36 $LactXMtot $MtotXbhb estat ic logit AnyDzCalc $model36 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model36 $LactXMtot $albXvitamina estat ic logit AnyDzCalc $model36 $LactXMtot $alb XCalcium estat ic logit AnyDzCalc $model36 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model36 $LactXMtot $albXNL estat ic logit AnyDzCalc $model36 $LactXMtot $albXBCS estat ic log it AnyDzCalc $model36 $LactXMtot $albXbh b estat ic logit AnyDzCalc $model36 $Lac tXMtot $AOPXvitamina estat ic logit AnyDzCalc $model36 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model36 $LactXMtot $AOPXNefa estat ic logi t AnyDzCalc $model36 $LactXMtot $AOPXNL estat ic eststo logit AnyDzCalc $mod el36 $LactXMtot $AOPXNL $AOPXBCS estat ic logit AnyDzCalc $model36 $LactXMtot $AOPXNL $AOPXbhb estat ic eststo logit AnyDzCalc $model36 $LactXMto t $AOPXNL $AOPXbhb $vitaminaXCalcium estat ic logit AnyDzCalc $model36 $LactXMto t $AOPXNL $AOPXbhb $vitaminaXNefa estat ic eststo logit AnyDzCalc $model36 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// $vitaminaXNL estat ic logit AnyDzCalc $model36 $LactXMtot $ AOPXNL $AOPXbhb $vitaminaXNefa /// $vitaminaX BCS estat ic logit AnyDzCalc $model36 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// $vitaminaXbhb estat ic eststo logit AnyDzCalc $model 36 $LactXMtot $AOPXNL $AOPXbhb $vitaminaX Nefa /// $vitaminaXbhb c.calcium#c.lognefasq estat ic logit AnyDzCalc $model36 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// $vitaminaXbhb c.calcium#c.NLratio estat ic logit AnyDzCalc $model36 $LactXMtot $AOPXNL $AOPXbhb $vi taminaXNefa /// $vitaminaXbhb c.calcium#c.bcs cat2 estat ic logit AnyDzCalc $model36 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// $vitaminaXbhb c.calcium#c.bhb estat ic logit AnyDzCalc $mod el36 $LactXMtot $AOPXNL $AOPXbhb $vitamin aXNefa /// $vitaminaXbhb c.lognefasq#c.NLrati o estat ic logit AnyDzCalc $model36 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// $vitaminaXbhb c.lognefasq#c.bcscat2 estat ic logit AnyDzCalc $model36 $LactXMtot $AOPXNL $AOPXbhb $v itaminaXNefa /// $vitaminaXbhb c.lognefasq#c. bhb estat ic logit AnyDzCalc $model36 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// $vitaminaXbhb c.NLratio#c.bcscat2 estat ic logit Any DzCalc $model36 $LactXMtot $AOPXNL $AOPXbhb $vita minaXNefa /// $vitaminaXbhb c.NLratio#c.bhb estat ic 305 logit AnyDzCalc $model36 $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa /// $vitaminaXbhb c.bcscat2#c.bhb estat ic esttab, aic /// title("Model 36") /*Model 37*/ e ststo clear global model37 "$LactCat $season $Mt ot $saaugml $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio bcscat2 bhb" logit AnyDzCalc $model37 estat ic logit AnyDzCalc $mod el37 $LactXSeason estat ic logit AnyDzCalc $ model37 $LactXMtot estat ic eststo logit AnyDzCalc $model37 $LactXMtot $LactXsaa estat ic eststo logit AnyDzCalc $model37 $LactXMtot $LactXsaa $LactXalb estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $LactXAOP estat ic lo git AnyDzCalc $model37 $LactXMtot $LactXsaa $Lact Xalpha estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $LactXvitamina estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $LactXCalc estat ic logit AnyDzCalc $model37 $LactXMtot $La ctXsaa $LactXNefa estat ic logit AnyDzCalc $ model37 $LactXMtot $LactXsaa $LactXNL estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $LactXBCS estat ic logit AnyDzCalc $mo del37 $LactXMtot $LactXsaa $LactXbhb estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXMtot est at ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXsaa estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXalb estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonX AOP estat ic logit AnyDzCalc $model37 $LactX Mtot $LactXsaa $SeasonXalpha estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXvitamina estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXCalc estat ic logit AnyDzCal c $model37 $LactXMtot $LactXsaa $SeasonXNefa es tat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXNL estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXBCS estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $Season Xbhb estat ic eststo logit AnyDzCalc $mode l37 $LactXMtot $LactXsaa $SeasonXbhb $MtotXsaa estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $MtotX alb estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $Mto tXAOP estat ic logit AnyDzCalc $model37 $La ctXMtot $LactXsaa $SeasonXbhb $MtotXalpha estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $MtotXvitam ina estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $MtotX Calcium estat ic logit AnyDzCalc $model37 $ LactXMtot $LactXsaa $SeasonXbhb $MtotXNefa estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $MtotXNL estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $MtotXBCS e stat ic 306 logit AnyDzCalc $model37 $LactXMtot $L actXsaa $SeasonXbhb $MtotXbhb estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb estat ic eststo logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $saaX AOP estat ic logit AnyDzCalc $model37 $Lact XMtot $LactXsaa $SeasonXbhb $saaXalb $saaXalpha estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saa Xalb $saaXvitamina estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $S easonXbhb $saaXalb $saaXCalcium estat ic lo git AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $saaXNefa estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $saaXNL estat ic logit AnyDzCalc $model37 $La ctXMtot $LactXsaa $SeasonXbhb $saaXalb $saaXBCS estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $saaXbhb estat ic logit AnyDzCalc $model 37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $albXAOP estat ic logit AnyDzCa lc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saa Xalb $albXalpha estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $albXvitamina estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $albXCalcium estat ic logit AnyDzCalc $model37 $LactXMto t $LactXsaa $SeasonXbhb $saaXalb $albXNefa estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $albXNL estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $ saaXalb $albXBCS estat ic logit AnyDzCalc $ model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $albXbhb estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $Sea sonXbhb $saaXalb $AOPXalpha estat ic logit AnyDzCalc $model37 $LactXMtot $Lac tXsaa $SeasonXbhb $saaXalb $AOPXvitamina estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXCalcium estat ic logit AnyDzCalc $model 37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXNefa estat ic logit AnyDzCa lc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saa Xalb $AOPXNL estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS estat ic estst o logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS $AO PXbhb estat ic logit AnyDzCalc $model37 $La ctXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS $alphaXvitamina estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS $alphaXCalcium estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXB CS $alphaXNefa estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS $alphaXNL estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS $a lphaXBCS estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS $alphaXbhb estat ic logit AnyDzCalc $model37 $LactXMtot $LactXs aa $SeasonXbhb $saaXalb $AOPXBCS $vitaminaXCalcium estat ic logit AnyDzCalc $model 37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOP XBCS $vitaminaXNefa estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS $vitam inaXNL estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS $vitaminaXBCS estat ic logit Any DzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS $vitaminaXbhb estat ic logit AnyDzCalc $mode l37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS c.calcium#c.lognefasq estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $S easonXbhb $saaXalb $AOPXBCS c.calcium#c.NLratio estat ic 307 logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonX bhb $saaXalb $AOPXBCS c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model37 $LactX Mtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS c.ca lcium#c.bhb estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS c.lognefasq#c. NLratio estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXal b $AOPXBCS c.lognefasq#c.bcscat2 estat ic lo git AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS c.lognefasq#c.bhb estat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS c.NLratio#c.bcscat2 e stat ic logit AnyDzCalc $model37 $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOPXBCS c.NLratio#c.bhb estat ic logit AnyDzCalc $model37 $LactXMtot $LactX saa $SeasonXbhb $saaXalb $AOPXBCS c.bcscat2#c.bhb estat ic esttab, aic /// titl e("Model 37") /*Model 38*/ eststo clear gl obal model38 "$LactCat $season $Mtot $aopugul calcium lognefasq NLratio bcscat2 bhb" logit AnyDzCalc $model38 estat ic logit AnyDzCalc $model38 $LactXSeason estat ic logit AnyDzCalc $model38 $La ctXMtot estat ic eststo logit AnyDzCalc $ model38 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model38 $LactXMtot $LactXCalc estat ic logit AnyD zCalc $model38 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model38 $LactXMtot $LactX NL estat ic logit AnyDzCalc $model38 $LactXM tot $LactXBCS estat ic logit AnyDzCalc $model38 $LactXMtot $LactXbhb estat ic logit AnyDzCalc $model38 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model38 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model38 $LactXMtot $Seaso nXCalc estat ic logit AnyDzCalc $model38 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $ model38 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model38 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model38 $LactXMtot $SeasonXb hb estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXNefa estat ic logi t AnyDzCalc $model38 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXbhb estat ic eststo logit AnyDzCalc $model38 $LactXMtot $MtotXbhb $AOPXCalcium estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXbhb $AO PXNefa estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXbhb $AOPXNL estat ic logi t AnyDzCalc $model38 $LactXMtot $MtotXbhb $AOPXBCS estat ic 308 eststo logit AnyDzCalc $model38 $LactXMtot $MtotX bhb $AOPXBCS $AOPXbhb estat ic eststo lo git AnyDzCalc $model38 $LactXMtot $MtotXbhb $AOPXBCS $AOPXbhb /// c.calcium#c.lognefasq estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXbhb $AOPXBCS $AOPXbhb /// c.calcium#c.NLratio estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXbh b $AOPXBCS $AOPXbhb /// c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model38 $Lac tXMtot $MtotXbhb $AOPXBCS $AOPXbhb /// c.calcium#c.bhb estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXb hb $AOPXBCS $AOPXbhb /// c.lognefasq#c.NLratio estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXbhb $AOPXBCS $AOPXbhb /// c.lognefasq#c.bcscat2 estat ic eststo logit AnyDzCalc $model38 $LactXMtot $MtotXbhb $AOPXBCS $AOPXbhb /// c.l ognefasq#c.bcscat2 c.lognefasq#c.bhb estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXbhb $AOPXBCS $AOPXbhb /// c.lognefasq#c .bcscat2 c.NLratio#c.bcscat2 estat ic logit AnyDzCalc $model38 $LactXMtot $MtotXbhb $AOPXBCS $AOPXbhb /// c.logne fasq#c.bcscat2 c.NLratio#c.bhb estat ic log it AnyDzCalc $model38 $LactXMtot $MtotXbhb $AOPXBCS $AOPXbhb /// c.lognefasq#c.bcsca t2 c.bcscat2#c.bhb estat ic esttab, aic /// title("Model 38") /*Model 39 save as model 7 once removed Ntot.. NL and Ntot both in model*/ /*Mode l 40*/ eststo clear global model40 "$LactCat $season $Mtot $saaugml $albumin $aopugul calcium NLratio bcscat2" logit AnyDzCalc $model40 estat ic logit AnyDzCalc $model40 $LactXSeason estat ic logit AnyDzCalc $model40 $LactXMtot estat ic eststo logit AnyDzCalc $model40 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model40 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model40 $LactXMtot $LactXAOP estat ic logit AnyDzCal c $model40 $LactXMtot $LactXCalc estat ic log it AnyDzCalc $model40 $LactXMtot $LactXNL estat ic logit AnyDzCalc $mode l40 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model40 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model40 $LactX Mtot $MtotXalb estat ic logit AnyDzCalc $mo del40 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model40 $LactXMtot $ MtotXCalcium estat ic logit AnyDzCalc $model40 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model40 $LactXMtot $MtotX BCS estat ic logit AnyDzCalc $model40 $LactXM tot $saaXalb estat ic eststo logit AnyDzCalc $model40 $LactXMtot $saaXalb $saaXAOP estat ic 309 logit AnyDzCalc $model40 $LactXMtot $saaXalb $saaXCalcium estat ic logit AnyDzCalc $model40 $Lact XMtot $saaXalb $saaXNL estat ic logit AnyDz Calc $model40 $LactXMtot $saaXalb $saaXBCS estat ic logi t AnyDzCalc $model40 $LactXMtot $saaXalb $albXAOP estat ic logit AnyDzCalc $model40 $LactXMtot $saaXalb $albXCalcium estat ic logit A nyDzCalc $model40 $LactXMtot $saaXalb $albXNL e stat ic logit AnyDzCalc $model40 $LactXMtot $saaXalb $albXB CS estat ic logit AnyDzCalc $model40 $LactXMtot $saaXalb $AOPXCalcium estat ic logit AnyDzCalc $model40 $LactXMtot $saaXalb $AOPXNL estat ic logit AnyDzCalc $model40 $LactXMtot $ saaXalb $AOPXBCS estat ic eststo logit AnyDzCalc $model 40 $LactXMtot $saaXalb $AOPXBCS c.calcium#c.NLratio estat ic logit AnyDzCalc $model40 $LactXMtot $saaXalb $AOPXBCS c.calcium#c.bcscat2 est at ic logit AnyDzCalc $model40 $LactXMtot $saa Xalb $AOPXBCS c.NLratio#c.bcscat2 estat ic esttab, aic /// title("Model 40") /*Model 41*/ eststo clear global model41 "$LactCat $season $Mtot $aopugul lognefasq NLratio bcscat2" logit Any DzCalc $model41 estat ic logit AnyDzCalc $m odel41 $LactXSeason estat ic logit AnyDzCal c $model41 $LactXMtot estat ic eststo logit AnyDzCalc $model41 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model41 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model41 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model41 $ LactXMtot $LactXBCS estat ic logit AnyDzCalc $model41 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model41 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model41 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model41 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model41 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model41 $LactXMtot $MtotXAOP estat ic logit A nyDzCalc $model41 $LactXMtot $MtotXNefa estat i c logit AnyDzCalc $model41 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model41 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model41 $LactXMtot $AOPXNefa estat ic logit AnyDzCalc $model41 $LactXMtot $AOPXNL estat ic logit AnyDzCalc $model41 $LactXMtot $AOPXBCS estat ic eststo logit AnyDzCalc $model41 $LactXMtot $AOPXBCS c.lognefasq#c.NLratio estat ic logit AnyDzCalc $model41 $LactXMtot $AOPXBCS c.lognefasq#c.bcscat2 e stat ic eststo logit AnyDzCalc $model41 $La ctXMtot $AOPXBCS c.lognefasq#c.bcscat2 // / c.NLratio#c.bcscat2 310 estat ic esttab, aic /// title("Model 41") /*Model 42*/ eststo clear global model42 "$LactCat $season $Mtot $saaugml $a lbumin $aopugul $alphatocopherol calcium lognefas q NLratio bcscat2 bhb" logit AnyDzCalc $model42 estat ic logit AnyDzCalc $model42 $LactXSeason estat ic logit AnyDzCalc $model42 $LactXMtot estat ic eststo logit AnyDzCalc $model4 2 $LactXMtot $LactXsaa estat ic eststo l ogit AnyDzCalc $model42 $LactXMt ot $LactXsaa $LactXalb estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $LactXAOP estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $LactXalpha esta t ic logit AnyDzCalc $model42 $LactXMtot $La ctXsaa $LactXCalc estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $LactXNefa estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $LactXNL estat ic logit AnyDzCalc $model42 $Lac tXMtot $LactXsaa $LactXBCS estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $LactXbhb estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $SeasonXMtot estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $SeasonXsaa estat ic logit AnyDzCalc $model42 $LactXMtot $Lact Xsaa $SeasonXalb estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $SeasonXAOP estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $SeasonXalpha estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $SeasonXCalc estat ic logit AnyDzCalc $ model42 $LactXMtot $LactXsaa $SeasonXNefa estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $SeasonXNL estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $SeasonXBCS estat ic logit AnyDzCalc $model42 $LactXM tot $LactXsaa $Seas onXbhb estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $MtotXsaa estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $MtotXalb estat ic logit AnyDzCalc $mode l42 $LactXMtot $LactXsaa $MtotXAOP estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $MtotXalpha estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $MtotXCalcium estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $MtotXNef a estat ic logit AnyDzCalc $model42 $LactX Mtot $La ctXsaa $MtotXNL estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $MtotXBCS estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $MtotXbhb estat ic logit AnyDzCalc $model4 2 $LactXMtot $LactXsaa $saaXalb estat ic e ststo logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $saaXAOP estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $saaXalpha estat ic 311 logit AnyDzCalc $model42 $LactXMtot $Lact Xsaa $saaXalb $saaXCalcium estat ic logit Any DzCalc $ model42 $LactXMtot $LactXsaa $saaXalb $saaXNefa estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $saaXNL estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $saaX BCS estat ic logit AnyDzCalc $model42 $Lact XMtot $LactXsaa $saaXalb $saaXbhb estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $albXAOP estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $albXalpha estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsa a $sa aXalb $albXCalcium estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $albXNefa estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $albXNL estat ic logit AnyDzCalc $ model42 $LactXMtot $LactXsaa $saaXalb $albXB CS estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $albXbhb estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXalpha estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXCalcium estat ic logi t AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXNefa estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXNL estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXal b $AOPXBCS estat ic eststo logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXBCS $alphaXCalcium estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXBCS $alphaXNefa estat ic logit AnyDzCalc $model42 $LactXMtot $Lac tXsaa $saaXalb $AOPXBCS $alphaXNL est at ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXBCS $alphaXBCS estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXBCS $alphaXbhb estat ic logit AnyDzCalc $model42 $LactX Mtot $LactXsaa $saaXalb $AOPXBCS c.calc ium#c.logn efasq estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXBCS c.calcium#c.NLratio estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXBCS c.calcium#c.bcscat2 est at ic logit AnyDzCalc $model42 $Lac tXMtot $La ctXsaa $saaXalb $AOPXBCS c.calcium#c.bhb estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXBCS c.lognefasq#c.NLratio estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXa lb $AOPXBCS c.lognefasq#c.bcscat2 est at ic es tsto logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXBCS c.lognefasq#c.bcscat2 /// c.lognefasq#c.bhb estat ic logit AnyDzCalc $model42 $LactXMtot $LactXsaa $saaXalb $AOPXBCS c.lognefasq #c.bcscat2 /// c.NLratio#c.bhb est sto esta t ic esttab, aic /// title("Model 42") /*Model 43 same as model 10 - duplicate*/ /*Model 44 same as model 11 - duplicate*/ /*Model 45*/ eststo clear global model45 "$LactCat $season $Mtot $s aaugml $albumin $aopugul $alphatocopherol calcium NLratio bcscat2" logit AnyDzCalc $model45 estat ic logit AnyDzCalc $model45 $LactXSeason estat ic 312 logit AnyDzCalc $model45 $LactXMtot estat ic eststo logit AnyDzCalc $model45 $LactXM tot $LactXsaa estat ic eststo logit Any DzCalc $model45 $LactXMtot $LactXsaa $LactXalb estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $LactXAOP estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $LactXalpha estat i c logit AnyDzCalc $m odel45 $LactXMtot $LactX saa $LactXCalc estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $LactXNL estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $LactXBCS estat ic logit AnyDzCalc $model45 $Lac tXMtot $LactXsaa $SeasonXMtot estat ic l ogit AnyDzCalc $model45 $LactXMtot $LactXsaa $SeasonXsaa estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $SeasonXalb estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $SeasonXAO P estat ic logit AnyDzCalc $model45 $Lact XMtot $LactXsaa $SeasonXalpha estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $SeasonXCalc estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $SeasonXNL estat ic logit AnyDzC alc $model45 $Lac tXMtot $LactXsaa $SeasonXBCS e stat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $MtotXsaa estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $MtotXalb estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $M totXAOP estat i c logit AnyDzCalc $model4 5 $LactXMtot $LactXsaa $MtotXalpha estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $MtotXCalcium estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $MtotXNL estat ic logit A nyDzCalc $model45 $LactXMtot $LactXsaa $MtotXBCS estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb estat ic eststo logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $saaXAOP estat ic logit AnyDzCalc $model45 $Lact XMtot $La ctXsaa $saaXalb $saaXalpha estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $saaXCalcium estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $saaXNL estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $saaXBCS estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $albXAOP estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $albXalpha estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $albXCalc ium est at ic logit AnyDzCalc $model45 $LactXM tot $LactXsaa $saaXalb $albXNL estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $albXBCS estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $AOPXalpha estat ic log it AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXa lb $AOPXCalcium estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $AOPXNL estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $AOPXBCS estat ic eststo logit AnyDzCal c $ model45 $LactXMtot $LactXsaa $saaXalb $AOPXBCS $alphaXCalcium 313 estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $AOPXBCS $alphaXNL estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $AOPXBCS $alphaXBCS estat ic l ogi t AnyDzCalc $model45 $LactXMtot $LactXsaa $saa Xalb $AOPXBCS c.calcium#c.NLratio estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $AOPXBCS c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model45 $LactXMtot $LactXsaa $saaXalb $AOP XBC S c.NLratio#c.bcscat2 estat ic esttab, aic /// title("Model 45") /*Model 46*/ eststo clear global model46 "$LactCat $season $Mtot $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio" logit AnyDzCalc $model46 estat ic logit AnyDzCalc $model46 $LactXSeason estat ic logit AnyDzCalc $model46 $LactXMtot estat ic eststo logit AnyDzCalc $model46 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model46 $LactXMtot $LactXalpha estat ic logit AnyDzCalc $ model46 $LactXMtot $LactXvitamina estat ic logit AnyDzCalc $model46 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model46 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model46 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model4 6 $LactXMtot $SeasonXMtot estat ic logit An yDzCalc $model46 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model46 $LactXMtot $SeasonXalpha estat ic logit AnyDzCalc $model46 $LactXMtot $SeasonXvitamina estat ic logit AnyDzCalc $model 46 $LactXMtot $SeasonXCalc estat ic logit A nyDzCalc $model46 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model46 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model46 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model46 $La ctXMtot $MtotXalpha estat ic logit AnyDzCalc $model46 $LactXMtot $MtotXvitamina estat ic logit AnyDzCalc $model46 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model46 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $mod el46 $LactXM tot $MtotXNL estat ic logit AnyDzCalc $mode l46 $LactXMtot $AOPXalpha estat ic logit AnyDzCalc $model46 $LactXMtot $AOPXvitamina estat ic logit AnyDzCalc $model46 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model46 $Lac tXMtot $AOPX Nefa estat ic logit AnyDzCalc $model46 $Lac tXMtot $AOPXNL estat ic logit AnyDzCalc $model46 $LactXMtot $alphaXvitamina estat ic logit AnyDzCalc $model46 $LactXMtot $alphaXCalcium estat ic logit AnyDzCalc $model46 $LactXMtot $alphaXNefa estat ic logit AnyDzCalc $model46 $LactXMto t $alphaXNL estat ic 314 logit AnyDzCalc $model46 $LactXMtot $vitaminaXCalcium estat ic logit AnyDzCalc $model46 $LactXMtot $vitaminaXNefa estat ic eststo logit AnyDzCalc $model46 $LactXMtot $v itaminaXNefa $vitaminaXNL estat ic logit Any DzCalc $model46 $LactXMtot $vitaminaXNefa c.calcium#c.lognefasq estat ic logit AnyDzCalc $model46 $LactXMtot $vitaminaXNefa c.calcium#c.NLratio estat ic logit AnyDzCalc $mode l46 $LactXMtot $vitamin aXNefa c.lognefasq#c.NLratio estat ic estt ab, aic /// title("Model 46") /*Model 47*/ eststo clear global model47 "$LactCat $season $Mtot $albumin $aopugul $vitamina calcium lognefasq bcscat2" logit AnyDzCalc $mo del47 estat ic logit AnyDzCalc $model47 $LactXSeason estat ic logit AnyDzCalc $model47 $LactXMtot estat ic eststo logit AnyDzCalc $model47 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model47 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $mode l47 $LactXMtot $LactXvitamina estat ic logi t AnyDzCalc $model47 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model47 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model47 $LactXMtot $LactXBCS estat ic lo git AnyDzCalc $model47 $ LactXMtot $SeasonXMtot estat ic logit AnyDz Calc $model47 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model47 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model47 $LactXMtot $SeasonXvitamina estat ic logit AnyDzCalc $model47 $LactXMtot $SeasonXCalc estat ic logit Any DzCalc $model47 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model47 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model47 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model47 $La ctXMtot $MtotXAOP estat ic logit AnyDzCalc $model47 $LactXMtot $MtotXvitamina estat ic logit AnyDzCalc $model47 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model47 $LactXMtot $MtotXNefa estat ic lo git AnyDzCalc $model47 $Lact XMtot $MtotXBCS estat ic logit AnyDzCalc $m odel47 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model47 $LactXMtot $albXvitamina estat ic logit AnyDzCalc $model47 $LactXMtot $albXCalcium estat ic logit An yDzCalc $model47 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model47 $LactXMtot $albXBCS estat ic logit AnyDzCalc $model47 $LactXMtot $AOPXvitamina estat ic logit AnyDzCalc $model47 $LactXMtot $AOPXCalcium estat ic logit AnyDzCal c $model47 $LactXMtot $AOPXN efa 315 estat ic logit AnyDzCalc $model47 $Lact XMtot $AOPXBCS estat ic logit AnyDzCalc $model47 $LactXMtot $vitaminaXCalcium estat ic logit AnyDzCalc $model47 $LactXMtot $vitaminaXNefa estat ic eststo logit AnyDzCalc $model47 $LactXM tot $vitaminaXNefa $vitaminaXBCS estat ic logit AnyDzCalc $model47 $LactXMtot $vitaminaXNefa c.calcium#c.lognefasq estat ic logit AnyDzCalc $model47 $LactXMtot $vitaminaXNefa c.calcium#c.bcscat2 estat ic l ogit AnyDzCalc $model47 $LactXMt ot $vitaminaXNefa c.lognefasq#c.bcscat2 estat i c eststo esttab, aic /// title("Model 47") /*Model 48*/ eststo clear global model48 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol calcium lognef asq bcscat2" logit AnyDzCalc $ model48 estat ic logit AnyDzCalc $model48 $ LactXSeason estat ic logit AnyDzCalc $model48 $LactXMtot estat ic eststo logit AnyDzCalc $model48 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model48 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model48 $LactXMtot $LactXalpha estat ic logit AnyDzCalc $model48 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model48 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model48 $La ctXMtot $LactXBCS estat ic logit AnyDzCa lc $model48 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model48 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model48 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model48 $LactXMtot $Se asonXalpha estat ic logit AnyDzCalc $m odel48 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model48 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model48 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model48 $LactXMtot $MtotXa lb estat ic logit AnyDzCalc $model48 $L actXMtot $MtotXAOP estat ic logit AnyDzCal c $model48 $LactXMtot $MtotXalpha estat ic logit AnyDzCalc $model48 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model48 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model48 $LactXM tot $MtotXBCS estat ic logit AnyDzCalc $mo del48 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model48 $LactXMtot $albXalpha estat ic logit AnyDzCalc $model48 $LactXMtot $albXCalc ium estat ic logit AnyDzCalc $model48 $LactXMtot $albX Nefa estat ic logit AnyDzCalc $model48 $Lact XMtot $albXBCS estat ic logit AnyDzCalc $model48 $LactXMtot $AOPXalpha estat ic 316 logit AnyDzCalc $model48 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model48 $LactXMtot $AOPXNefa esta t ic logit AnyDzCalc $model48 $LactXMtot $AO PXBCS estat ic logit AnyDzCalc $model48 $LactXMtot $alphaXCalcium estat ic logit AnyDzCalc $model48 $LactXMtot $alphaXNefa estat ic logit AnyDzCalc $model48 $LactXMtot $alphaXBCS estat ic logit AnyDzCalc $model48 $LactXMtot c.calcium# c.lognefasq estat ic logit AnyDzCalc $model48 $LactXMtot c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model48 $LactXMtot c.lognefasq# c.bcscat2 estat ic eststo esttab, aic /// title ("Model 48") /*Model 49*/ eststo clear glob al model49 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol calcium NLratio bcscat2" logit AnyDzCalc $model49 estat ic logit Any DzCalc $model49 $LactXSeason estat ic logit AnyDzCalc $ model49 $LactXMtot estat ic eststo logi t AnyDzCalc $model49 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model49 $LactXMtot $LactXAOP estat ic logit AnyDzCalc $model49 $LactXMtot $LactXalpha estat ic logit AnyDzCalc $model4 9 $LactXMtot $LactXCalc estat ic logit Any DzCalc $model49 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model49 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model 49 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model49 $La ctXMtot $SeasonXalb estat ic logit AnyDzCa lc $model49 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model49 $LactXMtot $SeasonXalpha estat ic logit AnyDzCalc $mo del49 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model49 $L actXMtot $SeasonXNL estat ic logit AnyDzCa lc $model49 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model49 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model4 9 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model49 $LactXM tot $MtotXalpha estat ic logit AnyDzCalc $ model49 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model49 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model49 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model49 $LactXMtot $albXAOP estat ic logit AnyDzCalc $model4 9 $LactXMtot $albXalpha estat ic logit AnyDzCalc $model49 $LactXMtot $albXCalcium estat ic logit AnyDzCalc $mode l49 $LactXMtot $albXNL estat ic logit AnyDzCalc $model49 $LactXMtot $albXB CS estat ic 317 logit AnyDzCalc $model49 $Lact XMtot $AOPXalpha estat ic logit AnyDzCalc $model49 $LactXMtot $AOPXCalcium estat ic logit AnyDzCalc $model49 $La ctXMtot $AOPXNL estat ic logit AnyDzCalc $model49 $LactXMtot $AOPXBCS es tat ic eststo logit AnyDzCalc $model49 $La ctXMtot $AOPXBCS $alphaXCalcium estat ic logit AnyDzCalc $model49 $LactXMtot $AOPXBCS $alphaXNL estat ic logit Any DzCalc $model49 $LactXMtot $AOPXBCS $alphaXBCS estat ic logit AnyDzCalc $mo del49 $LactXMtot $AOPXBCS c.calcium#c.NLratio e stat ic logit AnyDzCalc $model49 $LactXMtot $AOPXBCS c.calcium#c.bcscat2 estat ic logit AnyDzCalc $model49 $LactXMtot $AOPXBCS c.NLratio#c.bcscat2 estat ic esttab, aic /// title("Model 49" ) /*Model 50*/ eststo clear global model5 0 "$LactCat $season $Mtot $albumin calcium lognefasq NLratio bcscat2" logit AnyDzCalc $model50 estat ic logit AnyDzCalc $model50 $LactXSeason estat ic logit AnyDzCalc $model5 0 $LactXMtot estat ic eststo logit AnyDz Calc $model50 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model50 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model50 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model50 $LactXMt ot $LactXNL estat ic logit AnyDzCalc $model 50 $LactXMtot $LactXBCS estat ic logit AnyDzCalc $model50 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model50 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model50 $LactXMtot $Seas onXCalc estat ic logit AnyDzCalc $model50 $ LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model50 $LactXMtot $SeasonXNL estat ic logit Any DzCalc $model50 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model50 $LactXMtot $MtotXal b estat ic logit AnyDzCalc $model50 $LactXM tot $MtotXCalcium estat ic logit AnyDzCalc $model50 $LactXMtot $MtotXNefa estat ic logit AnyDzCal c $model50 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model50 $LactXMtot $MtotXBCS esta t ic logit AnyDzCalc $model50 $LactXMtot $alb XCalcium estat ic logit AnyDzCalc $model50 $LactXMtot $albXNefa estat ic logit AnyDzCalc $model50 $LactXMtot $albXNL estat ic logit AnyDzCalc $model50 $LactXMtot $albXBCS estat ic log it AnyDzCalc $model50 $LactXMtot c.calcium#c.logn efasq estat ic logit AnyDzCalc $model50 $LactXMtot c.calcium#c.NLratio estat ic logit AnyDzCalc $model50 $LactXMtot c.calcium#c.bcscat2 318 estat ic logit AnyDzCalc $model50 $LactXMtot c.lognef asq#c.NLratio estat ic logit AnyDzCalc $mod el50 $LactXMtot c.lognefasq#c.bcscat2 estat ic eststo logit AnyDzCalc $model50 $LactXMtot c.lognefasq#c.bcscat2 /// c.NLratio#c.bcscat2 estat ic esttab, aic /// title("Model 50") /*5 1 has both Ntot and NLratio*/ /*Model 52*/ eststo clear global model52 "$LactCat $season $Mtot $saaugml $albumin calcium lognefasq NLratio b cscat2" logit AnyDzCalc $model52 estat ic logit AnyDzCalc $model52 $LactXSeason estat ic logit AnyDzCalc $model52 $LactXMtot estat ic est sto logit AnyDzCalc $model52 $LactXMtot $LactXsaa estat ic logit AnyDzCalc $model52 $LactXMtot $LactXalb estat ic logit AnyDzCalc $model52 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $mo del52 $LactXMtot $LactXNefa estat ic logit An yDzCalc $model52 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model52 $LactX Mtot $LactXBCS estat ic logit AnyDzCalc $model52 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model52 $LactXM tot $SeasonXsaa estat ic logit AnyDzCalc $m odel52 $LactXMtot $SeasonXalb estat ic logit AnyDzCalc $model52 $LactXMtot $Sea sonXCalc estat ic logit AnyDzCalc $model52 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model52 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model5 2 $LactXMtot $SeasonXBCS estat ic logit AnyDzCalc $model52 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model52 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model52 $LactXMtot $MtotXCal cium estat ic logit AnyDzCalc $model52 $LactX Mtot $MtotXNefa estat ic logit AnyDzCalc $model52 $LactXMto t $MtotXNL estat ic logit AnyDzCalc $model52 $LactXMtot $MtotXBCS estat ic logit AnyDzCalc $model52 $LactXMtot $saaXalb estat ic eststo logit AnyDzCalc $model52 $LactXMtot $ saaXalb $saaXCalcium estat ic logit AnyDzCalc $model52 $Lac tXMtot $saaXalb $saaXNefa estat ic logit AnyDzCalc $model52 $LactXMtot $saaXalb $saaXNL estat ic logit AnyDzCalc $model52 $LactXMtot $saaXalb $saaXBCS estat ic logit AnyDzCalc $model52 $LactXMtot $saaXalb $albXCalcium estat ic logit Any DzCalc $model52 $LactXMtot $saaXalb $albXNefa estat ic logit AnyDzCalc $model52 $LactXMtot $saaXalb $albXNL estat ic 319 logit AnyDzCalc $model52 $LactXMtot $saaXalb $albXBCS estat ic logit AnyDzCalc $model52 $LactXMtot $saaXalb c.calcium#c.lognefasq estat ic logit AnyDzCalc $model52 $LactXMtot $saaXalb c.calcium#c.NLratio estat ic logit AnyDzCalc $model52 $LactXMtot $saaXa lb c.calcium#c.bcscat2 estat ic logit AnyDzCa lc $model52 $LactXMtot $saaXalb c.lognefasq#c.NL ratio estat ic eststo logit AnyDzCalc $model52 $LactXMtot $saaXalb c.lognefasq#c.NLratio /// c.lognefasq#c.bcscat2 estat ic eststo logit Any DzCalc $model52 $LactXMtot $saaXalb c.lognefasq#c .NLratio /// c.lognefasq#c.bcscat2 c.NLratio# c.bcscat2 estat ic esttab, aic /// title("Model 52") /*Model 53*/ eststo clear global model53 "$LactCat $season $Mtot $saaugml $albumin $aopugul calcium lognefasq NLratio bhb" logit AnyDzCal c $model53 estat ic logit AnyDzCalc $model53 $LactXSeason estat ic logit AnyDzCalc $model53 $LactXMtot estat ic eststo logit AnyDzCalc $model53 $LactXMtot $LactXsaa estat ic lo git AnyDzCalc $model53 $LactXMtot $LactXalb est at ic logit AnyDzCalc $model53 $ LactXMtot $LactXAOP estat ic logit AnyDzCalc $model53 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $model53 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model53 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model53 $LactXMtot $La ctXbhb estat ic logit AnyDzCalc $model53 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model53 $LactXMtot $SeasonXsaa estat ic logit AnyDzCalc $model53 $ LactXMtot $SeasonXalb estat ic logit AnyDzCa lc $model53 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model53 $LactXMtot $SeasonXCalc estat ic logit AnyDzCalc $model53 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model53 $LactXM tot $SeasonXNL estat ic logit AnyDzCalc $mode l53 $LactXMtot $SeasonXbhb estat ic logit AnyDzCalc $model53 $LactXMtot $MtotXsaa estat ic logit AnyDzCalc $model53 $LactXMtot $MtotXalb estat ic logit AnyDzCalc $model53 $LactXMtot $MtotXAOP estat ic logit AnyDzCalc $model53 $LactXMtot $MtotXCalcium estat ic logit AnyDzCalc $model53 $LactXMtot $MtotXNefa estat ic logit AnyDzCalc $model53 $LactXMtot $MtotXNL estat ic logit AnyDzCalc $model53 $LactXMtot $MtotXbhb estat ic logit AnyDzCalc $model53 $LactXMtot $saaXalb estat ic est sto logit AnyDzCalc $model53 $LactXMtot $saaXalb $saaXAOP estat ic 320 logit AnyDzCalc $model53 $LactXMtot $saaXalb $saaXCalcium estat ic logit AnyDzCalc $model53 $LactXMtot $saaXalb $s aaXNefa estat ic logit AnyDzCalc $model53 $La ctXMtot $saaXalb $saaXNL estat ic logit AnyDzCalc $model53 $LactXMtot $saaXalb $saaXbhb estat ic logit AnyDzCalc $model53 $LactXMtot $saaXalb $albXAOP estat ic logit AnyDzCalc $model53 $LactXMtot $saaXalb $albXCalcium estat ic logit AnyD zCa lc $model53 $LactXMtot $saaXalb $albXNefa estat ic logit AnyDzCalc $model53 $LactXMtot $saaXalb $albXNL estat ic logit AnyDzCalc $model53 $LactXMtot $saaXalb $albXbhb estat ic logit AnyDzCalc $mo del53 $LactXMtot $saaXalb $AOPXCalcium estat ic logit AnyDzCalc $model53 $LactXMtot $saaXalb $AOPXNefa estat ic logit AnyDzCalc $model53 $LactXMtot $saaXalb $AOPXNL estat ic logit AnyDzCalc $model53 $LactXMtot $saaXalb $AOPXbhb estat ic est sto logit AnyDzCalc $model53 $LactXMtot $sa aXa lb $AOPXbhb c.calcium#c.lognefasq estat ic logit AnyDzCalc $model53 $LactXMtot $saaXalb $AOPXbhb c.calcium#c.NLratio estat ic logit AnyDzCalc $model53 $LactXMtot $saaXalb $AOPXbhb c.calcium#c.bhb estat ic logit AnyDzCalc $model53 $LactXMto t $saaXalb $AOPXbhb c.lognefasq#c.NLratio estat ic eststo logit AnyDzCalc $model53 $LactXMtot $saaXalb $AOPXbhb c.lognefasq#c.NLratio /// c.lognefasq#c.bhb estat ic logit AnyDzCalc $model53 $ LactXMtot $saaXalb $AOPXbhb c.l ognefasq#c.NLratio /// c.NLratio#c.bhb estat ic esttab, aic /// title("Model 53") /*Model 54*/ eststo clear global model54 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol $vitamina calcium logn efasq NLratio bhb" logit Any DzCalc $model54 estat ic logit AnyDzCalc $model54 $LactXSeason estat ic logit AnyDzCalc $model54 $LactXMtot estat ic eststo logit AnyDzCalc $model54 $LactXMtot $LactXalb estat ic logit AnyDzCalc $ model54 $LactXMtot $LactXAOP estat ic logi t AnyDzCalc $model54 $LactXMtot $LactXalpha estat ic logit AnyDzCalc $model54 $LactXMtot $LactXvitamina estat ic logit AnyDzCalc $model54 $LactXMtot $LactXCalc estat ic logit AnyDzCalc $mo del54 $LactXMtot $LactXNefa estat ic logit AnyDzCalc $model54 $LactXMtot $LactXNL estat ic logit AnyDzCalc $model54 $LactXMtot $LactXbhb estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXMtot estat ic logit AnyDzCalc $model54 $Lact XMtot $SeasonXalb estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXAOP estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXalpha estat ic 321 logit AnyDzCalc $model54 $LactXMtot $SeasonXvitamina estat ic logit AnyDzCalc $model54 $Lact XMtot $SeasonXCal c estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXNefa estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXNL estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb estat ic eststo logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $MtotXalb estat ic lo git AnyDzCalc $model54 $LactXMtot $SeasonXbhb $MtotXAOP estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $MtotXalpha estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $MtotXvitamina estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $MtotXCalcium estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $MtotXNefa estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $MtotXNL estat ic logit AnyDzCalc $model 54 $LactXMtot $SeasonXbhb $MtotXbhb estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $albXAOP estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $albXalpha estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $albXvitamina estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $albXCalcium estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $albXNefa estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $albXNL estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $albXbhb estat ic log it AnyDzCalc $model54 $LactXMtot $SeasonXbhb $AOPXalpha estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $AOPXvitamina estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $AOPXCalcium estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $AOPXNefa estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $AOPXNL estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $AOPXbhb estat ic logit AnyDzCalc $model54 $Lac tXMtot $SeasonXbhb $alphaXvitamina estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $alphaXCalcium estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $alphaXNefa estat ic logit AnyDzCalc $model54 $LactXMtot $Season Xbhb $alphaXN L estat ic logit AnyDzCalc $model54 $LactXMt ot $SeasonXbhb $alphaXbhb estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $vitaminaXCalcium estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $vitaminaXNefa estat ic eststo logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $ vitaminaXNefa $vitaminaXNL estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $vitaminaXNefa $vitaminaXbhb estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $vitaminaXNefa c.calcium#c. lognefasq estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $vitaminaXNefa c.calcium#c.NLratio estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $vitaminaXNefa c.calcium#c.bhb estat ic logit AnyDzCal c $model54 $LactXMtot $Seas onXbhb $vitaminaXNefa c.lognefasq#c.NLratio est at ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $vitaminaXNefa c.lognefasq#c.bhb 322 estat ic logit AnyDzCalc $model54 $LactXMtot $SeasonXbhb $vitaminaXNefa c.NLratio#c.b hb estat ic esttab, aic /// title("Model 54") ****** ************************************************************************** /*Step 4: Test random effects*/ ************************************************************************* ******* /*Note: testing ran dom effects for inclusion in each candidate model using AIC values. If variance is not estimated, excluded*/ /*Model 1*/ eststo clear global model1 "$LactCat $season $Mtot $albumin $aopugul calcium lognefasq NLratio bcscat2 $LactXMtot $AOPXBCS c.lo gnefasq#c.bcscat2" melogit AnyDzCalc $model1 e ststo melogit AnyDzCalc $model1 || FarmId: eststo melogit AnyDzCalc $model1 || Cohort: eststo melogit AnyDzCalc $model1 || FarmId: || Cohort: eststo estta b, b(%9.0g) aic varwidth(4) /// titl e("Model 1") /// mtitles("None" "Farm Only" " Cohort Only" "Both") /*Model 2*/ eststo clear global model2 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio bcsc at2 $LactXMtot $AOPXNL c.lognefasq#c.bcs cat2" melogit AnyDzCalc $model2 eststo melog it AnyDzCalc $model2 || FarmId: eststo melogit AnyDzCalc $model2 || Cohort: eststo melogit AnyDzCalc $model2 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 2") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 3*/ eststo clear global model3 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol calcium lognefasq NLratio bcscat2 $LactXMtot $AOPXBCS c.lognefasq#c.bcscat2" melogit AnyDzCal c $model3 eststo melogit AnyDzCalc $model3 || FarmId: eststo melogit AnyDzCalc $model3 || Cohort: eststo melogit AnyDzCalc $model3|| FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4 ) /// title("Model 3") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 4 */ eststo clear global model4 "$LactCat $season $Mtot $saaugml $albumin $aopugul calcium lognefasq NLratio bcscat2 $LactXMtot $saaXalb $AOPXBCS c.lognefasq #c.NLratio" melogit AnyDzCalc $model4 eststo melogit AnyDzCalc $model4 || FarmId: eststo m elogit AnyDzCalc $model4 || Cohort: eststo melogit AnyDzCalc $model4 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// 323 title("Model 4") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 5*/ eststo clear g lobal model5 "$LactCat $season $Mtot $saaugml $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio bcscat2 $LactXMtot $saaXalb $AO PXNL" melogit AnyDzCalc $model5 eststo melogit AnyDzCa lc $model5 || FarmId: eststo melogit AnyDzCalc $model5 || Cohort: eststo melogit AnyDzCalc $model5 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 5") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 6*/ eststo clear global model6 "$ LactCat $season $Mtot $albumin $aopugul $vitamina calcium lognefasq NLratio bcscat2 $LactXMtot $AOPXNL $vitaminaXNefa c.lognefasq#c.bcscat2 c.NLrati o#c.bcscat2" melogit AnyDzCalc $model6 eststo melogit A nyDzCalc $model6 || FarmId: eststo melogit Any DzCalc $model6 || Cohort: eststo melogit AnyDzCalc $model6 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 6") /// mtitles("None" "Farm Only" "Cohort Only" "Bo th") /*Model 7*/ eststo clear global model7 "$LactCat $season $Mtot $saaugml $albumin $aopugul $alphatocopherol calcium lognefasq NLratio bcscat2 $LactXMtot $LactXsaa $saaXalb $AOPXB CS c.lognefasq#c.bcscat2" melogit AnyDzCalc $model7 eststo melogi t AnyDzCalc $model7 || FarmId: eststo melogit AnyDzCalc $model7 || Cohort: eststo melogit AnyDzCalc $model7 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 7") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 8*/ eststo clear global model8 "$LactCat $season $Mtot $albumin $aopugul calcium lognefasq NLratio $LactXMtot c.lognefasq#c.NLratio" melogit AnyDzCalc $model8 eststo melogit AnyDzCalc $model8 || FarmId: eststo melogit AnyD zCalc $model8 || Cohort: eststo melogit AnyDzC alc $model8 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 8") /// mtitles("None" "F arm Only" "Cohort Only" "Both") /*Model 9 excluded*/ /*Model 10*/ ests to clear global model10 "$LactCat $season $Mtot $aopugul calcium lognefasq NLratio bcscat2 $LactXMtot $AOPXBCS c.lognefasq#c.bcscat2" melogit AnyDzCalc $model10 eststo 324 melo git AnyDzCalc $model10 || FarmId: eststo melogit AnyDzCalc $model10 || Coh ort: eststo melogit AnyDzCalc $model10 || Farm Id: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 10") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 11*/ eststo clear global model11 "$LactCat $sea son $Mtot $albumin $aopugul calcium lognefasq NLr atio bcscat2 bhb $LactXMtot $AOPXBCS $AOPXbhb c.lognefasq#c.bcscat2" melogit AnyDzCalc $model11 eststo melogit AnyDzC alc $model11 || FarmId: eststo melogit AnyDzCalc $model11 || Cohort: eststo mel ogit AnyDzCalc $model11 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 11") /// mtitles("None" "Farm Only" "Cohort Only" "Bot h") /*Model 12*/ eststo clear global model12 "$LactCat $season $Mtot $saaugml $al bumin $aopugul $alphatocopherol calcium lognefasq NLratio $LactXMtot $LactXsaa $saaXalb" melogit AnyDzCalc $model12 eststo melogit AnyDzCalc $model12 || FarmId: ests to melogit AnyDzCalc $model12 || Cohort: eststo melogit AnyDzCalc $model12 || Far mId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 12") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 13*/ eststo clear global model13 "$LactCat $season $Mtot $saaugml $albumin $aopugul calcium log nefasq NLratio $LactXMtot $SeasonXNefa" melogit AnyDzCalc $model13 eststo melogit AnyDzCalc $model13 || FarmId: eststo melogit AnyDzCalc $model 13 || Cohort: eststo melogit AnyDzCalc $model13 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 13") /// mt itles("None" "Farm Only" "Cohort Only" "Both") /*Model 14*/ eststo clear global model14 " $LactCat $season $Mtot $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio $LactXMtot $ vitaminaXNefa" melogit AnyDzCalc $model14 ests to melogit AnyDzCalc $model14 || FarmId: eststo melogit AnyDzCalc $model14 || Cohort: eststo melogit AnyDzCalc $model14 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// titl e("Model 14") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 15*/ eststo clear 325 global model15 "$LactCat $season $Mto t $saaugml $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio $LactXMtot $LactXsaa $saaXalb $AO PXNL" melogit AnyDzCalc $model15 eststo melog it AnyDzCalc $model15 || FarmId: eststo melogit AnyDzCalc $model15 || Cohort: eststo m elogit AnyDzCalc $model15 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 15") /// mtitles("None" "Farm Only" "Cohort O nly" "Both") /*Model 16*/ eststo clear global model16 "$LactCat $season $Mtot $albu min $aopugul $alphatocopherol calcium lognefasq NLratio $LactXMtot $AOPXNL" melogit AnyDzCalc $model16 eststo melogit AnyDzCalc $model16 || FarmId: eststo m elogit AnyDzCalc $model16 || Cohort: eststo melogit AnyDzCalc $model16 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 16") /// mtitles("None" "Farm Only" "Coh ort Only" "Both") /*Model 17*/ eststo cle ar global model17 "$LactCat $season $Mtot $aop ugul $alphatocopherol $vitamina lognefasq calcium NLratio bcscat2 $LactXMtot $LactXBCS $AOPXBCS c.lognefasq#c.bcscat2" melogit AnyDzCalc $model17 eststo mel ogit AnyDzCalc $model17 || FarmId: eststo melo git AnyDzCalc $model17 || Cohort: eststo mel ogit AnyDzCalc $model17 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 17") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 18*/ eststo clear gl obal model18 "$LactCat $season $Mtot $aopugul $vitamina calcium lognefasq NLratio bcscat2 $LactXMtot $AOPXBCS $vitaminaXNefa c.lognefasq#c.bcscat2" melogit AnyDzCalc $model18 eststo melogit AnyDzCalc $m odel18 || FarmId: eststo melogit AnyDzCalc $mo del18 || Cohort: eststo melogit An yDzCalc $model18 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 18") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 19*/ eststo clear global model19 "$LactCat $season $Mtot $albumin $aop ugul $alphatocopherol $vitamina calcium lognefasq bcscat2 $LactXMtot $alphaXNefa $vitaminaXNefa c.lognefasq#c.bcscat2" melogit AnyDzCalc $model19 eststo melogit AnyD zCalc $model19 || FarmId: eststo melogit AnyDz Calc $model19 || Cohort: eststo me logit AnyDzCalc $model19 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 19") /// 326 mtitles("None" "Farm Only" "Cohort Only" "B oth") /*Model 20*/ eststo clear global model20 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio bcscat2 $LactXMtot $SeasonXNefa" melogit AnyDzCalc $model20 eststo melogit AnyDzCalc $model20 || F armId: eststo melogit AnyDzCalc $model20 || Co hort: eststo melogit AnyDzCalc $model20 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 20") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Mode l 21*/ eststo clear global model21 "$LactCat $ season $Mtot $albumin $aopugul $a lphatocopherol $vitamina calcium lognefasq NLratio bcscat2 bhb $LactXMtot $AOPXNL $AOPXbhb $vitaminaXNefa" melogit AnyDzCalc $model21 eststo melogit AnyDzCalc $model21 || FarmId: eststo melogit AnyDzCalc $model21 || Cohort: eststo melogit AnyDzCa lc $model21 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 21") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 22 excluded - duplicate*/ /*Model 23*/ e ststo clear global model23 "$LactCat $season $Mtot $albumin $aopugul calcium NLratio bcscat2 $LactXMtot $AOPXBCS" melogit AnyDzCalc $model23 eststo melogit AnyDzCalc $model23 || FarmId: eststo melog it AnyDzCalc $model23 || Cohort: eststo melogi t AnyDzCalc $model23 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 23") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 24 and model 25 dups - deleted*/ /*Model 26*/ eststo clear gl obal model26 "$LactC at $season $Mtot $saaugml $albumin $aopugul $vitamina lognefasq calcium NLratio bcscat2 $LactXMtot $saaXalb $AOPXBCS c.lognefasq#c.NLratio c.lognefasq#c.bcscat2" melogit AnyDzCalc $model 26 eststo melogit AnyDzCalc $model26 || FarmId : eststo melogit AnyDzCalc $model26 || Cohort: eststo melogit AnyDzCalc $model26 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 26") /// mtitles("None" "Fa rm Only" "Cohort Only" "Both") /*Model 27*/ eststo cle ar global model27 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol calcium lognefasq NLratio bcscat2 bhb $LactXMtot $AOPXBCS $AOPXbhb c.NLratio#c.bhb" melogit AnyDzCalc $model27 est sto melogit AnyDzCalc $model27 || FarmId: 327 ests to melogit AnyDzCalc $model27 || Cohort: eststo melogit AnyDzCalc $model27 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 27") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 28*/ e ststo clear global model28 "$LactCat $season $Mtot $aopugul calcium lognefasq NLratio $LactXMtot c.lognefasq#c.NLratio" melogit AnyDzCalc $model28 eststo melogit AnyDzCalc $model28 || FarmId: eststo melogit AnyDzCalc $model28 || Cohort: eststo m elogit AnyDzCalc $model28 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 28") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 29*/ eststo cl ear global model29 "$LactCat $season $Mtot $aopu gul $a lphatocopherol calcium lognefasq NLratio bcscat2 $LactXMtot $AOPXBCS c.lognefasq#c.bcscat2" melogit AnyDzCalc $model29 eststo melogit AnyDzCalc $model29 || FarmId: eststo melogit AnyDzCalc $mode l29 || Cohort: eststo melogit AnyDzCalc $model 29 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 29") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 30*/ eststo clear global mo del30 "$LactCat $season $Mtot $saaugml $albumin $ aopugul calcium lognefasq NLratio bcscat2 bhb $LactXMtot $saaXalb" melogit AnyDzCalc $model30 eststo melogit AnyDzCalc $model30 || FarmId: eststo melogit AnyDzCalc $model30 || Cohort: eststo melogi t AnyDzCalc $model30 || FarmId: || Cohort: es t sto esttab, b(%9.0g) aic varwidth(4) /// title("Model 30") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 31*/ eststo clear global model31 "$LactCat $season $Mtot $albumin $ao pugul calcium lognefasq $bcscat $LactXMtot $AOPX B CS c.lognefasq#c.bcscat2" melogit AnyDzCalc $model31 eststo melogit AnyDzCalc $model31 || FarmId: eststo melogit AnyDzCalc $model31 || Cohort: eststo melogit AnyDzCalc $model31 || FarmId: || Cohort : eststo esttab, b(%9.0g) aic varwidth(4) / // title("Model 31") /// mtitles("None" "Farm Only" "Cohort Only" "Both") 328 /*Model 32*/ eststo clear global model32 "$LactCat $season $Mtot $albumin $aopugul calcium lognefasq NLratio bhb $Lact XMtot $AOPXbhb c.lognefasq#c.NLratio " melogit An yDzCalc $model32 eststo melogit AnyDzCalc $model32 || FarmId: eststo melogit AnyDzCalc $model32 || Cohort: eststo melogit AnyDzCalc $model32 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic v arwidth(4) /// title("Model 32") /// mtit les("None" "Farm Only" "Cohort Only" "Both") /*Model 33*/ eststo clear global model33 "$LactCat $season $Mtot $albumin $aopugul $vitamina calcium lognefasq NLratio $LactXMtot $AOPXNL $vitaminaX Nefa" melogit AnyDzCalc $model33 eststo melog it AnyDzCalc $model33 || FarmId: eststo melogit AnyDzCalc $model33 || Cohort: eststo melogit AnyDzCalc $model33 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 33") /// mtitles("None" "Farm Only" "Cohort O nly" "Both") /*Model 34 and 35 duplicates - removed*/ /*Model 36*/ eststo clear global model36 "$LactCat $season $Mtot $albumin $aopugul $vitamina calcium lognefasq NLratio bcscat2 bhb $LactXM tot $AOPXNL $ AOPXbhb $vitaminaXNefa $vitaminaXbhb " melogit AnyDzCalc $model36 eststo melogit AnyDzCalc $model36 || FarmId: eststo melogit AnyDzCalc $model36 || Cohort: eststo melogit AnyDzCalc $model36 || FarmId: || Cohort: eststo esttab, b (%9.0g) aic v arwidth(4) /// title("Model 36") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 37*/ eststo clear global model37 "$LactCat $season $Mtot $saaugml $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio bcscat2 bhb $LactXMtot $LactXsaa $Season Xbhb $saaXalb $AOPXBCS" melogit AnyDzCalc $model37 eststo melogit AnyDzCalc $model37 || FarmId: eststo melogit AnyDzCalc $model37 || Cohort: eststo melogit AnyDzCalc $model37 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 37") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 38*/ eststo clear global model38 "$LactCat $season $Mtot $aopugul calcium lognefasq NLratio bcscat2 bhb $L a ctXMtot $MtotXbhb $AOPXBCS $AOPXbhb c.lognefasq# c.bcscat2" melogit AnyDzCalc $model38 329 eststo melogit AnyDzCalc $model38 || FarmId: eststo melogit AnyDzCalc $model38 || Cohort: eststo melogit AnyDzCalc $model38 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// titl e("Model 38") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 39 same as model 7 once removed Ntot.. NL and Ntot both in model*/ /*Model 40*/ eststo clear global model40 "$La ctCat $season $Mtot $saaugml $albumin $aopugul ca lcium NLratio bcscat2 $LactXMtot $saaXalb $AOPXBCS" melogit AnyDzCalc $model40 eststo melogit AnyDzCalc $model40 || FarmId: eststo melogit AnyDzCalc $model40 || Cohort: eststo melogit AnyDzCalc $mo del40 || FarmId: || Cohort: eststo esttab , b(%9.0g) aic varwidth(4) /// title("Model 40") /// mtitles("None" "Fa rm Only" "Cohort Only" "Both") /*Model 41*/ eststo clear global model41 "$LactCat $season $Mtot $aopugul lognefasq NLra tio bcscat2 $LactXMtot $AOPXBCS c.lognefasq#c.bcs cat2" melogit AnyDzCalc $model41 eststo melogit AnyDzCalc $model41 || Farm Id: eststo melogit AnyDzCalc $model41 || Cohort: eststo melogit AnyDzCalc $model41 || FarmId: || Cohort: eststo estt ab, b(%9.0g) aic varwidth(4) /// title("Model 41") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 42*/ eststo clear global model42 "$LactCat $season $Mtot $saaugml $albumin $aopugul $alphatocopherol calcium lognefasq NLratio bcsca t2 bhb $LactXMtot $LactXsaa $saaXalb $AOPXBCS c.l ognefasq#c.bcscat2 c.NLratio#c.bhb" melogit AnyDzCalc $model42 eststo melogit AnyDzCalc $model42 || FarmId: eststo melogit AnyDzCalc $model42 || Cohort: eststo melogit AnyDzCalc $model42 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic var width(4) /// title("Model 42") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 43 same as model 10 - duplicate*/ /*Model 44 same as model 11 - duplicate*/ /*Model 45*/ eststo clear global model45 "$LactCat $season $Mtot $saaugml $albumin $aopugul $alphatocopherol calcium NLratio bcscat2 $LactXMtot $LactXsaa $saaXalb $AOPXBCS" melogit AnyDzCalc $model45 eststo melogit AnyDzCalc $model45 || FarmId: eststo melogit AnyDzCalc $mo del45 || Cohort: eststo 330 melogit AnyDzCalc $mod el45 || FarmId: || Cohort: eststo esttab, b(%9 .0g) aic varwidth(4) /// title("Model 45") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 46*/ eststo clear global model46 "$ LactCat $season $Mtot $aopugul $alphatocopherol $ vitamina calcium lognefasq NLratio $LactXMtot $vitam inaXNefa" melogit AnyDzCalc $model46 eststo melogit AnyDzCalc $model46 || FarmId: eststo melogit AnyDzCalc $model46 || Cohort: eststo melogit Any DzCalc $model46 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 46") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 47*/ eststo clear global model47 "$LactCat $season $Mtot $albumin $aopu gul $vitamina calcium lognefasq bcscat2 $LactXMto t $vitaminaXNefa c.lognefasq#c.bcscat2" me logit AnyDzCalc $model47 eststo melogit AnyDzCalc $model47 || FarmId: eststo melogit AnyDzCalc $model47 || Cohort: eststo melogit AnyDzCalc $model47 || Far mId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 47") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 48*/ eststo clear global model48 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol cal cium lognefasq bcscat2 $LactXMtot c.lognefasq#c.b cscat2" melogit AnyDzCalc $model48 estst o melogit AnyDzCalc $model48 || FarmId: eststo melogit AnyDzCalc $model48 || Cohort: eststo melogit AnyDzCalc $model48 || FarmId: || Cohort: eststo est tab, b(%9.0g) aic varwidth(4) /// title("Mode l 48") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 49*/ eststo clear global model49 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol calcium NLratio bcscat2 $LactXMtot $A OPXBCS" melogit AnyDzCalc $model49 eststo mel ogit AnyDzCalc $model49 || FarmId: eststo melogit AnyDzCalc $model49 || Cohort: eststo melogit AnyDzCalc $model49 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Mode l 49") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 50* / eststo clear 331 global model50 "$LactCat $season $Mtot $albumin calcium lognefasq NLratio bcscat2 $LactXMtot c.lognefasq#c.bcscat2" melogit AnyDzCalc $model50 eststo melogit AnyDzCalc $model50 || FarmId: eststo melogit AnyDzCalc $model50 || Cohort: eststo melogit AnyDzCalc $model50 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 50") /// mtitles("None" "Farm Only" "Co hort Only" "Both") /*Model 51 has both Nto t and nl ratio*/ /*Mod el 52*/ eststo clear global model52 "$LactCat $season $Mtot $saaugml $albumin calcium lognefasq NLratio bcscat2 $LactXMtot $saaXalb c.lognefasq#c.NLratio c.lognefasq#c.bcscat 2" melogit AnyDzCalc $model52 eststo melogi t AnyDzCalc $model52 || FarmId : eststo melogit AnyDzCalc $model52 || Cohort: eststo melogit AnyDzCalc $model52 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 5 2") /// mtitles("None" "Farm Only" "Cohort On ly" "Both") /*Model 53*/ eststo clear global model53 "$LactCat $season $Mtot $saaugml $albumin $aopugul calcium lognefasq NLratio bhb $LactXMtot $saaXalb $AOPXbhb c.lognefasq#c.NLratio" melogit AnyDz Calc $model53 eststo melogit AnyDzCalc $model5 3 || FarmId: eststo melogit AnyDzCalc $model53 || Cohort: eststo melogit AnyDzCalc $model53 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 53") /// mtitles ("None" "Farm Only" "Cohort Only" "Both") /*Model 54* / eststo clear global model54 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol $vitamina calcium lognefasq NLratio bhb $LactXMtot $SeasonXbhb $vitaminaXNefa" melogit AnyDzCalc $m odel54 eststo melogit AnyDzCalc $model54 || Fa rmId: eststo melogit AnyDzCalc $model54 || Cohort: eststo melogit AnyDzCalc $model54 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 54") /// mtitles("None" "Farm Only" "Cohort Only" "Both") /*Model 55*/ eststo clear global model55 "$LactCat $season $Mtot $saaugml $albumin $aopugul $vitamina calcium lognefasq NLratio $LactXMtot $saaXNefa $AOPXNL" melogit AnyDzCalc $model55 eststo melogit AnyDzCa lc $model55 || FarmId: eststo melogit AnyDzCal c $model55 || Cohort: eststo melogit AnyDzCalc $model55 || FarmId: || Cohort: eststo 332 esttab, b(%9.0g) aic varwidth(4) /// title("Model 55") /// mtitles("None" "Farm Only" "Cohort Only" "Both ") /*Model 57*/ eststo clea r globa l model57 "$LactCat $season $Mtot $saaugml $albumin $alphatocopherol calcium lognefasq NLratio bcscat2 $LactXMtot $LactXsaa $saaXalb c.lognefasq#c.bcscat2" melogit AnyDzCalc $model57 eststo melogit AnyD zCalc $model57 || FarmId: eststo melogit AnyDz Calc $model57 || Cohort: eststo melogit AnyDzCalc $model57 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 57") /// mtitles("None" "Farm Only" "Cohort Only" "B oth") /*Model 61*/ eststo c lear global model61 "$LactCat $season $Mtot $saaugml $albumin calcium lognefasq NLratio bcscat2 $LactXMtot $saaXalb c.lognefasq#c.NLratio c.lognefasq#c.bcscat2 c.NLratio#c.bcscat2" melogit AnyDzCalc $model61 eststo melogit AnyDzCalc $model61 || FarmI d: eststo melogit AnyDzCalc $model61 || Cohort: eststo melogit AnyDzCalc $model61 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 61") /// mtitles("None" "Farm Only" "Co hort Only" "Both") /*Model 64*/ eststo clear global model64 "$LactCat $season $Mtot $albumin $aopugul $alphatocopherol $Ltot $vitamina calcium lognefasq bcscat2 $LactXMtot $LactXNefa $AOPXBCS $BCSXLtot" melogit AnyDzCalc $model64 eststo me logit AnyDzCalc $model64 || FarmId: eststo mel ogit AnyDzCalc $model64 || Cohort: eststo melogit AnyDzCalc $model64 || FarmId: || Cohort: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Model 64") /// mtitles("None" "Farm Only" "Cohor t Only" "Both") ************* ******************************************************************* /*Step 5: Reduce pool of candidate models*/ ******************************************************************************** /*Note: in this step, we ran the above models with the rando m effects included (where applicable) and exported the AIC values. Then, we ordered AICs and kept all within 4 AICs of the "best" model*/ eststo clear melogit AnyDzCalc $model1 eststo melogit AnyDzCalc $model2 eststo melogit AnyD zCalc $model3 eststo melogit AnyDzCalc $model4 eststo melogit AnyDzCalc $model5 eststo esttab, b(%9.0g) aic varwidth(4) /// 333 title("Models 1 - 5") eststo clear melogit AnyDzCalc $model6 eststo melogit AnyDzCal c $model7 eststo melogit AnyDzCalc $model8 ests to melogit AnyDzCalc $model10 eststo melogit AnyDzCalc $model11 eststo esttab, b(%9.0g) aic varwidth(4) /// title("Models 6 - 8, 10 - 11") eststo clear melogit AnyDzCalc $mo del12 eststo mel ogit AnyDzCalc $model13 eststo melogit AnyDzCal c $model14 eststo melogit AnyDzCalc $model15 eststo melogit AnyDzCalc $model16 eststo esttab, b(%9.0g) aic varwidth(4) /// title("Models 12 - 16") eststo clear melogit AnyDzCalc $model17 eststo melogit AnyDzCalc $model18 eststo melogit AnyDz Calc $model19 eststo melogit AnyDzCalc $model20 eststo melogit AnyDzCalc $model21 eststo esttab, b(%9.0g) aic varwidth(4) /// title("Models 17 - 21") eststo clear me logit AnyDzCalc $model23 e ststo melogit AnyDzCalc $model26 eststo melogit AnyDzCalc $model27 eststo melogit AnyDzCalc $model28 eststo melogit AnyDzCalc $model29 eststo esttab, b(%9.0g) aic varwidth(4) /// title("Models 23, 26 - 29") eststo clear melogit AnyDzCa lc $model30 eststo melogit AnyDzCalc $model31 e ststo melogit AnyDzCalc $model32 eststo melogit AnyDzCalc $model33 eststo melogit AnyDzCalc $model36 eststo melogit AnyDzCalc $model37 eststo 334 esttab, b(%9.0g) aic varwidth(4) /// title("Models 30 - 33, 36, 37") eststo clear melogit AnyD zCalc $model38 eststo melogit AnyDzCalc $model40 || FarmId: eststo melogit AnyDzCalc $model41 eststo melogit AnyDzCalc $model42 eststo melogit AnyDzCalc $model45 | | FarmId: eststo esttab, b(%9.0g) aic varwidth(4) /// title("Models 38, 40 - 42, 45") eststo clear melogit AnyDzCalc $model46 eststo melogit AnyDzCalc $model47 eststo melogit AnyDzCalc $model48 eststo melogit AnyDzCalc $model49 eststo melogit AnyDzCalc $model50 eststo e sttab, b(%9.0g) aic varwidth(4) /// title("Mo dels 46 - 50") eststo clear melogit AnyDzCalc $model52 eststo melogit AnyDzCalc $model53 eststo melogit AnyDzCalc $model54 eststo esttab , b(%9.0g) aic varwidth(4) /// title("Mod els 52 - 54") *********************** ********************************************************* /*Step 6: Model Diagnostics*/ ************************************************************************ ******** /* Note: for model diagnostics, rec orded discriminant ability (by using ROC) and t ested goodness - of - fit using hosmer - lemeshow. Looked at deviance as well as leverage graphs */ melogit AnyDzCalc $model45 || FarmId: predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0 .1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ drop phat dec melogit AnyDzCalc $model45 || FarmId: predict phat predict hat scatter hat phat, mlab(LstcId) yline(0) gen id=_n predi ct dv, dev scatter dv id, mlab(LstcId) drop h at phat id dv melogit AnyDzCalc $model7 pr edict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) 335 /*diagnostic plots*/ drop phat dec logit AnyDzCalc $ model7 predict hat, hat predict phat scatter hat phat, mlab(LstcId) yline(0) gen id=_n p redict dv, dev scatter dv id, mlab(LstcId) drop phat hat id dv melogit AnyDzCalc $model37 predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /* diagnostic plots*/ drop phat dec logit AnyDz Calc $model37 predict phat predict hat, hat scatter hat phat, mlab(LstcId) yline(0) gen id=_n predict dv, d ev scatter dv id, mlab(LstcId) drop phat hat id dv melogit AnyDzCalc $model4 2 predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ drop phat dec logit AnyDzCalc $model 42 predict phat predict hat, hat scatter hat phat, mlab(LstcId) yline(0) gen id= _n predict dv, dev scatter dv id, mlab(LstcId ) drop phat hat id dv melogit AnyDzCalc $model15 predict phat roctab AnyDzCalc phat egen dec=cut(phat),at (0(0.1)1) hl AnyDzCalc phat, q(dec) /*diagnostic plots*/ drop phat dec logit AnyDzCalc $model15 predict phat predict hat, hat scatter hat phat, mlab(LstcId) yline(0) gen id=_n predict dv, de v scatter dv id, mlab(LstcId) drop phat hat id dv *********************************************************** ********************* /*Step 7: Internal validat ion */ ******************************************************************************** /*Calculating ROC using 500 bootstrapped samples for each candidate model*/ *************************************** ********************************************* /*N ote: For each model, I created a program that calculates AUC by hand. Added a bsample line, so that with each run, a bootstrapped sample is used. I simulated this calculation 500 times, and averaged the pr edictive ability to get a bootstrapped estimate*/ /*Mo del 45*/ use full, clear melogit AnyDzCalc $model45 || FarmId: matrix b=e(b) matrix list b 336 global b_1_45= b[1,1] global b_2_45= b[1,2] global b_3_45= b[1,3] global b_4_45= b[1,4] global b_5_45= b[1,5] global b_6_45= b[1,6] glo bal b_7_ 45= b[1,7] global b_8_45= b[1,8] global b_9_45= b[1,9] global b_10_45= b[1,10] global b_11_45= b[1,11] global b_12_45= b[1,12] global b_13_45= b[1,13] global b_14_45= b[1,14] global b_15_45= b[1,15] global b_16_45= b[1,16] global b_17_45= b[1,17] global b_18_45= b[1,18] global b_19_45= b[1,19] global b_20_45= b[1,20] global b_21_45= b[1,21] global b_22_45= b[1,22] global b_23_45= b[1,23] global b_24_45= b[1,24] global b_25_45= b[1,2 5] global b_26_45= b[1,26] global b_27_45= b[1, 2 7] global b_28_45= b[1,28] global b_29_45= b[1,29] global b_30_45= b[1,30] global b_31_45= b[1,31] global b_32_45= b[1,32] global b_33_45= b[1,33] global b_34_45= b[1,34] global b_35_45= b[1,35] gl obal b_36_45= b[1,36] global b_37_45= b[1,37] g lobal b_38_45= b[1,38] global b_39_45= b[1,39] global b_40_45= b[1,40] global b_41_45= b[1,41] global b_42_45= b[1,42] global b_43_45= b[1,43] global b_44_45= b[1,44] global b_45_45= b[1,45] global b_46_45= b[1,46] global b_47_45= b[1,47] globa l b_48_45= b[1,48] global b_49_45= b[1,49] global b_50_45= b[1,50] global b_51_45= b[1,51] global b_52_45= b[1,52] global b_53_45= b[1,53] global b_54_45= b[1,54] global b_55_45= b[1,55] global b_56 _45= b[1,56] global b_57_45= b[1,57] globa l b_5 8_45= b[1,58] program model45, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_58_45 + ($b_1_45 * LactCat2) + ($b_2_45 * LactCat3) + /// ($b_3_45 * season2) + ($b_4_45 * season3) + ($b_5 _45 * season4) + /// ($b_6_45 * MtotCat2) + ($b_7_45 * MtotCat3) + ($b_8_45 * MtotCat4) + /// ($b_9_45 * MtotCat5) + ($b_10_45 * saaugml42) + ($b_11_45 * saaugml43) + /// ($b_12_45 * saaugml44) + ($b_13_45 * albumin52) + ($b_14_45 * albumi n53) + /// ($b_15_45 * albumin54) + ($b_16_45 * albumin55) + ($b_17_45 * aopugul52) + /// ($b_18_45 * aopugul53) + ($b_19_45 * aopugul54) + ($b_20_45 * aopugul55) + /// ($b_21_45 * alphatocopherol52) + ($b_22_45 * alphatocopherol53) + /// ($b_23 _45 * alphatocopherol54) + ($b_24_45 * alphatocopherol55) + /// ($b_25_45 * calcium) + ($b_26_45 * NLratio) + ($b_27_45 * bcscat2) + /// ($b_28_45 * LactCat2 * MtotCat2) + ($b_29_45 * LactCat2 * Mtot Cat3) + /// ($b_30_45 * LactCat2 * MtotCat4) + ($b_31_45 * LactCat2 * MtotCat5) + /// ($b_32_45 * LactCat3 * MtotCat2) + ($b_33_45 * LactCat2 * MtotCat3) + /// 337 ($b_34_45 * LactCat3 * MtotCat4) + ($b_35_45 * LactCat3 * MtotCat5) + /// ($b_36 _45 * LactCat2 * saaugml42) + ($b_37_45 * LactCat 2 * saaugml43) + /// ($b_38_45 * LactCat2 * saaugml44) + ($b_39_45 * LactCat3 * saaugml42) + /// ($b_40_45 * LactCat3 * saaugml43) + ($b_41_45 * LactCat3 * saaugml44) + /// ($b_42_45 * saaugml42 * albumin52) + ($b_43_45 * saaugml42 * albumin53) + /// ($b_44_45 * saaugml42 * albumin54) + ($b_45_45 * saaugml42 * albumin55) + /// ($b_46_45 * saaugml43 * albumin52) + ($b_47_45 * saaugml43 * albumin53) + /// ($b_48_45 * saaugml43 * albumin54) + ( $b_49_45 * saaugml43 * albumin55) + /// ($b_50_45 * saaugml44 * albumin52) + ($b_51_45 * saaugml44 * albumin53) + /// ($b_52_45 * saaugml44 * albumin54) + ($b_53_45 * saaugml44 * albumin55) + /// ($b_54_45 * aopugul52 * bc scat2) + ($b_5 5_45 * aopugul53 * bcscat2) + /// ($b_56_45 * aopugul54 * bcscat2) + ($b_57_45 * aopugul55 * bcscat2) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 prob1 save al l, replace /* 2. Cartesian product of dise ased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and di scordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 sa ve pai rs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model45 = concordantPCT + ( 0.5 * tiedPCT) end use full, clear simulate AUC=r(model45), reps(500) seed(12345): model45 egen AUC1 = mean(AUC) /*AUC1 .9049639 */ /*Model 37*/ use full, clear melogit AnyDzCalc $model37 matrix b=e(b) matrix list b global b_1_37= b[1,1] global b_2_37= b[1,2] global b_3_37= b [1,3] global b_4_37= b[1,4] global b_5_37= b[1,5] global b_6_37= b[1,6] global b_7_37= b[1,7] global b_8_37= b[1,8] global b_9_37= b[1,9] global b_10_37= b[1,10] global b_11_37= b[1,11] 338 globa l b_12_ 37= b[1,12] global b_13_37= b[1,13] global b_14 _37= b[1,14] global b_15_37= b[1,15] global b_16_37= b[1,16] global b_17_37= b[1,17] global b_18_37= b[1,18] global b_19_37= b[1,19] global b_20_37= b[1,20] global b_21_37= b[1,21] global b_22_37= b[ 1,22] global b_23_37= b[1,23] global b_24_37= b[1,24] global b_25_37= b[1,25] global b_26_37= b[1,26] global b_27_37= b[1,27] global b_28_37= b[1,28] global b_29_37= b[1,29] global b_30_37= b[1,30] global b_31_37= b[1,31] globa l b_32_37= b[1,32] global b_33_37= b[1,33] global b_34_37= b[1,34 ] global b_35_37= b[1,35] global b_36_37= b[1,36] global b_37_37= b[1,37] global b_38_37= b[1,38] global b_39_37= b[1,39] global b_40_37= b[1,40] global b_41_37= b[1,41] global b_ 42_37= b[1,42] gl obal b_43_37= b[1,43] global b_44_37= b[1,44] g lobal b_45_37= b[1,45] global b_46_37= b[1,46] global b_47_37= b[1,47] global b_48_37= b[1,48] global b_49_37= b[1,49] global b_50_37= b[1,50] global b_51_37= b[1,51] global b_52_37= b[1,52] global b_53_37= b[1,53] global b_54_37= b[1,54] global b_55_37= b[1,55] global b_56_37= b[1,56] global b_57_37= b[1,57] global b_58_37= b[1,58] global b_59_37= b[1,59] global b_60_37= b[1,60] global b_61_37= b[1,61] global b_ 62_37= b[1,62] global b _63_37= b[1,63] global b_64_37= b[1,64] globa l b_65_37= b[1,65] global b_66_37= b[1,66] program model37, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_66_3 7 + ($b_1_37 * LactCat2) + ($b_2_37 * LactCat3) + /// ($b_3_37 * seaso n2) + ($b_4_37 * season3) + ($b_5_37 * season4) + /// ($b_6_37 * MtotCat2) + ($b_7_37 * MtotCat3) + ($b_8_37 * MtotCat4) + /// ($b_9_37 * MtotCat5) + ($b_10_37 * saaugml42) + ($b_11_37 * saaugml43) + /// ($b_12_37 * saaugml44) + ($b_13_37 * alb umin52) + ($b_14_37 * albumin53) + /// ($b_15_37 * albumin54) + ($b_16_37 * albumin55) + ($b_17_37 * aopugul52) + /// ($b_18_37 * aopugul53) + ($b_19_37 * aopugul 54) + ($b_20_37 * aopugul55) + /// ($b_21_37 * alphatocopherol52) + ($b_22_37 * al phatocopherol53) + /// ($b_23_37 * alphatocopherol54) + ($b_24_37 * alphatocopherol55) + /// ($b_25_37 * vitamina42) + ($b_26_37 * vitamina43) + ($b_27_37 * vitam ina44) + /// ($b_28_37 * calcium) + ($b_29_37 * lognefasq) + ($b_30_37 * NLratio) + /// ($b_31_37 * bcscat2) + ($b_32_37 * bhb) + ($b_33_37 * LactCat2 * MtotCat2) + /// ($b_34_37 * LactCat2 * MtotCat3) + /// ($b_35_37 * LactCat2 * MtotCat4) + ($b_36_37 * LactCat2 * MtotCat5) + /// 339 ($b_37_37 * LactCat3 * MtotCat2) + ($b_3 8_37 * LactCat2 * MtotCat3) + /// ($b_39_37 * LactCat3 * MtotCat4) + ($b_40_37 * LactCat3 * MtotCat5) + /// ($b_41_37 * LactCat2 * saaugml42) + ($b_42_37 * LactCa t2 * saaugml43) + /// ($b_43_37 * LactCat2 * saaugml44) + ($b_44_37 * LactCat3 * s aaugml42) + /// ($b_45_37 * LactCat3 * saaugml43) + ($b_46_37 * LactCat3 * saaugml44) + /// ($b_47_37 * season2 * bhb) + ($b_48_37 * season3 * bhb) + /// ($b_ 49_37 * season4 * bhb) + ($b_50_37 * saaugml42 * albumin52) + /// ($b_51_37 * saau gml42 * albumin53) + /// ($b_52_37 * saaugml42 * albumin54) + ($b_53_37 * saaugml42 * albumin55) + /// ($b_54_37 * saaugml43 * albumin52) + ($b_55_37 * sa augml43 * albumin53) + /// ($b_56 _37 * saaugml43 * albumin54) + ($b_57_37 * saaugm l43 * albumin55) + /// ($b_58_37 * saaugml44 * albumin52) + ($b_59_37 * saaugml44 * albumin53) + /// ($b_60_37 * saaugml44 * albumin54) + ($b_61_37 * saaugm l44 * albumin55) + /// ($b_62_3 7 * aopugul52 * bcscat2) + ($b_63_37 * aopugul53 * bcscat2) + /// ($b_64_37 * aopugul54 * bcscat2) + ($b_65_37 * aopugul55 * bcscat2) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 prob 1 save all, replace /* 2. Cartesian pr oduct of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop pro b0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. Label conc ordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1= =prob0 save pairs, replace /* 4. Calcul ate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model37 = conc ordantPCT + (0.5 * tiedPCT) end use full, cle ar simulate AUC=r(model37), reps(500) seed(12345): model37 egen AUC1 = mean(AUC) /*AUC1 .9268732*/ /*Model 42*/ use full, clear melogit AnyDzCalc $model42 matrix b=e(b) matrix list b global b_1_42= b[1,1] global b_2_42= b[1,2] global b_3_42= b[1,3] global b_4_42= b[1,4] global b_5_42= b[1,5] global b_6_42= b[1,6] global b_7_42= b[1,7] global b_8_42= b[1,8] 340 global b_9_42= b[1,9] global b_10_42= b[1,10] global b_11_42= b[1 ,11] global b_12_42= b[1,12] global b_13_42= b[ 1,13] global b_14_42= b[1,14] global b_15_42= b[1,15] global b_16_42= b[1,16] global b_17_42= b[1,17] global b_18_42= b [1,18] global b_19_42= b[1,19] global b_20_42= b[1,20] global b_21_42= b[1,21] global b_22_42= b[1,22] global b_23_42= b[1,23] global b_24_42= b[1,24] global b_25_42= b[1,25] global b_26_42= b[1,26] global b_27_42= b[1,27] global b_28_42= b[1,28 ] global b_29_42= b[1,29] global b_30_42= b[1,30] global b_31_42= b[1,31] globa l b_32_42= b[1,32] global b_33_42= b[1,33] glob al b_34_42= b[1,34] global b_35_42= b[1,35] global b_36_42= b[1,36] global b_37_42= b[1,37] global b_38_42= b[1,38] global b_39_42= b[1,39] global b_40_42= b[1,40] global b_41_42= b[1,41] global b_ 42_42= b[1,42] global b_43_42= b[1,43] global b _44_42= b[1,44] global b_45_42= b[1,45] global b_46_42= b[1,46] global b_47_42= b[1,47] global b_48_42= b[1,4 8] global b_49_42= b[1,49] global b_50_42= b[1,50] global b_51_42= b[1,51] global b_52_42 = b[1,52] global b_53_42= b[1,53] global b_54_4 2= b[1,54] global b_55_42= b[1,55] global b_56_42= b[1,56] global b_57_42= b[1,57] global b_58_42= b[1,58] g lobal b_59_42= b[1,59] global b_60_42= b[1,60] global b_61_42= b[1,61] global b_62_42= b[1, 62] program model42, rclass version 1 4.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_62_42 + ($b_1_42 * LactCat2) + ($b_2_42 * LactCat3) + /// ($b_3_42 * season2) + ($b_4_42 * season3) + ($b_5_42 * season 4) + /// ($b_6_42 * MtotCat2) + ($b_7_42 * Mt otCat3) + ($b_8_42 * MtotCat4) + /// ($b_9_42 * MtotCat5) + ( $b_10_42 * saaugml42) + ($b_11_42 * saaugml43) + /// ($b_12_42 * saaugml44) + ($b_13_42 * albumin52) + ($b_14_42 * albumin53) + /// ($ b_15_42 * albumin54) + ($b_16_42 * albumin55) + ( $b_17_42 * aopugul52) + /// ($b_18_42 * aopugul53) + ($b_19_4 2 * aopugul54) + ($b_20_42 * aopugul55) + /// ($b_21_42 * alphatocopherol52) + ($b_22_42 * alphatocopherol53) + /// ($b_23_42 * alphat ocopherol54) + ($b_24_42 * alphatocopherol55) + / // ($b_25_42 * calcium) + ($b_26_42 * lognefasq) + ($b_27_42 * NLratio) + /// ($b_28_42 * bcscat2) + ($b_29_42 * bhb) + /// ($b_30_42 * LactCat2 * MtotCat2) + ($b_31_42 * LactCat2 * MtotCat3) + / // ($b_32_42 * LactCat2 * MtotCat4) + ($b_33_ 42 * LactCat2 * MtotCat5) + /// ($b_34_42 * LactCat3 * MtotCa t2) + ($b_35_42 * LactCat2 * MtotCat3) + /// ($b_36_42 * LactCat3 * MtotCat4) + ($b_37_42 * LactCat3 * MtotCat5) + /// ($b_38_42 * Lac tCat2 * saaugml42) + ($b_39_42 * LactCat2 * saaug ml43) + /// ($b_40_42 * LactCat2 * saaugml44) + ($b_41_42 * L actCat3 * saaugml42) + /// ($b_42_42 * LactCat3 * saaugml43) + ($b_43_42 * LactCat3 * saaugml44) + /// 341 ($b_44_42 * saaugml42 * albumin 52) + ($b_45_42 * saaugml42 * albumin53) + /// ($b_46_42 * saaugml42 * albumin54) + ($b_47_42 * saaugml42 * al bumin55) + /// ($b_48_42 * saaugml43 * albumin52) + ($b_49_42 * saaugml43 * albumin53) + /// ($b_50_42 * saaugml43 * albumin54 ) + ($b_51_42 * saaugml43 * albumin55) + /// ($b_52_42 * saaugml44 * albumin52) + ($b_53_42 * saaugml44 * albumin53) + /// ($b_54_42 * saaugml44 * albumin54) + ($b_55_42 * saaugml44 * albumin55) + /// ($b_56_42 * aopugul52 * bcscat2) + ($b_57_42 * aopugul53 * bcscat2) + /// ($b_ 58_42 * aopugul54 * bcscat2) + ($b_59_42 * aopugul55 * bcscat2) + /// ($b_60_42 * lognefasq * bcscat2) + ($b_61_42 * NLratio * bhb) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob0 p rob1 save all, replace /* 2. Cartesian produ ct of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace u se ND, clear cross using D /* 3. Label concord ant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if pro b1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/ 17112 gen tiedPCT= tiedN/17112 return scalar model42 = c oncordantPCT + (0.5 * tiedPCT) end u se full, clear simulate AUC=r(model42), reps(500) seed(12345): model42 egen AUC1 = mean(AUC) /*AUC1 .9220278 */ /*Model 15*/ use full, clear melogit AnyD zCalc $model15 matrix b=e(b) matrix list b global b_1_15= b[1,1] global b_2_15 = b[1,2] global b_3_15= b[1,3] global b_4_15= b[1,4] global b_5_15= b[1,5] global b_6_15= b[1,6] global b_7_15= b[1,7] global b_8_15= b[1,8] global b_9_15= b[1, 9] global b_10_15= b[1,10] global b_11_15= b[1, 11] global b_12_15= b[1,12] global b_13_15= b[1,13] 342 global b_14_15= b[1,14] global b_15_15= b[1,15] global b_16_15= b[1,16] global b_17_15= b[1,17] global b_18_15= b[1,18] global b_19_15= b[1,19] gl obal b_20_15= b[1,20] global b_21_15= b[1,21] g lobal b_22_15= b[1,22] globa l b_23_15= b[1,23] global b_24_15= b[1,24] global b_25_15= b[1,25] global b_26_15= b[1,26] global b_27_15= b[1,27] global b_28_15= b[1,28] global b_29_15= b[1,29] global b_30_15= b[1,30] global b_31_15= b[1,31] global b_32_15= b[1,32] global b_33_15= b[1,33] global b_34_15= b[1,34] global b_35_15= b[1,35] global b_36_15= b[1,36] global b_37_15= b[1,37] global b_38_15= b[1,38] global b_39_15= b[1,39] global b_40 _15= b[1,40] global b_41_15= b[1,41] global b_4 2_15= b[1,42] glob al b_43_15= b[1,43] global b_44_15= b[1,44] global b_45_15= b[1,45] global b_46_15= b[1,46] global b_47_15= b[1,47] global b_48_15= b[1,48] global b_49_15= b[1,49] global b_50_15= b[1,50] global b_51_15= b[1,51] global b_52_15= b[1,52] global b_ 53_15= b[1,53] global b_54_15= b[1,54] global b_55_15= b[1,55] global b_56_15= b[1,56] global b_57_15= b[1,57] global b_58_15= b[1,58] global b_59_15= b[1,59] global b_60_15= b[1,6 0] global b_61_15= b[1,61] program model15, rclass version 14.2 /* 1. calculate pprob */ use full, clear bsample gen logodds = $b_61_15 + ($b_1_15 * LactCat2) + ($b_2_15 * LactCat3) + /// ($b_3_15 * season2) + ($b_4_15 * se ason3) + ($b_5_15 * season4) + /// ($b_6_15 * MtotCa t2) + ($b_7_15 * MtotCat3) + ($b_8_15 * MtotCat4) + /// ($b_9_15 * MtotCat5) + ($b_10_15 * saaugml42) + ($b_11_15 * saaugml43) + /// ($b_12_15 * saaugml44) + ($b_13_15 * albumin52) + ($b_14_15 * albumin53) + /// ($b_15_15 * albumin54) + ( $b_16_1 5 * albumin55) + ($b_17_15 * aopugul52) + /// ($b_18_15 * aopugul53) + ($b_19_15 * aopugul54) + ($b_20_15 * aopugul55) + /// ($b_21_15 * alphatocopherol52) + ($b_22_15 * alphatocopherol53) + / // ($b_23_15 * alphatocopherol54) + ($b_24_15 * alph atocopherol55) + /// ($b_25_15 * vitamina42) + ($b_26_15 * vitamina43) + /// ($b_27_15 * vitamina44) + ($b_28_15 * calcium) + /// ($b_29_15 * lognefasq) + ($b_30_15 * NLratio) + /// ($ b_31_15 * LactCat2 * MtotCat2) + ($b_32_15 * Lact Cat2 * MtotCat3) + /// ($b_33_15 * LactCat2 * MtotCat4) + ($b_34_15 * LactCat2 * MtotCat5) + /// ($b_35_15 * LactCat3 * MtotCat2) + ($b_36_15 * LactCat2 * MtotCat3) + /// ($b_37_15 * LactCat3 * M totCat4) + ($b_38_15 * LactCat3 * MtotCat5) + /// ($ b_39_15 * LactCat2 * saaugml42) + ($b_40_15 * LactCat2 * saaugml43) + /// ($b_41_15 * LactCat2 * saaugml44) + ($b_42_15 * LactCat3 * saaugml42) + /// ($b_43_15 * LactCat3 * saaugml43) + ($b_44 _15 * LactCat3 * saaugml44) + /// ($b_45_1 5 * saa ugml42 * albumin52) + ($b_46_15 * saaugml42 * albumin53) + /// ($b_47_15 * saaugml42 * albumin54) + ($b_48_15 * saaugml42 * albumin55) + /// 343 ($b_49_15 * saaugml43 * albumin52) + ($b_50 _15 * saaugml43 * albumin53) + /// ($b_51_15 * saaugml43 * albumin54) + ($b_52_15 * saaugml43 * albumin55) + /// ($b_53_15 * saaugml44 * albumin52) + ($b_54_15 * saaugml44 * albumin53) + /// ($b_55_15 * saaugml44 * albumin54) + ($b_56_15 * saaugml44 * albumin55) + /// ($ b_57_15 * aopugul52 * NLratio) + ($b_58_15 * aopugul53 * NLratio) + /// ($b_59_15 * aopugul54 * NLratio) + ($b_60_15 * aopugul55 * NLratio) gen prob0 = 2.718 ^ logodds if AnyDzCalc==0 gen prob1 = 2.718 ^ logodds if AnyDzCalc==1 keep prob 0 prob1 save all, replace /* 2. Cartesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1= =. tempfile D save D, replace use ND, clear cross using D /* 3. Label concordant and discordant pairs */ gen concordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==pro b0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied==1) gen concordantPCT = concordantN/17112 gen tiedPCT= tiedN/17112 retur n scalar model15 = concordan tPCT + (0.5 * tiedPCT) end use full, clear simulate AUC=r(model15), reps(500) seed(12345): model15 egen AUC1 = mean(AUC) /*AUC1 .9183967 */ ******************************************************* ********************* **** /*Step 8: Model Averag ing */ ******************************************************************************** /*Note: We calculated AIC weights in excel. First, we average predictions from the original models. Then, we crea te bootstrapped average predictions over 500 reps */ use full, clear /*Averaging predictions from 4 models*/ melogit AnyDzCalc $model45 || FarmId: predict phat rename phat phat1 melogit AnyDzCalc $model37 predict phat rename phat phat2 melogit AnyDzCalc $model42 344 predict phat rename phat phat3 melogit AnyDzCalc $model15 predict phat rename phat phat4 /*Averaged*/ gen avgphat= (0.44 * phat1) + (.26 * phat2) + (.22 * phat3) + (.08 * phat4) roctab AnyDzCalc avgphat, detail /* 0.9333 */ r octg AnyDzCalc avgphat , optimal smooth lowess co interval(0.01) /*cut 0.41*/ /*NPV and PPV*/ gen positive = 0 replace positive = 1 if avgphat > 0.41 replace positive =. if avgphat==. ta b AnyDzCalc positive, column /*Averaging pr edictions from 500 bootstrapped samples*/ use full, clear program averaged, rclass version 14.2 /* 1. cal culate pprob */ use full, clear bsample gen logodds45 = $b_58_45 + ($b_1 _45 * LactCat2) + ($b_2_45 * LactCat3) + /// ($b_3_45 * season2) + ($b_4_45 * season3) + ($b_5_45 * season4) + /// ($b_6_45 * MtotCat2) + ($b_7_45 * MtotCat3) + ($b _8_45 * MtotCat4) + /// ($b_9_45 * MtotCat5) + ($b_10_45 * saaugml42) + ($b_11_4 5 * saaugml43) + /// ($b_12_45 * saaugml44) + ($b_13_45 * albumin52) + ($b_14_45 * albumin53) + /// ($b_15_45 * albumin54) + ($b_16_45 * albumin55) + ($b_17_45 * aopugul52) + /// ($b_18_45 * aopugul53) + ($b_19_45 * aopugul54) + ($b_20_45 * aop ugul55) + /// ($b_21_45 * alphatocopherol52) + ($b_22_45 * alphatocopherol53) + /// ($b_23_45 * alphatocopherol54) + ($b_24_45 * alphatocopherol55) + /// ($b_25_45 * calcium) + ($b_26_45 * NLratio) + ($b_27_45 * bcscat2) + /// ($b_28_45 * L actCat2 * MtotCat2) + ($b_29_45 * LactCat2 * Mtot Cat3) + /// ($b_30_45 * LactCat2 * MtotCat4) + ($b_31_45 * LactCat2 * MtotCat5) + /// ($b_32_45 * Lac tCat3 * MtotCat2) + ($b_33_45 * LactCat2 * MtotCat3) + /// ($b_34_45 * LactCat3 * MtotCat4) + ($b_35_45 * LactCat3 * MtotCat5) + /// ($b_36 _45 * LactCat2 * saaugml42) + ($b_37_45 * LactCat2 * saaugml43) + /// ($b_38_45 * LactCat2 * saaugml44) + ($b_39_45 * LactCat3 * saaugml42) + /// ($b_40_45 * LactCat3 * saaugml43) + ($b_41_45 * LactC at3 * saaugml44) + /// ($b_42_45 * saaugml42 * albumin52) + ($b_43_45 * saaugml42 * albumin53) + /// ($b_44_45 * saaugml42 * albumin54) + ($b_45_45 * saaugml42 * albumin55) + /// ($b_46_45 * saaugml43 * albumin52) + ($b_47_45 * saaugml4 3 * albumin53) + /// ($b_48_45 * saaugml43 * albumin54) + ($b_49_45 * saaugml43 * albumin55) + /// ($b_50_45 * saaugml44 * albumin52) + ($b_51_45 * saaugml44 * albumin53) + /// ($b_52_45 * saaugml44 * albumin54) + ($b_53_45 * saaugml44 * albumin55) + /// ($b_54_45 * aopugul52 * bc scat2) + ($b_55_45 * aopugul53 * bcscat2) + /// ($b_56_45 * aopugul54 * bcscat2) + ($b_57_45 * aopugul5 5 * bcscat2) gen logodds37 = $b_66_37 + ($b_1_37 * LactCat2) + ($b_2_37 * LactCat3) + /// ($b_3_37 * season2) + ($b_4_37 * season3) + ($b _5_37 * season4) + /// ($b_6_37 * MtotCat2) + ($b_7_37 * MtotCat3) + ($b_8_37 * MtotCat4) + /// ($b_9 _37 * MtotCat5) + ($b_10_37 * saaugml42) + ($b_11_37 * saaugml43) + /// ($b_12_37 * saaugml44) + ($b_13_37 * albumin52) + ($b_14_37 * albumin53 ) + /// ($b_15_37 * albumin54) + ($b_16_37 * albumin55) + ($b_17_37 * aopugul52) + /// ($b_18_37 * ao pugul53) + ($b_19_37 * aopugul54) + ($b_20_37 * aopugul55) + /// ($b_21_37 * alphatocopherol52 ) + ($b_22_37 * alphatocopherol53) + /// ($b_ 23_37 * alphatocopherol54) + ($b_24_37 * alphatocopherol55) + /// 345 ($b_25_37 * vitamina42) + ($b_26_37 * vitamina43) + ($b_27_37 * vitamina44) + /// ($b_28_37 * calcium) + ($b_29_37 * lognefasq) + ($b _30_37 * NLratio) + /// ($b_31_37 * bcscat2) + ($b_32_37 * bhb) + ($b_33_37 * LactCat2 * MtotCat2) + /// ($b_34_37 * LactCat2 * MtotCat3) + /// ($b_35_37 * LactCat2 * MtotCat4) + ($b_36_37 * LactCat2 * MtotCat5) + /// ($b_37_37 * LactCat3 * MtotCat2) + ($b_38_37 * LactCat2 * MtotCat3) + / // ($b_39_37 * LactCat3 * MtotCat4) + ($b_40_37 * LactCat3 * MtotCat5) + /// ($b_41_37 * LactCat2 * saaugml42) + ($b_42_37 * LactCat2 * saaugml43) + /// ($b_43_37 * LactCat2 * saaugml44) + ($b_44 _37 * LactCat3 * saaugml42) + /// ($b_45_37 * LactCat3 * saaugml43) + ($b_46_37 * LactCat3 * saaugml44) + /// ($b_47_37 * season2 * bhb) + ($b_48_37 * season3 * bhb) + /// ($b_49_37 * season4 * bhb) + ($b_50_37 * saaugml42 * albumin52) + /// ($b_51_37 * saaugml42 * albumin53) + /// ($ b_52_37 * saaugml42 * albumin54) + ($b_53_37 * saaugml42 * albumin55) + /// ($b_54_3 7 * saaugml43 * albumin52) + ($b_55_37 * saaugml43 * albumin53) + /// ($b_56_37 * saaugml43 * albumin54) + ($b_57_37 * saaugml43 * albumin55) + /// ($b_58_37 * saaugml44 * albumin52) + ($b_59_37 * saaugml44 * albumin53) + /// ($b_60_37 * saaugml44 * albumin54) + ($b_61_37 * saaugml44 * albumin55) + /// ($b_62_37 * aopugul52 * bcscat2) + ($b_ 63_37 * aopugul53 * bcscat2) + /// ($b_64_3 7 * aopugul54 * bcscat2) + ($b_65_37 * aopugul55 * bcscat2) gen logodds42 = $b_62_42 + ($ b_1_42 * LactCat2) + ($b_2_42 * LactCat3) + /// ($b_3_42 * season2) + ($b_4_42 * season3) + ($b_5_42 * seas on4) + /// ($b_6_42 * MtotCat2) + ($b_7_42 * MtotCat3) + ($b_8_42 * MtotCat4) + /// ($b_9_42 * MtotCat5) + ($b_10_42 * saaugml42) + ($b_1 1_42 * saaugml43) + /// ($b_12_42 * saaugml44) + ($b_13_42 * albumin52) + ($b_14_42 * albumin53) + /// ($b_15_42 * albumin54) + ($b_16_42 * albumin55) + ($b_17_42 * aopugul52) + /// ($b_18_42 * aopugul53) + ($b_19_42 * aopugul54) + ($b_20_42 * aopugul55) + /// ($b_21_42 * alphatocopherol52) + ($b_22_42 * alphatocopherol53) + /// ($b_23_42 * alph atocopherol54) + ($b_24_42 * alphatocopherol55) + /// ($b_25_42 * calcium) + ($b_26_42 * lognefasq) + ($b_27_42 * NLratio) + /// ($b_28_42 * bcscat2) + ($b_29_42 * bhb) + /// ($b_30_42 * LactCat2 * MtotCat2) + ($b_31_42 * LactCat2 * MtotCat3) + /// ($b_32_42 * LactCat2 * MtotCat4) + ($b_3 3_42 * LactCat2 * MtotCat5) + /// ($b_34_42 * LactCat3 * MtotCat2) + ($b_35_42 * Lac tCat2 * MtotCat3) + /// ($b_36_42 * LactCat3 * MtotCat4) + ($b_37_42 * LactCat3 * MtotCat5) + /// ($b_38_42 * L actCat2 * saaugml42) + ($b_39_42 * LactCat2 * saa ugml43) + /// ($b_40_42 * LactCat2 * saaugml44) + ($b_41_42 * LactCat3 * saaugml42) + /// ($b_42_42 * LactCat3 * saaugml43) + ($b_43_42 * LactCat3 * saaugml44) + /// ($b_44_42 * saaugml42 * album in52) + ($b_45_42 * saaugml42 * albumin53) + /// ($b_46_42 * saaugml42 * albumin54) + ($b_47_42 * saaugml42 * albumin55) + /// ($b_48_42 * saaugml43 * albumin52) + ($b_49_42 * saaugml43 * albumin53) + /// ($b_50_42 * saaugml43 * albumin 54) + ($b_51_42 * saaugml43 * albumin55) + /// ($b_52_42 * saaugml44 * albumin52) + ($b_53_42 * saaugml44 * albumin53) + /// ($b_54_42 * saaugml44 * albumin54) + ($b_55_42 * saaugml44 * albumin55) + /// ($b_56_42 * aopugul52 * bcscat2) + ($b_57_42 * aopugul53 * bcscat2) + /// ($ b_58_42 * aopugul54 * bcscat2) + ($b_59_42 * aopugul55 * bcscat2) + /// ($b_60_42 * lognefasq * bcscat2) + ($b_61_42 * NLratio * bhb) gen logodds15 = $b_61_15 + ($b_1_15 * LactCat2) + ($b_2_15 * Lact Cat3) + /// ($b_3_15 * season2) + ($b_4_15 * season3) + ($b_5_15 * season4) + /// ($b_6_15 * MtotCat2) + ($b_7_15 * MtotCat3) + ( $b_8_15 * MtotCat4) + /// ($b_9_15 * MtotCat5) + ($b_10_15 * saaugml42) + ($b_11_15 * saaugml43) + /// ($b_12_1 5 * saaugml44) + ($b_13_15 * albumin52) + ($b_14_ 15* albumin53) + /// ($b_15_15 * albumin54) + ($b_16_15 * albumin55) + ($b_17_15 * a opugul52) + /// ($b_18_15 * aopugul53) + ($b_19_15 * aopugul54) + ($b_20_15 * aopugul55) + /// ($b_21_15 * alph atocopherol52) + ($b_22_15 * alphatocopherol53) + /// ($b_23_15 * alphatocopherol54) + ($b_24_15 * alphatocopherol55) + /// ($b_25_15 * vitamina42) + ($b_26_15 * vitamina43) + /// ($b_27_15 * vitamina44) + ($b_28_15 * calcium) + /// ($b_29_ 15 * lognefasq) + ($b_30_15 * NLratio) + /// ($b_31_15 * LactCat2 * MtotCat2) + ($b_32_15 * LactCat2 * MtotCat3) + /// ($b_3 3_15 * LactCat2 * MtotCat4) + ($b_34_15 * LactCat2 * MtotCat5) + /// ($b_35_15 * LactCat3 * MtotCat2) + ($b_36_15 * Lact Cat2 * MtotCat3) + /// ($b_37_15 * LactCat3 * MtotCat4) + ($b_38_15 * LactCat3 * MtotCat5) + /// ($b_39_15 * LactCat2 * saau gml42) + ($b_40_15 * LactCat2 * saaugml43) + /// 346 ($b_41_15 * LactCat2 * saaugml44) + ($b_42_15 * LactCat3 * saaugml42) + /// ($b_43_15 * LactCat3 * saaugml43) + ($b_ 44_15 * LactCat3 * saaugml44) + /// ($b_45_15 * saaugml42 * albumin52) + ($b _46_15 * saaugml42 * albumin53) + /// ($b_47_15 * saaugml42 * albumin54) + ($b_48_15 * saaugml42 * albumin55) + /// ($b_49_15 * saaugml43 * albumin52) + ($b_ 50_15 * saaugml43 * albumin53) + /// ($b_51_15 * saaugml43 * albumin54) + ($b_5 2_15 * saaugml43 * albumin55) + /// ($b_53_15 * saaugml44 * albumin52) + ($b_54_15 * saaugml44 * albumin53) + /// ($b_55_15 * saaugml44 * albumin54) + ($b_56_1 5 * saaugml44 * albumin55) + /// ($b_57_15 * aopugul52 * NLratio) + ($b_58_15 * aopugul53 * NLratio) + /// ($b_59_15 * aopugul54 * NLratio) + ($b_60_15 * aopugul55 * NLratio) gen avglog odds= (0.44 * logodds45) + (.26* logodds37) + (.2 2 * logodds42) /// + (0.08 * logodds15) gen prob0 = 2.718 ^ avglogodds if AnyDzCalc==0 gen prob1 = 2.718 ^ avglogodds if AnyDzCalc==1 keep prob0 prob1 save all, replace /* 2. Car tesian product of diseased vs non diseased*/ drop prob1 drop if prob0==. tempfile ND save ND , replace use all, clear drop prob0 drop if prob1==. tempfile D save D, replace use ND, clear cross using D /* 3. L abel concordant and discordant pairs */ gen c oncordant= 0 replace concordant =1 if prob1> prob0 gen discordant= 0 replace discordant =1 if prob1< prob0 gen tied= 0 replace tied=1 if prob1==prob0 save pairs, replace /* 4. Calculate AUC */ use pairs, clear egen concordantN = total(concordant==1) egen tiedN = total(tied==1) gen concord antPCT = concordantN/ 17112 gen tiedPCT= tiedN/ 17112 return scalar averaged = concordantPCT + (0.5 * tiedPCT) end us e full, clear simulate AUC=r(averaged), reps(500 ) seed(12345):averaged egen AUCall = mean(AUC) /* AUCall .9090693 */ 7_ROCCOMP.DO /* Title: ROC Comparison Author: Lauren Wisnieski Purpose: ROC Comparison between models Revision history: - Create d April, 25, 2018 - revised June 15, 2018 347 T he purpose of this do file is to compare ROC values from the final averaged OX, IF, LM, and combined models Steps: 1. Calculate average phat from each average model. First need to define macros for each variable then rerun each model and save results . 2. Average phat with AIC weights. 3. After, using ROCCOMP to statistically compare the average phats between different models */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ F inalDataset3_5June2018.dta", clear save full, re pla ce ****************************** /* Lipid mobilization model */ ****************************** eststo clear global model4 "$LactCat $season calcium lognefasq $SeasonXNefa" global model1 "$LactCat $s eason $bcscat calcium lognefasq $SeasonXNefa" g lob al model7 "$LactCat $season bhb calcium lognefasq $SeasonXNefa" global model2 "$LactCat $season bcscat bhb calcium lognefasq $SeasonXNefa" global model11 "$LactCat $season $cholesterol $bcscat calcium lognefasq $SeasonXNefa" global season "season2 se ason3 season4" global LactCat "LactCat2 LactCat3" global cholesterol "cholesterol42 cholesterol43 cholesterol44" global bcscat "bcscat2" global SeasonXNefa "c.season2#c.lognefasq c.season3#c.logne fasq c.season4#c.lognefasq" /*Averaging pred ictions from 4 models*/ /*Model 4*/ melogit AnyDzCalc $model4 predict phat rename phat phat1 /*Model 1*/ melogit AnyDzCalc $model1 predict phat rename phat phat2 /*Model 7*/ melogit AnyDzCalc $model7 predict phat r ename phat phat3 /*Model 2*/ melogit AnyDzCalc $model2 predict phat rename phat phat4 /*Model 11*/ melogit AnyDzCalc $model11 predict phat rename phat phat5 /*Averaged*/ gen avgph atLM= (0.36* phat1) + (.27 * phat2) + (.18 * phat 3) + (.13 * phat4) + /// (.06 * phat5) roctab AnyDzCalc avgphatLM drop phat1 phat2 phat3 phat4 phat5 /* 0.7257 */ ****************************** /* Inflammation model */ ************************ ****** global model1 "$La ctCat $season $Mtot $saaugml $albumin NLratio $LactXMtot $saaXalb" global model12 "$LactCat $season $Mtot $saaugml $albumin Ntot $LactXMtot $LactXNtot $saaXalb" global model14 "$LactCat $Mtot $saaugml $albumin NLratio $LactX Mtot $saaXalb" global model 25 "$LactCat $season $Mtot $saaugml $albumin $Ltot $LactXMtot $saaXalb" global model4 "$LactCat $season $Mtot $saaugml $albumin $LactXMtot $saaXalb" 348 global Mtot "MtotCat2 MtotCat3 MtotCat4 MtotCat5" global saaugml "saaugm l42 saaugml43 saaugml44" gl obal albumin "albumi n52 albumin53 albumin54 albumin55" global Ltot "Ltot52 Ltot53 Ltot54 Ltot55" global season "season2 season3 season4" global LactCat "LactCat2 LactCat3" global LactXMtot "c.LactCat2#c.MtotCat2 c.LactC at2#c.MtotCat3 c.LactCat2#c.M totCat4 c.LactCat2#c .MtotCat5 c.LactCat3#c.MtotCat2 c.LactCat3#c.MtotCat3 c.LactCat3#c.MtotCat4 c.LactCat3#c.MtotCat5" global LactXNtot "c.LactCat2#c.Ntot c.LactCat2#c.Ntot c.LactCat3#c.Ntot c.LactCat3#c.Ntot" global saaXal b "c.saaugml42#c.albumin52 c.saaugml42#c.albumin5 3 c.saaugml42#c.albumin54 c.saaugml42#c.albumin55 c.saaugml43#c.albumin52 c.saaugml43#c.albumin53 c.saaugml43#c.albumin54 c.saaugml43#c.albumin55 c.saaugml44#c.albumin52 c.saaugml44#c.albumin53 c.saaugml44#c .a lbumin54 c.saaugml44#c.albumin55" /*Model 1*/ melogit AnyDzCalc $model1 || FarmId: predict phat rename phat phat1 /*Model 12*/ melogit AnyDzCalc $model12 || FarmId: predict phat rename phat phat2 /*Model 14*/ melogit A ny DzCalc $model14 || FarmId: || Cohort: pre dict phat rename phat phat3 /*Model 25*/ melogit AnyDzCalc $model25 || FarmId: predict phat rename phat phat4 /*Model 4*/ melogit AnyDzCalc $model4 || FarmId: predict phat rename phat pha t5 /*Averaged*/ gen avgphatIF= (0.30* ph at1) + (.28* phat2) + (.27 * phat3) + (.09 * phat4) + /// (0.06 * phat5) drop phat1 phat2 phat3 phat4 phat5 roctab AnyDzCalc avgphatIF ****************************** /* Oxidative Stress model */ ****************************** global model1 "$LactCat $season $aopugul $alphatocopherol" global model5 "$LactCat $aopugul $alphatocopherol" global model4 "$LactCat $aopugul $vitamina $alphatocopherol" global model3 "$LactCat $season $aopugul $vi tam ina $alphatocopherol" global model9 "$LactCat $a opugul" global model6 "$LactCat $alphatocopherol" global model2 "$LactCat $season $aopugul" global season "season2 season3 season4" global LactCat "LactCat2 LactCat3" global aopugul "aopugul52 aop ugu l53 aopugul54 aopugul55" global alphatocopherol "alphatocopherol52 alphatocopherol53 alphatocopherol54 alphatocopherol55" global vitamina "vitamina42 vitamina43 vitamina44" /*Model 1*/ melogit AnyDzCalc $model1 || FarmId: predict phat ren ame phat phat1 /*Model 5*/ melogit AnyDzCal c $model5 || FarmId: || Cohort: 349 predict phat rename phat phat2 /*Model 4*/ melogit AnyDzCalc $model4 || FarmId: || Cohort: predict phat rename phat phat3 /*Model 3*/ melogi t AnyDzCal c $model3 || FarmId: predict phat rename ph at phat4 /*Model 9*/ melogit AnyDzCalc $model9 || FarmId: || Cohort: predict phat rename phat phat5 /*Model 6*/ melogit AnyDzCalc $model6 || FarmId: || Cohort: predict phat rename pha t phat6 /*Model 2*/ melogit AnyDzCalc $mod el2 || FarmId: predict phat rename phat phat7 /*Averaged*/ gen avgphatOX= (0.30 * phat1) + (.27 * phat2) + (.1 2 * phat3) + (.12 * phat4) + /// (.07 * phat5) + (0.06 * phat6) + (0.06 * p hat7) roctab AnyDzCalc avgphatOX drop phat1 ph at2 phat3 phat4 phat5 phat6 phat7 /* 0.7785 */ ****************************** /* Combined model */ ************************ ****** global model45 "$LactCat $season $Mtot $saaugml $albumin $aopugul $al phatocopherol calcium NLratio bcscat2 $LactXMtot $LactXsaa $saaXalb $AOPXBCS" global model37 "$LactCat $season $Mtot $saaugml $albumin $aopugul $alphatocopherol $vitamina calciu m lognefasq NLratio bcscat2 bhb $LactXMtot $LactXsaa $SeasonXbhb $saaXalb $AOP XBCS" global model42 "$LactCat $season $Mtot $sa augml $albumin $aopugul $alphatocopherol calcium lognefasq NLratio bcscat2 bhb $LactXMtot $LactXsaa $saaXalb $AOPXBCS c.lognefasq #c.bcscat2 c.NLratio#c.bhb" global model15 "$LactCat $season $Mtot $saaugml $ albumin $aopugul $alphatocopherol $vitamina calci um lognefasq NLratio $LactXMtot $LactXsaa $saaXalb $AOPXNL" global LactXMtot "c.LactCat2#c.MtotCat2 c.LactCat2#c.MtotCat3 c.LactCat2#c.MtotCat4 c.LactCat2#c.MtotCat5 c.LactCat3#c.MtotCat2 c.LactCat3#c.M totCat3 c.LactCat3#c.MtotCat4 c.LactCat3#c.MtotCa t5" global LactXsaa "c.LactCat2#c.saaugml42 c.LactCat2#c.saaugm l43 c.LactCat2#c.saaugml44 c.LactCat3#c.saaugml42 c.LactCat3#c.saaugml43 c.LactCat3#c.saaugml44" global saaXalb "c.saaugml42#c.albumin52 c.s aaugml42#c.albumin53 c.saaugml42#c.albumin54 c.sa augml42#c.albumin55 c.saaugml43#c.albumin52 c.saaugml43#c.albumin 53 c.saaugml43#c.albumin54 c.saaugml43#c.albumin55 c.saaugml44#c.albumin52 c.saaugml44#c.albumin53 c.saaugml44#c.albumin54 c.saaugml44#c.album in55" global AOPXBCS "c.aopugul52#c.bcscat2 c.ao pugul53#c.bcscat2 c.aopugul54#c.bcscat2 c.aopugul55#c.bcscat2" global SeasonXbhb "c.season2#c.bhb c.season3#c.bhb c.season4#c.bhb" global AOPXNL "c.aopugul52#c.NLratio c.aopugul53#c.NLratio c.aopugul54#c .NLratio c.aopugul55#c.NLratio" melogit Any DzCalc $model45 || FarmId: predict phat rename phat pha t1 melogit AnyDzCalc $model37 predict phat rename phat phat2 350 melogit AnyDzCalc $model42 predict phat rename phat ph at3 melogit AnyDzCalc $model15 predict phat rename phat phat4 /*Averaged*/ gen avgphatCOMB= (0.44 * phat1) + (.26 * phat2) + (.22 * phat3) + (.08 * phat4) roctab AnyDzCalc avgphatCOMB /* 0.9333 */ ************************ /*Com paring ROC values*/ ************************ ro ccomp AnyDzCalc avgphatLM avgphatOX, graph summary roccomp AnyDzCalc avgphatLM avgphatIF, graph summary roccomp AnyDzCalc avgphatOX avgphatIF, graph summary roccomp AnyDzCalc avgphatLM avgphatCOMB, graph su mmary roccomp AnyDzCalc avgphatOX avgphatCOMB, gr aph summary roccomp AnyDzCalc avgphatIF av gphatCOMB, graph summary 8_AIM2 - 1COHORTVARIABLES.DO /* Title: Create new variables for cohort analysis Author: Lauren Wisnieski Purpose: Create variables for the cohort level models Revision history: - Cre ated May 7, 2018 - Updated June 5, 2018 after deleting cow with missing saaugml - Updated Dec 28, 2018 to change ROC method to 'roctg' The purpose of this do file is to create new variables for the cohort level analysis. E.g. mean values, dichotomous variables using thresholds Steps: 1. Calculate descriptives by cohort 2. Calculate average values of biomarkers by cohort and variance of biomarkers by cohort, median and deiation from median for ca tegorical vars 3. Lowess to see relationship of biomarker with outcome. 4. Perform ROC curves to analyze best combined sens and spec. - If U shaped, pick two thresholds. 5. Calculate number of cows above/below each threshold 6. Reduce data set to one line per cohort 7. Summary statistics by coh ort */ use "C: \ Users \ wisni \ Amazo n Drive \ DISSERTATION ANALYSIS \ Data files \ FinalDataset3_5June2018.dta", clear /*Step 1: Descriptive statistics by cohort*/ ********************************************** ************************************ /*Calculati ng how many cohorts and cows per farm*/ tab FarmId Cohort2 if LstcId<300 /*Calculating disease incidence by cohort*/ tab Ketosis Cohort2 if LstcId<300 & K etosis==1 tab mastitis Cohort2 if LstcId<300 & mastitis==1 tab Metritis Cohort2 if LstcId<300 & Metritis==1 tab Pneumonia Cohort2 if LstcId<300 & Pneumonia==1 tab lameness Cohort2 if LstcId<300 & lameness==1 tab DA Cohort2 if LstcId<300 & DA==1 tab R P Cohort2 if LstcId<300 & RP==1 351 tab MilkFever Cohort2 if LstcId<300 & MilkFever==1 tab abort Cohort2 if LstcId<300 & abort==1 tab calfdead Cohort2 if LstcId<300 & calfdead==1 tab died Cohort2 if LstcId<300 & died==1 tab AnyDzCalc Cohort2 if LstcId<300 /*Step 2: Calculate average values and va riance of biomarkers by cohort, median and deiva tion from median for categorical variables*/ ********************************************************************************** /*NM biomarkers*/ egen meanbhb= mean(bhb) , by(Cohort2) egen sdbhb= sd(bhb ), by(Cohort2) gen varbhb= sdbhb^2 egen meann efa= mean(nefa), by(Cohort2) egen sdnefa= sd(nefa), by(Cohort2) gen varnefa= sdnefa^2 egen meancalcium= mean(calcium), by(Cohort2) egen sdcalcium= sd (calcium), by(Cohort2) gen varcalcium= sdcalcium^2 egen meancholesterol= mean(cholesterol), by(Co hort2) egen sdcholesterol= sd(cholesterol), by(Cohort2) gen varcholesterol= sdcholesterol^2 /*Median BCS and deviation from median*/ egen medBCS= median(rounddrybcs) if LstcId<300, by(Cohort2) ege n manBCS= mad(rounddrybcs) if LstcId<300, by(Coho rt2) /*IF biomarkers*/ egen meanNtot= mean(Ntot), by(Cohort2) egen sdNtot= sd(Ntot), by(Cohort2) gen varNtot= sdNtot^2 egen meanLtot= mean(Ltot), by(Cohort2) egen sdLtot= sd(Ltot), by(Cohort2) ge n varLtot= sdLtot^2 egen meanNLratio= mean(NLr atio), by(Cohort2) egen sdNLratio= sd(NLratio), by(Cohort2) gen varNLratio= sdNLratio^2 egen meanalbumin= mean(albumin), by(Cohort2) egen sdalbumin= sd(albumin), by(Cohort2) gen varalbumin= sdalbumin^ 2 egen meanhaptoglobin= mean(haptoglobin), by (Cohort2) egen sdhaptoglobin= sd(haptoglobin), by(Cohort2) gen varhaptoglobin= sdhaptoglobin^2 egen meanvitamind= mean(vitamind), by(Cohor t2) egen sdvitamind= sd(vitamind), by(Cohort2) gen varvitamin d= sdvitamind^2 egen meanMtot= mean(Mtot), by (Cohort2) egen sdMtot= sd(Mtot), by(Cohort2) gen varMtot= sdMtot^2 egen meanBtot= mean(Btot), by(Cohort2) egen sdBtot= sd(Btot), by(Cohor t2) gen varBtot= sdBtot^2 egen meanBDtot= mean(BDtot), by (Cohort2) egen sdBDtot= sd(BDtot), by(Cohort2) gen varBDtot= sdBDtot^2 egen meanEtot= mean(Etot), by(Cohort2) egen sdEtot= sd(Etot), by(Cohort2) gen varEtot= sdEtot^2 egen meansaa= mean(saaugml), by(Cohort2) egen sdsaaugml= sd(saaugml), by(Coho rt2) gen varsaaugml= sdsaaugml^2 352 /*OX biomarke rs*/ egen meanaopugul= mean(aopugul), by(Cohort2) egen sdaopugul= sd(aopugul), by(Cohort2) gen varaopugul= sdaopugul^2 egen meanalphatocopherol= mean(alphatocopherol), by(Cohort2) egen sdalphatoco pherol= sd(alphatocopherol), by(Cohort2) gen var alphatocopherol= sdalphatocopherol^2 egen meanbetacarotene= mean(betacarotene), by(Cohort2) egen sdbetacarotene= sd(betacarotene), by(Cohort2) gen varbetacarotene= sdbetacarotene^2 egen meanros= mea n(ros), by(Cohort2) egen sdros= sd(ros), by(Coho rt2) gen varros= sdros^2 egen meanvitamina= mean(vitamina), by(Cohort2) egen sdvitamina= sd(vitamina), by(Cohort2) gen varvitamina= sdvitamina^2 /*Lactation number*/ egen medLact= median(LactationN umber) if LstcId<300, by(Cohort2) egen manLact= mad(LactationNumber) if LstcId<300, by(Cohort2) /*Step 3: Lowess to determine relationships with disease (may need two cutof fs)*/ ************************************************************************** ******** twoway lowess AnyDzCalc nefa , logit /*not linear, but lower values do seem to correlate with less risk*/ twoway lowess AnyDzCalc bhb , logit /*linear*/ t woway lowess AnyDzCalc calcium , logit /*sort of linear*/ twoway lowess AnyD zCalc saaugml , logit /*sort of linear*/ two way lowess AnyDzCalc aopugul , logit /*not linear - but looks like higher levels are associated with disease*/ twoway lowes s AnyDzCalc vitamina , logit /*sort of linear - higher associated with lower dis ease*/ twoway lowess AnyDzCalc ros , logit /*s ort of linear - lower risk with lower levels*/ twoway lowess AnyDzCalc vitamind , logit twoway lowess AnyDzCalc vitamind if vitamind >50 & vitamind <200 , logit /*not linear - but outliers skew it. lower risk with lower levels */ histogram alphatocoph erol twoway lowess AnyDzCalc alphatocopherol , logit twoway lowess AnyDzCalc alphatocopherol if alphatocopherol >2 & alphatocopherol <7 , logit /*not linear WILL NEED TWO CUT OFFS */ twoway lowess A nyDzCalc betacarotene , logit /*not linear - bu t higher values seem to be associated with higher disease risk*/ histogram albumin twoway lowess AnyDzCalc albumin , logit twoway lowess AnyDzCalc albumin if albumin >3 & albumin < 3.8 , logit /*not linear - higher albumin associated with higher dis ease risk */ histogram haptoglobin twoway lowess AnyDz Calc haptoglobin , logit /*not linear - lower values seem to be associated with higher risk of disease*/ histogram cholesterol twoway lowess AnyDz Calc cholesterol , logit /*not linear - higher va lues -- > higher risk*/ twoway lowess AnyDzCalc Ntot , log it /*linear*/ twoway lowess AnyDzCalc Ltot , logit /*sort of linear - higher values at higher risk */ twoway lowess AnyDzCalc NLratio , logi t /*linear*/ twoway lowess AnyDzCalc Etot , logit /*not linear - higher values at lower risk*/ twow ay lowess AnyDzCalc Btot , logit /*not linear - higher values at lower risk*/ histogram BDtot twoway lowess AnyDzCalc BDtot , logit /*not linea r - maybe lower values at higher risk? */ histog ram Mtot twoway lowess AnyDzCalc Mtot , logit twoway lowess AnyDzCalc Mtot if Mtot<500 , logi /*not linear - higher levels associated with higher disease risk */ 353 /*Step 4: ROC curves to determine be st spec and sens for variables that need only o ne cutoff*/ ************************************ ********************************************** /*NEFA*/ roctg AnyDzCalc nefa , optimal smooth lowess co interval(0.01) /*cut 0.21*/ gen nefacut= . replace nefacut = 1 if nefa>=0.21 replace nef acut= 0 if nefa <0.21 replace nefacut = . if ne fa==. logit AnyDzCalc nefacut tab AnyDzCalc nefacut /*sensitivity*/ cii proportion 93 54 /*specificity*/ cii proportion 184 89 /*BHB */ roctg AnyDzCalc bhb, optimal smooth lowess c o interval(0.1) /*cut 5.02*/ gen bhbcut= . replace bhbcut = 1 if bhb>= 5.02 replace bhbcut= 0 if bhb < 5.02 replace bhbcut = . if bhb==. logit AnyDzCalc bhbcut tab AnyDzCalc bhbcut / *sensitivity*/ cii proportion 93 47 /*spe cificity*/ cii proportion 184 93 / *Calcium*/ roctg AnyDzCalc calcium, optimal smooth lowess co interval(0.1) /*cut 9.7*/ gen calciumcut= . replace calciumcut = 1 if calcium>=9.7 replace ca lciumcut= 0 if calcium <9.7 replace calciumcut = . if calcium==. logit AnyDzCalc cal ciumcut tab AnyDzCalc calciumcut /*sensitivity*/ cii proportion 93 40 /*specificity*/ cii proportion 184 71 /*SAAugml*/ roctg AnyDzCalc saaugml, optimal smooth lowess co interval(2) /*cut 32.01*/ gen saaugmlcut= . replace saaugmlcut = 1 if saaugml>=32.01 replace saaugmlcut= 0 if saaugml <32.01 replace saaugmlcut = . if saaugml==. logit AnyDzCalc saaugmlcut tab AnyDzCa lc saaugmlcut /*sensitivity*/ cii propor tion 93 53 /*speci ficity*/ cii proportion 184 104 /*betacarotene*/ roctg AnyDzCalc betacarotene, optimal smooth lowess co interval(.1) /*cut 3.61*/ 354 gen betacarotenecut= . replace beta carotenecut = 1 if betacarotene>= 3.61 replace betacarotenecut= 0 if betacarotene <3.61 replace betacarotenecut = . if betacarotene==. tab AnyDzCalc betacarotenecut /*sensitivity*/ cii proportion 93 45 /*specificity*/ cii proporti on 184 91 /*haptoglobin*/ roctg AnyDzCalc h aptoglobin, optimal smooth lowess co interval(.01) /*cut 0.111*/ gen haptoglobincut= . replace haptoglobincut = 1 if haptoglobin>=0.11 replace haptoglobincut= 0 if haptoglobin <0.11 replace haptoglobincut = . if haptoglobin==. logit An yDzCalc haptoglobincut tab haptoglobincut AnyDzCalc /*sensitivity*/ cii proportion 93 50 /*specificity*/ cii proportion 184 87 /*cholesterol*/ roctg AnyDzCalc cholester ol, optimal smooth lowess co interval(2) /*cut 178*/ gen cholesterolcut= . replace cholesterolcut = 1 if cholesterol>=178 replace cholesterolcut= 0 if cholesterol <178 replace cholesterolcut = . if cholesterol==. logit AnyDzCalc cholesterol cut tab AnyDzCalc chole sterolcut /*se nsitivity*/ cii proportion 93 53 /*specificity*/ cii proportion 184 101 /*Neutrophils*/ roctg AnyDzCalc Ntot, optimal smooth lowess co interval(50) /*cut 3005*/ gen Ntotcut= . r eplace Ntotcut = 1 if Ntot>=3 005 replace Ntotcu t= 0 if Ntot <3005 replace Ntotcut = . if Ntot==. logit AnyDzCalc Ntotcut tab AnyDzCalc Ntotcut /*sensitivity*/ cii proportion 93 45 /*specificity*/ cii proportion 184 87 /*L ymphocytes*/ roctg AnyDzCalc Ltot, optimal smoo th lowess co interval(100) /*cut 3738.75*/ gen Ltotcut= . replace Ltotcut = 1 if Ltot>=3738.75 replace Ltotcut= 0 if Ltot <3738.75 replace Ltotcut = . if Ltot==. logit AnyDzCalc Ltotcut tab AnyDzCalc Ltotcu t /*sensitivity*/ c ii proportion 93 46 355 /*specificity*/ cii proportion 184 90 /*Etot*/ roctg AnyDzCalc Etot, optimal smooth lowess co interval(10) /*cut 230*/ gen Etotcut= . replace Etotcut = 1 if Etot >=230 replace Etotcut= 0 if Etot <230 replace Etotcut = . if Etot==. logit AnyDzCalc Etotcut tab AnyDzCalc Etotcut /*sensitivity*/ cii proportion 93 47 /*specificity*/ cii proportion 184 94 /*Btot*/ roctg AnyDzCalc Btot, o ptimal s mooth lowess co interval(10) /*cut 0.01 */ gen Btotcut= . replace Btotcut = 1 if Btot>=0.01 replace Btotcut= 0 if Btot <0.01 replace Btotcut = . if Btot==. logit AnyDzCalc Btotcut tab AnyDzCalc Btotcut /*sensitivity*/ cii p roportion 93 37 /*specificity*/ c ii proportion 184 99 /*Band cells*/ roctg AnyDzCalc BDtot, optimal smooth lowess co interval(4) /*cut 0.01*/ gen BDtotcut= . replace BDtotcut = 1 if BDtot>=0.01 replace BDtotcut= 0 if B Dtot <0.01 replace BDtotcut = . if BDtot==. logit AnyDzCalc BDtotcut tab AnyDzCalc BDtotcut /*sensitivity*/ cii proportion 93 43 /*specificity*/ cii proportion 184 100 /*AOP*/ roctg AnyDzCalc aopugul, optimal smooth lowess c o interval(.1) /*cut 4.41*/ gen aopugulcut = . replace aopugulcut = 1 if aopugul>= 4.41 replace aopugulcut= 0 if aopugul < 4.41 replace aopugulcut = . if aopugul==. logit AnyDzCalc aopugulcut tab AnyDzCalc aopugulcut /*sensitivit y*/ cii proportion 93 52 /*specificity*/ cii proportion 184 101 /*Vitamin A*/ roctg AnyDzCalc vitamina, optimal smooth lowess co interval(2) /*cut 286*/ gen vitaminacut= . 356 replace vitaminacut = 1 if vitamina>=286 r eplace vitam inacut= 0 if vitamina <286 replace vitaminacut = . if vitamina==. logit AnyDzCalc vitaminacut tab AnyDzCalc vitaminacut /*sensitivity*/ cii proportion 93 46 /*specificity*/ cii proportion 184 90 /*ROS*/ roctg AnyDzCalc r os, optimal smooth lowess co interval(2) /*cut 43.0*/ gen roscut= . replace roscut = 1 if ros>=43.0 replace roscut= 0 if ros <43.0 replace roscut = . if ros==. logit AnyDzCalc roscut tab AnyDzCalc roscut /*sen sitivity*/ cii proportion 93 45 /*specificity*/ cii pro portion 184 91 /*Vitamin D*/ roctg AnyDzCalc vitamind, optimal smooth lowess co interval(1) /*cut 96.4*/ gen vitamindcut= . replace vitamindcut = 1 if vitamind>=96.4 repl ace vitamindcut= 0 if vitamind <96.4 replace vitamindcut = . if vitamind==. logit AnyDzCalc vitamindcut tab AnyDzCalc vitamindcut /*sensitivity*/ cii proportion 93 46 /*specificity*/ cii proportion 184 90 /*albumin*/ roctg AnyDzCalc albu min, optimal smooth lowess co interval(.01) /*c ut 3.4*/ gen albumincut= . replace albumincut = 1 if albumin>=3.4 replace albumincut= 0 if albumin <3.4 replace albumincut = . if albumin==. logit AnyDzCalc alb umincut tab AnyDzCalc albu mincut /*sensitivity*/ cii proportion 93 60 /*specificity*/ cii proportion 184 81 /*Mtot*/ roctg AnyDzCalc Mtot, optimal smooth lowess co interval(7) /*cut 161*/ gen Mtotcut= . replace Mtotcut = 1 if Mtot>=161 replace Mt otcut= 0 if Mtot <161 replace Mtotcut = . if Mt ot==. logit AnyDzCalc Mtotcut tab AnyDzCalc Mtotcut /*sensitivity*/ cii proportion 93 46 357 /*specificity*/ cii proportion 184 91 /*al phatocopherol*/ roctg AnyDzCalc alphatocoph erol if alphatocopherol<=3.5, optimal smooth lowe ss co interval(.1) roctg AnyDzCalc alphatocopherol if alphatocopherol>3.5, optimal smooth lowess co interval(.1) gen alphatocopherolcut= 0 replace alphat ocopherolcut = 1 if alphatocopherol>= 2.34 & alphatocopherol <=5.24 replace alphatocopherolc ut = . if alphatocopherol==. tab AnyDzCalc alphatocopherolcut /*sensitivity*/ cii proportion 93 45 /*specificity*/ cii proportion 184 84 /*BCS cut off*/ gen bcscut = 0 replace bc scut = 1 if rounddrybcs < 3.5 | rounddrybcs > 4.0 replace bcscut = . if rounddrybcs ==. /*Part 5: Count cows above/below each threshold by cohort*/ ********************************************************* ******************** egen nNefa= total(nefacu t==1) if LstcId<300, by(Cohort2) egen nBhb= tot al(bhbcut==1) if LstcId<300, by(Cohort2) egen nCalcium= total(calciumcut==1) if LstcId<300, by(Cohort2) egen nSaaugml= total(saaugmlcut==1) if LstcId<300, by( Cohort2) egen nBetacarotene= total(betacaro tenecut==1) if LstcId<300, by(Cohort2) egen nHa ptoglobin= total(haptoglobincut==1) if LstcId<300, by(Cohort2) egen nCholesterol= total(cholesterolcut==1) if LstcId<300, by(Cohort2) egen nNtot= total(Ntotcu t==1) if LstcId<300, by(Cohort2) egen nLtot = total(Ltotcut==1) if LstcId<300, by(Cohort2) egen nEtot= total(Etotcut==1) if LstcId<300, by(Cohort2) egen nBtot= total(Btotcut==1) if LstcId<300, by(Cohort2) egen nBDtot= total(BDtotcut==1) if LstcId<30 0, by(Cohort2) egen nAopugul= total(aopugul cut==1) if LstcId<300, by(Cohort2) egen nVitami na= total(vitaminacut==1) if LstcId<300, by(Cohort2) egen nRos= total(roscut==1) if LstcId<300, by(Cohort2) egen nVitamind= total(vitamindcut==1) if LstcId<30 0, by(Cohort2) egen nAlbumin= total(albumin cut==1) if LstcId<300, by(Cohort2) egen nMtot= total(Mtotcut==1) if LstcId<300, by(Cohort2) egen nAlphatocopherol= total(alphatocopherolcut==1) if LstcId<300, by(Cohort2) egen nHeifer = total(LactationNum ber==1) if LstcId<300, by(Cohort2) egen nBCS = total(bcscut==1) if LstcId<300, by(Cohort2) /*Part 6: reduce dataset to one line per cohort*/ ***************************************************************************** drop if LstcId==. sort Cohort2 duplicates drop Cohort2, force /*Part 7: S ummary statistics by cohort*/ ******************* ****************************************************** ****** tabstat meanalbumin, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat nAlbumin, statistics(count mean sem min p25 p5 0 p75 max) columns(statistics) tabstat sdalbumi n, statistics(count mean sem min p25 p50 p75 max) colu mns(statistics) list meanalbumin Cohort2 list sdalbumin Cohort2 list nAlbumin Cohort2 tabstat meanhaptoglobin, statistics(count mean sem min p 25 p50 p75 max) columns(statistics) tabstat sdh aptoglobin, statistics(count mean sem min p25 p50 p75 max) columns(statistics) list meanhaptoglobin Cohort2 list sdhaptoglobin Cohort2 list nHaptoglobin Cohort2 tabstat meansaa, statistics(count mean sem min p25 p50 p75 max) columns(statistics ) tabstat sdsaa, statistics(count mean sem min p25 p 50 p75 max) columns(statistics) list meansaa Cohort2 list sdsaa Cohort2 list nSaa Cohort2 tabstat meanvitamind, statistics(count mean sem min p2 5 p50 p75 max) columns(statistics) 358 tabstat sdvi tamind, statistics(count mean sem min p25 p50 p75 max) columns(statistics) list meanvitamind Cohort2 list sdvitamind Cohort2 list nVitamind Cohort 2 tabstat meanBtot, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat sd Btot, statistics(count mean sem min p25 p50 p75 max) columns(statistics) list meanBtot Cohort2 list sdBtot Cohort2 list nBtot Cohort2 tabstat meanBDtot, statistics(count mean sem min p25 p50 p75 m ax) columns(statistics) tabstat sdBDtot, statis tics(count mean sem min p25 p50 p75 max) columns(statistics) list meanBDtot Cohort2 list sdBDtot Cohort2 list nBDtot Cohort2 tabstat meanEtot, statistics(count mean sem min p25 p50 p75 max) column s(statistics) tabstat sdEtot, statistics(count mean sem min p25 p50 p75 max) columns(statistics) list meanEtot Cohort2 list sdEtot Cohort2 list nEtot Cohort2 tabstat meanLtot, statistics(co unt mean sem min p25 p50 p75 max) columns(statistics) tabstat sdLtot, statistics(count mean sem min p 25 p50 p75 max) columns(statistics) list meanLtot Cohort2 list sdLtot Cohort2 list nLtot Cohort2 tabstat meanMtot, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat sdMt ot, statistics(count mean sem min p25 p50 p75 max ) columns(statistics) list meanMtot Cohort2 list sdMtot Cohort2 list nMtot Cohort2 tabstat meanNt ot, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat sdNtot, statistics (count mean sem min p25 p50 p75 max) columns(stat istics) list meanNtot Cohort2 list sdNtot Cohort2 list nNtot Cohort2 tabstat meanbhb, statistics( count mean sem min p25 p50 p75 max) columns(statistics) tabstat sdbhb, statistics(count mean sem min p25 p50 p75 max) columns(statistics) list meanbhb Cohort2 list sdbhb Cohort2 list nBhb Cohort2 tabstat meancalcium, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat sdcalcium, statistics(count mean sem min p25 p50 p75 max) columns(statistics) list meancalcium Cohort2 list sdcalcium Cohort2 list nCalcium Cohort2 tabstat meancholesterol, statistics(count mea n sem min p25 p50 p75 max) columns(statistics) tabstat sdcholesterol, statistics(count mean sem m in p25 p50 p75 max) columns(statistics) list m eancholesterol Cohort2 list sdcholesterol Cohort2 list nCholesterol Cohort2 tabstat meannefa, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat sdnefa, statistics(count mean sem min p25 p50 p75 max) columns(statistics) list meannefa Cohort2 list sdnefa Cohort2 list nNefa Cohort2 tabstat meanalp hatocopherol, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat sdalphatocopherol, statistics(cou nt mean sem min p25 p50 p75 max) columns(statisti cs) list meanalphatocopherol Cohort2 list sdalphatocopherol Cohort2 list nAl phatocopherol Cohort tabstat meanaopugul , statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat s daopugul , statistics(count mean sem min p25 p50 p75 max) columns(statistics) list meanaopugul Cohort2 list sdaopugul Cohort2 list nAopugul Cohort2 tabstat meanbetacarotene, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tab stat sdbetacarotene, statistics(count mean sem mi n p25 p50 p75 max) columns(statistics) list meanbetacarotene Cohort2 list sdbe tacarotene Cohort2 359 list nBetacarotene Cohort2 tabstat meanros, statistics(count mean sem min p25 p50 p75 max) columns(s tatistics) tabstat sdros, statistics(count mean sem min p25 p50 p75 max) columns(statistics) list meanros Cohort2 list sdros Cohort2 list nRos Cohort2 tabstat meanvitamina, statistics(count mean sem min p25 p50 p75 max) columns(statistics) t abstat sdvita mina, statistics(count mean sem min p25 p50 p75 max) columns(statistics) list meanvitamina Cohort2 list sdvitamina Cohort2 list nVitamina Cohort2 tabstat medBCS, statistics(count mean sem min p25 p50 p75 max) columns(statistics) tabstat manBC S, statistics(count mean sem min p25 p50 p75 max) columns(statistics) list medBCS Cohort2 list manBCS Cohort2 list nBCS Cohort2 list medLact Cohort2 list manLact Cohort2 list nHeifer Cohort2 list season Cohort2 list nTotalLstc2 C ohort2 save "C: \ Users \ wisni \ Amazon Drive \ DI SSERTATION ANALYSIS \ Data files \ CohortData_28December2018.dta", replace cd "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ " outsheet using CohortData_28December2018.csv , comma replace M1_ AIM2 - 1MEAN.DO /* Title: Cohort analysis 1 - M eans Author: Lauren Wisnieski Purpose: First cohort analysis - means Revision history: - May 9 2018 - Revised July 25 with new modelling strategy (shrinkage, lower EPV) - Revised 12/29 to reflect major ch anges with internal validation. - Revised 1/1/1 9 to fix changes with offset term in best subsets selection. - Revised 2/15/19 to fix how equivalence tests were done for slope and intercept The purpose of this do file is to calculate the DSS score s f rom the poisson models selected in R, perform boo tstrap validation, and model averaging Steps: Step 1. Calculate DSS scores Step 2: Internal validation Step 3: Model averaging */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data f iles \ Cohortdata_28December2018.dta", clear save cohort, replace /*STEP 1: Calculate DSS scores*/ ********************************************************************************** /*Model 1*/ /* Calculate p redicted incidence */ use cohort , clear gen x = 8.6904772777 + ( - 1.927990375 3 * meancalcium) + /// (2.5550525586 * meanalbumin) + ( 0.0004897288 * meanvitamina) + /// ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) 360 gen N SES = zp ^2 egen meanNSES = mean(NSES) /*D awid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.4333 */ /*Model 2*/ /* Ca lculate predicted incidence */ use cohort, clea r gen x = 9.744288409 + ( - 2.226877635 * meancalcium) + (3.572774601 * meanalbumin) + /// ( - 0.008446589 * meancholesterol) + ln(nTotalLstc2) gen pcount= 2.71 828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 ege n meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.587672 */ /*Model 3*/ /* Calculate predicted inci dence */ use cohort, clear gen x = 8.728243 + ( - 1.967848 * meancalcium) + (2.700862 * meanalbumin) + /// ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std (pcount) gen NSES = zp ^2 egen meanNSES = m ean(NSES) /*Dawid and Sebas tini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.505287 */ /*Model 4*/ /* Calculate predicted incidence */ us e cohort, clear gen x = 12.871226961 + ( - 2.333722678 * meancalcium) + /// (2.604991520 * meanalbumin) + ( - 0.005245968 * meanros) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = z p ^2 egen meanN SES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* 361 meanDS 1.539 677 */ /*Model 5*/ /* Calculate predi cted incidence */ use cohort, clear gen x = 11.1420276101 + ( - 2.2289603919 * meancalcium) + /// (2.8552492509 * meanalbumin) + ( - 0.0001264773 * meanNtot) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort C ohort2 /* meanDS 1.559403 */ /* Model 6 */ /* Calculate predicted incidence */ use cohort, clear gen x = 9.813731090 + ( - 1.953729671 * meancalcium) + (2.381305940 * meanalbumin) /// + ( - 0.002322456 * meansaa) + ln(nTota lLstc2) gen pcount= 2.71828 ^ x /*Cal culat ing NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen me anDS = mean(DS) sort Cohort2 /* meanDS 1.437701 */ /*Model 7*/ /* Calculate predicted incidence */ use cohort, clear gen x = 10.410488942 + ( - 2.226549955 * meancalcium) + /// (0.002347004 * meanBDtot) + (2.908330440 * meanalbumin) + ln(nTotalLstc2) gen p count= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10 (sdpcounts))) egen meanDS = mean(DS) s ort Co hort2 /* meanDS 1.553634 */ /*Model 8*/ /* Calculate predicted incidence */ use cohort, clear gen x = 8.912358147 + ( - 1.916827201 * meancalcium) + /// 362 ( - 0.004963873 * me anvitamind) + (2.646005039 * meanalbumin) + ln(nT otalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10( sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.500501 */ /*Model 9*/ /* Calculate predicted incidence */ use cohort, clear gen x = 10.38244732 + ( - 2.23037539 * meancalcium) + (0. 02047481 * meanbetacarotene) /// + (2.93251583 * meanalbumin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen mean NSES = mean(NSES) /*Dawid and Sebastini*/ egen sd pcounts = sd(pcount) gen DS = NSES + (2 * (lo g10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.512653 */ /*Model 10*/ /* Calculate predicted incidence */ use cohort, clear gen x = 8.15406993 + ( - 1.7372 1781 * meancalcium) + ( - 0.08821252 * meanbhb) /// + (2.35374979 * meanalbumin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebas tini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.51355 */ /*Model 11*/ /* Calculat e predicted incidence */ use cohort, clear gen x = 9.9522 953 + ( - 2.0341413 * meancalcium) + ( - 0.1718569 * medBCS) + /// (2.6973726 * meanalbumin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid a nd Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 363 1.485993 */ /*Model 12*/ /* Calculate predicted incidence */ use cohort, clear ge n x = 7.8604926466 + ( - 1.7194758959 * meancalcium ) + ( - 0.0004646192 * meanMtot) + /// (2.2811244464 * meanalbumin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /* Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSE S = mean(NSES) /*Dawid and Sebastini*/ eg en sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.437055 */ /*Model 13*/ /* Calculate predicted incidence */ u se cohort, clear gen x = 7.7400293442 + ( - 1.9 108122164 * meancalcium) + /// (0.0003974242 * meanEtot) + (2.7923362498 * meanalbumin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mea n(DS) sort Cohort2 /* meanDS 1.50576 */ /*Model 14*/ /* Calculate predic ted incidence */ use cohort, clear gen x = 7.727408219 + ( - 1.688887322 * meancalcium) + /// ( - 0.002295168 * meanBtot) + (2.225398403 * meanalbumin) + ln (nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcoun t) gen NSES = zp ^2 egen meanNSES = mean(NS ES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.479208 */ /*Model 15*/ / * Calculate predicted incidence */ use cohort, clear gen x = 7.772770 + ( - 1.799114 * meancalcium) + /// (2.568391 * meanalbum in) + ( - .00004888527 * meanLtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen z p = std(pcount) gen NSES = zp ^2 364 egen meanN SES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.494905 */ /*Mo del 16*/ /* Calculate predicted incidence */ use cohort, clear gen x = 8.3287012 + ( - 1.8686264 * meancalcium) + (2.5498694 * m eanalbumin) /// + ( - 0.2041196 * meanhaptoglobin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculatin g NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd( pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.4666 57 */ /*Model 17*/ /* Calculate p redicted incidence */ use cohort, clear gen x = 7.72011533 + ( - 1.72971781 * meancalcium) + /// (2.32865946 * meanalbumin) + ( - 0.07351493 * meannefa) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcou nt) gen NSES = zp ^2 egen meanNSES = m ean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.410272 */ /*Model 18*/ /* Calculate predicted incidence */ use cohort, clear gen x = 8.76578983 + ( - 1.92810506 * meancalcium) + (2.54220211 * meanalbumin) /// + (0.02632872 * meanaopugul) + ln(nTotalLstc2) gen pc ount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^ 2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) s ort Cohort2 /* meanDS 1.508616 */ 365 /*Model 19*/ /* Calculate predicte d incidence */ use cohort, clear gen x = 8.72043070 + ( - 1.89794788 * meancalcium) + (2.48382482 * meanalbumin) /// + (0.01801005 * meanalphatocopherol) + ln(n TotalLstc2) gen pcount= 2.71828 ^ x / *Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) ege n meanDS = mean(DS) sort Cohort2 /* meanDS 1.477314 */ /*Model 20*/ /* Calculate predicted incidence */ use cohort, clear gen x = 0.8480885920 + ( - 0.2003827105 * meanbhb) + /// ( - 0.0001876797 * meanLtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = m ean(DS) sort Cohort2 /* meanDS 1.251115 */ /*models 2, 6, and 20 have significant DSS values so are removed at this stage*/ /*STEP 2: INTERNAL VALIDATION OF PREDICTED VERSUS OBSERVED COUNTS*/ *************************** ************************************************* ****** /*Calculating slope and intercept for predicted versus observed counts using 500 bootstrapped samples for each candidate model Setting confidence level to 90 to reflect two one - sided equiv alence tests for intercept and slope*/ ********* *************** ************************************************************ set level 90 /*Model 1*/ /* Calculate predicted incidence */ use cohort, clear gen x1 = 8.6904772777 + ( - 1.927990 3753 * meancalcium) + /// (2.5550525586 * mea nalbumin) + ( 0 .0004897288 * meanvitamina) + /// ln(nTotalLstc2) gen pcount1= 2.71828 ^ x1 gen lnpcount1= ln(pcount1) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed( 1921): poisson nDzLstc2 lnpcount1 estat boot strap, all /*Model 2 (previously model 3)*/ /* Calculate predicted incidence */ gen x2 = 8.728243 + ( - 1.967848 * meancalcium) + (2.700862 * meanalbumin) + /// ln(nTotalLstc2) gen pcount2 = 2.71828 ^ x2 gen lnpcount2= ln(pcount2 ) 366 /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount2 estat bootstrap, all /*Model 3 (Previously model 4)*/ /* Calculate predicted incidence */ gen x3 = 12.871226961 + ( - 2.33372267 8 * meancalcium) + /// (2.604991520 * meanalbumin) + ( - 0.005245968 * meanros) + ln(nTotalLstc2) gen pcount3= 2.71828 ^ x3 gen lnpcount3= ln(pcount3) /*Bootstrap estimates*/ bootstrap _b, nodot s nowarn reps(500) seed(1921): poisson nDzLs tc2 lnpcount3 estat bootstrap, all /*Model 4 (Previously model 5)*/ /* Calculate predicted incidence */ gen x4 = 11.1420276101 + ( - 2.2289603919 * meancalcium) + /// (2.8552492509 * meana lbumin) + ( - 0.0001264773 * meanNtot) /// + ln (nTotalLstc2) gen pcount4= 2.71828 ^ x4 gen lnpcount4= ln(pcount4) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount4 estat bootstrap, a ll /*Model 5 (Previously model 7)*/ /* Cal culate predicted incidence */ gen x5 = 10.410488942 + ( - 2.226549955 * meancalcium) + /// (0.002347004 * meanBDtot) + (2.908330440 * meanalbumin) + ln(nTotalLstc2) gen pcount5= 2.71828 ^ x5 ge n lnpcount5= ln(pcount5) /*Bootstrap estima tes*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount5 estat bootstrap, all /*Model 6 (Previously model 8)*/ /* Calculate predicted incidence */ gen x6 = 8.9123 58147 + ( - 1.916827201 * mean calcium) + /// ( - 0 .004963873 * meanvitamind) + (2.646005039 * meanalbumin) + ln(nTotalLstc2) gen pcount6= 2.71828 ^ x6 gen lnpcount6= ln(pcount6) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) se ed(1921): poisson nDzLstc2 lnpcount6 estat b ootstrap, all /*Model 7 (previously model 9)*/ /* Calculate predicted incidence */ gen x7 = 10.38244732 + ( - 2.23037539 * meancalcium) + (0.02047481 * meanbetacarotene) /// + (2.93251583 * m eanalbumin) + ln(n TotalLstc2) gen pcount7= 2 .71828 ^ x7 gen lnpcount7= ln(pcount7) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount7 estat bootstrap, all /*Model 8 (previously model 10)*/ /* Calculate predicted incidence */ gen x8 = 8.15406993 + ( - 1.73721781 * meancalcium) + ( - 0.08821252 * meanbhb) /// + (2.35374979 * meanalbumin) + ln(nTotalLstc2) gen pcount8= 2.71828 ^ x8 gen lnpcount8= ln(pcount8) /* Bootstrap estimates */ bootstrap _b, nodots now arn reps(500) seed(1921): poisson nDzLstc2 lnpcount8 estat bootstrap, all /*Model 9 (Previously model 11)*/ /* Calculate predicted incidence */ gen x9 = 9.9522953 + ( - 2.0341413 * meanca lcium) + ( - 0.1718569 * medBCS) + /// (2.697372 6 * meanalbumin) + ln(nTotalLstc2) gen pcount9= 2.71828 ^ x9 367 gen lnpcount9= ln(pcount9) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(50 0) seed(1921): poisson nDzLstc2 lnpcount9 estat bootstrap, all /*Model 10 (previous ly model 12)*/ /* Calculate predicted incidence */ gen x10 = 7.8604926466 + ( - 1.7194758959 * meancalcium) + ( - 0.0004646192 * meanMtot) + /// (2.2811244 464 * meanalbumin) + ln(nTotalLstc2) ge n pcount10= 2.71828 ^ x10 gen lnpcount10= ln (pcount10) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount10 estat bootstrap, all /*Model 11 (previously model 13)*/ /* Calculate pre dicted incidence */ gen x11 = 7.7400293442 + ( - 1.9108122164 * meancalcium) + /// (0.0003974242 * meanEtot) + (2.7923362498 * meanalbumin) + ln(nTotalLstc2) gen pcount11= 2.71828 ^ x11 gen lnpcount11= ln(pcount11) /*Bootstrap estima tes*/ bootstrap _b, nodots nowarn reps(500) se ed(1921): poisson nDzLstc2 lnpcount11 estat bootstrap, all /*Model 12 (previously model 14)*/ /* Calculate predicted incidence */ gen x12 = 7.727408219 + ( - 1.688887322 * meancalcium) + /// ( - 0.002295168 * meanBtot) + (2.225398403 * me analbumin) + ln(nTotalLstc2) gen pcount12= 2.71828 ^ x12 gen lnpcount12= ln(pcount12) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount12 estat bootstrap, all /*Model 13 (previously m odel 15)*/ /* Calculate predicted incidence */ gen x13 = 7.772770 + ( - 1.799114 * meancalcium) + /// (2.568391 * meanalbumin) + ( - .000048 88527 * meanLtot) + ln(nTotalLstc2) gen pcount13= 2. 71828 ^ x13 gen lnpcount13= ln(pcount13) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount13 estat bootstrap, all /*Model 14 (Previously model 16)*/ /* Calculate predicted incide nce */ gen x14 = 8.3287012 + ( - 1.8686264 * me ancalcium) + (2.5498694 * meanalbumin) /// + ( - 0.2041196 * meanhaptoglobin) + ln(nTotalLstc2) gen pcount14= 2.71828 ^ x14 gen lnpcount14= ln(pcount14) /*Bootstrap estimates*/ bootstrap _ b, nodots nowarn reps(500) seed(1921): poisson n DzLstc2 lnpcount14 estat bootstrap, all /*Model 15 (previously model 17)*/ /* Calculate p redicted incidence */ gen x15 = 7.72011533 + ( - 1.72971781 * meancalcium) + /// (2.32865946 * meanal bumin) + ( - 0.07351493 * meannefa) + ln(nTotalLstc 2) gen pcount15= 2.71828 ^ x15 gen lnpcount15= ln(pcount15) /*Bootstrap estimates*/ b ootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount15 estat bootstrap, all /*Model 16 (previously model 18)*/ /* Calculat e predicted incidence */ gen x16 = 8.76578983 + ( - 1.92810506 * meancalcium) + (2.54220211 * mea nalbumin) /// + (0.02632872 * meanaopugul) + ln(nTotalLstc2) gen pcount16= 2.71828 ^ x16 gen ln pcount16= ln(pcount16) /*Bootstrap estimat es*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount16 368 estat bootstrap, all /*Model 17 (previously model 19)*/ /* Calculate predicted incidence */ gen x17 = 8.720 43070 + ( - 1.89794788 * meancalcium) + (2.48382482 * meanalbumin) /// + (0.01801005 * meanalphatocopherol) + ln(nTotalLstc2) gen pcoun t17= 2.71828 ^ x17 gen lnpcount17= ln(pcount17) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount1 7 estat bootstrap, all cd "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Da ta files \ " outsheet lnpcount1 lnpcount2 lnpcount3 lnpcount4 lnpcount5 lnpcount6 lnpcount7 lnpcount8 lnpcount9 l npcount10 lnpcount11 lnpcount12 lnpcount13 lnpcou nt14 lnpcount15 lnpcount16 lnpcount17 nDzLstc2 using mean2.1_othermodels.csv, comma replace ******************************************************************************** /*STEP 5: MODEL AVERAGING*/ * ************************************************* ****************************** /*Note: We calculated AIC weights in excel. First, we avera ge predictions from the original models. Then, we create bootstrapped average predictions over 500 reps*/ use cohort, clear /*Model 1*/ gen x1 = 8.6904772777 + ( - 1.9279903753 * meancalcium) + /// (2.5550525586 * meanalbumin) + ( 0.0004897288 * meanvitamina) + /// ln(nTotalLstc2) gen pcounts1 = 2.71828 ^ x1 /*Model 2*/ gen x2 = 8.728243 + ( - 1.967848 * meancalcium) + (2.70086 2 * meanalbumin) + /// ln(nTotalLstc2) gen pcounts2 = 2.71828 ^ x2 /*Model 3*/ gen x3 = 12.871226961 + ( - 2.333722678 * meancalcium) + /// (2.604991520 * meanalbumin) + ( - 0.00524596 8 * meanros) + ln(nTotalLstc2) ge n pcounts 3= 2.71828 ^ x3 /*Model 4*/ gen x4 = 11.1420276101 + ( - 2.2289603919 * meancalcium) + /// (2.8552492509 * meanalbumin) + ( - 0.0001264773 * meanNtot) /// + ln(nTotalLstc2) gen pcounts4= 2. 71828 ^ x4 /*Model 5*/ gen x5 = 10.410 488942 + ( - 2.226549955 * meancalcium) + /// (0.002347004 * meanBDtot) + (2.908330440 * meanalbumin) + ln(nTotalLstc2) gen pcounts5= 2.71828 ^ x5 /*Model 6*/ gen x6 = 8.912358147 + ( - 1.91682 7201 * meancalcium) + /// ( - 0.0049638 73 * mean vitamind) + (2.646005039 * meanalbumin) + ln(nTotalLstc2) gen pcounts6= 2.71828 ^ x6 /*Model 7*/ gen x7 = 10.38244732 + ( - 2.23037539 * meancalcium) + (0.02047481 * meanbetacarotene) /// + (2 .93251583 * meanalbumin) + ln(nTotalLstc2) g en pcounts7= 2.71828 ^ x7 /*Model 8*/ gen x8 = 8.15406993 + ( - 1.73721781 * meancalcium) + ( - 0.088212 52 * meanbhb) /// + (2.35374979 * meanalbumin) + ln(nTotalLstc2) gen pcounts8= 2.71828 ^ x8 /*Model 9*/ gen x9 = 9.9522953 + ( - 2.0341413 * meancalcium) + ( - 0.1718569 * medBCS) + /// (2.6973726 * meanalbumin) + ln(nTotalLstc2) g en pcounts9= 2.71828 ^ x9 /*Model 10*/ gen x10 = 7.8604926466 + ( - 1.7194758959 * meancal cium) + ( - 0.0004646192 * meanMtot) + /// (2.2 811244464 * meanalbumin) + ln(nTotalLstc2) gen pcounts10= 2.71828 ^ x10 369 /*Model 11*/ gen x11 = 7.7400293442 + ( - 1.9108122164 * meancalcium) + /// (0.0003974242 * meanEtot) + (2.7923362498 * meanalbumin) + ln(nTotalLstc2) gen pcounts 11= 2.71828 ^ x11 /*Model 12*/ gen x12 = 7.727408219 + ( - 1.688887322 * meancalcium) + /// ( - 0.002295168 * meanBtot) + (2.225398403 * meanalbumin) + ln(nTotalLstc2) gen pcounts12= 2.71828 ^ x12 /*Model 13*/ gen x13 = 7.772770 + ( - 1.799114 * meancalcium) + /// (2.568391 * meanalbumin) + ( - .00004888527 * meanLtot) + ln(nTotal Lstc2) gen pcounts13= 2.71828 ^ x13 /*Model 14*/ gen x14 = 8.3287012 + ( - 1.8686264 * mea ncalcium) + (2.5498694 * meanalbumin) /// + ( - 0.2041196 * meanhaptoglobin) + ln(nTotalLstc2) gen pcounts14= 2.71828 ^ x14 /*Model 15*/ ge n x15 = 7.72011533 + ( - 1.72971781 * meancalcium) + /// (2.32865946 * meanalbumin) + ( - 0.07351493 * meannefa) + ln(nTotalLstc2) gen pcounts15= 2 .71828 ^ x15 /*Model 16*/ gen x16 = 8.76578983 + ( - 1.92810506 * meancalcium) + (2.54220211 * meanalbumin) /// + (0.02632872 * meanaopugul) + ln(nTotalLstc2) gen pcounts16= 2.71828 ^ x16 /*Model 17*/ gen x17 = 8.72043070 + ( - 1.8 9794788 * meancalcium) + (2.48382482 * meanalbumin) /// + (0.01801005 * meanalphatocopherol) + ln(nTo talLstc2) gen pcounts17= 2.71828 ^ x17 /*Averaging predicted incidence*/ gen avgpcount = (pco unts1 * 0.13) + (pcounts2 * 0.11) + (pcounts3 * 0 .10) + (pcounts4 * 0.06) + /// (pcounts5 * 0.06) + (pcounts6 * 0.05) + (pcounts7 * 0.05) + (pcounts8 * 0.05) + /// (pcounts9 * 0.05) + (pcounts10 * 0.05) + (pcounts11 * 0.04) + (pcounts12 * 0.04) + /// (pcounts13 * 0.04) + (pcounts14 * 0.04) + (pcou nts15 * 0.04) + (pcounts16 * 0.04) + /// (pcounts17 * 0.04) /*Calculating NSES*/ e gen zp = std(avgpcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sd pcounts = sd(avgpcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /*Bootstrap estimates*/ set level 90 gen lnavgpcount = ln(avgpcount) bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzL stc2 lnavgpcount estat bootstrap, all cd "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ " outsheet lnavgpcount a vgpcount nDzLstc2 using mean2.1.csv, comma replace /* meanDS 1.477477 */ M2_AIM2 - 1SD.D O /* Title: Cohort analysis 2 - Standard d eviation 370 Author: Lauren Wisnieski Purpose: Second cohort analysis - standard dev and MAD Revision history: - Created 7/30/2018 - Revised 12/30/2018 to reflect new methods (bootstrapped Poisson reg estim ates) - Revised 1/2/2019 to reflect changes with offset term The purpose of this do file is to calculate the DSS scores from the poisso n models selected in R, perform bootstrap validation, and model averaging Steps: Step 1. Calculate DSS scores Ste p 2: Internal validation Step 3: Model averag ing */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ Cohortdata_28December2018.dta", clear save cohort, replace /*STEP 1: Calculate DSS scores*/ ******************************** ************************************************* * /*Model 1*/ /* Calculate predicted incidence */ use coh ort, clear gen x = 1.91727 + ( - 0.52097 * sdbhb) + ( - 0.00041 * sdLtot) + /// ( - 4.78991 * sdalbumin) + ln(nTotalLstc2) ge n pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSE S) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.745541 */ /*Model 2*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 0.409764 + ( - 0.006763 * sdBtot) + (0.004583 * sdBDtot) + /// ( - 0.013405 * sdvitamina) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSE S*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSE S = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean( DS) sort Cohort2 /* meanDS 1.711137 */ /*Model 3*/ /* Calculate predicted incidence */ use cohort, clear gen x = 0.80928801 + ( - 0.25998975 * sdbhb) + ( - 1.45092055 * sdcalcium) + /// ( - 0.01665032 * sdvitamina) + ln(n TotalLstc2) gen pcount= 2.71828 ^ x 371 / *Calculating NSES*/ egen zp = std(pcount) gen NS ES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) ege n meanDS = mean(DS) sort Cohort2 /* meanD S 1.629535 */ /*Model 4*/ /* Calculate predicted incidence */ use cohort, clear gen x = 0.040650853 + ( - 0.261424564 * sdbhb) + (0.003071062 * sdBDtot) /// + ( - 0.016652646 * sdvitam ina) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcoun ts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.652231 */ /*Model 5*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.358341986 + ( - 0.000223399 * sdLtot) + (0.018193555 * sdvitamind) /// + (0.0056720 90 * sdBDtot) + ln(nTotalLstc2) gen pcount = 2.71828 ^ x /*Calculating NSES*/ e gen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log1 0(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.480699 */ /*Model 6*/ /* Calculate predicted incidence */ use cohort, clear gen x = 0.97898 + ( - 0.38706 * sdbhb) + ( - 2.13531 * sdalbumin) + /// ( - 0.0169 7 * sdvitamina) + ln(nTotalLstc2) gen pcoun t= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log 10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.684832 */ 372 /*Model 7*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 0.53795 + ( - 0.00832 * sdBtot) + (0.00608 * sdBDtot) + /// ( - 0.47455 * sdalphatocopherol) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES */ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.655339 */ /*Model 8*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.740663616 + (0.022282587 * sdvitamind) + ( - 0.007757856 * sdBt ot) /// + (0.005645353 * sdBDtot) + ln(nTotal Lstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen mea nDS = mea n(DS) sort Cohort2 /* meanDS 1.685827 */ /*Model 9*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 0.1990292833 + ( - 0.2809175821 * sdbhb) + ( - 0.0003 088143 * sdLtot) /// + (0.0048621329 * sdBDtot ) + ln(n TotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(p count) gen DS = NSES + (2 * (log10(sdpcounts) )) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.630541 */ /*Model 10*/ /* Calculate predicted incidence */ use cohort, clear gen x = 0.16829831 + ( - 0.29822704 * sdbhb) + ( - 0.01391463 * sdvitamina) /// + ln(nTotalL stc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) 373 gen DS = NSES + (2 * (log10(sdpcount s))) egen mean DS = mean(DS) sort Cohort2 /* meanDS 1.480018 */ /*Model 11*/ /* Calculate predicted incidence */ use cohort, clear gen x = 0.376243 + ( - 1.607578 * sdcalcium) + (0.003169 * sdBD tot) + /// ( - 0.019386 * sdvitamina) + ln(nTota lLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) ge n DS = NSES + (2 * (log10(sdpcounts))) egen me anDS = mean(DS) sort Cohort2 /* meanDS 1.687996 */ /*Model 12*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 0.7177773964 + ( - 0.0001999466 * sdLtot) + ( - 0. 0058831912 * sdBtot) /// + (0.0051618713 * sd BDtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 ege n meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcou nts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.568546 */ /*Model 13*/ /* Calculate predicted incide nce */ use cohort, clear gen x = 0.46564687 + ( - 0.33590529 * sdbhb ) + ( - 0.24999120 * sdalphatocopherol) /// + ( - 0.01182507 * sdvitamina) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.474807 */ /*Model 14* / /* Calculate predicted incidence */ 374 use cohort, clear gen x = 0.63356864 + ( - 1 .95361286 * sdcalcium) + ( - 0.01758709 * sdvitamin a) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) g en NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdp counts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.521073 */ /*Model 15*/ /* Calculate predicted incidence */ use cohort, clear gen x = 0.50050390 + ( - 0.35496883 * sdbh b) + ( - 0.20393442 * sdaopugul) /// + ( - 0.01490 426 * sdvitamina) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = s td(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * ( log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.46582 */ /*Model 16*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.44252454 + (0.02246984 * sdvitamind) + (0.00648077 * sdBDtot) /// + ( - 0.46188399 * sdalphatocopherol) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calcul ating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebast ini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1. 495243 */ /*Model 17*/ /* Calculate predicted incidence */ use cohort, clear gen x = 1.40 15 + ( - 0.5288 * sdbhb) + ( - 3.2796 * sdalbumin) + /// ( - 0.6185 * sdalphatocopherol) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Se bastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mea n(DS) sort Cohort2 /* 375 meanDS 1.630759 */ /*Model 18*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 0 .04655551 + ( - 0.33567323 * sdbhb) + (0.00437113 * sdBDtot) /// + ( - 0.57711589 * sdalphatocopherol) + l n(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = s d(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.555205 */ /*Model 19*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 0.1949726812 + ( - 0.2851066290 * sdb hb) + /// (0.0009575476 * sdEtot ) + ( - 0.0121848554 * sdvitamina) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.438903 */ /*Model 20*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 0.171811399 + (0.0054 53509 * sdBDtot) + ( - 0.356925087 * sdalphatocopherol) /// + ( - 0.014376028 * sdvitamina) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen me anNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.591028 */ /*Model 21*/ /* Calculate p redicted incidence */ use cohort, clear gen x = - 0.396090 + (0.004307 * sdBDtot) + ( - 0.017961 * sdvitamina) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ 376 egen zp = std(pcount) gen NSES = zp ^2 egen meanNS ES = mean (NSES) /*Dawid and Sebastini*/ e gen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.572596 */ /*Model 22*/ /* Calculate predicted incidence */ use cohor t, clear gen x = 0.0503063281 + ( - 0. 3947528787 * sdbhb) + /// (0.0002820326 * sdNtot) + ( - 0.0150344119 * sdvitamina) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid a nd Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.510428 */ /*Model 23*/ /* Calcu late pre dicted incidence */ use cohort, clear gen x = 0.36781253 + ( - 1.48181995 * sdcalcium) + /// ( - 0.00420472 * sdBtot) + ( - 0.01307043 * sdvitamina) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pc ount) gen NSES = zp ^2 egen meanNSE S = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.519188 */ /*Mod el 24*/ /* Calculate predicted incidence */ use cohort, clear gen x = 1.0574819883 + ( - 0.0003387508 * sdLtot) + /// ( - 3.4423401378 * sdalbumin) + ( - 0.3111770516 * sdbetacarotene) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x / *Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanD S 1.451714 377 */ /*Model 25*/ /* Calculate predicted incidence */ use cohort, clear gen x = 1.00934579 + ( - 2.09161827 * sdnefa) + /// ( - 1.94230192 * sdcalcium) + ( - 0.01970961 * sdvitamina) /// + ln(nTotalLstc2) gen p count= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort C ohort2 /* meanDS 1.563014 */ /*Note: all these models have significant DSS scores, so no model averaging performed*/ M3_AIM2 - 1THRESHOLD.DO /* Title: Cohort analysis 3 thresholds (cut - offs) Author: Lauren Wisnieski Purpose: 3rd analysis - thresholds Revision history: - C reated 7/31/2018 - Edited 12/28/2018 with new method of determining thresholds of biomarkers (roctg) - Edited 1/2/2018 to reflect changes with offset term The purpose of this do file i s to calculate the DSS scores from the poisson models selected in R, perform bootstrap validation, and model averaging Steps: Step 1. Calculate DSS scores Step 2: Internal validation Step 3: Model averaging */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANA LYSIS \ Data files \ CohortData_28December2018.dta", clear save cohort, replace /*STEP 1: Calculate DSS scores*/ ********************************************************************************** /*Mode l 1*/ /* Calculate predicted incidence * / use cohort, clear gen x = - 1.39909244 + ( - 0.07655451 * nCalcium) + /// (0.09973137 * nAlbumin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) g en NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 378 1.398103 */ /*Model 2*/ /* Calculate predicted incidence */ use cohort, cl ear gen x = - 2.06059325 + ( - 0.09415151 * nCal cium) + /// (0.09476375 * nAlbumin) + (0.17280753 * nHeifer) + /// ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen z p = std(pcount) gen NSES = zp ^2 egen meanN SES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.535001 */ /*Mod el 3*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.50860480 + ( - 0. 09707006 * nCalcium) + /// (0.10839548 * nAlbumin) + (0.02824817 * nBDtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini* / egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* mean DS 1.456015 */ /*Model 4*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.34751923 + ( - 0.08058837 * nCalcium) + (0.12007883 * nAlbumin) /// + ( - 0.02590578 * nAlphatocopherol) + ln(nTotalLstc2) gen pcount= 2.71 828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.445798 */ /*Model 5*/ /* Calculate predic ted incidence */ use cohort, clear gen x = - 1.15995148 + ( - 0.06996664 * nCalcium) + (0.08697452 * nAlbumin) /// + ( - 0.02192193 * nMtot) + ln(nTotalLstc2) gen pc ount= 2.71828 ^ x /*Calculating NSES*/ 379 egen zp = std(pcount) gen NS ES = zp ^2 egen meanNSES = mean(NSES) /*Da wid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) s ort Cohort2 /* meanDS 1.356383 */ /*Model 6*/ /* Calcul ate predicted incidence */ use cohort, clear gen x = - 1.44658294 + ( - 0.07078849 * nCalcium) + (0.01045463*nAlbumin) /// + (0.09289844*nBtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) g en NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.434862 */ /*Model 7*/ /* Calc ulate predicted incidence */ use cohort, clear gen x = - 1.29820379 + ( - 0.07585598*nCalcium) + (0.10749530*nAlbumin) /// + ( - 0.02324399*nCholesterol) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcoun t) gen NSES = zp ^2 egen meanNSES = m ean(NS ES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.384046 */ /*Model 8*/ /* Calculate predicted incidence */ use co hort, clear gen x = - 1.36764455 + ( - 0.08050526*nCalcium) + (0.08931423*nAlbumin) /// + (0.01284816*nAopugul) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = s td(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.431841 */ /*Mode l 9*/ 380 /* Calculate predicted incid ence */ use cohort, clear gen x = - 1.28404908 + ( - 0.08652144*nCalcium) + (0.10730802*nAlbumin) /// + ( - 0.01298276*nNtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 e gen meanNS ES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.431913 */ /*Mod el 10*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.409012533 + ( - 0.076732530*nCalcium) + (0.096610556*nAlbumin) /// + (0.005841818*nVitamina) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 eg en meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.35489 */ /*Model 11*/ /* Calculate predicted incide nce */ use cohort, clear gen x = - 1.414425886 + ( - 0.078204481*nCalcium) + (0.098927901*nAlbumin) /// + (0.005158437*nBetacarotene) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculatin g NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.3686 72 */ /*Mo del 12*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.321976433 + ( - 0.079256409*nCalcium) + (0.098506454*nAlbumin) /// + ( - 0.005267687*nRos) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Ca lculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 381 /* meanDS 1.380783 */ /*Model 13*/ /* Calcu late predicted incidence */ use cohort, clear gen x = - 1.399928412 + ( - 0.075035401*nCalcium) + (0.096447276*nAlbumin) /// + (0.003161197*nBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x / *Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanD S 1.37376 */ /*Model 14*/ /* Calc ulate predicted incidence */ use cohort, clear gen x = - 1.409448509 + ( - 0.075990666*nCalcium) + (0.005872675*nAlbumin) /// + (0.095430604*nHaptoglobin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calcula ting NSES*/ egen zp = std(pco unt) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.7 39955 */ /*Model 15*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.27738627 + ( - 0.06803623*nCalcium) + (0.08846181*nAlbumin) /// + ( - 0.01127850*nBhb) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pco unt) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mea n(DS) sort Cohort2 /* meanDS 1.38331 */ /*Model 16*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.22985400 + ( - 0.01342588*nVitamind) + ( - 0.05529939*nCalcium) /// + (0.07454343*nAlbumin) + ln(nTotalLstc2) 382 gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(p count) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 / * meanDS 1.279602 */ /*Model 17*/ /* Calculate predicted incidence */ use cohort, clea r gen x = - 1.479946714 + ( - 0.066212690*nCalcium) + (0.091631697* nAlbumin) /// + (0.009873171*nSaaugml) + ln(nTotalLstc2) gen pcount= 2.7 1828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(N SES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohor t2 /* meanDS 1.357666 */ /*Model 18*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.3899918706 + ( - 0.0730915816*nCalcium) + (0.0955083627*nAlbumin) /// + (0.0005011734*nEtot) + ln(nTotalLstc2) gen pco unt= 2.71828 ^ x /*Calculating NSES*/ e gen zp = std(pcount) gen NSES = zp ^2 egen mea nNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) so rt Cohort2 /* meanDS 1.370174 */ /* Model 19*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.392094898 + ( - 0.075265319*nCalcium) + ( - 0.001044102*nLtot) /// + (0.098747779*nAlbumin) + ln(nTotalLstc2) gen p count= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.390591 383 */ /*Model 20*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.42065770 + ( - 0.06439484*nCalcium) + (0.01254392*nNefa) + /// (0.08025291*nAlbumin) + ln(nTotalLstc2) gen pco unt= 2.71828 ^ x /*Calculating NSES*/ e gen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) so rt Cohort2 /* meanDS 1.340254 */ /*Model 21*/ /* C alculate predicted incidence */ use cohort, clear gen x = - 1.65237323 + (0.05859601*nAlbumin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen mean NSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.201319 */ / *All these models have significant z values, so m odel av eraging not performed*/ M4_AIM2 - 1COMBINEDCOHORT.DO /* Title: Cohort analysis 4: combined model Author: Lauren Wisnieski Purpose: Last analysis - combined model that includes cohort means, p roportions, and SDs. Revision history: - Cre ated 8/1/2018 - Revised 12/30/2018 (changed best subsets selection to include pool of candidate variables within 4 AICs in bivariable analyses with outcome - Revised 1/2/2019 (fixed the wrong specificat ion of the exposure term ) - Revised 2/15/19 to fix how equivalence tests were done for slope and intercept The purpose of this do file is to calculate the DSS scores from the poisson models selected in R, perform bootstrap validation, and model averagin g Steps: Step 1: Sor t variables by AIC Step 2. Calculate DSS scores Step 3: Internal validation Step 4: Model averaging */ 384 use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ CohortData_28December2018.dta", clear save cohort, replace /*STEP 1: Sort variables by AIC*/ **** ****************************************************************************** poisson nDzLstc2 meanbhb, exposure(nTotalLstc2) estat ic /*84.22*/ poisson nDzLstc2 meannefa, exposure(nTotalLstc2) estat ic /*88.12*/ poisson nDzLstc2 mea ncalcium, exposure(nTotalLstc2) estat ic /*82.90*/ poisson nDzLstc2 meancholesterol, exposure(nTotalLstc2) estat ic /*88.11*/ poisson nDzLstc2 medBCS, exposure(nTotalLstc2) estat ic /*86.81*/ poisson nDzLstc2 meanNtot, expo sure(nTotalLstc2) estat ic /*88.12*/ poisson nDzLstc2 meanLtot, exposure(nTotalLstc2) estat ic /*85.00*/ poisson nDzLstc2 meanalbumin, exposure(nTotalLstc2) estat ic /*86.03*/ p oisson nDzLstc2 meanhaptoglobin, exposure(nTotalL stc2) estat ic /*87.83*/ poisson nDzLstc2 meanvitamind, exposure(nTotalLstc2) estat ic /*87.53*/ poisson nDzLstc2 meanMtot, exposure(nTotalLstc2) estat ic /*87.29*/ poisson nDzL stc2 me anBtot, exposure(nTotalLstc2) estat ic /*83.29*/ poisson nDzLstc2 meanBDtot, exposure(nTotalLstc2) estat ic /*87.94*/ poisson nDzLstc2 meanEtot, exposure(nTotalLstc2) estat ic /*87.90*/ poisson nDzLstc2 meansaa, e xposure (nTotalLstc2) estat ic /*88.12*/ poisson nDzLstc2 meanaopugul, exposure(nTotalLstc2) estat ic /*86.78*/ poisson nDzLstc2 meanalphatocopherol, exposure(nTotalLstc2) estat ic /*87.91*/ poisson nDzLstc2 meanbetacarot ene, exposure(nTotalLstc2) estat ic /*87.03* / poisson nDzLstc2 meanros, exposure(nTotalLstc2) estat ic /*87.72*/ 385 poisson nDzLstc2 meanvitamina, exposure(nTotalLstc2) estat ic /*86.27*/ poisson nDzLstc2 i.sea son, exposure (nTotalLstc2) estat ic /*88.00*/ pois son nDzLstc2 sdbhb, exposure(nTotalLstc2) estat ic /*83.02*/ poisson nDzLstc2 sdnefa, exposure(nTotalLstc2) estat ic /*88.12*/ poisson nDzLstc2 sdcalcium, exposure(nTotalLstc2) esta t ic /*85.32*/ poisson nDzLstc2 sdchol esterol, exposure(nTotalLstc2) estat ic /*87.30*/ poisson nDzLstc2 manBCS, exposure(nTotalLstc2) estat ic /*85.04*/ poisson nDzLstc2 sdNtot, exposure(nTotal Lstc2) estat ic /*87.37 */ poisson nDzLstc2 sdLtot, exposure(nTota lLstc2) estat ic /*85.15*/ poisson nDzLstc2 sdalbumin, exposure(nTotalLstc2) estat ic /*87.99*/ poisson nDzLstc2 sdhaptoglobin, exposure(nTotalLstc2) est at ic /*88.12*/ po isson nDzLstc2 sdvitamind, exposure(nTotalLstc2) estat ic /*86.16*/ poisson nDzLstc2 sdMtot, exposure(nTotalLstc2) estat ic /*87.82*/ poisson nDzLstc2 sdBtot, exposure(nTotalLstc2) estat ic /*82.30*/ poisson nDzLstc2 sdB Dtot, exposure(nTotalLstc2) estat ic /*85.49 */ poisson nDzLstc2 sdEtot, exposure(nTotalLstc2) estat ic /*86.91*/ poisson nDzLstc2 sdsaaugml, exposure(nTotalLstc2) estat ic /*87.76*/ p oisson nDzLstc2 sdaopugul, exposure(n TotalLstc2) estat ic /*87.96*/ poisso n nDzLstc2 sdalphatocopherol, exposure(nTotalLstc2) estat ic /*86.54*/ poisson nDzLstc2 nBhb, exposure(nTotalLstc2) estat ic /*86.21*/ 386 poisson nDzLstc2 nNefa, exposure(nTotalLstc2) estat ic /*84.95*/ poisson nDzLstc2 nC alcium, exposure(nTotalLstc2) estat ic /*87.15*/ poisson nDzLstc2 nCholesterol, exposure(nTotalLstc2) estat ic /*86.99*/ poisson nDzLstc2 n Heifer, exposure(nTotalLstc2) estat ic /* 86.95*/ poisson nDzLstc2 nNtot, exposure(nT otalLstc2) estat ic /*87.47*/ poisson nDzLstc2 nLtot, exposure(nTotalLstc2) estat ic /*87.45*/ poisson nDzLstc2 nAlbumin, exposure(nTotalLstc2) estat ic /*82.72*/ poisson nD zLstc2 nHaptoglobin, exposure(nTotalLstc2) esta t ic /*88.12*/ poisson nDzLstc2 nVitamind, exposure(nTotalLstc2) estat ic /*86.11*/ poisson nDzLstc2 nMtot, exposure(nTotalL stc2) estat ic /*87.95*/ poisson nDzLstc2 nBtot, exp osure(nTotalLstc2) estat ic /*88.00*/ poisson nDzLstc2 nBDtot, exposure(nTotalLstc2) estat ic /*88.09*/ poisson nDzLstc2 nEtot, exposure(nTotalLstc2) estat ic /*87.91*/ poisson nDzLstc2 nSaaugml, exposure(nTotalLstc2) est at ic /*87.80*/ poisson nDzLstc2 nAop ugul, exposure(nTotalLstc2) estat ic /*87.22*/ poisson nDzLstc2 nAlphatocopherol, exposure(nTotalLstc2) estat ic /*87.39* / poisson nDzLstc2 nBetacarotene, exposure(nTotalLstc2) estat ic /*87.63*/ poisson nDzLstc2 nRos, expo sure(nTotalLstc2) estat ic /*87.64*/ poisson nDzLstc2 nVitamina, exposure(nTotalLstc2) estat ic /*87.38*/ poisson n DzLstc2 nBCS, exposure(nTotalLstc2) estat ic /*86.86*/ /*STE P 2: Calculate DSS scores*/ 387 ******************** ************************************************************** /*Model 1*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 7.678615 + ( - 0.192068*meanbhb) + (0.009679* sdB Dtot) /// + (2.093141*medB CS) + ln(nTotalLs tc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen D S = NSES + (2 * (log10(sdpcou nts))) egen meanD S = mean(DS) sort Cohort2 /* meanDS 1.753054 */ /*Model 2*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 0.657394 + ( - 0.008927*sdBtot) + (.008038*sdBDtot) + /// ( - 0.097712*nBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean (DS) sort Cohort2 /* meanDS 1.73821 */ /*Model 3*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 7. 686938782 + ( - 0.288884208*sdbhb) + (0.009059366*sdBDtot) /// + (1.938748730 *medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(D S) sort Cohort2 /* meanDS 1.704201 */ /*Model 4*/ /* Calculate predicted incidence */ use cohort, clear ge n x = 16.462616 + (0.082790*nAlbumin) + ( - 1.921913*meancalcium) /// + (0 .003288*sdBDtot) + ln(nTotalLstc2) gen pc ount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ 388 egen sdpcounts = sd(pcount) gen DS = NSES + (2 * ( log10(sdpcounts))) egen meanDS = mean(DS) s ort Cohort2 /* meanDS 1.758754 */ /*Model 5*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 8.60642 + (0.01020*sdBDtot) + ( - 0.04529*nBhb) + /// (2.15352*medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Seba stini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.729848 */ /*Model 6*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 0.723169 + ( - 0.007950*meanBtot) + (0.008271*sdBDtot) + /// ( - 0.102901*nBCS) + ln(nT otalLstc2) gen pcount= 2.71828 ^ x /* Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.716293 */ /*Model 7*/ /* Calculate predicted incidence */ use cohort, clear gen x = 11.492929178 +( - 2.287064223*meancalcium) + (0.003390599*sdBDtot) /// + (2.723247871*meanalbu min) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcoun ts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.699115 */ /*Model 8*/ /* Calculate predicted incidence */ use cohort, clear gen x =13.99371744 + (0.08537553*nAlbumin) +( - 1.64468470*meancalcium) /// 389 + ln(nTotal Lstc2) gen pcount= 2.71828 ^ x /*Calc ulating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /* Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen mea nDS = mean(DS) sort Cohort2 /* meanDS 1. 569345 */ /*Model 9*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 8.0032864162 +(0.0097999487*sdBDtot) + (2.0520128050*medBCS) /// + ( - 0.0004792564*sdNtot) + ln(n TotalLstc2) gen pcount= 2.71828 ^ x / *Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean (NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) ege n meanDS = mean(DS) sort Cohort2 /* meanDS 1.692789 */ /*Model 10*/ /* Calculate predicted incidence */ use cohort, clear gen x = 8.6904772777+ ( - 1.9279903753*meancalcium) + (2.5550525586*meanalbumin) /// + (0.0004897288*mea nvitamina) + ln(nTotalLstc2) gen pcount= 2. 71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sd pcounts))) egen meanDS = mean(DS) sort Coho rt2 /* meanDS 1.4333 */ /*Model 11*/ /* Calculate predicted inci dence */ use cohort, clear gen x = - 7.86221392 + (0.01002708*sdBDtot) + (1.83231393*medBCS) + /// ln(nTotalLstc2 ) gen pcount= 2.71828 ^ x /*Calculati ng NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.64961 2 390 */ /*Model 12*/ /* Calculate predicted incidence */ u se cohort, clear gen x = - 1.68120+ ( - 0.44055*sdbhb) + (0.08626*nNefa) + /// (0.02215*sdvitamind) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.730149 */ /*Model 13*/ /* Calculate predicted incidence */ use cohort, clear gen x =7.7435555 +( - 1.4854745*meancalcium) +( - 0.3541039*sdcalcium) /// + (2.6317462*meanalbumin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES* / egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS ) sort Cohort2 /* meanDS 4.278593 */ /*Model 14*/ /* Calculate predicted incidence */ use cohort, clear gen x =2.69246933+ ( - 0.94061640*meancalcium) +(0.00879549*sdBDtot) /// + (1.41826331*medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.670402 */ /*Model 15*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 7.987332330 +(0.009533922*sdBDtot)+ (1.965593571*medBCS) /// + ( - 0.001332825*meanMtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) 391 gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.704543 */ /*Model 16*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 5.887880450 +( - 0.005540687*sdBtot) +(0.008338497*sdBDtot) /// +(1.335830338*medBCS) + ln(nTotalLstc2) gen p count= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen me anNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.661367 */ /*Model 17*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 3.057945568 + ( - 0.436543512*sdcalcium) + (0.008862894*sdBDtot) /// + (1.689084794 *medBCS)+ ln(nTotalLstc2) ge n pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 5.124495 */ /*Model 18*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 9.11090 + (0.01018*sdBDtot)+(1.95542*medBCS)+(0.16672*nHeifer) /// + ln(nTotalLstc2) gen pcount= 2.7182 8 ^ x /*Calculating NSES*/ egen zp = st d(pcount) gen NSES = zp ^ 2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.729981 */ /*Model 19*/ /* Calculate predicted i ncidence */ use cohort, clear gen x = 12.24039407+(0.07698336*nAlbumin) +( - 1.14944356*meancalcium) /// + ( - 0.30351703*sdcalcium) + ln(nTotalLstc2) 392 gen pcount= 2.71 828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort 2 /* meanDS 3.915638 */ /*Model 20*/ /* Calculate pr edicted incidence */ use cohort, clear gen x =13.3360895754 +( - 1.5833423633*meancalcium) + (0.0792046664*nAlbumin) /// +(0.0004105014*meanvitamina) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES */ egen zp = std(pcoun t) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(D S) sort Cohort2 /* meanDS 1.494927 */ /*Model 21*/ /* Calculate predicted incidence */ use cohort, clear gen x =8.728250 +( - 1.967850*meancalcium) +(2.700865*meanalbumin) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pc ount) g en NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.505286 */ /*Model 22*/ /* Calculate predicted incidence */ use cohort, clear gen x =8.68273080+( - 2.02232668*meancalcium)+(2.94168259*meanalbumin) /// +( - 0.03345594*nVitamind) + ln(nTotalLstc2) gen pcount= 2.71 828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort 2 /* meanDS 1.588294 */ 393 /*Model 23*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 0.377892 +( - 0.289063*sdbhb)+(0.007183*sdBDtot) + /// ( - 0.111453*nBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* me an DS 1.664329 */ /*Model 24*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 7.913441+(0.009727*sdBDtot)+( - 0.030100*nVitamind) + /// (1.922158*medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calcula ti ng NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.6 94184 */ /*Model 25*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 7.733626448+(0.009467678*sdBDtot)+(1.856926566*medBCS) /// +( - 0.036760692*meanbetacarotene) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /* Calculating NSES*/ egen zp = std(pcount) g en NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* mea nDS 1.669424 */ /*Model 26*/ /* Cal culate predicted incidence */ use cohort, clear gen x = - 0.29211+( - 0.50162*sdbhb)+(0.06686*nNefa)+( - 0.41040*sdalphatocopherol) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculat ing NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid an d Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) 394 egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.66 7781 */ /*Model 27*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 8.100973004+(0.009237479*sdBDtot)+(0.106886609*meanaopugul) /// +(1.775529553*medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calcu lating NSES*/ egen zp = std(pcount) gen NS ES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1 .664645 */ /*Model 28*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 6.322731164+(0.008631404*sdBDtot)+( - 0.004340620*meanBtot) /// +(1.439345069*medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calcula ting NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.6 52682 */ /*Model 29*/ /* Calculate pr edicted incidence */ use c ohort, clear gen x =7.97603535+( - 2.30730334*meancalcium)+(4.04684570*meanalbumin) /// + ( - 0.07201712*nCholesterol) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x / *Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen mea nNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.548171 */ /*Model 30*/ /* Calcu late predicted incidence */ use cohort, clear gen x =0.5018308188+( - 0.1956880755*meanbhb)+( - 0.0003540744*sdLtot) /// +(0.0050029719*sdBDtot) + ln(nTotalLstc2) 395 gen pcount= 2.71828 ^ x /* Calculating NSES*/ egen zp = std(pcount) g en NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.672554 */ /*Model 31*/ /* Calcul ate predicted incidence */ use cohort, clear gen x = - 1.572226263+(0.020550769*sdvitamind)+( - 0.102604806*nBCS) /// +(0.009524318*sdBDtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Cal culating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1 .663082 */ /*Model 32*/ /* Calculate pred icted incidence */ use cohort, clear gen x = - 8.123687586+(0.018957018*nAopugul)+(1.881269399*medBCS) /// + (0.009286897*sdBDtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calcul ating NSES*/ egen zp = std(pcount) gen NSE S = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1. 675699 */ /*Model 33*/ /* Calculat e p redicted incidence */ use cohort, clear gen x = - 7.048755+(0.010251*sdBDtot)+(1.735966*medBCS)+( - 0.008525*sdcholesterol) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES* / egen zp = std(pcount) gen NSES = zp ^ 2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.680523 */ /*Model 34*/ 396 /* Calculate predicted incid ence */ use cohort, clear gen x =14.62588514+(0.10256079*nAlbumin)+( - 1.69545318*meancalcium) /// +( - 0.03955223*nCholesterol) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES */ egen zp = std(pcount) ge n NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.576181 */ /*Model 35*/ /* Calculate predicted inci dence */ use cohort, clear gen x =9.87702+( - 1.12935*meancalcium)+(0.00785*sdBDtot) /// +( - 0.08862*nBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std( pcount) gen NSES = zp ^2 egen meanNSES = me an(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.656695 */ /*Model 36*/ /* Calculate predicted incide nce */ use cohor t, clear gen x =19.0430+(0.1002*nAlbumin)+( - 2.0550*meancalcium)+( - 0.3716*medBCS) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean( NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.675719 */ /*Model 37*/ /* Calculate pr edicted incidence */ use cohort, clear gen x =9.14157+( - 1.06120*meancalcium)+( - 0.34780*sdbhb)+(0.06743*nNefa) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen m eanNSES = mean(NSES) /* Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 397 1.67910 */ /*Model 38*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 6.990468066+(0 .009538788*sdBDtot)+(0.011537057*sdvitamind) /// +(1.494552179*medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen mea nNSES = mean(NSES) /* Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.669127 */ /*Model 39*/ /* Calculate predicted incidence */ use cohort, clear gen x =8.45488925+( - 1.93541 556*meancalcium)+(2.74649162*meanalbumin) /// + ( - 0.02665554*nBhb) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNS ES = mean(NSES) /*Dawid and Sebastini*/ e gen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.552971 */ /*Model 40*/ /* Calculate predicted incidence */ us e cohort, clear gen x = - 1.609586627+(0.008352 597*sdBDtot)+(0.175067717*meanaopugul) /// +( - 0.127640023*nBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.668541 */ /*Model 41*/ /* Calculate predicted incidence */ use coh ort, clear g en x =8.23248778+( - 1.68531294*mea ncalcium)+(0.03088412*nNefa) + /// (1.96792406*meanalbumin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = me an(NSES) /*D awid and Sebastini*/ egen sdp counts = sd(pcount) 398 gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.488914 */ /*Model 42*/ /* Calculate predicted incidence */ use coho rt, clear gen x =11.62734988+(0.0707056*nAlbu min)+( - 1.3632888*meancalcium) /// + ( - 0.1316809*sdbhb) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(N SES) /*D awid and Sebastini*/ egen sdpcoun ts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.548179 */ /*Model 43*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 9.233871908+(0.008489225*sdBDto t)+(0.135744414*sdalphatocopherol) /// + (2.091870960*medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = m ean(NSES ) /*Dawid and Sebastini*/ egen sd pcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.181928 */ /*Model 44*/ /* Calculate predicted incidence */ use cohort , clear gen x = - 5.277160364+(0.009341234*sdBD tot)+(1.156726607*medBCS) /// +( - 0.043419482*nBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.574472 */ /*Model 45*/ /* Calculate predicted incidence */ use cohort, cl ear gen x =15.07981704+(0.08695581*nAlbumin)+ ( - 1.76522663*meancalcium) /// + (0.01424496*meanbetacarotene) + ln(nTotalLstc2) 399 gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen m eanNSES = m ean(NSES) /*Dawid and Sebastini*/ egen sd pcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.552879 */ /*Model 46*/ /* Calculate predicted incidence */ use coh ort, clear gen x =13.02260279+(0.07683542*nAl bumin)+( - 1.52278054*meancalcium) /// + ( - 0.01635264*nVitamind) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen s dpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.555449 */ /*Model 47*/ /* Calculate predicte d incidence */ use c ohort, clear gen x =7.2877030+( - 1.6190283*mea ncalcium)+( - 0.1710019*sdbhb) + /// (2.2150279*meanalbumin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mea n(NSES) /*Dawid and Sebastini*/ egen sdpc ounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.488026 */ /*Model 48*/ /* Calculate predict ed incidence */ use coh ort, clear gen x = - 6.67166039+(0.02899234*nAl bumin)+(0.00810086*sdBDtot) + /// (1.43039427*medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES ) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.621193 */ 400 /*Model 49*/ /* Calculate predicted incidence */ use cohort, c lear gen x =10.42705235+(0.05374988*nAlbumin) +( - 1.58046116*meancalcium) /// +(0.95431849*meanalbumin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean( NSES) /*Dawid and Sebastini*/ egen sdpcou nts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.48896 */ /*Model 50*/ /* Calculate predicted incidence */ use cohort, cl ear gen x =12.18502218+(0.06391539*nAlbumin)+ ( - 1.45416794*meancalcium) /// +(0.02046031*nNefa) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.518181 */ /*Model 51*/ /* Ca lculate predicted incidence */ use cohort, clea r gen x =13.09564016+(0.07924347*nAlbumin)+( - 1.54473142*meancalcium) /// +( - 0.03880035*manBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd (pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.512377 */ /*Model 52 */ /* Calculate predicted incidence */ use cohort, clear gen x = - 6.895679250+(0.009290467*sdBDtot)+(1. 611239415*medBCS) /// +( - 0.021043417*nCalcium) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = st d(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(p count) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 401 /* meanDS 1.593033 */ /*Model 53*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1.540411+(0.108521*nAlbumin)+(0.003319*sd BDtot)+ /// ( - 0.098263*nCalcium) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.662843 */ /*Model 54*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 5.2057416244+(0.0084175137*sdBDtot)+( - 0.000127079 4*sdLtot) /// +(1.1634090560*medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid a nd Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.580828 */ /* Model 55*/ /* Calculate predicted incidence */ use cohort, clear gen x = - 1. 3583470851+( - 0.0002234054*sdLtot)+(0.0056722375*s dBDtot) /// +(0.0181939631*sdvitamind) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid a nd Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.480721 */ /*Model 56*/ /* Calculate predicted incidence */ use cohort, clear gen x =12 .84976009+(0.07472125*nAlbumin)+( - 1.48453016*mean calcium) /// +( - 0.05625727*meanbhb) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*C alculating NSES*/ egen zp = std(pcount) 402 gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) g en DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.564207 */ /*Model 57*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 1.040742771+(0.007278026*sdBDtot)+( - 0.119685668* nBCS) /// +(0.031478805*nAopugul) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Se bastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.607873 */ /*Model 58*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 0.5 3795+( - 0.00832*sdBtot)+(0.00608*sdBDtot)+( - 0.4745 5*sdalphatocopherol) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Cal culating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1. 655339 */ /*Model 59*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 0.428479+(0.0082 06*sdBDtot)+( - 0.131816*nBCS)+( - 0.001655*meanMtot) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdp counts))) egen meanDS = mean(DS) sort Cohort2 /* meanD S 1.650116 */ 403 /*Model 60*/ /* Calculate predicted incidence */ use cohort, clear gen x=12.52871028+(0.07348281*nAlbumin )+( - 1.51124243*meancalcium) /// +(0.05900132* nHeifer) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpco unts))) egen meanDS = mean(DS) sort Cohort2 /* meanD S 1.533004 */ /*Model 61*/ /* Calculate predicted incidence */ use cohort, clear gen x=13.49318371+(0.08036592*nAlbumin)+( - 1.57945493*meancalcium) /// +( - 0.01100347*nB hb) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Ca lculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd( pcount) gen DS = NSES + (2 * (log10(sdpcounts ))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.574676 */ /*Model 62*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 1.740666532+( - 0.007757901*sdBtot)+(0.0 05645378*sdBDtot) /// +(0.022282688*sdvitamin d) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(p count) gen DS = NSES + (2 * (log10(sdpcounts) )) egen meanDS = mean(DS) sort Cohort2 /* m eanDS 1.68583 */ /*Model 63*/ /* Calculate predicted incidence */ use cohort, clear gen x=11.39618+(.07070675*nAlbumin)+( - 1.347034* meancalcium) /// + ( - .00005259072*sdLtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean (DS) 404 sort Cohort2 /* meanDS 1.466361 */ /*Model 64*/ /* Calculate predicted incidence */ use cohort, clear gen x=12.4268122404+(0.0743895737*nAlbumin)+( - 1.463611 7026*meancalcium) /// +( - 0.0003492958*meanMto t) + ln(nTotalLst c2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(p count) gen DS = NSES + (2 * (log10(sdpcounts) )) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.511989 */ /*Model 65*/ /* Calculate predicted incidence */ use cohort, clear gen x=6.214543648+( - 1.732357063*meancalcium)+(2. 911558391*meanalbumin) /// + ( - 0.008020267*sdc holesterol) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcoun ts = sd(pcount) gen DS = NSES + (2 * (log10(s dpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.495038 */ /*Model 66*/ /* Calculate predicted incidence */ use cohort, clear gen x=10.994447369+(0.073085773*nAlbumi n)+( - 1.300260771*meancalcium) /// +( - 0.003427 014*sdcholesterol) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen s dpcounts = sd(pcount) gen DS = NSES + (2 * (l og10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.453055 */ /*Model 67*/ /* Calculate predicted incidence */ use cohort, clear gen x=13.73422787+(0.07665216*nAl bumin)+( - 1.63012309*meancalcium) /// + (0.0443 9145*sdalphatocopherol) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ 405 egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES ) /*Dawid and Sebastini*/ e gen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.396684 */ /*Model 68*/ /* Calculate predicted incidence */ use cohort, c lear gen x= - 0.1990193390+(0.004 8621919*sdBDtot)+( - 0.2809197095*sdbhb) /// +( - 0.0003088189*sdLtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ eg en sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.630551 */ /*Model 69*/ /* Calculate predicted incidence */ use cohort, clear gen x=13.494876681+(0.07656 9318*nAlbumin)+( - 1.589303618*meancalcium) /// +(0.006260292*nAopugul) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.551969 */ /*Model 70*/ /* Calculate predicted incidence */ use cohort, clear gen x=12.445558993+(0.076 570221*nAlbumin)+( - 1.462606673*meancalcium) /// +( - 0.000103687*sdNtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^ 2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.51594 */ 406 /*Model 71*/ /* Calculate predicted incidence */ use cohort, clear gen x=13.62667+(.08249165* nAlbumin)+( - 1.601464*meancalcium) + /// ( - .00 008003649*sdEtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sd pcounts = sd(pcount) gen DS = NSES + (2 * (lo g10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.543303 */ /*Model 72*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 6.3940334892+(0.00878202 00*sdBDtot)+( - 0.0009620967*meanvitamina) /// +(1.5090877481*medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pco unt) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ e gen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.610771 */ /*Model 73*/ /* Calculate predicted incidence */ use cohort, clear gen x=10.38245359+( - 2.23037 667*meancalcium)+(2.93251755*meanalbumin) /// +(0.02047484 *meanbetacarotene)+ ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini */ egen sdpcounts = sd(pcount) gen DS = N SES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.512653 */ /*Model 74*/ /* Calculate predicted incidence */ use cohort, clear gen x=13.02153+(. 08138922*nAlbumin)+( - 1.525097*meancalcium) /// +( - .00003162195 *meanLtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) 407 egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.544725 */ /*Model 75*/ /* Calculate predicted incidence */ use cohort, clear gen x=13.0782147077+( 0.0794903686*nAlbumin)+( - 1.5438838855*meancalcium ) /// +(0.0001532551*nBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini* / egen sdpcounts = sd(pcount) gen DS = NS ES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.522676 */ /*Model 76*/ /* Calculate predicted incidence */ use cohort, clear gen x=12.07161563+ (0.08591040*nAlbumin)+( - 1.43513183*meancalcium) / // +( - 0.01270657*nCalcium) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini* / egen sdpcounts = sd(pcount) gen DS = NS ES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanD S 1.542659 */ /*Model 77*/ /* Calculate predicted incidence */ use cohort, clear gen x=8.15406993+ ( - 1.73721781*meancalcium)+( - 0.08821252*meanbhb) / // +(2.35374979*meanalbumin) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebasti ni*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* me anDS 1.51355 */ /*Model 78*/ /* Calculate predicted incidence */ use cohort, clear gen x=13.04025748+( 0.07720475*nAlbumin)+( - 1.54745057*meancalcium) // / +(0.02090093*meanaopugul) + ln(nTotalLstc2) 408 gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini */ egen sdpcounts = sd(pcount) gen DS = N SES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.536132 */ /*Model 79*/ /* Calculate predicted incidence */ use cohort, clear gen x=11.4604648 68+(0.073007755*nAlbumin)+( - 1.376808508*meancalci um) /// +(0.002777992*sdvitamind) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Seb astini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.480875 */ /*Model 80*/ /* Calculate predicted incidence */ use cohort, clear gen x=12.111673 997+(0.071920079*nAlbumin)+( - 1.432238219*meancalc ium) /// + ( - 0.001218867*meanBtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and S ebastini*/ egen sdpcounts = sd(pcount) ge n DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.512049 */ /*Model 81*/ /* Calculate predicted incidence */ use cohort, clear gen x=11.693 360304+(0.068308625*nAlbumin)+( - 0.001646258*sdBto t) /// +( - 1.382914664*meancalcium) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and S ebastini*/ egen sdpcounts = sd(pcount) ge n DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.506704 409 */ /*Model 82*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 0.5 82656+ (0.008397*sdBDtot)+( - 0.126673*nBCS)+ /// ( - 0.052640*meanbetacarotene) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSE S) /*Dawid and Sebastini */ egen sdpcounts = sd(pcount) gen DS = N SES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.633746 */ /*Model 83*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 7.530630084+ (0.210433470*manBCS)+(0.009364786*sdBDtot) /// + (1.738325811*medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean( NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.637951 */ /*Model 84*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 0.5096459+( - 0.4394 938*sdbhb)+(0.0604288*nNefa)+ /// ( - 0.0001587 *sdLtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdp counts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.627716 */ /*Model 85*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 1.02685+( - 0.41322*sdbhb)+(0.068 87*nNefa) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ 410 egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(s dpcounts))) egen meanDS = mean(DS) sort Coh ort2 /* meanDS 1.629907 */ /*Model 86*/ /* Calculate predicted incidence */ use cohort, clear gen x=7.563099410+( - 0.0033695 77*sdBtot)+( - 1.665350996*meancalcium) /// +(2.222500 512*meanalbumin) + ln(nTotalLstc2) gen p count= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * ( log10(sdpcounts))) egen meanDS = mean(DS) s ort Cohort2 /* meanDS 1.508409 */ /*Model 87*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 8.01750+(0.01041*sdBDtot)+(1.89969*medBCS)+ /// ( - 0.01236*nCholest erol) + ln(nTotalLstc2) gen pcount= 2.718 28 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcou nts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.655272 */ /*Model 88*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 7.569145137+(1.757128306*medBCS)+( - 0.002246103*nNefa) + /// (0.009721934*sdBDt ot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawi d and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts ))) egen meanDS = mean(DS) sort Cohort2 / * meanDS 1.613803 */ /*Model 89*/ /* Calculate predicted incidence */ 411 use cohort, clear gen x= - 7.458876605+(0.009557977*sdBDtot)+(1.700857286*medBCS) + /// (0.000257351*sdEt ot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcount s))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.62689 */ /*Model 90*/ /* Calculate predicted incidence */ use cohort, cle ar gen x=7.249116+( - 1.707842*meancalcium)+( - .00008975671*sdLtot) + /// (2.464833*meanalbumin ) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts)) ) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.471851 */ /*Model 91*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 7.836806138+(0.238387955*meanalbumin)+(0.008953769*sdBDtot) /// +(1.603358029*medBCS) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = me an(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.585532 */ /*Model 92*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 5.964156+( - .00004842549*meanLtot)+(.008387064*sdBDtot) + /// (1.362984*medBCS) + ln( nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean( NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) ege n meanDS = mean(DS) sort Cohort2 /* meanDS 412 1.547998 */ /*Model 93*/ /* Calculate predicted incidence */ use cohort, clear gen x=9.9523877+( - 2.0341568*meancalcium)+(2.6973942*meanalbumin) /// +( - 0.1718619*medBCS) + ln(nTot alLstc2) gen pcount= 2.71828 ^ x /*Ca lculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean (NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen me anDS = mean(DS) sort Cohort2 /* meanDS 1.485998 */ /*Model 94*/ /* Calculate predicted incidence */ use coh ort, clear gen x= 8.41738410+( - 1.90582234*meancalcium)+(2.47619821*meanalbumin) /// + (0.09618206*nHeifer) + ln(n TotalLstc2) gen pcount= 2.71828 ^ x / *Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.531426 */ /*Model 95*/ /* Calculate predicted inciden ce */ use cohort, clear gen x= 10.405356205+( - 1.195250438*meancalcium)+( - 0.006335354*sdBtot) /// +(0.004312372*sdBDtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 e gen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* me anDS 1.621916 */ /*Model 96*/ /* Calculate predicted incidence */ use cohort, clear gen x=8.072504672+( - 1.772722458*meancalcium)+(2.352273218*meanalbumin) /// +( - 0.001467857*s dvitamind) + ln(nTotalLstc2) gen pcount= 2 .71828 ^ x /*Calculating NSES*/ 413 egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sd pcounts))) egen meanDS = mean(DS) sort Coho rt2 /* meanDS 1.405009 */ /*Model 97*/ /* Calculate predicted incidence */ use cohort, clear gen x=9.554800832+( - 1.083622435*meancalcium)+( - 0.000203840*sdLtot) /// +(0.005068 445*sdBDtot) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10( sdpcounts))) egen meanDS = mean(DS) sort Co hort2 /* meanDS 1.525962 */ /*Model 98*/ /* Calculate predicted incidence */ use cohort, clear gen x= - 0.948339+(0.008494*sdBDtot)+( - 0.109427*nBCS) + ln(nTotalLstc2) gen pc ount= 2.71828 ^ x /*Calculating NSES*/ e gen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) so rt Cohort2 /* meanDS 1.559837 */ /*STEP 3: INTERNAL VALIDATION*/ ***** ***************************************************************************** /*Calculating slope and intercept for predicted versus observed counts using 500 bootstrapped samples for each candidate mode l*/ ************************************** ********************************************** set level 90 use cohort, clear /*Model 1 (previously model 10)*/ /* Calculate predicted incidence */ gen x 1 = 8.6904772777+ ( - 1.9279903753*meancalcium) + ( 2.5550525586*meanalbumin) /// + (0.0004897288*meanvitamina) + ln(nTotalLstc2) gen pcount1= 2.71828 ^ x1 gen lnpcount1= ln(pcount1) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(5 00) seed(1921): poisson nDzLstc2 lnpcount1 e stat bootstrap, all /*Model 2 (previously model 21)*/ /* Calculate predicted incidence */ 414 gen x2 =8.728250 +( - 1.967850*meancalcium) +(2.700865*meanalbumin) /// + ln(nTotalLstc2) gen pcount 2= 2.71828 ^ x2 gen lnpcount2= ln(pcount2) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount2 estat bootstrap, all /*Model 3 (Previously model 49)*/ /* Calculate predicted i ncidenc e */ gen x3 =10.42705235+(0.05374988*nAlbumin )+( - 1.58046116*meancalcium) /// +(0.95431849*meanalbumin) + ln(nTotalLstc2) gen pcount3= 2.71828 ^ x3 gen lnpcount3= ln(pcount3) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(5 00) seed(1921): poisson nDzLstc2 lnpcount3 e stat bootstrap, all /*Model 4 (Previously model 65)*/ /* Calculate predicted incidence */ gen x4=6.214543648+( - 1.732357063*meancalcium)+(2.911558391*meanalbumin) /// +( - 0.00 8020267*sdcho lesterol) + ln(nTotalLstc2) gen pcount4= 2.7 1828 ^ x4 gen lnpcount4= ln(pcount4) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount4 estat bootstrap, all /*Model 5 (Previ ously model 7 3)*/ /* Calculate predicted incidence */ gen x5=10.38245359+( - 2.23037667*meancalcium)+(2.93251755*meanalbumin) /// +(0.02047484 *meanbetacarotene)+ ln(nTotalLstc2) gen pcount5= 2.71828 ^ x5 gen lnpcount5= ln(pcount5) /*Bootstrap e stimates*/ bootstrap _b, nodots nowarn reps(50 0) seed(1921): poisson nDzLstc2 lnpcount5 estat bootstrap, all /*Model 6 (Previously model 77)*/ /* Calculate predicted incidence */ gen x6=8.15406993+( - 1.73721781*meancalcium)+( - 0.08821252*m eanbhb) /// +(2.35374979*meanalbumin) + ln(n TotalLstc2) gen pcount6= 2.71828 ^ x6 gen lnpcount6= ln(pcount6) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed( 1921): poisson nDzLstc2 lnpcount6 estat bootstrap, all /*Model 7 (previously model 86)*/ /* C alculate predicted incidence */ gen x7=7.563099410+( - 0.003369577*sdBtot)+( - 1.665350996*meancalcium) /// +(2.222500512*meanalbumin) + ln( nTotalLstc2) gen pcount7= 2.71828 ^ x7 gen lnpcount 7= ln(pcount7) /*Bootstrap estimates*/ boot strap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount7 estat bootstrap, all /*Model 8 (previously model 93)*/ /* Calculate predicted incidence */ gen x8=9.9523877+( - 2.0 341568*meancalcium)+(2.6973942*meanalbumin) /// +( - 0.1718619*medBCS) + ln(nTotalLstc2) gen pcount8= 2.71828 ^ x8 gen lnpcount8= ln(pcount8) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpc ount8 estat bootstrap, all /*Model 9 (Previously model 94)*/ /* Calculate predicted incidence */ 415 gen x9= 8.41738410+( - 1.90582234*meancalc ium)+(2.47619821*meanalbumin) /// + (0.09618206*nHeifer) + ln(nTotalLstc2) gen pcount9= 2. 71828 ^ x9 gen lnpcount9= ln(pcount9) /* Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount9 e stat bootstrap, all /*Model 10 (previously model 96)*/ /* Calculate predicted incidence */ gen x10=8.072504672+( - 1.772722458*meancalciu m)+(2.352273218*meanalbumin) /// +( - 0.001467857*sdvitamind) + ln(nTotalLstc2) gen pcount10= 2.7182 8 ^ x10 gen lnpcount10= ln(pcount10) /*Bootstrap estimates*/ bootstrap _b, nodots now arn reps(500) seed(1921): poisson nDzLstc2 lnpco unt10 estat bootstrap, all cd "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ " outsheet lnpcount1 lnpcount2 lnpcount3 lnpcount4 lnpcount5 lnpcount6 lnpcount7 lnpcount8 lnpcount9 lnpcount10 nDzLstc2 using combined2.1_othermodel s.csv, comma replace ******************************************************************************** /*STEP 4: MODEL AVERAGING*/ ********************************************************************** ********** /*Note: We calculated AIC weights in excel. First, we average predictions from the original models. Then, we create bootstrapped average pre dictions over 500 reps*/ use cohort, clear /*Model 1 (previously model 10)*/ gen x 1 = 8.6904772777+ ( - 1.9279903753*meancalcium) + ( 2.5550525586*meanalbumin) /// + (0.0004897288*meanvitamina) + ln(nTotalLstc2) gen pcounts1 = 2.71828 ^ x1 /*Model 2 (previously model 21)*/ gen x2 =8.728250 +( - 1.967850*meancalcium) +(2.7008 65*meanalbumin) /// + ln(nTotalLstc2) gen pcounts2 = 2.71828 ^ x2 /*Model 3 (previously model 49)*/ gen x3 =10.42705235+(0.05374988*nAlbumi n)+( - 1.58046116*meancalcium) /// +(0.95431849*meanalbumin) + ln(nTotalLstc2) gen pcounts3 = 2.71828 ^ x3 /*Model 4 (previously model 65)* / gen x4=6.214543648+( - 1.732357063*meancalcium)+(2.911558391*meanalbumin) /// +( - 0.008020267*sdcholesterol) + ln(nTotalLstc2) gen pcounts4= 2.71828 ^ x4 /*Model 5 (previously model 73)* / gen x5=10.38245359+( - 2.23037667*meancalciu m)+(2.93 251755*meanalbumin) /// +(0.02047484 *meanbetacarotene)+ ln(nTotalLstc2) gen pcounts5= 2.71828 ^ x5 /*Model 6 (previously model 77)*/ gen x6=8.15406993+( - 1.73721781*meancalciu m)+( - 0.08821252*meanbhb) /// +(2.35374979*mea nalbumin ) + ln(nTotalLstc2) gen pcounts6= 2.71828 ^ x6 /*Model 7 (previously model 86)*/ gen x7=7.563099410+( - 0.003369577*sdBtot)+( - 1.665350996*meancalcium) /// +(2.222500512*meanalbumin) + ln(nTotalLstc2) gen pcounts7= 2.71828 ^ x7 /*Model 8 (previously model 93)*/ gen x8=9.9523877+( - 2.0341568*meancalcium)+(2.6973942*meanalbumin) /// +( - 0.1718619*medBCS) + ln(nTotalLstc2) gen pcounts8= 2.71828 ^ x8 /*Model 9 (previously model 94)*/ gen x9= 8.41738410 +( - 1.905 82234*meancalcium)+(2.47619821*meanalbumin) /// + (0.09618206*nHeifer) + ln(nTotalLstc2) 416 gen pcounts9= 2.71828 ^ x9 /*Model 10 (previously model 96)*/ gen x10=8.072504672+( - 1.77272 2458*meancalcium)+(2.352273218*meanalbumin) /// +( - 0.001467857*sdvitamind) + ln(nTotalLstc2) gen pcounts10= 2.71828 ^ x10 /*Averaging predicted incidence*/ gen avgpcount = (pcounts1 * 0.21) + (pcounts2 * 0.16) + (pcounts3 * 0.09) + (pcounts4 * 0.08) + /// (pcounts5 * 0.08) + (p counts6 * 0.08) + (pcounts7 * 0.08) + (pcounts8 * 0.08) + /// (pcounts9 * 0.07) + (pcounts10 * 0.07) /*Calculating NSES*/ egen zp = std(avgpcount) gen NSES = zp ^ 2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(avgpcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /*Bootstrap estimates*/ set level 90 gen lnavgpcount = ln(avgpcou nt) bootstrap _b, nodots nowa rn reps(500) seed(1921): poisson nDzLstc2 lnavgp count estat bootstrap, all cd "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ " outsheet lnavgpcount avgpcount nDzLstc2 using combined2.1.csv, comma replace /* meanDS 1.4739 48 */ 9_AIM2 - 2COHOR T VARIABLES.D O /* Title: Create new variables for cohort analysis 2 Author: Lauren Wisnieski Purpose: Create variables for the cohort level analysis number 2 (aggregating predicted probabilities from individual level models) Revision history: - Created Sep t 27, 2018 - Revised October 4, 2018 (changed analysis so including models from all 3 constructs so can compare) - Revised December 14, 2018 (chang ed cut - offs for proportion models so that they opti mize sens and spec) - Revised January 1, 2019 to use the averaged predictions from all 3 constructs The purpose of this do file is to create new variables for the cohort level analysis. E.g. mean pred prob, SD of pred prob, and number above threshold Steps: 1. Re - run individual level models: bes t model from aim 1 in these categories - Nutrient met abolism - Inflammation - Oxidative stress - Combined and calculate summary measures: - mean - standard deviation - threshold (number abov e 50% risk by cohort) 2. Reduce dataset to one line per cow 3. Summary statistics by cohort */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ FinalDataset3_5June2018.dta", clear save cohort2, replace 417 ************************ ************************************************* ******** /*Step 1: Re - run individual level models and c alculate summary measures *********************************************************************************/ /*NUTRIENT METABOLISM*/ *************** ******** /*Prepping macros to re - run models*/ global season "season2 season3 season4" global LactCat "LactCat2 LactCat3" global cholesterol "cholesterol42 cholesterol43 cholesterol44" global bcscat "bcscat2" global SeasonXNefa "c.season2 #c.lognefasq c.season3#c.lognefasq c.season4#c.lo gnef asq" global SeasonXCalc "c.season2#c.calcium c.season3#c.calcium c.season4#c.calcium" global SeasonXBCS "c.season2#c.bcscat2 c.season3#c.bcscat2 c.season4#c.bcscat2" global LactXSeason "c.LactCat 2#c.season2 c.LactCat2#c.season3 c.LactCat2#c.sea son4 c.LactCat3#c.season2 c.LactCat3#c.season3 c.LactCat3#c.season4" global LactXSeason "c.LactCat2#c.season2 c.LactCat2#c.season3 c.LactCat2#c.season4 c.LactCat3#c.season2 c.LactCat3#c.season3 c.LactCat3 #c.season4" global LactXBCS "c.LactCat2#c.bcsc at2 c.LactCat3#c.bcscat2" global LactXCalc "c.LactCat2#c.calcium c.LactCat3#c.calcium" global LactXNefa "c.LactCat2#c.lognefasq c.LactCat3#c.lognefasq" global SeasonXbhb "c.season2#c.bhb c.season3#c.b hb c.season4#c.bhb" global LactXbhb "c.LactCat 2#c. bhb c.LactCat3#c.bhb" global LactXChol "c.LactCat2#c.cholesterol42 c.LactCat2#c.cholesterol43 c.LactCat2#c.cholesterol44 c.LactCat3#c.cholesterol42 c.LactCat3#c.cholesterol43 c.LactCat3#c.cholesterol 44" global CholXSeason "c.cholesterol42#c.seaso n2 c .cholesterol42#c.season3 c.cholesterol42#c.season4 c.cholesterol43#c.season2 c.cholesterol43#c.season3 c.cholesterol43#c.season4 c.cholesterol44#c.season2 c.cholesterol44#c.season3 c.cholesterol44#c.seas on4" global CholXBCS "c.cholesterol42#c.bcscat2 c.c holesterol43#c.bcscat2 c.cholesterol44#c.bcscat2" global CalcXChol "c.calcium#c.cholesterol42 c.calcium#c.cholesterol43 c.calcium#c.cholesterol44" global NefaXChol "c.lognefasq#c.cholesterol42 c.lo gnefasq#c.cholesterol43 c.lognefasq#c.cholesterol 44" global CholXbhb "c.cholesterol42#c.bhb c.cholesterol43#c.bhb c.cholesterol44#c.bhb" /*Nutrient metabolism models*/ global model1 "$LactCat $season $bcscat calcium lognefasq $SeasonXNefa" global model2 "$LactCat $season bcscat bhb calci um l ognefasq $SeasonXNefa" global model3 "$LactCat bhb calcium lognefasq" global model4 "$LactCat $season calcium lognefasq $SeasonXNefa" global model5 "$LactCat bcscat bhb calcium lognefasq c.bcscat2 #c.lognefasq" global model6 "$LactCat $bcscat b hb calcium" global model7 "$LactCat $season bhb calcium lognefasq $SeasonXNefa" global model8 "$LactCat bhb calcium" global model9 "$LactCat calcium lognefasq" global model10 "$LactCat $bcscat calciu m lognefasq" global model11 "$LactCat $season $ cholesterol $bcscat calcium lognefasq $SeasonXNefa" global model12 "$LactCat $bcscat calcium" global model13 "$LactCat $season $bcscat lognefasq" global model14 "$LactCat $season $bcscat calcium" glo bal model15 "$LactCat $season $bcscat bh b calcium " global model16 "$LactCat calcium" global model17 "$LactCat $season $cholesterol $bcscat bhb calcium lognefasq $SeasonXNefa" global model18 "$LactCat $season $bcscat bhb lognefasq $SeasonXNefa" glob al model19 "$LactCat $season $cholestero l calcium lognefasq $SeasonXNefa" /*Calculating phat*/ /*Model 4*/ melogit AnyDzCalc $model4 predict phat rename phat phat1 /*Model 1*/ melogit AnyDzCalc $model1 predict phat rename phat phat2 /*Model 7*/ melog it AnyDzCalc $model7 predict phat rename phat phat3 418 /*Model 2*/ melogit AnyDzCalc $model2 predict phat rename phat phat4 /*Model 11*/ melogit AnyDzCalc $model11 predict ph at rename phat phat5 /*Averaged*/ gen avgphat= (0.36* phat1) + (.27 * phat2) + (.18 * phat3) + (.13 * phat4) + /// (.06 * phat5) roctg AnyDzCalc avgphat , optimal smooth lowess co interval(0.01) /*cut 0.335*/ ege n NM_meanphat= mean(avgphat) , by(Cohort2) eg en NM_sdphat= sd(avgphat), by(Cohort2) gen NM_phatcut= 0 replace NM_phatcut=1 if avgphat>0.335 replace NM_phatcut=. if avgphat==. tab AnyDzCalc NM_phatcut, column sort LstcId egen NM_nPhat= total (NM_phatcut==1) if LstcId<300, by( Cohort2) drop avgphat drop phat1 phat2 phat3 phat4 phat5 /*INFLAMMATION*/ **************** /*Prepping macros to re - run models*/ global Mtot "MtotCat2 MtotCat3 MtotCat4 MtotCat5" global saau gml "saaugml42 saaugml43 saaugml44 " global vitamind "vitamind52 vitamind53 vitamind54 vitamind55" global albumin "albumin52 albumin53 albumin54 albumin55" global haptoglobin "haptoglobin52 haptoglobin53 haptoglobin54 haptoglobin55" global Ltot "Ltot52 Ltot53 Ltot54 Ltot55" g lobal Etot "Etot52 Etot53 Etot54 Etot55" global Btot "BtotCat2 BtotCat3 BtotCat4 BtotCat5" global BDtot "BDtotCat2 BDtotCat3 BDtotCat4 BDtotCat5" global season "season2 season3 season4" global La ctCat "LactCat2 LactCat3" global LactXSeason " c.LactCat2#c.season2 c.LactCat2#c.season3 c.LactCat2#c.season4 c.LactCat3#c.season2 c.LactCat3#c.season3 c.LactCat3#c.season4" global LactXMtot "c.LactCat2#c.MtotCat2 c.LactCat2#c.MtotCat3 c.Lact Cat2#c.Mt otCat4 c.LactCat2#c.MtotCat5 c.LactCat3#c.MtotCat 2 c.LactCat3#c.MtotCat3 c.LactCat3#c.MtotCat4 c.LactCat3#c.MtotCat5" global LactXsaa "c.LactCat2#c.saaugml42 c.LactCat2#c.saaugml43 c.LactCat2#c.saaugml44 c.LactCat3#c.saaugml42 c.LactCat3#c.saau gml43 c.L actCat3#c.saaugml44" global LactXalb "c.LactCa t2#c.albumin52 c.LactCat2#c.albumin53 c.LactCat2#c.albumin54 c.LactCat2#c.albumin55 c.LactCat3#c.albumin52 c.LactCat3#c.albumin53 c.LactCat3#c.albumin54 c.LactCat3#c.albumin55" global LactXhapto "c.LactCa t2#c.haptoglobin52 c.LactCat2#c.haptoglobin53 c.L actCat2#c.haptoglobin54 c.LactCat2#c.haptoglobin55 c.LactCat3#c.haptoglobin52 c.LactCat3#c.haptoglobin53 c.LactCat3#c.haptoglobin54 c.LactCat3#c.haptoglobin55" global LactXNL "c.LactCat2#c.NLrati o c.LactC at3#c.NLratio" global SeasonXMtot "c.season2#c .MtotCat2 c.season2#c.MtotCat3 c.season2#c.MtotCat4 c.season2#c.MtotCat5 c.season3#c.MtotCat2 c.season3#c.MtotCat3 c.season3#c.MtotCat4 c.season3#c.MtotCat5 c.season4#c.MtotCat2 c.season4#c.MtotCat3 c.season 4#c.MtotCat4 c.season4#c.MtotCat5" global Seas onXsaa "c.season2#c.saaugml42 c.season2#c.saaugml43 c.season2#c.saaugml44 c.season3#c.saaugml42 c.season3#c.saaugml43 c.season3#c.saaugml44 c.season4#c.saaugml42 c.season4#c.saaugml43 c.season4#c.sa augml44" global SeasonXalb "c.season2#c.albumin52 c.sea son2#c.albumin53 c.season2#c.albumin54 c.season2#c.albumin55 c.season3#c.albumin52 c.season3#c.albumin53 c.season3#c.albumin54 c.season3#c.albumin55 c.season4#c.albumin52 c.season4#c.albumin53 c.sea son4#c.al bumin54 c.season4#c.albumin55" global SeasonXN L "c.season2#c.NLratio c.season3#c.NLratio c.season4#c.NLratio" global MtotXsaa "c.MtotCat2#c.saaugml42 c.MtotCat2#c.saaugml43 c.MtotCat2#c.saaugml44 c.MtotCat3#c.saaugml42 c.MtotCat3#c.saaug ml43 c.Mt otCat3#c.saaugml44 c.MtotCat4#c.saaugml42 c.MtotC at4#c.saaugml43 c.MtotCat4#c.saaugml44 c.MtotCat5#c.saaugml42 c.MtotCat5#c.saaugml43 c.MtotCat5#c.saaugml44" global MtotXalb "c.MtotCat2#c.albumin52 c.MtotCat2#c.albumin53 c.MtotCat2#c.albumin54 c .MtotCat2 #c.albumin55 c.MtotCat3#c.albumin52 c.MtotCat3#c. albumin53 c.MtotCat3#c.albumin54 c.MtotCat3#c.albumin55 c.MtotCat4#c.albumin52 c.MtotCat4#c.albumin53 c.MtotCat4#c.albumin54 419 c.MtotCat4#c.albumin55 c.MtotCat5#c.albumin52 c.MtotCat5#c.albumin53 c.Mt otCat5#c. albumin54 c.MtotCat5#c.albumin55" global MtotXN L "c.MtotCat2#c.NLratio c.MtotCat3#c.NLratio c.MtotCat4#c.NLratio c.MtotCat5#c.NLratio" global saaXalb "c.saaugml42#c.albumin52 c.saaugml42#c.albumin53 c.saaugml42#c.albumin54 c.saaugml42#c.albumi n55 c.saa ugml43#c.albumin52 c.saaugml43#c.albumin53 c.saau gml43#c.albumin54 c.saaugml43#c.albumin55 c.saaugml44#c.albumin52 c.saaugml44#c.albumin53 c.saaugml44#c.albumin54 c.saaugml44#c.albumin55" global saaXNL "c.saaugml42#c.NLratio c.saaugml43#c.NLrati o c.saaug ml44#c.NLratio" global albXNL "c.albumin52#c.N Lratio c.albumin53#c.NLratio c.albumin54#c.NLratio c.albumin55#c.NLratio" global LactXhap "c.LactCat2#c.haptoglobin52 c.LactCat2#c.haptoglobin53 c.LactCat2#c.haptoglobin54 c.LactCat2#c.haptoglobin 55 c.Lact Cat3#c.haptoglobin52 c.LactCat3#c.haptoglobin53 c .LactCat3#c.haptoglobin54 c.LactCat3#c.haptoglobin55" global SeasonXhap "c.season2#c.haptoglobin52 c.season2#c.haptoglobin53 c.season2#c.haptoglobin54 c.season2#c.haptoglobin55 c.season3#c.haptog lobin52 c .season3#c.haptoglobin53 c.season3#c.haptoglobin5 4 c.season3#c.haptoglobin55 c.season4#c.haptoglobin52 c.season4#c.haptoglobin53 c.season4#c.haptoglobin54 c.season4#c.haptoglobin55" global MtotXhap "c.MtotCat2#c.haptoglobin52 c.MtotCat2#c.hapto globin53 c.MtotCat2#c.haptoglobin54 c.MtotCat2#c.haptoglob in55 c.MtotCat3#c.haptoglobin52 c.MtotCat3#c.haptoglobin53 c.MtotCat3#c.haptoglobin54 c.MtotCat3#c.haptoglobin55 c.MtotCat4#c.haptoglobin52 c.MtotCat4#c.haptoglobin53 c.MtotCat4#c.haptoglobin54 c.Mt otCat4#c. haptoglobin55 c.MtotCat5#c.haptoglobin52 c.MtotCa t5#c.haptoglobin53 c.MtotCat5#c.haptoglobin54 c.MtotCat5#c.haptoglobin55" global saaXhap "c.saaugml42#c.haptoglobin52 c.saaugml42#c.haptoglobin53 c.saaugml42#c.haptoglobin54 c.saaugml42#c.haptoglo bin55 c.s aaugml43#c.haptoglobin52 c.saaugml43#c.haptoglobi n53 c.saaugml43#c.haptoglobin54 c.saaugml43#c.haptoglobin55 c.saaugml44#c.haptoglobin52 c.saaugml44#c.haptoglobin53 c.saaugml44#c.haptoglobin54 c.saaugml44#c.haptoglobin55" global hapXNL "c.hapto globin52# c.NLratio c.haptoglobin53#c.NLratio c.haptoglobin 54#c.NLratio c.haptoglobin55#c.NLratio" global albXhap "c.albumin52#c.haptoglobin52 c.albumin52#c.haptoglobin53 c.albumin52#c.haptoglobin54 c.albumin52#c.haptoglobin55 c.albumin53#c.haptoglobin52 c.albumi n53#c.haptoglobin53 c.albumin53#c.haptoglobin54 c .albumin53#c.haptoglobin55 c.albumin54#c.haptoglobin52 c.albumin54#c.haptoglobin53 c.albumin54#c.haptoglobin54 c.albumin54#c.haptoglobin55 c.albumin55#c.haptoglobin52 c.albumin55#c.haptoglobin53 c.a lbumin55# c.haptoglobin54 c.albumin55#c.haptoglobin55" g lobal LactXNtot "c.LactCat2#c.Ntot c.LactCat2#c.Ntot c.LactCat3#c.Ntot c.LactCat3#c.Ntot" global SeasonXNtot "c.season2#c.Ntot c.season3#c.Ntot c.season4#c.Ntot" global MtotXNtot "c.MtotCat2#c. Ntot c.Mt otCat3#c.Ntot c.MtotCat4#c.Ntot c.MtotCat5#c.Ntot " global saaXNtot "c.saaugml42#c.Ntot c.saaugml43#c.Ntot c.saaugml44#c.Ntot" global albXNtot "c.albumin52#c.Ntot c.albumin53#c.Ntot c.albumin54#c.Ntot c.albumin55#c.Ntot" global LactXLtot "c.L actCat2#c .Ltot52 c.LactCat2#c.Ltot53 c.LactCat2#c.Ltot54 c .LactCat2#c.Ltot55 c.LactCat3#c.Ltot52 c.LactCat3#c.Ltot53 c.LactCat3#c.Ltot54 c.LactCat3#c.Ltot55" global SeasonXLtot "c.season2#c.Ltot52 c.season2#c.Ltot53 c.season2#c.Ltot54 c.season2#c.Ltot55 c.season3 #c.Ltot52 c.season3#c.Ltot53 c.season3#c.Ltot54 c .season3#c.Ltot55 c.season4#c.Ltot52 c.season4#c.Ltot53 c.season4#c.Ltot54 c.season4#c.Ltot55" glob al MtotXLtot "c.MtotCat2#c.Ltot52 c.MtotCat2#c.Ltot53 c.MtotCat2#c.Ltot54 c.MtotCat2#c.Ltot55 c.MtotCat3# c.Ltot52 c.MtotCat3#c.Ltot53 c.MtotCat3#c.Ltot54 c.MtotCat3#c.Ltot55 c.MtotCat4#c.Ltot52 c.MtotCat4#c.Ltot53 c.MtotCat4#c.Ltot54 c.MtotCat4#c.Ltot55 c. MtotCat5#c.Ltot52 c.MtotCat5#c.Ltot53 c.MtotCat5#c.Ltot54 c.MtotCat5#c.Ltot55" global saaXLtot "c.saau gml42#c.Ltot52 c.saaugml42#c.Ltot53 c.saaugml42#c .Ltot54 c.saaugml42#c.Ltot55 c.saaugml43#c.Ltot52 c.saaugml43#c.Ltot53 c.saaugml43#c.Ltot54 c.saaugml4 3#c.Ltot55 c.saaugml44#c.Ltot52 c.saaugml44#c.Ltot53 c.saaugml44#c.Ltot54 c.saaugml44#c.Ltot55" global albXLtot "c.albumin52#c.Ltot52 c.albumin52#c.Lto t53 c.albumin52#c.Ltot54 c.albumin52#c.Ltot55 c.albumin53#c.Ltot52 c.albumin53#c.Ltot53 c.albumin53#c. Ltot54 c.albumin53#c.Ltot55 c.albumin54#c.Ltot52 c.albumin54#c.Ltot53 c.albumin54#c.Ltot54 c.albumin54#c. Ltot55 c.albumin55#c.Ltot52 c.albumin55#c.Ltot53 c.albumin55#c.Ltot54 c.albumin55#c.Ltot55" /*Inflammation models*/ global model1 "$LactCa t $season $Mtot $saaugml $albumin NLratio $LactXMtot $saaXalb" global model12 "$LactCat $season $Mtot $saaugml $albumin Ntot $LactXMtot $LactXNtot $sa aXalb" global model14 "$LactCat $Mtot $saaugml $albumin NLratio $LactXMtot $saaXalb" global mod el25 "$LactCat $season $Mtot $saaugml $albumin $Ltot $LactXMtot $saaXalb" global model4 "$LactCat $se ason $Mtot $saaugml $albumin $LactXMtot $saaXalb" /*Calculating phat*/ /*Model 1*/ melogit AnyDzCalc $model1 || FarmId: predict phat rename phat phat1 /*Model 12*/ melogit AnyDzCalc $model12 || FarmId: predic t phat rename phat phat2 /*Model 14* / melogit AnyDzCalc $model14 || FarmId: || Cohort: predict phat 420 rename phat phat3 /*Model 25*/ melogit AnyDzCalc $model25 || FarmId: predict phat rename phat phat4 /*Mo del 4*/ melogit AnyDzCalc $model4 || FarmId: predict phat rename phat phat5 /*Averaged*/ gen avgphat= (0.30* phat1) + (.28* phat2) + (.27 * phat3) + (.09 * phat4) + /// (0.06 * phat5) roctab AnyDzCalc avgphat, detail /*Summary measures*/ roctg AnyDzCalc a vgphat , optimal smooth lowess co interval(0.01) /*cut 0.351*/ egen IF_meanphat = mean(avgphat) , by(Cohort2) egen IF_sdphat= sd(avgphat), by(Cohort2) gen IF_phatcut= 0 replace IF_pha tcut=1 if avgphat>0.351 replace IF_phatcut=. if avgphat==. tab AnyDzCalc IF_phatcut, column sort LstcId egen IF_nPhat= total(IF_phatcut==1) if LstcId<300, by(Cohort2) drop avgphat drop phat1 phat2 phat3 phat4 phat5 /*OXIDATI VE STRESS*/ ******************** /*Preppin g macros to re - run models*/ global season "season2 season3 season4" global LactCat "LactCat2 LactCat3" global aopugul "aopugul52 aopugul53 aopugul54 aopugul55" global alphatocopherol "alphatocop herol52 alphatocopherol53 alphatocopherol54 alpha tocopherol55" global vitamina "vitamina42 vitamina43 vitamina44" g lobal LactXSeason "c.LactCat2#c.season2 c.LactCat2#c.season3 c.LactCat2#c.season4 c.LactCat3#c.season2 c.LactCat3#c.season3 c.LactCat3 #c.season4" global LactXAOP "c.LactCat2#c.aopug ul52 c.LactCat2#c.aopugul53 c.LactCat2#c.aopugul54 c.LactCat2#c.aopugul55 c.LactCat3#c.aopugul52 c.LactCat3#c.aopugul53 c.LactCat3#c.aopugul54 c.LactCat3#c.aopugul55" global LactXalpha "c.LactCat2#c.alpha tocopherol52 c.LactCat2#c.alphatocopherol53 c.Lac tCat2#c.alphatocopherol54 c.LactCat2#c.alphatocopherol55 c.LactCat3#c.alp hatocopherol52 c.LactCat3#c.alphatocopherol53 c.LactCat3#c.alphatocopherol54 c.LactCat3#c.alphatocopherol55" global SeasonXAOP "c.s eason2#c.aopugul52 c.season2#c.aopugul53 c.season 2#c.aopugul54 c.season2#c.aopugul55 c.season3#c.aopugul52 c.season3#c.aop ugul53 c.season3#c.aopugul54 c.season3#c.aopugul55 c.season4#c.aopugul52 c.season4#c.aopugul53 c.season4#c.aopugul54 c.season4#c.aopug ul55" global SeasonXalpha "c.season2#c.alphatoc opherol52 c.season2#c.alphatocopherol53 c.season2#c.alphatocopherol54 c.s eason2#c.alphatocopherol55 c.season3#c.alphatocopherol52 c.season3#c.alphatocopherol53 c.season3#c.alphatocopherol54 c.season3#c.alpha tocopherol55 c.season4#c.alphatocopherol52 c.seas on4#c.alphatocopherol53 c.season3#c.alphatocopherol54 c.season4#c.alphato copherol55" global AOPXalpha "c.aopugul52#c.alphatocopherol52 c.aopugul52#c.alphatocopherol53 c.aopugul52#c.alphatocopherol54 c.aopu gul52#c.alphatocopherol55 c.aopugul53#c.alphatoco pherol52 c.aopugul53#c.alphatocopherol53 c.aopugul53#c.alphatocopherol54 c.aopugul53#c.alphatocopherol55 c.aopugul54#c.alphatocopherol52 c.aopugul54#c.alphatocopherol53 c.aopugul54#c.alphatocopherol54 c.aopu gul54#c.alphatocopherol55 c.aopugul55#c.alphatoco pherol52 c.aopugul55#c.alphatocopherol53 c.aopugul55#c.alphatocopherol54 c.aopugul55#c.alphatocopherol55" global LactXvitamina "c.LactCat2#c.vitamina42 c.LactCat2#c.vitamina43 c.LactCat2#c.vitamina44 c.Lac tCat3#c.vitamina42 c.LactCat3#c.vitamina43 c.Lact Cat3#c.vitamina44" global SeasonXvitamina "c.season2#c.vitamina42 c.sea son2#c.vitamina43 c.season2#c.vitamina44 c.season3#c.vitamina42 c.season3#c.vitamina43 c.season3#c.vitamina44 c.season4#c.vitamina42 c .season4#c.vitamina43 c.season4#c.vitamina44" g lobal AOPXvitamina "c.aopugul52#c.vitamina42 c.aopugul52#c.vitamina43 c.a opugul52#c.vitamina44 c.aopugul53#c.vitamina42 c.aopugul53#c.vitamina43 c.aopugul53#c.vitamina44 c.aopugul54#c.vitamina42 c.aopugul54# c.vitamina43 c.aopugul54#c.vitamina44 c.aopugul55 #c.vitamina42 c.aopugul55#c.vitamina43 c.aopugul55#c.vitamina44" global alphaXvitamina "c.alphatocopherol52#c.vitamina42 c.alphatocopherol52#c.vitamina43 c.alphatocopherol52#c.vitamina44 c.alphatocopherol5 3#c.vitamina42 c.alphatocopherol53#c.vitamina43 c .alphatocopherol53#c.vitamina44 c.alphatocopherol54#c.vitamina42 c.alphat ocopherol54#c.vitamina43 421 c.alphatocopherol54#c.vitamina44 c.alphatocopherol55#c.vitamina42 c.alphatocopherol55#c.vitamina43 c.alphatoc opherol55#c.vitamina44" /*OX models*/ glo bal model1 "$LactCat $season $aopugul $alphatocopherol" global model5 "$LactCat $aopugul $alphatocopherol" global model4 "$LactCat $aopugul $vitamina $alphatocopherol" global model3 "$LactCat $season $ aopugul $vitamina $alphatocopherol" global mode l9 "$LactCat $aopugul" global model2 "$LactCat $season $aopugul " /*Averaging predictions from 4 models*/ /*Model 1*/ melogit AnyDzCalc $model1 || FarmId: predict phat rena me phat phat1 /*Model 5*/ melogit Any DzCalc $model5 || FarmId: || Cohort: predict phat ren ame phat phat2 /*Model 4*/ melogit AnyDzCalc $model4 || FarmId: || Cohort: predict phat rename phat phat3 /*Mod el 3*/ melogit AnyDzCalc $model3 || FarmId: predict phat rename phat phat4 /*Model 9*/ melogit AnyDzCalc $model9 || FarmId: || Cohort: predict phat rename phat phat5 /*Model 6*/ melogit AnyDzCalc $model6 || Farm Id: || Cohort: predict phat rename pha t phat6 /*Model 2*/ melogit AnyDzCa lc $model2 || FarmId: predict phat rename phat phat7 /*Averaged*/ gen avgphat= (0.30 * phat1) + (.27 * phat2) + (.12 * phat3) + (.12 * phat4) + /// (.07 * phat5) + (0.06 * phat6) + (0.06 * phat7) roctab AnyDzCalc avgphat, detail /* 0.7785 */ /*Summary measures*/ roctg AnyDzCalc avgphat , optimal smooth lowess co interval(0.01) /*cut 0.331*/ egen OX_meanphat= mean(avgphat) , by(C ohort2) egen OX_sdphat= sd(avgpha t), by(Cohort2) gen OX_phatcut= 0 replace OX_phatcut=1 if avgphat>0.331 replace OX_phatcut=. if avgphat==. tab AnyDzCalc OX_phatcut, column s ort LstcId egen OX_nPhat= total(OX_phatcut== 1) if LstcId<300, by(Cohort2) d rop avgphat drop phat1 phat2 phat3 phat4 phat5 phat6 phat7 422 /*Part 3: reduce dataset to one line per cohort*/ ******************************************* ********************************** drop if LstcId ==. sort Cohort2 duplicates drop Coho rt2, force save cohort2, replace /*Part 4: Summary statistics by cohort*/ ******************************************************************************* use cohort2, clear list NM_meanphat Cohort2 list IF_meanph at Cohort2 list OX_meanphat Cohort2 list NM_sdphat Cohort2 list IF_sdphat Cohort2 list OX_sdphat Cohort2 list NM_nPhat Cohort2 list IF_nPhat Cohort2 list OX_nPhat Cohort2 save "C: \ Users \ w isni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data file s \ Cohort2Data_11Ja n2019.dta", replace cd "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ " outsheet using Cohort2Data_11Jan2019.csv , comma replace M1_AIM2 - 2MEAN.DO /* Title: Cohort a nalysis Aim 2.2: Means Author: Lauren Wisnieski P urpose: Running mean analysis for Aim 2.2 Revision history: - Created Oct 7, 2018 - Edited Jan 10, 2019: new methods based on edits from Aim 2.1. The purpose of this do file is to calculate the DSS s cores from the poisson models selected in R, per f orm bootstrap validation, and model averaging Steps: Step 1. Calculate DSS scores Step 2: Internal validation Step 3: Model averaging */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ Cohort2Data_11Jan2019.dta", clear s a ve cohort2, replace /*STEP 1: Calculate DSS scores*/ ********************************************************************************** /*Model 1*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.852 + (4.950*OX_mean phat) + /// ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sd pcounts = sd(pcou nt) gen DS = NSES + (2 * (lo g10(sdpcounts))) 423 egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.830891 */ /*Model 2*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.8317028 + ( - 0.1497891*NM_meanphat) + /// (5.0433391*O X_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Seba stini*/ egen sdpcou nts = sd(pcount) gen DS = NSES + (2 * (log10( sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.824689 */ /*Model 3*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.84588511 +(4.9010 2040*OX_meanphat)+(0.03082209*IF_meanphat) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(p count) gen DS = NSES + (2 * (log10(sdpcounts) )) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.828511 */ /*Model 4*/ /* Calculate predicted incidence */ use cohort2, c lear gen x = - 2.340996 +(3.470558*IF_meanphat ) /// + ln(nTotalLstc2) gen pcount= 2 .71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(s dpcounts))) egen meanDS = mean(DS) sort Coh ort2 /* meanDS 1.752524 424 */ /*Model 5*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.8153032 +(4.8638909*OX_meanphat)+( - 0.2290109*NM_meanphat) /// + (0.20927 80*IF_meanphat) + ln(nTotalLstc2) gen pcou nt= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (lo g10(sdpcounts))) egen meanDS = mean(DS) sor t Cohort2 /* meanDS 1.825164 */ /* None of these models passed the DSS test so analysis was stopped*/ M2_AIM2 - 2SD.DO /* Title: Cohort analysis 2 - Standard deviation Author: Laur en Wisnieski Purpose: Second cohort analysis - sta ndard dev and MAD Revision history: - Created 10/7/2 018 - Edited 1/10/2019 - Edited 2/18/2019 to change specification of observed versus (log(predicted)) analysis The purpose of this do file is to calcu late the DSS scores from the poisson models selec ted in R, perform bootstrap validation, and model aver aging Steps: Step 1. Calculate DSS scores Step 2: Internal validation Step 3: Model averaging */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERT ATION ANALYSIS \ Data files \ Cohort2Data_11Jan2019.d ta", clear save cohort2, replace /*STEP 1: Calculate DSS scores*/ ********************************************************************************** /*Model 1*/ /* Calculate predicted incid ence */ use cohort2, clear gen x = - 2.4733 65 +(5.458310*IF_sdphat) + ln(nTot alLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen m eanDS = mean(DS) sort Cohort2 /* meanDS 1.497725 */ 425 /*Model 2*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.500246+(3 .195651*IF_sdphat)+(4.030766 *OX_sdphat) + /// ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ e gen sdpcounts = sd(pc ount) gen DS = NSES + (2 * (log10(sdpcounts)) ) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.418798 */ /*Model 3*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.3056533+(5.4221930*IF_sdphat )+( - 0.9959409*NM_sdphat) + /// ln(nTotalLs tc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen D S = NSES + (2 * (log10(sdpcounts))) egen meanD S = mean(DS) sort Cohort2 /* meanDS 1.462138 */ /*Model 4*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.340991 +( 8.339305*OX_sdphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = me an(DS) sort Cohort2 /* meanDS 1.31 2692 */ /*Model 5*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.455857+(4.398269*OX_sdphat)+( - 1.191626*NM_sdphat) + /// (3.533677*IF_sdphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NS ES*/ egen zp = std(pcount) 426 gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean (DS) sort Cohort2 /* meanDS 1.434549 */ /*Model 6*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.23162691+(7.70323842*OX_sdphat)+( - 0.07078798*NM_sdphat) /// + ln(nTotalLstc2) gen pcount= 2.7182 8 ^ x /*Calculating NSES*/ egen zp = st d(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts ))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.247354 */ /*STEP 2: INTERNAL VALIDATION OF PREDICTED VERSUS OBSERVED COUNTS*/ ********************************************************************************** /*Calculating slope and intercept for predicted versus observed counts using 500 bootstrapped samples for each candidate model*/ ********************************* *************************************************** set level 90 /*Model 1*/ /* Calculate predicted incidence */ use cohort2, clear gen x1 = - 2.473365 +(5.458310*IF_sdphat) + l n(nTotalLstc2) gen pcount1= 2.71828 ^ x1 gen lnpcou nt1= ln(pcount1) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount1 estat bootstra p, all /*Model 2*/ /* Calculate predict ed incidence */ gen x2 = - 2.500246+(3.195651*IF_sdphat)+( 4.030766 *OX_sdphat) + /// ln(nTotalLstc2) gen pcount2= 2.71828 ^ x2 gen lnpcount2= ln(pcount2) /*Bootstrap estimates*/ boot strap _b, nodots nowarn reps(500) seed(1921): po isson nDzLstc2 lnpcount2 estat bootstrap, all /*Model 3*/ /* Calculate predicted incidence */ gen x3 = - 2.455857+(4.398269*OX_sdphat)+( - 1.191626*NM_sdphat) + /// (3.533677*IF_sdphat) + ln (nTotalLstc2) gen pcount3= 2.71828 ^ x3 gen lnpcount3= ln(pcount3) /*Bootstrap estimates*/ bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnpcount3 estat bootstrap, all cd "C: \ Users \ wisni \ Amazon Drive \ DISSERTATI ON ANALYSIS \ Data files \ " 427 outsheet lnpcount1 lnpcount2 lnpcount3 nDzLstc2 using sd2 - 2_othermodels.c sv, comma replace ******************************************************************************** /*STEP 3: MODEL AVERAGING*/ ************** ************************************************* ***************** /*Note: We calculated AIC weights in excel. First, we average predictions from the original models. Then, we create bootstrapped average predictions over 500 reps*/ use cohort2, clear /*Model 1*/ gen x1 = - 2.473365 +(5.458310*IF_sdphat) + ln(nTotalLstc2) gen pcounts1 = 2.71828 ^ x1 /*Model 2*/ gen x2 = - 2.500246+(3.195651*IF_sdphat)+(4.030766 *OX_sdphat) + /// ln(nTotalLstc2) gen pcounts2 = 2. 71828 ^ x2 /*Model 3*/ gen x3 = - 2.455857 +(4.398269*OX_sdphat)+( - 1.191626*NM_sdphat) + /// (3.533677*IF_sdphat) + ln(nTotalLstc2) gen pcounts3 = 2.71828 ^ x3 /*Averaging predicted incidence*/ gen avgpcount = (pcounts1 * 0.54) + (pc ounts2 * 0.30) + (pcounts3 * 0.16) /*Calcul ating NSES*/ egen zp = std(avgpcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(avgpcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) /*Bootstrap estimates*/ set level 90 gen lnavgpcount = ln(avgpcount) bootstrap _b, nodots nowarn reps(500) seed(1921): poisson nDzLstc2 lnavgpcount estat bootstrap, all cd "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ " outsheet avgpcount lnav gpcount nDzLstc2 using sd2 - 2.csv, comma replace /* meanDS 1.453511 */ M3_AIM2 - 2THRESHOLD.DO /* Title: Cohort analysis 3 thresholds (cut - offs) Author: Lauren Wisnieski Purpose: 3rd analy sis - thresholds Revision history: - Created Oc tober 7, 2018 - Edited January 10, 2019 The purpose of this do file is to calculate the DSS scores from the poisson models selected in R, perform bootstrap validation, and model averaging Steps: Step 1. Calculate DSS scores Step 2: Internal validat ion Step 3: Model averaging 428 */ use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ Cohort2Data_11Jan2019.dta", clear save cohort2, replace /*STEP 1: Calculate DSS scores*/ ******* ************************************************* ************************** /*Model 1*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 1.9207053 + (0.1063769 *OX_nPhat) + /// ln(nTotalLstc2) gen pcount= 2.71 828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort 2 /* meanDS 1.797553 */ /*Model 2 */ /* Calculate predicted incidence */ use cohort2, clear gen x = - 1.97553031+(0.05756583*IF_nPhat)+(0.06186857*OX_nPhat) /// + ln(nTotalLstc2) ge n pcount= 2.71828 ^ x /*Calculati ng NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.775 105 */ /*Model 3*/ /* Calculate p redicted incidence */ use cohort2, clear gen x = - 1.85116602+(0.08791668*IF_nPhat)+(0.01923768*NM_nPhat) /// +ln(nTotalLstc2) gen pcount= 2.7 1828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSE S = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohor t2 /* meanDS 1.650381 */ /*Model 4 */ /* Calculate predicted incidence */ use cohort2, clear 429 gen x = - 1.8790656+(0.1122798*IF_nPhat) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* mean DS 1.679911 */ /*Model 5*/ /* Calculate predicted incidence */ use c ohort2, clear gen x = - 1.9815245012+( - 0.00044 73095*NM_nPhat)+(0.0627217921*OX_nPhat) /// +(0.0578709917*IF_nPhat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen mea nNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mea n(DS) sort Cohort2 /* meanDS 1.780518 */ /*Model 6*/ /* Calculate predicted incide nce */ use cohort2, clear gen x = - 1.851166 02 +(0.01923768*NM_nPhat)+(0.08791668*IF_nPhat) /// + ln(nTotalLstc2) gen pcount= 2. 71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.650381 */ /*None of these models passed the DSS test, so analysis was stopped */ M4_AIM2 - 2COMBINEDCOHORT.DO /* Title: Cohort analysis 4: combined model Author: Lauren Wisnieski Purpose: Last analysis - combined model that includes cohort means, proportions, and SDs. Revision history: - Created 8/1/2018 - Revised 1/14 /2019 The purpose of this do file is to calculat e the DSS scores from the poisson models selected in R, perform bootstrap validation, and model averaging 430 Steps: Step 1. Calculate DSS scores Step 2: Internal validation Step 3: Model averaging */ cd "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ " use "C: \ Users \ wisni \ Amazon Drive \ DISSERTATION ANALYSIS \ Data files \ Cohort2Data_11Jan2019.dta", clear save cohort2, replace /*STEP 1: Calculate DSS scores*/ ************************ ************************************************* ********* /*Model 1*/ /* Calculate predict ed incidence */ use cohort2, clear gen x = - 2.852 + (4.950*OX_meanphat) + /// ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calcula ting NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.8 30891 */ /*Model 2*/ /* Calculat e predicted incidence */ use cohort2, cle ar gen x = - 2.700814 +( - 1.366901*NM_sdphat)+(5.123677*OX_meanphat) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen mea nNSES = mean(NSES) /*Dawid and Sebastin i*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.825412 */ /*Model 3*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.63054205+(0.02922754*OX_nPhat)+(3.68939505*OX_meanphat) + /// ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* 431 meanDS 1.821394 */ /*Model 4*/ /* Calcula te predicted incidence */ use cohort2, clear gen x = - 2.89224763+( - 0.01143759*NM_nPhat)+(5.30726132*OX_meanphat) + /// ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen m eanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.828062 */ /*Model 5*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.8317028+( - 0.1497891*NM_meanphat)+(5.0433391*OX_meanphat) + /// ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini* / egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.824689 */ /*Model 6*/ /* Calculate predicted incidence */ use cohort 2, clear gen x = - 2.81 8829 +(0.008064*IF_nPha t)+(4.694738*OX_meanphat) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebast ini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.832164 */ /*Model 7*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2 .8136116+( - 0.38 94284*OX_sdphat)+(5.0078190*OX_mea nphat) + /// ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) 432 gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdp counts = sd(pco unt) gen DS = NSES + (2 * (log 10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.82832 */ /*Model 8*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.81387083+( - 0.093 85246*IF_sdphat)+(4.91351604*OX_meanphat) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini* / egen sdpcounts = sd(pco unt) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.81843 */ /*Model 9*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.84588533 +(0.03082059*IF_meanphat)+( 4.90102251*OX_meanphat) /// + ln(nTotalLstc2 ) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastin i*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.828511 */ /*Model 10*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 1.9 207053+(0.1063769*OX_nPhat) + /// ln(nTota lLstc2) gen pcount= 2.71828 ^ x /*Cal culating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen me anDS = mean(DS) sort Cohort2 /* meanDS 1.797553 433 */ /*Model 11*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.19292822+(0.06170802*OX_nPhat) +(1.72991547*IF_meanphat) /// + ln(nTotalLstc2) gen pcou nt= 2.71828 ^ x /*Calculating NSES*/ eg en zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sor t Cohort2 /* meanDS 1.79437 */ /*Model 12*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 1.97553031+(0.06186857*OX_nPhat)+(0.05756583*I F_nPhat) + /// ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount ) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * ( log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.775105 */ /*Model 13*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.340996 +(3.470558*IF_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.752524 */ /*Model 14*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.34867918+(0.09034685*OX_nPhat)+(3.67809655*OX_sdphat) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 e gen meanNSES = mean(NSES) 434 /*Dawid and Sebasti ni*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.755436 */ /*Model 15*/ /* Calculate predicted inc idence */ use cohort2, clear gen x = - 2.230 80710+(0 .07558337*OX_nPhat)+(1.55946052*NM_meanphat) + /// ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean (NSES) /*Dawid and Sebastini*/ egen sdpco unts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.773813 */ /*Model 16*/ /* Calculate predicted incidence */ use cohor t2, clear gen x = - 2.710531172+(5.337453148* OX_meanphat)+( - 0.008643339*NM_nPhat) /// + ( - 1.359981448*NM_sdphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanN SES = mean( NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.812524 */ /*Model 17*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.46875543+( - 1. 08566582*NM_sdphat)+(0.03529301*IF_nPhat) /// +(3.66025994*OX_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen m eanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.789173 */ 435 /*Model 18*/ /* Calculate predicted incide nce */ use cohort2, clear gen x = - 2.728018 2+( - 1.5439073*NM_sdphat)+(0.6416666*IF_sdphat)+ /// (4.8212135*OX_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 eg en meanNSES = mean(NSES) /*Dawid and Sebastin i*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort /* meanDS 1.813426 */ /*Model 19*/ /* Calculate predicted inc idence */ use cohort2, clear gen x = - 2.541944+( - 1.358931*NM_sdphat)+(1.098287*IF_meanphat) /// +(3.572221*OX_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen me anNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.792189 */ /*Model 20*/ /* Calculate predicted inc idence */ use cohort2, clear gen x = - 2.7471760+( - 1.5535462*NM_sdphat)+(0.7377391*OX_sdphat) /// +(5.0204870*OX_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen m eanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.825473 */ /*Model 21*/ /* Calculate predicted incide nce */ use cohort2, clear gen x = - 2.6924326 0+( - 1.31979462*NM_sdphat)+(0.09540742*NM_meanphat) /// + (4.98724974*OX_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Se bastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) 436 egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.817826 */ /*Model 22*/ /* Calculate predicte d incidence */ use cohort2, clear gen x = - 2.5991424+(0.0196100*OX_nPhat)+( - 0.9538241*NM_sdphat) /// +(4.2289364*OX_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.822164 */ /*Model 23*/ /* Calculate predic ted incidence */ use cohort2, clear gen x = - 1.93268008+(0.02142914*NM_nPhat)+(0.08750988*OX_nPhat) /// + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mea n(NSES) /*Dawid and Sebastini*/ egen sdpc ounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.766073 */ /*Model 24*/ /* Calculate predicted incidence */ use c ohort2, clear gen x = - 2.059893+(1.107662*NM _sdphat)+(0.102459*OX_nPhat)+ ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.778107 */ /*Model 25*/ /* Calculate predicted i ncidence */ use cohort2, clear gen x = - 2.12969975 +(1.34731288*IF_sdphat)+(0.08966902*OX_nPhat) + l n(nTotalLstc2) 437 gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.771477 */ /*Model 26*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.693861557+( - 0.007812509*NM_nPhat)+(0.0 25628273*OX_nPhat) /// +(4.112374990*OX_meanp hat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanN SES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd (pcount) gen DS = NSES + (2 * (log10(sdpcount s))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.824158 */ /*Model 27*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.82442+( - 0.01694*NM_nPhat)+(0.0211 6*IF_nPhat)+(4.81269*OX_meanphat) /// + ln( nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen me anNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) eg en meanDS = mean(DS) sort Cohort2 /* meanDS 1.831611 */ /*Model 28*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 1.8790656+(0.1122798*IF_nPhat) + ln(nTotal Lstc2) gen pcount= 2.71828 ^ x /*Calc ulating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen mea nDS = mean(DS) sort Cohort2 /* meanDS 1.679911 438 */ /*Model 29*/ /* Calculate predicted incidence */ use cohort2, c lear gen x = - 2.94879851+( - 0.02402178*NM_nPhat)+(1.12810885*NM_meanphat) /// + (4.61805213*OX_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean (NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.804568 */ /*Model 30*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.53815096+(0.01970270*IF_nPhat)+(0.03065491*OX_nPhat) /// +(3.00543520*OX_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.826477 */ /*Model 31*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.83111050+( - 0.01005255*NM_nPhat)+( - 0.31961945*IF_sdphat) /// +(5.33883646*OX_meanpha t) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen me anNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcount s))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.819311 */ /*Model 32*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.60661608+(0.03147812*OX_nPhat)+(0.04362134*IF_sdphat) /// + (3.54340038*OX_ meanphat) + ln(nTotalLstc2) gen pcount= 2. 71828 ^ x /*Calculating NSES*/ 439 egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sd pcounts))) egen meanDS = mean(DS) sort Coho rt2 /* meanDS 1.81601 */ /*Model 33*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.59455398+(0.03100909*OX_nPhat)+(0.51440698*IF_meanphat) /// +(3.02612520*O X_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10( sdpcounts))) egen meanDS = mean(DS) sort Co hort2 /* meanDS 1.82595 */ /*Model 34*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.63199232+(0.03001526*OX_nPhat)+(0.15148573*NM_meanphat) /// +(3.52505 661*OX_meanphat) + ln(nTotalLstc2) gen pcou nt= 2.71828 ^ x /*Calculating NSES*/ egen z p = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (lo g10(sdpcounts))) egen meanDS = mean(DS) sor t Cohort2 /* meanDS 1.821115 */ /*Model 35*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.63284711+(0.03638734*OX_nPhat)+(0.95581871*OX_sdphat) /// +(3.1358 4049*OX_meanphat) + ln(nTotalLstc2) gen p count= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.807015 */ 440 /*Model 36*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.846978343+( - 0.009948533*NM_nPhat)+( - 0.367524245*OX_sdphat) /// +(5.306399476*OX_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSE S + (2 * (log10(sdpcounts))) egen meanDS = mea n(DS) sort Cohort2 /* meanDS 1.82468 */ /*Model 37*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.8714051+( - 0.0125913*NM_nPhat)+(0.2991981*IF_meanphat) /// + (4.9710527*OX_meanphat) + ln(nTotalLstc2 ) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mea n(DS) sort Cohort2 /* meanDS 1.825353 */ /*Model 38*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.73233277+(0.02217123*IF_nPhat)+( - 0.17435177*IF_meanpha t) /// +(4.35310402*OX_meanphat) + ln(nTota lLst c2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) ge n DS = NSES + (2 * (log10(sdpcounts))) egen me anDS = mean(DS) sort Cohort2 /* meanDS 1.818745 */ /*Model 39*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.78162+(0.01233*IF_nPhat)+( - 0.38648*NM_meanphat) /// +(4.89079* OX_meanphat) + ln(nTotalLs tc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen D S = NSES + (2 * (lo g10(sdpcounts))) 441 egen meanD S = mean(DS) sort Cohort2 /* meanDS 1.832779 */ /*Model 40*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.8073791+( - 0.1978345*IF_sdphat)+( - 0.1339275*NM_meanp hat) /// +(5.1016875*OX_meanphat) + ln(nTot alLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) g en DS = NSES + (2 * (log10(sdpcounts))) egen m eanDS = mean(DS) sort Cohort2 /* meanDS 1.82366 */ /*Model 41*/ /* Calculate predicted incidence */ use cohort2, clea r gen x = - 2.785468+(0.007838*IF_nPhat)+( - 0.483175*OX_sdphat) + /// (4.810368 * OX_meanphat) + ln(nTotal Lstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen mea nDS = mean(DS) sort Cohort2 /* meanDS 1.83543 */ /*Model 42*/ /* Calculate predicted incidence */ use c ohort2, clear gen x = - 2.70162967+(0.02431173*IF_nPhat)+(0.19291849*IF_sdph at) + /// (3.91298801*OX_meanphat) + ln(nT otalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSE S = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.805775 */ /*Model 43*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.80869456+( - 0.23436760*OX_sdphat)+( - 0.0245951 8*NM_meanphat) /// +(4.95520312*OX_meanphat) + ln(nTotalLstc2) 442 gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(p count) gen DS = NSES + (2 * (log10(sdpcounts) )) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.82083 */ /*Model 44*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.80210016+(0.10589921*IF_sdphat)+(0.017 38465*OX_sdphat) /// +(4.73518262*OX_meanpha t) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pco unt) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd (pcount) gen DS = NSES + (2 * (log10(sdpcount s))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.802541 */ /*Model 45*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.8153039+( - 0.2290131*NM_meanphat)+( 0.2092739*IF_meanphat) /// +(4.8638990*OX_mea nphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = st d(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpc ounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.825164 */ /*Model 46* / /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.7979596+( - 0.4711631*OX_sdph at)+(0.1607806*IF_meanphat) /// +(4.8353512*O X_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES*/ egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ egen sdpc ounts = sd(pcount) gen DS = NSES + (2 * (log1 0(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.82915 443 */ /*Model 47*/ /* Calculate predicted incidence */ use cohort2, clear gen x = - 2.7926773+( - 0.1619405*IF _sdphat)+(0.2857293*IF_meanphat) /// +(4.612 1392*OX_meanphat) + ln(nTotalLstc2) gen pcount= 2.71828 ^ x /*Calculating NSES* / egen zp = std(pcount) gen NSES = zp ^2 egen meanNSES = mean(NSES) /*Dawid and Sebastini*/ ege n sdpcounts = sd(pcount) gen DS = NSES + (2 * (log10(sdpcounts))) egen meanDS = mean(DS) sort Cohort2 /* meanDS 1.82102 */ /*All models have significant DSS values, so analysis stopped at this point*/ 444 APPENDIX E : R c ode BES TSUBSETS.R #Title: Best subsets selection Aim 1 #Author: Lauren Wisnieski #Purpose: Run AIC Best subsets selection for Aim 1 (individual level models) #Revision history: Created June 5, 2018 data < - read. csv("Data_5June2018.csv") #Loading packages libr ary(rJava) library(glmulti) #Make categorical variables categorical (vs integer, etc) data$LactCat < - as.factor(data$LactCat) data$season < - as.factor(data$season) data$BtotCat < - as.factor(data$BtotCat) da ta$BDtotCat < - as.factor(data$BDtotCat) data$Mtot Cat < - as.factor(data$MtotCat) data$saaugml4 < - as.factor (data$saaugml4) data$vitamind5 < - as.factor(data$vitamind5) data$albumin5 < - as.factor(data$albumin5) data$haptoglobin5 < - as.factor(data$haptoglobin5 ) data$Ltot5 < - as.factor(data$Ltot5) data$Etot5 < - as.factor(data$Etot5) data$cholesterol4 < - as.factor(d ata$cholesterol4) data$bcscat < - as.factor(data$bcscat) data$aopugul5 < - as.factor(data$aopugul5) data$vitamina4 < - as.factor(data$vitamina4) data$ros 5 < - as.factor(data$ros5) data$betacarotene5 < - a s.factor(data$betacarotene5) data$alphatocopherol5 < - as. factor(data$alphatocopherol5) #Best subsets for lipid mobilization (nutrient metabolism) logisticLM.out < - glmulti(AnyDzCalc ~ LactCat + seas on + bhb + calcium + lognefasq + chol esterol4 + bcscat, data = data, level = 1, # No interactions considered, too many variables method = "h", # exhaustive crit = "aic", # use AI C for selection intercept = T, # include intercept plotty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function family = binomial) # binomial family for logistic regres sio #Using summary command to see how many model s are within 4 AICs, s o can simplify output summary.glmulti(logisticLM.out, icvalues) logisticLM.out < - glmulti(AnyDzCalc ~ LactCat + season + bhb + calcium + lognefasq + cholesterol4 + bcs cat, data = data, level = 1, # No interactions considered, too many variables method = "h", # exhaustive crit = "aic", # use AIC for selection intercept = T, # include in tercept plotty = T, report = T, # plot s and interim reports fitfunction = "glm", # glm function confsetsize= 19, # sets number of models to 19 family = binomial) # binomial family for logistic reg ressio # Reporting out candidate models logistic LM.out #Best subsets selection for Inflammation model logisticIF.out < - glmulti(AnyDzCalc ~ LactCat + season + MtotCat + NLratio + Ntot + saaugml4 + vitamind5 + albumin5 + haptoglobin5 + Ltot5 + Etot5 + BtotCat + BDtotCat, data = da ta, level = 1, # No interactions considered, too many variables method = "h", # exhaustive crit = "aic", # AIC as criteria plotty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function family = binomial) # binomial family for logistic regression 445 #Using summary command to see how many models are within 4 AICs, so can simplify ou tput summary.glmulti(logisticIF.out, icvalues) logisticIF.out < - glmulti(AnyDzCalc ~ LactCat + season + MtotCat + NLratio + Ntot + saaugml4 + vitamind5 + albumin5 + haptoglobin5 + Ltot5 + Etot5 + BtotCat + BDtotCat, data = data, level = 1, # No int eractions considered, too many variables method = "h", # exhaustive crit = "aic", # AIC as criteria plotty = T, report = T, # plots and interim reports confsetsize= 25, # sets number o f models to 25 fitfunction = "glm", # glm function family = binomial) # binomial family for logistic regression # Reporting out candidate models logisticIF.ou t #Best subsets s election for Oxidative stress logisticOX.out < - glmulti(AnyDzCalc ~ LactCat + season + aopugul5 + vitamina4 + alphatocopherol5 + ros5 + betacarotene5, data = data, level = 1, # No interactions consi dered, too many va riables method = "h", # exha ustive crit = "aic", # AIC as criteria plotty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function family = binomial) # binomial family for logistic regression #Usin g summary command to see how many models are within 4 AICs, so can simplify output summary.glmulti(logisticOX.out, icvalues ) logisticOX.out < - glmulti(AnyDzCalc ~ LactCat + season + aopugul5 + vitamina4 + alphatocopherol5 + ros5 + betacarote ne5, data = data, level = 1, # No interactions considered, too many variables method = "h", # exhaustive crit = "aic", # AIC as criteria co nfsetsize= 11, # sets number of models t o 11 plotty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function f amily = binomial) # binomial family for logistic regression # Reporting out c andidate models logisticOX.out #Best subs ets selection for all 3 logisticALL.out < - glmulti(AnyDzCalc ~ LactCat + season + calcium + lognefasq + MtotCat + saaugml4 + albumin5 + NLratio + aopugul5 + alphatocophero l5 + bcscat2 + bhb + Ntot + vitamin a4 , data = data, level = 1, # No interactions considered, too many variables method = "h", # exhaustive crit = "aic", # AIC as criteria plotty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function family = binomial) # binomial family for logistic regression #Using summary command to see how many models are within 4 AICs, so can simplify output summary.glmulti(logisticALL.ou t, icvalues) logisticALL.out < - glmulti(AnyDzCalc ~ LactCat + season + calcium + lognefasq + MtotCat + saaugml4 + albumin5 + NLratio + aopug ul5 + alphatocopherol5 + bcscat2 + bhb + Ntot + vitamina4, data = d ata, level = 1, # No interactions considered, too many variables method = "h", # exhaustive crit = "aic", # AIC as criteria confsetsize= 11, # sets number of model s to 54 plotty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function family = binomial) # binomial fa mily for logistic regression # Reporting out candidate models logisticALL. out 446 BESTSUBSETS_AIM2 - 1 _ MEAN.R #Title: Best subsets selection Aim 2 - 1: Mean/central method #Author: Lauren Wisnieski #Purpose: Run AIC Best subsets selection for Aim 2 - 1 central aggregation method #Revision history: Created January 1, 2019 data < - read.csv("Cohortdata_23July2018.csv") summary(data) #Loading packages library(rJava) library(glmulti) library(recoder) summary(data$nTotalLstc2) TotalLstc2 = data$nTotalLs tc2 Lstc = as.numeric(TotalLstc2) nDzLstc2 = data $nDzLstc2 Dz = as.numeric (nDzLstc2) mean.out < - glmulti(Dz ~ meanbhb + meannefa + meancalcium + meancholesterol + medBCS + meanNtot + meanLtot + meanalbumin + meanhaptoglobin + meanvita mind + meanMtot + meanBtot + meanBDtot + meanEtot + meansaa + m eanaopugul + meanalphatocopherol + meanbetacarotene + meanros + meanvitamina + season, offset= log(Lstc) , data = data, level = 1, # interactions considered method = "h ", # exhaustive crit = "aic", # use AIC for selection intercept = T, # include intercept plotty = T, report = T, # plots and interim reports f itfunction = "glm", # glm function co nfsetsize = 100 , #setting number of models maxsize = 3, #setting max number of variables family = poisson ) # poisson family #Using summ ary command to see how many m odels are within 4 AICs, so can simplify output s ummary.glmulti(mean.out, icvalues) mean.out < - glmulti(Dz ~ meanbhb + meannefa + meancalcium + meancholesterol + medBCS + meanNtot + meanLtot + meanalbumin + meanhaptoglobin + me anvitamind + meanMtot + meanBtot + m eanBDtot + meanEtot + meansaa + meanaopugul + meanalphatocopherol + meanbetacarotene + meanros + meanvitamina + season , offset= log(Lstc) , data = data, level = 1, # interactions considered method = "h", # exhaustive crit = "aic", # use AIC for selection intercept = T, # include intercept plotty = T, report = T, # plots and i nterim reports fitfunction = "glm", # glm function confsetsize = 22 , #setting number of models to 20 maxsize = 3, #setting max number of variables family = poisson ) # poisson family # Re porting out candidate models mean.out BESTSUBS ETSAIM2 - 1_SD.R #Title: Best subsets selection Aim 2 - 1: Dispersion method #Author: Lauren Wisnieski #Purpose: Run AIC Best subsets selection for Aim 2 - 1 dispersion aggregation method #Revision history: Creat ed January 2, 2019 data < - read.csv("Cohortdata _28December2018.csv") summary(data) 447 #Loading packages library(rJava) lib rary(glmulti) library(recoder) summary(data$nTotalLstc2) TotalLstc2 = data$nTotalLstc2 Lstc = as.numeric(TotalLstc2) nDzLstc2 = data $nDzLstc2 Dz = as.numeric(nDzLstc2) data$season < - as.factor(data$season) sd.out < - glmulti(Dz ~ sdbhb + sdnefa + sdcalcium + sdcholesterol + manBCS + sdNtot + sdLtot + sdalbumin + sdhaptoglobin + sdvitamind + sdMtot + sdBtot + sdBDtot + sdEtot + sdsaaugml + sdaopugul + sdalphatocopherol + sdbetacarotene + sdros + sdvitamina + season , offset= log(Lstc) , data = data, level = 1, # interactions considered method = "h", # exhaustive crit = "aic", # use AIC for selection intercept = T, # incl ude intercept plotty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function confsetsiz e = 100 , #setting number of models maxsize = 3, #setting max number of variables family = poisson ) # poisson family #Using summary command to see how many models are within 4 AICs, so can simplify output summary.g lmulti(sd.out, icvalues) sd.out < - glmulti(Dz ~ sdbhb + sdnefa + sdcalcium + sdcholesterol + manBCS + sdNtot + sdLtot + sdalbumin + sdhaptoglobin + sdvitamind + sdMtot + sdBtot + sdBDtot + sdEtot + sdsaaugml + sdao pugul + sdalphatocopherol + sdbetaca rotene + sdros + sdvitamina + season , offset= log(Lstc), data = data, level = 1, # interactions considered method = "h", # exhaustive crit = "aic", # use AIC for selection intercept = T, # include intercept plotty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function confsetsize = 26 , #setting number of models to 18 maxsize = 3, #setting max number of variables family = poisson ) # poisson family # Reporting out candidate models sd.out BESTSUBSETSAIM2 - 1_COUNT.R #Title: Best subsets selection Aim 2 - 1: Coun t method #Author: Lauren Wisnieski #Purpose: Run AIC Best subsets selection for Aim 2 - 1 count aggregation method #Revision history: Created January 2, 2019 data < - read.csv("Cohortdata_28December2018.csv") summary(data) #Loading packages library(rJava) library(glmulti) library(recoder) summary(data$ nTotalLstc2) TotalLstc2 = data$nTotalLstc2 Lstc = as.numeric(TotalLstc2) nDzLstc2 = data$nDzLstc2 Dz = as.numeric(nDzLstc2) data$season < - as.factor(data$season) prop.out < - glmulti(Dz ~ nBhb + nNefa + nCal cium + nCholesterol + 448 nHeifer + nNtot + nLtot + nAlbumin + nHaptoglobin + nVitamind + nMtot + nBtot + nBDtot + nEtot + nSaaugml + nAopugul + nAlphatocopherol + nBetacarotene + nRos + nVita mina + season + nBCS , offset= log(Lstc) , data = data, level = 1, # interactions not considered method = "h", # exhaustive crit = "aic", # use AIC for selection intercept = T, # inclu de intercept plotty = T, report = T, # p lots and interim reports fitfunction = "glm", # glm function confsetsize = 100 , #setting number of models maxsize = 3, #setting max number of variables family = poisson ) # poisson family #Using summary command to see how many models are within 4 AICs, so can simplify output summary.glmulti(prop.out, icvalues) prop.out < - glmulti(Dz ~ nBhb + nNefa + nCalcium + nCholesterol + nHeifer + nNtot + nLtot + nAlbumin + nHapto globin + nVitamind + nMtot + nBtot + nBDtot + nEtot + nSaaugml + nAopugul + nAlphatocopherol + nBetacaroten e + nRos + nVitamina + season + nBCS, offset= log(Lstc) , dat a = data, level = 1, # in teractions considered method = "h", # exhaustive crit = "aic", # use AIC for selection intercept = T, # include intercept plotty = T, r eport = T, # plots and interim reports fitfunction = "glm", # glm function confsetsize = 25 , #setting number of models to 24 maxsize = 3, #setting max number of variables family = poisson ) # poisson family # Reporting out candidate mo dels prop.out BESTSUBSETSAIM2 - 1_COMBINED.R #Title: Best subsets selection Aim 2 - 1: combined method #Author: Lauren Wisnieski #Purpose: Run AIC Best subsets selection for Aim 2 - 1 ccombined aggregation meth od #Revision history: Created December 30, 2018, Revised January 6, 2019 setwd("C:/Users/wisni/Amazon Drive/DISSERTATION ANALYSIS/Data files") data < - read.csv("Coho rtdata_28December2018.csv") summary(data) #Loading packages library(rJava) library(glmu lti) library(recoder) summary(data$nTotalLstc2) TotalLstc2 = data$nTotalLstc2 Lstc = as.numeric(TotalLstc2) nDzLstc2 = data$nDzLstc2 Dz = as.numeric(nDzLstc2) data$se ason < - as.factor(data$season) comb.out < - glmulti(Dz ~ sdBtot + nAlbumin + meancalcium + sdbhb + meanBtot + meanbhb + nNefa + m eanLtot + manBCS + sdLtot + sdcalcium + sdBDtot + meanalbumin + nVitamind + sdvitamind + nBhb + meanvitamina + sdalphatocopherol + meanaopugul + medBCS + nBCS + sdEtot + nHeifer + nCholesterol + meanbetacaro tene + nCalcium + nAopugul + meanMtot + sdcholesterol + sdNtot + , offset= log(Lstc) , data = data , level = 1, # interactions not considered meth od = "h", # exhaustive 449 crit = "aic", # use AIC for selection intercept = T, # include intercept plotty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function confsetsize = 200 , #setting number of models max size = 3, #setting max number of variables family = poisson ) # poisson family #Using summary command to see how many models are within 4 AICs, so can simplify out put summary.glmulti(comb.out, icvalues) comb.ou t < - glmulti(Dz ~ sdBtot + nAlbumin + meancalcium + sdbhb + meanBtot + meanbhb + nNefa + meanLtot + manBCS + sdLtot + sdcalcium + sdBDtot + meanalbumin + nVitamind + sdvit amind + nBhb + meanvitamina + sdalpha tocopherol + meana opugul + medBCS + nBCS + sdEtot + nHeifer + nCholesterol + meanbetacarotene + nCalcium + nAopugul + meanMtot + sdcholesterol + sdNtot , offset= log(Lstc) , data = data, level = 1, # interactions not considered method = "h", # exhaustive crit = "aic", # use AIC for selection intercept = T, # include intercept plo tty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function confsetsize = 19 , #setting number of models to 101 maxsize = 3, #setting max number of variables family = po isson ) # poisson family # Reporting out candidate models comb.out MODELDIAGNOSTICSAIM2 - 1_MEAN.R #Title: Model diagnostics for Aim 2 - 1: mean method #Author: Lauren Wisnieski #Purpose: Run model diagnostics for all Aim 2 - 1: mean aggregation method models #Revision history: Created January 1, 201 9 setwd("C:/Users/wisni/Amazon Drive/DISSERTATION ANALYSIS/Data files") data < - read.csv("Cohortdata_28December2018.csv") #Model Diagnostics #Loading packages library(rJava) library(recoder) library(pe nalized) library(AER) #Model 1 glm2 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) glm2 dispersiontest(glm2) #p value = 1.0 pchisq(glm2$deviance, df=gl m2$df.residual, lower.tail=FALSE) #[1] 0.89 #Model 2 glm3 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meancholesterol + offset(log(Lstc)), family = poisson( link=log), data = data) dispersiontest(glm3) #p value = 0.9996 pchisq( glm3$deviance, df=glm3$df.residual, lower.tail=FA LSE) #[1] 0.87 #Model 3 glm4 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm4) #p value = 0.9586 450 pc hisq(glm4$deviance, df=glm4$df.residual, lower.ta il=FALSE) #[1] 0.7794443 #M odel 4 glm6 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanros + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm6) #p value = 0.9979 pchisq(glm6$deviance, df=glm6$ df.residual, lower.tail=FALSE) #[1 ] 0.7947994 #Model 5 glm6 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanNtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersion test(glm6) #p value = 0.9979 pchisq(glm6$ deviance, df=glm6$df.residual, lower.t ail=FALSE) #[1] 0.79 #Model 6 glm6 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meansaa + offset(log(Lstc)), family = poisson(link=log), data = data) dis persiontest(glm6) #p value = 0.98 pchisq( glm6$deviance, df=glm6$df.residual, lo wer.tail=FALSE) #[1] 0.79 #Model 7 glm7 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanBDtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm7) #p value = 0. 9945 pchisq(glm7$deviance, df=glm7$df.residual, lower.tail=FALSE) #[1] 0.775276 4 #Model 8 glm8 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanvitamind + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiont est(glm8) #p value = 0.996 pchisq(glm8$deviance, df=glm8$df.residual, lower.tai l=FALSE) #[1] 0.77 #Model 9 glm9 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanbetacarotene + offs et(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm9) #p value = 0.97 pchisq(glm9$deviance, df=glm9$d f.residual, lower.tail=FALSE) #[1] 0.77 #Model 10 glm10 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanbhb + offset(log(Lstc)), family = poisson(lin k=log), data = data) dispersiontest(glm10) #p value = 0.995 pchisq(glm10$de viance, df=glm10$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 11 glm11 < - glm(nDzLstc2 ~ meancalciu m + meanalbumin + medBCS + offset(log(Lstc)), fam ily = poisson(link=log), data = data) dispersiontest(glm11) #p value = 0.993 pchisq(glm11$deviance, df=glm11$df.residual, lower.tail=FALSE) #[1] 0.76 #Model 12 glm12 < - glm(nDzL stc2 ~ meancalcium + meanalbumin + meanMtot + off set(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm12) #p value = 0.9783 pchisq(glm12$deviance, df=glm12$df.residual, lower.tail=FALS E) #[1] 0.7332396 451 #Model 13 glm13 < - glm(nDzLstc2 ~ meancalcium + mea nalbumin + meanEtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm13) #p value = 0.975 pchisq(glm13$deviance, df=glm13$df.r esidual, lower.tail=FALSE) #[1]0.7 3 #Model 14 glm14 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanBtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm14) #p value = 0.9583 pchisq(glm14$deviance, df=glm14$df.residual, lower.tail =FALSE) #[1]0.7286367 #Model 15 glm15 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanLtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm15 ) #p value = 0.96 pchisq(glm15$deviance, df=glm15$d f.residual, lower.tail=FALSE) #[1]0.72 #Model 16 glm16 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanhaptoglobin + offset(log(Lstc)), family = poisson(link=log), data = da ta) dispersiontest(glm16) #p value = 0.97 pchis q(glm16$deviance, df=glm16$df.residual, lower.tai l=FALSE) #[1]0.72 #Model 17 glm17 < - glm(nDzLstc2 ~ meancalcium + meannefa + meanalbumin + offset(log(Lstc)), family = poisson(lin k=log), data = data) dispersiontest(glm17) #p value = 0.9599 pchisq(glm17$deviance, df=glm17$df. residual, lower.tail=FALSE) #[1]0.7178091 #Model 18 glm18 < - glm(nDzLstc2 ~ meancalcium + meanaopugul + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) disper siontest(glm18) #p value = 0.96 pchisq(gl m18$deviance, df=glm18$df.residual, lower.tail=FALSE) #[1]0.72 #Model 19 glm19 < - glm(nDzLstc2 ~ meancalcium + meanalphatocopherol + meanalbumin + offset(log(Lstc)), family = poisson(link=lo g), data = data) dispersiontest(glm19) #p value = 0.96 pchisq(glm19$deviance, df=glm19$df.residual, lowe r.tail=FALSE) #[1]0.72 #Model 20 glm20 < - glm(nDzLstc2 ~ meanbhb + meanLtot + offset(log(Lstc)), family = poisson(link =log), data = data) dispersiontest(glm20) #p value = 0.87 pchisq(glm20$deviance, df=glm20$df.residual, l ower.tail=FALSE) #[1]0.54 MODELDIAGNOSTICSAIM2 - 1_SD.R #Title: Model diagnostics for Aim 2 - 1: dispersion metho d #Author: Lauren Wisnieski #Purpose: Run model d iagnostics for all Aim 2 - 1: dispersion aggregation method models 452 #Revision history: Created January 2, 2019 data < - read.csv("Cohortdata_28December2018.csv") #Model Diagnostics #Loading packages librar y(rJava) library(recoder) library(penalized) libr ary(AER) #Model 1 glm2 < - glm(nDzLstc2 ~ sdb hb + sdLtot + sdalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$devia nce, df=glm2$df.residual, lower.tail=FALSE) # [1] 0.9337767 #Model 2 glm3 < - glm(nDzLstc2 ~ sdBtot + sdBDtot + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm3) #p value = 0.9988 pchisq (glm3$deviance, df=glm3$df.residual, lower.tail=F ALSE) #[1] 0.8360108 # Model 3 glm4 < - glm(nDzLstc2 ~ sdbhb + sdcalcium + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm4) #p value = 0.9 995 pchisq(glm4$deviance, df=glm4$df.residual , lower.tail=FALSE) #[1] 0. 815 #Model 4 glm5 < - glm(nDzLstc2 ~ sdbhb + sdBDtot + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm5) #p val ue = 0.999 pchisq(glm5$deviance, df=glm5$df.r esidual, lower.tail=FALSE) #[1] 0.79 #Model 5 glm6 < - glm(nDzLstc2 ~ sdLtot + sdvitamind + sdBDtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm6) # p value = 0.997 pchisq(glm6$deviance, df=glm6 $df.residual, lower.tail=FALSE) #[1] 0.78 #Model 6 glm1 < - glm(nDzLstc2 ~ sdbhb + sdalbumin + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.99 pchisq(glm1$deviance, df=glm 1$df.residual, lower.tail=FALSE) #[1] 0.78 #Model 7 glm7 < - glm(nDzLstc2 ~ sdBtot + sdBDtot + sdalphatocopherol + offset(log(Lstc)), family = poisson(link=log), data = data) dis persiontest(glm7) #p value = 0.98 pchisq( glm7$deviance, df=glm7$df.residual, lower.tail=FALSE) # [1] 0.78 #Model 8 453 glm8 < - glm(nDzLstc2 ~ sdvitamind + sdBtot + sdBDtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm8) #p value = 0.996 pchisq(glm8$deviance, df=glm8$df.residual, lower.tail=FALSE ) #[1] 0.77 #Model 9 glm9 < - glm(nDzLstc2 ~ sdLtot + sdbhb + sdBDtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm9) #p value = 0.9992 pchisq(glm9$deviance, df=glm9$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 10 glm10 < - glm(nDzLstc2 ~ sdbhb + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm10) #p value = 0.96 pchisq(glm10$deviance, df=glm10$df.residual, lower.tail=FALSE) #[1] 0.68 #Model 11 glm11 < - glm(nDzLstc2 ~ sdcalcium + sdBDtot + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm11) #p v a lue = 0.99 pchisq(glm11$deviance, df=glm11$df.residual, lower.tail=FALSE) #[1] 0.76 #Model 12 glm12 < - glm(nDzLstc2 ~ sdLtot + sdBtot + sdBDtot + offset(log(Lstc)), family = poisson(li nk=log), data = data) dispersiontest(glm12) #p value = 0.996 pchisq(glm12$deviance, df=glm12$df.residual, lower.tail=FALSE) #[1] 0.74 #Model 13 glm13 < - glm(nDzLstc2 ~ sdbhb + sdalphatocopherol + sdvitamina + offset(log(Lstc)) , family = poisson(link=log), data = data) di spersiontest(glm13) #p value = 0.98 pchisq(glm13$deviance, df=glm13$df.residual, lower.tail=FALSE) #[1] 0.73 #Model 14 glm14 < - glm(nDzLstc2 ~ sdcalcium + sdvitamina + offset(log(Ls tc)), family = poisson(link=log), data = data) dispersiontest(glm14) #p value = 0.95 pchisq(glm14$deviance, df=glm14$df.residual, lower.tail=FALSE) #[1] 0 .64 #Model 15 glm15 < - glm(nDzLstc2 ~ sdbhb + sdaopugul + sdvitamina + o ffset(log(Lstc)), family = poisson(link=log), dat a = data) dispersiontest(glm15) #p value = 0.99 pchisq(glm15$deviance, df=glm15$df.residual, lower.tail=FALSE ) #[1] 0.72 #Model 16 glm16 < - glm(nDzLstc2 ~ sdvitamind + sdBDt ot + sdalphatocopherol + offset(log(Lstc)), fami ly = poisson(link=log), data = data) dispersiontest(glm16) #p value = 0.98 pchisq(glm16$deviance, df=glm16$df.residual, lower.tail=FALSE) #[1] 0.72 #Model 17 454 glm17 < - glm(nDzLst c2 ~ sdbhb + sdalbumin + sdalphatocopherol + offs et(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm17) #p value = 0.99 pchisq(glm17$deviance, df=glm17$df.residual, lower.tail=FALSE) #[1] 0.56 #Model 18 glm18 < - glm(nDzLstc2 ~ sdbhb + sdBDtot + sdalpha tocopherol + offset(log(Lstc)), family = poisson(link=log), data = data) disper siontest(glm18) #p value = 0.99 pchisq(glm18$deviance, df=glm18$df.residual, lower.tail=FALSE) #[1] 0.71 #Mo del 19 glm19 < - glm(nDzLstc2 ~ sdbhb + s dEtot + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm19) #p value = 0.98 pchisq(glm19$deviance, df=glm19$df.residual, lower.tail=FALSE) #[1] 0.71 #Model 20 glm20 < - glm(nDzLstc2 ~ sdBDtot + sdalphatocopherol + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm20) #p value = 0.99 pchisq(glm20$deviance, df=glm20$df.residual, lower. tail=FALSE) #[1] 0.7 1 #Model 21 glm21 < - glm(nDzLstc2 ~ sdBDtot + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm21) #p value = 0.92 pchisq(glm21$deviance, df=glm21$df.residual, lower .tail=FALSE) #[1] 0. 62 #Model 22 glm22 < - glm(nDzLstc2 ~ sdbhb + sdNtot + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm22) #p value = 0.99 pchisq(glm22$deviance, df=glm22$df.resi dual, lower.tail=FALSE) #[1] 0.70 #Model 23 glm23 < - glm(nDzLstc2 ~ sdcalcium + sdBtot + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm23) #p value = 0.99 pchisq(glm23$deviance, df= glm23$df.residual, lower .tail=FALSE) #[1] 0.7 0 #Model 24 glm24 < - glm(nDzLstc2 ~ sdLtot + sdalbumin + sdbetacarotene + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm24) #p value = 0.99 pchisq(gl m24$deviance, df=glm24$df.residual, lower.tail=FA LSE) #[1] 0.69 #Model 25 glm25 < - glm(nDzLstc2 ~ sdnefa + sdcalcium + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm25) #p value = 0.99 pchisq(glm25$deviance, df=glm25$df.residual, lower.tail=FALSE) #[1] 0.69 455 MODELDIAGNOSTICSAIM2 - 1_COUNT.R #Title: Model diagnostics for Aim 2 - 1: count method #Author: Lauren Wisnieski #Purpose: Run model diagnostics f or all Aim 2 - 1: count aggregation method models # Revision history: Created January 2, 2019 #Model Diagnostics data < - read.csv("Cohortdata_28December2018.csv") #Loading packages library(rJava) library(recoder) library(penalized) library(AER) #Mode l 1 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbum in + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.96 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.64 #Model 2 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbum in + n Heifer + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.998 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.71 #Model 3 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbu min + nBDtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.99 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1]0.668 #Mo del 4 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nAlphatocopherol + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.98 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1 ] 0.62 #Model 5 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nMtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.98 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) # [1] 0. 61 #Model 6 glm1 < - glm(nDzLstc2 ~ nCalcium + nBtot + nAlbumin + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.99 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1 ] 0.60 #Model 7 glm1 < - glm (nDzLstc2 ~ nCalcium + nAlbumin + nCholesterol + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.98 pchisq(glm1$deviance, df=glm1$df.residual, lower. tail=FALSE) # [1] 0.60 456 #Model 8 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nAopugul + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.98 pchisq(glm1$deviance, df=glm1$df.residu al, low er.tail=FALSE) # [1] 0.60 #Model 9 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nNtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.98 pchisq(glm1$deviance, df=glm1$df .residu al, lower.tail=FALSE) #[1] 0.59 #Model 1 0 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nVitamina + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.98 pchisq(glm1$deviance, df=glm1$ df.residual, lower.tail=FALSE) #[1] 0.59 #Model 11 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nBetacarotene + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.97 pchisq(glm1$devia nce, df=glm1$df.residual, lower.tail=FALSE) # [1] 0.58 #Model 12 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nRos + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.98 pchisq(glm1$d eviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.57 #Model 13 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.97 pchisq(gl m1$deviance, df=glm1$df.residual, lower.tail=FALS E) #[1] 0.57 #Model 14 glm1 < - glm(nDzLstc2 ~nCalcium + nHaptoglobin + nAlbumin + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.96 pchisq(glm1$deviance, df=glm1$df.residual, lowe r.tail=FALSE) #[1] 0.57 #Mod el 15 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nBhb + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.9 7 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.57 #Model 16 glm1 < - glm(nDzLstc2 ~ nVitamind + nCalcium + nAlbumin + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p val ue = 0.97 pchisq(glm1$deviance, df=glm1$df.re sidual, lower.tail=FALSE) #[1] 0. 56 457 #Model 17 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nSaaugml + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1 ) #p value =0.96 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE ) #[1] 0.56 #Model 18 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nEtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.96 p chisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.56 #Model 19 glm1 < - glm(nDzLstc2 ~ nCalcium + nLtot + nAlbumin + offset(log(Lstc)), family = poisson(lin k=log), data = data) dispersiontest(glm1 ) #p value = 0.96 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.56 #Model 20 glm1 < - glm(nDzLstc2 ~ nCalcium + nNefa + nAlbumin + offset(log(Lstc)) , family = poisson(link=log), data = data) di spersiontest(glm1) #p value = 0.96 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.56 #Model 21 glm1 < - glm(nDzLstc2 ~ nAlbumin + offset(log( Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.58 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.564 MODELDIAGNOSTICSAIM2 - 1_COMBINED.R #Title: M odel diagnostics for Aim 2 - 1: combined method #Au thor: Lauren Wisnieski #Purpose: Run model diagnostics for all Aim 2 - 1: combined aggregation method models #Revision history: Created January 3, 2019 setwd("C:/Users/wisni/Amaz on Drive/DISSERTATION ANALYSI S/Data files") data < - read.csv("Cohortdata_28Dec ember2018.csv") #Model Diagnostics #Loading packages library(rJava) library(recoder) library(penalized) library(AER) #Model 1 glm1 < - glm(nDzLstc2 ~ meanbhb + sdBDtot + medBCS + offset(log(Lstc) ), family = poisson(link=log), data = data) d ispersiontest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.96 #Model 2 458 glm1 < - glm(nDzLstc2 ~ sdbhb + sdBD tot + medBCS + offset(log(Lstc)), f amily = poisson(link=log), data = data) dispe rsiontest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.94 #Model 3 glm1 < - glm(nDzLstc2 ~ sdbhb + sdBDto t + medBCS + offset(log(Lstc)), fam ily = poisson(link=log), data = data) dispers iontest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 94 #Model 4 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + sdBDtot + offset(log(Lstc)), family = poisson(link=log), data = data) dis persiontest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.93 #Model 5 glm1 < - glm(nDzLstc2 ~ sdBDtot + nBhb + medBCS + offset(log(Lstc)), fami ly = poisson(link=log), data = data) dispersi ontest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.92 #Model 6 g lm1 < - glm(nDzLstc2 ~ meanBtot + sdBDtot + nBCS + offset(log(Lstc)), family = po isson(link=log), data = data) dispersiontest( glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.92 #Model 7 glm1 < - glm(nDzLstc2 ~ meancalcium + sdBDtot + meanalbumin + offset(log(Ls tc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.90 #Model 8 glm1 < - glm(nDzLstc2 ~meancalcium + nAlbumin + offset(log(Lstc)), family = poisson(link=log), data = data) dis persiontest(glm1) #p value = 0.998 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.82 #Model 9 glm1 < - glm(nDzLstc2 ~ sdBDtot + medBCS + sdNtot + offset(log(Lstc )), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.89 #Model 10 glm1 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanvitamina + offset(log(Lstc)), family = poiss on(link=log), data = data) dispersio ntest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.89 #Model 11 glm1 < - glm(nDzLstc2 ~ sdBDtot + me dBCS + offset(log(Lstc)), family = poisson(link=l og), data = data) 459 dispersiontest(glm1) #p value = 0.999 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.82 #Model 12 g lm1 < - glm(nDzLstc2 ~ sdbhb + nNEFA + s dvitamind + offset(log(Lstc)), fami ly = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.999 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.82 # Model 13 glm1 < - glm(nDzLstc2 ~ meancalcium + sdcalcium + meanalb umin + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.9996 pchisq(glm1$deviance, df=glm1$df.residual, lower.t ail=FALSE) #[1] 0.88 #Model 14 glm1 < - glm(nDzLstc2 ~meancalcium + sdB Dtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.9991 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.87 #Model 15 glm1 < - glm( nDzLstc 2 ~ sdBDtot + medBCS + meanMtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, low er.tail=FALSE) #[1] 0.87 #Model 16 glm1 < - glm(nDzLstc2 ~ sdBtot + sdBDtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$d f.residual, lower.tail=FALSE) #[1] 0.87 #Model 17 glm1 < - glm(nDzLstc2 ~ sdcalcium + sdBDtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value =1 pchisq(glm1$ deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.87 #Model 18 glm1 < - glm(nDzLstc2 ~ sdBDtot + medBCS + nHeifer + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.999 pchisq(glm1$deviance, df=glm1$df.residu al, lower.tail=FALSE) #[1] 0.87 #Model 19 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + sdcalcium + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontes t(glm1) #p value = 1 pchisq(glm1$deviance , df=glm1$df.residual, lower.tail=FALSE) #[1] 0.87 #Model 20 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalcium + meanvitamina + offset(log(Lstc)), family = poisson(link= log), data = dat a) 460 dispersiontest(glm1) #p value = 0.9998 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.86 #Model 21 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumin + offset(log(Lstc)), family = poisson(lin k=log), data = data) dispersiontest(glm1) #p value = 0.96 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.78 #Model 22 glm1 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + nVitamind + offset(log(Lstc) ), family = poisson(link=log), data = data) d ispersiontest(glm1) #p value = 0.9998 pchisq(glm1$deviance, df=glm1$df.residual, lower.ta il=FALSE) #[1] 0.85 #Model 23 glm1 < - glm(nDzLstc2 ~sdbhb + sdBDtot + nBCS + offset(log(Lstc)), family = poisson(link=log), d ata = data) dispersiontest(glm1) #p value = 0.998 pchisq(glm1$deviance, df=glm1$df.re sidual, lower.tail=FALSE) #[1] 0.85 #Model 24 glm1 < - glm(nDzLstc2 ~ sdBDtot + nVitamind + m edBCS + offset(log(Lstc)), fami ly = poisson(link=log), data = data) dispersiontest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.85 #Model 25 glm1 < - glm(nDzLstc2 ~ sd BDtot + medBCS + meanbetacarotene + o ffset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.84 #Model 26 glm 1 < - glm (nDzLstc2 ~ sdbhb + nNefa + sdalphatocoph erol + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.9999 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.8 4 # Model 27 glm1 < - glm(nDzLstc2 ~ sdBDtot + meanaopugul + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 1 pchisq(glm1$deviance, df=glm1$df.residual, lower.ta il=FALSE) #[1] 0.84 #Model 28 glm1 < - glm(nDzLstc2 ~ meanBtot + sdBDtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(gl m1) #p value = 1 pchisq(glm1$deviance, df=glm1$ df.residual, lower.tail=FALSE) #[1] 0.84 #Model 29 glm1 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + nCholesterol 461 + offset(log(Lstc)), family = poisson(link=log ), data = data) dispersiontest(glm1) #p value = 0.999 pchisq(glm1$deviance, df=glm1$df.resid ual, lower.tail=FALSE) #[1] 0.83 #Model 30 glm1 < - glm(nDzLstc2 ~ meanbhb + sdLtot + sdBDtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest (glm1) #p value = 0.9998 pchisq(glm1$devi ance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.8 3 #Model 31 glm1 < - glm(nDzLstc2 ~ sdBDtot + sdvitamind + nBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p val ue = 0.9994 pchisq(glm1$deviance, df=glm1$df.residual , lower.tail=FALSE) #[1] 0.82 #Model 32 glm1 < - glm(nDzLstc2 ~ sdBDtot + medBCS + nAopugul + offset(log(Lstc)), fami ly = poisson(link=log), data = data) dispersi ontest(glm1) #p value =1 pchisq(glm1$deviance, df =glm1$df.residual, lower.tail=FALSE) #[1] 0.82 #Model 33 glm1 < - glm(nDzLstc2 ~ sdBDtot + medBCS + sdcholesterol + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.9998 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.81 #Model 34 glm1 < - glm(nDzLstc2 ~ n Albumin + meanc alcium + nCholesterol + offset(lo g(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.9993 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.81 # Model 35 glm1 < - glm(nDzLstc2 ~ meancalcium + sdBDtot + nBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.999 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALS E) #[1] 0.8 1 #Model 36 glm1 < - glm(nDzLstc2 ~ m eancalcium + nAlbumin + medBCs + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.999 pchisq(glm1$deviance, df=glm1$df.residual, lower .tail=FALSE) #[1] 0.81 #Model 37 glm1 < - glm(nDzLstc2 ~ meancalcium + sdbhb + nNefa + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.9995 pchisq(glm1$deviance, df=g lm1$df.residual, lower.tail=FALSE) #[1] 0.81 #Model 38 glm1 < - glm(nDzLstc2 ~ sdBDtot + sdvitamind + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) 462 dispersiontest(glm1) #p value = 0.9997 pc hisq(glm1$deviance, df=glm1$df.residual, lower.ta il=FALSE) #[1] 0.81 #Model 39 glm1 < - glm(nDzLstc2 ~ meanBtot + sdBDtot + sdalphatocopherol + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontes t(glm1) #p value = 0.996 pchisq(glm1$devi ance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.80 #Model 40 glm1 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + nBhb + offset(log(Lstc)), family = poisson(link=log), d ata = data) dispersiontest(glm1) #p value = 0.9993 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[ 1] 0.80 #Model 41 glm1 < - glm(nDzLstc2 ~sdBDtot + meanaopugul + nBCS + offset(log(Lstc)), family = poi sson(link=log), data = data) dispersiontest(g lm1) #p value = 0.997 pchisq(glm1$deviance, df=glm1$df.residual, lower.ta il=FALSE) #[1] 0.80 #Model 42 glm1 < - glm(nDzLstc2 ~ meancalcium + nNefa + meanalbumin + offset(lo g(Lstc)), family = poisson(link=log), data = data ) dispersiontest(glm1) #p value = 0.998 pchisq(glm1$deviance, df=glm1 $df.residual, lower.tail=FALSE) #[1] 0.80 #Model 43 glm1 < - glm(nDzLstc2 ~sdbhb + nAlbumin + meancalcium + offset(log(Lstc)), family = poisson(lin k=log), data = data) dispersiontest(glm1) #p value = 0.998 pchisq(glm 1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.80 #Model 44 glm1 < - glm(nDzLstc2 ~ sdBDtot + sdalphato copherol + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.9998 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.80 #Model 45 glm1 < - glm(nDzLstc2 ~ sdBDtot + medBCS + nBCS + offset (log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.9992 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.80 #Model 46 glm1 < - glm(nDzLstc 2 ~ nAlbumin + meancalcium + mea nbetacarotene + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.999 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.79 #Model 47 glm1 < - glm(nDzLstc2 ~ meancalciu m + nAlbumin + nVitamind + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.999 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE ) #[1] 0.79 #Model 48 glm1 < - glm(nD zLstc2 ~ meancalcium + sdbhb + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1 ) 463 #p value = 0.997 pchisq(glm1$deviance, df=glm1$df.res idual, lower.tail=FALSE) #[1] 0.79 #M odel 49 glm1 < - glm(nDzLstc2 ~ nAlbumin + sdBDtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersio ntest(glm1) #p value = 0.9999 pchisq(glm1$deviance, df= glm1$df.residual, lower.tail=FALSE) #[1] 0.79 #Model 50 glm1 < - glm(nDzLstc2 ~ meancalcium + nAlbumin + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.998 pchisq(glm1$deviance, df=glm1$df.residual, lower .tail=FALSE) #[1] 0.79 #Model 51 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + nNefa + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm 1) #p value = 0.9993 pchisq(glm1$deviance , df=glm1$df.residual, lower.tail=FALSE) #[1] 0.78 #Model 52 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + manBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.997 pchisq(glm1$deviance, df=glm1$df.residual, lower .tail=FALSE) #[1] 0.78 #Model 53 glm1 < - glm(nDzLstc2 ~sdBDtot + medBCS + nCalcium + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0 .999 pchisq(glm1$deviance, df=glm1$df.residua l, lower.tail=FALSE) #[1] 0.78 #Model 54 glm1 < - glm(nDzLstc2 ~ nAlbumin + nCalcium + sdBDtot + offset(log(Lstc)), family = poiss on(link=log), data = data) dispersiontest(glm 1) #p value = 0.9999 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.78 #Model 55 glm1 < - glm(nDzLstc2 ~ sdLtot + sdBDtot + medBCS + offset(log(Lstc)) , family = poisson(link=log), data = data) di spersiontest(glm1) #p value = 0.9998 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.78 #Model 56 glm1 < - glm(nDzLstc2 ~ sdLtot + sdBDtot + sdvitamind + offset( log(Lstc)), family = poisson(link=log), data = da ta) dispersiontest(gl m1) #p value = 0.997 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.78 #Model 57 glm1 < - glm(nDzLstc2 ~ meancalcium + nAlbumin + meanbhb + offset(log(Lstc)), family = poisso n(link=log), data = data) dispersiontest(glm1) #p value = 0.999 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.78 #Model 58 glm1 < - glm(nDzLstc2 ~sdBDtot + nBCS + nAopugul + offset(log(Lstc)), f amily = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.995 464 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.78 #Model 59 glm1 < - glm(nDzLstc2 ~s dBtot + sdB Dtot + sdalphatocopherol + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.98 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.78 #Model 60 glm1 < - gl m(nDzLstc2 ~sdBDtot + nBCS + meanMtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.99 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.78 #Model 6 1 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalciu m + nHeifer + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.999 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[ 1] 0.78 #Model 62 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + nBhb + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.999 pchisq(glm1$deviance, df=glm1$df.residual, lower. tail=FAL SE) #[1] 0.77 #Model 63 glm1 < - glm(nDzLstc2 ~sdBtot + sdBDtot + sdvitamind + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.996 pchisq(glm1$deviance, df=glm1$d f.residu al, lower.tail=FALSE) #[1] 0.77 #Model 64 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalcium + sdLtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.997 pchisq(glm1 $devianc e, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 65 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalcium + meanMtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.998 pchisq(glm1$deviance, df=glm1$df.resid ual, lower.tail=FALSE) #[1] 0.77 #Model 66 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumin + sdcholesterol + offset(log(Lstc)), family = poisson(link=log), data = d ata) dispe rsiontest(glm1) #p value = 0.98 pchisq(gl m1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 67 glm1 < - glm(nDzLstc2~ nAlbumin + meancalcium + sdcholesterol + offset(log(Lstc)), family = p oisson(link=lo g), data = data) dispersiontest(glm1) #p value = 0.997 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 68 glm1 < - glm(nDzLstc2 ~sdalphatocopherol + nAlbumin + meancalcium + offset(log(L stc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.997 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) 465 #[1] 0.77 #Model 69 glm1 < - glm(nDzLstc2 ~sdbhb + sdLtot + sdBDtot + offset(log(Lstc)), family = poisson(link=log), d ata = data) dispersiontest(glm1) #p value = 0.9992 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 70 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalcium + nAopugu l + offset(log(Lstc)), family = p oisson(link=log), data = data) dispersiontest(glm1) #p value = 0.998 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALS E) #[1] 0.77 #Model 71 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalcium +sdNtot + offset(log (Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.99 pchisq(glm1$deviance, df=glm1$df.residua l, lower.tail=FALSE) #[1] 0.77 #Model 72 glm1 < - glm(nD zLstc2 ~nAlbumin + meancalcium + sdEtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.998 pchisq(glm1$deviance , df=glm1$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 7 3 glm1 < - glm(nDzLstc2 ~sdBDtot + meanvitamin a + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.9997 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[ 1] 0.77 #Model 74 glm1 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanbetacarotene + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.97 pchisq(glm1$deviance, df=glm1$df.res idual, lower.tail=FALSE) #[1] 0.77 #Mode l 75 gl m1 < - glm(nDzLstc2 ~nAlbumin + meancalcium+ meanLtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.997 pchisq(glm1$dev iance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 76 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalcium + nBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.998 pchisq(glm1$deviance, df=glm1$df.residual, low er.tail=FALSE) #[1] 0.77 #Model 77 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalcium + nCalcium + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.997 pchisq(glm1$deviance, df =glm1$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 78 glm1 < - glm(nDzLstc2 ~meancalcium + meanbhb + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.995 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.77 466 #Model 79 glm1 < - glm(nDzLstc2 ~meanbhb + sdBDtot + sdalphatocopherol + offset(log(Lstc)), family = poisso n (link=log), data = data) dispersiontest(glm1) #p value = 0.98 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 80 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalcium + meanaopugul + offset(log( Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.998 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 81 glm1 < - glm(nDzLstc2 ~nAlbumin+ meancalcium + sdvitamind + offset(log(Lstc)), family = poisson(l ink=log), data = data) dispersiontest(glm1) #p value = 0.998 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 82 glm1 < - glm(nDzLstc2 ~nAlbumin + meancal cium + meanBtot + offset(log(Lstc )), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.998 pchisq(glm1$deviance , df=glm1$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 83 glm1 < - glm(nDzLst c2 ~sdBtot + nAlbumin+ meancalcium + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.998 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.77 #Model 84 glm1 < - glm(nDzLstc2 ~sdBDtot + nBCS + meanbetac arotene + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.99 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0. 77 #Model 85 glm1 < - glm(nDzLstc2 ~manBC S + sdBDtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispers iontest(glm1) #p value = 0.999 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALS E) #[1] 0.77 #Model 86 glm1 < - glm(n DzLstc2 ~sdbhb+ nNefa + sdLtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.99 pchisq(glm1$deviance, df=glm1$df.residual, lower. tail=FALSE) #[1] 0.76 #Model 87 glm1 < - glm(nDzLstc2 ~sdbhb + nNefa + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.98 pchisq(glm1$deviance, df=glm1$df.residual, lower .tail=FALSE) #[1] 0.70 #Model 88 glm 1 < - glm(nDzLstc2 ~sdBtot + mean calcium + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.97 pchisq(glm1$deviance, df=g lm1$df.residual, lower.tail=FALSE) #[1] 0.76 467 #Model 89 glm1 < - glm(n DzLstc2 ~sdBDtot + medBCS + nCholesterol + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.998 pchisq (glm1$deviance, df=glm1$df.residual, lower.tail=F ALSE) #[1] 0.76 #Model 90 glm1 < - glm(nDzLstc2 ~nNefa + sdBDtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.9 99 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.76 #Model 91 glm1 < - glm(nDzLstc2 ~sdBDtot + sdEtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p v alue = 0.999 pchisq(glm1$deviance, df=glm1$df .residual, lower.tail=FALSE) #[1] 0.76 #Model 92 glm1 < - glm(nDzLstc2 ~sdLtot + meancalcium + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) disper siontest(glm1 ) #p value = 0.95 pchisq(glm 1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.76 #Model 93 glm1 < - glm(nDzLstc2 ~sdBDtot + medBCS + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) d ispersiontest(glm1) #p value = 0 .9992 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.76 #Model 94 glm1 < - glm(nDzLstc2 ~meanLtot + sdBDtot + medBCs + offset(log(Lstc)), family = poisson (link=log), d ata = data) dispersiontest(glm1) #p value = 0.9992 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.76 #Model 95 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumin + medBCS + offset(log( Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.99 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1 ] 0.76 #Model 96 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumin + nHeifer + offset(log(Lstc)), family = poisson(l ink=log), data = data) dispersiontest(glm1) #p value = 0.99 pchisq(glm1$deviance, df=glm1$df.residual, lo wer.tail=FALSE) #[1] 0.75 #Model 97 glm1 < - glm(nDzLstc2 ~sdBtot + sdBDtot + meancalcium + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.995 pchisq(glm1$deviance, df=g lm1$df.residual, lower.tail=FALSE) #[1] 0.75 #Model 98 glm1 < - glm(nDzLstc2 ~ meancalcium +meanalbumin + sdvitamind + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.99 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0.75 #Model 99 468 glm1 < - glm(nDzLstc2 ~sdLtot + sdBD tot + meanc alcium + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.997 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FALSE) #[1] 0. 75 #Model 100 glm1 < - glm(nDz Lstc2 ~sdBt ot + sdLtot + sdBDtot + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.996 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail=FAL SE) #[1] 0.74 #Model 101 glm1 < - glm (nDzLstc2 ~sdBDtot + nBCS + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm1) #p value = 0.91 pchisq(glm1$deviance, df=glm1$df.residual, lower.tail= FALSE) #[1] 0.66 SHRINKAGEAIM2 - 1MEA N.R #Title: Shrinkage for Aim 2 - 1: mean method #Author: Lauren Wisnieski #Purpose: Run shrinkage for all Aim 2 - 1: mean aggregation method models #Revision history: Created January 1, 2019 setwd(" C:/Users/wisni/Amazon Drive/DIS SERTATION ANALYSIS /Data files") data < - read.csv("Cohortdata_28December2018.csv") #Shrinkage #Loading packages library(rJava) library(glmulti) library(recoder) library(penalized) library(AER) #setting up variables summary (data$nTotalLstc2) TotalLstc2 = data$nTotalLstc2 Lstc = as.numeric(TotalLstc2) summary(data$Lstc) nDzLstc2 = data$nDzLstc2 meancalcium = data$meancalcium meanalbumin = data$meanalbumin meancholesterol = data$meancholesterol meanNtot = data$meanNtot meanLto t = data$meanLtot meanBDtot = data$meanBDtot mean Btot = data$meanBtot meanvitamind = data$meanvitamind meannefa = data$meannefa meanros = data$meanros meanaopugul = data$meanaopugul meanMtot = data$meanMtot meansaa = data$meansaa meanbhb = data$meanbhb med BCS = data$medBCS meanalphatocopherol = data$mean alphatocopherol meanbetacarotene = data$meanbetacarotene meanhaptoglobin = data$meanhaptoglobin meanEtot = data$meanEtot meanvitamina = data$vitamina season = data$season season2 = data$season2 season3 = dat a$season3 season4 = data$season4 data$season < - a s.factor(data$season) #running model 1 469 glm1 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print s ummary(glm1) #finding optimum tuning parameter lambda2 for model 1 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium+meanalbumin+meanvitamina, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , t race = TRUE) optl2$lambda #the optimum lam bda found was 10.82159 #Running shri nkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium+meanalbumin+meanvitamina, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=10.8 2159, standardize=TRUE, mode l = c("poisson"), tr ace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium+meanalbumin+meanvitamina , nDzLstc2~off set(log(Lstc)), data ) print(predictions) #running model 2 glm1 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meancholesterol + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #findi ng optimum tuning parameter lambda2 for model 2 optl2 < - op tL2 (nDzLstc2, nDzLstc2~meancalcium+meanalbumin+meancholesterol, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 4.50 4303 #Runnin g shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium+meanalbumin+meancholesterol, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=4.5043038, standard ize=TRUE, model = c("poisson "), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium+meanalbumin+meancholesterol, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 3 glm1 < - glm(n DzLstc2 ~ meancalcium + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambd a2 for model 3 optl2 < - optL2 (nDzLstc2, n DzLstc2~meancalcium+meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$la mbda #the optimum lambda found was 6.52 0924 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium+meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=6.520924, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients (pen1) 470 # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium+meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #runnin g model 4 glm1 < - glm(nDzLstc2 ~ meanca lcium + meanalbumin + meanros + offset(log(Lstc)) , family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 4 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium+mea nalbumin+meanros, nDzLstc2~offset(log(Lstc)) , la mbda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 6.168624 #Running shrinkage procedure pen1< - penalized (nDzL stc2, nDzLstc2~meancalcium+meanalbumin+meanros, n DzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=6.168624, standardize=TRUE, model = c("poisson"), tra ce = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - pre dict(pen1, nDzLstc2~meancalcium+meanalbumin+meanros, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 5 glm1 < - glm(nDzLstc2 ~ meanca lcium + meanalbumin + meanNtot + offset(log(L stc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 6 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium+me analbumin+meanNtot, nDzLstc2~offset(log(Lstc) ) , lambda1 = 0 , model = c("poisson") , standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 4.906699 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium+meanalbumin+mean Ntot, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=4.906699, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - p redict(pen1, nDzLstc2~meancalcium+meanalbumin+mea nNtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 6 glm1 < - glm(nDzLstc 2 ~ meancalcium + meanalbumin + meansaa + offset(log(Lstc)), family = poisson(link=log), data = dat a) print=(pred=glm1$fitted) print summary(g lm1) #finding optimum tuning parameter lambda2 for model 6 optl2 < - optL2 (nDzLstc2, nDzLstc2~meanc alcium+meanalbumin+ meansaa, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poiss on"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 12.07264 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium+meanalbumin+meansaa, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=12.07264, standardize =TRUE, model = c("poisson"), trace = TRUE) 471 show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium+meanalbumi n+meansaa, nDzLstc2~offset(log(Lstc)), data ) p rint(predictions) #running model 7 glm1 < - glm(nDzLstc 2 ~ meancalcium + meanBDtot + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print sum mary(glm1) #finding optimum tuning parameter la mbda2 for model 7 optl2 < - optL2 (nDzLstc2, nDzLstc2~mea ncalcium+meanBDtot + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trac e = TRUE) optl2$lambda #the optimum lambda found was 4.739466 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium+meanBDtot + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=4.739466 , standardize=TRUE, model = c("poisson"), trace = TRUE) show (pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium+meanBDtot + meanalbumin, nDzLstc2~offset(lo g(Lstc)), data ) print(predictions) #running model 8 glm1 < - glm(nDzLstc2 ~ meancalcium + meanvitamind + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 8 op tl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium+ mean vitamind + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) opt l2$lambda #the optimum lambda found was 8.16446 9 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + meanvitamind + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=8.164469, standardize =TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium+ meanvitamind + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running mod el 9 glm1 < - glm(nDzLstc2 ~ meancalcium + meanbetacarotene + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 9 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium+ meanbetacarotene + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) 472 optl2$l ambda #the optimum lambda found was 5.489907 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + meanbetacarotene + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 5.489907, standardi ze=TRUE, model = c("poisson" ), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium+ meanbetacarotene+ meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running mo del 10 glm1 < - glm(nDzLstc2 ~ meancalcium + meanbhb + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 10 optl2 < - op tL2 (nDzLstc2, nDzLstc2~meancalcium+ meanbhb + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the op timum lamb da found was 9.050423 #Running shr inkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + meanbhb + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=9.050423, standardize=TRUE, model = c("poisson"), tr ace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium+ meanbhb + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 11 glm1 < - glm(nDz Lstc2 ~ meancalcium + medBCS + meanalbumin + offset(log(Lstc)), family = po isson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for mod el 11 optl2 < - optL2 (nDzLstc2, nDzLstc2~m eancalcium+ medBCS + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 8.2 62127 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + medBCS + meanalbumin, nDzLstc2 ~offset(log(Lstc)) , lambda1=0, lambda2=8.262127, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium+ medBCS + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 12 glm1 < - glm(nDzLstc2 ~ meancalcium + meanMt ot + meanalbumin + offset(log(Lstc)), family = p oisson(link=log), data = data) 473 print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 12 optl2 < - optL 2 (nDzLstc2, nDzLstc2~meancalcium+ meanMtot + mea nalbumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 13.50431 #Running shri nkage procedure pen1< - penalized (nDzLstc2, nD zLstc2~meancalcium + meanMtot + meanalbumin, nDz Lstc2~offset(log(Lstc)) , lambda1=0, lambda2=13.50431, standardize=TRUE, model = c("poisson"), tr ace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium+ meanMtot + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 13 glm1 < - glm(nDzLs tc2 ~ meancalcium + meanEtot + meanalbumin + offs et(log(Lstc)), family = po isson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 13 optl2 < - optL2 (nDzLstc2, nDzLstc2~m eancalcium+ meanEtot + meanalbumin, nDzLstc2~offs et(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 6.756526 #Running shrinkage procedure pen1 < - penalized (nDzLstc2, nDzLstc2~meancalcium + me anEtot + meanalbumin, nDzL stc2~offset(log(Lstc)) , lambda1=0, lambda2=6.756526, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1 ) coefficients(pen1) # make predictio ns predictions < - predict(pen1, nDzLstc2~meancalcium+ meanEtot + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 14 glm1 < - glm(nDzLstc2 ~ meancalcium + mea nBtot + meanalbumin + offset(log(Lstc)), family = po isson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 14 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + meanBtot + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda 1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 10.04912 #Running shrinkage procedure pen1< - penalized (nDzLstc2 , nDzLstc2~meancalcium + meanBtot + meanalbumin, nDz Lstc2~offset(log(Lstc)) , lambda1=0, lambda2=10.04912, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions 474 predictions < - pr edi ct(pen1, nDzLstc2~meancalcium + meanBtot + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 15 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumin + meanLtot + offse t(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 15 optl2 < - optL2 (nD zLstc2, nDzLstc2~meancalcium + meanalbumin + meanLtot, nDzLstc2~offs et(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 7.343901 #Running shrinkag e procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + me analbumin + meanLtot, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=7.343901, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictio ns predictions < - predict(pen1, nDzLstc2~meanca lcium + meanalbumin + meanLtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 16 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumin + meanhaptoglobin + offset(log(Lstc)), fami ly = poisson(link=log), data = data) print=(pre d=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 16 optl2 < - op tL2 (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + meanhaptoglobin, nDzLstc2~offset(log(Lstc )) , lambda1 = 0 , model = c("poisson" ), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 9.208455 #Ru nning shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + meanhaptoglobin, nDzLstc2~offset(log(Lstc)) , la mbda1=0, lambda2=9.208455, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~mmeancalci um + meanalbumin + meanhaptoglobin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 17 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumin + meannefa + offset(log( Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 17 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + meannefa, nDzLstc2~offset(log (Lstc)) , lambda1 = 0 , model = c("poi sson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 13.9132 #Running shrinkage procedure 475 pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + meanalbum in + meannefa, nDzLstc2~offset(log(Lstc)) , lambd a1=0, lambda2=13.9132, standard ize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions pre dictions < - predict(pen1, nDzLstc2~meancalcium + meanalbumin + meannefa, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 18 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumin + meanaopugul + offset(log(Lstc)), f amily = poisson(link=log), data = data) print=( pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 17 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + meanaopugul, nDzLstc2~offset(log(Lstc) ) , lambda1 = 0 , model = c("poisson") , standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 7.13528 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + meanaopugul, nDzLstc2~offset(log(Lstc)) , lambda1 =0, la mbda2= 7.13528, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions pred ictions < - predict(pen1, nDzLstc2~meancalcium + m eanalbumin + meanaopugul, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 19 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumin + meanalphatocopherol + offset(log(L stc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 17 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + meanalphatocopherol, nDzLstc2~ offset(log(Lstc)) , lambda1 = 0 , mode l = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 8.91898 6 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + meanalphatocopherol, nDzLstc2~off set(log(Lstc)) , lambda1=0, lambda2=8.918986, standardize=TRUE, model = c("poi sson"), trace = TRUE) show(pen1) coefficients(pen1) # ma ke predictions predictions < - predict(pen1, nDz Lstc2~meancalcium + meanalbumin + meanalphatocopherol, nDzLstc2~offset(log(Lstc)), data ) print(predictions) # running model 20 glm1 < - glm(nDzLstc2 ~meanbhb + meanLtot + offset(log(L stc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) 476 #finding optimum tuning parameter lambda2 for model 17 optl2 < - optL2 (nDzLstc2, nDzLstc2~meanbhb + meanLtot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize =TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 14.21892 #Running s hrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meanbhb + meanLtot, nDzLstc2~offset(log(L stc)) , lambda1=0, lambda2=14 .21892, standardize=TRUE, model = c("poisson"), trace = TRUE) s how(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~mea nbhb + meanLtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) SHRINKAGEAIM2 - 1_SD.R #Title: Shrinkage for Aim 2 - 1: dispersion met hod #Author: Lauren Wisnieski #Purpose: Run shrinkage for all Aim 2 - 1: dispersion aggregation method models #Revision history: Created January 2, 201 9 #Shrinkage data < - read.csv("Cohortdata_28December2018.csv") #Loading packages library(rJava) library(glmulti) library(recoder) library(penalized) library(AER) #setting up variables summary(data$nTot alLstc2) TotalLstc2 = data$nTotalLstc2 Lstc = as. numeric(TotalLstc2) summary(data$Lstc) nDzLstc2 = data$nDzLstc2 sdbhb = data$sdbhb sd Btot = data$sdBtot sdNtot = data$sdNtot sdBDtot = data$sdBDtot sdEtot = data$sdEtot sdvitamina = data$sdvitamina sdbetacar otene = data$sdbetacarotene sdalphatocopherol = d ata$sdalphatocopherol sdvitamind = data$sdvitamind sdnefa = data$sdnefa sdcalcium = data$sdcalcium sdaopugul = data$sdaopugul sdLtot = data$sdLtot sdalbumin = data$sdalbumin season = data$season season2 = da ta$season2 season3 = data$season3 season4 = data$ season4 data$season < - as.factor(data$season) #running model 1 glm1 < - glm(nDzLstc2 ~ sdbhb + sdLtot + sdalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning p arameter lambda2 for model 1 optl2 < - optL2 (nDzLstc2, nDzLstc2~s dbhb + sdLtot + sdalbumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , maxiter= 20, epsilon = 1e - 10, trace = TRUE) 477 optl2$lambda #the optimum lambda found was 0 #running model 2 gl m1 < - glm(nDzLstc2 ~ sdBtot + sdBDtot + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fit ted) print summary(glm1) #finding optimum t uning parameter lambda2 for model 2 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdBtot + sdBDtot + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) o ptl2$lambda #the o ptimum lambda found was 0 #running model 3 glm1 < - glm(nDzLstc2 ~ sdbhb + sdcalcium + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summar y(glm1) #finding optimum tu ning parameter lambd a2 for model 3 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdbhb + sdcalcium + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE ) optl2$lambda #the op timum lambda found w as 5.090781 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdbhb + sdcalcium + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=5.090781, standardize =TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdbhb + sdcalcium + sdvitamina, nDzLstc2~offset(log(Lstc)), data ) pr int(pred ictions) #running model 4 glm1 < - glm(nDzLstc2 ~ sdbhb + sdBDtot + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for mo del 4 optl2 < - optL2 (nDzLstc2, nDzLstc2~s dbhb + sdBDtot + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 2.283249 #R unning shrinkage procedure pen1< - pena lized (nDzLstc2, nDzLstc2~sdbhb + sdBDtot + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=2.283249, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficient s(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdbhb + sdBDtot + sdvitamina, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 5 glm1 < - glm(nDzLstc2 ~ sdLtot + sdvitamind + sdBDtot + offset(log(Lstc)) , family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 5 478 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdLtot + sdvita mind + sdBDtot , nDzLstc2~offset(log(Lstc)) , la mbda1 = 0 , model = c(" poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 14.85961 #Running shrinkage procedure pen1< - penalized (nDzL stc2, nDzLstc2~sdLtot + sdvitamind + sdBDtot , nD zLstc2~offset(log(Lstc)) , lambda1 =0, lambda2=14.85961, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict (pen1, nDzLstc2~sdLtot + sdvitamind + sdBDtot , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 6 glm1 < - glm(nDzLstc2 ~ sdbhb + sdalbumin + sdvitamina + offset(log(Lstc)), family = poisson(link=lo g), data = data) print =(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 6 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdbhb + sdalbumin + sdvitamina , nDzLstc2~offset(log(Lstc)) , lam bda1 = 0 , mod el = c("poisson"), stand ardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 7 glm1 < - glm(nDzLstc2 ~ sdBtot + sdBDtot + sdalphatocopherol + offset(log(Lstc)), family = p oisson(link=log), data = data) print=(pred=glm1 $fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 7 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBtot + sdBDtot + sdalphatocopherol , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardiz e=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 8 glm1 < - glm(nDzLstc2 ~ sdvitamind + sdBtot + sdBDtot + offset(log(Lstc)), family = poisson(link= log), data = data) print=(pred=glm1$fitted) p rint summary(glm1) #finding optimum tuning parameter lambda2 for model 8 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdvitamind + sdBtot + sdBDtot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , mod el = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.403216 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdvitamind + sdBtot + sdBDtot , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=1.403216, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdvitam ind + sdBtot + sdBDtot, nDzLstc2~offset(log(Lstc) ), data ) print(predictions) 479 #running model 9 glm1 < - glm(nDzLstc2 ~ sdbhb + sdLtot + sdBDtot + offset(log(Lstc)), family = poisson(link=log), d ata = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning paramet er lambda2 for model 9 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdbhb + sdLtot + sdBDtot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poi sson"), standardize=TRUE , trace = T RUE) optl2$lambda #the optimum lambda foun d was 0.1325674 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdbhb + sdLtot + sdBDtot , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=0.1325674, standardize =TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdbhb + sdLtot + sdBD tot, nDzLstc2~offset(log(Lstc)), data ) print (predictions) #running model 10 glm1 < - glm(nDzLstc2 ~ sdbhb + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 10 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdbhb + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 8.826263 #Running s hrinkage procedure pen1< - penalized (n DzLstc2, nDzLstc2~ sdbhb + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=8.826263, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ sdbhb + sdvitamina, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 11 glm1 < - glm(nDzLstc2 ~ sdcalcium + sdBDtot + sdvitamina + offset(log(Lstc)), family = poisson (link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 11 optl2 < - optL2 (nDzLs tc2, nDzLstc2~sdcalcium + sdBDtot + sdvitamina , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 12 glm1 < - glm(nDzLstc2 ~ sdLtot + sdBtot + sdBDtot + off set(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 12 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdL tot + sdBtot + sdBDtot , nDzLstc2~offset(log(L stc)) , lambda1 = 0 , 480 model = c("poiss on"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 6.806575 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdLtot + sdBtot + sdBDto t, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=6.806575, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - p redict(pen1, nDzLstc2~sdLtot + sdBtot + sdBDtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 13 glm1 < - glm(nDzLstc2 ~ sd bhb + sdalphatocopherol + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = da ta) print=(pred=glm1$fitted) print summary( glm1) #finding optimum tuning parameter lambda2 for model 13 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdbhb + sdalphatocopherol + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trac e = TRUE) optl2$lambda #the optimum lambda found was 10.59451 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdbhb + sdalphatocopherol + sdvitamina, nDzLstc2~offset(log(Lstc )) , lambda1=0, lambda2=10.59 451, standardize=TRUE, model = c("poisson"), trace = TRUE) s how(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdb hb + sdalphatocopherol + sdvitamina, nDzLstc2~off set(log(Lstc)), data ) print(predictions) #running model 14 glm1 < - glm(nDzLstc2 ~ sdcalc ium + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitte d) print summary(glm1) #finding optimum tun ing parameter lambda2 for model 14 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdcalcium + sdvitamina, nDz Lstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum la mbda found was 3.663417 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdcalc ium + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=3.663417, standar dize=TRUE, model = c("poisso n"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdcalcium + sdvitamina , nDzLstc2~offset(log(Lstc)), data ) pr int(predictions) #running model 15 glm1 < - glm(nDzLstc2 ~ sdbhb + sdaopugul + sdvitamina + offset(log(Lstc)), family = pois son(link=log), data = data) print=(pred=glm1$fitted) 481 print summary(glm1) #finding optimum tuning parameter lambda2 fo r model 15 optl2 < - optL2 (nDzLstc2, nDzLs tc2~sdbhb + sdaopugul + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 9 .149691 #Running shrinkage procedure pen1 < - penalized (nDzLstc2, nDzLstc2~sdbhb + sdaopugul + sdvitamina, nDzLstc2~offset(l og(Lstc)) , lambda1=0, lambda2= 9.149691, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) c oefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdbhb + sdaopugul + sdvitamina, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 16 glm1 < - glm(nDzLstc2 ~ sdvitamind + sdBDtot + sdalphatoc opherol + offset(log(Lstc)), family = poiss on(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 16 optl2 < - optL2 (nDzLs tc2, nDzLstc2~sdvitamind + sdBDtot + sdalphatocop herol , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 9.505468 #Running shrink age procedure pen1< - penalized (nDzLstc2, nDzL stc2~sdvitamind + sdBDtot + sdalphatocophero l, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 9.505468, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdvitamind + sdBDtot + sdalphatocopherol, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 17 glm1 < - glm(nDz Lstc2 ~sdbhb + sdalbumin + sdalphatocopherol + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 17 optl2 < - optL2 (nDzL stc2, nDzLstc2~sdbhb + sdalbumin + sdalphatocophe rol, n DzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 18 glm1 < - glm(nDzLstc2 ~ sdbhb + sdBDtot + sdalphatocopher ol + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 18 optl2 < - optL 2 (nDzLstc2, nDzLstc2~ sdbhb + sdBDtot + sdalphat ocophe rol, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) 482 optl2$lambda #the optimum lambda found was 3.16614 #Running shrin kage procedure pen1< - penalized (nDzLstc2, nDz Lstc2~sdbhb + sdBDtot + sdalphatocopherol, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 3.16614, standardize=TRUE, model = c("poisson"), tra ce = TRUE) sho w(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdbhb + sdBDtot + sdalphatocopherol, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 19 glm1 < - glm(nDzLstc2 ~ sdbhb + sdEtot + sdvi tamina + offset(log (Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 19 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdbhb + sdEtot + sdvit amina, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 18.25305 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdbhb + sdEtot + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 18.25305, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1 ) # make predictions predictions < - pred ict(pen1, nDzLstc2~sdbhb + sdEtot + sdvitamina, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 20 glm1 < - glm(nDzLstc2 ~ sdBDtot + sdalphatocopherol + sdvitam ina + offset(log(Lstc)), family = poisson(link=log), d ata = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 20 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdBDtot + sdalphatocopher ol + sdvitamina, nD zLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 3.259704 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdBDt ot + sdalphatocopherol + sdvitamina, nDzLstc2~off set(log(Lstc)) , lambda1=0, lambda2= 3.259704, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # m ake predictions predictions < - predict(pen1, nD zLstc2~sdBDtot + sdalphatocopherol + sdvitamina, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 21 gl m1 < - glm(nDzLstc2 ~ sdBDtot + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) print =(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 21 optl2 < - 483 optL2 (nDzLstc 2, nDzLstc2~ sdBDtot + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TR UE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 22 glm1 < - glm(nDzLstc2 ~ sdbhb + sdNtot + sdvitamina + offset(log(Lstc)), family = poisson (link=log), data = data) print=(pred=glm1$fitte d) print summary(glm1) #finding optimum tuning parameter lambda2 for model 22 optl2 < - opt L2 (nDzLstc2, nDzLstc2~ sdbhb + sdNtot + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 6.365956 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdbhb + sdNtot + sdvitamina, nDzLstc2~offset(log(Lstc )) , lambda1=0, lambda2= 6.3 65956, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdbh b + sdNtot + sdvitamina, nDzLstc2~offset(log(Lstc )), data ) print(predictions) #running model 23 glm1 < - glm(nDzLstc2 ~ sdcalcium + sdBtot + sdv itamina + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$f itted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 23 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdcalcium + sdBtot + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRU E , trace = TRUE) optl2$lambda #the optimum lambda found was 12.41436 #Runnin g shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdcalcium + sdBtot + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1=0, l ambda2= 12.41436, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdcalcium + sdBtot + sdvitamina, nDzLstc2 ~offset(log(Lstc)), data ) print(predictions) #running model 24 glm1 < - glm(nDzLstc2 ~ sdLtot + sdalbumin + sdbetacarotene + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 24 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdLtot + sdalbumin + sdbetacarotene, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRU E ) optl2$lambda #the optimum lambda found was 5.935499 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdLtot + sdalbumin + sdbetacarotene, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 5.935499, sta n dardize=TRUE, model = c("po isson"), 484 trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdLtot + sdalbumin + sdbetacarotene, nDzLstc2~offset(log(Lstc)) , data ) print(predictions) #running model 25 glm1 < - glm(nDzLstc2 ~sdnefa + sdcalcium + sdvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tu ning parameter lambda2 for model 25 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdnefa + sdcalcium + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 5.162763 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdnefa + sdcalcium + sdvitamina, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 5.162763, standardize=T RUE, model = c("poisson"), t race = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdnefa + sdcalcium + sdvitamina, nDzLstc2~offset(log(Lstc)), data ) pri nt(predictions) SHRINKAGEAIM2 - 1_COUNT.R #Title: Shri nkage for Aim 2 - 1: count method #Author: Lauren Wisnieski #Purpose: Run shrinkage for all Aim 2 - 1: count aggregation method models #Revision history: Created January 2, 2019 #Shrinkage data < - read.csv("Co hortdata_28December2018.csv") #Loading packages library(rJava) library(glmulti) library(recoder) library(penalized) library(AER) #setting up variables summary(data$nTotalLstc2) TotalLstc2 = data$nTotalLstc2 Lstc = as.numer ic(TotalLstc2) summary(data$Lst c) nDzLstc2 = data$nDzLstc2 nBhb = data$nBhb nRos = data$nRos nBtot = data$nBtot nMtot = data$nMtot nNtot = data$nNtot nEtot = data$nEtot nBDtot = data$nBDtot nVitamina = data$nVitamina nCholesterol = data$nCholesterol nHeifer = data$nHeifer nBetacarotene = data$nBetacarotene nAlphatocopherol = data$nAlp hatocopherol nSaaugml = data$nSaaugml nVitamind = data$nVitamind nHaptoglobin = data$nHaptoglobin nAopugul = data$nAopugul nNefa = data$nNefa nCalcium = data$nCalcium 485 nLtot = data$nLtot nBCS = data$nBCS nAlb umin = data$nAlbumin season = data$season season2 = data$season2 season3 = data$season3 season4 = data$season4 data$season < - as.factor(data$season) #running model 1 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + offset(log(Lstc)), family = poisson(link=l og), data = data) print=(pred=glm1$fitted) pr int summary(glm1) #finding optimum tuning parameter lambda2 for model 1 optl2 < - optL2 (nDzLstc2, nDzLstc2~nCalcium + nAlbumin, nDzLstc2~offs et(log(Lstc)) , lambda1 = 0 , model = c("poi sson"), standardize=TRUE , trace = TR UE) optl2$lambda #the optimum lambda found was 6.493025 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nAlbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=6.493025, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coeffic ients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nCalcium + nAlbumin, nDzLstc2~off set(log(Lstc)), data ) print(predictions) #running model 2 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nHeifer + offset(log(Lstc) ), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optim um tuning parameter lambda2 for model 2 optl2 < - optL2 (nDzLstc2, nDzLstc2~ nCalcium + nAlbumin + nHeifer, nDzLstc2~offset(log(Lstc)) , la mbda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #t he optimum lambda found was 5.741377 #Runnin g shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nHeifer, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=5.741377, standardize=TRUE, model = c("poisson"), tra ce = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nCalcium + nAlbumin + nHeifer, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 3 glm1 < - glm(nDzLstc2 ~ nCa lcium + nAlbumin + nBDt ot + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 3 optl2 < - optL2 (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nBDtot, nDzLstc2~off set(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 5.270649 486 #Running shr inkage procedure pen1< - penalized (nDzLstc2, n DzLstc2~nCalcium + nAlbumin + nBDtot, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=5.270649, standardize=TRUE, model = c("poisson"), trace = T RUE) show(pen1) coefficients(pen1) # m ake predictions predictions < - predict(pen1, nDzLstc2~nCalcium + nAlbumin + nBDtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 4 glm1 < - glm(nDzLstc2 ~ nCalcium + n Albumin + nAlphato copherol + offset(log(Lstc)), f amily = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 4 optl2 < - optL2 (nDzLstc2, nDzLstc2~nCalcium + nAlbumi n + nAlphatocopher ol, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 4.306079 #Running shrinkage procedure pen1< - penalized ( nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nAlphato copherol, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 4.306079, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coeffic ients(pen1) # make predictions predictio ns < - predict(pen1, nDzLstc2~nCalcium + nAlbumin + nAlphatocopherol, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 5 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nMtot + offset(log(Lstc)), family = poisson(link=log), d ata = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 5 optl2 < - optL2 (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nMto t , nDzLstc2~offset( log(Lstc)) , lambda1 = 0 , model = c(" poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 13.02885 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nMtot, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=13.02885, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions prediction s < - predict(pen1, nDzLstc2~nCalcium + nAlbumin + nMtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 6 glm1 < - glm(nDzLstc2 ~ nCalcium + nBtot + nAlbumin + of fset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1 ) 487 #finding optimum tuning parameter lambda2 for model 6 optl2 < - optL2 (nDzLstc2, nDzLstc2~ nCalcium + nBtot + nAlbumin , nDzLstc2~offset(l og(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 11.98708 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nBtot + nAlbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=11.98708, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nCalcium + nBtot + nAlbumin, nDzL stc2~offset(log(Lstc)), data ) print(prediction s) #running model 7 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nCholester ol + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) # finding optimum tuning parameter lambda2 for mode l 7 optl2 < - optL2 (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nCholesterol, nDzLst c2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 6.7 09319 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nCholesterol, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=6.709319, standardize=T RUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nCalcium + nAlbumin + nCholesterol, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 8 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nAopugul + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter l ambda2 for model 8 optl2 < - optL2 (nDzLstc 2, nDzLstc2~nCalcium + nAlbumin + nAopugul, nDz Lstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda fou nd was 8.138597 #Running shrinkage procedu re pen1< - penalized (nDzLstc2, nDzLstc2~nCal cium + nAlbumin + nAopugul, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 8.138597, standardize=TRUE, model = c("poi sson"), trace = TRUE) show (pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nCalcium + nAlbumin + nAopugul, nDzLstc2~offset(log(Lstc)), data ) print(predictions) 488 #running model 9 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nNtot + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 9 optl2 < - optL2 (nDzLstc2 , nDzLstc2~nCalcium + nAlbumin + nNtot, nDzLstc2~ offset( log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 4.801626 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nAlbumi n + nNtot , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 4.801626, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nCalcium + nAlbumin + nNtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 10 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nVit amina + offset(log(Lstc)), family = poisson(link= log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 10 optl2 < - optL2 (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nVitamina, nDzLs tc2~offset(log(Lstc)) , la mbda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 8.974438 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nVitamina, n DzLstc2~offset(log(Lstc )) , lambda1=0, lambda2=8.974438, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictio ns predictions < - predict(pen1, nDzLstc2~nCalci um + nAlbumin + nVitamina, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 11 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nBetacarotene + offset(log(Lstc)), family = p oisson(link=log), data = data) print=(pred=g lm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 11 optl2 < - optL2 (nDzLstc2, nDzLstc2~ nCalcium + nAlbumin + nBetacarotene, nDzLstc2~offset(log(Lstc)) , lambd a1 = 0 , model = c("poisson"), standar dize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 7.835122 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nBetacarotene, nDz Lstc2~offset(log(Lstc)) , lambda1=0, lambda2= 7.835122, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) 489 # make predictions predictions < - pred ict(pen1, nDzLstc2~nCalcium + nAlbumin + nBetacar otene, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 12 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nRos + offset(log(Lstc)), family = poisso n(link=log), data = d ata) print=(pred=glm1$fitted) print summary (glm1) #finding optimum tuning parameter lambda2 for model 12 optl2 < - optL2 (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nRos, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson" ), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 7.395966 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nRos, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=7.395966, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nCalcium + nAlbumin + nRos, nDzL stc2~offset(log(Lstc)), data ) print(prediction s) #running model 13 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nBCS + offset(log(Lstc)), family = poisson(link=lo g), data = data) print=(pred=glm1$fitted) print summary(glm1) #f inding optimum tuning parameter lambda2 for model 13 optl2 < - optL2 (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nBCS, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$la mbda #the optimum lambda found was 8.472062 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nBCS, nDzLstc2~offset(log(Lstc)) , lambda1 =0, lambda2= 8.472062, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nCalcium + nAlbumin + nBCS, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 14 glm1 < - glm(nDzLs tc2 ~nCalcium + nHaptoglobin + nAlbumin + offset(log(Lstc)), family = poisson( link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 14 optl2 < - optL2 (nDzLstc2, nDzLstc2~nCal cium + nHaptoglobin + nAlbumin , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 8.2 16163 #Running shrinkage procedure 490 pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nHaptoglobin + nAlbumin , nDzLstc2~o ffset(log(Lstc)) , lambda1=0, lambda2=8.216163, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, n DzLstc2~nCalcium + nHaptoglobin + nAlbumin , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 1 5 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nBhb + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 15 optl2 < - optL2 (nDzLstc2, n DzLstc2~nCalcium + nAlbumin + nBhb, nDzLstc2~of fset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 9.635287 #Running shrinkage procedure pe n1 < - penalized (nDzLstc2, nDzLstc2~nCalcium + nAl bumin + nBhb, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=9.635287, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coef fi cients(pen1) # make predictions predic tions < - predict(pen1, nDzLstc2~nCalcium + nAlbumin + nBhb, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 16 glm1 < - glm(nDzLstc2 ~ nVitamind + nCalcium + nAlbumin + offset(log(Lstc)), family = poisson(link=log), da ta = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 16 optl2 < - optL2 (nDzLst c2, nDzLstc2~ nVitamind + nCalcium + nAlbumin , nDzLstc2~of fset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 18.85139 #Running shrinkage proce dure pen1< - penalized (nDzLstc2, nDzLstc2~nVitamind + nC alcium + nAlbumin, nDzLstc2~offset(log(Lstc)) , l ambda1=0, lambda2= 18.85139, standardize=TRUE, model = c("poisson"), trace = TRUE) s how(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nVitamind + nCalcium + nAlbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 17 glm1 < - glm(nDzLstc2 ~ nCalcium + nAlbumin + nSaaugml + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm 1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 16 optl2 < - 491 optL2 (nDzLstc2, nDzLstc2~ nCalcium + nAlbumin + nSaaugml, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize =TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 10.437797 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nSaaugml, nDzLstc2 ~offset(log(Lstc)) , lambda1=0, lambda2= 10.43779, standardize=TRUE, model = c("poisson"), tra ce = TRUE) show(pen1) coefficients(pen1) #running model 18 glm1 < - glm(nDzLstc2 ~ nCalci um + nAlbumin + nEtot + offset(log (Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 18 optl2 < - optL2 (nDzLstc2, nDzLstc2~nCalciu m + nAlbumin + nEtot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 8.500918 #Running shrinkage procedure pen1< - penalized ( nDzLstc2, nDzLstc2~nCalcium + nAlbumin + nEtot, n DzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 8.500918, standar dize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predi ct(pen1, nDzLstc2~ nCalcium + nAlbumin + nEtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 19 glm1 < - glm(nDzLstc2 ~ nCalcium + nLtot + nAlbumin + of fset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 19 optl2 < - optL2 (nDzLstc2, nDzLstc2~nCalcium + nLtot + nAlbumin , nDzLstc2~offset(lo g(Lstc)) , lambda1 = 0 , model = c("po isson"), standardize=TRUE , trace = TRUE) op tl2$lambda #the optimum lambda found was 6.909317 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nLtot + nA lbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 6.909317, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ nCalcium + nLtot + nA lbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictio ns) #running model 20 glm1 < - glm(nDzLstc2 ~ nCalcium + nNefa + nAlbumin + offset(log(Lstc)), family = poisson(link=log ), data = data) print=(pred=glm1$fitted) prin t summary(glm1) #finding optimum tuning parameter lambda2 for model 20 optl2 < - 492 optL2 (nDzLstc2, nDzLstc2~nCalcium + nNefa + nAlbumin , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , tra ce = TRUE) op tl2$lambda #the optimum lambda found was 12.83052 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nCalcium + nNefa + nAlbumin, nDzLstc2~offset(log(Lstc)) , lambd a1=0, lambda2= 12.83052, stan dardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ nCalcium + nNe fa + nAlbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictio ns) #running model 21 glm1 < - glm(nDzLstc2 ~ nAlbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary( glm1) #finding optimum tuning parameter lambda2 for model 21 optl2 < - optL2 (nDzLstc2, nDzLstc2~ nAlbumin , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 16.46679 #Run ning shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 16.46679, standardize=TRUE, model = c("po isson" ), trace = TRUE) sho w(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ nAlbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) SHRINKAGEAIM2 - 1COMBINED.R #Title: Shrin kage for Aim 2 - 1: combined method #Author: Lauren Wisnieski #Purpose: Run shrinkage for all Aim 2 - 1: combined aggregation method models #Revision history: Created January 3, 2019 setwd("C:/Users/wisni/Amazon Drive/DISSERTATION ANALYSIS/Data files") data < - read.csv("Cohortdata_28December2018.csv" ) #Shrinkage #Loading packages library(rJava) library(glmulti) library(recoder) library(penalized) library(AER) #setting up variables summary(data$nTotalLstc2) TotalLstc2 = data$nTotalLstc2 Lstc = as.nu meric(TotalLstc2) summary(data$Lstc) nDzLstc2 = d ata$nDzLstc2 sdBDtot = data$sdBDtot sdalbumin = data$sdalbumin sdBtot = data$sdBtot sdbhb = data$sdbhb 493 sdLtot = data$sdLtot sdMtot = data$sdMtot sdvitamind = data$sdvitamind sdvitamind = data$sdvitamind sdne fa= data$sdnefa sdcalcium = data$calcium sdcholes terol = data$sdcholesterol sdalphatocopherol= data$alphatocopherol nCalcium = data$nCalcium nBCS = data$nBCS nAlbumin = data$nAlbumin nLtot= data$nLtot nBtot= data$nBtot nAopugul= data$nA opugul nHeifer = dat a$nHeifer meanBtot = data$meanBtot meanLtot = dat a$meanLtot meanMtot = data$meanMtot meanalbumin= data$meanalbumin meanaopugul= data$meanaopugul meanbetacarotene = data$meanbetacarotene manBCS= data$manBCS #running model 1 glm1 < - gl m(nDzLstc2 ~ meanbhb + sdBDtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) glm1 print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 1 optl2 < - optL2 (nDzLstc2, nDzLstc2~ meanbhb + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 2 gl m1 < - glm(nDzLstc2 ~ sdBDtot + sdBtot + nBCS + o ffset(log(Lstc)), family = poisson(link=log), dat a = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 2 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdBDtot + sdBtot + nBCS, nDzLstc2~offset(log(Ls tc)) , lambda1 = 0 , model = c("poisso n"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 3 glm1 < - glm(nDzLstc2 ~ sdbhb + sdBDtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm 1$fitted) print glm1 summary(glm1) #finding optimum tuning parameter lambda2 for model 3 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdbhb + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=T RUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.2115466 #Running shri nkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdbhb + sdBDtot + medBCS, nDzLstc2~offset( log(Lstc)) , lambda1=0, lambd a2= 0.2115466, standardize=TRUE, model = c("poisson"), trace = TRUE ) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLs tc2~sdbhb + sdBDtot + medBCS, nDzLstc2~offset(log (Lstc)), data ) print(predictions) 494 #running model 4 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalciu m + sdBDtot + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitte d) print summary(glm1) #finding optimum tun ing parameter lambda2 for model 4 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + sdBDtot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the o ptimum lambda found was 0 #running model 5 glm1 < - glm(nDzLstc2 ~sdBDtot + nBhb + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for m odel 5 optl2 < - optL2 (nDzLstc2, nDzLst c2~sdBDtot + nBhb + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$la mbda #the optimum lambda found was 0 #runni ng model 6 glm1 < - glm(nDzLstc2 ~meanBtot + sdBDtot + nBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning paramet er lambda2 for model 6 optl2 < - optL2 (nDz Lstc2, nDzLstc2~meanBtot + sdBDtot + nBCS , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda fou nd was 0 #running model 7 glm1 < - glm(nDzLst c2 ~ meancalcium + sdBDtot + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 7 optl2 < - optL2 (nDzLstc2, nDzLstc2~mean calcium + sdBDtot + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.44 8059 #Running shrinkage proced ure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + sdBDtot + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 1.448059, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium + sdBDtot + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running mode l 8 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalciu m + offset(log(Lstc)), family = poisson(link=lo g), data = data) print=(pred=glm1$fitted) print summary(glm1) 495 #finding optimum tuning parameter lambda2 for model 8 optl2 < - optL2 (nDzLstc2, nDz Lstc2~nAlbumin + meancalcium, nDzLstc2~offset(log (Lstc)) , lambda1 = 0 , model = c("p oisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 5.561205 #Running shrinkage procedure pen1< - pen alized (nDzLstc2, nDzLstc2~nAlbumin + meancalcium , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 5.561205, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pe n1) # make predictions predictions < - pr edict(pen1, nDzLstc2~nAlbumin + meancalcium, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 9 glm1 < - glm(nDzLstc2 ~ sdBDtot + medBCS + sdNtot + offset(log(Lstc)), fam ily = poisson(link=log), data = data) print=(pr ed=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 9 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBDtot + medBCS + sdNtot , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisso n"), standardize=TR UE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.3684528 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdBDtot + medBCS + sdNtot, nDzLstc2~offset( log(Lstc)) , lambda1=0, lambd a2= 0.3684528, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLs tc2~sdBDtot + medBCS + sdNtot, nDzLstc2~offset(lo g(Lstc)), data ) print(predictions) #running model 10 glm1 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanvitamina + offset(log(Lstc)), family = po isson(link=log), data = data) prin t=(pred=glm1$fitted) print summary(glm1) #f inding optimum tuning parameter lambda2 for model 10 optl2 < - optL2 (nDzLstc2, nDzLstc2~ meancalcium + meanalbumin + meanvitamina, nDzLstc2~offset(log(Lstc)) , lamb da1 = 0 , model = c("pois son"), standardize=TRUE , trace = TRUE ) optl2$lambda #the optimum lambda found was 10.82159 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + meanvit amina, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 10.82159, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalciu m + meanalbumin + meanvitamina, nDzLstc2~offset(l og(Lstc)), data ) 496 print(predictions) #running model 11 glm1 < - glm(nDzLstc2 ~ sdBDtot + medBCS + offset(log(Lstc)), family = poisson(lin k=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning p arameter lambda2 for model 11 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdBDtot + medBCS , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c( "poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda fou nd was 0.3071843 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 0.3071843, standardize=TRU E, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ sdBDtot + medBCS, nDzLst c2~offset(log(Lstc)), data ) print(predictions) #running model 12 glm1 < - glm( nDzLstc2 ~sdbhb + nNefa + sdvitamind + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 12 optl2 < - optL2 (nDzLstc2, nD zLstc2~sdbhb + nNefa + sdvitamind , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #run ning model 13 glm1 < - glm(nDzLstc 2 ~meancalcium + sdcalcium + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 13 optl2 < - optL2 (nDzLstc2, nDzLstc2~mea ncalcium + sdc alcium + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 7.613414 # Running shrinkage procedure pen1< - penalized ( nDzLstc2, nDzLstc2~meancalcium + sdcalcium + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 7.613414, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coeffici ents(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium + sdcalcium + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 14 497 glm1 < - glm(nDzLstc2 ~meancalcium + sdBDtot + m edBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 14 optl2 < - optL2 (nDzLs tc2 , nDzLstc2~ meancalcium + sdBDtot + medBCS, nDzLs tc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.337392 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalc ium + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 1.337392, standardize=TRUE, model = c("poisson"), trace = TRUE) show(p en1) coefficients(pen1) # make predictio ns predictions < - predict(pen1, nDzLstc2~meancalcium + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 15 glm1 < - glm(nDzLstc2 ~ sdBDtot + medBCS + meanMto t + offset(log(Lstc)), family = poisson(link=log) , data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 15 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdBDtot + medBCS + meanMtot, nDzLstc2~off set(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.628802 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdBDtot + medB CS + meanMtot, nDzLstc2~offset(log(Lstc)) , lambd a1=0, lambda2= 0.628802, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions pred ictions < - predict(pen1, nDzLstc2~sdBDtot + medBC S + meanMtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 16 glm1 < - glm(nDzLstc2 ~ sdBtot + sdBDtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data ) print=(pred=glm1$fitted) print summary(gl m1) #finding optimum tuning parameter lambda2 for model 16 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdBtot + sdBDtot + medBCS , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 2.179659 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdBtot + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 2.179659, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions 498 predictions < - predict(pen1, nDzLstc2~sdBtot + sdBDtot + medBCS, nDzL stc2~offset(log(Lstc)), data ) print(prediction s) #running model 17 glm1 < - glm(nDzLstc2 ~ sdcal cium + sdBDtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1 ) #finding optimum tuning parameter lambda2 for model 17 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sd calcium + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.121215 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdcalcium + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 1.121215, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdcalcium + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)), data ) print(pred ictions) #running model 18 glm1 < - glm (nDzLstc2 ~ sdBDtot + medBCS + nHeifer + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambd a2 for model 18 optl2 < - optL2 (nDzLstc2, nDzLstc2~s dBDtot + medBCS + nHeifer, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 19 glm1 < - glm(nDzLstc2 ~nAl bumin + meancalcium + sdcalcium + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 19 optl2 < - optL2 (nDzLstc2, nDzLstc 2~nAlbumin + meancalcium + sdcalcium, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 10.21812 #Runnin g shrinkage procedure pe n1< - penalized (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + sdcalcium, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=10.21812, standardize=TRUE, model = c("poisson" ), trace = TRUE) show(pen1 ) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + sdcalcium, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 20 glm1 < - glm(nDzLstc2 ~ meancalcium + nAlbumin + meanvitamina + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) 499 print summary(glm1) #finding optimum tuning parameter lambda2 for model 20 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + n Albumin + meanvitamina , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 11.77968 #Running shrinkage procedure pen1< - penali zed (nDzLstc2, nDzLstc2~meancalcium + nAlbumin + meanvitamina, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 11.77968, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coe fficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium + nAlbumin + meanvitamina, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 21 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumi n + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 21 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + meanalbumin , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 6.52088 #Running shrinkage proce d ure pen1< - penalized (nDzLstc2, nDzLstc2~mean calcium + meanalbumin , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 6.52088, standardize=TRUE, model = c("poisson"), trace = TRUE) show ( pen1) coefficients(pen1) # make predict ions predictions < - predict(pen1, nDzLstc2~meancalcium + meanalbumin , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 22 glm1 < - glm(nDzLstc2 ~meancalcium + meanalb umin + nVitamind + offset(log(Lstc )), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 22 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + nVitamind, nDzLstc2~offset(log(Ls tc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 4.393372 #Running shrinkage procedure pen1< - penali zed (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + nVitamind , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 4.393372, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) co efficients(pen1) 500 # make predictions pred ictions < - predict(pen1, nDzLstc2~meancalcium + meanalbumin + nVitamind, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 23 glm1 < - glm(nDzLstc2 ~ sdbhb + sdBDtot + nBCS + offset(log(Lstc)), family = poisson(li nk=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 23 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdbhb + sdBDtot + nBCS , nDzLstc2~o ffset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$l ambda #the optimum lambda found was 0 #running model 24 glm1 < - glm(nDzLstc2 ~ sdBDtot + nVitamind + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #findi ng optimum tuning parameter lambda2 for model 24 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBDtot + nVitamind + medBCS , nDzLstc2~o ffset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$l ambda #the optimum lambda found was 0 #running model 25 glm1 < - glm(nDzLstc2 ~sdBDtot + medBCS + meanbetacarotene + offset(log(Lstc)), family = poisson(link=log) , data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 25 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdBDtot + medBCS + meanbetaca rotene, nDzL stc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.8888548 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDz Lstc2~sdBDto t + medBCS + meanbetacarotene, nDzLstc2~offset(lo g(Lstc)) , lambda1=0, lambda2= 0.8888548, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make p redictions predictions < - predict(pen1, nDzLstc 2~ sdBDtot + medBCS + meanbetacarotene, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 26 glm1 < - glm(nDzLstc2 ~sdbhb + nNefa + sdalphatocopherol + off set(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 26 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdbhb + nNefa + sdalphatocopherol, nDzLstc2~offse t(log(Lstc)) , lambda1 = 0 , model = c ("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda 501 #the optimum lambda found was 0 #running mode l 27 glm1 < - glm(nDzLstc2 ~ sdBDtot + meanaopugul + medBCS + of fset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 27 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdBDtot + meanaopugul + medBCS, nDzLstc2~offset (log(Lstc)) , lambda1 = 0 , model = c( "poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.593047 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdBDtot + meanaop ugul + medBCS, nDzLstc2~offset(log(Lstc)) , lambd a1=0, lambda2= 1.593047, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions pred ictions < - predict(pen1, nDzLstc2~ sdBDtot + mean aopugul + medBCS, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 28 glm1 < - glm(nDzLstc2 ~ sdBDtot + meanBtot + medBCS + offset(log(Lstc)), family = poisson(li nk=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for mo del 28 optl2 < - optL2 (nDzLstc2, nDzLstc2~ sdBDtot + meanBtot + medBCS , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found is 1.76 9778 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdBDtot + meanBtot + medBCS, nDzLstc2~offset(log(Lstc) ) , lambda1=0, lambda2= 1.76 9778, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdBDt ot + meanBtot + medBCS, nDzLstc2~offset(log(Lstc) ), data ) print(predictions) #running model 29 glm1 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + nCholesterol + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pre d=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~ meancalcium + meanalbumin + nCholesterol , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson" ), standardize= TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 3.308399 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ meancalcium + meanalbumin + nCholesterol, nDzLstc2~offset(log(Lstc)) , lamb da1=0, lambda2= 3.308399, sta ndardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) 502 coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ meancalcium + meanalbumin + nCholesterol , nDzLstc2~offset(log (Lstc)), data ) print(predictions) #running model 30 glm1 < - glm(nDzLstc2 ~ meanbhb + sdLtot + sdBDtot + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$f itted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~ meanbhb + sdLtot + sdBDtot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.4958992 #Running shrinkage procedure pen1< - penalized (nD zLstc2, nDzLstc2~ meanbhb + sdLtot + sdBDtot, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 0.4958992, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ meanbhb + sdLtot + sdBDtot, nDzLstc2~offset(log (Lstc)), data ) print(predictions) #running model 31 glm1 < - glm(nDzLstc2 ~ sdvitamind + nBCS + sdBDtot + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimu m tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdvitamind + nBCS + sdBDtot, nDzLstc2~ offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.3971716 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ sdvitamind + nBCS + sdBDtot, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=0.3971716, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdvitamind + nBCS + sdBDtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #runnin g model 32 glm1 < - glm(nDzLstc2 ~ nAopugul + me dBCS + sdBDtot + offset(log(Lstc)) , family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAopugul + medB CS + sdBDtot, nDzLstc2~offset(log(Lstc)) , lambd a1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.9367798 #Running shrinkage procedure pen1< - penalized (nDzLstc 2, nDzLstc2~ nAopugul + medBCS + sdBDtot, nDzLst c2~offset(log(Lstc)) , lambda1=0, lambda2=0.9367798, standardize=TRUE, model = c("poisson"), trac e = TRUE) 503 show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAopugul + medBCS + sdBDtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 33 glm1 < - glm(nDzLstc2 ~ sdBDtot + me dBCS + sdcholesterol + offset(log(L stc)), family = poiss on(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBDtot + medBCS + sdcholesterol, nDzLstc2~offset(log(Lstc )) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 34 glm1 < - glm(nDzLstc2 ~nAlbumin + mea ncalcium + nCholesterol + offset(lo g(Lstc)), family = po isson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbum in + meancalcium + nCholesterol, nDzLstc2~offset( log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 4.371335 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin + meancal cium + nCholesterol, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=4.371335, standardize=TRUE, model = c("poisson"), trace = TRUE) s how(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + nCholesterol, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 35 glm1 < - glm(nDzLstc2 ~ meancalcium+ sdBD tot + nBC S + offset(log(Lstc)), family = poi sson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium+ sdBDto t + nBCS, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 36 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalcium + medBCS + offset(log(Lstc)), family = poisson(lin k=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + medBCS, nDz Lstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 504 #running model 37 gl m1 < - glm(nDzLstc2 ~ meancalcium + sdbhb + nNefa + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 ( nDzLstc2, nDzLstc2~meancalcium + sdbhb + nNefa, nDzLstc2~offse t(log(Lstc)) , lambda1 = 0 , model = c ("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 38 glm1 < - glm(nDzLstc2 ~ sdBDtot + sdvitamind + medBCS + offse t(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - o ptL2 (nDzLstc2, nDzLstc2~ sdBDtot + sdvitamind + medBCS, nDzLstc2~offset(log (Lstc)) , lambda1 = 0 , model = c("poi sson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.292217 #Running shrink age procedure pen1< - penalized (nDzLstc2, nDzLstc2~ sdBDtot + sdvitamind + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0 , lambda2=1.292217, standardize=TRUE, model = c("poisson"), trace = T RUE) show(pen1) coefficients(pen1) # make predictions predictio ns < - predict(pen1, nDzLstc2~ sdBDtot + sdvitamin d + medBCS, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 39 - DELETED glm1 < - glm(nDzLstc2 ~meanBtot + sdBDtot + sdalphatocopherol + offs et(log(Lstc )), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~meanBtot + sdBDtot + sdalp hatocophero l, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was infinity; deleted this model #running model 39 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumin + nBhb + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 ( nDzLstc2, nDzLstc2~meancalcium + meanalbumin + nB hb, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 6.155821 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2 ~ meancalcium + meanalbumin + nBhb, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=6.155821, standardize=TRUE, model = c("poisson"), trace = TRU E) show(pen1) coefficients(pen1) 505 # mak e predictions predictions < - predict(pen1, nDzLstc2~meancalcium + meanalbumin + nBhb, nDzLstc2~offset(log(Lstc)), data ) print(predictions ) #r0unning model 40 glm1 < - glm(nDzLstc2 ~ sdBDtot + mean aopugul + nBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for m odel 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBDtot + meana opugul + nBCS, nDzLstc2~offset(log(Lstc)) , lambd a1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.1790327 #Running shrinkage procedure pen1< - penalized (nDzLst c2, nDzLstc2~ sdBDtot + meanaopugul + nBCS, nDzLs tc2~offset(log(Lstc)) , lambda1=0, lambda2=0.1790327, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(p en1, nDzLstc2~sdBDtot + meanaopugul + nBCS, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 41 glm1 < - glm(nDzLstc2 ~ meancalcium + nNefa + meanalbumin + offset(l og(Lstc)), family = poisson(link=log), data = dat a) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for mo del 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + nNefa + meanalbumin, nDzLstc2~offset(log (Lstc)) , lambda1 = 0 , model = c("poi sson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 13 .75006 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ meancalcium + nNefa + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda 1=0, lambda2=13.75006, standardize=TRUE, model = c("poisson"), trace = TRUE) show( pen1) coefficients(pen1) # make predictions predic tions < - predict(pen1, nDzLstc2~meancalcium + nNe fa + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 42 glm1 < - glm(nDzLstc2 ~nAlbumin+ meancalcium +s dbhb + offset(log(Lstc)), family = poisson(link =log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin+ meancalcium +sd bhb, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , mo del = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 10.59812 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc 2~ nAlbumin+ meancalcium +sdbhb, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=10.59812, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) 506 coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin+ meancalcium +sdbhb, nDzLstc2~offset(log(Lstc)), d ata ) print(predictions) #running model 43 glm1 < - glm(nDzLstc2 ~ sdBDtot + sdalphatocopherol + medB CS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1 $fitted) print summary(glm1) #finding optim um tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBDtot + sdalphatocopherol + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardiz e=TRUE , trace = TRUE) optl2$lamb da #the optimum lambda found was 0.884414 #Running shrinkage procedure pen1< - penalized (nDzLstc2, n DzLstc2~ sdBDtot + sdalphatocopherol + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=0.884414, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdBDtot + sdalphatocopherol + me dBCS, nDzLstc2~offset(log(Lstc)), data ) print( predictions) #running model 44 glm1 < - glm(nDzLstc2 ~ sdBDtot + medBCS + nBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summar y(glm1) #finding optimum tuning parameter lambd a2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBDtot + medBCS + nBCS, nDzLs tc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 3.6 26849 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ sdBDtot + medBCS + nBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=3.626849, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdBDtot + medBCS + nBCS, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 45 glm1 < - glm(nDzLst c2 ~ nAlbumin + meancalcium + meanbetacarotene + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning paramet er lambda2 for model 29 optl2 < - optL2 (nD zLstc2, nDzLstc2~nAlbumin + meancalcium + meanbetacarotene, nDz Lstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the op timum lambda found was 6.344746 #Running shr inkage procedure 507 pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbu min + meancalcium + meanbetacarotene, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=6.344746, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + meanbetacarotene, nDzLstc2~offset(log(Lstc)), data ) print(pr edictions) #running model 46 glm1 < - glm(nDzLstc2 ~ nAlbumin + mea ncalcium + nVitamind + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning p arameter lambda2 for model 29 optl2 < - opt L2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + nVitamind, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the opt imum lambda found was 8.501784 #Running shri nkage procedure pen1< - penali zed (nDzLstc2, nDzLstc2~ nAlbumin + meancalcium + nVitamind, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=8.501784, standardize=TRUE, model = c("poisson"), tra ce = TRUE) show(pen1) coeffi cients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + nVitamind, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 47 glm1 < - glm(nDzLstc2 ~ meancalcium + sdbhb + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nD zLstc2~meancalcium + sdbhb + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 11.02679 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ meancalcium + sdbhb + meanalbum in, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=11.02679, standardize=TRUE, model = c("pois son"), trace = TRUE) show( pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium + sdbhb + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running mode l 48 glm1 < - glm(nDzLstc2 ~nAlbumin+ sdBDtot + medBCS + offset(lo g(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin+ sdBDtot + med BCS, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , 508 trace = TRUE) optl2$lambda #the optimum lambda found was 2.881502 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc 2~ nAlbumin+ sdBDtot + medBCS, n DzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=2.881502, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make pre dictions predictions < - predic t(pen1, nDzLstc2~nAlbumin+ sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 49 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + meanalbumin + offset(log(Lstc)), f amily = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~ nAlbumin + meanc alcium + mean albumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 15.95994 #Running shrinkage procedure pen1< - penalized ( nDzLstc2, nDz Lstc2~ nAlbumin + meancalcium + mean albumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=15.95994, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficie nts(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 50 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + nNefa + offset(log(Lstc)), family = poisson(l ink=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + nNefa, nD zLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 13.87486 #Running shrin kage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlb umin + meancalcium + nNefa, nDzLstc2~offset(log(L stc)) , lambda1=0, lambda2=13.87486, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predic tions predictions < - predict(pen1, nDzLstc2~nAl bumin + meancalcium + nNefa, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 51 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + manBCS + offset(log(Lstc)), fa mily = poisson(link=log), data = data) print=(p red=glm1$fitted) print summary(glm1) 509 #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + manBCS, nDzLstc2~offset(log(Lstc)) , lambd a1 = 0 , model = c("poisson"), standar dize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 10.93479 #Running shr inkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin + meancalcium + manBCS, nDz Lstc2~offset(log(Lstc)) , lambda1=0, lambda2=10.93479, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict( pen1, nDzLstc2~nAlbumin + meancalcium + manBCS, n DzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 52 glm1 < - glm(nDzLstc2 ~ sdBDtot + medBCS + nCalcium + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summ ary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - opt L2 (nDzLstc2, nDzLstc2~sdBDtot + medBCS + nCalcium, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("pois son"), standardize=TRUE , trace = TRUE ) optl2$lambda #the optimum lambda found was 2.249282 #Running shrinkage p rocedure pen1< - penalized (nDzLstc2, nDzLstc2~sdBDtot + medBCS + nCalcium, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=2.249282, standardize=TR UE, model = c("poisson"), trace = TRUE) sh ow(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdBDtot + medBCS + nCalci um, nDzLstc2~offset(log(Lstc)), data ) print(pr edictions) #running model 53 glm1 < - glm(nDzLstc2 ~nAlbumin + sdBDtot + nCal cium + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print sum mary(glm1) #finding optimum tuning parameter la mbda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + sdBDtot + nCalcium, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = T RUE) optl2$lambda #the optimum lambda foun d was 0 #running model 54 glm1 < - glm(nDzLstc2 ~sdBDtot + sdLtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 o ptl2 < - optL2 (nDzLstc2, n DzLstc2~sdBDtot + sdLtot + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda # the optimum lambda found was 4.097135 510 #Runni ng shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdBDtot + sdLtot + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 4.097135, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdBDtot + sdLtot + medBCS, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running m odel 55 glm1 < - glm(nDzLstc2 ~ sdLtot + sdBDtot + sdvitamind + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdLtot + sdBDtot + sdvitamind, nDzLstc2~ offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 14.85782 #Running shrinkage procedure pen1< - penalized (nDzLstc 2, nDzLstc2~sdLtot + sdB Dtot + sdvitamind , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=14.85782, standardize=TRUE, model = c("poisson"), tra ce = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdLtot + sdBDtot + sdvitamind, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 56 glm1 < - glm(nDzLstc2 ~nAl bumin + meancalcium + meanbhb + off s et(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~ nAlbumin + meancalcium + meanbhb, nDzLstc2~offset ( log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 8.349923 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin + meanca l cium + meanbhb, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=8.349923, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) c oefficients(pen1) # make predictions pre dictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + meanbhb, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 57 glm1 < - glm(nDzLstc2 ~ sdBDtot + nBCS + nAopugul + offset(log(Ls tc)), family = poisson(lin k=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBDtot + nBCS + nAopugul, nDzLstc2~ offset(log(Lstc)) , lam bda1 = 0 , mode l = c("poisson"), standardize=TRUE , trace = TRUE) 511 optl2$lambda #the optimum lambda found was 2.585974 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdBDtot + nB CS + nAopugul, nDzLstc2 ~offset(log(Lstc)) , lambd a1=0, lambda2=2.585974, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predi ctions < - predict(pen1, nDzLstc2~sdBDtot + nBCS + nAopugul, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 58 glm1 < - glm(nDzLstc2 ~ sdBtot + sdBDtot + sdalphatocopherol + offset(log(Lstc)), fam ily = p oisson(link=log), data = data) print=(pred=glm1 $fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBtot + sdBDtot + sdalphatocopherol, nDzLstc2~offset(log(Lstc)) , l ambda1 = 0 , model = c("poisson"), standardiz e=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 59 glm1 < - glm(nDzLstc2 ~sdBDtot + nBCS + meanMtot + offset(log(Lstc)), famil y = poi sson(link=log), data = data) print=(pred=glm1$f itted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBDtot + nBCS + meanMtot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 60 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalcium + nHeifer + offset(log(Lstc)), family = poisson(link =log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + nHeifer, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 11.71645 #Running shrinkag e procedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin + meancalcium + nHeifer, nDzLstc2~offset(log( Lstc)) , lambda1=0, lambda2=1 1.71645, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nA lbumin + meancalcium + nHeifer, nDzLstc2~offset(l og(Lstc)), data ) print(pr edictions) #running model 61 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + nBhb + offset(log(Lstc)), family = poisson(link=log), data = data) 512 print=(p red=glm1$fitted) print summary(glm1) #findi ng optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + nBhb, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardi ze=TRUE , trace = TRUE) optl2$lam bda #the optimum lambda fo und was 6.411867 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin + meancalcium + nBhb, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=6.411867, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + nBhb, nDzLstc 2~offset(log(Lstc)), data ) print(predi ctions) #running model 62 glm1 < - glm(nDzLstc2 ~ sdBtot + sdBDtot + sdvitamind + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lamb da2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBtot + sdBDtot + sdvitamind, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.4 02907 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ sdBtot + sdBDtot + sdvitamind, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 1.402907, standardize=TRUE , model = c("poisson "), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdBtot + sdBDtot + sdvitamind, nDzLstc2~offset(log(Lstc)), data ) print(pr edictions) #running model 63 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + sdLtot + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parame ter lambda2 for model 29 optl2 < - optL2 (n DzLstc2, nDzLstc2~ nAlbumin + meancalcium + sdLtot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum la mbda found was 15.52409 #Running shrinkage p rocedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin + meancalcium + sdLtot, nDzLstc2 ~offset(log(Lstc)) , lambda1=0, lambda2=15.52409, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + sdLtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) 513 #run ning model 64 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + meanMtot + offset(log(Lstc)), fa mily = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbum in + meancalcium + meanMtot, nDzLstc2~offset(log(Lstc)) , lam bda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 12.6757 7 #Running shrinkage procedure pen1< - pen alized (nDzLstc2, nDzLstc2~ nAlbumin + meancalcium + meanMtot , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 12.67577, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coe fficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + meanMtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 65 glm1 < - glm(nDzLstc2 ~ nAopugul + medBCS + sdBDtot + offset(log(Lstc)), family = po isson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzL stc2, nDzLstc2~nAopugul + medBCS + sdBDtot, nDzLs tc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.9367798 #Running shrinkage procedur e pen1< - penalized (nDzLstc2, nDzLstc2~ nAopug ul + medBCS + sdBDtot, nDzLstc2~offset(log (Lstc)) , lambda1=0, lambda2=0.9367798, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pe n1) coefficients(pen1) # make prediction s predictions < - predict(pen1, nDzLstc2~nAopugul + medBCS + sdBDtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 65 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumin + sdcholesterol + offset(log(Lstc)), fa mily = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 65 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + me analbumin + sdcholesterol, nDzLstc2~offset(log(Ls tc )) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 8.031552 #Running shrinkage procedure pen1< - penal ized (nDzLstc2, nDzLstc2~ meancalcium + meanalbum in + sdcholesterol, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 8.031552, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) 514 # make predictions p redictions < - predict(pen1, nDzLstc2~meancalcium + meanalbumin + sdcholesterol, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 66 glm1 < - glm(nDzLstc2 ~nAlbumin + mea ncalcium + sdcholesterol + offset(l og(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 66 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbu min + meancalcium + sdcholesterol, nDzLstc2~offse t(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 17.46998 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nAlbumin + meanca lcium + sdcholesterol, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=17.46998, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen 1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + sdcholesterol, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 67 glm1 < - glm(nDzLstc2 ~ nAlbumin + mea ncalcium + sdalphatocopherol + offs et(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 op tl2 < - optL2 (nDzLstc2, nDzLstc2~n Albumin + meancalcium + sdalphatocopherol, nDzLst c2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 6.6 61129 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + sdalphatocopherol, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 6.661129, standardize=TRUE, model = c("poi sson"), trace = TR UE) show(pen1) coefficients(pen1) # ma ke predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + sdalphatocopherol, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 68 glm1 < - glm(nDzLstc2 ~ sdbhb + sdLtot + sdBDtot + offse t(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 op tl2 < - optL2 (nDzLstc2, nDzLstc2~ s dBDtot +sdbhb + sdLtot, nDzLstc2~offset(log(Lstc) ) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda 515 #the optimum lambda found was 0.1318306 #Runnin g shrinkage procedure pen1< - penalize d (nDzLstc2, nDzLstc2~ sdBDtot +sdbhb + sdLtot, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=0.1318306, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1 ) # make predictions predictions < - pred ict(pen1, nDzLstc2~ sdBDtot +sdbhb + sdLtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 69 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + nAopugul + offset(log(Lstc)), family = poisson(link=log), da ta = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 69 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + nAopugul, nDzLstc2~of fset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 8.912563 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin + me ancalcium + nAopugul, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=8.912563, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + nAopugul, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 70 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + sdNtot + offset(log(Lstc)), f amily = poisson(link=log), data = data) print=( pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 70 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + sdNtot, nDzLstc2~offset(log(Lstc)) , lamb da1 = 0 , model = c("poisson"), standa rdize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 11.54331 #Running shr inkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin + meancalcium + sdNtot, nD zLstc2~offset(log(Lstc)) , lambda1=0, lambda2=11.54331, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict (pen1, nDzLstc2~nAlbumin + meancalcium + sdNtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 71 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + sdEtot + offset(log(Lstc)), family = poisson(link=log ), data = data) print=(pred=glm1$fitted) prin t summary(glm1) #finding optimum tuning parameter lambda2 for model 71 516 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + sdEtot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , mod el = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 7.984584 #Running shr inkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + sdEtot, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=7.984584 , standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + sdEtot, nDzLstc2~offset(log(Lstc) ), data ) print(predictions) #running model 72 glm1 < - glm(nDzLstc2 ~ sdBDtot + meanvitamina + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1 $fitted) print summary(glm1) #finding optim um tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBDtot + meanvitamina + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRU E , trace = TRUE) optl2$lambda #the optimum lambda found was 3.795779 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ sdBDtot + meanvitamina + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=3.795779, standardize=TRU E, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdBDtot + meanvitamina + medBCS, nDzLstc2~ offset(log(Lstc)), data ) print (predictions) #running model 73 glm1 < - glm(nDzLstc2 ~ meancalcium + meanalbumin + meanbetacarotene + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parame ter lambda2 for model 73 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + meanbetacarotene, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #th e optimum lambda found was 5.489887 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ meancalcium + meanalbumin + meanbetacarotene, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2 =5.489887, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium + meanalbumin + mean betacarotene, nDz Lstc2~offset(log(Lstc)), data ) print(predictions) 517 #running model 74 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalcium + meanLtot + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fi tted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + meanLtot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda # the optimum lambda found was 7.193122 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin + meancalcium + meanLtot, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambd a2=7.193122, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + meanLtot, nDzLst c2~off set(log(Lstc)), data ) print(predictions) #running model 75 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + nBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(gl m1) #finding optimum tuning parameter lambda2 f or model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + nBCS, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) o ptl2$lambda #the optimum lambda found was 1 0.01385 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin + meancalcium + nBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=10.01385, standardize=TRU E, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + nBCS, nDzLstc2~offset(log(Lstc)), data ) print(p redi ctions) #running model 76 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + nCalcium + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning paramet er l ambda2 for model 76 optl2 < - optL2 (nD zLstc2, nDzLstc2~nAlbumin + meancalcium + nCalcium, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum la mbda found was 7.839487 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + nCalcium, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 7.839487, standardize=TRUE, mode l = c("poisson"), trace = TR UE) show(pen1) coefficients(pen1) # make predictions 518 predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + nCalcium, nDzLstc2~offset(log(Lstc)), data ) print(predicti ons) #r unning model 77 glm1 < - glm(nDzLstc2 ~ meancalc ium + meanbhb + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for mode l 77 optl2 < - optL2 (nDzLstc2, nDzLstc2~me ancalcium + meanbhb + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 9. 050423 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ meancalcium + meanbhb + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=9.050423, standardize=TRUE, model = c("poisson") , trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium + meanbhb + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predi ctions) #running m odel (model removed - lambda went to infinity) glm1 < - glm(nDzLstc2 ~ meanbhb + sdBDtot + sdalphatocopherol + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted ) print summary(glm1) #finding optimum tuning parameter lambda2 f or model 78 optl2 < - optL2 (nDzLstc2, nDzLstc2~meanbhb + sdBDtot + sdalphatocopherol, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poi sson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found goes to infinity - model removed #running model 78 glm1 < - glm(nDzLstc2 ~ nAlbumin+ meancalcium + meanaopugul + offset(log(Lstc)), family = poisson(l ink=log), data = data) print=(pred=glm1$fit ted) print summary(glm1) #finding optimum tuning parameter lambda2 for mode l 78 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin+ meancalcium + meanaopugul, nDzLstc2~offset(log(Ls tc)) , lambda1 = 0 , model = c("po isson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 9.647628 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nD zLstc2~ nAlbumin+ meancalcium + meanaopugul, nDzL stc2~offset(log(Lstc)) , lambda1=0, lambda2=9.647628, standardize=TRU E, model = c("poisson"), trace = TRUE) show(pen1) co efficients(pen1) # make prediction s predictions < - predict(pen1, nDzLstc2~nAlbumin+ meancalcium + meanaopugul, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 79 glm1 < - glm(nDzLstc2 ~ nAlbumin + meancalcium + sdvitamind 519 + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 79 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + mean ca lcium + sdvitamind, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 15.36287 #Running shrinkage pro ce dure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin + meancalcium + sdvitamind, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=15.36287, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~nAlbumin + meancalcium + sdvitamind, nDzLstc2~offset(log(Lstc)), dat a ) print(predictions) #runni ng model 80 glm1 < - glm(nDzLstc2 ~nAlbumin + meancalcium + meanBtot + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summ ary( glm1) #finding optimum tuning parameter lambda2 for model 80 optl2 < - optL2 (nDzLstc2, nDzLstc2~nAlbumin + meancalcium + meanBtot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$la mbda #the optimum lambda found was 11.86458 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ nAlbumin + meancalcium + meanBtot, nDzLstc2~offset(log(Lstc)) , lambda 1=0, lambda2=11.86458, st andardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict( pen1 , nDzLstc2~nAlbumin + meancalcium + meanBtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 81 glm1 < - glm(nDzLstc2 ~nAlbumin + sdBtot + meancalcium + offset(log(Lstc)), family = poisson(li nk=log), data = data) print=(pred=glm1$fitt ed) print summary(glm1) #finding optimum tuning parameter lambda2 for model 81 optl2 < - optL2 (nDzLstc2, nDzLstc2~nA lbumin + sdBtot + meancalcium, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson "), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 13.06221 #Running shri nkage procedure pen1< - penalized (nDzLstc2, nDzLstc 2~ nAlbumin + sdBtot + meancalcium, nDzLstc2~offs et(log(Lstc)) , lambda1=0, lambda2=13.06221, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficient s(pen1) 520 # make predictions p redictions < - predict(pen1, nDzLstc2~nAlbumin + sdBtot + meancalcium, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 82 glm1 < - glm(nDzLstc2 ~sdBDtot + nBCS + mean betacarotene + offset(log(Lstc) ), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 82 optl2 < - optL2 (nDzLstc2 , nDzLstc2~sdBDtot + nBCS + meanbetacarotene, nDz Lstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running mo del 83 glm1 < - glm(nDzLstc2 ~manBCS + sdBDt ot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pr ed=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 83 optl2 < - optL2 (nDzLstc2, nDzLstc2~manBCS + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("p oisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optim um lambda found was 0.7947411 #Runni ng shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ manBCS + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=0.7947411, standardize=TRUE, model = c("poisson"), trace = TRUE ) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~manBCS + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running mod el 84 glm1 < - glm( nDzLstc2 ~sdbhb + nNefa + sdLtot + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) # finding optimum tuning parameter lambda2 for mode l 84 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdbhb + nNefa + sdLtot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lamb da found was 0 #r unning model 85 glm1 < - glm(nDzLstc2 ~sdbhb + nNefa + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 85 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdbhb + nNefa , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 86 glm1 < - glm(nDzLstc2 ~sdBtot + meancalcium + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred= glm1$fitted) 521 print summary(glm1) #finding optimum tuning parameter lambda2 for model 86 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBtot + meancalcium + meanalbumin , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=T RUE , trace = TRUE) optl2$lambda #the optimum lambda found was 9.058128 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdBtot + meanca lcium + meanalbumin, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=9.058128, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - p redict(pen1, nDzLstc2~sdBtot + meancalcium + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 87 glm1 < - glm(nDzLstc2 ~sdBDtot + medBCS + nCholesterol + offset(log(Lstc)), family = pois son(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 87 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBDt ot + medBCS + nCholesterol , nDzLstc2~offset(log( Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the opti mum lambda found was 0 #running model 88 glm1 < - glm(nDzLstc2 ~medBCS + nNefa + sdBDtot + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 88 opt l2 < - optL2 (nDzLstc2, nDzLstc2~medBCS + nNefa + sdBDtot , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found wa s 1.47994 #Running shrinkage procedu re pen1< - penalized (nDzLstc2, nDzLstc2~medBCS + nNefa + sdBDtot, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=1.47994, standardize=TRUE, model = c( "poisson"), trace = TRUE ) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzL stc2~medBCS + nNefa + sdBDtot, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 89 glm1 < - glm(nDzL stc2 ~ sdBDtot + medBCS + sdEtot + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning p arameter lambda2 for model 89 optl2 < - optL2 (nDzLstc2, nDzLstc2~sdBDtot + medBCS + sdEtot , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$l ambda 522 #the optimum lambda found was 1.2 824 19 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~sdBDtot + medBCS + sdEtot, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 1.282419, standard ize=TRUE, model = c(" poi sson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~sdBDtot + medBCS + sdEtot, nDzLstc2~offs et(log(Lstc)), data ) print(predictions) #running model 90 glm1 < - glm(nDzLstc2 ~ meancalcium + sdLtot + meanalbumin + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) pr int summary(glm1) #finding optimum tu ning parameter lambda2 for model 90 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + sdLtot + meanalbumin , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), stan dardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 9.698618 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + sdLtot + meanalbumin, nDzLstc2~offset(log (Lstc)) , lambda1=0, lamb da2=9.698618, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predict ions < - predict(pen1, nDzLstc2~meancalcium + sdLt ot + meanalbumin, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 91 glm1 < - glm(nDzLstc2 ~ meanalbumin + sdBDtot + medBCS + offset(log(Lstc)) , family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(g lm1) #finding optimum tuning parameter lambda2 for model 91 optl2 < - optL2 (nDzLstc2, nDzLstc2~meanalbumin + sdBDtot + medBCS , nDzLst c2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 2.781826 #Running shrinkage procedure pen1< - penaliz ed (nDzLstc2, nDzLstc2~meanalbumin + sdBDtot + me dBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=2.781826, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1 ) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meanalbumin + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 92 glm1 < - glm(nDzLstc2 ~ meanLtot + sdBDtot + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 92 optl2 < - optL2 (nDzLstc2, nDzLstc2~meanLtot + sdBDtot + medBCS , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , 523 model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 4.310807 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meanLtot + sdBDtot + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=4.310807, standardize=TR UE, model = c("pois son"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meanLtot + sdBDtot + medBCS, nDzLstc2~offset(l og(Lstc)), data ) print(predictions) #running model 93 glm1 < - glm( nDzLstc2 ~ meancalcium + meanalbumin + medBCS + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding opt imum tuning parameter lambda2 for model 93 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + medBCS , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), st andardize=TRUE , trace = TRUE ) optl2$lambda #the optimum la mbda found was 8.261804 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + medBCS, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=8.261804, standardize=TRUE, model = c("poisson"), trac e = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalciu m + meanalbumin + medBCS, nDzLstc2~offset(log(Lst c)), data ) print(predictions) #running model 94 glm1 < - glm(nDzLstc2 ~meancalcium + meanalbumin + nHeifer + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) prin t summary(glm1) #finding optimum tuning parameter lambd a2 for model 94 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + nHeifer , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), stand ardize=TRUE , trace = TRUE) optl2$lambd a #the optimum lambda found was 7.908711 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancal cium + meanalbumin + nHeifer , nDzLstc2~offset(lo g(Lstc)) , lambda1=0, lambda2=7.908711, standardize=TRUE, model = c("poisson"), tr ace = TRUE) show(pen1) coefficients(pe n1) # make predictions predi ctions < - predict(pen1, nDzLstc2~meancalcium + meanalbumin + nHeifer , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 95 glm1 < - glm(nDzLstc2 ~meancalcium + sdBtot + sdBDtot 524 + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for m odel 95 optl2 < - optL2 (nDzLstc2, n DzLstc2~meancalcium + sdBtot + sdBDtot , nDzLstc2 ~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 3.184096 #Ru nning shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + sdBtot + sdBDtot , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=3.184096, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1 ) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium + sdBtot + sdBDtot , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 96 glm 1 < - glm(nDzLstc2 ~mea ncalcium + meanalbumin + sdvitamind + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tunin g parameter lambda2 for mod el 96 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + sdvitamind, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda fo und was 13.88566 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~meancalcium + meanalbumin + sdvitamind , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=13.88566, standar dize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~meancalcium + meanalbumin + sdvitamind , nDzLstc2~offset(log( Lstc)), data ) print(predictions) #running model 97 glm1 < - glm(nDzLstc2 ~meancalcium + sdLtot + sdBDtot + offset (log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print su mmary(glm1) #finding optimum tuning parameter lambda2 for model 97 optl2 < - optL2 (nDzLstc2, nDzLstc2~meancalcium + sdLtot + sdBDt ot, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 9.566592 #Running shrinkage procedure pen1 < - penalized (nDzLstc2, nDzLstc2~meancalcium + sdLtot + sdBDtot , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=9.566592, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions 525 predictions < - predict(pen 1, nDzLstc2~meancalcium + sdLtot + sdBDtot , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 98 glm1 < - glm (nDzLstc2 ~sdBDtot + nBCS + offset(log(Ls tc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 98 optl2 < - op tL2 (nDzLstc2, nDzLstc2~sdBDtot + nBCS, nDzLstc2~offset(log (Lstc)) , lambda1 = 0 , model = c( "poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 OBSERVEDV ERSUS EXP ECTEDAIM2 - 1.R #Title: Observed versus expected analys is for Aim 2 - 1 #Author: Lauren Wisnieski #Purpose : Create residual plots (observed - predicted) and run Poisson regression #Revision history: Created January 1, 2019 #Revised 2/15/2019 to update R squared calculation #Revised 2/16/2019 to include other model s besides the averaged model for deviance calcula tion library(modEvA) #Calculating deviance explained for each mean method model setwd("/Users/wisni/Amazon Drive/DISSERTATION ANALYSIS/ Data files/" ) data < - read.csv("mean2.1_othermodels.csv") nDzLstc2 = data$nDzLstc2 lnpcount1= data$lnpcount1 ln pcount2= data$lnpcount2 lnpcount3= data$lnpcount3 lnpcount4= data$lnpcount4 lnpcount5= data$lnpcount5 lnpcount6= data$lnpcount6 lnpcount7= data$lnpcount7 lnpcount8= data$lnpcount8 lnpcount9= da ta$lnpcount9 lnpcount10= data$lnpcount10 lnpc ount11= data$lnpcount11 lnpcount12= data$lnpcount12 lnpcount13= data$lnpcount13 lnpcount14= data$lnpcount14 lnpcount15= data$lnpc ount15 lnpcount16= data$lnpcount16 lnpcount17= data$lnpcount17 Dsquared(obs = nDzLstc2, pred = lnpcount1, fa mily = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount2, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount3, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount4, family = 'poisson') Dsquared(obs = nDzLstc2, pred = l npcount5, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount6, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount7, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcoun t8, family = 'poisson') Dsquared(obs = nDzLst c2, pred = lnpcoun t9, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount10, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount11, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount12, family = 'poisson') Dsquar ed(obs = nDzLstc2, pred = lnpcount13, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount14, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount15, family = 'poisson') Dsquared( obs = nDzLstc2, pred = lnpcount16, family = 'pois son') Dsquared (obs = nDzLstc2, pred = lnpcount17, family = 'poisson') #averaged mean model setwd("/Users/wisni/Amazon Drive/DISSERTATION ANALYSIS/Data files/" ) data < - read.csv("mean2.1.csv") nDzLstc2 = data$nDzLstc2 avgpcount= data$avg pcount lnavgpc ount= data$lnavgpcount resid = data$nDzLstc2 - data$avgpcount 526 model< - glm(nDzLstc2~avgpcount , family = poisson(link=log), data = data) summary(model) confint(model, lev el=0.95) Dsquared(obs = nDzLstc2, pred = lnav gpcount, family = 'poisson') par(xaxs="i", yaxs="i") plot(nDzLstc2, resid, col = "blue" , abline(h= 0) , cex = 1.3,pch = 16,cex.main=1, ylim=c( - 5,8), xlim=c(0,15), xlab = "Observed",y lab = "Re sidual (Observed - Predicted)") plot (nDzLstc2, avgpcount, col = "blue" , cex = 1.3,pch = 16,cex.main=1, ylim=c(0,10), xlim=c(0,15), xlab = "Observed",ylab = "Predicted" ) #Calculating deviance explained for each mean method mod el setwd( "/Users/wisni/Amazon Drive/DISSERTATION ANALYSIS/Data files/" ) data < - read.csv("combined2.1_othermodels.csv") nDzLstc2 = data$nDzLstc2 lnpcount1= data$lnpcount1 lnpcount2= data$lnpcount2 lnpcount3= data$lnpcount3 lnpcoun t4= data$ lnpcount4 lnpcount5= data$lnpcount5 lnpcount6= data$lnpcount6 lnpcount7= data$lnpcount7 lnpcount8= data$lnpcount8 lnpcount9= data$lnpcount9 lnpcount10= data$lnpcount10 Dsquared(obs = nDzLstc2, pred = lnpcount1, family = 'poisso n') Dsquared(obs = nDzLstc2, pred = lnpcount2, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount3, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount4, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcou nt5, family = 'poisson') Dsquared(obs = nDzLs tc2, pred = lnpcount6, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount7, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount8, family = 'poisson') Dsquared(obs = nDzLstc2, p red = lnpcount9, family = 'poisson') Dsquared (obs = nDzLstc2, pred = lnpcount10, family = 'poisson') #averaged combined model data < - read.csv("combined2.1.csv") nDzLstc2 = data$nDzLstc2 avgpcount= data$avgpcount lnavgpcount= data$lnavgpcount resid = data$nDzLstc2 - data $avgpcount model< - glm(nDzLstc2~lnavgpcount , family = poisson(link=log), data = data) summary(model) Dsquared(obs = nDzLstc2, pred = lnavgpcount, family = 'poisson') confint(model, lev el=0.95) par(xaxs="i", yaxs="i") pl ot(nDzLstc2, resid, col = "blue" , abline(h= 0) , cex = 1.3,pch = 16,cex.main=1, yli m=c( - 5,8), xlim=c(0,15), xlab = "Observed",ylab = "Residual (Observed - Predicted)") plot(nDzLstc2, avgpcount, col = "blue" , cex = 1.3,pc h = 16,cex.main=1, ylim=c(0,10), xlim=c(0,15), xlab = "Observed",ylab = "Predicted" ) pt(tb1,3,lower.tail = FALSE) BESTSUBSETSAIM2 - 2.R #Title: Best subsets selection Aim 2 - 2 #Author: La uren Wisnieski #Purpose: Run AIC Best subsets sel ection for Aim 2 - 2 analyses (central, dispersion, count methods) #Revision history: Created January 11,2019 #Best subsets selection for Aim 2.2: cohort level analyses setwd("/Users/wisni/Amazon Drive/DISS ERTATION ANALYSIS/Data files/" ) data < - read.csv ("Cohort2Data_11Jan2019.csv") #Loading packages library(rJava) library(recoder) lib rary(glmulti) 527 library(AER) NM_meanphat= data$NM_meanphat IF_meanphat= data$IF_meanphat OX_meanphat= data$OX_meanphat NM_sd phat= data$NM_sdphat IF_sdphat= data$IF_sdphat OX _sdphat= data$OX_sdphat NM_nPhat= data$NM_nPhat IF_nPhat= data$IF_nPhat OX_nPhat= data$OX_nPhat summary(data$nTotalLstc2) TotalLstc2 = data$nTotalLstc2 Lstc = as.numeric(TotalLstc2) nDzLstc2 = data$nDzLstc2 Dz = as.numeric(nDzLstc2) #Mean models mean .out < - glmulti(Dz ~ NM_meanphat + IF_meanphat + OX_meanphat , off set= log(Lstc) , data = data, level = 1, # interactions considered method = "h", # exhaustive crit = "aic", # use AIC for selection intercept = T, # include intercept plotty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function confsetsize = 100 , #setting number of models maxsize = 3, #setting max number of variables family = poisson ) # poisson family #Using summary command to see how ma ny models are within 4 AICs, so can simplify outp ut summary.glmulti(mean.out, icvalues) mean.out < - glmul ti(Dz ~ NM_meanphat + IF_meanphat + OX_meanphat , offset= log(Lstc) , data = data, level = 1, # interactions considered m ethod = "h", # exhaustive crit = "aic", # use AIC for selection intercept = T, # include intercept plotty = T, report = T, # plots a nd interim reports f itfunction = "g lm", # glm function confsetsize = 5, #setting number of models maxsize = 3, #setting max number of variables family = poisson ) # poisson family # Reporting out candidate models mean.out #St Dev models stdev.out < - glmulti(Dz ~ NM_sdphat + IF_sdphat + OX_sdphat , offset= log(Lstc) , data = data, level = 1, # interactio ns considered meth od = "h", # exhaustive crit = "aic", # use AIC for selection intercept = T, # include intercept plotty = T, report = T, # plots and interim r eports fitfunction = "glm", # glm function confsetsize = 100 , #setting number of models maxsize = 3, #setting max number of variables family = poisson ) # poisson f amily #Using summary command to see how many models are within 4 AICs, so can simplify output summary.glmulti(stdev.out, icvalues) stdev.out < - glmulti(Dz ~ NM_sdphat + IF_sdphat + OX_sdphat , offset= log(Lstc) , data = data, level = 1, # interactions considered method = "h", # exhaustive crit = "aic", # u se AIC for selection intercept = T, # include interc ept plotty = T, report = T, # pl ots and interim reports fitfunction = "glm", # glm function confsetsize = 6 , #setting n umber of models maxsize = 3, #setting max number o f variables family = poisson ) # poisson family 528 # Reporting out candidate models stdev.out #Cut - point/threshold mode ls cut.out < - glmulti(Dz ~ NM_nPhat + IF_nPhat + OX_nPhat, offse t= log(Lstc) , data = data, level = 1, # interactions considered method = "h", # exhaustive crit = "aic", # use AIC for selection intercept = T, # include intercept plotty = T, re port = T, # plots and interim reports fitfunction = "glm", # glm function confsetsize = 100 , #setting number of models maxsize = 3, #set ting max number of variables fam ily = poisson ) # poisson family #Using summary command to see how many models are within 4 AICs, so can simplify output summary.glmulti(cut.out, icvalues) cut.out < - glmulti(Dz ~ NM_nPhat + IF_nPhat + OX_nPha t, offset= log(Lstc) , data = data, level = 1, # interactions considered method = "h", # exhaustive crit = "aic", # use AI C for selection intercept = T, # include intercept pl otty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function confsetsize = 6 , #setting number of models maxsize = 3, #setting max number of variables family = poisson ) # poisson family # Reporting out candidate models cut.out #Combined models co.out < - glmulti(Dz ~ NM_nPhat + IF_nPhat + OX_nPh at + NM_sdphat + IF_sdphat + OX_sdphat + NM_meanphat + IF_meanphat + OX_meanphat, offset= log(Lstc) , data = data, level = 1, # interaction s considered method = "h", # exhaustive crit = "aic" , # use AIC for selection intercept = T, # include intercept plotty = T, report = T, # plots and interim re ports fitfunction = "glm", # glm function confsetsize = 100 , #setting number of models maxsize = 3, #setting max number of variables family = poisson ) # poisson fa mily #Using summary command to see how many models are within 4 AICs, so can simplify output summary.glmulti(co.out, icvalues) co.out < - glmulti(Dz ~ NM_nPhat + IF_nPhat + OX_nPhat + NM_sdphat + IF_sdphat + OX_sdph at + NM_meanphat + IF_meanphat + OX_meanphat, offset= log(Lstc) , data = data, level = 1, # interactions considered method = "h", # exhaustive cri t = "a ic", # use AIC for selection intercept = T, # include intercept plotty = T, report = T, # plots and interim reports fitfunction = "glm", # glm function confsets ize = 47 , #setting number of models maxsize = 3, #setting max number of variables family = poisson ) # poisson family # Reporting out candidate models co.out MODELDIAGNOSTICSAIM2 - 2MEAN.R #Title : Model diagnostics for Aim 2 - 2: mean (central) method #Author: Lauren Wisnieski #Purpose: Run diagnostics for Aim 2 - 2 mean (central) methods models #Revision history: Created January 11,2019 529 #Model Diagno stics for Aim 2.2 Mean models: cohort level analy ses setwd("/Users/wisni/Amazon Drive/DISSERTATION ANALYSIS/Data files/" ) data < - read.csv("Cohort2Data_11Jan2019.csv") #Loading packages library(rJava) library(recoder) library(AER) summary(data) nDzLstc 2 = data$nDzLstc2 NM_meanphat= data$NM_meanphat I F_meanphat= data$IF_meanphat OX_meanphat= data$OX_meanphat TotalLstc2 = data$nTotalLstc2 Lstc = as.numeric(TotalLstc2) #Model 1 glm2 < - glm(data$nDzLstc2 ~ data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiont est(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.9998 #Model 2 gl m2 < - glm(data$nDzLstc2 ~ data$NM_meanphat + data$OX_meanphat + offset(log(Ls tc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.9996 #Mo del 3 glm2 < - glm(data$nDzLstc2 ~ data$OX_meanphat + data$IF_meanphat + offset(log(Lstc)), family = poisson(link=log), d ata = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0. 9996 #Model 4 glm2 < - glm(data$nDzLstc2 ~ data$IF_meanphat + offs et(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.99 #Model 5 glm2 < - glm(data$nDzLstc2 ~ data$OX_meanphat + data$IF_mean phat + data$NM_meanphat + offset( log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.9991 MODELDIAGNOSTICSAIM2 - 2SD.R #Title: Model diagnostics for Aim 2 - 2: disperion method #Author: Lauren Wisnieski #Purpose: Run diagnostics for Aim 2 - 2 dispersion method models #Revision history: Created January 11,2019 #Model Diagnostics for Aim 2.2 standard deviation model s: co hort level analyses setwd("/Users/wisni/Ama zon Drive/DISSERTATION ANALYSIS/Data files/" ) data < - read.csv("Cohort2Data_11Jan2019.csv") #Loading packages library(rJava) library(recoder) library(AER) summary(data) nDzLstc2 = data$nDzLstc2 NM_sdphat= data $NM_sdphat IF_sdphat= data$IF_sdphat TotalLs tc2 = data$nTotalLstc2 Lstc = as.numeric(TotalLstc2) 530 #Model 1 glm2 < - glm(data$nDzLstc2 ~ data$IF_sdphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p va lue = 0.93 pchisq(glm2$deviance, df=glm2$df.r esidual, lower.tail=FALSE) #[1] 0.72 #Model 2 glm2 < - glm(data$nDzLstc2 ~ data$IF_sdphat + data$OX_sdphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 0.97 pchisq(glm2$deviance, df =glm2$df.residual, lower.tail=FALSE) #[1] 0.71 #Model 3 glm2 < - glm(data$nDzLstc2 ~ data$IF_sdphat + data$NM_sdphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersion test(glm2) #p value = 0.92 pchisq(glm2$de viance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.68 #Model 4 glm2 < - glm(data$nDzLstc2 ~ data$OX_sdphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm 2) #p value = 0.75 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.58 #Model 5 glm2 < - glm(data$nDzLstc2 ~ data$OX_sdphat + data$NM_sdphat + data$IF_sdphat + offset(log(Lstc)), family = poisson(li nk=log), data = data) dispersiontest(glm2) #p value = 0.96 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.70 #Model 6 glm2 < - glm(data$nDzLstc2 ~ data$OX_sdphat + data$NM_sdphat + offset(log(Lstc)) , family = poisson(link =log), data = data) di spersiontest(glm2) #p value = 0.74 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.51 MODELDIAGNOSTICSAIM2 - 2THRESHOLD.R #Title: Model diagnostics for Aim 2 - 2: count method models #Author: Lauren Wisn ieski #Purpose: Run diagnostics for Aim 2 - 2 count method models #Revision history: Created January 11,2019 setwd("/Users/wisni/Amazon Drive/DISSERTATION ANALYSIS/Data files/" ) data < - read.csv("Cohort2Data _11Jan2019.csv") #Loading packages library(rJava ) library(recoder) library(AER) summary(data) nDzLstc2 = data$nDzLstc2 NM_nPhat= data$NM_nPhat IF_nPhat= data$IF_nPhat OX_nPhat= data$OX_nPhat TotalLstc2 = data$nTotalLstc2 Lstc = as.nu meric(TotalLstc2) #M odel 1 glm2 < - glm(data$nDzLstc2 ~ data$OX_n Phat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.9 96 531 #Model 2 glm 2 < - glm(data$nDzLstc2 ~ data$IF_nPhat + data$OX_ nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.998 #Model 3 glm2 < - glm(data$nDzLstc2 ~ data$OX_nPhat + dat a$NM_nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p valu e = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.994 #Model 4 glm2 < - glm(data$nDzLstc2 ~ data$IF_nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 p chisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.98 #Model 5 glm2 < - glm(data$nDzLstc2 ~ data$NM_nPhat + data$OX_n Phat + data$IF_nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) disper siontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.997 #Model 6 glm2 < - glm(da ta$nDzLstc2 ~ data$NM_nPhat + data$IF_nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail= FALSE) #[1] 0.96 MODELDIAGNOST ICSAIM2 - 2COMBINED.R #Title: Model diagnostics for Aim 2 - 2: combined method models #Author: Lauren Wisnieski #Purpose: Run diagnostics for Aim 2 - 2 combined method models #Revision history: Created Janua ry 14,2019 setwd("/Users/wisni/Amazon Drive/DIS SERTATION ANALYSIS/Data files/" ) data < - read.csv("Cohort2Data_11Jan2019.csv") #Loading packages libra ry(rJava) library(recoder) library(AER) summary(data) nDzLstc2 = data$nDzLstc2 NM_nPhat= data$NM_nPhat IF_nPhat= data$IF_nPhat OX_nPhat= data$OX_nPhat NM_meanphat= data$NM_meanphat IF_meanphat= data$IF_meanphat OX_meanphat= data$OX_meanphat NM_sdphat= data$NM_sdphat IF_sdphat= data$IF_sdphat OX_sdphat= data$OX_sdphat TotalLstc2 = data$nTotalLstc2 Lstc = as .numeric(TotalLstc2) #Model 1 glm2 < - glm( data$nDzLstc2 ~ data$OX_meanphat + offset(log(Lstc)), family = poisson(li nk=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) # [1] 0.9998 #Model 2 532 glm2 < - glm(data$nD zLstc2 ~ data$NM_sdphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FA LSE) #[1] 0.9998 #Model 3 glm2 < - g lm(data$nDzLstc2 ~ data$OX_nPhat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lowe r.tail=FALSE) #[1] 0.9997 #Model 4 glm 2 < - glm(data$nDzLstc2 ~ data$NM_nPhat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual , lower.tail=FALSE) #[1] 0.9997 #M odel 5 glm2 < - glm(data$nDzLstc2 ~ data$OX_meanphat + data$NM_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, d f=glm2$df.residual, lower.tail=FALSE) #[1] 0 .9996 #Model 6 glm2 < - glm(data$nDzLstc2 ~ data$OX_meanphat + data$IF_nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2 $deviance, df=glm2$df.residual, lower.tail =FALSE) #[1] 0.9996 #Model 7 glm2 < - glm(data$nDzLstc2 ~ data$OX_meanphat + data$OX_sdphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.re sidual, lower.tail=FALSE) #[1] 0.9996 #Model 8 glm2 < - glm(data$nDzLstc2 ~ data$OX_meanphat + data$IF_sdphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$d f.residual, lower.tail=FALSE) #[1] 0.9996 #Model 9 glm2 < - glm(data$nDzLstc2 ~ data$IF_meanphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiont est(glm2) #p value = 1 pchisq(glm2$devian ce, df=glm2$df.residual, lower.tail=FALSE) #[1 ] 0.9996 #Model 10 glm2 < - glm(data$nDzLstc2 ~ data$OX_nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2 ) #p value = 1 pchisq(glm2$deviance, df=g lm2$df.residual, lower.tail=FALSE) #[1] 0.996 #Model 11 glm2 < - glm(data$nDzLstc2 ~ data$IF_meanphat + data$OX_nPhat + data$NM_sdphat + offset(log(Lstc)), family = poisson(link=log), data = da ta) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower. tail=FALSE) #[1] 0.998 #Model 12 glm2 < - glm(data$nDzLstc2 ~ data$IF_nPhat + data$OX_nPhat + offset(log(Lstc)), family = poisson(link=log), da ta = data) 533 dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.998 #Mo del 13 glm2 < - glm(data$nDzLstc2 ~ data$IF_meanphat + offset(log(Lstc)), family = poisson(link=log), data = d ata) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.99 #Model 14 glm2 < - glm(data$nDzLstc2 ~ data$OX_sdphat + data$OX_nPhat + data$NM_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersion test(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.996 #Model 15 glm2 < - glm(data$nDzLstc2 ~ data$OX_nPhat + data$NM_meanphat + offset(log(Ls tc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower .tail=FALSE) #[1] 0.997 #Model 16 glm2 < - glm(data$nDzLstc2 ~ data$NM_nPhat + data$NM_sdphat + data$OX_m eanphat + offset(log(Lstc)), family = poisson(lin k=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.9997 #Model 17 glm2 < - glm(data$nDzLstc2 ~ data$NM_sdph at + data$IF_nPhat + data$OX_meanphat + offset(lo g(Lstc)), family = poisson(link=log), data = data) dispersi ontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.9996 #Model 18 glm2 < - glm(data$nDzLstc2 ~ data$NM_sdphat + data$IF_sdp hat + data$OX_meanphat + offset(log(Lstc)), family = poisson(li nk=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.999 6 #Model 19 glm2 < - glm(data$nDzLstc2 ~ data$NM_sdphat + data$IF_meanphat + data$OX_meanphat + offset( log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lo wer.tail=FALSE) #[1] 0.9996 #Model 20 glm2 < - glm(data$nDzLstc2 ~ data$NM_sdphat + data$OX_meanp hat + data$OX_sdphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$dev iance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.9996 #Model 21 glm2 < - glm(data$nDzLstc2 ~ data$NM_sdphat + data$NM_meanphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df .residual, lower.tail=FALSE) #[1] 0.9996 #Model 22 glm2 < - glm(data$nDzLstc2 ~ data$OX_nPhat + data$NM_sdphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) 534 #p value = 1 pchi sq(glm2$deviance, df=glm2$df.r esidual, lower.tail=FALSE) #[1] 0.9995 #Model 23 glm2 < - glm(data$nDzLstc2 ~ data$OX_nPhat + data$NM_nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=g lm2$df.residual, lower.tail=FALSE) #[1] 0.994 #Model 24 glm2 < - glm(data$nDzLstc2 ~ data$OX_nPhat + data$NM_sdphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p val ue = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.994 #Model 25 glm2 < - glm(data$nDzLstc2 ~ data$OX_nPhat + data$IF_sdphat + offset(log(Lstc)), family = poisson(lin k=log), data = data) dispersiontest(glm2) #p val ue = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.994 #Model 26 glm2 < - glm(data$nDzLstc2 ~ data$NM_nPhat + data$OX_nPhat + data$OX_meanphat + offse t(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.9994 #Model 27 glm2 < - glm(data$nDzLstc2 ~ data$NM_nPhat + data$IF_nPhat + data$OX_meanphat + offset(log(Ls tc)), f amily = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.9993 #Model 28 glm2 < - glm(data$n DzLstc2 ~ data$IF_nPhat + offset(log(Lstc)), fa mily = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.98 #Model 29 glm2 < - glm(data$nDzLstc2 ~ data$NM_nPhat + data$NM_meanphat + data$OX_mea nphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.999 #Model 30 glm2 < - glm(data$nDzLstc2 ~ data$IF_nPhat + da ta$OX_nPhat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tai l=FALSE) #[1] 0.9993 #Model 31 glm2 < - glm(data$nDzLstc2 ~ data$NM_nPhat + data$IF_sdphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df= glm2$df.residual, lower.tail=FALSE) #[1] 0.99 9 #Model 32 glm2 < - glm(data$nDzLstc2 ~ data$OX_nPhat + data$IF_sdphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiont est(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower .tail=FALSE) 535 #[1] 0.999 #Model 33 glm2 < - glm(data$nDzLstc2 ~ data$OX_nPhat + data$IF_meanphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontes t(glm2) #p value = 1 pchisq(glm2$deviance , df=glm2$df.residual, lower.tail=FALSE) #[1] 0.999 #Model 34 glm2 < - glm(data$nDzLstc2 ~ data$OX_nPhat + data$NM_meanphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), da ta = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.re sidual, lower.tail=FALSE) #[1] 0.999 #Model 35 glm2 < - glm(data$nDzLstc2 ~ data$OX_nPhat + data$OX_sdphat + data$OX_meanphat + offset(log(Lstc)), fam ily = poisson(link=log), data = data) dispers iontest(glm2) #p value = 1 pchisq(g lm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.999 #Model 36 glm2 < - glm(data$nDzLstc2 ~ data$NM_nPhat + data$OX_sdphat + data$OX_meanph at + offset(log(Lstc)), family = poisson(link=log ), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.999 #Model 37 glm2 < - glm(data$nDzLstc2 ~ data$NM_nPhat + data $IF_meanphat + data$OX_meanphat + offset(log(Lstc )), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.999 #Model 38 glm2 < - glm(data$nDz Lstc2 ~ data$IF_nPhat + data$IF_meanphat + data$ OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.r esidual, lower.tail=FALSE) #[1] 0.999 #Mode l 39 glm2 < - glm(data$nDzLstc2 ~ data$IF_nPh at + data$NM_meanphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchis q(glm2$deviance, df=glm2$df.residual, lower.tail=FALS E) #[1] 0.999 #Model 40 glm2 < - g lm(data$nDzLstc2 ~ data$IF_sdphat + data$NM_meanphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest( glm2) #p value = 1 pchisq(glm2$deviance, df=g lm2$df.residual, lower.tail=FALSE) #[1] 0.99 9 #Model 41 glm2 < - glm(data$nDzLstc2 ~ data$IF_nPhat + data$OX_sdphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower .tail=FALSE) #[1] 0.999 #Model 42 glm2 < - glm(data$nDzLstc2 ~ data$IF_nPhat + data$IF_sdphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm 2) #p value = 1 pchisq(glm2$deviance, df= glm2$df.residual, lower.tail=FALSE) #[1] 0.999 536 #Model 43 glm2 < - glm(data$nDzLstc2 ~ data$OX_sdphat + data$NM_meanphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = da ta) dispersiontest(glm2) #p value = 1 pchisq(glm2$devi ance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.999 #Model 44 glm2 < - glm(data$nDzLstc2 ~ data$IF_sdphat + data$OX_sdphat + data$OX_meanphat + offset(log(Lstc)), family = pois son(link=log), data = data) dispersiontest(gl m2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.999 #Model 45 glm2 < - glm(data$nDzLstc2 ~ data$NM_meanphat + data$IF_meanphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), da ta = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.999 #Model 46 glm2 < - glm(data$nDzLstc2 ~ data$OX_sdphat + data$IF_ meanphat + data$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$ df.residual, lower.tail=FALSE) #[1] 0.999 #Model 47 glm2 < - glm(data$nD zLstc2 ~ data$IF_sdphat + data$IF_meanphat + dat a$OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) dispersiontest(glm2) #p value = 1 pchisq(glm2$deviance, df=glm2$df.residual, lower.tail=FALSE) #[1] 0.999 SHR INKAGEAIM2 - 2MEAN.R #Title: Shrinkage for Aim 2 - 2: mean method models #Author: Lauren Wisnieski #Purpose: Shrinkage for Aim 2 - 2: mean method models #Revision history: Created January 11,2019 #Model Diagnostics for Aim 2.2. Combined models: cohort lev el analyses setwd("/Users/wisni/Amazon Drive/DIS SERTATION ANALYSIS/Data files/" ) data < - read.csv("Cohort2Data_11Jan2019.csv") #Loading libraries lib rary(penalized) summary(data) nDzLstc2 = data$nDzLstc2 NM_meanphat= data$NM_meanphat IF_meanphat= data$ IF_meanphat OX_meanphat= data$OX_meanphat TotalLs tc2 = data$nTotalLstc2 Lstc = as.numeric(TotalLstc2) #Model 1 glm1 < - glm(nDzLstc2 ~ OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning pa rameter lambda2 for model 1 optl2 < - optL2 (nDzLstc2, nDzLstc2~ OX_meanphat, nDzLstc 2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 537 #Model 2 glm1 < - glm(nDzLstc2 ~ NM_meanphat + OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) glm1 #finding optimum tuning parameter lamb da2 for model 2 optl2 < - optL2 (nDzLstc2, nDzLstc2~ NM_meanphat + OX_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.5633702 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ NM_meanphat + OX_meanp hat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=0.5633702, standa rdize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_meanphat + OX_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #Model 3 glm1 < - glm(nDzLstc2 ~ OX_meanphat + IF_meanphat + offset(log(L stc)), family = poisson(lin k=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 3 optl2 < - optL2 (nDzLstc2, nDzLstc2~ OX_meanphat + IF_meanphat , nDzLstc2~of fset(log(Lstc)) , lamb da1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.204056 #Running shrinkage pro cedure pen1< - pena lized (nDzLstc2, nDzLs tc2~ OX_meanphat + IF_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 0.204056, standardize=TRUE, model = c("poisson"), trace = TRUE) s how(pen1) coef ficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ OX_meanphat + IF_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #Model 4 glm1 < - glm(nDzLstc2 ~ IF_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 4 op tl2 < - optL2 (nDzLstc2, nDzLstc2~ I F_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.6 425 36 #Running shrinkage procedu re pen1< - penalized (nDzLstc2, nDzLstc2~ IF_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=1.642536, standardize=TRUE, model = c("poi sso n"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ IF_meanphat , nDzLstc2~offset(log(Lstc)), data ) 538 print(predictions) #Model 5 glm1 < - glm(nDzLstc2 ~ OX_meanphat + NM_meanphat + IF_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 5 optl 2 < - optL2 (nDzLstc2, nDzLstc2~ OX_meanphat + NM_meanphat + IF_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TR UE , tra ce = TRUE) optl2$lambda #the opti mum lambda found was 0.558544 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ OX_meanphat + NM_meanphat + IF_meanphat , nDzLstc2~offset (log(Lstc)) , lambda1=0, lambda2= 0.558544, sta ndardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predi ctions prediction s < - predict(pen1, nDzLstc2~ OX_meanphat + NM_mea nphat + IF_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) SHRINKAGEAIM2 - 2S D .R #Title: Shrinkage for Aim 2 - 2: dispersion method models #Author: Lauren Wisnieski #Purpose: Shrinkage for Aim 2 - 2: dispersion meth od models #Revision history: Created January 11,2019 setwd("/Users/wisni/Amazon Drive/DISSERTATION ANALYSIS/Data files/" ) data < - read.csv("Cohort2Data_11Jan2019.csv") #Loading libraries library(penalize d) summary(data) nDzLstc2 = data$nDzLstc2 NM_sdp hat= data$NM_sdphat IF_sdphat= data$IF_sdphat TotalLstc2 = data$nTotalLstc2 Lstc = as.numeric(TotalLstc2) #Model 1 glm1 < - glm(n DzLstc2 ~ IF_sdphat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 1 optl2 < - optL2 (nDzLstc2, nDzLstc2~ IF_sdphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , mo del = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 4.067066 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ IF_sdphat, nDzLstc2~offset(log(Ls tc)) , lambda1=0, lambd a2=4.067066, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions pr edictions < - predict(pen1, nDzLstc2~IF_sdphat , n DzLstc2~offset(log(Lstc)), data ) print(predictions) 539 #Model 2 glm1 < - glm(nDzLstc2 ~ IF_sdphat + OX_sdphat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred =glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 2 optl2 < - optL2 (nDzLstc2, nDzLstc2~ IF_sdphat + OX_sdphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$la mbda #the optimum lambda found was 21.95886 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ IF_sdphat + OX_sdpha t, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 21.95886, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - pr edict(pen1, nDzLstc2~IF_sdphat + OX_sdphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #Model 3 glm1 < - glm(nD zLstc2 ~ IF_sdphat + NM_sdphat + offset(log(Lstc)), family = poiss on(link=log), data = data) print=(pred=gl m1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 3 optl2 < - optL2 (nDzLstc2, nDzLstc2~ IF_sdphat + NM_sdphat, nDzLstc2~of fset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 9.556629 #Running shrinkage procedure pen1< - penal ized (nDzLstc2, nDzLstc2~ IF_sdphat + NM_sdphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=9.556629, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pe n1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~IF_sdphat + NM_sdphat , nDzLstc2~offs et(log(Lstc)), data ) print(predictions) #Model 4 glm1 < - glm(nDzLstc2 ~ OX_sdphat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #fin ding optimum tuning parameter lambda2 for model 4 optl2 < - optL2 (nDzLstc2, nDzLstc2~ OX_sdphat , nDzLstc2~offset(log(Lstc)) , lambda 1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 8.76894 #Running shrinkage proced ure pen1< - penalized (nDzLstc2, nDzLstc2 ~ OX_sdphat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 8.76894, stand ardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ OX_sdphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #Model 5 540 glm1 < - glm(nDzLstc2 ~ OX_sdphat + NM_sdphat + IF_sdphat + offset(log(Ls tc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 5 optl2 < - optL2 (nDzLstc2, nDzLstc2~ OX_sdphat + NM _sdphat + IF_sdphat , nDzLstc2~o ffset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda fo und was 19.17497 #Running shrink age procedure pen1< - pena lized (nDzLstc2, nDzLstc2~ OX_sdphat + NM_sdphat + IF_sdphat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 19.17497, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ OX_sdphat + NM_sdphat + IF_sdphat , nDzLs tc2~offset(log(Lstc)), data ) print(predi ctions) #Model 6 glm1 < - glm(nDzLstc2 ~ OX_sdphat + NM_sdphat + offset(log(Lstc)), family = poisson(link= log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 6 optl2 < - optL2 (nDzLstc2, nDzLstc2~ OX_sdphat + NM_sdphat , nDzLstc2~offset(l og(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 16.70786 #Running shrinkage procedure pen1< - penalized ( nDzLstc2, nDzLstc2~ OX_sdphat + NM_sdphat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=16.70786, standardi ze=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - p redict(pen1, nDzLstc2~ OX_sdphat + NM_sdphat , n DzLstc2~offset(log(Lstc)), data ) print(predictions) SHRINKAGEAIM2 - 2 THRESHOLD .R #Title: Shrinkage for Aim 2 - 2: count method models #Author: Lauren Wisnieski #Purpos e: Shrinkage for Aim 2 - 2: count method models #Re vision history: Created January 11,2019 setwd("/Users/wisni/Amazon Drive/DISSERTATION ANALYS IS/Data files/" ) data < - read.csv("Cohort2Data_11Jan2019.csv") #Loading libraries library(penalized) summary(d ata) nDzLstc2 = data$nDzLstc2 NM_nPhat= data$NM_n Phat IF_nPhat= data$IF_nPhat OX_nPhat= data$OX_nPhat TotalLstc2 = data$nTotalLstc2 541 Lstc = as.n umeric(TotalLstc2) #Model 1 glm1 < - glm(nDzLstc2 ~ OX_nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) pr int summary(glm1) #finding optimum tuning parameter lambda2 for model 1 optl2 < - optL2 (nDzLstc2, nDzLstc2~OX_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.02967626 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ OX_nPhat, nDzLstc2~offset( log(Lstc)) , lambda1=0, lambda2= 0.02967626, stand ardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictio ns predictions < - predict(pen1, nDzLstc2~ OX_nPhat , nDzLstc2~offset( log(Lstc)), data ) print(predictions) #Model 2 glm1 < - glm(nDzLstc2 ~ IF_nPhat + OX_nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) p rint=(pred=glm1$fitted) print sum mary(glm1) #finding optimum tuning parameter lambda2 for model 2 optl2 < - optL2 (nDzLstc2, nDzLstc2~ IF_nPhat + OX_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), stand ardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 11.11461 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ IF_nPhat + OX_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 11.11461, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions prediction s < - predict(pen1, nDzLstc2~IF_nPhat + OX_nPhat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #Model 3 glm1 < - glm(nDzLstc2 ~ IF_nPhat + NM_nPhat + offset(log(Lstc)), fam ily = poisson(link=log), data = data) pri nt=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 3 optl2 < - optL2 (nDzLstc2, nDzLstc2~ IF_nPhat + NM_nPhat, nD zLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 20.4363 8 #Running shrinkage procedure pen 1< - penalized (nDzLstc2, nDzLstc2~ IF_nPhat + NM_ nPhat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=20.43638, standardize=TRUE, model = c( "poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~IF_nPhat+NM_nPhat , nDzLstc2~offset(log(Lstc)), data ) print(pred ictions) 542 #Model 4 glm1 < - glm(nDzLstc2 ~ IF_nPhat + offset(log(Lstc)), family = poisson( link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lam bda2 for model 4 optl2 < - optL2 (nDzLstc2 , nDzLstc2~ IF_nPhat , nDzLstc2~offset(log(Lstc )) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 10.94649 #Running shrin kage procedure pen1< - penalized (nDzLstc 2, nDzLstc2~ IF_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 10.94649, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients( pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ IF_nPhat, nDzLstc 2~offset(log(Lstc)), data ) print(predictions) #Model 5 g lm1 < - glm(nDzLstc2 ~ NM_nPhat + OX_nPhat + IF_n Phat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 5 optl2 < - optL2 (nDzLstc2, nDzL stc2~ NM_nPhat+ OX_nPhat + IF_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c( "poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the opt imum lambda found was 9.902708 # Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ NM_nPhat+ OX_nPhat + IF_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=9.902708, standardize=TRUE, model = c("poisson"), t race = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_nPhat+ OX_nPhat , nDzLstc2~offs et(log(Lstc)), data ) print(predictions) #Model 6 glm1 < - glm(nDzLstc2 ~ NM_nPhat + IF_nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning paramete r lambda2 for model 6 optl2 < - optL2 (nDzLstc2, nDzLstc2~ NM_nPhat + IF_nPhat , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 20.43638 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ NM_nPhat + IF_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 20.43638, standardize=TR UE, model = c("poisson"), trace = TR UE) show(pen1) coefficients(pen1) # make predictions predictions < - predic t(pen1, nDzLstc2~ NM_nPhat + IF_nPhat , nDzLstc2~ offset(log(Lstc)), data ) 543 print(predictions) SHRINKAGEAIM2 - 2 COMBINEDCOHORT .R #Title: Shrinkage for Aim 2 - 2:combined method models #Author: Lauren Wisnieski #Purp ose: Shrinkage for Aim 2 - 2: combined method model s #Revision history: Created January 14,2019 #Shrinkage data < - read.csv("Cohort2Data_11Jan2019.csv") NM_meanphat= data$NM_meanphat IF_meanphat= data$IF_meanphat OX_meanphat= data$OX_meanphat NM_sdphat= data$NM_sdphat IF_sdphat= data$IF_sdphat OX_sdph at= data$OX_sdphat NM_nPhat= data$NM_nPhat IF_nPhat= data$IF_nPhat OX_nPhat= data$OX_nPhat summary(data) summary(data$nTotalLstc2) TotalLstc2 = data$nTotalLstc2 Lstc = as.numeric(TotalLstc2) nDzLstc2 = data$ nDzLstc2 Dz = as.numeric(nDzLstc2) #running mode l 1 glm1 < - glm(nDzLstc2 ~ OX_meanphat + offset(l og(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter la mbda2 for model 1 optl2 < - optL2 (nDzL stc2, nDzLstc2~OX_meanphat, nDzLstc2~offset(log(Lstc) ) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running model 2 glm1 < - glm(nDzLstc2 ~NM_sdphat + OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 1 optl2 < - optL 2 (nDzLstc2, nDzLstc2~NM_ sdphat + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.103988 #Running shrinka ge procedure pen1< - penalized (nDzLstc2, nDzLstc2~NM_sdphat + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=1.103988, standardize=TRUE, model = c("poisson"), trace = TR UE) show(pen1) co efficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~NM_sdphat + OX_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running model 3 glm 1 < - glm(nDzLstc2 ~ OX_n Phat + OX_meanphat + offs et(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) 544 #finding optimum tuning parameter lambda2 for model 2 optl2 < - optL2 (nDzL stc2, nDzLstc2~OX_nPhat + OX_meanphat, nDzLstc2~o ffset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.923421 #Running shrinkage proce dure pen1< - penalized (nDzLstc2, nDzLstc2~OX _nPhat + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=1.923421, standardize=TRUE , model = c("poisson"), trace = TRUE) s how(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~OX_nPhat + OX_meanphat, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 4 glm1 < - glm(nDzLstc2 ~ NM_ nPhat + OX_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 4 optl2 < - optL2 (nDzLstc2, nDzLstc2~ NM_nPha t + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lam bda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.00826366 #Running shrinkage procedure pen1< - pen alized (nDzLstc2, nDzLstc2~ NM_nPhat + OX_meanpha t, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=0.00826366, standardiz e=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coe fficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_nPhat + OX_meanphat, nDzLstc2~offset(log(Lstc)), dat a ) print(predictions) #running combined model 5 glm1 < - glm(nDzLstc2 ~ NM_meanphat + OX_mea nphat+ offset(log(Lstc)), family = poisson(link=l og), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding op timum tuning parameter lambda2 for model 5 optl2 < - optL2 (nDzLstc2, nDzLstc2~ NM_meanphat + OX_meanp hat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.5633702 # Running shrinkage procedure pen1< - penalized (nDzL stc2, nDzLstc2~ NM_meanphat + OX_meanphat, nDzLs tc2~offset(log(Lstc)) , lambda1=0, lambda2= 0.5633702, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficient s(pen1) # make predictions predict ions < - predict(pen1, nDzLstc2~ NM_meanphat + OX_meanphat, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 6 glm1 < - glm(nDzLstc2 ~ IF_nPhat + OX_meanphat + off set(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 6 545 optl2 < - optL 2 (nDzLstc2, nDzLstc2~IF_nPhat + OX_meanphat, nDzLstc2~ offset(log(Lstc)) , lambda1 = 0 , mo del = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running combined model 7 glm1 < - glm(nDzLstc2 ~ OX_sdphat + OX_meanphat + offset(log(Lstc)), family = poisson(link=log), d ata = data) print=(pred=glm1$fitted) print summary(glm1) #find ing optimum tuning parameter lambda2 for model 7 optl2 < - optL2 (nDzLstc2, nDzLstc2~ OX_sdphat + OX_meanphat, nDz Lstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$l ambda #the optimum lambda found was0.4851019 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzL stc2~ OX_sdphat + OX_meanphat , nDzLstc2~offset(l og(Lstc)) , lambda1=0, lambda2=0.4851019, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predi ct(pen1, nDzLstc2~OX_sdphat + OX_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 8 glm1 < - glm(nDzLstc2 ~ IF_sdphat + OX_meanphat + offset(log(Lst c)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 7 optl2 < - optL2 (nDzLstc2, nDzLstc2~ IF_sdphat + OX_meanphat, nDzLstc2~offset(lo g(Lstc)) , lambda1 = 0 , model = c(" poisson"), standardize=TRUE , trace = TRUE) optl2$lambda # the optimum lambda found was 1.315622 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ IF_sdphat + OX_meanphat , nDzLstc2~offset(log(Lstc)) , lam bda1=0, lambda2=1.315622, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make pre dictions predictions < - predict(pen1, nDzLstc 2~IF_sdphat + OX_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predic tions) #running combined model 9 glm1 < - glm(nDzLstc2 ~ IF_meanphat + OX_meanphat + offset(log(Lstc)), family = poi sson(link=log), data = data) print=(pred=glm1 $fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 7 optl2 < - o ptL2 (nDzLstc2, nDzLstc2~ IF_meanphat + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lamb da1 = 0 , model = c("poisson"), stan dardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.2040502 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ IF_meanphat + OX_meanp hat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 0.2040502, standardize=TRUE, model = c("poisson"), 546 trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~IF_mean phat + OX_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 10 glm1 < - glm(nDzLstc2 ~ OX_nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 10 optl2 < - optL2 (nDzLstc2, nDzLstc2~ OX_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poi sson"), standardize=TRUE , trace = T RUE) optl2$lambda #the optimum lambda found was 0.02967626 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ OX_nPhat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 0.02967626, standardize=TR UE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ OX_nPhat , nDzLstc2~offset(log(Lstc)), data ) print(pre dictions) #running combined model 11 glm1 < - glm(nDzLstc2 ~ OX_nPhat + IF_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summ ary(glm1) #finding optimum tuning parame ter lambda2 for model 11 optl2 < - optL2 (nDzLstc2, nDzLstc2~ OX_nPhat + IF_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , t race = TRUE) optl2$lambda #the optimum la mbda found was 6.106406 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ OX_nPhat + IF_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=6.10640 6, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ OX_nPhat + IF_meanphat , nDzLstc2~offset(l og(Lstc)), data ) print(pr edictions) #running combined model 12 glm1 < - glm(nDzLstc2 ~ IF_nPhat + OX_nPhat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #f inding optimum tuning paramete r lambda2 for model 12 optl2 < - optL2 (nDzLstc2, nDzLstc2~ IF_nPhat + OX_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$l ambda #the optimum lambda found was 11.11461 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ OX_nPhat + IF_nPhat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=11.11461, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) 547 coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ OX_nPhat + IF_nPhat , nDzLstc2~offset(log(Lstc)), data ) print(pr edicti ons) #running combined model 13 g lm1 < - glm(nDzLstc2 ~ IF_meanphat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 12 optl2 < - optL2 (nDzLstc2, nD zLstc2~ IF_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.642536 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ IF_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 1.642536, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficien ts(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ IF_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #r unning combined model 14 glm1 < - glm(nDz Lstc2 ~ OX_nPhat + OX_sdphat + offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 14 optl2 < - optL2 (nDzLstc2, nDzLstc 2~ OX_nPhat + OX_sdphat , nDzLstc2~offset(log(Ls tc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 3.423936 #Running shrinkage procedure p en1< - penalized (nDzLstc2, nDzLstc2~OX_nPhat + OX _sdphat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 3.423936, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predicti ons predictions < - predict(pen1, nDzLstc2~ OX_n Phat + OX_sdphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 15 glm1 < - glm(nDzLstc2 ~ OX_nPhat + NM_mean phat + offset(log(Lstc)), family = poisson(link=l og), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 15 optl2 < - optL2 (nDzLstc2, nDzLstc2~ OX_nPhat + NM_meanpha t , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 5.723397 #Running shrinkage procedure pen1< - penalized (nDzLst c2, nDzLstc2~OX_nPhat + NM_meanphat , nDzLstc2~o ffset(log(Lstc)) , lambda1=0, lambda2= 5.723397, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1 ) # make predictions 548 predictions < - predict(pen1, nDzLstc2~ OX_nPhat + NM_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 16 glm1 < - glm(nDzLstc2 ~ OX_meanphat + NM_nPhat + NM_sdphat , offset(log(Lstc)), family = pois son(link=log), da ta = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 16 optl2 < - optL2 (nDzLstc2, nDzLstc2~ OX_meanphat + NM_nPhat + NM_ sdphat , nDzLstc2~offset(log(Ls tc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.985954 #Running shrinkage procedure pen1< - penalized (n DzLstc2, nDzLstc2~OX_meanphat + NM_nPhat + NM_sdp hat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 1.985954, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ OX_meanphat + NM_nPhat + NM_sdphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 17 glm1 < - glm(nDzLstc2 ~ NM_sdpha t + IF_nPhat + OX_meanphat , offset(log(Lstc)), f amily = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 17 optl2 < - optL2 (nDzLstc2, nDzLstc2~ NM_sdphat + IF_nPhat + OX _meanphat , nDzLstc2~o ffset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 8.239014 #Running shrinkage proce dure pen1< - penalized (nDzLstc2, nDzLstc2~NM _sdphat + IF_nPhat + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 8.239014, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_sdphat + IF_nPhat + OX_meanphat, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 18 glm1 < - glm(nDzLstc2 ~ NM_sdphat + IF_sdphat + OX_meanp hat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 18 optl2 < - optL2 (nDzLstc2, nDzLstc2~ NM_sdphat + IF_sdphat + OX_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 2.038 171 #Running shrinkage procedure 549 pen1< - penal ized (nDzLstc2, nDzLstc2~NM_sdphat + IF_sdphat + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 2.038171, standardize=TRUE, model = c("po isson"), trace = TRUE) show(pen1 ) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_sdphat + IF_sdphat + OX_meanphat, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 19 glm1 < - glm(nDzLstc2 ~ NM_ sdphat + IF_meanphat + OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 fo r model 19 optl 2 < - optL2 (nDzLstc2, n DzLstc2~NM_sdphat + IF_meanphat + OX_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 7. 141425 #Running shrin kage procedure pen1< - penalized (nDzLstc2, nDzLstc2~NM_sdphat + IF_meanphat + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 7.141425, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_sdphat + IF_meanphat + OX_meanphat, nDzLstc2~offset(log(Lstc)), data ) print(prediction s) #running combined mode l 20 glm1 < - glm(nDzLstc2 ~ NM_sdphat + OX_sdphat + OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding op timum tuning parameter lambda2 for model 20 o ptl2 < - optL2 (nDzLstc2, nDzLstc2~NM_sdphat + OX_sdphat + OX_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , tr ace = TRUE) optl2$lambda #the optimum lambda found was 0 .6959984 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~NM_sdphat + OX_sdphat + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lamb da2= 0.6959984, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_sdphat + OX_sdphat + OX_me anphat, nDzLstc 2~offset(log(Lstc)), data ) print(predictions ) #running combined model 21 glm1 < - glm(nDzLstc2 ~ NM_sdphat + NM_meanphat + OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print sum mary(glm1) #finding optimum tuning param eter lambda2 for model 21 550 optl2 < - optL2 (nDzLstc2, nDzLstc2~NM_sdphat + NM_meanphat + OX_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.904048 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~NM_sdphat + NM_meanphat + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 1.904048, standardize=TRU E, model = c("poi sson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_sdphat + NM_meanphat + OX_meanphat, nDzLstc2~offset(log(L stc)), data ) print(predictions) #running combined model 22 glm1 < - glm(nDzLstc2 ~ OX_nPhat + NM_sdphat + OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print =(pred=glm1$fitted) print summary(glm1) #finding optimum tuning paramete r lambda2 for model 22 optl2 < - optL2 (nDzLstc2, nDzLstc2~OX_nPhat + NM_sdphat + OX_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , mod el = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.969179 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~OX_nPhat + NM_sdphat + OX_meanpha t, nDzLstc2~offs et(log(Lstc)) , lambda1=0, lambda2= 1.969179, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ OX_nPhat + NM_sdphat + OX _meanphat, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 23 glm1 < - glm(nDzLstc2 ~ NM_nPhat + OX_nPhat , offset(log(Lstc)), family = poisson(link= log), data = dat a) print=(pred=glm1$fitted) print sum mary(glm1) #finding optimum tuning parameter lambda2 for model 23 optl2 < - optL2 (nDzLstc2, nDzLstc2~NM_nPhat + OX_nPhat , nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , tr ace = TRUE) optl2$lambda #the optimum lambda found was 5.776283 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~NM_nPhat + OX_nPhat, nDzLstc2~offset(log(Lstc)) , lamb da1=0, lambda2= 5.776283, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen 1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLst c2~ NM_nPhat + OX_nPhat, nDzLstc2~offset(log(Lstc )), data ) print(predictions) #running combined model 24 glm1 < - glm(nDzLstc2 ~ OX_nPhat + NM_sdpha t , offset(log(Lstc)), family = poisson(link=log), data = data) 551 print=(pred=glm1$fitted) print summary(glm1) #finding optim um tuning parameter lambda2 for model 24 optl2 < - optL2 (nDzLstc2, nDzLstc2~OX_nPhat + NM_sdphat, nDz Lstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #t he optimum lambda found was 2.253691 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~NM_sdphat + OX_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda 2= 2.253691, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pe n1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_sdphat + OX_nPhat, nDzLstc2~ offset(log(Lstc)), data ) print(predictions) #running combined model 25 glm1 < - glm(nDzLstc2 ~ OX_nPhat + IF_sdp hat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for mo del 25 optl2 < - optL2 (nDzLstc2, nDzLstc2~OX_nPhat + IF_sdphat, n DzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl 2$lambda #the optimum lambda found was 3.5545 16 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nD zLstc2~IF_sdphat + OX_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=3.554516, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ IF_sdphat + OX_nPhat, nDzLstc2~offset(log(Lstc)), data ) print( predictions) #running combined model 26 glm1 < - glm(nDzLstc2 ~ NM_nPhat + OX_nPhat + OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tunin g parameter lambda2 for model 26 optl2 < - optL2 (nDzLstc2, nDzLstc2~NM_nPhat + OX_ nPhat + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.047086 #Running shrinkage procedure pen1< - p enalized (nDzLstc2, nDzLstc2~NM_nPhat + OX_nPhat + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 1.047086, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pe n1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_nPhat + OX_nPhat + OX_meanphat, nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combi ned model 27 552 glm1 < - glm(nDzLstc2 ~ NM_nPhat + IF_nPhat + OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #find ing optimum tuning parameter lambda2 for model 27 optl2 < - optL2 (nDzLstc2, nDzLstc2~NM_nPhat + IF_nPhat + OX_meanphat, nDzLstc2~offset(log (Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #running combined model 28 glm1 < - glm(nDzLstc2 ~ IF_nPhat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 28 optl2 < - optL2 (nDzLstc2, nDzLstc2~IF_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the opt imum lambda found was 10.94649 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~IF_nPhat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 10.94649, standardize=TRUE, model = c("po isson"), trace = TRUE) show(pen1) coeff icients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ IF_nPhat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 2 9 glm1 < - glm(nDzLstc2 ~ NM_nPhat+ NM_meanpha t+ OX_meanphat , offs et(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 29 optl2 < - optL2 (nDzLstc2, nDzLstc2~NM_nPhat+ NM _meanphat+ OX_meanpha t, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 2.24924 2 #Running shrinkage procedure pen1 < - penalized (nDzLstc2, nDzLstc2~NM_nPhat+ NM_meanphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 2.249242, standardize=TRUE, model = c("pois son"), trace = TRUE) s how(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_nPhat+ NM_meanphat+ OX_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #runn ing com bined model 30 glm1 < - glm(nDzLstc2 ~ IF_nPhat+ OX_nPhat+ OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for m odel 30 optl2 < - optL2 (nDzLstc2, nDzL stc2~ IF_nPhat+ OX_nPhat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) 553 optl2$lambda #the optimum lambda fou nd was 2.437073 #Running shrinkage proce dure pen1< - penalized (nDzLstc2, nDzLstc2~ IF_nPhat+ OX_nPhat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 2.437073, standardize=TRUE, mod el = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ IF _nPhat+ OX_nPhat+ OX_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictio ns) #running combined model 31 glm1 < - g lm(nDzLstc2 ~ NM_nPhat+ IF_sdphat+ OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(p red=glm1$fitted) print summary(glm1) #finding optimum tuning paramete r lambda2 for model 31 optl2 < - optL2 (nDzLstc2, nDzLstc2~NM_nPhat+ IF_sdphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c(" poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the opt imum lambda found was 1.051878 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ NM_nPhat+ IF_sdphat+ OX_meanphat, nDzLstc2~offset(log(Lstc )) , lambda1=0, lambda2= 1.051878, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_nPhat+ IF_sdphat+ OX_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 32 glm1 < - glm(nDzLstc2 ~ OX_nPhat+ IF_sdphat+ OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimu m tuning parameter lambda2 for model 32 optl2 < - optL2 (nDzLstc2, nDzLstc2~OX_nPhat+ IF_sdphat+ OX_meanphat, nDzLstc2~offset(log( Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$la mbda #the optimum lambda found was 2.852469 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ OX_nPhat+ I F_sdphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 2.852469, standardize= TRUE, model = c("poisson") , trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ OX_nPhat+ IF_sdphat+ OX_meanphat , nDzLstc2~offset(log(L stc)), data ) print(predictions) #r unning combined model 33 glm1 < - glm(nDzLstc2 ~ OX_nPhat+ IF_meanphat+ OX_meanph at , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print 554 summary(glm1) #finding optimum tuning parameter lambda2 for model 33 optl2 < - optL2 (nDzLstc2, nDzLstc2~OX_nPhat+ IF_meanphat+ O X_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , t race = TRUE) optl2$lambda #the optimum la mbda found was 1.775976 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ OX_nPhat+ IF_meanphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lamb da2= 1.775976, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ OX_nPhat+ IF_meanphat+ OX_mea nphat , nDzLstc2~offset(log(Lstc)), data ) pr int(predictions) #running combined model 34 glm1 < - glm(nDzLstc2 ~ OX_nPhat+ NM_meanphat+ OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuni ng parameter lambda2 for model 34 optl2 < - optL2 (nDzLstc2, nDzLstc2~OX_nPhat+ NM_meanphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=T RUE , trace = TRUE) optl2$lambda #the optimum lambda found was 2.101864 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ OX_nPhat+ NM_meanphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 2.101864, standardize=TR UE, model = c("poisson "), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ OX_nPhat+ NM_meanphat+ OX_meanphat , nDzLstc2~offset(log(L stc)), data ) print(predictions) #ru nning combined model 35 glm1 < - glm(nDzLstc2 ~ OX_nPhat+ OX_sdphat+ OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred= glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 35 optl2 < - optL2 (nDzLstc2, nDzLstc2~OX_nPhat+ OX_sdphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("pois son"), standardize=TRUE , trace = TR UE) optl2$lambda #the optimum lambda found was 3.731701 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ OX_nPhat+ OX_sdphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 3.7317 01, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nD zLstc2~ OX_nPhat+ OX_sdphat+ OX_meanphat , nDzLs tc2~offset(log(Lstc)), data ) print(predictions) 555 #running co mbined model 36 glm1 < - glm(nDzLstc2 ~ NM_nPhat+ OX_sdphat+ OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summar y(glm1) #finding optimum tuning parameter lambda2 for model 36 optl2 < - optL2 (nDzLstc2, nDzLstc2~NM_nPhat+ OX_sdphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.5797942 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ NM_nPhat+ OX_sdphat+ OX_meanphat, nDzLstc2~o ffset(log(Lstc)) , lambda1=0, lambda2= 0.5797942, standardize=TRUE, model = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ NM_nPhat+ OX_sdphat+ OX_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 37 glm1 < - glm(nDzLstc2 ~ NM_nPhat+ IF_meanphat+ OX_meanphat , offset(log(Lstc)), family = poisson(li nk=log), data = data) print=(pred=glm1$fitted ) print summary(glm1) #finding optimum tuning parameter lambda2 for model 37 optl2 < - optL2 (nDzLstc2, nDzLstc2~NM_nPhat+ IF_meanphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , la mbda1 = 0 , model = c("poisson"), st andardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.2931829 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ NM_nPhat+ IF_meanpha t+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lamb da1=0, lambda2= 0.2931829, standardize=TRUE, model = c("poisso n"), trace = TRUE) show(pen1) coefficients(pen1) # make p redictions predictions < - predict(pen1, nDzLs tc2~ NM_nPhat+ IF_meanphat+ OX_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #r unning combined model 38 glm1 < - glm(nDzLstc2 ~ IF_nPhat+ IF_meanphat+ OX_meanphat , offset(log (Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 38 optl2 < - optL2 (nDzLstc2, nDzLstc2~IF_nPhat+ IF_meanphat+ OX_meanphat, nDz Lstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.979071 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzL stc2~ IF_nPhat+ IF_meanphat+ OX_meanphat, nDzLstc 2~offset(log(Lstc)) , lambda1=0, lambda2= 1.979071, standardize=TRU E, model = c("poisson"), trace = TRUE) show(pen1) 556 coefficients(p en1) # make predictions prediction s < - predict(pen1, nDzLstc2~IF_nPhat+ IF_meanphat+ OX_meanphat , nDzLstc2~offset(log(Lstc )), data ) print(predictions) #running combined model 39 glm1 < - glm(nDzLstc2 ~ IF_nPhat+ NM_meanphat+ OX _meanphat , offset(log(Lstc)), family = poisson(l ink=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 39 optl2 < - optL2 (nDzLstc2, nDzLstc2~IF_nPhat+ NM_mean phat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , l ambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = T RUE) optl2$lambda #the optimum lambda found was 0 #running combined model 40 glm1 < - glm(nDzLstc2 ~ I F_sdphat+ NM_meanphat+ OX_meanphat , offset(log(L stc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 40 optl2 < - optL2 (nDzLstc2, n DzLstc2~IF_sdphat+NM_meanphat+ OX_meanphat, nDzLs tc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was0.8108459 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLst c2~ IF_sdphat+ NM_meanphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 0.8108459, standardize=TRUE, model = c("pois son"), trace = TRUE) show(pen1) coefficients(p en1) # make predictions predictions < - predict(pen1, nDzLstc2~IF_sdphat+ NM_meanphat+ OX_meanphat , nDzLstc2~offset(log(Lstc)), data ) print(predictions) # running combined model 41 glm1 < - glm(nDzLstc2 ~ IF_nPhat+ OX_sdphat+ OX_ meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 41 optl2 < - optL2 (nDzLstc2, nDzLstc2~IF_nPhat+OX_sdphat + OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 #runni ng combined model 39 glm1 < - glm(nDzLstc2 ~ I F_nPhat+ NM_meanphat+ OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) prin t summary(glm1) #finding optimum tuning parameter lambda2 for model 39 optl2 < - optL2 (nDzLstc2, nD zLstc2~IF_nPhat+ NM_meanphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize= TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0 557 #running combined model 43 glm1 < - glm(nDzLstc2 ~ OX_sdphat+ NM_meanphat+ OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(p red=glm1$fitted) print summary(glm1) #finding optimum tun ing parameter lambda2 for model 43 optl2 < - optL2 (nDzLstc2, nDzLstc2~OX_sdphat+ NM_meanphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lamb da #the optimum lambda found was 1.143324 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~ OX_sdphat+ NM_meanphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=1.143324, standardize= TRUE, model = c("poisson") , trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ OX_sdphat+ NM_meanphat+ OX_meanphat , nDzLstc2~offset(log (Lstc)), data ) print(predictions) #runn ing combined model 44 glm1 < - glm(nDzLstc2 ~ IF_sdphat+ OX_sdphat+ OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 44 optl2 < - optL2 (nDzLstc2, nDzLstc2~IF_sdphat+ OX_sdphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 2.880438 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~IF_sdphat+ OX_sdphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=2.8 80438, standardize=TRUE, m odel = c("poisson"), trace = TRUE) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ IF_sdphat+ OX_sdphat+ OX_meanphat , nD zLstc2~offset(log(Lstc)), data ) print(predic tions) #running combined model 45 glm1 < - glm(nDzLstc2 ~ NM_meanphat+ IF_meanphat+ OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 45 optl2 < - optL2 (nDzLstc2, nDzLstc2~NM_meanphat+ IF_meanphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardize=T RUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.5585221 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~IF_sdphat+ OX_sdphat+ OX_meanphat , nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2= 0.5585221, standardize=T RUE, model = c("poisson"), trace = TRUE) show(pen1) 558 coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ IF_sdphat + OX_sdphat+ OX_meanphat , nDzLstc2~offset(log(Ls tc)), data ) print(predictions) #running combined model 46 glm1 < - glm(nDzLstc2 ~ OX_sdphat+ IF_meanphat+ OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pre d=glm1$fitted) print summary(glm1) #finding optimum tuning parameter lambda2 for model 46 optl2 < - optL2 (nDzLstc2, nDzLstc2~OX_sdphat+ IF_meanphat+ OX_meanphat, nDzLstc2~offset(lo g(Lstc)) , lambda1 = 0 , model = c( "poisson"), standardize=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 0.4988454 #Running shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~OX_sdphat+ IF_meanphat+ OX_meanphat, nDzLstc2~offset(log( Lstc)) , lambda1=0, lambda2 =0.4988454, standardize=TRUE, model = c("poisson"), trace = TRU E) show(pen1) coefficients(pen1) # make predictions predictions < - predict( pen1, nDzLstc2~ OX_sdphat+ IF_meanphat+ OX_meanph at , nDzLstc2~offset(log(Lstc)), data ) print(predictions) #running combined model 47 glm1 < - gl m(nDzLstc2 ~ IF_sdphat+ IF_meanphat+ OX_meanphat , offset(log(Lstc)), family = poisson(link=log), data = data) print=(pred=glm1$fitted) pri nt summary(glm1) #finding optimum tuning parameter lambda2 for model 47 optl2 < - optL 2 (nDzLstc2, nDzLstc2~IF_sdphat+ IF_meanphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)) , lambda1 = 0 , model = c("poisson"), standardiz e=TRUE , trace = TRUE) optl2$lambda #the optimum lambda found was 1.248906 #Run ning shrinkage procedure pen1< - penalized (nDzLstc2, nDzLstc2~IF_sdphat+ IF_meanphat+ OX_mean phat, nDzLstc2~offset(log(Lstc)) , lambda1=0, lambda2=1.248906, standardize=TRUE, model = c("poisson"), trace = TRUE ) show(pen1) coefficients(pen1) # make predictions predictions < - predict(pen1, nDzLstc2~ IF_sd phat+ IF_meanphat+ OX_meanphat, nDzLstc2~offset(log(Lstc)), data ) print(predictions) OBSERVEDVERSUSEXPECTEDAIM2 - 2.R #Title: Observed versus expected count analyses for Aim 2 - 2 #Auth or: Lauren Wisnieski #Purpose: Observed versus expectd count analyses for Aim 2 - 2, creates residual plot #Revision history: Created January 18,2019 #Edited Feb 18, 2019 to update obser ved versus predicted counts #Cohort analysis: aim 2.2 library(modEvA) #Calculating deviance explained for each disp ersion method model 559 setwd("/Users/wisni/Amazon Drive/DISSERTATION ANALYSIS/Data files/" ) data < - read.csv("sd2 - 2_othermodels.csv") nDzLstc2 = data$nDzLstc2 lnpcount1= data$lnpcount1 lnpcount2= dat a$lnpcount2 lnpcount3= data$lnpcount3 Dsquared(obs = nDzLstc2, pred = lnpcount1, family = 'poisson' ) Dsquared(obs = nDzLstc2, pred = lnpcount2, family = 'poisson') Dsquared(obs = nDzLstc2, pred = lnpcount3, family = 'poisson') #averaged standard deviation model data < - read.csv("sd2 - 2.csv") nDzLstc2 = data$nDzLstc2 lnavgpcount= data$lnavgpcount resi d = data$nDzLstc2 - data$avgpcount Dsquared(obs = nDzLstc2, pred = lnavgpcount, family = 'poisson' ) plot(nDzLstc2, resid, col = "blue" , abline(h= 0) , cex = 1.3,pch = 16,cex.main=1,ylim=c( - 5,8), xlim=c(0,17), xlab = "Obse rved",ylab = "Residual (Observed - Predicted)") plot(nDzLstc2, avgpcount, col = "blue" , cex = 1.3,pch = 16,cex.main=1, ylim=c(0,10), xlim=c( 0,15), xlab = "Observed",ylab = "Predicted" ) 560 REFERENCES 561 REFERENCES Abebe, R . , Hat iya, H . , Abera, M . , Megersa, B . , Asmare, K ., 2016. Bovine mastitis: prevalence, risk factors and isolation of Staphylococcus aureus in dairy herds at Hawassa milk shed, South Ethiopia. BioMed . Cen t. Vet . Res . 12 , 270. https:// doi.org/ 10.1186/s12917 - 016 0905 - 3 . Abuelo, A., Gandy, J.C. , Neuder, L., Brester, J., Sordillo, L.M., 2016. Short communication: Markers of oxidant status and inflammation relative to the development of claw lesions associated with lam eness in early lactation cows. J. Dairy Sci. 99, 5640 5648. https://doi.org/10.3168/jds.2015 - 10707 . Abuelo, A . , Hernandez, J . , Benedito, J . L . , Castillo, C ., 2013. Oxidative Stress index (OSi) as a new tool to assess redox status in dairy cattle during the transition period. Anim . 7 , 1374 - 1378. https://doi.org /10.1017/S1751731113000396 . Austin, P.C., 2010. Estimating multilevel logistic regression models when the number of clusters in low: a comparison of different statistical software procedures. The Intern. J. of Bi ostat. 6. https://doi.org/10.2202/1557 - 4679.1195 . Autier, P ., Boniol, M., Pizot, C., Mullie, P., 2014. Vitamin D status and ill healt h: a systematic review. Lancet Diab . End o. 2 , 76 - 89. http://dx.doi.org/10.1016/S2213 - 8587(13)70165 - 7 . Bachman, K.C., Schairer, M.L., 2003. Invited review: Bovine studies on optimal lengths of dry periods. J. Dairy Sci. 86, 3027 3037. https:// doi.org/10.3168/jds.S00220302(03)73902 - 2 . Badolato, R . , Wang, J . M . , Murphy, W . J . , Lloyd, A . R . , Michiel, D . F . , Bausserman, L . L . , Kelvin, D . J . , Oppenheim, J . J ., 1994. Serum a myloid A i s a c hemoattractant: i nduction of m igration, a dhesion, and t issue i nfiltration of m onocytes and p olymorphonuclear l eukocytes. J . of Exper . Med . 180 , 203 - 209. Baldi, A ., 2005. Vitamin E in dairy cows. Live . Prod . Sci . 98 , 117 - 122 . https://doi:10.1016/j.livprodsci.2005.10.004 . Barbosa, A.M., Brown, J.A., Jimenez - Valverde, A., Real., R., 2016. Package 'modEvA.' R package version 1.3.2, https://cran.r - project.org/web/packa ges/ modEvA/modEvA.pdf (Accessed 16 Feb 2019). B arkema, H . W . , Schukken, Y . H . , Lam, T . J . G . M . , Beiboer, M . L . , Benedictus, G . , Brand, A ., 1998. Management p ractices a ssociated with l ow, m edium, and h igh s omatic c ell c ounts in b ulk m ilk. J. Dairy Sc i. 81 , 1917 - 1927. 562 Barker, Z.E., Amory, J.R., Wright, J.L., Mason, S.A., Blowey, R.W., Green, L.E., 2009. Risk factors for increased rates of sole ulcers, white line disease, and digital dermatitis in dairy cattle from twenty - seven farms in England and Wales. J. Dair y Sci. 92, 1971 - 1978. https://doi.org/ doi:10.31 68/jds.2008 - 1590 . Benzaquen, M . E . , Risco, C . A . , Archbald, L . F . , Melendez, P . , Thatcher, M . J . , Thatcher, W . W ., 2007. Rectal t emperature, c alving - r elated f actors, and the i ncidence of p uerperal m etritis in p os tpartum d airy c ows. J . Dairy Sci . 90 , 2804 - 2814. https://doi.org/10.3168/jds.2006 - 482 . Betteridge, D . J ., 2000. What Is Oxidative Stress? Metab . 49 , 3 - 8. Bewley, J . M . Schutz, M . M . 2008. Review: An Interdisciplinary Review of Body Condition Scoring for Dai ry Cattle. Prof . Anim . Sci . 24 , 507 - 529. Bionaz , M., Trevisi, E., Calamari, L., Librandi, F., Ferrari, A., Bertoni, G., 2007. Plasma paraoxonase, health, inflammatory conditions, and liver function in transition dairy cows. J. Dairy Sci. 90, 1740 1750. https://doi.org/10.3168/jds.2006 - 445 . Bleeker, S.E., Moll, H.A., Steyerberg, E.W., Donders, A.R., Derksen - Lubsen, G., Grobbee, D .E., Moons, K.G., 2003. External validation is necessary in prediction resea rch: a clinical example. J. Clin. Epidemiol. 56( 9), 826 - 832. Bobe, G., Young, J.W., Beitz, D.C., 2004. Invited review: Pathology, etiology, prevention, and treatment of fatty liv er in dairy cows. J. Dairy Sci. 87, 3105 3124. https://doi. org/10.3168/jds.S0022 - 0302(04)73446 - 3 . Boronat, M . , Saavedra, P . , Varillas, V . F . , Novoa, F . J ., 2009. Use of confirmatory factor analysis for the identification of new components of the metabolic syndrome: T he role of plasminogen activator inhibitor - 1 and Haemoglobin A1c. Nutr . , Metab . & Cardio . Dis . 19 , 271 - 276. https://doi.org/10.1016/j.numecd.200 8.07.007 . Bouwmeester, W. 2012. Prediction models: systematic reviews and clustered study data. Igitur archive, Utrecht, Netherlands. Bradford, B . J . , Yuan, K . , Farney, J . K . , Mamedova, L . K . , Carpenter A . J ., 2015. Invited review: Inflammation during the transition to lactation: New adventures with an old flame. J. Dairy Sci . 98 , 6631 - 6650. http://dx.doi.org/10.3168 /jds.2015 - 9683 .s Brown, T . A ., 2015. Confirmator y Factor Analysis for Applied Research. 2 nd ed. The Guilford Press, New York, NY. Bruun, J . , Ersboll, A . K . , Alban, L ., 2002. Risk factors for metritis in Danish dairy cows. Prev . Vet . Med . 54 , 179 - 190. 563 B D., Dillon, P., 2003. Relationships among milk yield, body condition, cow weight and reproduction in Spring - Calved Holstein Friesians. J. of Dairy Sc. 86, 2308 2319. Burnham, K.P., Anderson, R.P., 2004. Mult imodel inference: Understanding AIC and BIC in m odel selection. Sociol. Methods Res. 33, 261 304. https://doi.org/10.1177/0049124104268644 . Burow, E . , Thomsen, P . T . , Sorenson, J . T . , Rousing, T ., 201 1. The effect of grazing on cow mortality in Dan ish dairy herds. Prev . Vet . Med . 100 , 237 - 241. http://dx.doi.org/10.1016/j.prevetmed. 2011.04.00 1. Calcagno, V., de Mazancourt, C, 2010. glmulti: An R package for easy automated model selection with (generali zed) linear models. J. of Stat. Soft. [Online] 3 4:12 Cameron, R.E.B., Dyk, P.B., Herdt, T.H., Kaneene, J.B., Miller, R., Bucholtz, H.F., Liesman, J.S., Vandehaar, M.J., Emery, R.S., 1998. Dry cow diet, management, and energy balance as risk factors for di splaced abomasum in high producing dairy herds. J. Dairy Sci. 81, 132 139. https://doi.org/10.3168/jds.S0022 - 0302(98)75560 - 2 . Cameron, A.C., Windmeijer, F.A.G., 1996. R - squared measures for count data regression models with applications to health - care utilization. J. Busin. Econ. Stat. 14, 209 - 220. https://doi.org/10.1080/07350015.1996.10524648 . Cameron, A. C., Windmeijer, F. A. G., 1997. An R - squared measure of goodness of f it for some common nonlinear regression models. J. Econom. 77, 329 - 342. ht tps://doi.org/10.1016/S0304 - 4076(96)01818 - 0. Cavestany, D., Blanc, J.E., Kulcsar, M., Uriarte, G., Chilibroste, P., Meikle, A., Febel , H., Ferraris, A., Krall, E., 2005. Studies of the transition cow under a pasture - based milk production system: metabolic profiles. J. Vet. Med. 52, 1 - 7. Ceciliani, F., Ceron, J.J., Eckersall, P.D., Sauerwein, H., 2012. Acute phase proteins in ruminants. J. Proteomics 75, 4207 4231. https://doi.org/10.1016/j.jprot. 2012.04.004 . Celi, P., 201 0 . Biomarkers of oxidative stress in ruminant medicine. Immunopharm. and Immunotox. 33:2, 233 - 240. https://doi.org/10.3109/08923973.2010.514917 . Chagas, C.E.A., Borges, M.C., Martini, L.A., Rogero, M.M., 2012. Focus on vitamin D, inflammation and type 2 diabetes. Nutrients. 4, 52 67. https://doi.org /10. 3390/nu4010052 . Chan, D., 1998. Functional relations among constructs in the same content domain at different levels of analysis: A typology of composition models. J. of Appl. Psych. 83, 234 - 246. 564 Chapi nal, N . , Carson, M . , Duffield, T . F . , Capel, M . , Godden, S . , Overton, M . , Santos, J . E . P ., 2011. The association of serum metabolites with clinical disease during the transition period. J . Dairy Sci . 94 , 4897 - 4903. Chapinal, N., LeBlanc, S.J., Carson, M.E. , Leslie, K.E., Godden, S., Capel, M., Santos, J .E.P., Overton, M.W., Duffield, T.F., 2012. Herd - level association of serum metabolites in the transition period with disease, milk production, and early lactation reproductive performance. J. of Dairy Sci. 9 5, 5676 5682. Clyde, M., 2009. Modeling averagi ng, in: Press, J., Subjective and objective Bayesian statistics: principles, models, and applications. John Wiley and Sons, Inc., Hoboken, New Jersey, pp. 320 - 335. Collier, R . J . , Dahl, G . E . , VanBaale, M . J ., 2006. Major a dvances a ssociated with e nvironmen tal e ffects on d airy c attle. J. Dairy Sci . 89 , 1244 - 1253. Contreras, G.A., Sordillo, L.M., 2011. Lipid mobilization and inflammatory responses during the transition period of dairy cows. Comp. Immun. Micro. and Infect. Dis. 34, 281 - 289. https:// doi.org/1 0.1016/j.cimid.2011.01.004 . Cook, N . B ., 2003. Prevalence of lameness among dairy cattle in Wisconsin as a function of housing type and stall surface. JAVMA. 9 , 1324 - 1328. Cook, N.B., Nordlund, K.V., 2009. T he influence of the environment on dairy cow beh avior, claw health and herd lameness dynamics. Vet. J. 179, 360 - 369. Corica, F . , Corsonella, A . , Apolone, G . , Mannuncci, E . , Lucchetti, M . , Bonfiglio, C . , Melchoinda, N . , Marchesini, G . , QUOVADIS Study Group ., 2008. Metabolic s yndrome, psychological statu s and quality of life in obesity: the QUOVADIS Study. Int . J . Obes . 32 , 185 - 191. Coxe, S., West, S.G., Aiken, L.S., 2009. The analysis of count data: A gentle introduction to Poisson regression and its alter natives. J. Personal. Assess. 91, 121 - 136. https ://doi.org/10.1080/00223890802634175. Czado, C., Gneiting, T., Held, L., 2009. Predictive model assessment for count data. Biomet. 65, 1254 - 1261. Das, R., Sailo, L., Verma, N., Bhart, P., Saikia, J., K umar, I., Kumar, R., 2016. Impact of heat stress on h ealth and performance of dairy animals: A review. Veterinary World. EISSEN: 2231 - 0912. https://doi.org/10.14202/vetworld.2016.260 - 268 . Dawid, A. P., Sebastiani, P. 1999. Coherent dispersion cr iteria for optimal experimental design. The Annals of Stats. 27, 65 81. 565 Dean, A.G., Sullivan, K.M., Soe, M.M., 2006. OpenEpi open source epidemiologic statistics for public health, Version 3.01. www.OpenEpi.com/SampleSize/SSCohort.htm (accessed 15 June 2018). DeGaris, P . J ., Lean, I . J ., 2008. Milk fever in dairy cows: A review of pathophysiology and control principles . Vet . J . 58 , 69. http://d oi.org/10.1016/j.tvjl.2007.12.029 . Dervishi, E. , Zhang, G., Hailemariam, D., Dunn, S.M., Ametaj, B.N., 2015. Innate immunity and carbohydrate metabolism alterations precede occurrence of subclinical mastitis in transition dairy cows. J. Anim. Sci. Technol . https://doi.org/10.1186/s40781 - 015 - 0079 - 8 . Dervishi, E., Zhang, G., Hailemariam, D., Dunn, S.M., Ametaj, B.N., 2016a. Occurrence of retained placenta is preceded by an inflammatory st ate and alte rations of energy metabolism in transition dairy cows. J. Anim. Sci. Biotechnol. 1 13. https://doi.org/10.1186/s40104 - 016 - 0085 - 9 . Dervishi, E., Zhang, G., Hailemariam, D., Goldansaz, S. A., Deng, Q. , Dunn, S.M., Ametaj, B.N., 2016b. Alterations i n innate immunity reactants and carbohydrate and lipid metabolism precede occurrence of metritis in transition dairy cattle. Res. in Vet. Sci. 104, 30 - 39. http://dx.doi.org/10.1016/j.rvsc.2015.11. 004 . d e Vries, M., Bookers, E.A.M., van Reenen, C.G., Engel, B., van Schaik, G., Dijkstra, T., de Boer, I.J.M., 2015. Housing and management factors associated with indicators of dairy cattle welfare. Prev. Vet. Med. 118, 80 - 92. d eVries, T . J . , von Keyse rlingk, M . A . G . , Weary, D . M ., 2004. Effect of f eeding s pace on the i nter - c ow d isease, a ggression, and f eeding b ehavior of f ree - s tall h oused l actation d airy c ows. J. Dairy Sci. 87, 1432 - 1438. Dingwell, R.T., Leslie, K.E., Schukken, Y.H., Sargeant, J.M., Ti mms, L.L., Duffield, T.F., Keefe, G.P., Kelton, D.F., Lissemore, K.D., Conklin, J., 2004. Association of cow and quarter - level factors at drying - off with new int ramammary infections during the dry period. Pre v. Vet. Med. 63, 75 - 89. https://doi:10.1016/j.pr evetmed.2004.01.012 . DiStefano, C . , Zhu, M . , Mindrila, D ., 2009. Understanding and u sing f actor s cores: c onsiderations for the a pplied r esearcher. Pra ct. Assess . , Res . & Eval . 1 4, 1 - 11. Dohoo, I., Martin, W ., Stryhn, H., 2003. Veterinary Epidemiologic Re search. AVC Inc., Charlottetown, Prince Edward Island, Canada: Dohoo, I., Martin, W., Stryhn, H., 2012. Methods in epidemiologic research. VER Inc, Prin ce Edward Island, Canada. Drackley, J.K., 1999. Biol ogy of dairy cows during the transition period: the final frontier? J. Dairy Sci. 82, 2259 2273. https://doi.org/10.3168/jds.S0022 - 0302( 99)75474 - 3 . 566 Dubuc, J., Duffield, T.F., Leslie, K.E., Walton, J.S., LeBlanc, S.J., 2010. Risk factors for postpartum uterine disease in dairy cows. J. Dairy Sci. 93 , 5764 - 5771. Dubuc, J., Duffield, T.F., Leslie, K.E., Walton, J.S., LeBlanc, S.J., 2011. Effects of postpartum uterine diseases on milk productio n and culling in dairy cows. J. Dairy Sci. 94, 1 339 1349. http://doi.org/10.3168/jds.2010 - 3758 . Eberhart, R.J., Hutchinson, L.J., Spencer, S .B., 1982. Relationships of bulk tank somatic cell counts to p revalence of intramammary infection and to indic es of herd production. J. of Food Prot. 45, 1125 1128. Elamir, E.A.H., 2012. Mean absolute deviation about median as a tool of explanatory data a nalysis. Proceedings of the World Congress on Engineering (WCE ), London U.K. (2012). Esposito, G., Irons, P.C ., Webb, E.C., Chapwanya, A., 2014. Interactions between negative energy balance, metabolic diseases, uterine health and immune response in transi tion dairy cattle. Anim. Repro. Sci. 144, 60 - 71. Fenlon, C., O'Grady, L., Doherty, M.L., Dunnion, J., 2018. A discussion of calibration techniques for evaluation binary and categorical predictive models. Prev. Vet. Med. 149, 107 - 114. Fenlon, A . L . , Mee, J . F . , Butler, S . T . , Doherty, M . L . , Dunnion, L ., 2017 . A comparison of 4 predictive models of calving assistance and difficulty in dairy heifers and cows. J . Dairy Sci . 100 , 1 - 13. https://doi.org/10.3168/jds.2017 - 12931 . Fleischer, P., Metzner, M. , Beyerbach, M., Hoedemaker, M., Klee, W., 2001. The relations hip between milk yield and the incidence of some diseases in dairy cows. J. Dairy Sci. 84, 2025 2035. Fragoso, T . M . , Bertoli, W . , Louzada, F . 2018. Bayesian model averaging: A systematic review and conceptual classification. Inter . Stat . Rev . 86 , 1 - 28. G albraith, S . , Daniel, J . A . , Vissel, B ., 2010. A s tudy of c lustered d ata and a pproaches. J . Neurosci . 30 , 10601 - 10608. Galvão, K . N . , Flaminio, M . J . B . F . , Brittin, S . B . , Sper, R . , Fraga, M . , Caixeta, L . , Ricci, A . , Guard, C . L . , Butler, W . R . , Gilbert, R . O ., 2 010. Association between uterine disease and ind icators of neutrophil and systemic energy status in lactation Holstein cows. J . Dairy Sci . 93 , 2926 - 2937. Gelman, A . , Carlin, J . B . , Stern, H . S . , Dunson, D . B . , Vehtari, A . , Rubin D . B ., 2014. Bayesian d ata a na lysis, 3 rd ed . Taylor & Francis Group, LLC, Boca Raton, FL. 567 Ghosh, D ., Yuan, Z ., 2009. An improved model averaging scheme for logistic regression. J . Multivar . Analy . 100 , 1670 - 1681. Giuliodori, M.J., Magnasco, R.P., Becu - Villalobos, D., Lacau - Mengido, I.M., Risco, C.A., La Sota, R.L. de, 2013. Metr itis in dairy cows: Risk factors and reproductive performance. J. of Dairy Sci. 9 6, 3621 3631. Gneiting, T., Raftery, A.E., 2007. Strictly proper scoring rules, prediction, and estimation. J. Amer. Stat. Ass oc. 102, 359 - 378. https://doi.org/ 10.1198/01621 4506000001437 . Goeman, J., Meijer, R., Chaturvedi, N., Lueder, M. 2018. Package 'penalized.' R package version 0.9 - 51, https://cran.r - project.org/web/packages/penalized/penalized.pdf (Accessed 26 Oct 2018). Goff, J.P., 2004. Macromineral disorders of the transition cow. Vet. Clin. of North Amer.: Food Anim. Pract. 20, 471 494. Goff, J.P., 2008 a . The monitoring, prevention, and treatment of milk fever and subclinical hypocalcemia in dairy cows. Vet. J. 176, 50 57. https://doi.org/10.1016/j.tvjl.2007.12.020 . Goff, J . P ., 2008b. Transition c ow i mmune f unction and i nteraction with m etabolic d iseases. Proceedings of the 17 th annual T ri - State Dairy Nutriti on Conference , Fort Wayne, Indiana, USA. (2008). Goff, J . P ., Horst, R . L ., 1997. Physiological c hanges at p arturition and t heir r elationship t o m etabolic d isorders. Physio . Manage . 80 , 1260 - 1268. Gombart, G ., 2012. Vitamin D: o xidative s tress, i mmunity, a nd a ging. CRC Press, Boca Raton, FL . Grandjean, P ., 1995. Biomarkers in epidemiology. Clin . Chem . 4 1, 1800 1803. Green, M . J . , Bradley, A . J . , Medley, G . F . , Brown, W . J ., 2007. Cow, f arm, and m anagement f actors d uring the d ry p eriod that d etermine the r ate of c linical m astitis a fter c alving. J . Dairy Sci . 90 , 3764 - 3776. http:// doi .org/ 10.3168/jds.2007 - 0107 . Green, M.J., Green, L.E., Medley, G.F., Schukken, Y.H., Bradley, A.J., 2002. Influence of dry period bacterial intramammary infection on clinical mastit is in dairy cows. J. Dairy Sci. 85, 25 89 - 2599. Grice, J . W ., 2001. Computing and e valuating f actor s cores. Psych . Meth . 6 , 430 - 450. Grohn, Y.T., Eicker, S.W., Hertl, J.A, 1995. The association between previous 305 - day milk yield and disease in New York st ate dairy cows. J. Dairy Sci. 78, 1693 - 1702. 568 Gr otenhuis, M., Pelzer, B., Elsinga, R., Nieuwenhuis, R., Schmidt - Catran, A., Konig, R., 2017. When size matters: advantages of weighted effect coding in observational studies. Int. J. Public Health. 62, 163 - 16 7. https://doi - org/10.1007/s00038 - 016 - 0902 - 0 Guillot, X., Semerano, L., Saidenberg - Vitamin D and inflammation. Jt. Bone Spine. 77, 552 557. https://doi.org/10.1016/j.jbspin.2010.09.018 Gurka, M . J . , Lilly, C . L . , Oliver, M . N . , DeBoer, M . D ., 2014. An e xamination of s ex and r acial/ e thnic d ifferences in the m etabolic s yndrome among a dults: a c onfirmatory f actor a nalysis and a r esulting c o ntinuous s everity s core. Metab . 63 , 218 - 225. http://doi.org/10.1016/j.metabol.2013.10.006. Halliwell, B., 2007. Biochemistry of oxidative stress. Inflamm. 35, 1147 - 1150. Hammon, D . S . , Evjen, I . M . , Dhiman, T . R . , Goff, J . P . , W alters, J . L ., 2006. Neutrophil function and energy status in Holstein cows with uterine health disorders. Vet . Immunol . Immunopath . 113 , 21 - 29. Hardin, J.W., Hilbe, J.M., 2007. Generalized linear models and extensions, second ed. StataCo rp LP, College St ation, Texas. Harrell, F . E ., 2015. Regression m odeling s trategies: w ith a pplications to l inear m odels, l ogistic and o rdinal r egression, and s urvival a nalysis, 2 nd ed . Springer Science and Business Media, Berlin, Germany. Haskell, M . J . , Rennie, L . J . , Bowell, V . A . , Bell, M . J . , and La wrence, A . B ., 2006. Housing s ystem, m ilk p roduction, and z ero - g razing e ffects on l ameness and l eg i njury in d airy c ows. J . Dairy Sci . 8 9, 4259 - 4266. Heinrichs, A . J . , Costello, S . S . , Jones, C . M ., 2009. Contro l of heifer mastitis by nutrition. Vet . Microbio . 134 , 172 - 176. https://doi.org/10.1016/j.vetmic.2008.09.025. Hilbe, J . M ., 2014. Modeling c ount d ata . Cambridge University Press, New York, NY . Holcombe, S.J., Wisnieski, L., Gandy, J., Norby, B., Sordillo, L.M., 2018. Reduced serum vitamin D concentrati ons in healthy early - lactation dairy cattle. J. Dairy Sci. 101, 1488 1494. https://doi.org/10.3168/jds.2017 - 13547 Horst, R.L., Goff, J.P., Rein hardt, T. A., 1994. Calcium and vitamin D metabolism in th e dairy cow. J. of Dairy Sci. 77, 1936 - 1951. Hosmer, D.W., Lemeshow, S., Sturdivant, R.X., 2013. Applied logistic regression, third ed. John Wily & Sons, Inc, Hoboken, New Jersey. 569 Hu, X., Madden, L.V., Edw ards, S., Xu, X., 2015. Combining models is more likely to give better predictions than single models. Phytopath. 105, 1174 - 1182. Hutjens, M.F., Aalseth, E.P., 2005. Caring for Transition Cows. W.D. Hoards & Sons, Fort Atkinson, Wisconsin, USA. Huzzey, J . M . , Nydam, D . V . , Grant, R . J . , Overton, T . R ., 20 11. Associations of prepartum plasma cortisol, haptoglobin, fecal cortisol metabolites, and nonesterified fatty acids with postpartum health status in Holstein dairy cows. J . Dairy Sci . 94 , 5878 - 58 89. http:// doi.org/10.3168/jds.2010 - 3391. Huzzey, J.M., Ve ira, D.M., Weary, D.M., von Keyserlingk, M.A.G., 2007. Prepartum behavior and dry matter intake identify dairy cows at risk for metritis. J. Dairy Sci. 90, 3220 3233. Ingvartsen, K.L., 2006. Feedi ng - and man agement - related diseases in the transition cow: Physiological adaptations around calving and strategies to reduce feeding - related diseases. Anim. Feed Sci. Technol. 126, 175 213. h ttps://doi.o rg/10.1016/j.anifeedsci.2005.08.003 Ingvartsen , K.L., Dewhurst, R.J., Friggens, N.C., 2003. On the relationship between lactational performance and health: Is it yield or metabolic imbalance that cause production diseases in dairy cattle? A po sition pape r. Livest. Prod. Sci. 83, 277 308. https://doi.org/10.1016/S0301 - 6226(03)00110 - 6 International Diabetes Federation (IDF) , 2006. The IDF consensus worldwide definition of the metabolic syndrome. IDF, Brussels, Belgium . Ivanescu, A . E . , Li, P . , George, B . , Brown, A . W . , Keith, S . W . , Raju, D . , Allison, D . B ., 2016. The i mportance of p rediction m odel v alidation and a ssessment in o besity and n utrition r esearch. Intern . J . Obes . 4 0, 887 - 894. http:// dx. doi.org/10.1038/ijo.2015.214. Kaur, J ., 2014. A Comprehensive r eview on m etabolic s yndrome. Cardiol . Res . Pract . 2014 , article ID 943162. http://dx.doi.org/10.1155/2014/943162. Kellogg, W, 201 0 . Body condition scoring with dairy cattle. https://www.uaex. edu/publications/pdf/FSA - 4008.pdf (Accessed 17 J uly 2018). Kelsey, J.L., Whittemore, A.S., Evans, A.S., Thompson, W.D., 1996. Methods in Observational Epidemiology, second edition. Oxford University Press, New York. Kerr, K . F ., Pepe, M . S ., 2011. Joint m odeling, c ovariate a djustment, and i nteraction c ontrasting n otions in r isk p rediction m odels and r isk p rediction p erformance. Epidem . 2 2, 805 - 812. http://dx.doi/10.1097/EDE.0b013e31823035fb. 570 Kim, I . H ., Suh, G . H ., 2003. Effect of the amount of body conditi on loss from the dry of near calving periods on the subsequent body condition change, occurrence of postpartum diseases, metabolic parameters and r eproductive performance in Holstein dairy cows. Theriogen . 60 , 1445 - 1456. http://dx.doi:10.1016/S0093 - 691X(03 )00135 - 3 . Kimura, K., Reinhardt, T.A., Goff, J. P., 2006. Parturition and hypocalcemia blunts calcium signals in immune cells of dairy cattle. J. Dairy Sci. 89, 2588 2595. https://doi.org/10. 3168/jds.S0022 - 0302(06)72335 - 9 Klecka, W . R ., 1 980. Discriminant a nalysis. s eries: q uantitative a pplications in the s ocial s cience. Sa ge Publications, Inc, Newbery Park, London, New Delhi. Kleiber, C., Zeileis, A., 2017. Package 'AER.' R package version 1.2 - 4, https://cran.r project.org/web/packages/ AER/AER.pdf (Accessed 26 Oct 2018). Kozlowski, S.W., Klein, K.J., 2000. A multilevel a pproach to theory and research in organizations: Contextual, temporal, and emergent processes. In K.J. Klein &S.W.J. Kozl owski (Eds)., Multilevel theory, research and me thods in organization: Foundations, extensions, and new directions. Jossey - Bass, San Fr ancisco, California, pp. 3 - 90. Kraemer, H.C., Blasey, C.M., 2003. Centring in regression analyses: a strategy to preve nt errors in statistical inference. Intern. J. o f Meth. in Psych. Res. 13:3, 141 - 151. Kristensen, T., 1986. Method for estimation of b ody condition of dairy cows. Pages 59 75 in Rep. no. 615. Natl. Inst. Anim. Sci. Fredriksberg, Denmark. Kulberg, S . , Sto rset, A . K . , Heringstad, B . , Larsen, H . J . S ., 2002 . Reduced l evels of t otal l eukocytes and n eutrophils in Norwegian c attle s elected for d ecreased m astitis i ncidence. J . Dairy Sci . 85 , 3470 3475. LeBlanc, S . J ., 2008. Postpartum uterine disease and dairy her d reproductive performance: A review. Vet . J . 17 6 , 102 114. http://dx. doi.org/10.1016/j.tv jl.2007.12.019 . LeBlanc, S., 2010. Monitoring metabolic health of dairy cattle in the transition period. J. Reprod. Dev. 56 Suppl, S29 - 35. https://doi.org/10.1262 /jrd.1056S29 LeBlanc, S . J . , Duffield, T . F . , Leslie, K . E . , Batement, K . G . , TenHag, J . , Walton, J . S . , and Johnson, W . H ., 2002. The e ffect of p repartum i njection of v itamin E on h ealth in t ransition d airy c ows . J . Dairy Sci . 85 , 1416 - 1426. LeBlanc, S., Her dt, T.H., Seymour, W.M., Duffield, T.F., Leslie, K.E., 2004. Peripartum Serum vitamin E, retinol, and beta - carotene in dairy cattle and their associations with disease. J. Dairy Sci. 87, 609 - 619. 571 LeBlanc, S. J., Leslie, K.E., Du ffield, T.F., 2005. Metaboli c predictors of displaced abomasum in dairy cattle. J. Dairy Sci. 88, 159 70. https://doi.org/10.3168/jds.S00220302(05)72674 - 6 LeBlanc, S.J., Lissemore, K.D., Kel ton, D.F., Duffield, T.F., L eslie, K.E., 2006. Major advances in disease prevention in dairy cattle. J. Dairy Sci. 89, 1267 1279. https://doi.org/10.3168/jds.S0022 - 0302(06 )72195 - 6 Leite, M . L . C . , Nicolosi, A . , Firmo, J . O . A . , Lima - Costa, M . F ., 2007. Features of metabolic syndrome in non - diabetic Italians and Brazilians: a discriminant analysis. Intern . J. Clin . Pract . 6 1, 32 - 38. http://dx.doi.org/10.1111/j.1742 - 1241.2006.01 139.x . Lemeshow, S ., Hosmer, D . W ., 198 2 . A revi ew of goodness of fit statistics for use in the development of logistic regression models. Amer . J . Epidem . 115 , 92 - 106. Lakens, D., 2017. Equivalence tests: a practical primer for t tests, correlations, and meta analyses. Soc. Psych. Personal. Sci. 8, 355 - 362. https://doi.org/10.1177/1948550617697177. Li, C ., Ford, E . S ., 2007. Is t here a s ingle u nderlying f actor for the m etabolic s yndrome in a dol escents? Diab . Care 3 0, 1556 - 1561. Li, R . H ., Belford, G . G ., 2002. Instability of d ecision t ree c lassifi cation a lgorithms. Proceedings of the 8 th ACM SIGKDD international conference on knowledge discovery and data mining, Edmonton , Alberta, Canada (2002). Lima, F . S . , Filho, M . F . , Greco, L . F . , Santos, J . E . F ., 2012. Effects of feeding rumen - protected choline on incidence of disesaes and reproduction of dairy cows. Vet . J . 193 , 140 - 145. https://doi.org/10.1016/j.tvjl.2011.09.019. Lin, I. - F., Chang, W.P., Liao, Y. - N., 2013. Shrinkage methods enhanced the accuracy of parameter estimation using Cox models with s mall number of events (in eng). J. of Clin. Epi. 66, 743 751. Loh, W . Y ., 2011. Classification and regression trees. WIREs Data Mining and Know . Disc . 1 , 14 - 23. http://dx.doi.org: 10.1002/widm.8 Long, S.J., Freese, J., 2006. Regression models for categori cal dependent variables using Stata, 2nd ed. Stata Press, College Station, Texas. Lykkesfeldt, J., Svendsen, O., 2007. Oxidants and antioxidants in disease: Oxidative stress in farm animals. Vet. J. 173, 502 511. https://doi.org/10.1016/j.tvjl.2006.06.005 Ma, X , 2018. Using c lassification and r egression t rees: a p ratical p rimer. I nformation Age Publishing Inc., Charlotte, NC. 572 Malarkey, L . M ., McMorr ow, M . E ., 2011. S aunders n ursing g uide to l abora tory and d iagnostic t ests, 2 nd ed . Elsevier, New York. Mascha, E.J., Sessler, D.I., 2011. Equivalence and noninferiorit y testing in regression models and repeated - measures designs. Anesth. Analges. 112, 678 - 687. http://dx.doi.org/10.1213/ANE.0b013e318206f 872. Mazerolle, M . J ., Criterion (AIC) to assess the strength of biological hypotheses . Amphib . Reptil . 27 , 169 180. McArt, J . A . A . , N ydam, D . V . , Oetzel, G . R ., 2012. Epidemiology of subclinical ketosis in early lactation dairy cattle. J . Dairy Sci . 95 , 5056 - 5066. http://dx.doi.org/ 10.3168/jds.20 12 5443. McArt, J . A . A . , Nydam, D . V . , Oetzel, G . R ., 2013 a . Dry period and parturient predicto rs of early lactation hyperketonemia in dairy ca ttle. J . Dairy Sci . 96 , 198 - 209. http://dx.doi.org/ 10.3168/jds.2012 - 5681. McArt, J.A.A., Nydam, D.V., Oetzel, G.R., Overton, T.R., Ospina, P.A., 2013 b . Elevated non esteri - hydroxybutyrate and their association with transition dairy cow performance. The Vet. J. 198, 560 570. http://dx .doi.org/10.1016/j.tvjl.2013.08.011. McCrum - Gardner, E ., 2008. Which is the correct statistical tes t to use? Brit . J . Or al Maxillo . Surg . 46 , 38 - 41. https://doi.org/10.1016/j.bjoms.2007.09.002 . McDowell, L . R . , Wilkinson, N . , Madison, R . , Felix, T ., 2007. Vitamins and minerals functioning as a ntioxidants with supp lementation considerations. Flo rida Ruminant Nutrition Symposium. Gainesville, Florida (2007). McLachlan, G ., 2004 . Discriminant a nalysis and s tatistical p attern r ecognition. John Wiley & Sons, Inc, Hoboken, New Jersey. Melendez, P . , Barolome J ., Archba ld, L . F . , Donovan A ., 2003. The association between lameness, ovarian cysts, and fertility in lactation dairy cows. Theriogen . 59 , 927 - 937. Moons, K.G.M., Kengne, A.P., Woodward, M., Royston, P., Vergouwe, Y., Altman, D.G., Grobbee, D.E., 2012. Risk pre diction models: Develo pment, internal validation, and assessing the incremental value of a new (bio)marker. Heart 98, 683 - 690. cow nutriti on and production dis eases of the transition cow. Anim. Repro. Sci. 96, 331 353. Mulligan, F.J., Doherty, M.L., 2008. Production diseases of the transition cow. Vet J. 176, 3 - 9. 573 National Animal Health Monitoring System (NAHMS) , 1996. Part 1: Reference of 1996 Dairy Managemen t Practices . United States Department of Agriculture Animal and Plant Health Inspection Service: Veterinary Sciences: Centers for Epidemiology and Animal Health (USDA: APHIS: VS: CEAH) , Fort Collins, CO. National Animal Health Monito ring System (NAHMS), 2018. Health and management practices on U.S. dairy operations, 2014. United States Department of Agriculture - Animal and Plant Health Inspection Service: Veterinary Sciences: Centers for Epidemiology and Animal Health (USDA: APHIS: VS: CEAH), Fort Collins, Colorado. Neave, H . W . , Lomb, J . , von Keyserlingk, M . A . G . , Behnam - Shabahang, A . , Weary D . M ., 2017. Parity differences in the behavior of transition dairy cows. J . Dairy Sci . 100 , 548 - 561. https://doi.org/10.3168/jds.2016 - 10987. Nezl ek, J . B ., 2008. An i n troduction to m ultilevel m odeling for s ocial and p ersonality p sychology. Soc . Person . Psych . Comp . 2 , 842 - 860. http://dx.doi.org/ 10.1111/j.1751 9004.2007.00059.x Ngo, T . H . D ., 2016. Gener alized l inear m odels for n on - n ormal d ata. Proceedings from SAS Global Forum , Las Vegas, NV ( 2016 ) . Nordlund, K., Cook, N., Oetzel, G., 2006. Commingling Dairy Cows: Pen Moves, Stocking Density, and Health. 39th Proceedings American Association Bovine Pra ctitioners, St. Paul, MN. (2006). Nyman, A., Emanuelson, U., Holtenius, K., Ingvartsen, K.L., Larsen, T., Waller, K.P., 2008. Metabolites and immune variables associated with somatic cell counts of primiparous dairy cows. J. Dairy Sci. 91, 2996 3009. https://doi.org/10.3168/jds.2007 - 0969 Martin, S.W., Nielsen, L.R., Pearl, D.L., Pfeiffer, D.U., Sa nchez, J., Torrence, M.E., Vigre , H., Waldner, C., Ward, M.P., 2016. Explanation and elaboration document for the STROBE - Vet statement: Strengthening the Reporting of Observational Studies in Epidemiology - Veterinary Extension. J. Vet. Intern. Med. 30, 189 6 - 1928. O etzel, G.R., 2004. Mo nitoring and testing dairy herds for metabolic disease. Vet. Clin. of North Amer.: Food Anim. Pract. 20, 651 674. Ospina, P.A., McArt, J.A., Overton, T.R., Stokol, T., Nydam, D.V., 2013. Using nonesterified fatty acids and - hydroxybu tyrate concentrations during the transition period for herd level monitoring of increased risk of disease and decreased reproductive and milking performance (in eng). The Vet. clinics of North Amer. Food Ani. Pract. 29, 387 412. Ospina, P.A., N ydam, D.V., Stokol, T., Overton, T.R., 2010. Associations of elevated nonesterified fatty acids and beta - hydroxybutyrate concentrations with early lactation reproductive performance and milk production in transition dairy cattle in the 574 northeastern United States. J. Dairy Sci. 93, 1596 603. https://doi.org/10.3168/jds.20092852 Overton, T . R ., Waldron, M . R ., 2004. Nutritional m anagement of t ransition d airy c ows: s trategies to o ptimize m etabolic h ealth. J . Dairy Sci . 87 , E105 - E119. Pavlou, M., Ambler, G., Seaman, S.R., Guttmann, O., Elliott, P., King, M., Omar, R.Z., 2015. How to develop a more accurate risk prediction model when there are few events. BMJ. 351, h3868. https://doi.org/10.1136/bmj.h3868 Peng, C . J . , Lee, K . L . , Ingersoll, G . M ., 2002. An i ntroduction to l ogistic r egression a nalysis and r eporting. J . Educ . Res . 96 , 3 - 14. Pepys, M . B ., 2012. Acute p hase p roteins in the a cute p hase r esponse. Spri ng - Verlag London Limited , London, England. Petrovski, K . R . , Trajcev, M . , Buneski, G ., 2006. A review of the factors affecting the costs of bovine mastitis. J . South African Vet . Assoc . 77 , 52 - 60. Pfeffermann, D., 1996. The use of sampling weights for su rvey data analysis. Stat. Meth. in Med. Res. 5, 239 - 261. Pinedo, P., Risco, C., Melendez, P., 2011. A retrospective study on the association between different lengths of the dry period and subclinical mastitis, milk yield, reproduct ive performance, and cu lling in Chilean dairy cows. J. Dairy Sci. 94, 106 - 115. https://doi.org/10.3168/jds.2010 - 3141 Pineiro, G., Perelman, S., Guerschman, J.P., Paruelo, J.M., 2008. How to evaluate mo dels: observed vs. pred icted or predicted vs. observed? Ecolog. Model. 216, 316 - 322. Pladevall, M . , Singal, B . , Williams, L . K . , Brotons, C . , Guyer, H . , Sadurni, J . , Falces, C . , Serrano Rios, M . , Gabriel, R . , Shaw, J . E . , Zimmet, P . Z . , Haffner, S . H ., 2006. A s ingle f actor u nderli es the m etabolic s yndrome . Diab . Care 29 , 113 - 122. Pludowski, P., Holick, M.F., Pilz, S., Wagner, C.L., Hollis, B.W., Grant, W.B., Shoenfeld, Y., Lerchbaum, E., Llewellyn, D.J., Kienreich, K., Soni, M., 2013. Vitamin D effects on musculoskeletal health, immunity, autoimmunity, cardiovascular disease, cancer, fertility, pregnancy, dementia and mortality - A review of recent evidence. Autoimmun. Rev. 12, 976 989. h ttps://doi.org/10.1016/j.autrev.2013 .02.004 Press, S . J ., Wilson, S ., 1978. Choosing b etween l ogistic r egression and d iscriminant a nalysis. J. Amer . Stat . Assoc . 73 , 699 - 705. Pryce, J.E., Coffey, M.P., Simm, G., 2001. The relationship between body conditio n score and reproductive performanc e. J. Dairy Sci. 84, 1508 1515. 575 Quiroz - Rocha, G.F., LeBlanc, S., Duffield, T., Wood, D., Leslie, K.E., Jacobs, R.M., 2009. Evaluation of prepartum serum cholesterol and fatty acids concentrations as predictors of postpartum retention of the placenta in da iry cows. J. Am. Vet. Med. Assoc. 234, 790 793. https://doi.org/10.2460/javma.234.6.790 Rabe - Hesketh, S . , Skrondal, A ., 2008. Multilevel and l ongitudinal m od el u sing Stata, 2 nd ed . Stata Press, Colle ge Station, Texas. Rabiee, A.R., Lean, I.J., 2013. The effect of internal teat sealant products (Teatseal and Orbeseal) on intramammary infection, clinical mastitis, and somatic cell counts in lactating dairy cows: A meta - analysis. J. Dairy Sci. 96, 6915 - 6931. http://dx.doi.org/10.3168/jds.2013 - 6544 Re, R., Pellegrini, N., Proteggente, A., Pannala, A., Yang, M., Rice - Evans, C., 1999. Antioxidant activity applying an improved ABTS radical cation decoloriz ation assay. Free. Rad. Bio. & Med. 26, 1231 - 1237. Reichenheim, M.E., 2002. Two - graph receiver operating characteristic. The Stata J. 2, 351 - 357. Richards, S.A., Whittingham, M.J., Stephens, P.A., 2011. Model selection and model averaging in behavioural ecology: the utility of the IT - AIC framework. Behav . Ecol. Sociobiol. 65, 77 - 89. Richert, R . M . , Cicconi, K . M . , Gamrock, M . J . , Schukken, Y . N . , Stiglbauer, K . E . , Ruegg, P . L ., 2013. Risk factors for clinical mastitis, ketosis, and pneumonia in dairy cattle on organic and small conventional farms in the Unit ed States. J . Dairy Sc i. 96 , 1 - 17. Robinson, C., Schumacker, R.E., 2009. Interaction effects: centering, variance, inflation factor, and interpretation issues. Mult. Linear. Reg. Viewp. 35(1), 6 - 11 . Roche, J.R., Friggens, N.C., Kay, J.K., Fisher, M.W., Stafford, K.J., Berry, D.P., 2009. Invited review: Body condition score and its association with dairy cow productivity, health, and welfare. J. Dairy Sci. 92, 5769 5801. https://doi.or g/http://dx.doi.org/10.3168/jds.20092431 Rollin, E., Dhuyvetter, K.C., Overton, M.W., 2015. The cost of clinical mastitis in the first 30 days of lactation: An economic modeling tool. Prev. Vet. M ed. 121, 257 - 264. Sainani, K.L., 2014. Explanatory versus predictive modeling. PM&R. 6, 841 844. https://doi.org/10.1016/j.pmrj.2014.08.941 Santos, J.E.P., Bisinotto, R.S., Ribeiro, E.S., Lima, F.S., Thatcher, W.W., 2012. Impacts of Metabolism and Nutri tion During the Transition Period on Fertility of Dairy Cows. In: 2012 High Plains Dairy Conference, Amarillo, Texas. pp. 97 - 112. 576 Saun, V ., 2004. Metabolic p rofiling and h ealth r isk in t rans ition c ows. Proceedings 37 th Annual American Association of Bovin e Practitioners Convention, Ft. Worth, Texas (2004). Sarkar, S.K., Midi, H., Rana, S., 2010. Model selection in logistic regression and performance of its predictive ability. Comput. Stat. 4, 5813 5822. Schmidt, C . O ., Kohlmann, T ., 2008. When to use the odds ratio or the relative risk? Intern . J . Pub . Heal . 53 , 165 - 167. http://dx.doi.org/10.1007/s000 - 00 - 7068 - 3 . Schmidt - Catran, A.W., 2016. IGENERATE: Stata module to apply a variety of coding schemes, including weighted effect coded interactions. Statis tical Software Components S458259, Boston College Department of Economics. Seifi, H . A . , LeBlanc, S . J . , Leslie, K . E ., Duffield T . F ., 2011. Metabolic predictors of post - partum disease and c ulling risk in dairy cattle. Vet . J . 188 , 216 - 220. https://doi.org/10.1016/j.tvjl.2010.04.007 . Seifi, H.A., Mohri, M., Ehsani, A., Hosseini, E . , 2005. Interpretation of bovine serum total calcium: effects of adjustment for albumin and total protein. Comp. Clin. Pa thol. 14: 155 159. https://doi.org/10.1007/s00580 - 005 - 0582 - 2 . Seifi, H.A., Mohri, M., Zadeh, J.K., 2004. Use of pre - partum urine pH to predict the risk of milk fever in dairy cows. The Vet. J. 167 , 281 - 285. Serhan, C.N., Ward, P.A., Gilroy, D.W., 2010. Fundamentals of inflammation. Cambridge University Press, New York, NY. Shmueli, G. , 2010. To explain or to predict? Stat. Sci. 25, 289 310. https:// doi.org/10.1214/10STS330 Shtatland, E . S . , Cain, E . , Barton, M . B ., 2001. The perils of stepwise logistic regression and how to escape them using information criteria and the output delivery system. Proceedings from the 26th Annual SAS Users Group Internat ional Conference , Long Beach, California (2001). Shtatland, E . S . , Kleinman, K . , Cain E . M ., 2004. A n ew s trategy of m odel b uilding in P ROC LOGISTIC with a utomatic v ariable s election, v alidation, s hrinkage, and m odel a y, N orth Carolina (2004). Simensen, E . , Ostera, O . , Boe, K . E . , Kielland, C . , Rudd L . E . , Naess, G ., 2010. Housing system and herd size interactions in Norwegian dairy herds; association with performance and disease incidence. Acta Vet . Scand . 52 , 14. Sin gal, A . G . , Mukherjee, A . , Elmunzer, B . J . , Higgins, P . D . R . , Lok, A . S . , Zhu, J . , Marrero, J . A . , Walijee, A . K ., 2013. Machine l earning a lgorithms o utperform c onventional r egression 577 m odels in p redicting d evelopment of h epatocellular c arcinoma. Amer . J . Gastroe nt . 108 , 1723 - 1730. http://dx.doi.org/10.1038/ajg.2013.332. Sordillo, L.M., Aitken, S.L., 2009. Impact of oxidative stress on the health and immune function of dairy cattle. Vet. Imm. and Immunopath. 128, 104 109. https://doi.org/10.1016/j.vetimm.2008.10.305 Sordillo, L ., Erskine, R . 2010. Bovine l eukosis v irus u pdate II: i mpact on i mmunity and d isease r esistance. Mich . Dairy Rev . 15 , 4 - 5. Sordillo, L.M., Raphael, W., 2013. Significance of met abolic stress, lipid mobilization, and inflammation on transition cow disorders. Vet. Clin. Food. Anim. 29, 267 - 278. http://dx.doi.org/10.1016/j.cvfa.2013.03.002 Sordillo, L.M., Mavangira, V., 2014. The nexus between nutrient metabolism, oxidative stress and inflammation in transition cows. Anim. Prod. Sci. 54, 1204 1214. http://dx.doi.org/10.1071/AN14503 Sorensen, J.B., 2002. The use and misuse of the coefficient of variation in organizational demography research. Socio. Meth. & Res. 30, 475 - 491. Spears, J . W . , Weiss, W . P ., 2008. Role of antioxidants and trace elements in health and immunity of transition dairy cows. Vet . J . 176 , 70 76. http://d x.doi.org/10.1016/j.tvjl.2007.12.015. Steensels, M . , Antler, A . , Bahr, C . , Berckmans, D . , Maltz, E . , Halachmi, I ., 2016. A decision tree model to detect post - calving disease based on rumination, activity, milk yield, BW, and voluntary visits to the milkin g robots. Ani m. 10 , 1493 - 1500. http://dx.doi.org/10.1017/S1751731116000744. Sterne, J.A.C., White, I. R., Carlin, J.B., Spratt, M., Royston, P., Kenward, M.G., Wood., A.M., Carpenter, J.R., 2009. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 338, b2393. https://doi.org/10.1136/bmj.b2393 Steyerberg, E . W . , Eijkemans, M . J . C . , Habbema, J . D . F ., 2001a. Application of shrinkage techniques in logistic re gression analysis: a case study. Stat . Neerland . 55 , 76 - 88. Steyerberg, E.W., Harrell, F.E., Borsboom, G.J.J.M., Eijkemans, M.J.C., Vergouwe, Y., Habbema, J.D.F., 2001 b . Internal validation of predictive models: Efficiency of some procedures for logistic regression analysis. J. Clin. Epidemiol. 54, 774 781. https://doi.org/10.1016/S0895 - 4356(01)00341 - 9 Steyerberg, E.W., Bleeker, S.E., Moll, H.A., Grobbee, D.E., Moons, K.G.M., 2003. Internal and external validation of predictive models: A simulation study of bias and precision in 578 small s amples. J. Clin. Epidemiol. 56, 441 447. https://doi.org/10.1016/S0895356(03)00047 - 7 Strucken, E.M., Laurenson, Y.C.S.M., Brockmann, G.A., 2015. Go with the flow - biology and genetics of the lact ation cycle. Front. Gene. 6. https://dog.org/ 10.3389/fgene.2015.00118 Suhr, D . D ., 2006. Exploratory or c onfirmatory f actor a nalysis. SAS Users Group Internation al Conference, San Francisco, California (2006). Symonds, M.R.E., Moussalli, A., 2011. A brief guide to model selection, multimodel inference Ecol. Sociobiol. 65, 13 21. https://doi.org/10.1007/s0026 5 - 010 - 1037 - 6 Tall, A.R., Yvan - Charvet, L., 2015. Cholesterol, inflammation and innate immunity. Nat. Rev. Immunol. 15(2), 104 - 116. Tóthová , C., Nagy, O., , G. , 2014. Relationship between some variables of protein profile and indicators of lipmobilization in dairy cows after calving. Archiv. Tierzucht. 57(19), 1 - 9. Treacher, R., Reid, I., Roberts, C., 1986. Effect of body condition at calving on the h ealth and perfor mance of dairy cows. Anim. Sci. 43 (1), 1 - 6. https://dx.doi.org/10.1017/S0003356100018286 Trevisi, E., Amadori M., Archetti, I., Lacetera, N., Bertoni, G., 2011. Inflammatory resp onse and acute phase proteins in the transition period of high - yielding dairy cows, in: Prof. Francisco Veas (Ed.), Acute phase proteins as early non - specific biomarkers of human and veterinary diseases. InTech, pp. 355 - 381. Turner, H ., 2008. Introduction to gener alized linear models . ESRC National Centre for Research Methods, UK and Department of Statistics, University of Warwick. Available from: http://statmath.wu.ac.at/cour ses/heat her_turner/glmCourse_001.pdf [June 23, 2018]. Van Breukelen, G . J . P ., Candel M . J . J . M ., 2012. Calculating sample sizes for cluster randomized trials: We can keep it simple and efficient! J. Clin . Epidem . 65 , 1212 - 1218. Van der Ark, L . A . , Croon, M . A . , Sijtsma, K ., 2012. New d evelopments in c ategorica d ata a nalysis for the s ocial and b ehavioral s ciences. Psychology Press, New York, NY and Hove, East Sussex . Vanholder, T., Papen, J., Bemers, R., Vertenten, G., Berge, A.C.B., 2015. Risk factors for subclinical and clinical ketosis and association with production parameters in dairy cows in the Netherlands. J. Dairy Sci. 98, 880 888. https://doi.org/10.3168/jds.2014 - 8362 579 Van Saun, R . J ., Sniffen, C . J ., 2014. Transition c ow n utrition and f eeding m anagement for d isease p revention. Vet . Clin . North America: Food Anim . Pract . 30 , 689 - 719. http://dx.doi.org/10.1016/j.cvfa.2014.07.009. Vaughan, T.S., Berry, K.E., 2005. Using Monte Carlo techniques to dem onstrate the meaning and implications of multicollinearity. J. Of Stat. 13, 1069 - 1898. https://doi.org/10.1080/10691898.2005.11910640 Vergara, C.F., Dopfer, D., Cook, N.B., No rdlund, K.V., McA rt, A., Nydram, D.V., Oetzel, G.R., 2014. Risk factors for postpartum problems in dairy cows: Explanatory and predictive modeling. J. Dairy. Sci. 97, 4127 - 4140. Vittinghoff, E., McCulloch, C.E., 2007. Relaxing the rule of ten events per v ariable in logist ic and cox regression. Amer. J. of Epi. 165, 710 - 718. https://doi.org/10.1093/aje/kwk052. Wagenmakers, E. - J., Farrell, S., 2004. AIC model selection using Akaike weights. Psychon. Bull. Rev. 11, 192 196. h ttps://doi.org/10.3758/BF03206482 Wang, L., Manson, J.E., Song, Y., Sesso, H.D., 2010. Systematic review: vitamin D and calcium supplementation in prevention of cardiovascular events. Ann. of. Intern. Med. 152, 315 - 323. Wathes, D . C . , Ch eng, Z . , Bourne, N . , Taylor, V . J . , Coffey, M . P . , Brotherstone, S ., 2007. Differences between primiparous and multiparous dairy cows in the inter - relationships between metabolic traits, milk yield and body condition score in the periparturient period. Dom . Anim . Endocrin . 3 3 , 203 - 225. https://doi.org/10.1016/j.domaniend.2006.05.004. Wei, W., Held, L., 2014. Calibration tests for count data. Test 23, 787 - 805. http://doi.org/10.1007/s11749 - 014 - 0380 - 8 Weiss, W . P ., 1998. Requirements of f at - soluble v itamins for d airy c ows: a r eview. J . Dairy Sci . 81 , 2493 - 2501. Wiens, B.L., 2002. Choosing an equivalence limit for noninferiority or equivalence studies. Con. Clin. Tri. 23, 2 - 14. Wisnieski, L., Norby, B., Pierce, S.J., Be cker, T., Gandy, J.C., Sordillo L.M., 2019 a . Predictive models for early lactation diseases in transition dairy cattle at dry - off. Prev. Vet. Med., 163: 68 - 78. https://doi.org/10.1016/j.prevetm ed.2018.12.014 . Wisnieski, L., Norby, B., Pierce, S.J., Becker, T., Gandy, J.C., Sordillo, L.M., 2019b. Cohort level disease prediction using aggregate biomarker data in transition dairy cattle. Manuscript submit ted for publication. Yang, F . L . , Li, X . S . , 2015. Role of antioxidant vitamins and trace elements in mastitis in dairy cows. J. Adv . Vet . Anim . Res . 2 , 1 - 9. http://dx.doi.org/10.5455/javar.2015.b48. 580 Yang, R . Z . , Lee, M . J . , Hu, H . , Pollin, T . I . , Ryan, A . S . , Nicklas, B . J . , Snitker, S . , Horenstein, R . B . , Hull, K . , Goldberg, N . H . , Goldberg, A . P . , Shuldiner, A . R . , Fried, S . K . , Gong, D . W ., 2006. Acute - p hase s erum a m y loid A: a n i nflammatory a dipokine and p otential l ink between o besity and i ts m etabolic c omplications. PLoS Med . 3 , e287. Yin, K ., A grawal, D . K ., 2014. Vitamin D and inflammatory diseases. J . Inflamm . Res . 7 , 69 - 87. http://dx.doi.org/10.2147/JIR.S63898. Zaborski, D . , Grzesiak, W . , Pilarczyk, R ., 2016. Detection of difficult calvings in the Polish Holstein Black - and - White heifers. J . Appl . An im . Res . 44 , 42 - 53. https://doi.org/10.1080/09712119.2014.987293 Zhang, G., Hailemariam, D., Dervishi, E., Deng, Q., Goldansaz, S.A., Dunn, S.M., Ametaj, B.N., 2015. Alterations of innate immunity re actants in transition dairy cows before clinical signs o f lameness. Animals. 5, 717 747. https://doi.org/10.3390/ani5030381 Zhang, G., Hailemariam, D., Dervishi, E., Ali, S., Deng, Q., Dunn, S.M., Ametaj, B.N., 2016. Dairy cows affected by ketosis show alterat ions in innate immunity and lipid and carbohydrate metabolism during the dry off period and postpartum. Res. in Vet. Sci. 107, 246 256. https://doi.org/10.1016/j.rvsc.2016.06.012 Zorn, C ., 2005. A s o lution to s eparation in b inary r esponse m odels. Polit . A naly . 13 , 157 - 170. http://dx.doi.org/10.1093/pan/mpi009 . Zurbrigg, K . , Kelton, D . , Anderson, N . , Millman S ., 2005. Tie - s tall d esign and its r elationship to l ameness, i njury, and c leanliness on 317 On ta rio d airy f arms. J . Dairy Sci . 88 , 3201 3210.