POSTMORTEM MICROBIOME COMPUTATIONAL METHODS AND APPLICATIONS By Sierra Frances Kaszubinski A THESIS Submitted to Michigan State University in partial fulfilment of the requirements for the degree of Integrative Biology—Master of Science 2020 POSTMORTEM MICROBIOME COMPUTATIONAL METHODS AND APPLICATIONS ABSTRACT By Sierra Frances Kaszubinski Microbial communities have potential evidential utility for forensic applications. However, bioinformatic analysis of high-throughput sequencing data varies widely among laboratories and can potentially affect downstream forensic analyses and data interpretations. To illustrate the importance of standardizing methodology, we compared analyses of postmortem microbiome samples using several bioinformatic pipelines, while varying minimum library size or the minimum number of sequences per sample, and sample size. Using the same input sequence data, we found that pipeline significantly affected the microbial communities. Increasing minimum library size and sample size increased the number of low abundant and infrequent taxa detected. Our results show that bioinformatic pipeline and parameter choice significantly affect the resulting microbial communities, which is important for forensic applications. One such forensic application is the potential postmortem reflection of manner of death (MOD) and cause of death (COD). Microbial community metrics have linked the postmortem microbiome with antemortem health status. To further explore this association, we demonstrated that postmortem microbiomes could differentiate beta-dispersion among M/COD, especially for cardiovascular disease and drug-related deaths. Beta-dispersion associated with M/COD has potential forensic utility to aid certifiers of death by providing additional evidence for death determination. Additional supplemental files including tables of raw data and additional statistical tests are available in supplemental files online, denoted in the text as table ‘S’. This thesis is dedicated to my family- my biggest supporters and inspirations. Thank you for believing in me, and not letting me drop out to be a police officer. It was for the best. iii ACKNOWLEDGMENTS I first would like to thank my committee members, Drs. Mariah H. Meek, M. Eric Benbow, Jennifer L. Pechal, and Sarah E. Evans for their contribution to my scientific growth. Dr. Meek, I am forever grateful for the day I became your ‘bonus’ student, thank you for your endless support. Dr. Benbow, I’m glad I finally realized all my favorite forensic studies were coming from one floor below me, thank you for all the adventures. Dr. Pechal, thank you for all your help, you truly shaped me into a scientist I am proud to be. Dr. Evans, I appreciate your comments that made instrumental improvements to my studies, thank you for your support. I would also like to thank Drs. Carl J. Schmidt and Heather R. Jordan for their contributions to my research. I truly enjoyed our forensic conversations. Thank you to Drs. Ruth Smith and Mary Finn, you are a large part of the reason this thesis is completed. I also want to thank the Meek lab: Charlene, Miranda, Nadya, Sara, and Shannon. Thank you for all the coffee runs and conversations. To the Benbow lab (past and present): Breanna, Courtney L., Courtney W., Emily, Joe, Katelyn, and Nick. I enjoyed every part of fieldwork, most of lab work, and some of coding with you. Thanks for making my time in grad school so much more enjoyable. Thank you to the SMART scholarship program, and the Defense Forensic Science Center. Because of you, I have my dream job. To Thomas Meyer and Henry Maynard, thank you for your mentorship and great discussions. Thank you to the Department of Integrative Biology, despite my unusual admittance, I felt at home in the program. Finally, thank you to my friends and family. Thank you for believing in me, even when I didn’t believe in myself. Thank you for listening to my endless complaints about Michigan weather, and being there for me through every defeat and every triumph. I literally could not have done this without you, your support means everything to me. This one’s for you. iv TABLE OF CONTENTS LIST OF TABLES .................................................................................................................vi LIST OF FIGURES ...............................................................................................................vii CHAPTER I: EVALUATING BIOINFORMATIC PIPELINE PERFORMANCE FOR FORENSIC MICROBIOME ANALYSIS ............................................................................1 Introduction ................................................................................................................2 Material and Methods ................................................................................................8 Sample Collection, DNA Extraction, and Sequencing ..................................9 Pipeline Comparison ......................................................................................10 Data Analysis and Bioinformatics .................................................................13 A Priori Power Analysis ................................................................................16 Data Availability ............................................................................................16 Results ........................................................................................................................17 Pipeline Comparison ......................................................................................18 Minimum Library Size ...................................................................................23 Sample Size ....................................................................................................26 Discussion ..................................................................................................................29 Pipeline Comparison ......................................................................................30 Minimum Library Size ...................................................................................31 Sample Size ....................................................................................................33 Conclusion .................................................................................................................35 CHAPTER II: DYSBIOSIS IN THE DEAD: HUMAN POSTMORTEM MICROBIOME BETA-DISPERSION AS ONE INDICATOR OF MANNER AND CAUSE OF DEATH DURING AUTOPSY.............................................................................................................37 Introduction ................................................................................................................38 Materials and Methods ...............................................................................................43 Sample Collection, DNA Extraction, and Sequencing ..................................44 Data Analysis and Bioinformatics .................................................................44 Method Selection ...............................................................................45 Model Selection .................................................................................46 Case Studies .......................................................................................48 Data Availability ............................................................................................49 Results ........................................................................................................................50 Method Selection ...........................................................................................51 Model Selection .............................................................................................52 Case Studies ...................................................................................................59 Discussion ..................................................................................................................62 Conclusion .................................................................................................................68 APPENDIX ............................................................................................................................70 REFERENCES ......................................................................................................................75 v LIST OF TABLES Table 1: Summary of parameter differences among pipelines compared. .............................13 Table 2: Summary of sample number and sequence read differences among pipelines. .......20 Table 3: Random forest classification error among pipelines. ..............................................20 Table 4: Total number of sequences that remained after filtering, and unclassified sequences (phylum and family taxonomic level) among sample area, sample size, and minimum library size. ........................................................................................................................................25 Table 5: Random forest classification error among sample sizes and minimum library sizes. ................................................................................................................................................28 Table 6: Case metadata stratified by manner/cause of Death (M/COD). ..............................54 Table 7: Summary of logistic regression models classifying natural vs. accident, cardiovascular vs. drug-use, and disease vs. non-diseased state............................................58 vi LIST OF FIGURES Figure 1: Workflow for bioinformatic analysis. ....................................................................12 Figure 2: In silico data error among pipelines. ......................................................................19 Figure 3: Microbial community composition among pipelines. ............................................22 Figure 4: Microbial community composition among sample sizes and minimum library sizes. .......................................................................................................................................24 Figure 5: Core microbiome among sample sizes and minimum library sizes. ......................27 Figure 6: Beta-dispersion values among body sites and manners/causes of death (M/CODs). ................................................................................................................................................53 Figure 7: Multinomial logistic regression model comparison among full community and random forest indicator community beta-dispersion for manners/causes of death (M/CODs). .............................................................................................................................57 Figure 8: Logistic regression models of the best performing pair-wise comparisons. ..........59 Figure 9: Proposed workflow with suicide matched-design case study. ...............................61 Supplemental Figure 1: Summary of Kruskal-Wallis results for normalization strategies determining beta-dispersion. ..................................................................................................71 Supplemental Figure 2: Summary of Fligner-Killeen results for normalization strategies determining beta-dispersion. ..................................................................................................72 Supplemental Figure 3: Alpha-diversity metrics across normalization strategies and levels . ................................................................................................................................................73 Supplemental Figure 4: Shannon diversity across sequencing depth among microbial samples. ..................................................................................................................................74 vii CHAPTER I: EVALUATING BIOINFORMATIC PIPELINE PERFORMANCE FOR FORENSIC MICROBIOME ANALYSIS 1 Introduction 2 Before the widespread use of next-generation sequencing (NGS), forensic microbiology was limited to identifying pathogens of bio-crimes with culture-based methods, such as the 2001 anthrax letter attacks (Budowle et al. 2003). Now with the advent of NGS technology, amplicon sequencing can describe entire microbial communities from evidence rather than just targeting a single microbe of interest. Microbes are an important forensic resource as they are ubiquitous organisms, with community compositions specific to different environments or hosts (i.e., the location of a body or body part) and that vary over time, such as during decomposition (Pechal et al. 2018; Metcalf et al. 2016). NGS has expanded microbial forensics to many potential applications including: body fluid identification, human identification, and postmortem interval estimation (Schmedes, Sajantila, and Budowle 2016). Targeted amplicon sequencing of 16S rRNA identified potential microbial biomarkers for sensitive body fluid identification (Dobay et al. 2019), and a clade based, single nucleotide polymorphism approach identified human individuals using their skin microbiome (Schmedes, Woerner, and Budowle 2017). Postmortem microbiome studies have included a diverse set of investigative circumstances to better understand how microbial communities after death can inform forensic investigation. Studies have used human-surrogates, such as swine (Benbow et al. 2015), humans (Pechal et al. 2018), and grave soil (Metcalf et al. 2016), as well as experiments in anthropologic facilities (Pechal et al. 2014) and during routine autopsy for death investigation (Pechal et al. 2018). Researchers have developed models for forensic applications (i.e., postmortem interval estimation), and described microbial community function and succession during decomposition(Benbow et al. 2015; Metcalf et al. 2016; Pechal et al. 2018). While recent research suggests excellent potential for microbial community use in 3 forensics (Metcalf et al. 2013, 2017; Carter et al. 2016; Johnson et al. 2016), additional foundational work is needed before forensic microbiology using postmortem microbial community assemblages can be applied in the criminal justice system. Needed foundational work includes standardizing parameters for transforming raw microbial sequences into a usable data format for downstream analyses. For bioinformatic analysis of microbial data, raw sequence data files generated from NGS undergo a series of transformations using executable command line software known as pipelines (Leipzig 2017). Forensic microbiological data created by high-throughput sequencing platforms are processed using complex analyses that require users to make processing decisions along these pipelines (e.g., Should samples be normalized? Which taxonomic database should be used?). It is hypothesized that different decisions could have downstream effects on results and their interpretation (Sivarajah et al. 2017; Golob et al. 2017). For any downstream application of forensic microbiology in the criminal justice system, there is a need for streamlined standard operating procedures (SOPs) (Carter et al. 2016). There are self-contained pipelines for processing sequence data for characterizing microbial communities. Three of the most often used self-contained pipelines are: QIIME2 (Bolyen et al. 2019), mothur (Schloss et al. 2009), and MG-RAST (Keegan, Glass, and Meyer 2016). QIIME2 and mothur require some command line experience, whereas MG-RAST has a web-based graphical user interface (GUI) (Plummer and Twin 2015). Several studies determined that the microbial communities generated from different pipelines were comparable; however, these studies used simulated data (Golob et al. 2017; Siegwald et al. 2017; Mysara et al. 2017; Nilakanta et al. 2014; McMurdie and Holmes 2014; Weiss et al. 2017), small sample sizes (n < 40) 4 (Plummer and Twin 2015; McMurdie and Holmes 2014; Weiss et al. 2017; D’Argenio et al. 2014) or were composed of the same sample type (i.e., human gut microbial data) (Plummer and Twin 2015; D’Argenio et al. 2014), which does not readily apply to many forensic datasets that can include highly variable sampling locations or contextual circumstances (Benbow et al. 2015; Metcalf et al. 2016; Pechal et al. 2018). In addition to choosing the pipeline, microbial sequence data analyses can be confounded by different library sizes, or the number of sequence reads per sample (McMurdie and Holmes 2014). The common library size normalization procedure for microbial analysis is referred to as rarefying (McMurdie and Holmes 2014). To rarefy data, a minimum library size is chosen, samples with too few reads are discarded, and the remaining sample reads are subsampled without replacement to the minimum library size (McMurdie and Holmes 2014). Minimum library size is often chosen by selecting the smallest library size of a non-defective sample, which is a subjective assessment that can add uncertainty to microbial community analysis (McMurdie and Holmes 2014). Minimum library size can also be chosen based on rarefaction curves: taxon-based re-sampling curves that indicate species richness and coverage analysis to justify a certain library size. Potentially useful data are omitted during rarefying, which can decrease the power and specificity of analyses as samples are discarded, and the samples that remain may not be distinguishable using a fraction of the original reads (McMurdie and Holmes 2014). While rarefying can decrease power, sample size is an additional factor to consider for downstream analysis of microbial data. Most postmortem microbiome studies have small sample sizes (Pechal et al. 2018). Often, this is due to space, time or resource limitations of using human donated bodies or surrogate carcasses, such as swine. Prior to 2018, when Pechal et al. (2018) characterized and 5 modeled postmortem microbiomes using 188 autopsies, sample sizes for postmortem microbiome studies range from 2 to 48 (Metcalf et al. 2016; Hyde et al. 2013). Small samples sizes can decrease statistical power and inflate effect sizes, which can lead to unreliable conclusions that may not be generalizable to other data sets (Ioannidis 2005), which can have important consequences in forensics (Clarke et al. 2017). Given this, it is important to evaluate data analysis parameters with large, spatially and temporally heterogenous datasets with multiple sample areas (i.e., body locations) in order to improve the reliability of postmortem microbiome data. Downstream analytical methods for forensic microbiology studies are still being evaluated. Recently, machine learning algorithms and parameterization were comprehensively tested for potential direct forensic applications to postmortem interval estimation, manner of death determination, and location of death (i.e. inside, outside, hospital death) for this particular dataset (Zhang et al. 2019). Therefore, we focus on preceding steps of microbial analysis before modeling applications for comparing forensic predictions. The overall goal was to provide an initial assessment of how pipeline choice and data processing parameter differences affect data outcomes that are used as inputs for downstream modeling and prediction, with the intention that forensic researchers and examiners could make informed decisions about study design, data analysis methods, and applications relevant to their forensics needs. A better understanding of performance among bioinformatic pipelines and parameters is needed to reveal potentially significant differences in downstream analysis and data interpretation, especially for future use in the forensic sciences. To determine how different microbiome data processing parameters affect downstream analytical outcomes, we determined how pipeline software, library size normalization by rarefying, and sample size affect common microbiome community metrics and 6 machine learning model results using a large postmortem microbiome dataset (n=188). We have emphasized using standardized, recognizable, and user- friendly methods encountered in open- access microbiome analyses tools for forensic applications. 7 Materials and Methods 8 Sample Collection, DNA Extraction, and Sequencing Postmortem microbiome data were obtained from our previously published study that characterized the microbial communities from multiple body sites in 188 routine autopsy cases (Pechal et al. 2018). The postmortem microbiome data for this study represented the microbial communities from the mouth and rectum, with cases that spanned all four seasons (Spring, Summer, Fall, and Winter), manners of death (accident, homicide, suicide, natural), postmortem intervals (< 24h - > 73+ h), and ages (18-55+ years) (Table S1). Detailed methods for sample collection, DNA extraction, and sequencing are available in Pechal et al. (2018). In short, trained personnel at Wayne County Medical Examiner’s Office in Detroit, Michigan took swab samples during routine autopsy using DNA-Free sterile cotton-tipped applicators (25–806 1WC FDNA, Puritan®, Guilford, MA, USA). Each swab was rotated for 3-5 seconds on the body location to sample the microbial community. Samples were placed in sterile microfuge tubes and 200 μl of 100% molecular grade ethanol (BP2818-4, Fisher Scientific, Waltham, MA, USA) and stored at -20℃. Metadata were collected for each case including: sampling date (season), anatomic region sampled, sex, ethnicity, estimated age (years), postmortem interval (PMI), body mass index (BMI; kg/m2), event location (indoors, outdoors, hospital, vehicle), and manner of death (Table S1). Manner of death and PMI estimates were determined by a board-certified forensic pathologist at the time of autopsy. To determine the microbial communities, total DNA was extracted in a biological safety cabinet with aseptic conditions using PureLink® Genomic DNA Mini Kit (Thermo Fisher Scientific, Waltham, MA, USA) following manufacture instructions except an additional 5 ng/μl of lysozyme was added during the lysis step for each sample reaction (Pechal et al. 2017). The samples were quantified by Qubit 2.0 and the Quant-iT dsDNA HS Assay kit (Thermo Fisher 9 Scientific, Waltham, MA, USA). Microbial DNA was sequenced at the MSU Genomics Core Facility (East Lansing, MI, USA) using an Illumina MiSeq. The MSU Genomics Core Facility prepared the 16S rRNA gene amplicon library and sequenced the samples using a modified version of the protocol adapted for the Illumina HiSeq2000 and MiSeq. The V4 region 16S gene amplicon 2 x 250 bp paired-end reads were generated with region specific primers [515 f (5′ GTGCCAGCMGCCGCGGTAA) and 806 r (5′ GGACTACHVGGGTWTCTAAT)] that included Illumina flowcell adapters (Caporaso et al. 2010). PCR products were normalized and pooled with Invitrogen SequalPrep DNA Normalization Plates. A combination of Qubit dsDNA HS, Caliper LabChipGX HS DNA, and Kapa Illumina Library Quantification qPCR assays were used to quantify the pooled library. Amplicons were sequenced with custom primers complementing amplification primers to avoid primer sequencing after cluster formation as described by Kozich et al. (Kozich et al. 2013). Pooled sequences were loaded on an Illumina MiSeq standard flow cell (v2) and sequenced using a 500 cycle reagent cartridge. Filtering parameters were optimized for detecting low abundance phylogenetic diversity (Caporaso, Knight, and Kelley 2011; Caporaso et al. 2012). Bases were called by the Illumina Real Time Analysis (RTA) v1.18.54, and the output was demultiplexed and converted to FastQ format by Illumina Bcl2fastq v1.8.4. Pipeline Comparison Sequence reads from postmortem microbiome samples were analyzed with mothur v1.39.5 (Schloss et al. 2009), QIIME2 v2018.11 (Bolyen et al. 2019), and MG-RAST v4.0.3 (Keegan, Glass, and Meyer 2016) to determine how the microbial community composition and 10 diversity metrics (alpha- and beta-diversity) varied among pipelines. The SOP for mothur (Kozich et al. 2013), the QIIME 2 moving pictures tutorial (Caporaso et al. 2011), and the MG- RAST manual (Glass et al. 2010) were used for reference to analyze the samples. For mothur and QIIME2, the SILVA small subunit database v132 (Quast et al. 2013) was used at a 99% similarity cutoff for taxonomic classification. The database version of SILVA used by MG- RAST is unknown, as it is not reported by the authors or on the website, even after multiple inquires. An overview of each pipeline workflow, including commands used to run the mothur and QIIME2, are represented in Figure 1. Some steps were conserved among pipelines (i.e., quality control and classify sequences/assign taxonomy) but the number and order of steps occurred in different succession among pipelines depending on how the pipeline developers created the software (Figure 1). For QIIME2, DADA2 v1.8.0 (Callahan et al. 2016) corrected Illumina amplicon sequencing data, including removing phiX reads. Sequence were aligned using MAFFT v7.397 (Katoh and Standley 2013) and FastTree v2.1 (Price, Dehal, and Arkin 2010) created the phylogenetic tree. Biom tables were created using biom-format package v2.1(McDonald et al. 2012) and exported. For mothur, OTUs (Operational Taxonomic Units) and taxonomy tables were exported as column separated values (csv) files. For MG-RAST, identified sequences were clustered using UCLUST v6 (Edgar 2010). OTUs and taxonomy tables were exported as tab separated values (tsv) files. As many parameters were standardized as possible, but analysis methods among pipelines differed (Table 1). For example, all pipelines used the SILVA database for taxonomic classification, and VSEARCH v2.8.0 (Rognes et al. 2016) for chimera detection and removal, while the taxonomic classification algorithms and alignment methods differed (Table 1). 11 To calculate error rate for each pipeline, in silico sequences were taken from “mockrobiota” (Bokulich et al. 2016), an online repository of sequences used to assess pipeline error rate. The mock-3 16S rRNA dataset contained sequence data, corresponding taxonomy, and relative abundance of the OTUs. Using pipeline workflows mentioned above, four samples were run through each of the pipelines: two samples representative of an “even” (all taxa have the same relative abundance) community composition and two “staggered” (taxa have varied relative abundances) community composition samples. Figure 1: Workflow for bioinformatic analysis. Several of the steps were shared among pipelines including quality control and aligning sequences. Commands from the code are in square brackets, while parameters are in parentheses. OTU: Operational taxonomic unit. 12 Table 1: Summary of parameter differences among pipelines compared. Adapted from Plummer et al. (2015). Bolded text were parameters chosen for running the pipelines. License Language Current Version Cited (Google Scholar) Web Based Interface Quality Control 16S rRNA Database Alignment Method Taxonomic Assignment Chimera Detection User Support QIIME2 Open-Source Python 2019.1 mothur Open-Source C++ 1.39.5 MG-RAST Open-Source Perl 4.0.3 18,800 14,300 4,620 GUI, API, CLI API, CLI GUI, API YES YES YES SILVA, Greengenes, UNITE mafft Naive Bayes classifier VSEARCH SILVA, Greengenes, RDP SILVA, Greengenes, RDP, ITS Needleman-Wunsch BLAT Wang BLAT VSEARCH VSEARCH Forum, tutorials, FAQs Forum, SOPs, FAQs, user manual Video tutorials, FAQs, user manual, ‘How to’ section on website Data Analysis and Bioinformatics Pipeline outputs were quantitatively compared using Biom files from QIIME2, and OTUs and taxonomy files from mothur and MG-RAST were combined with metadata as phyloseq v1.24.0 (McMurdie and Holmes 2013) objects in R v3.5.1 (R Core Team) and rarefied to 1,000 sequences to account for the variability of library size among pipelines. Taxonomic names were corrected so that outputs could be properly merged with comparable taxa names (Table S2). Two methods were used to compare pipelines: statistical analyses of pipeline outputs from postmortem autopsy case microbiome data and pipeline error analysis using in silico data. These 13 comparison methods were chosen based on previous pipeline comparison research, which either compared pipeline output or in silico data (Plummer and Twin 2015; Sivarajah et al. 2017). Relative abundance was determined by combining all samples analyzed within each pipeline and sample area, removing taxa that were less than 1% abundance, and determining the proportional contribution of each taxa to the total community. Differentially abundant taxa among pipelines were determined from relative abundance at the phylum and family level using ANCOM (Mandal et al. 2015). Alpha-diversity metrics (observed richness, Chao1, Shannon diversity (1- D), Inverse Simpson diversity (1/D)) were calculated using phyloseq. Alpha-diversity metrics were compared using Kruskal-Wallis and Nemenyi post hoc tests with the R stats and PMCMR packages (Pohlert 2018). Beta-diversity metrics, evaluated using Principal Coordinates Analysis (PCoA) of Jaccard distances, were plotted using phyloseq. Jaccard was chosen as a presence/absence method to buffer against relative abundance differences found among parameters. PERMANOVA (permutational multivariate analysis of variance) from the vegan v2.5-2 (Oksanen et al. 2019) package confirmed beta-diversity and dispersion differences among pipelines. In addition to diversity metrics, a measure of divergence was assessed (Kullback- Leibler Divergence) (la Rosa et al. 2012) and found to be generally consistent with trends of diversity metrics (Table S3). Classifications were made of sample area and manner of death using random forest (randomForest package v4.6-14) (Breiman et al. 2018) among pipelines. Out-of-bag (OOB) error rates were reported. However, test-set validation (70% training sets, 30% test sets) was also tested and the error rates were within 2% of the OOB error rates. For a more extensive comparison of random forest methods on the larger dataset of these postmortem microbiome data, please see Zhang et al. (2019). 14 We also used “mockrobiota” (Bokulich et al. 2016) reference samples to compare the accuracy of the pipeline outputs. For the in silico data analysis, three metrics were assessed: correct taxa, false positives, and false negatives. Taxa that were present in both mockrobiota taxonomic reference dataset and the pipeline output was labeled ‘correct taxa’. Taxa present in only the mockrobiota taxonomic reference dataset were considered false negatives, while taxa present only in the pipeline outputs were considered false positives. Concordance with the mockrobiota dataset was assessed based on pipeline outputs of abundance (sequence number) regressed with the expected abundances of mockrobiota as the R2 value. False negative taxa were indicated by negative abundance values, while false positive taxa fell along the X- axis with no expected abundance value. Based on the results of comparisons (see Results: Pipeline Comparison), the QIIME2 pipeline was chosen for conducting sample size and library size comparisons. Subsamples of the original 188 cases were chosen at random [R; sample()] without replacement for 60 and 120 cases (see A Priori Power Analysis below). Rarefaction levels used for comparison were 1,000 sequences, 7,000 sequences, and no rarefaction. Rarefying to 7,000 sequences was based on the alpha-rarefaction curve generated in QIIME2, while rarefying to 1,000 sequences was based on a subjectively chosen minimum library size. Outputs among subsamples and rarefaction levels using statistical analyses were compared using methods described above, including relative abundance, ANCOM, alpha-diversity, and beta-diversity. Changes in core microbiome characterization and random forest accuracy were also evaluated. In this case, the core microbiome was defined as shared OTUs among defined groups (i.e., sample areas, minimum 15 library sizes, and sample sizes) (Shade and Handelsman 2012) and determined using log abundance vs. occupancy plots. Classifications were made of sample area and manner of death using random forest (randomForest package v4.6-14) (Breiman et al. 2018), and the model error rate was compared among subsamples and normalization methods. A Priori Power Analysis To determine statistically significant sample size, an a priori power analysis was completed using G*Power 3 v3.0.5 (Faul et al. 2007). Body locations (mouth and rectum) were compared using the mean and standard deviation of Faith’s phylogenetic diversity from previous bioinformatic analysis (Pechal et al. 2018) calculated using R. An independent mean two tail t- test was used to determine sample size needed for significant power. For each body location (mouth and rectum), 27 samples were required for significant power (α = 0.05; 1-β = 0.80). Three random subsample sizes were chosen based on the number of cases to include that all have significant power: 60, 120, 188. Random subsamples were generated using R, sampling without replacement. Data Availability Postmortem microbiome samples collected, extracted, and targeted amplicon 16S sequenced from 188 cases in Pechal et al. (2018) were used as a large dataset for comparing pipelines, sample size, and minimum library size for downstream analyses. Sequence data are archived through the European Bioinformatics Institute European Nucleotide Archive (www.ebi.ac.uk/ena) under accession number: PRJEB22642. The microbial community analysis is available as R code on GitHub (https://github.com/sierrakasz/postmortem-analysis). 16 Results 17 Pipeline Comparison Despite having an easy to implement web-based GUI, MG-RAST was not an appropriate tool for this forensic dataset. Although the concordance of MG-RAST (R2 = 0.24) was comparable to QIIME2 (R2 = 0.27) for the in silico dataset (Figure 2), the in silico dataset represented very low diversity (< 20 taxa present) and simplistic community structure (even and staggered), with a low sample size (n = 4) and was not an accurate approximation of forensic data. For postmortem data, MG-RAST had a much smaller effect size than mothur and QIIME2 due to the twofold reduction in samples (Table S4). MG-RAST had an average library size ~ 20- times smaller than QIIME2 and mothur, tenfold reduction in sequences after filtering, and the highest numbers of unclassified sequences at the family and phylum level (Table 2, Table S5), representing a loss of valuable and forensically relevant data. MG-RAST also had the highest error rates for the random forest modeling in all cases (Table 3; Table S6). Compared to mothur and QIIME2, MG-RAST had more differentially abundant taxa and significantly reduced alpha- and beta-diversity metrics indicative of a very different microbial community structure, despite the input of the same data. Although the top five most abundant phyla were consistent among pipelines, MG-RAST’s mean relative abundances at the phylum and family taxonomic level were reduced for both mouth and rectum sample areas (Figure 3A, Figure 3B, Table S7-S8). MG-RAST had the most differentially abundant taxa (taxa for which relative abundances significantly differ from each other based on statistical tests) at both the phylum (9) and family level (39), which were represented by about 20-times more sequences than mothur and QIIME (ANCOM; Table S9-S11). When considering all samples, MG-RAST produced around 50% lower alpha-diversity for all three metrics compared to mothur and QIIME2 (Kruskal-Wallis p < 0.05; post hoc Nemenyi p < 0.05; Table S12B; Table S13A- 18 Figure 2: In silico data error among pipelines. In silico raw sequence data were obtained from mockrobiota and processed through each pipeline using the same methods as the postmortem microbiome dataset. Four samples were processed: two ‘even’ and two ‘staggered’ samples representing different microbial community compositions. (A) Taxa output from each pipeline were compared to the taxonomic reference dataset available on mockrobiota. Taxa that were present in both the mockrobiota taxonomic reference dataset and the pipeline output was labeled ‘correct taxa.’ Taxa present in only the mockrobiota taxonomic reference dataset were considered false negatives, while taxa present only in the pipeline outputs were considered false positives. Samples within each pipeline are ordered along the x-axis as: even, even, staggered, staggered. (B-D) Abundance of taxa (based on sequence number) from pipeline output versus expected abundance of taxa based on the mockrobiota abundance reference dataset for each sample. Negative abundances were assigned to samples that were considered false negatives. The R2 line represented the actual concordance. (B) MG-RAST (C) mothur (D) QIIME2. 19 Table 2: Summary of sample number and sequence read differences among pipelines. The total number of samples among pipeline differed, as some samples were removed for not having the minimum required sequences (1,000), or in the case of MG-RAST, for not having the minimum 1,000,000 base pairs before filtering. Total Number of Samples Number of Mouth Samples Number of Rectum Samples Pipeline QIIME2 324 mothur MG- RAST 324 150 169 169 97 155 155 53 Total Number of Number of Unclassified Reads Phylum after Filtering Reads 11,375,659 2,399 Number of Unclassified Family Reads 2,832 12,861,356 594 2,419 2,167,164 40,530 43,576 Table 3: Random forest classification error among pipelines. Random forest classifications were made with 1,000 trees, and out-of-bag (OOB) error was reported. Classifications of sample area (mouth and rectum) and manner of death (suicide, homicide, accident, natural) were made for phylum and family taxonomic level. Percentages and number of misclassifications from the total number of samples were included. Taxonomic Level Pipeline Sample Area Error Rate (Misclassifications/ Total) Phylum QIIME2 Phylum mothur Phylum MG-RAST Family QIIME2 Family mothur Family MG-RAST 16.05% (52/324) 12.65% (41/324) 16.67% (25/150) 4.94% (16/324) 4.01% (13/324) 5.33% (8/150) Manner of Death Error Rate (Misclassifications/ Total) 58.64% (190/324) 56.79% (184/324) 72% (108/150) 57.72% (187/324) 57.41% (186/324) 60% (90/150) 13B). MG-RAST samples clustered closely together (Figure 3D), reflective of MG-RAST’s significantly lower dissimilarity (0.850 ± 0.139) than QIIME2 and mothur (PERMANOVA, p < 0.05, Table S14). The loss of sequences and samples during the filtering process was also 20 reflected in the reduced microbial community metrics downstream for MG-RAST. QIIME2 and mothur had more comparable results but key differences among the pipeline outputs made QIIME2 the most appropriate pipeline to use for downstream analysis. Richness and diversity (evenness and richness) metrics from mothur and QIIME2 quantified similar levels of diversity (Figure 3C, Table S12A-12B). However, QIIME2 had around 20% lower observed richness than mothur (Table S12B, Table S13B) in contrast to previous studies (Plummer and Twin 2015). While QIIME2 had slightly higher levels of unclassified sequences (Table 2), mothur had a higher number of unclassified taxa at the phylum (3) and family level (4) (ANCOM, Table S9-S10), which artificially inflated richness measured by mothur. For random forest classification, mothur had slightly lower error rates than QIIME2 (Table 3), indicator taxa differed among mothur and QIIME2 possibly due to the inflated richness from mothur (Table S15). Despite mothur and QIIME2 samples’ admixture (Figure 3D) and comparable dissimilarity (QIIME2: 0.874 ± 0.146; mothur: 0.872 ± 0.144), microbial community structure was distinguishable (PERMANOVA, p < 0.05, Table S14). However, the effect size was small (R2 < 0.1, Table S14). Despite the similar microbial communities, mothur and QIIME2 diverged when testing the in silico dataset. QIIME2 had no false positives, while mothur had the most false positives (range: 10-20) of all pipelines (Figure 2A, Table S16). Many of mothur’s false positives were unclassified taxa at multiple taxonomic levels, which disrupted the relative abundances of the whole microbial community reflected in concordance. QIIME2 had the highest concordance to the mockrobiota taxonomy reference dataset (R2 = 0.27), while mothur had the lowest (R2 = 0.08) (Figure 2C-2D). 21 Figure 3: Microbial community composition among pipelines. (A) Relative abundances of the five most predominate phyla for mouth samples among pipelines (MG: MG-RAST, M: mothur, Q: QIIME2). (B) Relative abundances of the five most predominate phyla for rectum samples among pipelines (MG: MG-RAST, M: mothur, Q: QIIME2). (C) Alpha-diversity metrics for each sample area (mouth and rectum) and pipeline including: observed richness, Chao1, Shannon diversity, and Inverse Simpson diversity (InvSimpson). Kruskal-Wallis and post hoc Nemenyi tests for alpha-diversity metrics detected significant differences (p < 0.05) among pipelines and sample areas. (D) The principal coordinate analysis (PCoA) of Jaccard distances among sample area and pipeline. Ellipses indicated sample area, for a 95% confidence interval. Permutational multivariate analysis of variance (PERMANOVA) detected significant differences (p < 0.05) among pipelines for each sample area, and all pairwise differences were statistically significant (p < 0.05). 22 Based on the pipeline comparison results above, QIIME2 was chosen as the best pipeline to examine the effect of rarefying and sample sizes on modeling results due to the lowest error rate and reduced unclassified taxa compared to mothur, and therefore more accurate microbial community structure. Minimum Library Size Key differences in microbial community metrics were among minimum library sizes, indicating the importance of considering minimum library size during forensic microbiome analysis. The number of unclassified sequences at phylum and family level decreased proportionally as the minimum library size decreased from no rarefaction, 7,000 sequences, to 1,000 sequences for each sample area (Table 4). Relative abundances of the top five phyla changed among minimum library size, as smaller minimum library sizes (1,000 vs. 7,000) showed low abundance taxa (not the top five) phyla as a larger component of the relative abundances due to the random subsampling to a specified library size (Figure 4A, Table S17). The differences in microbial community structure among minimum library sizes were evident in the very distinct clustering of minimum library sizes (Figure 4F). Minimum library size had large effects beta-diversity (PERMANOVA, p < 0.05, R2 = 0.297-0.426, Table S18). Due to the effects of minimum library size on microbial community structure, including beta-diversity and relative abundance, minimum library size should be considered for downstream analyses. However, a large majority of taxa were captured within each minimum library size, as indicated by alpha-diversity and core microbiome analysis. Only the minimum library size of 1,000 had around 25% reduced richness compared to 7,000 and no rarefaction library sizes. 23 Figure 4: Microbial community composition among sample sizes and minimum library sizes. (A) Relative abundances of the five most predominate phyla among subsamples with no rarefaction. (B) Relative abundances of the five most predominate phyla among minimum library sizes of subsample 188. (C) Alpha-diversity metrics for each sample size with no rarefaction including: observed richness, Chao1, Shannon diversity, and Inverse Simpson diversity (InvSimpson). Kruskal-Wallis and post hoc Nemenyi tests for alpha-diversity metrics did not detect significant differences (p < 0.05) among subsamples. (D) Alpha-diversity metrics for each minimum library size of subsample 188 including: observed richness, Chao1, Shannon diversity, and Inverse Simpson diversity. Kruskal-Wallis and post hoc Nemenyi tests for alpha-diversity metrics detected significant differences (p < 0.05) among pairwise comparisons including the 1,000 sequence minimum library size for observed richness. (E) The principal coordinate analysis (PCoA) of Jaccard distances among subsamples with no rarefaction. Permutational multivariate analysis of variance (PERMANOVA) did not detect significant differences (p < 0.05) among sample sizes. (F) The principal coordinate analysis (PCoA) of Jaccard distances among minimum library sizes of subsample 188. PERMANOVA detected significant differences (p < 0.05) among rarefaction levels and all pairwise differences were statistically significant (p < 0.05). 24 Table 4: Total number of sequences that remained after filtering, and unclassified sequences (phylum and family taxonomic level) among sample area, sample size, and minimum library size. Pipeline Sample Area Sample Size QIIME2 Mouth 60 QIIME2 Mouth QIIME2 Mouth 60 60 QIIME2 Rectum 60 QIIME2 Rectum 60 QIIME2 Rectum 60 QIIME2 Mouth 120 QIIME2 Mouth QIIME2 Mouth 120 120 QIIME2 Rectum 120 QIIME2 Rectum 120 QIIME2 Rectum 120 QIIME2 Mouth 188 QIIME2 Mouth QIIME2 Mouth 188 188 QIIME2 Rectum 188 QIIME2 Rectum 188 QIIME2 Rectum 188 Minimum Library Size No Rarefaction 7,000 1,000 No Rarefaction 7,000 1,000 No Rarefaction 7,000 1,000 No Rarefaction 7,000 1,000 No Rarefaction 7,000 1,000 No Rarefaction 7,000 1,000 Total Number of Sequences Unclassified Phylum Unclassified Family 2,115,130 405,998 60,000 4,988 1,059 157 7,660 1,481 207 2,349,815 27,026 30,707 405,981 59,996 4,182,126 811,998 119,000 5,200 733 7,721 1,935 265 5,438 772 12,999 2,703 381 3,834,632 35,019 36,583 734,993 106,999 6,154 862 6,762 958 5,874,406 14,099 22,323 1,154,999 168,999 5,463,539 1,056,993 155,000 4,044 612 63,667 12,498 1,798 5,424 819 70,089 13,963 2,017 (Kruskal-Wallis, post hoc Nemenyi; p < 0.05, Figure 4C, Table S19-S20) When comparing core microbiome taxa among rarefaction levels, most OTUs (695) were shared among all minimum library sizes and were represented in higher log abundance (greater than the mean log abundance) compared to non-core taxa (Figure 5B, Table S21). Some taxa only occurred in non-rarefied samples (62) and amounted to about 10% of the core taxa among all minimum library sizes (Figure 5B, Table S21). Despite changing minimum library sizes, a majority of taxa 25 were still represented. However, the low abundant and infrequent taxa still have downstream effects for forensic analysis. Random forest modeling was affected by low abundant and infrequent taxa. While sample area classification error (14% for phylum and 4% for family) was lower than manner of death classification error (phylum and family error rates were nearly 56%), among all minimum library sizes, sample area classification error and manner of death classification error were relatively similar (Table 5, Table S22). Although the error rates among library sizes were overall similar, some of the important indicator taxa for classifications were unique to certain groups. For example, Bifidobacteriaceae was an important (one of the highest mean Gini decrease) indicator of manner of death for subsamples 60 and 120 and at rarefaction levels 0 and 1,000, while Spirochaetales was an indicator for sample area using subsamples 120 and 188 (Table S23). For downstream applications in forensic microbiology, changing indicator taxa among minimum library sizes indicated that lack of standardization among studies can lead to differentiation among results interpretation. Sample Size Overall, sample size did not affect overall microbial community metrics in a significant way. Sample size remained relatively constant among minimum library sizes (standard deviations- no rarefaction: 47.4; 7,000: 45.1; 1,000: 46.0) (Table S24). The total number of sequences, as well as the number of unclassified sequences at phylum and family level, increased 1.5 to 2-times with increasing sample number from 60, 120, to 188 (Table S24), but there were no differentially abundant taxa among subsamples (Figure 4B). Alpha-diversity metrics did not significantly differ among sample sizes (Kruskal-Wallis; p > 0.05; Figure 4D, Table S19- 26 Figure 5: Core microbiome among sample sizes and minimum library sizes. In this case, core microbiome is considered by OTU (operational taxonomic unit) membership among defined groups (sample size or minimum library size). (A) Occupancy vs. log abundance of shared OTUs among sample sizes at no rarefaction. The horizontal line indicated mean log abundance. (B) Occupancy vs. log abundance of shared OTUs among minimum library sizes for 188 subsample. The horizontal line indicated mean log abundance. S20), but observed richness did increase with increasing sample size, as larger sample sizes captured more infrequent taxa. For subsamples within each sample area and minimum library size, there were no significant differences (PERMANOVA; p > 0.05) among beta-diversity and beta-dispersion and no clear clustering by subsample (Figure 4E; Table S18). However, increasing sample size changed the number of low abundant and infrequent taxa. Comparing the core microbiome, most OTUs (569) were shared among all sample sizes, and were represented in higher log abundance (greater than the mean log abundance) compared to non-core taxa (Figure 5A, Table S25). Although, there were a few OTUs shared among subsamples and unique to the subsample 188 (Figure 5A, Table S25). Much like minimum library size, error rates for predicting sample area among sample sizes were around 14% for 27 Table 5: Random forest classification error among sample sizes and minimum library sizes. Random forest classifications were made with 1,000 trees, and out-of-bag (OOB) error was reported. Classifications of sample area (mouth and rectum) and manner of death (suicide, homicide, accident, natural) were made for each taxonomic level, rarefaction level, and subsample. Percentages and number of misclassifications from the total number of samples were included. Taxonomic Level Phylum Phylum Phylum Phylum Phylum Phylum Phylum Phylum Phylum Family Family Family Family Family Family Family Family Family Minimum Library Size 1,000 1,000 1,000 7,000 7,000 7,000 No Rarefaction No Rarefaction No Rarefaction 1,000 1,000 1,000 7,000 7,000 7,000 No Rarefaction No Rarefaction No Rarefaction Sample Area Error Rate (Misclassifications/ Total) 14.17% (17/120) 15.49% (35/226) 17.59% (57/324) 14.66% (17/116) 14.03% (31/221) 14.24% (45/316) Manner of Death Error Rate (Misclassifications/ Total) 55.83% (67/120) 58.41% (132/226) 57.41% (186/324) 57.76% (67/116) 57.47% (127/221) 56.33% (178/316) Sample Size 60 120 188 60 120 188 60 120 188 60 120 188 60 120 188 60 120 188 12.50% (15/120) 54.17% (65/120) 14.29% (33/231) 58.44% (135/231) 14.50% (48/331) 5.83% (7/120) 3.10% (7/226) 4.63% (15/324) 5.17% (6/116) 4.07% (9/221) 5.06% (16/316) 58.61% (194/331) 54.17% (65/120) 56.64% (128/226) 56.17% (182/324) 55.17% (64/116) 59.28% (131/221) 58.54% (185/316) 4.17% (5/120) 54.17% (65/120) 4.76% (11/231) 57.58% (133/231) 3.93% (13/331) 58.31% (193/331) phylum and 4% for family, and around 56% for manner of death classifications (Table 5). Again, there were indicator taxa unique to certain subsamples (Table S23). The increase of infrequent and low abundant taxa among the sample sizes are indicative that sample size should be taken into consideration for downstream application for forensic microbiology studies, and comparisons across studies. 28 Discussion 29 Our bioinformatic parameter comparison using a large postmortem microbiome dataset showed that parameters can affect downstream analyses, including microbial community structure results. We also show that sample size and minimum library size affect the resulting number of low and infrequent taxa and potential indicator taxa for model building. While the results here are specific to this postmortem dataset of targeted amplicon (16S rRNA) sequencing on the Illumina platform, similar considerations should be made for other pipelines and platforms that may be used in downstream applications of forensic microbiology in the criminal justice system. Pipeline Comparison Pipelines differed in many ways, including the development, parameters, and usability (Plummer and Twin 2015). To accurately compare pipelines, we standardized as many steps as possible; i.e. using SILVA as a reference database. The steps that could not be standardized among pipelines, including different taxonomic assignment and alignment algorithms, were likely responsible for the differences in microbial community characterizations (Siegwald et al. 2017). Overall, we showed that MG-RAST finds a different microbial community structure compared to mothur and QIIME2, due to reduced diversity metrics and increased unclassified and differentially abundant taxa. These results are similar to previous work comparing MG- RAST to other pipelines (D’Argenio et al. 2014). However, there are few studies that did not find the same reduction in diversity metrics and increased unclassified reads (Plummer and Twin 2015; Golob et al. 2017). Although, both studies focused on a smaller sample size of similar 30 microbial communities (35 infant gut microbiome samples) or in silico generated data that does not approximate forensic data. Due to the results from this postmortem dataset, the reduction in microbial information (samples, classified sequences), the varied community structure (differentially abundant taxa, reduced alpha- and beta-diversity), and the higher random forest classification error, MG-RAST is not the appropriate tool for this type of forensic dataset moving forward. The overall microbial community structure had minimal differences among QIIME2 and mothur, which was a similar result to previous studies (Plummer and Twin 2015; Golob et al. 2017). It is important to note that we rarefied samples to 1,000 sequences, to account for the sequence reduction in MG-RAST, but this could be limiting the differences among mothur and QIIME2. This study was comparison of bioinformatic pipelines using machine learning outcomes. Rather than an exhaustive search of machine learning algorithms (Zhang et al. 2019), we used a standardized user-friendly methodology of random forest classification (out-of-bag error) after comparison to the test-set validations were within 2% error rate as OOB error rate. QIIME2 had a higher overall classification error than mothur but resulted in less than 1% difference from mothur at the family taxonomic level. Overall, the increased unclassified taxa and in silico data error for mothur made QIIME2 the more appropriate choice for downstream analyses in this study. Minimum Library Size Rarefying, as a method of normalizing varying library sizes, should be of interest for forensic applications in the future as standardizing methodology among labs will be important for actual casework. Many of the current postmortem microbiome studies have a variety of 31 minimum library sizes chosen based on alpha-rarefaction curves (Metcalf et al. 2016; Pechal et al. 2018) or based on minimum library size of the samples (DeBruyn and Hauther 2017). We chose three levels of rarefying for the postmortem data, based on these different approaches: no rarefaction, 7,000 (based on alpha-rarefaction curves), and 1,000 (minimum library size). Previous research determined that in large enough datasets of very different microbial communities (postmortem data across body site) (Pechal et al. 2018), rarefying should not negatively impact downstream analysis (Weiss et al. 2017) . However, we found differences among the normalization strategies, indicating that rarefying should be taken into consideration for forensic microbiome analysis. While we showed that more low abundance and infrequent taxa among cases are captured without rarefying, downstream statistical analyses commonly used in forensic microbiology studies (i.e. ANCOM, PERMANOVA) assume similar library sizes (Weiss et al. 2017). Demonstrated by the PCoA plot (Figure 4F) a large majority of the postmortem samples in this study clustered by minimum library size, a result that was found in previous normalization research (Weiss et al. 2017). This clustering, and indication of distinct microbial communities, could possibly limit the ability for laboratories to compare data across studies if normalization was not standardized. For standardization in forensic analyses, normalization of library size should be taken into consideration due to the effect on downstream statistical analyses. Random forest classification has potential forensic applications for many analyses including manner of death determination. Error rates of classifications (both of sample area and manner of death) were generally stable among minimum library size but decreased with larger library sizes, perhaps as more of the infrequent taxa become important for classification. While classification of sample area is generally consistent with other studies (Pechal et al. 2018), the classifications of manner 32 of death in this study are generally lower than previously reported studies with the same dataset (Zhang et al. 2019). However, a previous study provided a comprehensive evaluation of machine learning algorithms, which was not the major focus of this study. Rather, by using a standardized, user-friendly open-access methodology of random forest classification, we illustrated the importance to future researchers to consider parameters, such as library size, even if they are not a primary focus of the study. Indicator taxa that are important for classifications changed among minimum library sizes, which has implications for forensic cases, as indicator taxa among classifications will be potentially very important for downstream applications in casework. Sample Size There is a tradeoff for researchers to consider when including more samples in analyses. Sample sizes for studies are mostly limited by resource availability, including funding. It is not realistic for all studies to have very large sample sizes. However, to improve forensic microbiology for future use in the criminal justice system, more robust tools that capture low abundant and infrequent taxa encountered in real cases are needed. Our postmortem microbiome dataset is the largest analyzed to date for postmortem microbiome studies (Pechal et al. 2018). For downstream forensic applications, including more samples in predictive models makes those models more robust to the variability present in real forensic cases. We found patterns of relative abundance did not change as more samples were included in the analysis, but observed richness did increase with increasing sample size, as larger sample sizes captured more infrequent taxa. As expected, increasing subsample size increased 33 error rate for the random forest classifications (sample area and manner of death). However, those random forest models are arguably more robust for downstream forensic applications (Zhang et al. 2019), therefore including as many samples as possible in forensic microbiology studies is best practice moving forward. 34 Conclusion 35 This work was the first to compare pipeline and parameters for a forensically relevant, large, and heterogenous dataset. Based on the results of this study, we make the following recommendations for future forensic microbiology studies: 1. The QIIME2 pipeline is the most suitable pipeline for this type of postmortem microbiome dataset; 2. Rarefying data is the best normalization practice for downstream statistical analyses (Weiss et al. 2017). However, an appropriate minimum library size should be chosen based on richness captured (alpha-rarefaction plots), instead of the smallest library size among samples; and 3. Sample size should be maximized to captures lower abundant and infrequent taxa among the data for more robust model building. However, sample size must be weighed with other practical considerations, such as financial constraints. Considering the potential application of forensic microbiology to the criminal justice system, continued research to optimize computational methodology will be important for downstream applications. While model building was not the focus of this study, the preliminary results show how parameter choice can potentially affect downstream applications, which is important for future research and casework. Applying bioinformatic workflows necessary for microbial data in forensic casework will be challenging as command line software and microbial data analysis is not already part of the examiner workflow. The constant influx of pipelines available, changing parameter options, and updates will be a barrier to creating an SOP for forensic casework. However, this study is an important part of laying the groundwork for standardizing computational methodology for forensic microbiology research, and will help set the precedent for forensic casework in the future. 36 CHAPTER II: DYSBIOSIS IN THE DEAD: HUMAN POSTMORTEM MICROBIOME BETA- DISPERSION AS ONE INDICATOR OF MANNER AND CAUSE OF DEATH DURING AUTOPSY 37 Introduction 38 The organisms represented in microbiomes have important functional roles for host life, influencing health, development, and disease susceptibility, among many others (Zaneveld, McMinds, and Thurber 2017; Turnbaugh et al. 2007). Microbes also play an important functional role in decomposition (Pechal et al. 2014), as the communities change with dispersal, competition, and other interactions after host death (Pechal et al. 2014, 2018; Metcalf et al. 2016). These dynamic, yet predictable, microbial community profile changes after death make the postmortem microbiome a potential forensic resource for postmortem interval (PMI) estimation. PMI estimation is indeed the most studied forensic application of the postmortem microbiome (Pechal et al. 2014; Metcalf et al. 2013); but, this community has additional potential for forensic applications as well, like indicating antemortem health conditions (e.g., cardiovascular disease or violent death) (Pechal et al. 2018) and the living environment (e.g., neighborhood blight) (Pearson et al. 2019). The postmortem microbiome is thought to be structured by a decedents’ antemortem health condition and the suite of stressors that impact the human host including drug/alcohol abuse or high stress lifestyle conditions such as neighborhood blight, that are associated with certain manners of death (i.e., homicide) (Pechal et al. 2018; Pearson et al. 2019; Zhang et al. 2019). Importantly, the postmortem microbiome does not significantly change from the antemortem microbiome for approximately 48 hours after death (Pechal et al. 2018). Due to the stability of the postmortem microbiome within 48 hours after death, and potential connection to lifestyle condition, microbial community metrics have been shown to indicate certain manners of deaths (MOD) or causes of deaths (COD). However, there have been few studies that have tested associations of postmortem microbial communities with MOD or COD (Pechal et al. 2018; Pearson et al. 2019; Zhang et al. 2019). 39 In past work, microbial diversity and indicator taxa were shown to reflect antemortem health conditions and MOD (Pechal et al. 2018; Pearson et al. 2019; Zhang et al. 2019). In some cases lower microbial alpha-diversity was associated with cardiovascular disease, non-violent deaths, and neighborhood blight (e.g., abandoned building, inactivity, and dumping) (Pechal et al. 2018; Pearson et al. 2019); however, it is difficult to capture the variability in a large sample set using diversity alone, as it does not account for taxon relative composition that makes up the community. In one of two studies, Zhang et al. combined microbial indicator taxa and case metadata found in autopsy reports (e.g., decedents age, sex, race, etc.) to test machine learning models for classifying M/COD (Zhang et al. 2019).While indicator taxa are a useful reflection of antemortem conditions, microbial indicator taxa may not be present in all cases (e.g., Haemophilus influenzae), or they may be ubiquitous (e.g., Staphylococcus), and therefore less useful (Pechal et al. 2018). For this study, we tested a new metric that captures microbial variability and does not specifically rely on indicator taxa: beta-dispersion. We hypothesized that postmortem microbiome beta-dispersion could be an additional tool for determining M/CODs during death investigation. Increased beta-dispersion among living individuals has been associated with obesity, infection, and smoking (Zaneveld, McMinds, and Thurber 2017; Barbian et al. 2015). Following the conceptual context of the ‘Anna Karenina Principle’ (AKP), after prolonged exposure to any array of stressors, the microbiomes of unhealthy individuals becomes more variable compared to the microbial communities of healthy individuals (Zaneveld, McMinds, and Thurber 2017). In other words, increased variation in the microbial communities reflects dysbiosis, and this community variability can be quantified through calculations of beta-dispersion (Zaneveld, McMinds, and Thurber 2017; Barbian et al. 2015). Beta-dispersion is calculated within the 40 context of a dataset, based on the multivariate distance from the centroid for each case sample, as defined by a grouping factor (Oksanen et al. 2019). Based on the link between increased beta- dispersion and health status, we considered M/COD as grouping factors to quantify microbial signatures associated with M/COD determinations. Such association of beta-dispersion with M/COD could conceivably be additional evidence during death investigation. Microbial community metrics could potentially aid medical examiners and other certifiers of death, as determining M/COD can be error prone. While COD is the injury/disease a person died from which spans a variety of causes, MOD encompasses only five major categories: natural, accident, suicide, homicide, and undetermined (Randy, Hunsaker III, and Davis 2002). Medical examiners and other certifiers of death qualify their MOD determination with incremental degrees of certainty considering multiple pieces of evidence (Randy, Hunsaker III, and Davis 2002). Given the possibility for mismatch between the MOD determination and the actual MOD, the postmortem microbiome could provide another piece of evidence to justify M/COD determination. To evaluate how postmortem microbiome variability associated with M/CODs, we modeled postmortem microbiome beta-dispersion from five body sites of 188 routine autopsy with known M/COD (as determined by a board-certified medical examiner). We predicted that certain M/CODs, such as natural deaths and cardiovascular disease, would have higher beta-dispersion than other M/CODs due to the previous antemortem health condition links found in previous studies (Pechal et al. 2018). However, the effect of life environment was predicted to increase beta-dispersion as well, in a way that could potentially factor into homicides or blunt force trauma/gunshot wounds (Pearson et al. 2019). Quantifying beta-dispersion by using M/COD as a 41 grouping factor could provide reliable and usable tool in death investigation for M/COD determination. 42 Materials and Methods 43 Sample Collection, DNA Extraction, and Sequencing The postmortem microbiome data used in this study were acquired from Pechal et al. (2018) but re-analyzed to test M/COD determination from beta-dispersion. This dataset contains postmortem microbiome samples obtained from 188 Wayne County Medical Examiner’s Office autopsy cases (Detroit, MI 2014-2016), representing multiple MODs (accident, homicide, suicide, natural) and CODs (asphyxiation, blunt force trauma, cardiovascular disease, drug- related deaths, gunshot wounds, etc.) (Table S1) (Pechal et al. 2018). The cases also represent a cross-section of the Detroit area populace and were nearly evenly divided among females and males (83:105) and black and white (90:98) (Table S1). Cases in the dataset used were from people 18–88 years with a body mass index (BMI) from 8.5–67.5 kg/m2 (Table S1). The dataset is the largest postmortem microbiome available to test beta-dispersions potential to aid M/COD determination. Detailed methods for sample collection, DNA extraction, and sequencing can be found in (Pechal et al. 2018). In summary, trained personnel at Wayne County Medical Examiner’s Office took microbial community swab samples from five body sites (nose, mouth, rectum, ears, and eyes) during routine autopsy. Microbial DNA was extracted and sequenced to characterize the microbial communities. The Michigan State University (MSU) Genomics Core Facility (East Lansing, MI, USA) sequenced the 16S rRNA V4 region using an Illumina MiSeq standard flow cell (v2) using a 500-cycle reagent cartridge. Data Analysis and Bioinformatics Sequence reads from postmortem microbiome samples were analyzed with QIIME2 (v2018.11) (Bolyen et al. 2019) following the methodology outlined in (Kaszubinski et al. 2019). Taxonomy and amplicon sequencing variant (ASV) tables were exported as csv files to be used 44 as input data for all downstream analysis. ASV and taxonomy files were combined with metadata obtained from autopsy reports (age, sex, race, BMI etc.) as phyloseq (v1.28.0) objects in R (v3.6.1) (McMurdie and Holmes 2013; R Core Team). ASVs less than 0.01% of the mean library size were trimmed, removing 22,214 ASVs for a total of 8,692 ASVs. Phyloseq objects were split among the body sites (nose, mouth, rectum, ears, and eyes) and analyzed separately. Method Selection We needed to determine the optimal methodology for calculating beta-dispersion before moving forward with classifying M/COD. To do this, we compared standardization approaches of the microbial communities, distance matrices, the number of significant M/COD comparisons, and alpha-diversity to select the optimal method for beta-dispersion calculation. For standardization, rarefying (randomly subsampling ASVs to a specified minimum library size) and normalizing (removing ASVs not present in a specified percentage of samples) were compared for each body site using three minimum library sizes (3,000, 5,000, and 7,000 sequences) sample percentage cut off (1%, 3%, 10%), respectively. While rarefaction has been debated (Weiss et al. 2017; McMurdie and Holmes 2014), we sought to eliminate bias associated with different library sizes that could inflate differences in beta-dispersion among M/CODs (Kaszubinski et al. 2019). We also compared unweighted and weighted unifrac distances matrices for calculating beta-dispersion. Unifrac is commonly used in forensic studies (Javan et al. 2017; Pechal et al. 2018). We wanted to determine whether considering abundances would affect the beta- dispersion calculation and should be considered for downstream modeling. Beta-dispersion was calculated among MODs and CODs using the vegan package (v2.5-5) in R (v3.6.1) for each minimum library size and sample percentage cutoff (Oksanen et al. 2019). The betadisper 45 function from vegan reports the distance from the centroid for each sample, as defined by a grouping factor (in this case, M/COD). Every postmortem microbiome sample had two corresponding beta-dispersion values, with either MOD or COD as a grouping factor. Kruskal- Wallis and Fligner-Killeen tests among beta-dispersion values determined differences among M/CODs and were reported with a Bonferroni correction (Pohlert 2018). Additionally, alpha- diversity metrics (Chao1 and Shannon diversity) for each minimum library size and sample percentage cutoff were calculated using phyloseq (v1.28.0). We selected a methodology (standardization approach, distance matrix) for calculating beta-dispersion based on the number of significant differences in beta-dispersion identified by Kruskal-Wallis as well as the highest alpha-diversity (see Results: Method Selection). We wanted to select a method that had potential for distinguishing M/COD but did not lose microbial diversity. For subsequent analyses, microbial communities were rarefied to 5,000 sequences and beta-dispersion was calculated using unweighted unifrac distance. Model Selection We chose logistic regression modeling to showcase how beta-dispersion and case metadata could be reliable and usable tool in death investigation for M/COD determination. We built multinomial logistic regression models to classify M/COD from beta-dispersion values and case metadata (Böhning 1992) using the lme4 (v1.1-21) and mlogit package (v1.0-1) (Bates et al. 2019; Croissant 2019). Multinomial logistic regression models included all categories of interest for classifying M/COD (e.g., homicide, suicide, natural, and accidental death for MOD; cardiovascular disease, drug related deaths, blunt force trauma, asphyxiation, gunshot wounds, and other deaths for COD). Metadata of interest [age, BMI, sex, race, PMI (<48 h; > 49 h), season, and event location (outdoors, indoors, hospital, vehicular)] were summarized and tested 46 with Kruskal-Wallis among M/CODs with a Bonferroni correction to determine the covariates for model building. (Pohlert 2018). We wanted to identify if metadata were indicative of certain M/CODs (e.g., age significantly higher in natural deaths or for cardiovascular disease) and were useful to supplement beta-dispersion in downstream modeling. We built and tested multiple logistic regression models for each body site, classifying M/COD. Only models with significant (p < 0.1) beta-dispersion contribution were selected for further modeling with case metadata. We chose a slightly less conservative cut off for the p- value so that potentially important metadata were not prematurely removed. The best performing models were considered based on Akaike information criterion (AIC) (Bozdogan 1987), McFadden R2, and classification success (correct classifications / total number of samples). We used multinomial logistic regression model results were used to determine which body sites beta- dispersion classified M/COD well. Based on initial model building results, models could be improved (see Results: Model Selection). To improve logistic regression models, we considered three microbial community types for beta-dispersion calculation: full communities, random forest indicator communities, and ‘non-core’ communities. While the grouping factor (M/COD) remained the same, the microbial communities used to calculate beta-dispersion differed. “Full community’ beta-dispersion was calculated from the entire filtered and rarefied community within each body site. “Random forest indicators” were determined by Boruta (v6.0.0) using confirmed and tentative taxa of importance (p < 0.05) (Kursa and Rudnicki 2018), and beta-dispersion was calculated from identified significant indicator taxa. Random forest classification error was determined using the randomForest package (v4.6-14) (Breiman et al. 2018). While the “core microbiome” definition is widely debated (Shade and Handelsman 2012), we determined ‘non-core’ taxa in this case as 47 taxa present in only one M/COD of interest (e.g., accidents, blunt force trauma, etc.). ‘Non-core’ taxa were removed, and beta-dispersion was calculated from the remaining taxa. Lastly, we also test binary logistic regression (classifying between two categories) compared to multinomial logistic regression (multiple categories). Binary logistic regression models were also built using the lme4 (v1.1-21) and mlogit package (v1.0-1) (Bates et al. 2019; Croissant 2019), and the best performing models were considered based on Akaike information criterion (AIC) (Bozdogan 1987), McFadden R2, and classification success (correct classifications / total number of samples) much like the multinomial logistic regression models. Case Studies Using the methodology outlined above, we tested data from two case studies for classifying MOD from nose communities, to showcase the forensic potential beta-dispersion has as a tool for medical examiners. For the first case study, we chose a matched design of cases examining suicide vs other manners of death, including similar ages, races, and sexes to limit the effect of metadata for a total of 43 cases (22 suicides; 21 accidents/homicide/natural) (Table S26). For the second case study, we examined MOD within COD; examining gunshot wound homicides (n = 25) vs. suicides (n = 4) (Table S26). As previous studies have identified indicator taxa (Pechal et al. 2018; Pearson et al. 2019; Zhang et al. 2019), we determined potential indicator taxa using Boruta. We also evaluated the potential beta-dispersion differences using Permutational multivariate analysis of variance (PERMANOVA), and classified MOD using binary logistic regression (Anderson 2017). We calculated a priori power using G*Power 3 v3.0.5 (Faul et al. 2007). Case study beta-dispersion was compared using the mean and standard deviation using an independent mean two‐tailed t-test (α = 0.05; 1−β = 0.80). 48 Data Availability All postmortem microbiome case data can be found as supplemental material or associated archived data of (Pechal et al. 2018). Sequence data were archived through the European Bioinformatics Institute European Nucleotide Archive (www.ebi.ac.uk/ena) under accession number: PRJEB22642. The microbial community analysis is available as R code on GitHub (https://github.com/sierrakasz/AKP-betadisp-paper). 49 Results 50 Method Selection Beta-dispersion among M/CODs was calculated using either unweighted or weighted unifrac distance matrices from microbial communities that were either rarefied or normalized (Figure S1-2, Table S27). Microbial communities were rarified to three minimum library sizes (3,000, 5,000, and 7,000 sequences), or normalized to three sample percentage cut offs (1%, 3%, 10%). Each body site was considered separately. Therefore, we had a total of 120 comparisons to determine which beta-dispersion calculation method was most appropriate for downstream model building. We determined that unweighted compared to weighted unifrac distances were the optimum distance matrix for this study. Unweighted unifrac distances had three-times as many significant comparisons than weighted unifrac (Kruskal-Wallis p < 0.05; Figure S1, Table S27), while unweighted unifrac distances had four-times less significant deviations of variance compared to weighted unifrac (Fligner-Killeen p < 0.05; Figure S2, Table S27). We showed that unweighted unifrac distances were more robust against rarefying and normalizing, as significant comparisons occurred with both standardization (Kruskal-Wallis and post hoc Nemenyi p < 0.05; Figure S1; Table S27). For normalizing combined with weighted unifrac, we found a bias of library size among the significant comparisons (Table S28). Most of the significant comparisons (7 out of 10) had differential library sizes, reflecting that the significance did not originate from M/COD differences in beta-dispersion (Kruskal-Wallis and post hoc Nemenyi p < 0.05; Table S28). Due to the number of significant comparisons across standardization strategies, less deviations of variance, and lack of library size bias, we chose unweighted unifrac as the appropriate metric for additional analyses. We determined that rarefaction was the more appropriate standardization strategy than 51 normalization for this study. While normalizing had more than double (19 vs. 9) the significant comparisons than rarefying, normalizing microbial communities led to a significant decrease in richness (Chao1) and alpha-diversity (Shannon diversity) compared to rarefying (Kruskal-Wallis and post hoc Nemenyi p < 0.05; Figure S1, S3, Table S27, S29). Due to loss of alpha-diversity and library size bias of normalizing, we chose rarefying as the appropriate standardization strategy. In conclusion, we chose a minimum library size of 5,000 sequences as more body sites yielded significant comparisons (Kruskal-Wallis p <0.05) and was the appropriate minimum library size based on alpha-rarefaction curves of sequencing depth (Figure S4). For downstream model building, samples were rarefied to 5,000 sequences and beta-dispersion was calculated using the unweighted unifrac dissimilarity matrix. Model Selection Beta-dispersion significantly differed among body sites and M/CODs (Kruskal-Wallis p <0.05, Figure 6, Table S30). Every postmortem microbiome sample had two corresponding beta- dispersion values (distances from centroid), with MOD or COD as the grouping factor. On average, eye microbiomes had highest beta-dispersion [MOD: 0.646 (SD = 0.0346); COD: 0.642 (SD = 0.035); Table S30] while mouths had the lowest beta-dispersion [MOD: 0.567 (SD = 0.0779); COD: 0.563 (SD = 0.0800); Table S30]. Beta-dispersion for all body site communities was significantly different for M/COD except the ears and eyes, but we considered all body sites for downstream modeling with logistic regression (Kruskal-Wallis p < 0.05; Table S30). Natural death postmortem microbiomes had the highest average beta-dispersion [0.628 (SD = 0.056); Table S30] compared to homicides [0.606 (SD = 0.0694)] and accidents [0.608 (SD = 0.0683); Kruskal-Wallis p < 0.05; Table S30]. Microbiomes of cases with cardiovascular disease had 52 Figure 6: Beta-dispersion values among body sites and manners/causes of death (M/CODs). A) Beta-dispersion values for MODs among body sites. B) Beta-dispersion values for CODs among body sites. C) Beta-dispersion values among MODs. D) Beta-dispersion values among CODs (Cardio = cardiovascular disease; Drug = drug-related deaths; BFT = blunt force trauma; Asphyx = Asphyxiation). significantly higher beta-dispersion among all body sites [0.625 (SD= 0.0565); Table S30] compared to asphyxiation, which had the lowest [0.598 (SD= 0.0678); Table S30]. Beta- dispersion for cardiovascular disease was significantly higher than gunshot wounds [0.605 (SD 53 Table 6: Case metadata stratified by manner/cause of death (M/COD). Age and BMI (kg/m2) were reported with means and standard deviations, while sex, race, PMI, event location, and season were reported as counts within M/COD. Manner of Age BMI Sex Race Death mean sd mean sd Female Male Black White Accident (n=71) Homicide (n=39) 39.9 13.8 30 8.48 36.2 12.3 26.9 6.42 Natural (n=57) 53.7 10.7 29.8 10.1 44.9 15.6 27.2 6.57 Age BMI 35 9 25 11 Sex 36 30 32 12 26 34 29 3 45 5 28 20 Race mean sd mean sd Female Male Black White Suicide (n=22) Cause of Death Asphyxiation (n=11) Blunt force trauma (n=21) Cardiovascular disease (n=42) Drug-related deaths (n=70) Gunshot wounds (n=30) 46.5 20.1 25.4 4.35 4 42.6 14.7 28.9 10.3 12 54.5 11.6 29.9 9.56 20 41 12.7 29.7 7.51 34 33.8 11.8 27.5 6.34 Other (n=16) 47.3 12 28.8 12.1 5 5 PMI <48 hr > 49 hr 62 37 45 21 9 2 12 2 PMI <48 hr > 49 hr Event Location Season Hospital Indoors Outdoors Vehicular Autumn Spring Summer Winter 10 9 6 1 51 14 49 18 7 12 1 3 3 4 1 1 4 2 2 2 41 20 33 10 Event Location Season 17 6 14 6 9 11 8 5 Hospital Indoors Outdoors Vehicular Autumn Spring Summer Winter 7 9 22 36 25 11 2 14 23 22 24 7 9 7 19 48 6 9 10 21 34 58 28 14 1 0 8 12 2 2 1 4 6 5 7 3 7 7 35 60 10 13 2 8 1 3 9 0 1 2 0 2 4 0 1 4 2 2 1 0 3 9 26 43 16 7 3 3 7 17 6 7 4 5 7 8 7 2 54 =0.0708)], blunt force trauma [0.601 (SD = 0.0624)] and drug-related deaths [0.611(SD = 0.0684); Kruskal-Wallis p < 0.1; Table S30]. While beta-dispersion means differed significantly among MODs and CODs, there was an overlap among beta-dispersion values, indicating that other variables contribute to microbiome beta-dispersion (Figure 6). Therefore, we considered additional case metadata for downstream modeling. Metadata were differentially represented in MODs and CODs (Table 6; Table S31). Age, sex, race, and event location were significantly different among MODs/CODs (Kruskal-Wallis p < 0.05; Table S31); however, so as to not prematurely remove potentially important metadata we included all metadata of interest in downstream modeling [age, BMI, sex, race, PMI (<48 h; > 49 h), season, and event location (outdoors, indoors, hospital, vehicular)]. We found that initial multinomial logistic regression models were useful for determining which body site beta-dispersion had the best classification potential for M/COD. Classifying among all MODs, nose and mouth community was a significant covariate, while nose, mouth, and ear beta-dispersion were a significant covariate for classifying all CODs (p < 0.1; Table S32). Nose community beta-dispersion on average successfully classified MODs at 61.05%, while ears and nose beta-dispersion successfully classified CODs on average 62.87% and 62.79%, respectively (Table S32). ). Based on these results, we used only body sites with statistically significant (p < 0.05) beta-dispersion as a covariate in the models for further model building: nose, mouth, and ears. While initial multinomial logistic regression models were able to classify among all M/CODs at higher success rate than random (overall average 50.23%; random change: MOD: 25%; COD: 16.67%), models could be improved (Table S32). Models classifying M/COD with only beta-dispersion (no case metadata) were significant (p < 0.05) but had low classification 55 success and fit (~40% average classification success; McFadden R2 of ~ 0.0244; Table S32). Adding case metadata to the models led to better fit (~60% average classification success; McFadden R2: ~0.298; Table S32). However, we attempted to improve our models by testing different microbial community beta-dispersion and binary logistic regression. Overall, we determined microbial community type (i.e., full communities, random forest indicator communities, and ‘non-core’ communities) did not improve logistic regression models (Figure 7; Table S33; Table S34). Models of non-core taxa community beta-dispersion were not significant (p > 0.05) and so were removed from further consideration (Table S33). Even though random forest indicator communities were specific to M/COD, associated beta-dispersion models were not more successful than full communities in classifying M/COD, and were less successful in some cases [full: 0.590(SD=0.0264); RF: 0.578(SD=0.0306); Figure 7; Table S34]. For the random forest indicator communities and full communities, models were within 7% of each other [full: 0.298(SD=0.0367); RF: 0.318 (SD=0.0445); Figure 7; Table S34]. As changing the microbial communities did not significantly improve regression models, we considered binary logistic regression. For some M/COD comparisons (natural vs. accidental death; cardiovascular disease vs. drug-related death; disease vs. non-diseased state), binary logistic regression models performed best with an average classification success of 83.04% (Table S35). For nose communities, pairwise comparisons beta-dispersion was a significant (p < 0.1) covariate (full community and random forest indicator communities) included natural vs accidental death, cardiovascular disease vs. drug-use, and disease (natural deaths) vs. non-diseased (accidental, homicide, suicide) deaths (Table 7; Table S35). While random forest indicator communities compared to full communities had marginally higher successful classification (full: 78.9%; RF: 83.6%) and higher 56 Figure 7: Multinomial logistic regression model comparison among full community and random forest indicator community beta-dispersion for manners/causes of death (M/CODs). For the bottom panel, the y axis indicated percent correct, or the number of correct classifications/ total number of samples. Each bar represented a multinomial logistic model. For the top panel, the y axis indicated McFadden R2. McFadden R2 (full: 0.347; RF: 0.369), in all cases the sample size was smaller (Table 7; Table S35). Therefore, we considered full community beta-dispersion as the most appropriate metric 57 Table 7: Summary of logistic regression models classifying natural vs. accident, cardiovascular vs. drug-use, and disease vs. non-diseased state. Significance of the model was determined by the p-value. For model comparisons, the McFadden R2 and percent correct (correct classifications/total number of samples) were considered within each body site. Beta-dispersion Community Profile Significant Metadata Race + Event Location + McFadden R2 Degrees of Freedom Full community (n=120) Age Random Forest Indicators (n=117) Event Location + Age 0.314 0.388 0.399 0.356 0.328 0.364 6 5 5 3 5 7 Chi Sq 51.9 62.5 56.9 47.6 70.3 73.1 p 0 0 0 0 0 0 Percent correct 0.783 0.829 0.804 0.820 0.779 0.859 Comparison Natural - Accident Cardio - Drug Full community (n=107) Random Forest Indicators (n=100) Disease - Non Full community (n=172) BMI + Event Location + PMI + Age BMI + Age BMI + Race + Event Location + Age BMI + Race + Event Random Forest Indicators (n=163) Location + Age 58 Figure 8: Logistic regression models of the best performing pair-wise comparisons. Nose body site was selected, as well as beta-dispersion from full communities. A-C) Logistic regression models with a 95% confidence interval of beta-dispersion on the x-axis and the binary classification on the y-axis. A) Accident: 0, Natural: 1. B) Drug-related:0, Cardiovascular disease:1. C) Disease: 0, Non-diseased: 1. D-F) Principle Coordinate Analysis (PCoA) plots of microbial samples included in the logistic regression model. Colors corresponded with the M/COD, while shape indicated if the sample was correctly classified by the model. D) Natural vs. Accidental deaths. E) Drug-related vs. cardiovascular disease deaths. F) Diseased vs. non- diseased deaths. (Figure 8). There were no distinct clustering of samples suggesting that misclassification was randomly distributed among samples (Figure 8). Case Studies Of the 188 cases, we matched nose communities of 22 cases by age, sex, and race with other deaths (natural, homicide, and accidental) for a total of 43 cases (Table S26). We identified 59 three significant indicator taxa of suicide (Table S36). Suicide communities had higher beta- dispersion [0.659(SD=0.0433)] than non-suicides [0.654(SD=0.0427); Table S36]. Using PERMANOVA there were no significant differences in beta-dispersion among suicides (p = 0.144; Table S36). Using logistic regression beta-dispersion without metadata classified suicide cases with a 58% success rate (Table S36), likely associated with low power (A priori power sample size needed: 916). For future studies, we proposed a potential workflow using this matched-design case study for other researchers to use as a reference (Figure 9; Table S36). Despite the low sample size, we identified ten potential indicator taxa for gunshot wound homicide vs. suicide (Boruta; Table S37). Beta-dispersion among gunshot wound homicides was significantly higher [0.626(SD=0.049)] than gunshot wound suicides [0.543(SD=0.0959); PERMANOVA permuted p < 0.05; Table S37)]. Significant logistic regression models of gunshot wound homicides vs. suicides accurately classified MOD 93.1% of the time (Table S37), despite uneven and low sample sizes (n=25 homicides; n=4 suicides; Table S37). For significant power, an a priori power sample size needed for each category was 14. 60 Figure 9: Proposed workflow with suicide matched-design case study. Left column indicates potential steps researchers and practitioners can follow for future studies. Right column indicates the results from the matched-design case study, following the workflow. Twenty-three suicide cases were matched by age, sex, and race with other deaths (natural, homicide, and accidental), and nose samples were included in the analyses. Indicator taxa were identified by Boruta, while beta-dispersion was calculated using unweighted unifrac distances and tested with PERMANOVA. A logistic regression model of beta-dispersion was constructed to classify suicide vs. non-suicide deaths. 61 Discussion 62 In previous research the AKP concept was tested with distinct treatment vs. control groups based on living health conditions and in living hosts (Zaneveld, McMinds, and Thurber 2017). However, we hypothesized that using the postmortem microbiome within 48 h or death, M/CODs would associate with differential beta-dispersion, which could potentially be used as additional evidence in future forensic investigation. We hypothesized that cardiovascular disease and/or natural death would have the highest beta-dispersion, as AKP predicts with disease state in the antemortem life condition (Zaneveld, McMinds, and Thurber 2017; Barbian et al. 2015); and diversity metrics in previous studies also associated with manner and cause of death (Pechal et al. 2018). High microbiome beta-dispersion was also predicted to be related to a stressful life environment often associated with homicides, gunshot wound, and blunt force trauma deaths (Pearson et al. 2019). Using published case data from 188 postmortem microbiome autopsies our best performing binary logistic regression models were able to confirm medical examiner M/COD assessment ~79% of the time. Our best performing multinomial logistic regression models were able to confirm medical examiner M/COD assessment ~62% of the time. While better than random chance, including all M/CODs for classification with uneven sample sizes was likely resulted reduced classification success in multinomial logistic regression models. Our dataset represents a diverse cross-section of death cases from a large metropolitan area, with multiple body sites, using targeted sequencing of the 16S rRNA gene. Cases included were predominately natural cardiovascular disease deaths and accidental drug related deaths. Therefore, direct comparison of the results of this study would be most applicable for cities with similar demographics (“U.S. Census Bureau QuickFacts: Detroit City, Michigan; Michigan”), such as Chicago (“U.S. Census Bureau QuickFacts: Chicago City, Illinois”) or Cincinnati (“U.S. Census Bureau QuickFacts: Cincinnati City, Ohio”). While the metadata lends classification 64 ability in multinomial logistic regression for cases in Detroit, areas with differing demographics will also need to make their own baselines. It is also important to note that beta-dispersion is calculated in reference to the samples included in the dataset (Anderson, Ellingsen, and McArdle 2006). Therefore, future work should include data and cases from other geographic areas with a range of socio-economic diversity and overall living conditions. While we included five body sites (ears, nose, mouth, eyes, and rectum) that had differential success in classifying M/COD with beta-dispersion, there are more body sites of interest for the forensic community. As body site drives the microbial community composition more than any other factor (Pechal et al. 2018), comparisons to other body sites may be limited. For example, body sites sampled for the “thanatomicrobiome” (i.e., internal organs and blood) (Javan et al. 2016), or skin microbiome (Kodama et al. 2019) could harbor different microbial communities than the ones included in this study. Beta-dispersion among all body sites was significantly different, but mouth, nose, and ears from this data set showed the most potential for downstream forensic applications. This work revealed that beta-dispersion has potential to inform the M/COD decision making process during death determination. Accidental deaths, which were predominately drug related deaths, had overall lower beta-dispersion than natural deaths, mirroring the dysbiosis found in non-forensic studies (Meckel and Kiraly 2019). Accidental deaths and homicides were not distinguishable by beta-dispersion. While we hypothesized that high-stress lifestyle associated with homicidal deaths would increase beta-dispersion (Pearson et al. 2019), homicides had the lowest beta-dispersion among MODs. The antemortem link of high-stress lifestyle was not as strong as antemortem disease status in this study, compared to previous results that indicated higher microbial diversity associated with neighborhood blight and vacancy (Pearson 64 et al. 2019). This may be because those decedents who were victims of homicide lived relatively healthy lifestyles compared to decedents with a disease status. However, we do not have access to that information, and are constrained to information in the autopsy reports. Suicide postmortem microbiomes, while representing the lowest sample size, had similar beta-dispersion to natural deaths, similar to other, antemortem studies (Liang et al. 2018; Naseribafrouei et al. 2014). Microbiomes of suicidal people have higher diversity than healthy controls, specifically increased taxa associated with inflammation (Naseribafrouei et al. 2014). Therefore, there is a potential link between high microbial beta-dispersion and mental health that would be a promising area of future research. A previous study has documented postmortem microbiome diversity and other metrics being associated with heart disease (Pechal et al. 2018). In the current research, cardiovascular disease had significantly higher beta-dispersion than any other type of death. Dysbiosis in the microbiomes of people with cardiovascular disease has been documented, specifically as there may be a microbiome link to disease pathogenesis (Wilson Tang and Hazen 2017). Based on our results, some deaths may benefit from microbial evidence more than others. Specifically, drug-related deaths, cardiovascular disease, and suicides prompt further investigation with the postmortem microbiome. We chose multinomial logistic regression (MLR) as a simple, easy to apply model that is often used in clinical settings (de Jong et al. 2019). However, MLR has limitations, and biases towards classifying categories with larger sample sizes (de Jong et al. 2019). We achieved marginal improvement in these models with random forest indicator taxa compared to models using the full microbial community data set. This result was not entirely unexpected, as random forest model error rates ranged from: 53.1% - 64.4% (Table S38). Random forest indicators derived from error prone random forest models did not improve multinomial regression models 65 classifying M/COD. This illustrates that beta-dispersion can be calculated in a variety of ways, which has downstream effects on distinguishing categories of interest. Therefore, an objective approach to selecting beta-dispersion calculation should be used as outlined in this study. Instead, binary logistic regression models were most effective at improving model success. The categories with the highest classification success also had largest sample size (natural deaths/ accidents; cardiovascular disease/ drug-related deaths), and were highly correlated, as most natural deaths were cardiovascular disease (42/57) and most accidents were drug-related deaths (59/71). There was some overlap in pathology among cardiovascular disease deaths and drug-related deaths (Molina et al. 2020), showcasing how our best performing logistic regression models have potential applications in forensic death determination. While our case studies would benefit from further exploration with larger datasets, we provided strong evidence that other comparisons differentiating MOD, such suicide vs. non-suicides could also prove useful for forensic death determination. This was not the first study to classify M/COD from microbial communities; however, compared to previous studies using random forest classification, logistic regression performance of beta-dispersion was similar for the same dataset (Kaszubinski et al. 2019; Zhang et al. 2019; Pechal et al. 2018). Using random forest, MOD classification success with microbial communities alone (Kaszubinski et al. 2019) was comparable to multinomial logistic regression models built with beta-dispersion alone (~40%). Adding case metadata, random forest classification accuracy was consistent with multinomial logistic regression models in two body site communities (~60%) (Zhang et al. 2019). MLR does not depend on specific indicator postmortem microbial taxa, which can change across studies (Kaszubinski et al. 2019; Pechal et al. 2018). Furthermore, we suggest that microbial community information, either taxon 66 dependent (e.g., indicator taxa) or not (e.g., beta-dispersion) could be an addition piece of evidence in M/COD determination. By including metadata into our models, successful classification was improved rather than using microbial data alone, something to consider in future death investigation. Indeed, to our knowledge case metadata have never been modeled or used alone for M/COD determination. 67 Conclusion 68 Microbial community metrics, such as beta-dispersion, have potential forensic use in contributing to classification of M/COD during death investigation. This reflection is due to the antemortem link to the postmortem microbiome. We showed beta-dispersion increased based on disease status (cardiovascular disease) according to AKP, and beta-dispersion reflected M/COD, especially for cardiovascular disease and drug related deaths. While random forest is a useful tool for these types of datasets, MLR produced comparable results without reliance on specific indicator taxa. Furthermore, using two case studies we demonstrated circumstances where beta- dispersion could be used to distinguish MOD; however, low and uneven sample size was an issue for all case studies. Despite the reduced power of these case studies, this workflow may be useful for other forensic practitioners to test within their own sample set, that encompass new locations and metadata, to strengthen the antemortem link to the postmortem microbiome. The methods outlined in this study serve as a guide to developing non-taxonomic microbiome tools for other researchers and medical examiners in other geographic locations and investigation contexts. Ultimately, modeling beta-dispersion with case metadata is a tool that could be useful for medical examiners during death investigation, to combine with other methods of M/COD determination. 69 APPENDIX 70 Supplemental Figure 1: Summary of Kruskal-Wallis results for normalization strategies determining beta-dispersion. Each body site is represented across the x-axes. Each level for the corresponding normalization strategy (percent cutoff and minimum library sizes) are on the y- axes. Each box is indicative of the results of a Kruskal-Wallis test for either manner of death (A and B) or cause of death (C and D). Left-hand side of the figure includes unweighted unifrac distances (A and C), while the right-hand side includes weighted unifrac distances (B and D). Significant (p < 0.05) Kruskal-Wallis tests are indicated by gold, while nearly significant (p < 0.1) are indicated by blue. 71 Supplemental Figure 2: Summary of Fligner-Killeen results for normalization strategies determining beta-dispersion. Each body site is represented across the x-axes. Each level for the corresponding normalization strategy (percent cutoff and minimum library sizes) are on the y- axes. Each box is indicative of the results of a Fligner-Killeen test for either manner of death (A and B) or cause of death (C and D). Left-hand side of the figure includes unweighted unifrac distances (A and C), while the right-hand side includes weighted unifrac distances (B and D). Significant (p < 0.05) Fligner-Killeen tests are indicated by gold, while nearly significant (p < 0.1) are indicated by blue. 72 Supplemental Figure 3: Alpha-diversity metrics across normalization strategies and levels. Chao1 (richness) and Shannon diversity (richness and evenness) are reported. A significant decrease in alpha-diversity was found among percent cutoffs, and compared to minimum library sizes (Kruskal-Wallis and post hoc Nemenyi p < 0.05). 73 Supplemental Figure 4: Shannon diversity across sequencing depth among microbial samples. 74 REFERENCES 75 REFERENCES Anderson, Marti J. 2017. “Permutational Multivariate Analysis of Variance (PERMANOVA).” In Wiley StatsRef: Statistics Reference Online, 1–15. Chichester, UK: John Wiley & Sons, Ltd. https://doi.org/10.1002/9781118445112.stat07841. Anderson, Marti J., Kari E. Ellingsen, and Brian H. McArdle. 2006. “Multivariate Dispersion as a Measure of Beta Diversity.” Ecology Letters 9 (6): 683–93. https://doi.org/10.1111/j.1461-0248.2006.00926.x. Barbian, Hannah J., Yingying Li, Miguel Ramirez, Zachary Klase, Iddi Lipende, Deus Mjungu, AH. Moeller, et al. 2015. “Destabilization of the Gut Microbiome Marks the End-Stage of Simian Immunodeficiency Virus Infection in Wild Chimpanzees.” American Journal of Primatology 80 (1): 1–29. https://doi.org/10.1002/ajp.22515. Bates, Douglas, Martin Maechler, Ben Bolker, Steven Walker, Rune Haubo Bojesen Christensen, Henrik Singmann, Bin Dai, et al. 2019. “Package ‘Lme4.’” Benbow, Mark Eric, Jennifer L. Pechal, Jennifer M. Lang, Racheal Erb, and John R. Wallace. 2015. “The Potential of High-Throughput Metagenomic Sequencing of Aquatic Bacterial Communities to Estimate the Postmortem Submersion Interval.” Journal of Forensic Sciences 60 (6): 1500–1510. https://doi.org/10.1111/1556-4029.12859. Böhning, Dankmar. 1992. “Multinomial Logistic Regression Algorithm.” Annals of the Institute of Statistical Mathematics 44 (1): 197–200. https://doi.org/10.1007/BF00048682. Bokulich, Nicholas A., Jai Ram Rideout, William G. Mercurio, Arron Shiffer, Benjamin Wolfe, Corinne F. Maurice, Rachel J. Dutton, Peter J. Turnbaugh, Rob Knight, and J. Gregory Caporaso. 2016. “Mockrobiota: A Public Resource for Microbiome Bioinformatics Benchmarking.” MSystems 1 (5). https://doi.org/10.1128/msystems.00062-16. Bolyen, Evan, Jai Ram Rideout, Matthew R. Dillon, Nicholas A. Bokulich, Christian C. Abnet, Gabriel A. Al-Ghalith, Harriet Alexander, et al. 2019. “Reproducible, Interactive, Scalable and Extensible Microbiome Data Science Using QIIME 2.” Nature Biotechnology 37 (8): 852–57. https://doi.org/10.1038/s41587-019-0209-9. Bozdogan, Hamparsum. 1987. “Model Selection and Akaike’s Information Criterion (AIC): The General Theory and Its Analytical Extensions.” Psychometrika 52 (3): 345–70. https://doi.org/10.1007/BF02294361. Breiman, Leo, Adele Cutler, Andy Liaw, and Matthew Wiener. 2018. “Package ‘RandomForest,’” 1–29. https://doi.org/10.1023/A:1010933404324. Budowle, Bruce, Steven E. Schutzer, Anja Einseln, Lynda C. Kelley, Anne C. Walsh, Jenifer A.L. Smith, Babetta L. Marrone, James Robertson, and Joseph Campos. 2003. “Building 76 Microbial Forensics as a Response to Bioterrorism.” Science 301 (5641): 1852–53. https://doi.org/10.1126/science.1090083. Callahan, Benjamin J., Paul J. McMurdie, Michael J. Rosen, Andrew W. Han, Amy Jo A. Johnson, and Susan P. Holmes. 2016. “DADA2: High-Resolution Sample Inference from Illumina Amplicon Data.” Nature Methods 13 (7): 581–83. https://doi.org/10.1038/nmeth.3869. Caporaso, J. Gregory, Rob Knight, and Scott T. Kelley. 2011. “Host-Associated and Free-Living Phage Communities Differ Profoundly in Phylogenetic Composition.” PLoS ONE 6 (2). https://doi.org/10.1371/journal.pone.0016900. Caporaso, J. Gregory, Justin Kuczynski, Jesse Stombaugh, Kyle Bittinger, Frederic D. Bushman, Elizabeth K. Costello, Noah Fierer, et al. 2010. “QIIME Allows Analysis of High- Throughput Community Sequencing Data.” Nature Methods. https://doi.org/10.1038/nmeth.f.303. Caporaso, J. Gregory, Christian L. Lauber, Elizabeth K. Costello, Donna Berg-Lyons, Antonio Gonzalez, Jesse Stombaugh, Dan Knights, et al. 2011. “Moving Pictures of the Human Microbiome.” Genome Biology 12 (5): R50. https://doi.org/10.1186/gb-2011-12-5-r50. Caporaso, J. Gregory, Christian L. Lauber, William A. Walters, Donna Berg-Lyons, James Huntley, Noah Fierer, Sarah M. Owens, et al. 2012. “Ultra-High-Throughput Microbial Community Analysis on the Illumina HiSeq and MiSeq Platforms.” ISME Journal 6 (8): 1621–24. https://doi.org/10.1038/ismej.2012.8. Carter, David O., Jeffery K. Tomberlin, M. Eric. Benbow, and Jessica L. Metcalf. 2016. Forensic Microbiology. John Wiley & Sons, Incorporated. Clarke, Thomas H., Andres Gomez, Harinder Singh, Karen E. Nelson, and Lauren M. Brinkac. 2017. “Integrating the Microbiome as a Resource in the Forensics Toolkit.” Forensic Science International: Genetics 30 (2017): 141–47. https://doi.org/10.1016/j.fsigen.2017.06.008. Croissant, Yves. 2019. “Package ‘Mlogit,’” 1–40. https://doi.org/10.1017/CBO9780511805271. D’Argenio, Valeria, Giorgio Casaburi, Vincenza Precone, and Francesco Salvatore. 2014. “Comparative Metagenomic Analysis of Human Gut Microbiome Composition Using Two Different Bioinformatic Pipelines.” Biomed Res Int 2014: 325340. https://doi.org/10.1155/2014/325340. DeBruyn, Jennifer M., and Kathleen A. Hauther. 2017. “Postmortem Succession of Gut Microbial Communities in Deceased Human Subjects.” PeerJ. https://doi.org/10.7717/peerj.3437. Dobay, Akos, Cordula Haas, Geoffrey Fucile, Nora Downey, Hilary G. Morrison, Adelgunde 77 Kratzer, and Natasha Arora. 2019. “Microbiome-Based Body Fluid Identification of Samples Exposed to Indoor Conditions.” Forensic Science International: Genetics 40 (May): 105–13. https://doi.org/10.1016/j.fsigen.2019.02.010. Edgar, Robert C. 2010. “Search and Clustering Orders of Magnitude Faster than BLAST.” Bioinformatics 26 (19): 2460–61. https://doi.org/10.1093/bioinformatics/btq461. Faul, Franz, Edgar Erdfelder, Albert Georg Lang, and Axel Buchner. 2007. “G*Power 3: A Flexible Statistical Power Analysis Program for the Social, Behavioral, and Biomedical Sciences.” In Behavior Research Methods, 39:175–91. Psychonomic Society Inc. https://doi.org/10.3758/BF03193146. Glass, Elizabeth M., Jared Wilkening, Andreas Wilke, Dionysios Antonopoulos, and Folker Meyer. 2010. “Using the Metagenomics RAST Server (MG-RAST) for Analyzing Shotgun Metagenomes.” Cold Spring Harbor Protocols 5 (1): pdb.prot5368. https://doi.org/10.1101/pdb.prot5368. Golob, Jonathan L., Elisa Margolis, Noah G. Hoffman, and David N. Fredricks. 2017. “Evaluating the Accuracy of Amplicon-Based Microbiome Computational Pipelines on Simulated Human Gut Microbial Communities.” BMC Bioinformatics 18 (1): 1–12. https://doi.org/10.1186/s12859-017-1690-0. Hyde, Embriette R., Daniel P. Haarmann, Aaron M. Lynne, Sibyl R. Bucheli, and Joseph F. Petrosino. 2013. “The Living Dead: Bacterial Community Structure of a Cadaver at the Onset and End of the Bloat Stage of Decomposition.” Edited by Corrie S. Moreau. PLoS ONE 8 (10): e77733. https://doi.org/10.1371/journal.pone.0077733. Ioannidis, John P. A. 2005. “Why Most Published Research Findings Are False.” PLoS Medicine 2 (8): e124. https://doi.org/10.1371/journal.pmed.0020124. Javan, Gulnaz T., Sheree J. Finley, Ismail Can, Jeremy E. Wilkinson, J. Delton Hanson, and Aaron M. Tarone. 2016. “Human Thanatomicrobiome Succession and Time since Death.” Scientific Reports. https://doi.org/10.1038/srep29598. Javan, Gulnaz T., Sheree J. Finley, Tasia Smith, Joselyn Miller, and Jeremy E. Wilkinson. 2017. “Cadaver Thanatomicrobiome Signatures: The Ubiquitous Nature of Clostridium Species in Human Decomposition.” Frontiers in Microbiology. https://doi.org/10.3389/fmicb.2017.02096. Johnson, Hunter R., Donovan D. Trinidad, Stephania Guzman, Zenab Khan, James V. Parziale, Jennifer M. DeBruyn, and Nathan H. Lents. 2016. “A Machine Learning Approach for Using the Postmortem Skin Microbiome to Estimate the Postmortem Interval.” PLoS ONE 11 (12): 1–23. https://doi.org/10.1371/journal.pone.0167370. Jong, Valentijn M.T. de, Marinus J.C. Eijkemans, Ben van Calster, Dirk Timmerman, Karel G.M. Moons, Ewout W. Steyerberg, and Maarten van Smeden. 2019. “Sample Size 78 Considerations and Predictive Performance of Multinomial Logistic Prediction Models.” Statistics in Medicine 38 (9): 1601–19. https://doi.org/10.1002/sim.8063. Kaszubinski, Sierra F, Jennifer L Pechal, Carl J Schmidt, Jordan R Heather, M Eric Benbow, and Mariah H Meek. 2019. “Evaluating Bioinformatic Pipeline Performance for Forensic Microbiome Analysis.” Journal of Forensic Sciences, 1–13. https://doi.org/10.1111/1556- 4029.14213. Katoh, Kazutaka, and Daron M. Standley. 2013. “MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability.” Molecular Biology and Evolution 30 (4): 772–80. https://doi.org/10.1093/molbev/mst010. Keegan, Kevin P., Elizabeth M. Glass, and Folker Meyer. 2016. “MG-RAST, a Metagenomics Service for Analysis of Microbial Community Structure and Function.” In Methods in Molecular Biology, 1399:207–33. Humana Press Inc. https://doi.org/10.1007/978-1-4939- 3369-3_13. Kodama, Whitney A., Zhenjiang Xu, Jessica L. Metcalf, Se Jin Song, Nicholas Harrison, Rob Knight, David O. Carter, and Christopher B. Happy. 2019. “Trace Evidence Potential in Postmortem Skin Microbiomes: From Death Scene to Morgue.” Journal of Forensic Sciences 64 (3): 791–98. https://doi.org/10.1111/1556-4029.13949. Kozich, James J., Sarah L. Westcott, Nielson T. Baxter, Sarah K. Highlander, and Patrick D. Schloss. 2013. “Development of a Dual-Index Sequencing Strategy and Curation Pipeline for Analyzing Amplicon Sequence Data on the MiSeq Illumina Sequencing Platform.” Applied and Environmental Microbiology 79 (17): 5112–20. https://doi.org/10.1128/AEM.01043-13. Kursa, Miron Bartosz, and Witold Remigiusz Rudnicki. 2018. “Package ‘Boruta.’” https://notabug.org/mbq/Boruta/. Leipzig, Jeremy. 2017. “A Review of Bioinformatic Pipeline Frameworks.” Briefings in Bioinformatics 18 (3): 530–36. https://doi.org/10.1093/bib/bbw020. Liang, Shan, Xiaoli Wu, Xu Hu, Tao Wang, and Feng Jin. 2018. “Recognizing Depression from the Microbiota–Gut–Brain Axis.” International Journal of Molecular Sciences. MDPI AG. https://doi.org/10.3390/ijms19061592. Mandal, Siddhartha, Will Van Treuren, Richard A. White, Merete Eggesbø, Rob Knight, and Shyamal D. Peddada. 2015. “Analysis of Composition of Microbiomes: A Novel Method for Studying Microbial Composition.” Microbial Ecology in Health & Disease 26 (0). https://doi.org/10.3402/mehd.v26.27663. McDonald, Daniel, Jose C. Clemente, Justin Kuczynski, Jai R. Rideout, Jesse Stombaugh, Doug Wendel, Andreas Wilke, et al. 2012. “The Biological Observation Matrix (BIOM) Format or: How I Learned to Stop Worrying and Love the Ome-Ome.” GigaScience. Oxford 79 University Press. https://doi.org/10.1186/2047-217X-1-7. McMurdie, Paul J., and Susan Holmes. 2013. “Phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data.” PLoS ONE 8 (4). https://doi.org/10.1371/journal.pone.0061217. McMurdie, Paul J., and Susan Holmes. 2014. “Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible.” PLoS Computational Biology 10 (4). https://doi.org/10.1371/journal.pcbi.1003531. Meckel, Katherine R., and Drew D. Kiraly. 2019. “A Potential Role for the Gut Microbiome in Substance Use Disorders.” Psychopharmacology. Springer Verlag. https://doi.org/10.1007/s00213-019-05232-0. Metcalf, Jessica L., Laura Wegener Parfrey, Antonio Gonzalez, Christian L. Lauber, Dan Knights, Gail Ackermann, Gregory C. Humphrey, et al. 2013. “A Microbial Clock Provides an Accurate Estimate of the Postmortem Interval in a Mouse Model System.” ELife 2013 (2): 1–19. https://doi.org/10.7554/eLife.01104.001. Metcalf, Jessica L., Zhenjiang Z. Xu, Amina Bouslimani, Pieter Dorrestein, David O. Carter, and Rob Knight. 2017. “Microbiome Tools for Forensic Science.” Trends in Biotechnology 35 (9): 814–23. https://doi.org/10.1016/j.tibtech.2017.03.006. Metcalf, Jessica L., Zhenjiang Zech Xu, Sophie Weiss, Simon Lax, Will Van Treuren, Embriette R Hyde, Se Jin Song, et al. 2016. “Microbial Community Assembly and Metabolic Function during Mammalian Corpse Decomposition.” Science 351 (6269): 158–62. https://doi.org/10.1126/science.aad2646. Molina, D. Kimberly, Kathryn Vance, Maci L. Coleman, Veronica M. Hangrove. 2020. “Testing an Age‐old Adage: Can Autopsy Findings be of Assistance in Differentiating Opioid Versus Cardiac Deaths?” Journal of Forensic Sciences 65 (1):112–6. https://onlinelibrary.wiley.com/doi/abs/10.1111/1556-4029.14174 Mysara, Mohamed, Mercy Njima, Natalie Leys, Jeroen Raes, and Pieter Monsieurs. 2017. “From Reads to Operational Taxonomic Units: An Ensemble Processing Pipeline for MiSeq Amplicon Sequencing Data.” GigaScience 6 (2): 1–10. https://doi.org/10.1093/gigascience/giw017. Naseribafrouei, A., K. Hestad, E. Avershina, M. Sekelja, A. Linløkken, R. Wilson, and K. Rudi. 2014. “Correlation between the Human Fecal Microbiota and Depression.” Neurogastroenterology & Motility 26 (8): 1155–62. https://doi.org/10.1111/nmo.12378. Nilakanta, Haema, Kimberly L. Drews, Suzanne Firrell, Mary A. Foulkes, and Kathleen A. Jablonski. 2014. “A Review of Software for Analyzing Molecular Sequences.” BMC Research Notes 7 (1). https://doi.org/10.1186/1756-0500-7-830. 80 Oksanen, Jari, F Guillaume Blanchet, Michael Friendly, Roeland Kindt, Pierre Legendre, Dan Mcglinn, Peter R Minchin, et al. 2019. “Vegan: Community Ecology Package.” Pearson, Amber L., Amanda Rzotkiewicz, Jennifer L. Pechal, Carl J. Schmidt, Heather R. Jordan, Adam Zwickle, and M. Eric Benbow. 2019. “Initial Evidence of the Relationships between the Human Postmortem Microbiome and Neighborhood Blight and Greening Efforts.” Annals of the American Association of Geographers 109 (3): 958–78. https://doi.org/10.1080/24694452.2018.1519407. Pechal, Jennifer L., Tawni L. Crippen, M. Eric Benbow, Aaron M. Tarone, Scot Dowd, and Jeffery K. Tomberlin. 2014. “The Potential Use of Bacterial Community Succession in Forensics as Described by High Throughput Metagenomic Sequencing.” International Journal of Legal Medicine 128 (1): 193–205. https://doi.org/10.1007/s00414-013-0872-1. Pechal, Jennifer L., Carl J. Schmidt, Heather R. Jordan, and M. Eric Benbow. 2017. “Frozen: Thawing and Its Effect on the Postmortem Microbiome in Two Pediatric Cases.” Journal of Forensic Sciences 62 (5): 1399–1405. https://doi.org/10.1111/1556-4029.13419. Pechal, Jennifer L., Carl J. Schmidt, Heather R. Jordan, and M. Eric Benbow. 2018. “A Large- Scale Survey of the Postmortem Human Microbiome, and Its Potential to Provide Insight into the Living Health Condition.” Scientific Reports 8 (1): 1–15. https://doi.org/10.1038/s41598-018-23989-w. Plummer, Erica, and Jimmy Twin. 2015. “A Comparison of Three Bioinformatics Pipelines for the Analysis of Preterm Gut Microbiota Using 16S RRNA Gene Sequencing Data.” Journal of Proteomics & Bioinformatics 8 (12): 283–91. https://doi.org/10.4172/jpb.1000381. Pohlert, Thorsten. 2018. “PMCMR: Calculate Pairwise Multiple Comparisons of Mean Rank Sums.” Price, Morgan N, Paramvir S Dehal, and Adam P Arkin. 2010. “FastTree 2--Approximately Maximum-Likelihood Trees for Large Alignments.” PloS One 5 (3). https://doi.org/10.1371/journal.pone.0009490. Quast, Christian, Elmar Pruesse, Pelin Yilmaz, Jan Gerken, Timmy Schweer, Pablo Yarza, Jörg Peplies, and Frank Oliver Glöckner. 2013. “The SILVA Ribosomal RNA Gene Database Project: Improved Data Processing and Web-Based Tools.” Nucleic Acids Research 41 (January). https://doi.org/10.1093/nar/gks1219. Randy, Hanzlick, John C. Hunsaker III, and Gregory J. Davis. 2002. “A Guide for Manner of Death Classification.” Rognes, Torbjørn, Tomáš Flouri, Ben Nichols, Christopher Quince, and Frédéric Mahé. 2016. “VSEARCH: A Versatile Open Source Tool for Metagenomics.” PeerJ, 1–22. https://doi.org/10.7717/peerj.2584. 81 Rosa, Patricio S. la, J. Paul Brooks, Elena Deych, Edward L. Boone, David J. Edwards, Qin Wang, Erica Sodergren, George Weinstock, and William D. Shannon. 2012. “Hypothesis Testing and Power Calculations for Taxonomic-Based Human Microbiome Data.” PLoS ONE 7 (12): e52078. https://doi.org/10.1371/journal.pone.0052078. Schloss, Patrick D., Sarah L. Westcott, Thomas Ryabin, Justine R. Hall, Martin Hartmann, Emily B. Hollister, Ryan A. Lesniewski, et al. 2009. “Introducing Mothur: Open-Source, Platform- Independent, Community-Supported Software for Describing and Comparing Microbial Communities.” Applied and Environmental Microbiology 75 (23): 7537–41. https://doi.org/10.1128/AEM.01541-09. Schmedes, Sarah E., Antti Sajantila, and Bruce Budowle. 2016. “Expansion of Microbial Forensics.” Journal of Clinical Microbiology 54 (8): 1964–74. https://doi.org/10.1128/JCM.00046-16. Schmedes, Sarah E., August E. Woerner, and Bruce Budowle. 2017. “Forensic Human Identification Using Skin Microbiomes.” Applied and Environmental Microbiology 83 (22). https://doi.org/10.1128/AEM.01672-17. Shade, Ashley, and Jo Handelsman. 2012. “Beyond the Venn Diagram: The Hunt for a Core Microbiome.” Environmental Microbiology 14 (1): 4–12. https://doi.org/10.1111/j.1462- 2920.2011.02585.x. Siegwald, Léa, Hélène Touzet, Yves Lemoine, David Hot, Christophe Audebert, and Ségolène Caboche. 2017. “Assessment of Common and Emerging Bioinformatics Pipelines for Targeted Metagenomics.” PLoS ONE 12 (1): 1–26. https://doi.org/10.1371/journal.pone.0169563. Sivarajah, Uthayasankar, Muhammad Mustafa Kamal, Zahir Irani, and Vishanth Weerakkody. 2017. “Critical Analysis of Big Data Challenges and Analytical Methods.” Journal of Business Research 70 (January): 263–86. https://doi.org/10.1016/j.jbusres.2016.08.001. Team, R Core. “R: A Language and Environment for Statistical Computing.” R Foundation for Statistical Computing. Accessed October 23, 2019. https://www.gbif.org/tool/81287/r-a- language-and-environment-for-statistical-computing. Turnbaugh, Peter J., Ruth E. Ley, Micah Hamady, Claire M. Fraser-Liggett, Rob Knight, and Jeffrey I. Gordon. 2007. “The Human Microbiome Project.” Nature. Nature Publishing Group. https://doi.org/10.1038/nature06244. “U.S. Census Bureau QuickFacts: Chicago City, Illinois.”. https://www.census.gov/quickfacts/chicagocityillinois. “U.S. Census Bureau QuickFacts: Cincinnati City, Ohio.”. https://www.census.gov/quickfacts/cincinnaticityohio. 82 “U.S. Census Bureau QuickFacts: Detroit City, Michigan; Michigan” https://www.census.gov/quickfacts/fact/table/detroitcitymichigan,MI/PST045219. Weiss, Sophie, Zhenjiang Zech Xu, Shyamal Peddada, Amnon Amir, Kyle Bittinger, Antonio Gonzalez, Catherine Lozupone, et al. 2017. “Normalization and Microbial Differential Abundance Strategies Depend upon Data Characteristics.” Microbiome 5 (1): 1–18. https://doi.org/10.1186/s40168-017-0237-y. Wilson Tang, W. H., and Stanley L. Hazen. 2017. “The Gut Microbiome and Its Role in Cardiovascular Diseases.” Circulation 135 (11): 1008–10. https://doi.org/10.1161/CIRCULATIONAHA.116.024251. Zaneveld, Jesse R., Ryan McMinds, and Rebecca Vega Thurber. 2017. “Stress and Stability: Applying the Anna Karenina Principle to Animal Microbiomes.” Nature Microbiology 2 (August): 1–8. https://doi.org/10.1038/nmicrobiol.2017.121. Zhang, Yu, Jennifer L. Pechal, Carl J. Schmidt, Heather R. Jordan, Wesley W. Wang, M. Eric Benbow, Sing Hoi Sze, and Aaron M. Tarone. 2019. “Machine Learning Performance in a Microbial Molecular Autopsy Context: A Cross-Sectional Postmortem Human Population Study.” PLoS ONE. https://doi.org/10.1371/journal.pone.0213829. 83