u" "“1". an )2 in... .9 ‘3. ‘ I 2% L: n. , . ..;.1 ”14. .J . . 3&1- Ms E .UBRARY ' M'Chlgan State 20% University ‘ This is to certify that the thesis entitled RHIZOBIAL T-RFLP ANALYSIS FOR DIFFERENTIATING SOILS AND HABITATS presented by Erin Jennifer Lenz has been accepted towards fulfillment of the requirements for the Master of degree in Forensic Science Science fl Major Pr‘ofessor’s Signature 3/ z»/ / o 8/ Date MSU is an afiirmative-action, equal-opportunity employer PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 5/08 KzlProj/Acc8PresICIRC/Date0ue indd RHIZOBIAL T-RF LP ANALYSIS FOR DIFFERENTIATING SOILS AND HABITATS By Erin Jennifer Lenz A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Forensic Science 2008 that c101 bac di ft fror {01’ l ABSTRACT RHIZOBIAL T-RF LP ANALYSIS FOR DIFFERENTIATING SOILS AND HABITATS By Erin Jennifer Lenz Soil can be of broad evidentiary value as it is commonly found in many locations that may link a victim or suspect to the scene of a crime. Soil samples from a shoe, tire, clothing, or any other material may be collected by the crime scene investigator and taken back to the laboratory for further analysis. While traditional methods of soil analysis can differentiate soils, they lack the ability to positively link the questioned soil to the area from which it originated. Recent research has examined microbial communities as a tool for differentiating soil from diverse locations in a forensic setting. There are an infinite number of microorganisms, particularly bacteria, present in soil. These differ throughout geographical regions and have the potential to establish the site from which a soil sample originated. This study takes advantage of rhizobia, soil bacteria that fix nitrogen after becoming established inside root nodules of legumes. Rhizobia require a plant host, so their presence is highly variable based on plant species in an area. Terminal Restriction Fragment Length Polymorphism (T-RF LP) was studied as a means of establishing temporal variability, local heterogeneity, and whether the five habitats being compared could be differentiated from each other. Using multidimensional scaling (MDS), when all five habitats were compared, only two could frequently be distinguished from others. However, by comparing only two habitats concurrently, each habitat but the agricultural field could be separated. ACKNOWLEDGEMENTS There are many individuals I would like to thank for their contributions and help throughout the course of this project. First, I would like to express profound gratitude to my advisor, Dr. David Foran. His advice throughout this research work as well as help with thesis revisions enabled this project to be as accurate and precise as possible. I am also very thankful to Dr. Rich Merritt, who offered encouragement and advice even with trips to Africa and beyond scheduled throughout the summer. Finally, I am grateful for the help of Dr. Steve Chermak, who offered to help someone he had never met before. I would also like to thank Dr. Steve Mauro and Luis Cabo, two very special professors at Mercyhurst College that helped me on numerous occasions throughout both my undergraduate and graduate experiences. Their support and encouragement has given me the confidence to pursue a career in forensic biology. Finally, I am so lucky to have such good friends and family. My colleagues in the forensic biology laboratory have offered advice and support on countless occasions. And, last but certainly not least, my parents and grandmother have been an inspiration in my life. I owe much of where I am now to Keith and Pamela Lenz as well as Nancy Lynn. 0-- LCAIKICIFI \ASLLRCAIC RATRXNPT C TABLE OF CONTENTS LIST OF TABLES ................................................................................... vi LIST OF FIGURES ................................................................................ vii INTRODUCTION ................................................................................... 1 Conventional Forensic Soil Analyses ............................................................. 2 Microbial Community Analysis .................................................................... 5 Terminal Restriction Fragment Length Polymorphism Analysis .............................. 7 Multidimensional Scaling of T-RFLP Profiles ................................................. 10 The Utility of T-RFLP in the Comparison of Soils ............................................ 11 Complications with T-RFLP Analysis ........................................................... 13 T-RFLP Analysis at the Forensic Biology Laboratory of MSU .............................. 14 Rhizobia in Microbial Community Analysis ................................................... 16 The Utility of Rhizobia] T-RF LP Forensic Soil Analysis .................................... 18 MATERIALS AND METHODS ................................................................. 19 Sample Collection .................................................................................. 19 DNA Extractions .................................................................................... 19 DNA Amplification othizobial DNA from Soil 19 Restriction Digestion of Amplified RecA Gene ................................................ 21 Capillary Electrophoresis of Restriction Digests ............................................... 22 Analysis of T-RF LP Profiles ..................................................................... 22 Comparison of Replicates ......................................................................... 23 RESULTS ................................. ~ .......................................................... 24 Amplification of the RecA Gene .................................................................. 24 T—RFLP Results .................................................................................... 24 Reproducibility of T-RFLP Profiles ............................................................. 25 Multidimensional Scaling as a Measure of Temporal Variability ........................... 27 Multidimensional Scaling as a Measure of Habitat Heterogeneity .......................... 32 Peak Number as a Source of Outliers ............................................................ 38 The Utility of Multidimensional Scaling as Determined by Addition of “Unknown” Samples ...................................................................... 38 Common Peaks for the Production of Genetic ‘Fingerprints’ ................................ 41 TABLE OF CONTENTS CONTINUED DISCUSSION ....................................................................................... 43 Conclusions ................................................................................. 56 APPENDIX .......................................................................................... 58 REFERENCES ..................................................................................... 77 LIST OF TABLES Averaged Similarity Indices for T-RFLP Replicates .......................................... 26 Similarity Indices for Technical Replicates ..................................................... 27 Genetic ‘Fingerprints’ for each Habitat and Enzyme .......................................... 42 Similarity Indices for PCR Replicates ......................................................... 58 Indicative Peaks for DpnII ........................................................................ 73 Indicative Peaks for MspI ......................................................................... 74 Indicative Peaks for RsaI .......................................................................... 75 vi LIST OF FIGURES Diagram Illustrating the Classification of Ped Structures ....................................... 3 Illustration of the Terminal Restriction Fragment Length Polymorphism Process ......... 9 Typical T-RF LP Profile ........................................................................... 25 Multidimensional Scaling Results for Main Collection Sites using DpnII .................. 29 Multidimensional Scaling Results for Main Collection Sites using MspI ................... 30 Multidimensional Scaling Results for Main Collection Sites using RsaI .................... 32 Multidimensional Scaling Results for Directional Samples using DpnII, in June .........33 Multidimensional Scaling Results for Directional Samples using Mspl, in December ...35 Multidimensional Scaling Results for Directional Samples using RsaI, in June 3.7 Multidimensional Scaling Results for the Agricultural Field and Woodlot, with Added “Unknown” ................................................................................... 39 Pie Chart Depicting Groupings of Unknowns by Samples Digested with Dan .......... 40 Pie Chart Depicting Groupings of Unknowns by Samples Digested with MspI ........... 4O Pie Chart Depicting Groupings of Unknowns by Samples Digested with RsaI ............ 4l Multidimensional Scaling Results for Directional Samples using DpnII, in March .......64 Multidimensional Scaling Results for Directional Samples using Dan, in September...65 Multidimensional Scaling Results for Directional Samples using DpnII, in December...66 Multidimensional Scaling Results for Directional Samples using Mspl, in March ....... 67 Multidimensional Scaling Results for Directional Samples using Mspl, in June .......... 68 Multidimensional Scaling Results for Directional Samples using Mspl, in September ...69 Multidimensional Scaling Results for Directional Samples using Rsal, in March .70 vii LIST OF FIGURES CONTINUED Multidimensional Scaling Results for Directional Samples using RsaI, in September ...71 Multidimensional Scaling Results for Directional Samples using RsaI, in December ...72 viii INTRODUCTION Soil can be of broad evidentiary value as it is very common and can potentially link a victim or suspect to the scene of a crime. In many situations, investigations may be strengthened if soil from, for instance, a shoe, tire, or clothing, is analyzed (Hopkins et al. 2000; Pye et al. 2007). Soil samples can also determine if a body has been moved from another location, perhaps where a crime occurred, or any other site that may provide a clue to the identity of a suspect or in reconstructing a crime. Soil analysis has been used in a forensic context on countless occasions (McKinley and Ruffell 2007). In one instance, war crime investigations were aided substantially by soils taken from mass graves (Brown 2006). Officials of the United Nations International Criminal Tribunal for the Former Yugoslavia (ICTY) traveled to Bosnia from 1997 through 2002 with the goal of locating and exhuming mass graves that resulted from the Srebrenica Massacre. Multiple bodies had been transported from their original grave sites to secondary sites, which were later exhumed by the ICTY. Soil analysis assisted in the generation of an environmental profile that was used to match bodies to original mass graves. This helped to complete a chain of evidence that confirmed what originally happened in the Srebrenica Massacre and subsequent attempts to conceal evidence related to it. Soil analysis can also be a key factor in criminal court hearings. One such case transpired in Belfast, Ireland in 2005 (McKinley and Ruffell 2007). A shooting occurred when the victim arrived at the entrance to his work. Samples of soil located around the scene and nearby abandoned land were collected by a crime scene investigator. Later, additional soils from a gutter adjacent to the shooting, as well as the land where the suspect had been standing when he fatally shot the victim, were collected and compared to samples taken from the shoes of the suspect. Unique particles (pieces of plaster that were <0.5mm in diameter) located in some of the control samples were consistent with those found in the suspect‘s shoe treads, linking him to the area of the shooting. Conventional Forensic Soil Analyses Traditional methods of forensic soil analysis involve the physical comparison of soil colors and particle sizes, assessment of any other materials in the soil, and examination of chemical features such as pH and organic content (Pye et al. 2007). Determination of ped structure (units of soil composed of individual particles of sand, silt, clay, and organic material) as being platy, prismatic, columnar, blocky, or granular (Figure 1) is one of the first steps in a soil investigation. Soils with a platy structure are flat and are often found in subsurface samples that have been compacted by animals or machinery (United States Department of Agriculture Soil Survey Manual 1993). In the prismatic structure, individual units are composed of flat to rounded vertical faces and are longer vertically than horizontally. Columnar structures are similar to prismatic units, containing rounded column tops rather than the angled vertices of prismatic units. Segments of the blocky structure are square in shape and commonly composed of clay. Finally, components with a granular structure appear to be spherical or polyhedral and are said to look like "cookie crumbs.” Figure 1. Diagram Illustrating the Classification of Ped Structures Granluar BIOCky (Subangular) (Angular) a. (Soil aggregates) Prismatic Columnar Platy Peds may be classified as granular, blocky, platy, prismatic, or columnar as defined by the United States Department of Agriculture. (Image modified from Soil Survey Data for New England States, http://nesoil.com/images/structuregif) Determinations of color and grain size are often included in the forensic examination of soils. Historically, the Munsell Color System was used to differentiate among soil colors (Adderley et a1. 2002). More recently, the Munsell values are digitally recorded and compared to a computer database to establish the exact color of the soil, avoiding user subjectivity. Grain size distribution analysis may also aid in the investigation of soils, as samples from differing geographic regions can be collected and measured to give a distribution profile that will characterize them (Murray and Solebello 2002). By using grain size distributions coupled with color analysis, a high degree of discrimination among soils can be achieved (Junger 1996; Sugita and Marumo 2001). Texture is also analyzed when generating a physical profile of soil particles. This involves the classification of soil as sand, sandy loam, loam, clay loam, light clay, medium clay, or heavy clay (Williams et al. 1983). The Soil Science Society of America defines clay as soil that contains 40% or more clay, the amount of which differentiates among the specific types. A clay loam is soil that consists of 27 to 40% clay and 20 to 45% sand. To obtain a sand classification, the soil must be comprised of more than 85% sand. A sandy loam has 7 to 20% clay, more than 52% sand, and the percentage of silt is 30 or more. Finally, loam is defined as soil that includes 7 to 27% clay, 28 to 50% silt, and less than 52% sand. Organic composition is frequently determined in the forensic and environmental analysis of soil (Wanogho et al. 1989). This is done by measuring the weight loss after organic matter is removed through ignition (Boyle 2004). Determination of carbon content can also be used to establish the organic composition of soils (Chan et al. 2002). Recently, automated methods have allowed for a more precise measurement of total carbon content in soil. LECO Carbon Analyzers and Carlo Erba CN Elemental Analyzers have been frequently used for this purpose (Cai et al. 2002; Galantini and Rosell 2006). A variety of other techniques may be used to differentiate soils. Light microscopy can help identify plant or animal material, as well as many additional types of debris (Saferstein 2001). Transmission electron microscopy has been utilized to study the surface of a particle, while scanning electron microscopy achieves a greater magnification of the surfaces of particles through three-dimensional images (Morgan and Bull 2007). Fourier transform infrared spectroscopy can discriminate among soils when other methods fail, as it achieves a high degree of resolution among similar organic groups found in soil (Cox et al. 2000). As an alternative, density gradient techniques can also be applied to separate components of the sampled material (Petraco and Kubic 2000). Finally, elemental variability has been employed as a method to differentiate among soil samples (Jarvis et al. 2004) through the use of inductively coupled plasma atomic emission spectroscopy and inductively coupled plasma mass spectroscopy (Pye et a1. 2006). There have also been advancements in biological techniques used to ‘fingerprint’ soil samples. Some forensic laboratories utilize palynologists (experts that identify pollen), whom potentially can determine the location from which pollen originated (Horrocks 2004). Likewise, the use of plant wax—lipids found in dust and sediments that are derived from an outer wax covering of terrestrial plant leaves—may possibly provide a link between compared soils (Dawson et al. 2003). It is also feasible to affiliate the epicuticular wax of most plants, which contain mixtures of hydrocarbons, with vegetation that currently exists or used to live at a particular site (Dawson et al. 2004). Microbial Community Analysis Currently, research is being performed to determine the potential of microbial communities in soil (or any other medium that bacteria find attractive) to form a microbial ‘fingerprint’ specific for a certain geological area. The number of microbial species is in the millions, if not billions (Cohan 2002; Oren 2004). Less than 1% of microorganisms that can be viewed microscopically have been cultivated and characterized, meaning that soil ecosystems are largely uncharted (Torsvik and Ovreas 2002). Microbial diversity is complex at many levels, encompassing variability within species as well as in their number and relative abundance. It has been proposed that microbial communities differ throughout geographical regions, and thus can be used to establish the site from which a soil sample originated (Heath and Saunders 2006). In 1992, de Bruijn (of Michigan State University) utilized families of short intergenic repeated sequences found in enterobacteria such as Eschericia coli and Salmonella typhimurium to generate unique profiles for individual bacterial strains. Polymerase chain reaction (PCR) amplification, with primers specific for the conserved region of interest, was performed and amplicons visualized on an agarose gel. Results yielded multiple DNA products that created distinct fingerprint patterns. A high degree of resolution must be achieved in microbial community analysis owing to the amount of molecular conservation present in most bacterial species (Perret and Broughton 1998). Genetic fingerprinting methods such as single strand conformation polymorphism analysis (Schwieger and Tebbe 1998), restriction fragment length polymorphism (RF LP) (Widmer et al. 2001), and denaturing gel electrophoresis (DGGE) (Muyzer and Smalla 1998) have been used to differentiate microbial communities existing in soil, but are time consuming and often cannot make a distinction among bacterial strains that differ only marginally. Amplified fragment length polymorphism (AFLP) has a high throughput capacity, screening for many different DNA regions distributed randomly throughout the genome, but is not locus-specific (Franklin and Mills 2003). PCR amplification of phylogenetic markers like those encoding the small or large subunit of ribosomal RNA (rRN A) followed by DGGE or RFLP has helped to discriminate microbial communities originating from diverse environments (Casamayor et al. 2002), but may provide too much information pertaining to all bacterial species existing in soil for a reproducible genomic profile to be obtained. Due to its high resolution power, terminal restriction fragment length polymorphism (T-RF LP; explained below) analysis can provide a representative picture of microbial existence in soil (Pesaro et al. 2004). Capillary electrophoresis allows for bacterial species with very few genetic differences to be differentiated and analyzed, generating a community profile (Marsh 1999). Soils taken from one habitat have proven to be more similar to each other than samples taken from different habitats (Heath and Saunders 2006; Meyers and Foran 2008), and distinct terminal restriction fragments with peak areas above 1% of the total peak area are highly reproducible (Lukow et al. 2000). T-RFLP analysis is a potentially useful tool for forensic analysis as it can be performed with equipment already existing in laboratories. Little new training and expertise is required and analyses are less time consuming than standard techniques used to characterize soil. Terminal Restriction Fragment Length Polymorphism Analysis T-RF LP analysis has been used increasingly in microbial ecology research (Marsh 2005). One of the first studies published showed that related bacterial strains can be distinguished from each other (Liu et al. 1997). Since then, T-RF LP has been used to determine bacterial (Liu et al. 1998) and archaeal (Van der Maarel et al. 1998) population diversity. Blackwood et al. (2003) were also able to discriminate among similar microbial communities present in soil but stressed that it is imperative to determine the significance of results using the statistical technique that best fits the data, as data with unique patterns (such as many outliers or a high degree of similarity between samples) can best be characterized by different analyses. The first stage in obtaining a T-RFLP profile is to extract total DNA/RN A (Figure 2, Step 1) through culturing the bacteria on selective media and then extracting DNA from the colonies (Avaniss-Aghajani et al. 1996). It is also possible to isolate DNA by homogenizing the soil and performing an organic extraction (Liu et al. 1997). Alternatively, commercial kits can be used to extract bacterial DNA through cell lysis followed by DNA purification (Marsh 2005), although they may not always yield DNA that can be amplified through PCR (LaMontagne et al. 2003). Meyers and F oran (2008) concluded that the success of DNA amplification depended on the kit used to extract DNA as well as the location from which the soils were taken. The next stage of T-RFLP analysis is to amplify the locus of interest (Figure 2, Step 2). Amplification is performed with a 5’ fluorescently tagged primer for detection during capillary electrophoresis (Liu et al. 1997; Marsh 2005). The amplicon is then digested with a restriction enzyme that produces different sized products (Figure 2, Step 3). Once subjected to capillary electrophoresis (Figure 2, Step 4), only the restriction fragments that include the labeled primer are detected. Products can then be compared by generating a graph of fragments separated by size, known as an electropherogram (Liu et al. 1997; Marsh 2005). Figure 2. Illustration of the Terminal Restriction Fragment Length Polymorphism Process T-RF LP Generates a community profile using length heterogeneity in the terminal fragment of target gene Soil 1 DNA Extraction Mixed community . genomic DNA Fluorescent terrnrnal fragments MAA Base pairs Small TRFs Large TRFs Fluorescence PCR amplification 4 Electrophoresis and 2 with fluorescent primer excitation of fluor PCR amplicons 3 Restriction fragments Restriction ——r Genomic DNA is first extracted from the soil (Step 1). It is then subjected to PCR amplification (Step 2) with primers that are specific to the locus of interest. One primer is 5’ fluorescently tagged so that it can later be detected through capillary electrophoresis. The PCR product must be purified to remove proteins and dNTP’s. The amplicon is digested (Step 3) with a restriction enzyme chosen to offer the clearest picture of the genetic diversity present. The resulting fragments are separated through capillary electrophoresis (Step 4) and compared to other samples. (Image modified from Molecular Approaches to Soil, Rhizosphere and Plant Microorganism Analysis edited by Cooper and Rao, 2006). Samples may then be analyzed using cluster analyses such as UPGMA, Q-mode factor analysis, multidimensional scaling, or principal components analysis (Quinn and Keough 2002). Another option for comparing profiles derived from T-RF LP is to compute the degree of similarity among pairs of samples. A similarity index between zero and one is calculated by determining the number of peaks that two samples have in common, with zero indicating that the samples share no peaks and one signifying all peaks are shared. Multidimensional Scaling of T -RFLP Profiles Within any scientific study, choosing the statistical analysis that best determines similarities among the data is vital (Blackwood et a1. 2003). A multivariate technique is required for T-RF LP analysis, as every peak that can be generated is yet another variable that must be accounted for. Although T-RF LP profiles have been analyzed many ways, multidimensional scaling (MDS) provides some of the most accurate and robust results possible (Rees et al. 2004). MDS simply generates distances among variables through the creation of a similarity index. These distances are then used to generate a plot of geometric distances (N orusis 2007). Each sample creates a point in a multidimensional space, being arranged so that similar T-RFLP profiles are represented by points that are close together, while dissimilar profiles correspond to points that are farther apart. Nonmetric MDS is used for T-RF LP analysis as the peaks being evaluated are qualitative, and distances need to be calculated by comparing profiles. The Euclidean distance model, which is the default mode of generating similarity indices, is merely an extension of the Pythagorean Theorem. It calculates the similarity between two samples when a metric distance is unknown by summing the square of the distances between two profiles and then calculating the square root of that sum (Norusis 2007). Finally, weighted MDS assumes that samples differ from each other in nonlinear ways, which is 10 necessary in T-RF LP analysis so as to account for all peaks that are shared among profiles and to place an emphasis on unique peaks by including a “weight space” into the Euclidean equation. Factors that cannot be accounted for through MDS include the influences of time and soil heterogeneity. It is impossible for a statistical technique to determine which peaks are included or excluded at certain months of the year due to environmental. factors. Therefore, any samples that fall outside of the clustered area of soils from a specific habitat must be analyzed separately. Moreno et al. (2006) collected nine soil samples from four different habitats and found that through the use of non-metric MDS of microbial communities, each habitat generated clusters with few outliers. In the dry season of Florida (February), the samples all formed distinct clusters with no outliers. However, in the wet season (August), soils from each habitat grouped together, but their plotted location changed and several outliers were noted. This could have been due to seasonal shifts in bacterial strains or merely the amount of soil heterogeneity present. Nonetheless, by optimizing the options MDS has to offer, a multidimensional plot can be generated that shows the similarities or dissimilarities among all samples being compared. The Utility of T—RFLP Analysis in the Comparison of Soils Although T-RFLP analysis may link a victim or suspect to the scene of a crime, many obstacles must be overcome before such a technique is accepted by the scientific community. While previous studies (Osborn et al. 2000; Sakamoto et al. 2003) indicate that T-RFLP results can be highly reproducible, it is necessary to follow a strict protocol 11 to ensure that as little variability is introduced as possible (Lueders and Friedrich 2003). It is also important to determine if samples from differing habitats are truly unique enough to form DNA ‘fingerprints’ that can safely link a soil sample back to its origin. Time has been shown to impact microbial communities in soil (Lukow et al. 2000; Wolsing and Priemé 2004; N011 et a1. 2005; Meyers and Foran 2008). Lukow et al. (2000) found that soils vary over space and time. Three sampling locations were separated by a distance of 2.5 to 6 meters, with 7 samples taken from each, collected in May, June, August, and September 1998. In the resulting analysis, 29 peaks were considered. The overall T-RF LP pattern indicated that 7 terminal restriction fragments varied among plots while 19 differed over time, suggesting that time is more relevant to changes in the microbial community than space. Wolsing and Priemé (2004) collected 10 samples from 5 locations in March, July, and October 2002. Primers specific for two nitrite reductase gene fragments (nirK and nirS) were used. Similarities in T-RF LP patterns revealed a significant seasonal shift in the community structure of nirK— containing bacteria (p=0.0077) for each sampling location. Heterogeneity within a habitat must also be accounted for when generating T- RFLP profiles (Heath and Saunders 2006). The authors performed T-RF LP analysis on soils collected at three habitats. Although local heterogeneity existed, samples from one habitat were still more alike (and contained the main terminal restriction fragments for each site needed to generate a genomic 'fingerprint') than soils from different habitats. However, soil heterogeneity within a habitat could possibly result in difficulties in linking soil samples back to a location and in generating a microbial community fingerprint if bacterial composition differs substantially. 12 Plant species and soil types may cause within or among habitat heterogeneity. In a habitat, different management regimes can influence which bacterial species reside there. Blackwood and Paul (2003) collected soil samples in September 1998 and August 1999 from four agricultural fields with different crops. Resulting profiles were separated into two distinct clusters that were found to be partially dependent on the crops present, suggesting that plant species existing in a region affect bacterial species residing in surrounding soil. Soil pH has also been found to impact which bacterial strains exist in soil. F ierer and Jackson (2006) collected 98 soils from across North and South America and, based on T-RF LP analyses, found that bacterial diversity was highest in soils with a neutral pH and lower in acidic soils, with samples from the Peruvian Amazon being the most acidic and least diverse. Similarly, Kennedy et al. (2004) concluded that the effect of pH on bacterial community structure is greater than that of plant species. In their study, seven perennial plant species typical of grasslands in Ireland were left unchanged or altered with lime, nitrogen, or lime plus nitrogen. T-RFLP profiles revealed that lime (P<0.001) and nitrogen (P<0.001), both of which changed the pH of the soil, altered soil bacterial structure while plant species had little effect. However, it is possible that the bacteria could have been thriving on nutrients present, as lime and nitrogen are both fertilizers. Complications with T -RF LP Analysis Certain procedures must be followed to achieve the most accurate T-RF LP results possible. A potential source of variation in T-RF LP results is the type of polymerase used for PCR. Osborn et al. (2000) isolated bacterial DNA from soil to determine the 13 effects of different PCR reagents and parameters on the profiles produced. They noted that Taq polymerase from Qiagen resulted in more bands than the same samples amplified with AmpliTaq (Perkin-Elmer). During the restriction digest, it is also necessary to follow a strict protocol to ensure repeatable results. Additional peaks may be caused from incomplete digestion (Bruce 1997; Clement et al. 1998), resulting in an overestimation of the total amount of diversity in each community (Osborn et al. 2000). It is requisite that the same PCR reagents, primers, and restriction enzymes be used throughout a research project to reduce potential variability. Variation in T—RFLP profiles can be caused by differences in amplification specificity as well as a reduced ability to detect all fragments due to a large amount of DNA being injected during capillary electrophoresis, causing smaller, potentially informative peaks to be overlooked‘(lrwin et al. 2003; Hartmann and Widmer 2008). Because of this, it is necessary to determine relative fluorescence unit (RF U) thresholds that can potentially eradicate uninforrnative peaks. Chao et al. (2007) recommended eliminating peaks that are <50 bases as these fragments can result from primer dimer. Samples with peaks over 6000 RFUs may need to be diluted with forrnamide and rerun to reduce the amount of DNA and obtain more precise results, whereas if the main peaks are less than 1000 RFU it may be necessary to reamplify the gene of interest as there is not enough DNA present to differentiate between background noise and informative peaks. T-RFLP Soil Analysis at the Forensic Biology Laboratory of MS U Previous research performed at Michigan State University involved investigation of the effectiveness of T—RF LP analysis for the typing of forensic soil samples. The 163 14 ribosomal RNA (rRNA) gene was targeted (Meyers and F oran 2008). Samples were collected monthly over a one year period from five habitats in central Michigan, including an agricultural field, a marsh, a yard, a woodlot, and a sandy woodlot that was approximately 100 miles distant to the other habitats. Every third month, additional samples were collected 10 feet in each direction (N, S, E, and W). Total DNA was isolated and 16S rRNA gene amplified, with the resulting amplicon being digested with MspI. T-RF LP profiles from the five habitats were compared and similarity indices calculated to determine habitat individuality, local heterogeneity, and temporal variation of the bacterial communities. In general, the five locations could be differentiated from each other, as samples from one habitat were more similar than samples originating from different habitats. The highest level of similarity within a habitat was found in the two locations that were manipulated by humans: the yard and the agricultural field. The low average similarity values among habitats did not appear to be caused by the type of soil, as the two sandy loam habitats (the yard and the woodlot) had an average similarity that was no higher than that of other sites. The habitats that were the most visually alike, the two woodlots, produced profiles that were as different as any of the others. However, distance may have directly affected the similarity of bacterial species among habitats, as the sandy woodlot, 100 miles away from the other tested sites, had the lowest similarity compared to the remaining locations. Similarity indices between consecutive months showed that microbial communities change over short periods of time, introducing a factor that must be accounted for if known soil samples are collected after a crime occurred. Autumn had 15 the least amount of variability within each habitat, while samples from winter and summer months had a higher level of heterogeneity. Although the results were informative enough to conclude that microbial communities vary temporally and among habitats, assaying the 168 rRNA gene, which is conserved in all bacterial species, resulted in community profiles that were so complex that they likely contained too much information to differentiate between two soils with similar bacterial species. Profiles generated were often repeatable, but sometimes contained so many different strains of bacteria that it may have been impossible for common strains to be amplified with equal intensity in each sampling reaction. This could have caused important bacterial strains to be overlooked. In order to obtain a higher degree of accuracy, a more explicit assay that targets a select set of bacteria known to vary based on plant species or soil pH might be informative, as a much smaller amount of potential bacterial species will be present in the resulting electropherogram. It is possible that similar soils will be differentiated if a more specific assay is employed. Rhizobia in Microbial Community Analysis In the current study, the rhizobial family of bacteria was targeted, as it is widespread and well-studied (Cooper et al. 1998; Perret and Broughton 1998). An estimated 67 species of rhizobia exist among 12 genera (Weir 2006). Most of these are located in the family Rhizobiacae, but many species are found in the Rhizobium, Mesorhizobium, Ensifer, or Bradyrhizobium genera. Rhizobia have a symbiotic relationship with legumes, a family of flowering plants (McIntyre et al. 2007). Common legume products include alfalfa, clover, peas, beans, and peanuts; however, legumes can 16 also be shrubs or trees such as the Black locust or Kentucky coffee tree. In the beginning stages of legume growth, rhizobia invade the plant root and multiply, forming nodules on the surface of the root. The bacteria absorb air from the soil and convert, or fix, nitrogen (N2) into ammonia, which can be used by the plant to produce protein (Adjei et al. 2002). The ability to differentiate among similar strains of rhizobia is well documented. Perret and Broughton (1998) utilized targeted PCR fingerprinting to analyze nitrogenase reductase (nin) and recombination (recA) gene. Selected regions of the genome known to be phenotypically important were amplified. RF LP analyses of the products were then performed on high resolution denaturing polyacrylamide gels. The authors were able to differentiate between the two closely related Rhizobium species NGR234 and R. fredii USDA257. RecA was chosen as it is one of the most highly conserved bacterial genes, being essential for the repair and maintenance of DNA, but still contains hypervariable regions that allow for species identification. NifH also is conserved and variable among species but is located on a plasmid that aids in the nitrogen fixation process. Another method for differentiating among rhizobial species involves species specific probes designed through subtractive hybridization (Cooper et al. 1998; Wulf 2003). In this technique, shared DNAs are removed until only regions unique to the organism in question remain. Probes specific to the sequences are then designed, labeled with a fluorescent dye, and hybridized to genomic DNA from the bacteria. Only strains with the conserved sequence of interest hybridize with the probe. Cooper et al. (1998) designed probes that were unique for R. leguminosarum bv. trifolii and R. leguminosarum bv. phaseoli,‘ making it possible to differentiate between these two closely related strains. 17 The Utility of Rhizobial T -RFLP Forensic Soil Analysis The goal of the research presented here was to investigate the possibility of generating simplified, and therefore potentially more specific and reproducible, T-RFLP soil profiles. Rhizobia were chosen as they are highly conserved and yet many strains exist that may make it possible to produce unique genetic ‘fingerprints’. A particular species of rhizobia can nodulate a limited number of host species; therefore, strains correlate with the plants of a region, causing bacterial community structures that may be specific to certain habitats (Spaink et al. 1987). Soil samples were collected and DNA extracted as previously described by Meyers and F oran (2008). The rhizobia recA gene was targeted for amplification. Profiles were analyzed through MDS, with similar samples grouping closer together than dissimilar samples. By simplifying the number of bacterial families amplified, as well as the statistical technique used to analyze T-RFLP profiles, the utility of rhizobial T-RF LP analysis for differentiating soils and habitats was examined with attention to habitat uniqueness, local heterogeneity, and temporal variation of microbial communities. 18 MATERIALS AND METHODS Sample Collection Soil samples were collected in 2004 and 2005 (Meyers and Foran 2008). Five habitats in central Michigan were examined: an agricultural field (A), a marsh (M), a yard (R), a woodlot 0N), and a sandy woodlot (S). Soils were gathered at a habitat’s main site at the beginning of each month throughout the course of a year. In addition, every three months, samples were collected ten feet in all directions (north, south, east, and west) of the main collection site. The north site at the marsh was inaccessible. Samples were acquired 0 to 5 centimeters below surface level and stored at -20°C until DNA extraction could be performed. They were labeled with the month and year, the habitat, and the direction from the main site from which they were collected. DNA Extractions The DNA from each sample was extracted as described by Meyers and Foran (2008). An UltraCleanTM Soil DNA Isolation Kit (MO BIO Laboratories, Inc.) or, depending on amplification success, a PowerSoilTM DNA Isolation Kit (MO BIO Laboratories, Inc.) was used, following the manufacturer’s instructions. The former kit required one gram of soil while the latter used 0.25 grams. DNA Amplification of Rhizobia] DNA from Soil Initial PCR reactions included 1X Eppendorf® PCR MasterMix along with 2uM of each primer. A single primer that contained the same forward and reverse sequence was used to target a conserved domain in the rhizobium nif promoter 19 (5’-AATTTTCAAGCGTCGTGCCA-3’) (Watson and Schofield 1985; Richardson et al. 1995). Two uL of template DNA was added in a final reaction volume of 20uL. DNA organically extracted from a slant culture of R. trifolii served as a positive control, while no DNA was added to a negative control. A hot start polymerase was required in later reactions, leading to the use of 1 unit of HotMasterTM Taq DNA Polymerase (Eppendorf), 1X HotMasterTM Taq Buffer with 2.5mM Mg 2’ (specific salt not specified) (Eppendorf), and 0.2mM of each dNTP (Promega). Problems with amplification specificity lead to the use of new primers and Taq polymerase. Reactions contained 1 unit of AmpliTaq Gold® DNA Polymerase (Applied Biosystems) along with IX of the included GeneAmp® PCR Gold Buffer, 2.5mM of MgClz solution, and 2uL of 1 ug/uL BSA. Two uM of each primer was used in all reactions. Primers were specific for the nifl-I (nitrogenase reductase) and MM (recombination protein A) genes of rhizobia (forward primer 5 ’-CGTTTTACGGCAAGGGCGGTATCGGCA-3 ’and reverse primer 5 ’-[6FAM]TCCTCCAGCTCCTCCATGGTGATCGG-3 ’; forward primer 5’-CATGCRCTGGA T C CGGTCTATGC-3’ and reverse primer 5’-[6F AM]CTTGTTCTTGT C GA CCTTGACGCG-3’, respectively) (Perret and Broughton 1998). Amplicon length was approximately 781 base pairs for the nifH and 462 base pairs for the recA gene. Amplification was successful on samples tested with primers specific for the recA gene while products were only visualized on an intermittent basis with primers specific for the nifl-I gene; therefore, analysis of the nifH gene was discontinued. 20 Preliminary amplification parameters for the recA gene consisted of an activation of the enzyme for 10 minutes at 94°C, followed by 5 cycles of 30 seconds denaturation at 94°C, primer annealing at 55°C for 30 seconds, and a 45 second extension at 72°C (Perret and Broughton 1998). DNA was then subjected to 30 more cycles of a 30 second denaturation at 94°C followed by a 30 second annealing time at 62°C, and a 45 second extension step at 72°C with a final extension of 5 minutes at 72°C. The annealing temperature was lowered to 55°C for all cycles and the number of cycles reduced to 32 in subsequent PCRs to ensure the amplification of each sample. Amplicons were visualized by electrophoresing 2 uL of the PCR product on a 2% agarose gel along with 2 uL of a 100 base pair DNA ladder (New England Biolabs). The gel was stained with ethidium bromide, and DNA yields were estimated by comparing the amplicon to the 400bp size marker that contained approximately 38mg of DNA. Restriction Digestion of Amplified RecA Gene The remaining 18 uL of PCR product was purified using a Montage® PCR Centrifirgal Filter Device (Millipore) with one rinse of 400uL of TE (IOmM Tris, lmM EDTA) , and centrifugation for 15 minutes at 1000x g. The retentate was brought up to 18 uL. Restriction digests were performed with three enzymes in separate reactions (Rsal, Mspl, Dpnll; New England Biolabs). Restriction digestions consisted of 1 unit enzyme, IX of the corresponding buffer (NEBuffer 1 for Rsal, NEBuffer 2 for Mspl, NEBuffer DpnII for DpnII), and approximately 150ng of purified PCR product in a total volume of 10uL. Digestions were incubated at 37°C overnight, and terminated by heating at 75°C for 20 minutes. Digested DNAs were purified using a Montage® PCR 21 Centrifugal Filter Device with two rinses of 300uL TE, centrifuged for 15 minutes at 1000x g. The final volume was returned to IOuL. Capillary Electrophoresis of Restriction Digests Capillary electrophoresis was executed using an ABI Prism® 310 Genetic Analyzer. Three [AL of purified restriction digest, 21.5 uL of formamide and 0.5uL of GeneScanTM 500 LIZTM Size Standard (Applied Biosystems) were heat denatured in a 0.5mL microcentrifuge tube (95°C for 3 minutes), then chilled on ice. T-RF LP profiles were initially generated using ABI 310 Genetic Analyzer Data Collection Software version 3.0.0, GS STR POP4 (lmL) G5.md5 module with 60 second injection, 15kV run voltage, and 35 minute run time. This was modified to include a 10 second injection time, which was used for all samples. Analysis of T —RFLP Profiles T-RFLP profiles were generated through the use of ABI GeneMapper® ID, version 3.1 software. Terminal restriction fragments from 40 — 450 bases were included. Profiles were aligned using T-Align (University College Dublin, http://inismor.ucd.ie/~talign/index.html) with a confidence interval of 0.5 bases and exported as a binary file. Resulting data were imported into Microsoft Office Excel 2007 for further analyses. To generate genetic ‘fingerprints’ for each habitat, terminal restriction fragments from every profile (main site as well as those collected ten feet in each direction) were . compared. Peaks found at least 75% of the time in at least one habitat were labeled as 22 indicative peaks. Strongly indicative peaks (indicative peaks found less than 25% of the time in all other habitats) were also determined. Statistical analyses were completed using SPSS® version 15.0 (SPSS Inc.). Nonmetric multidimensional scaling was chosen to examine site heterogeneity, the effect of time on bacterial samples collected from soils, and the uniqueness of habitats. Weighted Euclidean measurements were used to calculate similarity indices for each of the samples, which were then plotted geometrically in two dimensional space. For each set of samples being compared, SPSS® produced a matrix of the Euclidean distances determined among each of the samples and a plot of the generated distances. To test the utility of multidimensional scaling analysis for forensic soil samples, habitats were also compared in pairs. Thirty plots were generated for each enzyme, with a different directional sample being added that was to represent an unknown sample. All “unknowns” were then rated as to whether they plotted correctly (within 0.5 units of 1 another sample from that habitat), incorrectly (within 0.5 units of a sample from a different habitat), or inconclusively (located approximately equidistant between the two habitats). Comparison of Replicates Triplicate PCR reactions were analyzed for both March and September samples from all main collection sites and each enzyme. Technical replicates were examined by performing five injections of a single restriction digest. Data generated were used to create a similarity index through the Ribosomal Database Project 11 (Center of Microbial Ecology, Michigan State University, http://rdp8.cme.msu.edu/cgis/trflp.cgi?su=SSU). 23 RESULTS Amplification of the RecA Gene RecA amplification was successfiil for all but one (904 WE) of the 136 soil samples. A 460bp amplicon indicated a positive result while no DNA band signified a negative result. DNAs extracted using the PowerSoil DNA Kit were utilized first, followed by those extracted with the UltraClean DNA Kit, since a higher rate of amplification efficiency was noted in previous studies with the former (Meyers and Foran 2008). The DNA extracted from the main marsh site in July was not analyzed as it had been exhausted in previous studies. T -RFLP Results All amplified DNAs yielded T-RF LP profiles. Although DNA quantities were estimated through comparison with a DNA size standard, total fluorescence varied. T- RFLP profiles that had a total number of fragments between 20 and 80 allowed for the most samples from a habitat to be grouped together in multidimensional scaling (detailed below). An exemplary T-RF LP profile can be seen in Figure 3. Every sample that amplified was used in statistical analysis, regardless of peak number. 24 Figure 3. Typical T-RFLP Profile 2100 1800 1500 1200 900 An exemplary T-RFLP profile. Fragment size in number of bases is on the x axis and relative fluorescence units are on the y axis. Only fragments that were between 40 and 450 bases were included in the analysis (shown in gray). Reproducibility of T—RFLP Profiles Similarity indices were calculated for each of ten triplicate PCR reactions. The lowest observed similarity was 0.300 (Appendix), which was noted between two RsaI 904 SM replicates. The highest similarity was 1.00, which was observed between 904 WM replicates that were digested with DpnlI. Values were averaged among each of the three replicates (Table 1). Average similarity values for replicates digested with RsaI were often much lower than those digested with Dan and Mspl, with six out of the ten replicates for RsaI digested samples being less than the lowest similarity value for Dan or MspI digested replicates, and the latter two having like similarity values. Technical reproducibility was examined by injecting two digested amplicons five separate times (Table 2). Similarity indices ranged from 0.917 to 1.00 with an average of 25 0.977 for Dpnll 605 WM and 0.954 to 1.00 with an average of 0.978 for Mspl 1204 SW. Mspl 1204 SW likely had more DNA, as there was a larger number of peaks in each replicate than Dan 605 WM replicates. Table l. Averaged Similarity Indices for T-RFLP Replicates Triplicate PCR reactions were performed for main site samples in March and September for all habitats, and similarity indices calculated; the three similarity values were then averaged. In addition, overall averages were calculated for each sample for all restriction enzymes. There were six Rsal digested samples with similarity values lower than any Dpnll or Mspl digested samples (in bold print). Dpnll Mspl Rsal Average_ 305 AM 0.741 0.847 0.747 0.778 305 MM 0.892 0.698 0.866 0.819 305 RM 0.718 0.865 0.651 0.745 305 SM 0.895 0.930 0.649 0.825 305 WM 0.860 0.881 0.817 0.853 904 AM 0.710 0.872 0.751 0.778 904 MM 0.816 0.793 0.687 0.765 904 RM 0.812 0.804 0.635 0.750 904 SM 0.956 0.711 0.508 0.725 904 WM 0.786 0.894 0.643 0.774 26 Table 2. Similarity Indices for Technical Replicates Two digested samples were injected five times. Replicates were compared to determine similarity values. Dpnll 605 WM Number of Peaks 37 37 32 33 31 Mspl 1204 SW Number of Peaks 52 56 53 51 52 Multidimensional Scaling as a Measure of Temporal Variability Multivariate analyses used for analyzing the effects of temporal variability resulted in, overall, Dpnll digested samples that allowed for most habitats to be separated from each other, while profiles generated from Rsal or Mspl digestions did not show as much distinct clustering (Figures 4, 5, 6). Two habitats (the woodlot and sandy woodlot) were distinctly separated from all others when Dpnll digested samples collected throughout the course of a year were compared, whereas Mspl digested samples could only be differentiated along one dimension, and Rsal digested samples from three habitats (the sandy woodlot, woodlot, and yard) were located in distinct regions but 27 relatively near samples from other habitats. The agricultural field, marsh, and yard samples were largely indistinguishable amongst all enzymes. Dpnll digested sandy woodlot samples were located between -1 and -2.5 for dimension 1, and -1 and 1 for dimension 2, with the exception of SM1004, which was located roughly at .25 (x axis), 1.5 (y axis) (Figure 4). The only sample from another habitat that was also included in this region was RM1204. The remaining yard samples intersected largely with agricultural field and marsh samples, which were all located between 2 and -2 for dimension 1 and -l and 2 for dimension 2. There was no overlap with samples Item the woodlot, which were located between -1 and l for dimension 1 and -1 and -2 for dimension 2. 28 Figure 4. Multidimensional Scaling Results for Main Collection Sites using Dpnll 2: «11040 “mumo 3“” ° mans mMaosOO W10“ i. ”105m o°msoso 3380534705 c o mausomuotmeos WW!“ unsosdo 0 M1204 RM1104 mesa 094100: . mums l H705 Mm M1 mm—fi Q40 “5% RM2050 o°° RMQS Rm Dimension 2 O I b- ”-1 Dimension 1 A plot produced from similarities among main collection sites by samples digested with Dan. The sandy woodlot (SM; in green) and woodlot (WM; in blue) habitats were distinguishable from other locations while the agricultural field (AM), marsh (MM), and yard (RM) overlapped substantially. Two outliers, RM1204 and SM1004 (circled), did not group with other samples originating from the same habitat. This image is presented in color. Amplicons digested with Mspl were distinguishable along dimension 2, but dimension 1 estimates did not provide distinct coordinates for any habitat (Figure 5). Once again, the sandy woodlot plotted the furthest away from the other habitats, located between -2 and 1.5 on dimension 1 and 1 and 2 on dimension 2. SM1004 was an outlier, found roughly at -0.5, 0. AM205 and AM505 overlapped with the sandy woodlot samples. Remaining agricultural field samples were located mostly between -0.5 and .25 (dimension 2), though AM605 was located with marsh samples, which were sited between -0.75 and 0.75 (dimension 2). Although all samples from the marsh were located in this geometric region, there was a high degree of overlap with field and yard 29 samples, as the yard samples were located between -0.75 and 0 (dimension 2). This included all samples but RM505 (located with the woodlot samples). Woodlot samples were not as distinguishable with Mspl as they were with Dpnll, but still were situated in a separate area along dimension 2 (-0.75 to -2). Figure 5. Multidimensional Scaling Results for Main Collection Sites using Mspl Dimension 2 Dimension 1 A plot produced from similarities among main collection sites by samples digested with Mspl. Habitats were mostly distinguishable along dimension 2, but were indistinguishable along dimension 1. The sandy woodlot (SM; green) and woodlot (WM; blue) habitats displayed the highest amount of separation from the others. Five outliers (SM1004, AM205, AM505, AM605, and RM505; circled) did not group with other samples originating from the same habitat. This image is presented in color. Amplicons digested with Rsal could not often be differentiated (Figure 6). Sandy woodlot and woodlot samples generally grouped with other samples that originated from the same habitat. The sandy woodlot was positioned between 0 and -2 for dimension 1 and 0.25 and 2 for dimension 2, although SM1004 and SM305 were not situated in this 30 region. AM205, AM505, and WM1004 were at the same location as the sandy woodlot. Remaining woodlot samples were sited mostly between 0.75 and 2 for dimension 1 and 0 and 1 for dimension 2. However, there was overlap noted with MM1004, MM105, AM605, and SM305. The rest of the field samples frequently plotted between 1 and -1.5 for dimension 1 and 0 and -l for dimension 2 while most of the marsh samples were between 0 and 1.5 for dimension 1 and -l.5 and 1.5 for dimension 2. MM305 and, MM505 were the only samples outside of this area; instead, they intersected with the field and woodlot. Finally, all yard samples were located between 0 and -1.5 for dimension 1 and -2 and 0 for dimension 2; however, some samples from other habitats were similarly situated. Rsal allowed for the yard samples to be separated from all other habitats, which was not seen with Mspl or Dpnll digested samples. 31 Figure 6. Multidimensional scaling results for Main Collection Sites using Rsal Dimension 2 Dimension 1 A plot produced from similarities among main collection sites by samples digested with Rsal. A large intersection was noted with amplicons digested with this enzyme. The sandy woodlot (SM; green) once again displayed the highest amount of separation from the other locations, although yard samples (RM; purple) were also located within one cluster with overlap from samples originating from other habitats. Woodlot (WM; blue) samples also grouped together. Ten outliers (AM205, AM505, AM605, MMIOS, MM305, MM505, MM1004, SM305, SM1004, WM1004; circled) did not group with other samples originating fiom the same habitat. This image is presented in color. Multidimensional Scaling as a Measure of Habitat Heterogeneity Samples were collected every four months ten feet in each direction of the main site to examine within habitat heterogeneity. Each month was plotted separately to avoid temporal effects and to determine variability in bacterial species present, which was indicated by how many samples from each habitat grouped together. Also, including all months in one plot resulted in a virtually uninterpretable plot due to the large number of variables (samples) present. 32 The woodlot and sandy woodlots could be distinguished from other habitats for most months and enzymes (ten out of twelve plots) (Figures 7, 8, 9; Appendix). However, plots produced when comparing certain months showed a greater difference among these habitats. Amplicons digested with Dpnll resulted in plots with the highest amount of differentiation among habitats in June (Figure 7), followed by December, then March and September (Appendix). In June, all sandy woodlot samples were between -1 and -2 (dimension 1) and -0.5 and 1.5 (dimension 2) with no overlap from other habitats. All woodlot samples were located between 1 and -l (dimension 1) and -0.5 and -2 (dimension 2), with RW605 and RS605 also positioned in this region. Field, marsh, and yard samples were indistinguishable. Figure 7. Multidimensional Scaling Results for Directional Samples using Dpnll, in June W N E a .3 MEWS 3 RM — o 0 a z!“ a .1 'l .24 I l l l I l a -2 -1 0 1 2 Dimension 1 A plot depicting local heterogeneity within habitats (in June 2005) by samples digested with Dpnll. The sandy woodlot (S; green) and woodlot (W; blue) were the most distinguishable from the other habitats. Two outliers (RW 605 and RS605; circled) did not group with other samples originating from the same habitat. This image is presented in color. 33 December sandy woodlot samples were also located in one cluster, with the exception of SN1204, while woodlot samples grouped together except for WW1204 (Appendix). Once again, field, marsh, and yard samples could not be differentiated. In March, four of the five sandy woodlot samples clustered together (SM305 was the exception) and three of the five woodlot samples were located roughly in the same plot area (WN305 and WE305 were not). Yard samples were in a distinct cluster, along with WE305, AW305, and SM305. Field and marsh samples were indistinguishable. Finally, in September, three of the sandy woodlot samples, two of the woodlot samples, three of the yard samples, and three of the field samples were located in separate groupings. The rest, including all marsh samples, were positioned around location (0, 0). Results for amplicons digested with Mspl also varied based on month of collection. Samples collected in December could frequently be separated into the habitats they originated from (Figure 8). All sandy woodlot samples were tightly clustered together between -1 and -2 (dimension 1) and -1 and 1 (dimension 2). The only sample from another habitat similarly situated was RW1204. Woodlot samples were located in the same region along dimension 2 (-0.5 to -2), but were spread from 1 to -0.5 for dimension 1. R81204 was interspersed with woodlot samples. All marsh samples were located from 1 to -l (dimension 1) and 1 to 2 (dimension 2). All field samples were tightly clustered together, between 1 and 1.5 (dimension 1) and 0.5 and -0.5 (dimension 2), with the exception of AWl 204. Yard samples were interspersed amongst other habitats. 34 Figure 8. Multidimensional Scaling Results for Directional Samples using Mspl, in December 2—1 Dimension 2 I l I -2 -1 0 1 Nd Dimension 1 A plot depicting local heterogeneity within habitats (in December 2004) by samples digested with Mspl. All habitats were within distinct plotted regions except for the yard (R) samples, which were interspersed amongst other habitats. One outlier (AW1204; circled) did not group with other samples originating from the same habitat. This image is presented in color. The March samples digested with Mspl did not form as many distinct clusters as those collected in December (Appendix). Although all sandy woodlot samples were located together, a large amount of overlap occurred with other habitats. The woodlot samples plotted with samples from the yard, field, and marsh. Three of the four marsh samples grouped together, but field and yard samples scattered amongst other habitats. Field samples collected in June produced the most distinct cluster. Only AM605 was located near samples from any other habitat. Sandy woodlot samples were all in one region, although near samples from many other habitats. Three of the five woodlot samples separated fiom other habitats, with WE605 and WS605 located near yard and sandy woodlot samples. Marsh and yard samples intermingled. Finally, Mspl digested samples from September had the smallest amount of differentiation among habitats. Only field and woodlot samples formed a distinct cluster, with MM904 in the field region and RW904 and RS904 in the woodlot region. Three sandy woodlot samples clustered together, while 88904 and SE904 were located near samples from other habitats. Once again, the yard and marsh could not be differentiated from samples originating from other habitats. Rsal digested amplicons collected in June and December also often were grouped corresponding to habitat (Figure 9; Appendix). In June, all samples from the sandy woodlot and woodlot clustered together with no overlap. Sandy woodlot samples were located between -1 and -2 (dimension 1) and 0 and -1 (dimension 2), while woodlot samples were between 0.5 and 2 (dimensions 1 and 2). However, samples from all other habitats were indistinguishable. 36 Figure 9. Multidimensional Scaling Results for Directional Samples using Rsal, in June 2— .1 _ N :I .2 n i: 0 E D -1 -1 -2- 1 I I I I l a -2 .1 0 1 2 Dimension I A plot depicting local heterogeneity within habitats (in June 2005) by samples digested with Rsal. The sandy woodlot (S; green) and woodlot (W; blue) samples clustered together, while the field (A), marsh (M), and yard (R) samples were indistinguishable. This image is presented in color. Amplicons digested with Rsal that were collected in December resulted in the yard being the only habitat that formed a distinct cluster that did not overlap with other habitats, although field samples were located nearby (Appendix). Remaining field samples were interspersed with samples from many other habitats. Three of the four marsh samples clustered together; however, MW1204 was in the same location as the field samples. Clusters were also formed for the sandy woodlot and woodlot habitats, with SW1204 and WN1204 located outside of their respective regions, and field samples being positioned in the same location as woodlot samples. March samples digested with Rsal resulted in overlap among all habitats. Samples collected in September formed 37 clusters for the field and woodlot, while marsh, sandy woodlot, and yard samples could not be differentiated. Peak Number as a Source of Outliers Soils collected ten feet in each direction of the main sampling site contained a much higher percentage of samples with very few or many more peaks compared to those collected at main sites, possibly contributing to the frequent inability to differentiate among habitats when all five were compared using MDS. Of the 80 samples that contained a total peak number below 20 or above 80, 14 did not group with remaining samples from the same habitat and 37 more were located in habitats that were “indistinguishable” from each other. Conversely, when 80 representative samples that had a total peak number between 20 and 80 were compared, only 28 could not be grouped with samples from their respective habitats. The Utility of Multidimensional Scaling as Determined by Addition of “Unknown ” Samples When two habitats were analyzed concurrently, all grouped into their respective habitats. These were then used for subsequent analyses examining if an “unknown” sample grouped with other samples from the same habitat. The only exception was the agricultural field, which often overlapped with other habitats, and therefore could not be used to classify “unknown” samples. Figure 10 shows an unusual plot in which the agricultural field and woodlot samples clustered apart from each other. The added agricultural field sample (AS605) plotted with those originating from the same habitat. Out of plots generated for Dpnll digestions on samples from all habitats, the “unknown” 38 with other samples from the same habitat 70% of the time (Figure 11), while 17% of “unknowns” were grouped incorrectly and 13% were inconclusive. Mspl digestions were plotted correctly 50% and incorrectly 23% of the time, while 27% were inconclusive (Figure 12). Finally, multidimensional scaling of Rsal digestions grouped the “unknown” with samples from the same habitat with 40% accuracy, while 13% were grouped incorrectly and 47% remained inconclusive (Figure 13). Figure 10. Multidimensional Scaling Results for the Agricultural Field and Woodlot, with Added Unknown 2 Wit/1904, ,1 IE ‘2 i o a 8AM1104° W705 % ANGUS AMI 004 USWBUSOWDS 05 0 WM.) WIVI'S . 5 7381511405 m1 1W1204 g AWUS mans ”M1050 0 W305 -1- Ail-€05 WM1004 . -2 - AM20 I I I I -2 -1 0 1 2 Dimension1 A plot comparing two habitats: the agricultural field (AM) and the woodlot (WM). The “unknown” sample (AS605; circled) grouped with other samples from the field. 39 Figure 11. Pie Chart Depicting Groupings of Unknowns by Samples Digested with Dan Percentages of “unknowns” digested with Dpnll that grouped correctly (70%), incorrectly (17%), or inconclusively (13%) with other samples originating from the same habitat. Habitats that did not form distinct groupings were not included with this analysis. This image is presented in color. Figure 12. Pie Chart Depicting Groupings of Unknowns by Samples Digested with MsL - - -----L -- - -7- -- i Percentages of “unknowns” digested with Mspl that grouped correctly (50%), incorrectly (23%), or inconclusively (27%) with other samples originating from the same habitat. Habitats that did not form distinct groupings were not included with this analysis. This image is presented in color. Figure 13. Pie Chart Depicting Groupings of Unknowns by Samples Digested with RsaI Percentages of unknowns digested with RsaI that grouped correctly (40%), incorrectly (13%), or inconclusively (47%) with other samples originating from the same habitat. Habitats that did not form distinct groupings were not included with this analysis. This image is presented in color. Common Peaks for the Production of Genetic ‘Fingerprints’ Peaks that were commonly found in habitats as well as peaks that were unique to specific habitats created a genetic ‘fingerprint’ for each habitat. Indicative peaks (those noted at least 75% of the time) included 27 peaks from Dpnll digestions, 37 peaks from Mspl digested samples, and 36 peaks from RsaI digestions. The woodlot contained the highest number of indicative peaks for all enzymes (Table 3). Indicative peaks found less than 25% of the time in all other habitats were considered strongly indicative, and are noted in bold print in Table 3. The sandy woodlot had three strongly indicative peaks when samples were digested with Dpnll, two with Mspl digestion and one with RsaI digestion. The woodlot samples digested with Dpnll had four strongly indicative peaks, five for Mspl digestions, and five for samples digested with RsaI. The agricultural field had one strongly indicative peak out of all three 41 enzymes, the yard had two, while the marsh did not have any. Two strongly indicative peaks (originating from RsaI digested woodlot samples), 301.31 and 336.64, were completely unique, with 0% of that bacterial strain present in any other habitat. These results support those found with MDS, in that the woodlot had the highest number of strongly indicative peaks and also commonly separated from remaining habitats. Table 3. Genetic ‘Fingerprints’ for each Habitat and Enzyme Frequencies of each peak were calculated for every habitat and enzyme. Peaks were considered indicative if they were present with at least 75% frequency, and included in the genetic ‘fingerprint’ for that habitat. Peaks found at least 75% of the time for one habitat and less than or equal to 25% of the time for all other habitats were considered strongly indicative and are presented in bold print. Peaks 301.31 and 336.64 (originating from RsaI digested woodlot samples), were completely unique, with 0% of the strain present in any other habitat. Habitat and Enzyme Indicative Peaks Field, Dpnll 49.94, 150.13, 178.29, 203.92 Marsh, Dpnll 49.94, 178.29 Yard, Dpnll 49.94, 84.25, 96.59, 104.39, 149, 150.13, 162.89, 178.29, 200.03 Sandy Woodlot, Dpnll 130.78, 143.38, 178.29, 194.23, 235.53, 262.84 Woodlot, Dpn 11 40.97, 41.97, 60.35, 62.02, 64.91, 86.95, 93.99, 150.13, 162.89, 164.04, 165.72, 178.29, 179.42, 230.24, 327.34 Field, Mspl 61.63, 115.8, 117.06, 120.46, 129.23, 136.17, 197.16, 211.77, 252.95 Marsh, Mspl 66.76, 117.06, 120.46, 240.6 Yard, Mspl 104.5, 115.8, 117.06, 129.23, 132.53, 142.28, 193.42, 320.67 Sandy Woodlot, Mspl 66.76, 80.89, 115.8, 129.23, 145.44, 148.95, 197.16, 209.53, 210.84, 234.55, 283.34, 343.27 Woodlot, Mspl 53.52, 66.76, 69.76, 72.85, 91.94, 100.7, 115.8, 119.56, 120.46, 133.36, 138.81, 144.14, 160.83, 163, 193.42, 201.72, 212.6, 252.95 Field, RsaI 112.86, 137.49, 139.02, 158.76, 240.61 Marsh, Rsal 69.62, 118.56, 137.49, 222.8, 240.61, Yard, Rsal 93.78, 107.88, 137.49, 139.02, 140.17, 185.27, 350.16 Sandy Woodlot, RsaI 52.23, 69.62, 83.99, 137.49, 208.49, 294.47 Woodlot, Rsal 63.67, 64.58, 69.62, 72.8, 73.86, 79.93, 83.99, 91.77, 100.89, 102.91, 112.86, 137.49, 139.02, 144.79, 158.76, 169.73, 199.29, 206.12, 214.69, 222.8, 240.61, 241.71, 297.84, 302.31, 323.77, 336.64, 348.99 42 DISCUSSION Meyers and Foran (2008) used the bacterial 16S rRNA gene, a conserved sequence containing highly variable regions that make species differentiation possible, for T-RF LP analysis. Soils collected over time from the same habitat were, on average, more similar than those collected from separate habitats. However, as information about all existing bacterial strains in the soil was present, it was difficult to distinguish between background noise and important peaks that permit habitat differentiation. The current study was designed to simplify the number of bacterial species amplified and generate T- RF LP profiles that target a single family of bacteria while taking into consideration the effects of time and habitat heterogeneity. A genetic ‘fingerprint’ then can potentially be established that may link soil collected from a suspect or victim to the area from which it originated. T-RFLP analyses have been widely used in microbial ecology research, but have yet to reach the Daubert standards needed to be useful in a courtroom setting. T-RFLP profiles are generally reproducible, as seen with the similarity indices generated in this research. However, a similarity value of one is uncommon, suggesting that results from T-RFLP analysis are not yet fully reproducible and that strict protocols need to be followed to eliminate potential variation (Osborn et al. 2000; Meyers and Foran 2008). There have also been numerous peer reviewed publications involving T-RFLP analysis for differentiating soils, another requirement of Daubert. However, there is no known error rate, as only one study (Blackwood et al. 2003) attempted counting the number of times samples grouped differently during cluster analysis. Finally, as habitats are not yet fully distinguishable through comparing T-RFLP profiles, it has been. suggested that 43 much more work needs to be done before this technique is accepted by the courts (Heath and Saunders 2006). After the T-RF LP process has been optimized to allow for the greatest precision in resulting profiles that is possible, the validity and reliability of this analysis should be tested through blind proficiency tests before being used in a forensic examination. All but one (904WE) of the 136 soils amplified using the recA primers specific for rhizobia in the current study. Both PowerSoil and UltraClean extracted DNAs were utilized for this sample, and the cycle number was increased from 32 to 40, with no results. In contrast, 16S rRNA amplification of 904WE was successful in the earlier study. AS 16S rRNA is a conserved gene in all prokaryotes, it is likely that, although some species of bacteria were present, there were no rhizobia. However, all remaining samples produced amplifiable DNA, suggesting that rhizobia existed in them, and that unique strains could be used to differentiate among habitats. Variation in T-RF LP profiles resulting from the comparison of multiple PCRs can sometimes be attributed to the method itself (Osborn et al. 2000). Differences in the number and size of peaks may be caused by variability in the amounts of DNA being injected. Also, peak heights, which ideally represent how prevalent each species of bacteria is, can vary among profiles. Biases can exist if preferential annealing of a primer to certain templates occurs. Therefore, a measure of peak heights was not utilized in the current study to avoid inaccurate representations of species commonality, despite its usage by many researchers to delete peaks below a specific threshold value. Reproducibility of T-RFLP profiles was assayed by comparing triplicate PCRs performed on soils collected from each habitat’s main site in both March and September, 44 for every enzyme. The lowest similarity value, 0.300, was observed between two RsaI 904 SM replicates (Appendix), while the highest, 0.985, resulted from Dpnll digested 305 MM replicates. In general, RsaI replicates had the lowest reproducibility (Table 1). On the other hand, samples digested with Dpnll and Mspl displayed higher similarity indices, with average values for DpnII ranging from 0.710 to 0.956 and for Mspl from 0.698 to 0.866. Osborn et a1. (2000) obtained similar results when they collected soils five meters apart within one habitat. The authors examined variability during capillary electrophoresis, multiple DNA isolations of the same soil, different amplifications from a DNA extraction, and several digestions of a PCR product. Variation from capillary electrophoresis was low and mostly resulted from the presence/absence of small peaks, which was likely background noise, as they were not repeatable. Replicate digests of a PCR product or DNA isolations showed no more variation than that found during capillary electrophoresis. The authors found that the greatest variability occurred between replicate PCRs, although specific data were not included. Different starting amounts of DNA in each PCR reaction or the preferential binding of primers to specific templates could have caused some variability. It is likely that a similar phenomenon also occurred in the current study, which would explain why technical replicates (discussed below) comparing multiple injections had less variability that PCR replicates. Dunbar et al. (2000) used threshold values to delete background noise when comparing four soil communities representing two environments between trees and two located in the tree rooting system. The authors concluded that, at a threshold of 25 RF U, 24 of 193 peaks could not be reproduced. When the threshold was increased to 100 RF U, 45 13 of the 24 were not generated, resulting in increased reproducibility when profiles were compared. In contrast, during the current study, initial similarity indices between soils from the same habitat did not include an RFU threshold and similarity values ranged from 0.300 to 0.970 (data not shown). When a preliminary threshold of 40 RF U was employed, lower similarity values resulted. A similar phenomenon was seen by Blackwood et al. (2003), who, when comparing T-RF LP profiles from two soils, noted that the number of errors (the amount of times samples from each soil community were grouped in different ways during cluster analysis) increased when deleting peaks below 200 RFU, indicating that informative peaks were being deleted. The use of a threshold value can apparently cause inaccurate results to be obtained due to the removal of important information, which likely occurred in the current study when a preliminary threshold value was employed. Technical reproducibility was examined by injecting two digested samples five times each. Similarity values ranged from 0.917 to 1.00. Dissimilarities were likely due to small variations caused by the capillary electrophoresis process itself, such as laser intensity or the injection voltage/current, as the number of peaks detected for each replicate did not increase or decrease sequentially. On two occasions (Table 2), a similarity value of one was noted when replicates with a different number of peaks were compared. This was due to an increased confidence interval, which allowed for peaks +/- one base pair to be binned together, i.e., replicates with a larger number of peaks had two peaks that were within one base pair of each other, causing them to be considered as a single element. 46 Comparison of samples taken monthly from the main collection sites showed that the sandy woodlot and woodlot could be separated in plots of all five habitats with each of the three enzymes. However, the remaining habitats rarely formed distinct clusters (the yard could be differentiated from other habitats once, while the field and marsh samples always overlapped with other habitats). Similar results were obtained from soils collected 10 feet in each direction from the main sampling site: the sandy woodlot and woodlot could be distinguished in 10 out of the 12 plots, the agricultural field was differentiated in 5 plots, and the marsh and yard were separated from other habitats in 3 plots. Rees et al. (2004) had similar results when they collected ten soil samples from three distant sites along a river in Australia. 168 T-RF LP profiles from Hin6 (Hhal) digestion were analyzed using MDS. When a binary matrix was used to determine the presence/absence of peaks with a RFU threshold that deleted all peaks below 1% of the total peak area, two of the three sites clustered together. It is possible that complete separation of all sites is difficult with MDS analysis of T-RF LP profiles if more than two habitats are analyzed, as was noted in both the cited research and the current study. The idea that analyzing more than two habitats concurrently makes MDS analysis difficult was confirmed when only two habitats were compared at a time. All habitats grouped separately, with no overlap or samples that did not group with others from the same habitat. The only exception was the agricultural field, which was not distinguishable from any other habitat save one comparing it to the woodlot (Figure 10). It seems that analyzing multiple habitats caused a large number of dissimilar samples to be compared, resulting in the highly dissimilar habitats (the woodlots) occupying a distinct region of the plot and more similar habitats being indistinguishable. 47 The restriction enzyme used sometimes made it possible for habitats with similar vegetation and levels of human manipulation to be differentiated. Selection of restriction enzymes is critical because the conservation of the restriction sites within the target sequence varies considerably (Marsh 2005). Some restriction enzymes will reveal certain bacterial strains but will not represent total community diversity, others will expose all diversity present, while still others will disclose very little because the restriction site is highly conserved. In the current study, there was no clear distinction as to which restriction enzyme caused most samples from a habitat to be plotted together. Dpnll digested DNAs produced “unknown” samples that were accurately grouped with other samples originating from the same habitat with a higher frequency than the other two enzymes. Additionally, samples collected monthly from all habitats that were digested with Dpnll produced the clearest distinction among habitats. Conversely, when comparing Mspl, Rsal, and Dpnll digested amplicons to determine the amount of unique bacterial species detected with each enzyme, Mspl and RsaI digested DNAs had the largest number of peaks that were present at least 75% of the time (Table 3). However, all three enzymes exhibited roughly the same number of strongly indicative peaks. It is possible that Mspl and Rsal cut at conserved regions, resulting in a large number of common peaks that were not unique to a specific habitat as the cut sites would be present in all species of rhizobia. Dpnll may cut at more unique sequences, resulting in more distinct species being detected, and a comparable number of strongly indicative peaks. Variability within and among habitats could have resulted from the total number of peaks generated for each sample, environmental factors, human manipulation of a habitat, geographic distance, or soil type. T-RFLP profiles that contained a large or small 48 number of peaks were often outliers, with 51 out of the 80 samples that had a peak number below 20 or above 80 not grouping with other samples from the same habitat, or located in areas of the MDS plot that intersected with other habitats. In contrast, 28 outliers were seen when 80 representative samples with a ‘normal’ number of peaks were compared. Few rhizobial strains in the sample or poor digestion by the restriction enzyme could have caused fewer peaks to be detected, with those present being larger in size (Irwin et al. 2003; Hartmann and Widmer 2008). This was noted in many profiles resulting from digestion with Rsal in the current study, where peaks were more numerous that had a larger number of bases. Although some variability inevitably results from the T-RF LP method itself (Osborn et al. 2000), within habitat heterogeneity can largely be attributed to environmental factors (Mummey and Stahl 2003; Meyers and Foran 2008). Even when comparing samples collected relatively close to one another in a single habitat, local heterogeneity (probably resulting from different plant species, amount of sunlight reaching the soil, and differing moisture levels) can cause variability in T-RFLP profiles. Mummey and Stahl (2003) obtained samples from two different types of grasslands in Wyoming and found variability, with similarities decreasing rapidly at distances greater than 3.6 meters. The authors proposed that simple differences in rooting patterns, such as direction and amount of growth, could induce different species of bacteria to be present even in such a small area. This may cause some soils collected from sites with diverse plant species to not group with each other. Heterogeneity could have occurred in the current study due to the same effect, as rhizobia only exist in the root nodules of legumes. 49 Human manipulation may have caused increased similarity within habitats. The two habitats that commonly separated from all others, the sandy woodlot and woodlot, had little human influence. In contrast, the marsh was located near a home, the yard was mowed weekly in the summer, and the field was tilled and fertilized in May, leading to a decreased variation in plant species present. Human modifications seemed to have caused reduced heterogeneity and, therefore, the existence of relatively few species of bacteria. Meyers and Foran (2008) noted increased similarity in the habitats that were modified by humans, with the highest similarity value (0.773) from agricultural field samples. Similar results were found by Wu et al. (2007), who studied soil fungal communities that had five years of different agricultural land management, such as crops with soil firmigation, no vegetation present, soil being left undisturbed with weeds free to vegetate, organic techniques with no pesticides, and soil without pasture grass removed. They found that management practices that disturbed soil greatly diminished fungal diversity. Geographic distance may also be a factor in the similarity of bacterial species in a habitat. The sandy woodlot, which often separated from other habitats in MDS plots, was located approximately 100 miles from the other habitats. Meyers and F oran (2008) also found that the sandy woodlot was the most different using the 168 rRNA gene, suggesting that distance can have a marked effect on the microbial makeup of soil. Liu et al. (2003) obtained similar results when they collected marine sediment from four locations off the Pacific coast of Mexico in an effort to measure the genetic diversity of denitrifyers (bacteria that convert nitrate to nitrogen gas) by using primers specific for nitrite reductase genes. Principal components analysis showed that the bacterial 50 communities were more similar in sites closer to each other. The results indicate that bacterial community structure is affected by distance, with those communities geographically closer having similar bacterial strains. Yet when comparing rhizobia in the current study, the woodlot also separated from the remaining habitats, suggesting that geographic distance is not the only factor causing habitat diversity. Soil type did not appear to have an impact on the rhizobial similarity of the habitats. Both the yard and the woodlot were classified as sandy loams (Meyers and Foran 2008); however, in the current study the yard rarely formed a distinct cluster while the woodlot almost always did. In contrast, the field, marsh, and yard were composed of different soil types, yet their bacterial profiles were similar. Meyers and Foran (2008) also concluded that the yard and woodlot were not anymore similar to one another than to the remaining habitats. Results of the current research as well as the results of Meyers and Foran (2008) correspond to those of Tan et al. (2003), who collected samples from rice fields for fifteen days. The authors found that similar T-RF LP profiles resulted when like rice plants existed in the regions rather than when the soil types of the fields were the same. It appears that soil type does not affect the strains of bacteria that exist in a region. It is possible that some local heterogeneity was caused by variation in soil pH. Rhizobia have a higher survival rate in soils that have a pH greater than 7. Brockwell et al. (1991) found that Rhizobium meliloti existed at 890,000 colonies per gram in soils with a pH above 7, and 37 colonies per gram when the pH was below 6. In the current study, soil pH could have changed throughout the year, either naturally or through human activity, causing increased variability within habitats. 51 Temporal effects, examined when comparing samples collected once a month, also caused increased heterogeneity within habitats. Plots of soils collected in June and December consistently separated the sandy woodlot and woodlot from remaining habitats (Figures 7, 8, and 9; Appendix). In contrast, plots comparing soils collected in March and September resulted in the greatest overlap for each habitat compared. Outliers often came from the spring months and October when main site soils were compared. It seems likely that seasonal variation caused variability in rhizobial strains. When the 168 rRNA gene was analyzed (Meyers and F oran 2008), the field, marsh, and yard showed the highest within habitat similarity in June while the highest similarity for the woodlot and sandy woodlot was in December. The temporal fluctuation of all bacterial species suggested that many prokaryotes act similarly, with the most within habitat heterogeneity occurring in the spring months. Local heterogeneity could also have been caused by fluctuations in temperature, which impact the growth of bacteria (Drouin et al. 2000). Bacterial survival can be affected by fluctuation in temperature, wet-dry and freeze-thaw cycles, and precipitation (Adams and Wall, 2000). Specifically, rhizobia have limited growth at 4 to 10°C; consequently, the development of legumes such as alfalfa and soybean slows when temperatures are below this level due to the reduced amount of rhizobia available to fix the nitrogen needed for plant growth (Drouin et al. 2000). In central Michigan, average temperatures do not reach 4 to 10°C until March or April (Federal Research Division of the Library of Congress 1998). It is possible that, in the winter months, rhizobial strains lie dormant, and reproduction slows, until optimal temperatures for growth are reached. Once temperatures rise, all rhizobial strains 52 flourish, leading to common strains existing in every habitat before competition among strains begins and more unique profiles can be formed. George et al. (1987) studied interstrain competition among Bradyrhizobiumjaponicum and its effects on five soybean varieties in Hawaii. Results indicated that certain bacterial species found on soybean roots outcompeted others, with one strain in particular occupying up to 81% of the nodules in the field. This may also occur in Michigan during the summer months, leading to some rhizobial strains being largely present in certain habitats and different genomic ‘fingerprints’ being formed. This is likely the reason why more unique profiles with less heterogeneity were formed when soils collected in June were compared. It is not until the summer months that certain bacterial species outcompete others and prevalent strains from each habitat are noted. Finally, low temperatures in central Michigan have an average of 4 to 10°C in October. It is possible that, in reaching a sub-optimal temperature, some common strains of rhizobia lie dormant while others still flourish, leading to an increased variance when comparing bacterial profiles from the same habitat. It is also possible that any inability to differentiate among habitats does not lie purely in the experimental method or environmental factors, but results from statistical analyses that do not describe the data. Previous techniques for analyzing T-RF LP data, inc1uding visually comparing electropherograms for the presence or absence of peaks, principal components analysis, cluster analysis, or self organizing networks, are valid, but may not represent the data well if not normally distributed (Rees et al. 2004). Multivariate techniques have been suggested as the most accurate way to evaluate T- , RFLP results because complex datasets and non-linear data can be analyzed and the results and are easy to interpret. These techniques have allowed for differentiation 53 among collection sites when other statistical methods failed (Harder et al. 2004, Rees et al. 2004). This was seen in the present study, as similarity indices did not accurately reflect the underlying data patterns, and MDS produced greater differentiatidn among habitats. Indicative peaks (in this study, peaks found at least 75% of the time in a habitat; Table 3) can be used in conjunction with MDS plots to confirm results or to calculate a measure of similarity among habitats. For example, in the current research, the woodlot frequently plotted separately from other habitats. This may be explained by the fourteen indicative peaks that were present in this habitat and rarely found in the remaining habitats. It is also possible to use common peaks as a measure of similarity among habitats; for instance, the number of peaks shared between two habitats can be divided by the total number of peaks present in both habitats (Kerkhof et a1. 2000). This was not performed in the current study as more descriptive and accurate statistics were available. The utility of MDS for forensic soil analysis was tested by regenerating a plot with an “unknown” sample to determine if it grouped with known samples from the same habitat. In this study, two habitats were compared at a time, which represented a more realistic approach and allowed for better separation of habitats. This also allowed for the “prosecutor’s hypothesis,” that the unknown soil had a similar profile as known soils originating from the site of the crime, to be tested against the “defense hypothesis,” that the suspect was not at the scene of the crime; therefore, the soil originated from a different location. The habitats were distinguishable from each other, and the added “unknown” grouped with samples from the same habitat 40% to 70% of the time. The number of inconclusive groupings varied from 13% with Dpnll digested DNAs to 47% 54 with RsaI, and the level of incorrect groupings remained the same among plots generated from all three enzymes (13% to 23%). There was large variation in the amount of peaks from Rsal digestions used as unknown samples, likely raising the number of inconsistent results. Nonetheless, a standard error rate of “unknowns” grouping with other samples from the same habitat must be obtained before this technique meets Daubert standards. The results of the current study suggest that T-RFLP analysis could be a useful component of the “forensic toolbox.” The research provides insight into steps that must be followed in order to obtain accurate results. First, it is best to collect soil samples from the crime scene as soon as possible. These soils should be gathered for several months, perhaps at weekly intervals, to allow determination of outliers as well as if temperature correlates to heterogeneity and to make a full set of known samples for comparison with the unknown sample. Duplicates or triplicate soils should also be collected to identify outliers caused from bacterial heterogeneity or the T-RF LP process itself that may skew results. It may be beneficial to use multiple enzymes to digest the samples, as each will illustrate different amounts of bacterial diversity. In the current study, Dpnll digested DNAs resulted in “unknown” samples most often being grouped with samples from the same habitat, while Mspl and RsaI digested soils had more inconclusive results, suggesting that Dpnll has cut sites that are more representative of the unique rhizobial strains present. Variability among PCR replicates is expected in T-RFLP analysis (Osborn et al. 2000), and can often be quickly and accurately corrected by aligning profiles before statistical analysis. T-Align, a program developed by the University College Dublin and utilized in the current study to form binary data matrices, has an option for determining a 55 consensus profile. Only peaks that appear in duplicate will be analyzed, resulting in the removal of irreproducible peaks (Smith et al. 2005). MDS analysis in the current study allowed for easily interpretable results that separated all habitats but the agricultural field when pairs of habitats were analyzed. The comparison of only two habitats concurrently was the most applicable scenario for the current research, since most likely only two habitats will be compared in a forensic setting. However, it may be necessary on occasion to analyze more than two habitats concurrently. Although not yet studied in detail with T-RF LP, discriminate function analysis (DF A) may allow for separation of all habitats in these instances. The technique operates in the same manner as multidimensional scaling: by separating samples into groups. Conversely, DF A removes and adds back variables from the dataset until no further improvements can be achieved that emphasize unique species in specific habitats. In this manner, common peaks are excluded, reducing the total number of variables and improving the ability to differentiate among habitats. Schwarzenbach et al. (2006) concluded that this statistical analysis was accurate in a study in which six replicates of ten soil samples collected in June 2006 were compared within 400 m2 of each other in a grassland (sandy loam). By using DFA, the replicates were clustered consistently into the ten groups; however, the influence of time was not examined. Conclusions T-RF LP has become a common technique in microbial ecology. It can be performed quickly and easily, with a resolution that allows related strains of bacteria to be differentiated. When several habitats are compared, those with unique peaks (such as 56 the sandy woodlot and woodlot examined in this study) can often be separated from other habitats with little or no overlap. More similar habitats may be differentiated through simplifying the statistical analysis by comparing only two habitats at a time. Also, by analyzing a bacterial family instead of all prokaryotes, more reproducible profiles can be obtained. However, T-RFLP analysis, although promising, is not yet ready to be implemented into a forensic setting. It is possible that, if the recommendations set forth in this study are followed, T-RF LP analysis may be used to accurately and reliably identify the origin of an unknown soil. However, this may require that a confidence value be determined explaining the likelihood that the questioned soil originated from a specific habitat. Most likely this can be achieved through perfecting the T-RF LP process and fully appreciating the effects of time, soil heterogeneity, and the statistical method used to analyze the data. 57 APPENDIX: Table 4. Similarity Indices for PCR Replicates Samples from the main site of each location were amplified three separate times and compared in both March and September. Sample name is in the first column while number of peaks in each profile is in the second column. The third replicate for Dpn 305 RM was not included because it only had 2 peaks from 40 to 450 base pairs. Number of Peaks Sample 1 Sample 2 Sample 3 Dpnll 305 21 Sample 1 0.655 0.710 AM 12 Sample 2 0.857 14 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Dpnll 305 17 Sample 1 0.800 0.891 MM 38 Sample 2 0.985 29 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Dpnll 305 46 Sample 1 0.718 N/A RM 24 Sample 2 N/A N/A Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Dpnll 305 35 Sample 1 0.924 0.815 SM 44 Sample 2 0.946 30 Sample 3 Table 4 (cont’d) Number of Peaks Sample 1 Sample 2 Sample 3 Dpn11305 76 Sample 1 0.823 0.930 WM 20 Sample 2 0.828 38 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Dpnll 904 8 Sample 1 0.606 0.679 AM 25 Sample 2 0.844 20 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Dpnll 904 9 Sample 1 0.810 0.762 MM 12 Sample 2 0.875 12 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Dpnll 904 44 Sample 1 0.750 0.844 RM 24 Sample 2 0.841 20 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Dpnll 904 30 Sample 1 0.949 0.919 SM ' 25 Sample 2 1.00 32 Sample 3 59 Table 4 (cont’d) Number of Peaks Sample 1 Sample 2 Sample 3 Dpn" 904 9 Sample 1 0.870 0.724 WM 14 Sample 2 0.765 20 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Mspl 305 AM 27 Sample 1 0.831 0.808 32 Sample 2 0.902 25 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Map1 305 31 Sample 1 0.842 0.649 MM 45 Sample 2 0.603 13 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Msp1305 RM 42 Sample 1 0.932 0.831 31 Sample 2 0.833 17 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Mspl 305 SM 67 Sample 1 0.909 0.936 32 Sample 2 0.946 42 Sample 3 6O Table 4 (cont’d) Number of Peaks Sample 1 Sample 2 Sample 3 Map1 305 70 Sample 1 0.860 0.950 WM 23 Sample 2 0.833 31 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Mspl 904 AM 41 Samme 1 0.950 0.828 39 Sample 2 0.839 23 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Mspl 904 30 Sample 1 0.698 0.783 MM 13 Sample 2 0.897 16 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Mspl 904 RM 57 Sample 1 0.855 0.693 26 Sample 2 0.864 18 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Mspl 904 SM 28 Sample 1 0.786 0.783 14 Sample 2 0.564 41 Sample 3 61 Table 4 (cont’d) Number of Peaks Sample 1 Sample 2 Sample 3 MSp1904 20 Sample 1 0.892 0.872 WM 17 Sample 2 0.917 19 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Rsal 305 AM 22 Sample 1 0.737 0.810 16 Sample 2 0.694 20 Sample 3 Number of , Peaks Sample 1 Sample 2 Sample 3 Rsa1305 MM 24 Sample 1 0.870 0.846 30 Sample 2 0.881 54 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 16 Sample 1 0.680 0.607 Rsal 305 RM 9 Sample 2 0.667 12 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Rsal 305 SM 21 Sample 1 0.638 0.500 37 Sample 2 0.808 15 Sample 3 62 Table 4 (cont’d) Number of Peaks Sample 1 Sample 2 Sample 3 Rsal 305 WM 82 Sample 1 0.784 0.947 24 Sample 2 0.720 51 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 RsaI 904 AM 22 Sample 1 0.784 0.857 15 Sample 2 0.612 34 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 Rsal 904 MM 32 Sample 1 0.630 0.561 14 Sample 2 0.87 9 Sample 3 Number of Peaks Sample 1 Sample 2 Sample 3 RsaI 904 RM 78 Sample 1 0.673 0.489 23 Sample 2 0.743 12 Sample 3 Number of 2 1 Peaks Sample 1 Sample Samp e 3 R331 904 SM 25 Sample 1 0.300 0.811 5 Sample 2 0.412 12 Sample 3 63 Table 4 (cont’d) Number of Peaks Sample 1 Sample 2 Sample 3 Rsal 904 WM 19 Sample 1 0.727 0.548 14 Sample 2 0.654 12 Sample 3 Figure 14. Multidimensional Scaling Results for Directional Samples using Dpnll, in March Each habitat was compared to determine local heterogeneity. These samples were collected once every three months in four directions from the main collection site. 2— 85335 RM305 0 0 353050 was RE305° RW305 0 _ O O 1 WE305 RS 30 5 311305 mos N O 0 g RN3OS SM305 .. o m 0 9 A3305 __ 5 o o E ”most/mans 410305 a M53050 ”M305 0 1311335 wsaos o -1 —1 O O o A5305 M8305 0 W605 W305 0 -24 If I I -1 0 1 2 Dimension 1 64 Figure 15. Multidimensional Scaling Results for Directional Samples using Dpnll, in September 1 Each habitat was compared to determine local heterogeneity. These samples were collected once every three months in four directions from the main collection Site. 2 ‘ RM904 0 Ring 0 SN 904 C, 04 RE904 0 RN 904 1— o N W934 RS904 : O 0 '3 SMQO‘ MESM‘b 0W4 ‘8 0 e ‘3 ""WQMMMQM 414904 0 3141904 0 0 E m904| 0 106904 01-: O 0 9 SE 904 own904 99904 ”3,3904 -1 _ 0 49904 0 45904 00 -2 — pin/904 I I I l I e .2 -1 0 1 2 Dimension 1 65 Figure 16. Multidimensional Scaling Results for Directional Samples using Dpnll, in December Each habitat was compared to determine local heterogeneity. These samples were collected once every three months in four directions from the main collection site. 2 — M81204 911111204 0 o 0 951204 81141204 991204 451204 1 — O 0 0 RN1204 R E 1204 W 204 N ' O o o o 04:41 204 a : 0 D WW1 204 M1 204 o I O O MWl 204 ,351 204 '5 : WN1204 W512“ RS1204 ””1202, 5 E omM1204 0 g : -1 - RM1 204 0m, 204 W 204 ME1204 Q ' 0 O 0 8111234 -2 m G -3 — I I ' I I .2 .1 0 1 2 Dimension 1 66 Figure 17. Multidimensional Scaling Results for Directional Samples using Mspl, in March Each habitat was compared to determine local heterogeneity. These samples were collected once every three months in four directions from the main collection site. a - 1 amps 45305 WE305 ng05 0| O O 1 _ 0 45305 MM305 RM 305 RN 305 N o RE305 o 0 5 W305 RW305o 05 8 0- 0654179305 _ $3305 “a” "3 M305 _ ‘E’ M5305 o O ._., M9305 Q 1475305 WN305 o o -1 — SN305 97141305005,“305 0 99305 0 55305 -2 — O I F I I I .2 .1 0 1 2 Dimension 1 67 Figure 18. Multidimensional Scaling Results for Directional Samples using Mspl, in June Each habitat was compared to determine local heterogeneity. These samples were collected once every three months in four directions from the main collection site. AN 605 2‘ A8605 0 0 AE 605 Q o AWBOS 1 — msms M535 RNBDSO G N REBOSORMBOS 1:: MEBUS MM 6 0 5 O .3 0 _ mms Cb RW605 = O 35%65 0 en E 88605 -- WE 505 W58 Q NOS O was 'WNBOS _1 3 o 0 WMBOS SN 835 R 8805 O O 0 SM 605 o -2 -+ I l l I -1 0 2 Dimension 1 68 Figure 19. Multidimensional Scaling Results for Directional Samples using Mspl, in September Each habitat was compared to determine local heterogeneity. These samples were collected once every three months in four directions from the main collection site. 2— RN904 O RM904 414904 mmrgoq ‘3 3‘90“ MM904 0 43/4904 1‘ O 0 45904 0 RW904 N O = 1245904 0 45904 c o '5 0 559040 — s: Q) E WN904 W904 RE904 -- 0 541904 a 0 R8904 O o 0 501904 M3904 o -1- o 5N904 0 MW904 c 33904 ME904 o 0 -2- l l l I l l -1 5 .1 .0 -05 0.0 0.5 1 .0 1.5 Dimension 1 69 Figure 20. Multidimensional Scaling Results for Directional Samples using Rsal, in blarch Each habitat was compared to determine local heterogeneity. These samples were collected once every three months in four directions from the main collection site. 2d AN305 A5305 0 ME305 A5305 0 0 3,4305 1- O MM305 O 0 5141305 0 o ”3305 N 00 RN305 W305 31.1305 : 55305 0 o a o 1;; 0 510305 5 ° RE305AM305 o E n 400305 1111111305 0 O _1_ WE305 11453050 .3 wn3a5 OSE305 o 5111:3043 WM305 o 0 55305 -2— l l I l I -2 -1 0 1 2 Dimension 1 7O Figure 21. Multidimensional Scaling Results for Directional Samples using RsaI, in September Each habitat was compared to determine local heterogeneity. These samples were collected once every three months in four directions from the main collection site. RN904 2 ‘ o 0 RM 904 1 —1 SNQO4 I o RW904 5M904 N 55904 R5904 o O = O O MMQ34 O ASSMO W4 .2 5111904 MW904AN904 0 m 0 Q n. 0 AMQJ4 r} 5 c 45904 E ‘ M5904 a 55904 0 OREQIM 1141/1904 _1 _ 0 0 12111904 M5904 o ”33904 o -2 - a- 1'0 3. O .5 l0 (0.. Dimension 1 71 Figure 22. Multidimensional Scaling Results for Directional Samples using Rsal, in December Each habitat was compared to determine local heterogeneity. These samples were collected once every three months in four directions from the main collection site. 2-1 WM1204 O 121112111204 4M1204 51111204 0 o o G 19 <3 451204 11151204 RS1204 N AN 204 O 1 E O J{132141204 551204 a 51111204 0 .._____ 5 0 11131204412111 2134 WW2043311151204 00551204 E o MW1204 5M1204 a“ o a _1_ R51204 o MM1204 M51204 0 o M51204 -2—1 1 l l l l -1 0 1 2 Dimension 1 72 Table 5. Indicative Peaks for Dpnll Frequencies of rhizobial species were calculated (in percentages) for each habitat using Dpnll. Peaks were considered indicative if they were found with at least 75% frequency, and included in the genetic ‘fingerprint’ for that habitat. Peaks found at least 75% of the time for one habitat and less than or equal to 25% of the time for all other habitats were considered strongly indicative and are in bold print. TRF(bp) Field Marsh Yard Sandy Woodlot Woodlot 40.97 25.0 0.0 41.7 25.0 75.0 41.97 41.7 63.6 58.3 50.0 83.3 49.94 100.0 81.8 91.7 58.3 66.7 60.35 66.7 45.5 58.3 41.7 83.3 62.02 25.0 27.3 41.7 33.3 83.3 64.91 16.7 18.2 25.0 8.3 83.3 84.25 33.3 18.2 83.3 58.3 50.0 86.95 8.3 18.2 25.0 50.0 83.3 93.99 8.3 9.1 8.3 8.3 75.0 96.59 41.7 27.3 75.0 58.3 33.3 104.39 33.3 9.1 75.0 25.0 16.7 130.78 16.7 0.0 25.0 75.0 58.3 143.38 8.3 0.0 0.0 75.0 0.0 149.00 25.0 63.6 75.0 58.3 66.7 150.13 83.3 63.6 100.0 66.7 83.3 162.89 66.7 72.7 83.3 33.3 83.3 164.04 41.7 36.4 50.0 66.7 83.3 165.72 8.3 0.0 16.7 8.3 83.3 178.29 91.7 100.0 75.0 83.3 83.3 179.42 50.0 18.2 58.3 16.7 83.3 194.23 16.7 18.2 0.0 75.0 16.7 200.03 8.3 27.3 75.0 0.0 58.3 203.92 83.3 9.1 16.7 8.3 0.0 230.24 0.0 0.0 8.3 8.3 91.7 235.53 16.7 0.0 0.0 75.0 8.3 262.84 58.3 27.3 16.7 83.3 8.3 327.34 0.0 0.0 33.3 8.3 75.0 73 Table 6. Indicative Peaks for Mspl Frequencies of rhizobial species were calculated (in percentages) for each habitat using Mspl. Peaks were considered indicative if they were found with at least 75% frequency, and included in the genetic ‘fingerprint’ for that habitat. Peaks found at least 75% of the time for one habitat and less than or equal to 25% of the time for all other habitats were considered strongly indicative and are in bold print. TRF(bp) Field Marsh Yard Sandy Woodlot Woodlot 53.52 41.7 45.5 33.3 8.3 91.7 61.63 83.3 27.3 58.3 66.7 41.7 66.76 58.3 90.9 33.3 100.0 91.7 69.76 58.3 54.5 33.3 50.0 83.3 72.85 8.3 0.0 16.7 16.7 75.0 80.89 25.0 54.5 0.0 100.0 16.7 91.94 8.3 18.2 0.0 8.3 75.0 100.70 8.3 0.0 16.7 16.7 75.0 104.50 25.0 18.2 83.3 41.7 8.3 115.80 75.0 18.2 75.0 83.3 100.0 117.06 83.3 100.0 75.0 41.7 41.7 119.56 33.3 9.1 0.0 33.3 83.3 120.46 83.3 81.8 50.0 8.3 75.0 129.23 100.0 27.3 91.7 91.7 66.7 132.53 41.7 9.1 75.0 16.7 8.3 133.36 8.3 18.2 8.3 41.7 91.7 136.17 75.0 72.7 66.7 8.3 0.0 138.81 25.0 54.5 16.7 0.0 83.3 142.28 16.7 36.4 83.3 25.0 8.3 144.14 8.3 27.3 8.3 0.0 91.7 145.44 25.0 45.5 41.7 75.0 33.3 148.95 25.0 9.1 16.7 75.0 66.7 160.83 0.0 9.1 16.7 16.7 91.7 163.00 25.0 9.1 33.3 16.7 91.7 193.42 0.0 0.0 83.3 8.3 83.3 197.16 83.3 9.1 25.0 75.0 0.0 201.72 16.7 9.1 8.3 8.3 83.3 209.53 33.3 72.7 33.3 75.0 66.7 210.84 50.0 45.5 41.7 91.7 58.3 211.77 91.7 45.5 41.7 16.7 8.3 212.60 0.0 45.5 41.7 0.0 75.0 74 Table 6 (cont’d) 234.55 16.7 27.3 0.0 75.0 0.0 240.60 58.3 100.0 41.7 8.3 33.3 252.95 91.7 45.5 33.3 33.3 75.0 283.34 16.7 9.1 8.3 75.0 0.0 320.67 0.0 0.0 83.3 25.0 0.0 343.27 16.7 18.2 0.0 75.0 8.3 Table 7. Indicative Peaks for Rsal Frequencies of rhizobial species were calculated (in percentages) for each habitat using RsaI. Peaks were considered indicative if they were found with at least 75% frequency, and included in the genetic ‘fingerprint’ for that habitat. Peaks found at least 75% of the time for one habitat and less than or equal to 25% of the time for all other habitats were considered strongly indicative and are in bold print. TRF(bp) Field Marsh Yard Sandy Woodlot Woodlot 52.23 8.3 36.4 25.0 83.3 25.0 63.67 41.7 45.5 16.7 33.3 75.0 64.58 66.7 9.1 50.0 58.3 83.3 69.62 50.0 81.8 50.0 83.3 100.0 72.80 25.0 27.3 41.7 58.3 83.3 73.86 66.7 63.6 25.0 58.3 83.3 79.93 8.3 36.4 8.3 16.7 91.7 83.99 58.3 54.5 33.3 83.3 75.0 91.77 16.7 9.1 8.3 8.3 75.0 93.78 50.0 18.2 83.3 16.7 33.3 100.89 16.7 18.2 8.3 50.0 75.0 102.91 58.3 36.4 0.0 0.0 91.7 107.88 41.7 9.1 75.0 33.3 50.0 112.86 83.3 63.6 41.7 58.3 75.0 118.56 25.0 81.8 16.7 41.7 16.7 137.49 91.7 90.9 100.0 75.0 75.0 139.02 75.0 72.7 83.3 25.0 75.0 140.17 50.0 27.3 83.3 16.7 33.3 144.79 0.0 0.0 16.7 16.7 83.3 158.76 75.0 27.3 16.7 8.3 91.7 169.73 25.0 45.5 8.3 16.7 83.3 75 Table 7 (cont’d) 185.27 0.0 18.2 83.3 8.3 0.0 199.29 8.3 9.1 8.3 50.0 91.7 206.12 33.3 9.1 66.7 8.3 91.7 208.49 8.3 9.1 16.7 75.0 0.0 214.69 41.7 27.3 0.0 0.0 83.3 222.80 41.7 81.8 41.7 41.7 75.0 240.61 91.7 100.0 66.7 8.3 83.3 241.71 8.3 18.2 0.0 8.3 91.7 294.47 8.3 0.0 41.7 75.0 50.0 297.84 50.0 27.3 8.3 41.7 83.3 302.31 0.0 0.0 0.0 0.0 83.3 323.77 33.3 36.4 8.3 8.3 91.7 336.64 0.0 0.0 0.0 0.0 75.0 348.99 8.3 27.3 16.7 16.7 75.0 350.16 8.3 27.3 75.0 8.3 0.0 76 REFERENCES Adams, GA. and DH. Wall. 2000. Biodiversity above and below the surface of soils and sediments: linkages and implications for global change. Bioscience. 50(12): 1043- 1048. Adderley, W.P., I.A. Simpson, and DA. Davidson. 2002. Colour description and quantification in mosaic images of soil thin sections. Geoderma. 108:181-195. Adjei, M.B., K.H. Quesenberry, and CG. Chambliss. 2002. Nitrogen fixation and inoculation of forest legumes. Accessed 2008 March 19. Avaniss-Aghajani, E., K. Jones, A. Holtzman, T. Aronson, N. Glover, M. Boian., S. Froman, and CF. Brunk. 1996. Molecular technique for the rapid identification of mycobacteria. Journal of Clinical Microbiology. 34(1):98-102. Blackwood, C.B., T.L. Marsh, K. Sang-Hoon, and EA. Paul. 2003. Terminal restriction fragment length polymorphism data analysis for quantitative comparison of microbial communities. Applied and Environmental Microbiology. 69(2):926- 932. Blackwood, CB. and EA. Paul. 2003. Eubacterial community structure and population size within the soil light fraction, rhizosphere, and heavy fraction of several agricultural systems. Soil Biology and Biochemistry. 35(9): 1245-1255. Boyle, J. 2004. A comparison of two methods for estimating the organic matter content of sediments. Journal of Paleolimnology. 31(1): 125-127. Brockwell, J., A. Pilka, and RA. Holliday. 1991. Soil pH is a major determinant in the numbers of naturally occurring Rhizobium meliloti in non-cultivated soils in south central Wales. Australian Journal of Experimental Agriculture. 31(2):21 1-219. Brown, A.G. 2006. The use of botany and geology in war crimes investigations in NE Bosnia. Forensic Science International. 163(3):204-210. Bruce, K.D. 1997. Analysis of mer gene subclasses within bacterial subcommunities in soils and sediments resolved by fluorescent-PCR-restriction fragment length polymorphism. Applied and Environmental Microbiology. 63:4914-4919. Cai, Y., J .C. Cabrera, M. Georgiadis, and K. Jayachandran. 2002. Assessment of arsenic mobility in the soils of some golf courses in South Florida. The Science of the Total Environment. 291(1-3): 123-134. 77 Casamayor, E.O., R. Massana, S. Benlloch, L. Ovreas, B. Diez, V.J. Goddard, J .M. Gasol, 1. Joint, F. Rodriguez-Valera, and C. Pedros-Alio. 2002. Changes in archaeal, bacterial and eukaryal assemblages along a salinity gradient by comparison of genetic fingerprinting methods in a multipond solar saltern. Environmental Microbiology. 4:338—348. Chan, K.Y., D.P. Heenan, and A. Oates. 2002. Soil carbon fractions and relationship to soil quality under different tillage and stubble management. Soil and Tillage Research. 63(3-4): 133-139. Chao, L.S.L., R.E. Davis, and CL. Moyer. 2007. Characterization of bacterial community structure in vestimentiferan tubeworm Ridgeia piscesae trophosomes. Marine Ecology. 28:72-85. Clement, B.G., L.E. Kehl, K.L. DeBord, and CL. Kitts. 1998. Terminal restriction fragment patterns (TRFPs), a rapid, PCR based method for the comparison of complex bacterial communities. Journal of Microbiological Methods. 31:135-142. Cohan, F .M. 2002. What are bacterial species? Annual Review of Microbiology. 56:457- 487. Cooper, J .E., A.J. Bjourson, W. Streit, and D. Werner. 1998. Isolation of unique nucleic acid sequences from rhizobia by genomic subtraction: applications in microbial ecology and symbiotic gene analysis. Plant and Soil. 204247-55. Cooper, J .E. and J .R. Rao. 2007. Molecular approaches to soil, rhizosphere, and plant microorganism analysis. United Kingdom: Cabi Publishing. Cox, R.J., H.L. Peterson, J. Young, C. Cusik, and E0. Espinoza. 2000. The forensic analysis of soil organic by FTIR. Forensic Science International. 108(2): 107-1 16. Dawson, L.A., W. Towers, R.W. Mayes, S. Hillier, A. Fraser, .1. Craig, and EC. Waterhouse. 2003. Use of plant wax signatures in understanding soils. Proceedings of the Clay Minerals Group of the Mineralogical Society and the Forensic Science Society - Trace Metals, Isotopes, and Minerals in Forensic Science. Dawson, L.A., W. Towers, R.W. Mayes, J. Craig, R.K. Vaisanen, and EC. Waterhouse 2004. The use of plant hydrocarbon signatures in characterizing soil organic matter. Geological Society Special Publications. 232:269-276. DeBruijn, F .J . 1992. Use of repetitive (repetitive extragenic palindromic and enterobacterial repetitive intergenic consensus) sequences and the polymerase chain reaction to fingerprint the genomes of rhizobium meliloti isolates and other soil bacteria. Applied and Environmental Microbiology. 58:2180-2187. 78 Denton, M.D., D.R. Coventry, P.J. Murphy, J.G. Howieson, and W.D. Bellotti. 2002. Competition between inoculant and naturalised Rhizobium Ieguminosarum bv. trifolii for nodulation of annual clovers in alkaline soils. Australian Journal of Agricultural Research. 53(9): 1 O 1 9- 1026. Drouin, P., D. Pre'vost, and H. Antoun. 2000. Physiological adaptation to low temperatures of strains of Rhizobium leguminosarum bv. viciae associated with Lathyrus spp.(l). FEMS Microbiology Ecology. 32(2):111-120. Dunbar, J ., L.O. Ticknor, and CR. Kuske. 2000. Assessment of microbial diversity in four southwestern United States soils by 16S rRNA gene terminal restriction fragment analysis. Applied and Environmental Microbiology. 66(7):2943-2950. F ierer, N. and RB. Jackson. 2006. From the cover: the diversity and biogeography of soil bacterial communities. Proceedings of the National Academy of Sciences. 103:626-631. Franklin, RB. and AL. Mills. 2003. Multi-scale variation in spatial heterogeneity for microbial community structure in an eastern Virginia agricultural field. FEMS Microbiology Ecology. 44(3):335-346. Galantini, J. and R. Rosell. 2006. Long-term fertilization effects on soil organic matter quality and dynamics under different production systems in semiarid Pampean soils. Soil and Tillage Research. 87(1):72-79. George, T., 8.8. Bohlool, and P.W. Singleton. 1987. Bradyrhizobiumjaponicum environmental interactions: nodulation and interstrain competition in soil along an elevational transect. Applied and Environmental Microbiology. 53(5):1113-1117. Harder, T., S.C.K. Lau, W.Y. Tam, and P.Y. Qian. 2004. A bacterial culture-independent method to investigate chemically mediated control of bacterial epibiosis in marine invertebrates by using T-RF LP analysis and natural bacterial populations. FEMS Microbiology Ecology. 47(1):93.99. Hartmaan, M. and F. Widmer. 2008. Reliability for detecting composition and changes of microbial communities by T-RFLP genetic profiling. FEMS Microbiology Ecology. 63(2):249-260. Heath, LE. and VA. Saunders. 2006. Assessing the potential of bacterial DNA profiling for forensic soil comparisons. Journal of Forensic Sciences. 51(5):]062-1068. Hopkins, D.W., P.E.J. Wiltshire, and ED. Turner. 2000. Microbial characteristics of soils from graves: an investigation at the interface of soil microbiology and forensic science. Applied Soil Ecology. 14(3):283-288. Horrocks, M. 2004. Sub-sampling and preparing forensic samples for pollen analysis. Journal of Forensic Sciences. 49: 1024-1027. Irwin, D.L., K.R. Mitchelson, and I. Findlay. 2003. PCR product cleanup methods for capillary electrophoresis. BioTechniques. 34:932-936. 79 Jarvis, K.E., HE. Wilson, and S.L. James. 2004. Assessing element variability in small soil samples taken during forensic investigation. Geological Society, London, Special Publications. 232: 171-182. Junger, ER 1996. Assessing the unique characteristics of close-proximity soil samples: just how useful is soil evidence? Journal of Forensic Sciences. 41(1):27-34. Kennedy, N., E. Brodie, J. Connolly, and N. Clipson. 2004. Impact of lime, nitrogen and plant species on bacterial community structure in grassland microcosms. Environmental Microbiology. 6(10): 1070- 1080. Kerkhof, L., M. Santoro, and J. Garland. 2000. Response of soybean rhizosphere communities to human hygiene water addition as determined by community level physiological profiling (CLPP) and terminal restriction fragment length polymorphism (T-RFLP) analysis. FEMS Microbiology Letters. 184(1):95.101. LaMontagne, M.G., J .P. Schimel, and RA. Holden. 2003. Comparison of subsurface and surface soil bacterial communities in California grassland assessed by terminal restriction fragment length polymorphisms of PCR-amplified 16S rRNA genes. Microbial Ecology. 46:216-227. Liu, W.T., T.L. Marsh, H. Cheng, and L]. Fomey. 1997. Characterization of microbial diversity by determining terminal restriction fragment length polymorphisms of genes encoding 16S rRNA. Applied and Environmental Microbiology. 63(11):4516-4522. Liu, W.T., T.L. Marsh, and L]. F orney. 1998. Determination of the microbial diversity of anaerobic-aerobic activated sludge by novel molecular biological technique. Water Science and Technology. 37:417-422. Liu, X, S.M. Tiquia, G. Holguin, L. Wu, S.C. Nold, A.H. Devol, K. Luo, A.V. Palumbo, J .M. Tiedje, and J. Zhou. 2003. Molecular diversity of denitrifying genes in continental margin sediments within the oxygen-deficient zone off the Pacific coast of Mexico. Applied and Environmental Microbiology. 69(6):3549-3560. Lueders, T. and M.W. Friedrich. 2003. Evaluation of PCR amplification bias by terminal restriction fragment length polymorphism analysis of small-subunit rRNA and mcrA genes by using defined template mixtures of methanogenic pure cultures and soil DNA extracts. Applied and Environmental Microbiology. 69(1):320-326. Lukow, T., P.F. Dunfield, and W. Liesack. 2000. Use of the T-RF LP technique to assess spatial and temporal changes in the bacterial community structure within an agricultural soil planted with transgenic and non-transgenic potato plants. FEMS Microbiology Ecology. 32(3):241—247. Marsh, TL. 1999. Terminal restriction fragment length polymorphism (T—RF LP): an emerging method for characterizing diversity among homologous populations of amplification products. Current Opinion in Microbiology. 2:323— 327. 80 Marsh, TL 2005. Culture-independent microbial community analysis with terminal restriction fragment length polymorphism. Methods in Enzymology. 397:308-329. McIntyre, H.J., H. Davies, T.A. Hore, S.H. Miller, J .P. Dufour, and CW. Ronson. 2007. Trehalose biosynthesis in Rhizobium leguminosarum bv. trifolii and its role in desiccation tolerance. Applied and Environmental Microbiology. 73:3984-3992. McKinley, J. and A. Ruffell. 2007. Contemporaneous spatial sampling at scenes of crime: advantages and disadvantages. 172(2-3): 196-202. Meyers, MS. and DR. F oran. 2008. Spatial and temporal influences on bacterial profiling of forensic soil samples. Journal of Forensic Sciences. 53(3):652-660. Moreno, L.I., D.K. Mills, J. Entry, R.T. Sautter, and K. Mathee. 2006. Microbial metagenome profiling using amplicon length heterogeneity-polymerase chain reaction proves more effective than elemental analysis in discriminating soil specimens. 51(6): 13 15-1322. Morgan, RM. and RA. Bull. 2007. The philosophy, nature and practice of forensic sediment analysis. Progress in Physical Geography. 31(1):43-58. Mummey, D.L. and PD. Stahl. 2003. Spatial and temporal variability of bacterial 16S rDNA—based T-RFLP patterns derived from soil of two Wyoming grassland ecosystems. FEMS Microbiology Ecology 46(1):]13-120. Murray, RC, and LP. Solebello. 2002. Forensic Examination of Soil. In Forensic Science Handbook Volume 1. 2nd ed. ed. R. Saferstein. Upper Saddle River, New Jersey: Prentice Hall. Muyzer, G., and K. Smalla. 1998. Application of denaturing gradient gel electrophoresis (DGGE) and temperature gradient gel electrophoresis (TGGE) in microbial ecology. Antonie Van Leeuwenhoek International Journal of General and Molecular Microbiology. 73: 127—141. Noll, M., D. Matthies, P. Frenzel, M. Derakshani, and W. Liesack. 2005. Succession of bacterial community structure and diversity in a paddy soil oxygen gradient. Environmental Microbiology. 7(3):382—395. Norusis, Marija. 2007. SPSS 15.0 Advanced statistical procedures companion. Upper Saddle River, New Jersey: Prentice Hall. Oren, A. 2004. Prokaryote diversity and taxonomy: current status and future challenges. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 359:623-638. 81 Osborn, A.M., E.R.B. Moore, and K.N. Timmis. 2000. An evaluation of terrninal- restriction fiagment length polymorphism (T-RFLP) analysis for the study of microbial community structure and dynamics. Environmental Microbiology 2(1):39-50. Perret, X. and W.J. Broughton. 1998. Rapid identification of rhizobium strains by targeted PCR fingerprinting. Plant and Soil. 204:21-38. Pesaro, M., G. Nicollie, J. Zeyer, and F. Widmer. 2004. Impact of soil drying—rewetting stress on microbial communities and activities and on degradation of two crop protection products. Applied and Environmental Microbiology. 70:2577— 2587. Petraco, N. and Kubic, T. 2000. A density gradient technique for use in forensic soil analysis. Journal of Forensic Sciences. 45:872-873. Pye, K., 8.]. Blott, D.J. Croft, and J .F . Carter. 2006. Forensic comparison of soil samples: assessment of small-scale spatial variability in elemental composition, carbon and nitrogen isotope ratios, colour, and particle size distribution. Forensic Science International. 163( l -2):59-80. Pye, K., S]. Blott, D.J. Croft, and SJ. Witton. 2007. Discrimination between sediment and soil samples for forensic purposes using elemental data: an investigation of particle size effects. Forensic Science lntemational. l67(1):30-42. Quinn, GP. and M.J. Keough. 2002. Experimental design and data analysis for biologists. Cambridge: Cambridge University Press. Rees, G.N., D.S. Baldwin, G.O. Watson, S. Perryman, and D.L. Nielsen. 2004. Ordination and significance testing of microbial community composition derived from terminal restriction fragment length polymorphisms: application of multivariate statistics. Antonie van Leeuwenhoek. 86(4):339-347. Richardson, A.E., LA. Viccars, J .M. Watson, and A.H. Gibson. 1995. Differentation of Rhizobium strains using polymerase chain reaction with random and directed primers. Soil Biology and Biochemistry. 27(4-5):515-524. Saferstein, R. 2001. Criminalistics: An Introduction to Forensic Science. 7th ed. Upper Saddle River, New Jersey: Prentice Hall. Sakamoto, M., Y. Takeuchi, M. Umeda, I. Ishikawa, and Y. Benno. 2003. Application of terminal RFLP analysis to characterize oral bacterial flora in saliva of healthy subjects and patients with periodontitis. Journal of Medical Microbiology, 52:79- 89. Schwartzenbach, K., J. Enkerli, and F. Widmer. 2006. Objective criteria to assess representativity of soil fungal community profiles. Journal of Microbiological Methods. 68(2):358-366. 82 Schwieger, F ., and CC. Tebbe. 1998. A new approach to utilize PCR single-strand- conforrnation polymorphism for 16S rRNA genebased microbial community analysis. Applied and Environmental Microbiology. 64:4870—4876 Soil Survey Division Staff. 1993. Soil survey manual. Soil Conservation Service. US Department of Agriculture Handbook 18. Spaink, H.P., R.J.H. Okker, C.A. Wijffelman, E. Pees, and B.J.J. Lugtenberg. 1987. Promoters in the nodulation region of Rhizobium leguminosarum Sym plasmid pRLlJI. Plant Molecular Biology. 9(1):27-39. Smith, C.J., B.S. Danilowicz, A.K. Clear, F.J. Costello, B. Wilson, W.G. Meijer. 2005. T- Align, a web-based tool for comparison of multiple terminal restriction fragment length polymorphism profiles. FEMS Microbiology Letters. 54(3):375-380. Sugita, R. and Y. Marumo. 2001. Screening of soil evidence by a combination of simple techniques: validity of particle size distribution. Forensic Science lntemational. 122:155-158. Tan, 2., T. Hurek, and B. Reinhold-Hurek. 2003. Effect of N-fertilization, plant genotype, and environmental conditions on nifH gene pools in roots of rice. Environmental Microbiology. 5(10): 1009- 1 01 5. Torsvik, V. and L. Ovreas. 2002. Microbial diversity and function in soil: from genes to ecosystems. Current Opinion in Microbiology. 5(3):240-245. T-RFLP Profile Matrix. 1998. The Ribosomal Database Project. < http://rdp8.cme.msu.edu/cgis/trflp.cgi?su=SSU>. Accessed 2008 April 24 Federal Research Division of the Library of Congress. 1998. Country Studies/Area Handbook Series. < http://countrystudies.us/united- states/weather/michigan/lansing.htm>. Accessed 2008 June 28 Van der Maarel, M.J.E.C., R.R.E. Artz, R. Haanstra, and L.J. Fomey. 1998. Association of marine archaea with the digestive tracts of two marine ash species. Applied and Environmental Microbiology. 64:2894-2898. Wanogho, S., G. Gettinby, B. Caddy, and J. Robertson. 1989. Determination of particle size distribution of soils in forensic science using classical and modern instrumental methods. Journal of Forensic Sciences. 34(4):823-835. Watson, J .M. and PR. Schofield. 1985. Species-specific, symbiotic plasmid-located repeated DNA sequences in Rhizobium trifolii. Molecular and General Genetics MGG. 199(2):279-289. Weir, B. S. 2006. The current taxonomy of rhizobia. Accessed 2008 March 12. 83 Widmer, F ., A. F liessbach, E. Laczko, J. Schulze-Aurich, and J. Zeyer. 2001. Assessing soil biological characteristics: a comparison of bulk soil community DNA-, PLFA-, and BiologTM-analyses. Soil Biology & Biochemistry. 33: 1029— 1036. Williams, J., R.E. Prebble, W.T. Williams, and CT. Hignett. 1983. The influence of texture, structure, and clay mineralogy on the soil moisture characteristic. Australian Journal of Soil Research. 21:15-32. Wolsing, M. and A. Priemé. 2004. Observation of high seasonal variation in community structure of denitrifying bacteria in arable soil receiving artificial fertilizer and cattle manure by determining T-RF LP of nir gene fragments. FEMS Microbiology Ecology. 48(2):261-271. Wu, T., D.O. Chellemi, K.J. Martin, J .H. Graham, E.N. Rosskopf. 2007. Discriminating the effects of agricultural land management practices on soil fungal communities. Soil Biology and Biochemistry. 39(5):]139-1155. Wulf, A., K. Manthey, J. Doll, A.M. Perlick, B. Linke, T. Bekel, F. Meyer, P. Franken, H. Kuster, and F. Krajinski. 2003. Transcriptional changes in response to Arbuscular Mycorrhiza development in the model plant Medicago truncatula. Molecular Plant-Microbe Interactions. 16(4):306-314. 84 1111111111113113111111 ........