. ."aww , - .... 1 .«u .. 301th? 11.... I .4 t‘.’ 7......» . . ’ I&W l. f. v 1 . 3, . .5 E... ..1.r 3.534% . . 1.. «$4wa .. . . mmw. vim? a; 3% 2. 33..., . a .S. Q LIBRARY , W. Michigan State ‘20 " University This is to certify that the thesis entitled ASSOCIATION AND DISCRIMINATION OF DIESEL FUELS USING CHEMOMETRIC PROCEDURES FOR FORENSIC ARSON INVESTIGATIONS presented by LUCAS JAMES MARSHALL has been accepted towards fulfillment of the requirements for the MS. degree in CRIMINAL JUSTICE Major Proiessor’s Signature 20 TH HA7 1406’ Date MSU is an affirmative-action. equal-opportunity employer -.—.-.-.-p--.-.-.---1—.-.---.---.—-------.-—.-.-.-.-.------u—.-.—1-.-.-.-‘-.---v-.--v—-—.—-— PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 5/08 K:lProilAcc&Pres/ClRC/DaleDue.indd ASSOCIATION AND DISCRIMINATION OF DIESEL FUELS USING CHEMOMETRIC PROCEDURES FOR FORENSIC ARSON INVESTIGATIONS By Lucas James Marshall A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE School of Criminal Justice 2008 ABSTRACT ASSOCIATION AND DISCRIMINATION OF DIESEL FUELS USING CHEMOMETRIC PROCEDURES FOR FORENSIC ARSON INVESTIGATIONS By Lucas James Marshall The identification of an ignitable liquid in fire debris is indicative of an intentional fire and hence, is significant evidence in arson investigations. Currently, gas chromatography-mass spectrometry (GC-MS) is the conventional analytical technique used for the identification of ignitable liquids through chromatographic pattern matching. Chemometric procedures such as Pearson Product Moment Correlation (PPMC) coefficients and Principal Components Analysis (PCA) provide a more objective method to statistically associate and discriminate burned and unburned diesel fuels based on chemical composition in both the total ion chromatograrn (TIC), as well as extracted ion profiles (EIP) corresponding to characteristic compound classes in the diesel samples. Data pre-treatment options, such as retention time alignment and area normalization, were also investigated in order to determine their effects on the Chemometric results. The association and discrimination of burned and unburned diesels was also examined. Diesels were spiked onto different matrices commonly found in the home (cotton cloth, magazine, and carpet) and burned in order to simulate arson conditions. The burned diesels were extracted using solvent extraction procedure and analyzed by GC-MS. The data generated from the burnings was compiled into the same set as the data generated from the neat diesels so that PPMC and PCA could be applied to the entire data set. The potential for the association and discrimination of the burned diesels using these procedures was investigated. ACKNOWLEDGMENTS I would first and foremost like to thank my family and my fiance Molly for always being there to support me. It hasn’t been easy, but they have continued to encourage me throughout the process, and without them, I would have never made it this far. I would also like to thank Dr. Ruth Smith for the opportunity to work with her as a forensic chemistry student. She not only is a top-notch scientist, but she has also become my friend. I know that I am lucky to have had daily contact with my advisor, but again, without her, none of this work would be possible. She has constantly motivated me to take the project to the next level, while along the way challenging me to think and do for myself. I would also like to thank Dr. Victoria McGuffin for her presence and advice throughout this collaborative project. She always seemed to have the answers when I needed help, and without her I would not be where I am as a confident scientist. I would also like to thank Dr. Vince Hoffman for taking time out of his busy schedule to sit on my thesis committee, and Dr. Kathy Severin for instrument time and assistance. Finally, I would like to thank the forensic science program at MSU, as well as all the people in it. I have made some wonderful friends here, especially in the forensic chemistry section and the McGuffin group. To Dr. Melissa Meaney, Amber Hupp, Sarah Meisinger, Lisa LaGoo, Ruth Udey, Jamie Baemcopf, Patty Joiner, and John McIlroy, I thank you. You guys have made my time here so much fun, and I will never forget you! iii TABLE OF CONTENTS List of Tables - - - - _ - - .......... vi List of Figures -- - - - - - -- - - ............ -- - _---ix Chapter 1 - Introduction - - .......... . - 1 1.1 Arson .................................................................................................................... 1 1.2 Ignitable Liquids ................................................................................................... 1 1.3 Analysis of Fire Debris ......................................................................................... 2 1.4 Current Literature Review .................................................................................... 7 1.4.] Statistical and Chemometric Analysis of Petroleum Distillates ........................... 8 1.4.2 Chromatographic Pre-processing ....................................................................... 1] 1.4.3 Burn Conditions and Matrix Interferences ......................................................... 12 1.5 Research Objectives ........................................................................................... 14 Chapter 2 - Analytical and Chemometric Theory -- - - ...... . - 17 2.1 Gas Chromatography-Mass Spectrometry ......................................................... 17 2.2 Data Pre-treatment .............................................................................................. 21 2.2.1 Retention Time Alignment ................................................................................. 22 2.2.2 Area Normalization ............................................................................................ 26 2.2.3 Mean-Centering .................................................................................................. 27 2.3 Pearson Product Moment Correlation Coefficients ............................................ 30 2.4 Principal Components Analysis ......................................................................... 32 Chapter 3 - Association and Discrimination of Neat Diesels Using PPMC Coefficients and PCA - - - - - - ..... 35 3. 1 Introduction ........................................................................................................ 3 5 3.2 Sample Collection and Analysis ......................................................................... 36 3.3 Data Pre-treatment .............................................................................................. 39 3.4 Data Analysis ...................................................................................................... 40 3.5 Results and Discussion ....................................................................................... 42 3.5.1 Initial Data Set .................................................................................................... 42 3.5.2 Data Set using Timed Injection .......................................................................... 48 3.5.3 Improving Abundance Levels of TIC and EICs ................................................. 51 3.5.4 Final Neat Data Set ............................................................................................. 54 3.5.4.1 RSD Values ........................................................................................................ 54 3.5.4.2 PPMC Coefficients ............................................................................................. 56 3.5.4.3 PCA .................................................................................................................... 59 Chapter 4 - Association and Discrimination of Burned Diesels Extracted from Fire Debris Using PPMC Coefficients and PCA - - - -73 4.1 Introduction ........................................................................................................ 73 4.2 Analysis of New Set of Diesels .......................................................................... 75 iv 4.2.1 Procedure ............................................................................................................ 75 4.2.2 Results and Discussion ....................................................................................... 78 4.3 Efficiency of Solvent Extraction Procedure ....................................................... 92 4.3.1 Procedure ............................................................................................................ 92 4.3.2 Results and Discussion ....................................................................................... 94 4.4 Analysis of Burned Diesels ................................................................................ 99 4.4.1 Procedure ............................................................................................................ 99 4.4.2 Results and Discussion ..................................................................................... 102 4.4.2.1 Assessment of Potential Interferences from Unburned and Burned Matrices . 102 4.4.2.2 Solvent Extraction of Diesels from Unburned and Burned Matrices ............... 105 4.4.2.3 PPMC Coefficients of Neat and Burned Diesels .............................................. 108 4.4.2.4 PCA of Neat and Burned Diesels ...................................................................... 1 l 1 4.4.2.5 Identification of Blind Diesels .......................................................................... 124 Chapter 5 - Conclusions and Future Work - - - -- -- ........... 127 5.1 Conclusions ...................................................................................................... 1 27 5.1.1 Association and Discrimination of Neat Diesels Using PPMC Coefficients and PCA ........................................................................................................... 127 5.1.2 Association and Discrimination of Burned Diesels Extracted from Fire Debris Using PPMC Coefficients and PCA ..................................................... 129 5.2 Future Work ..................................................................................................... 130 Appendix A - ASTM Classification Scheme ............................................................. 135 Appendix B - PPMC Coefficients for the TIC and EIPs of Ten Neat Diesels Analyzed in Triplicate -- - _ - 138 Appendix C - PPMC Coefficients for the TIC and EIPs of Five Neat Diesels Analyzed in Triplicate -- - - - 145 Appendix D- PPMC Coefficients for the TIC and EIPs of Compiled Neat and Burned Diesels - _ - - - - - 152 References - -- - - - _ 158 LIST OF TABLES Table 1.1 Major Ions Included in Extracted Ion Profiles for Ignitable Liquids [12] ........ 6 Table 3.1 Diesel Collection Log ............... - ...................................................................... 37 Table 3.2 RSD Values for Eight Peaks Across the Retention Time Range for Ten Diesel Samples Analyzed in Triplicate ........................................................................... 43 Table 3.3 PPMC Coefficients for Replicate Sets of Ten Diesel Samples Analyzed in Triplicate ......................................................................................................................... 45 Table 3.4 RSD Values for Eight Peaks Across the Retention Time Range for Ten Diesel Samples Analyzed in Triplicate Using a Timed Injection Method ..................... 49 Table 3.5 PPMC Coefficients for Replicate Sets of Five Diesel Samples Analyzed in Triplicate Using a Timed Injection Method ............................................................... 50 Table 3.6 PPMC Coefficients for Individual m/z Values and Their Summed EIPs for the Alkane, Aromatic, and Indane Profiles ............................................................... 53 Table 3.7 RSD Values for Eight Peaks Across the Retention Time Range for Final Neat Diesel Data Set Analyzed in Triplicate Using a Timed Injection Method ............. 55 Table 3.8 PPMC Coefficients for TIC and EIPs for Replicate Sets of Ten Diesel Samples (Final Data Set) ................................................................................................ 57 Table 4.1 Diesel Collection Log ..................................................................................... 76 Table 4.2 RSD Values for Eight Peaks Across the Retention Time Range for Five Diesel Samples Analyzed in Triplicate ........................................................................... 79 Table 4.3 PPMC Coefficients for the TIC of Replicate Sets of Five Diesel Samples Analyzed in Triplicate ..................................................................................................... 80 Table 4.4 PPMC Coefficients for TIC and EIPs for Replicate Sets of Five Diesel Samples ........................................................................................................................... 82 Table 4.5 Extraction Recoveries from Cloth for Each Calibration Standard (1,3,5-TMB = 1,3,5-trimethylbenzene, C10 = decane, C12 = dodecane, C14 = tetradecane) ..................................................................................................................... 95 Table 4.6 Extraction Recoveries from Magazine for Each Calibration Standard vi (1,3,5-TMB = 1,3,5-trimethylbenzene, C10 = decane, C12 = dodecane, C14 = tetradecane) ..................................................................................................................... 96 Table 4.7 Extraction Recoveries from Carpet for Each Calibration Standard (1,3,5-TMB = 1,3,5-trimethylbenzene, C10 = decane, C12 = dodecane, C14 = tetradecane) ..................................................................................................................... 98 Table A.l ASTM Ignitable Liquid Classification Scheme [4,5] .................................. 135 Table A2 Chromatographic and Mass Spectral Characteristics of ASTM Ignitable Liquid Classes [4,5] ...................................................................................................... 136 Table 8.1 PPMC Coefficients for the TIC of Triplicate Analyses of Diesels 1-10 ...... 138 Table B.2 PPMC Coefficients for the Alkane EIP of Triplicate Analyses of Diesels 1-10 .................................................................................................................. 139 Table 3.3 PPMC Coefficients for the Aromatic EIP of Triplicate Analyses of Diesels 1-10 .................................................................................................................. 140 Table B.4 PPMC Coefficients for the Indane EIP of Triplicate Analyses of Diesels 1-10 .................................................................................................................. 141 Table B.5 PPMC Coefficients for the OCP EIP of Triplicate Analyses of Diesels 1-10 .................................................................................................................. 142 Table B.6 PPMC Coefficients for the PNA EIP of Triplicate Analyses of Diesels 1-10 .................................................................................................................. 143 Table C.1 PPMC Coefficients for the TIC of Triplicate Analyses of Diesels 21-25....145 Table C.2 PPMC Coefficients for the Alkane EIP of Triplicate Analyses of Diesels 21-25 ................................................................................................................ 146 Table C.3 PPMC Coefficients for the Aromatic EIP of Triplicate Analyses of Diesels 21-25 ................................................................................................................ 147 Table C.4 PPMC Coefficients for the Indane EIP of Triplicate Analyses of Diesels 21 -25 ................................................................................................................ 148 Table C.5 PPMC Coefficients for the OCP EIP of Triplicate Analyses of Diesels 21-25 ................................................................................................................ 149 Table C.6 PPMC Coefficients for the PNA EIP of Triplicate Analyses of Diesels 21-25 ................................................................................................................ 149 vii Table D.1 PPMC Coefficients for the TIC of Neat and Burned Diesels (Cloth A-C = Replicates of Burned Diesel 21 from Cloth, Mag A-C = Replicates of Burned Diesel 21 from Magazine, Car A-C = Replicates of Burned Diesel 21 from Carpet, B-A = Blind A, B-B = Blind B, C1 = Cloth, Mag = Magazine, Car = Carpet) ............. 152 Table D2 PPMC Coefficients for the Alkane EIP of Neat and Burned Diesels (Cloth A-C = Replicates of Burned Diesel 21 from Cloth, Mag A-C = Replicates of Burned Diesel 21 from Magazine, Car A-C = Replicates of Burned Diesel 21 from Carpet, B-A = Blind A, B-B = Blind B, C1 = Cloth, Mag = Magazine, Car = Carpet) ............. 153 Table D.3 PPMC Coefficients for the Aromatic EIP of Neat and Burned Diesels (Cloth A-C = Replicates of Burned Diesel 21 from Cloth, Mag A-C = Replicates of Burned Diesel 21 from Magazine, Car A-C = Replicates of Burned Diesel 21 from Carpet, B-A = Blind A, B-B = Blind B, C1 = Cloth, Mag = Magazine, Car = Carpet) ............. 154 Table D4 PPMC Coefficients for the Indane EIP of Neat and Burned Diesels (Cloth A-C = Replicates of Burned Diesel 21 from Cloth, Mag A-C = Replicates of Burned Diesel 21 from Magazine, Car A-C = Replicates of Burned Diesel 21 from Carpet, B-A = Blind A, B-B = Blind B, C1 = Cloth, Mag = Magazine, Car = Carpet) ............. 155 Table D5 PPMC Coefficients for the OCP EIP of Neat and Burned Diesels (Cloth A-C = Replicates of Burned Diesel 21 from Cloth, Mag A-C = Replicates of Burned Diesel 21 from Magazine, Car A-C = Replicates of Burned Diesel 21 from Carpet, B-A = Blind A, B-B = Blind B, C1 = Cloth, Mag = Magazine, Car = Carpet) ............. 156 Table D6 PPMC Coefficients for the PNA EIP of Neat and Burned Diesels (Cloth A-C = Replicates of Burned Diesel 21 from Cloth, Mag A-C = Replicates of Burned Diesel 21 from Magazine, Car A-C = Replicates of Burned Diesel 21 from Carpet, B-A = Blind A, B-B = Blind B, C1 = Cloth, Mag = Magazine, Car = Carpet) ............. 157 "Images in this thesis are presented in color. viii LIST OF FIGURES Figure 2.1 The Effects of a Retention Time Alignment Algorithm on the Decane Peak of Diesels 1-5 ......................................................................................................... 23 Figure 2.2 The Effects of Area Normalization on the Decane Peak of Diesels 1-5 ....... 28 Figure 2.3 The Application of Mean-Centering to the Decane Peak of Diesels 1-5 ...... 29 Figure 3.1 Distribution of Select Peak Areas Among Replicate Injections of Diesel 9 ........................................................................................................................... 47 Figure 3.2 Scores Plot - TIC (a) PCl v. PC2 and (b) PCl v. PC2 v. PC3 ...................... 60 Figure 3.3 Scores Plot - Alkane EIP (a) PCl v. PC2 and (b) PC 1 v. PC2 v. PC3 .......... 62 Figure 3.4 Scores Plot - Indane EIP (a) PCI v. PC2 and (b) PCl v. PC2 v. PC3 .......... 63 Figure 3.5 Scores Plot - PNA EIP (a) PC] v. PC2 and (b) PCl v. PC2 v. PC3 ............. 64 Figure 3.6 Scores Plot - Aromatic EIP (a) PC] v. PC2 and (b) PC 1 v. PC2 v. PC3 ...... 66 Figure 3.7 Scores Plot - OCP EIP (a) PC] v. PC2 and (b) PCl v. PC2 v. PC3 .............. 67 Figure 3.8 Loadings Plots of First Two Eigenvectors from PCA of the TICs of Ten Diesels Analyzed in Triplicate ........................................................................................ 69 Figure 3.9 Loadings Plots of First Two Eigenvectors from PCA of the Alkane ElPs of Ten Diesels Analyzed in Triplicate ............................................................................ 71 Figure 3.10 Misaligned Peak in the Alkane EIPs of Diesel 1 Replicates ....................... 72 Figure 4.1 Scores Plot for the TIC of the Neat Diesels .................................................. 84 Figure 4.2 Scores Plot for the Alkane EIP of the Neat Diesels ...................................... 85 Figure 4.3 Scores Plot for the Aromatic EIP of the Neat Diesels ................................... 87 Figure 4.4 Scores Plot for the Indane EIP of the Neat Diesels ....................................... 88 Figure 4.5 Scores Plot for the OCP EIP of the Neat Diesels .......................................... 89 Figure 4.6 Scores Plot for the PNA EIP of the Neat Diesels .......................................... 91 ix Figure 4.7 Calibration Curve for the Neat Calibration Standards for Cloth Matrix Recovery Study ............................................................................................................... 95 Figure 4.8 Calibration Curve for the Neat Calibration Standards for Magazine Matrix Recovery Study ................................................................................................... 96 Figure 4.9 Calibration Curve for the Neat Calibration Standards for Carpet Matrix Recovery Study ............................................................................................................... 98 Figure 4.10 Representative Chromatograms of Unburned Matrices: (a) Cloth, (b) Magazine, and (c) Carpet .............................................................................................. 103 Figure 4.11 Representative Chromatograms of Burned Matrices: (a) Cloth, (b) Magazine, and (c) Carpet .............................................................................................. 104 Figure 4.12 Representative Chromatograms for Unburned Matrices Spiked with Diesel 21: (a) Cloth, (b) Magazine, and (c) Carpet ....................................................... 106 Figure 4.13 Representative Chromatograms for Burned Matrices Spiked with Diesel 21: (a) Cloth, (b) Magazine, and (c) Carpet ....................................................... 107 Figure 4.14 Representative Chromatograms for Matrices Spiked with Diesel 21 and then Burned: (a) Cloth, (b) Magazine, and (0) Carpet ........................................... 109 Figure 4.14 Scores Plot for the TIC of the Neat and Burned Diesels ........................... 1 13 Figure 4.15 Representative PCA Scores Plots for (a) TIC and (b) Alkane EIP for the Burned Data Scored with the Eigenvectors from the Neat Data ............................. 1 15 Figure 4.16 Scores Plot for the TIC of the Neat Diesels and the Projected Burned Diesels After a 10X Correction ..................................................................................... l 17 Figure 4.17 Scores Plot for the Alkane EIP of the Neat Diesels and the Projected Burned Diesels After a 10X Correction ........................................................................ l 19 Figure 4.18 Scores Plot for the Aromatic EIP of the Neat Diesels and the Projected Burned Diesels After a 10X Correction ........................................................................ 120 Figure 4.19 Scores Plot for the Indane EIP of the Neat Diesels and the Projected Burned Diesels After a 10X Correction ........................................................................ 122 Figure 4.20 Scores Plot for the OCP EIP of the Neat Diesels and the Projected Burned Diesels After a 10X Correction ........................................................................ 123 Figure 4.21 Scores Plot for the PNA EIP of the Neat Diesels and the Projected Burned Diesels After a 10X Correction ........................................................................ 125 Chapter 1 Introduction 1.1 Arson Arson is defined as the deliberate setting of a fire with the malicious intent to cause damage [1]. The United States Fire Administration (USE A) estimates that more than 30,000 intentional structural fires were set in 2006, which resulted in more than 300 civilian deaths [2]. These intentional fires were reported to have caused approximately $755 million in property damage. An estimated 20,000 vehicle fires were also set. which caused an additional $134 million in damages. Arson is a destructive and expensive crime, yet only a small number of cases ever result in an arrest or a conviction. The Federal Bureau of Investigation (FBI) reports that in 2002 only 16.5% of arson cases were closed [3]. This low rate of closure demonstrates the difficulties encountered in arson investigations and fire debris analysis, and indicates that research into the improvement of the methods in place is essential for advancements in this field. 1.2 Ignitable Liquids Ignitable liquids are commonly used as accelerants in arson cases. The presence of an ignitable liquid, which is used to increase the rate and spread of a fire, may indicate that the fire was intentionally set. Common ignitable liquids frequently used as accelerants include petroleum distillates, such as gasoline, kerosene, and diesel fuel, and other hydrocarbon products, such as aromatic products and oxygenated solvents. The American Society of Testing and Materials (ASTM) has developed a classification scheme for ignitable liquids based on chemical composition [4,5]. Nine classes have been differentiated for the characterization of ignitable liquids based on composition: (1) gasoline-all brands, including gasohol, (2) petroleum distillates, (3) isoparaflinic products, (4) aromatic products, (5) naphthenic paraffinic products, (6) normal alkanes products, (7) de-aromatized distillates, (8) oxygenated solvents, and (9) others-miscellaneous. These classes are further divided into three subclasses based on the range of carbon content present in the liquid. The first subclass covers the light carbon range of C4-C9, with no components present above C”. The medium carbon range includes C3-C13 compounds, with none present below C7 or above C14. The heavy carbon range includes compounds at C9 and above, with the typically observed range of C9-C23. It should be noted that these demarcations are not rigid, and that often samples are reported as “light to medium” or “medium to heavy” when the observed carbon range fits both profiles. The details of the ASTM classification scheme, including examples of common ignitable liquids that are representative of each class, can be found in Appendix A [4,5]. 1.3 Analysis of Fire Debris At a fire scene, ignitable liquid residues (ILRs) are most likely to be found in the area where the fire started, known as the origin. ILRs are also commonly found in pour patterns known as trailers, which occur when an ignitable liquid is poured in a constant stream from room to room in order to force the fire to spread throughout the entire structure. For evidence collection, the fire debris most likely to contain ILRs are those with porous surfaces, because the liquid may soak into them and thus be protected to some extent from the heat of the fire. Similar debris types that are not suspected to contain ILR are collected as control samples, which are necessary to prove that, if an ILR is detected, it is not a natural component or pyrolysis product of the debris matrix itself. The laboratory analysis of fire debris encompasses the identification of both suspected neat liquids and ILRs extracted from burned and unburned substrates in the debris. The current protocols in place in forensic laboratories follow the ASTM standards that have been established and are maintained by the technical committee E3001, which is the criminalistics subcommittee of the committee on forensic sciences [6]. ASTM standards are in place for both the analysis of ignitable liquids and the extraction of ILRs from debris. In order for a suspected neat ignitable liquid to be analyzed, it is simply diluted in an acceptable solvent and then analyzed by gas chromatography-mass spectrometry (GC- MS). The analysis of fire debris for the presence of ILRs is similar to the analysis of a neat liquid. The difference arises in the need to extract the ILR from the debris in order for it to be analyzed by GC-MS. A variety of extraction techniques can be used, though each has its advantages and disadvantages. Five common extractions are listed by ASTM for the separation of ILRs from debris: steam distillation, solvent extraction, passive headspace concentration with activated charcoal, dynamic headspace concentration, and passive headspace concentration with solid-phase microextraction (SPME) [7-11]. Currently, the most frequently used extraction technique in crime labs is passive headspace concentration with activated charcoal strips, though solvent extractions are still valuable in some cases. The ASTM standard E1412-07 details the procedure for the passive headspace extraction of ILRs from fire debris using activated charcoal [9]. The activated charcoal is suspended inside the submitted evidence container, and the container resealed. The container is placed in an oven and heated to a temperature of 50-80 °C for 2-24 hours. The temperature and time heated is sample dependent, as higher temperatures and longer durations may be necessary to promote sufficient adsorption of less volatile compounds in the debris. After the adsorption step, the activated charcoal is washed with 50-1000 pL of an appropriate elution solvent, typically carbon disulfide, n-pentane, or diethyl ether. The eluate is collected and then analyzed by GC-MS. This procedure is capable of extracting ILRs across a wide range of concentrations with an extremely high level of sensitivity. It is considered a nondestructive technique for the extraction of ILRs from fire debris. Solvent extractions, though less frequently used, can provide significant additional information in analyses when activated charcoal extractions are limited by volatility. Solvent extractions offer a more complete extraction that is representative of the entire range of compounds in a sample, whereas activated charcoal extractions are biased against less volatile compounds and those that are not selectively adsorbed onto the carbon strip. The ASTM standard E1386-00(2005) details the specifics of solvent extractions [8]. A suitable organic solvent, typically carbon disulfide, pentane, petroleum ether, or diethyl ether, is added to the debris. The solvent-debris mixture is thoroughly agitated for approximately one minute. The solvent is poured off and then filtered if particulates remain in the decantate. The extract is evaporated to approximately 1 mL and then analyzed by GC-MS. This technique is very sensitive, and is especially useful for the extraction of nonporous surfaces or when a small amount of sample needs to be extracted. The major disadvantage of solvent extraction is the concurrent extraction of interferences from the burned matrix. It must also be noted that some of the lighter classes of ignitable liquids may be lost in the evaporation phase of the procedure. The ASTM standard suggests that, because solvent extraction is a destructive technique. it only be used in tandem with another extraction procedure. For example, if the results of an extraction with a charcoal strip indicate a heavy petroleum distillate, then a solvent extraction may also be performed in order to overcome the volatility bias in the charcoal strip extraction and to obtain a more representative chromatogram. Both neat ignitable liquids and ILRs extracted from fire debris are analyzed by GC-MS per ASTM Standard El6l8-O6el [4]. The GC-MS is operated under conditions that are capable of adequately separating a test mixture consisting of common ignitable liquid constituents. The typical test mixture contains the even-numbered normal alkanes from C3 to C20, as well as toluene, p-xylene, o-ethyltoluene, m-ethyltoluene, and 1,2,4- trimethylbenzene. This test solution is usually prepared at a concentration of 0.005% (v/v) per component to ensure the sensitivity of the instrument [4,5]. Once a chromatogram has been obtained of neat ignitable liquid or extracted ILR. the pattern of its peaks, as well as the relative peak ratios, are used to determine whether or not it is consistent with a common ignitable liquid. In order for an ignitable liquid to be identified based on the aforementioned ASTM classification scheme, certain criteria in the chromatographic patterns must be met for each ignitable liquid class [4,5]. Information about the specific characteristics observed in the chromatographic profiles for each defined ASTM class can be found in Appendix A [4,5]. In addition to the chromatographic patterns, the mass spectral data are also used to identify the questioned sample. Extracted ion profiles (EIP) are generated for specific compound classes that are common to ignitable liquids. Table 1.1 lists the major ions that are used to create these summed profiles [12]. The patterns of the EIPs are also Table 1.1 Major Ions Included in Extracted Ion Profiles for Ignitable Liquids [12] Compound Class Ion Mass-to-Charge (m/z) Ratio Alkane 43, 57, 71, 85 Cycloalkane and alkene 55, 69 n-Alkylcyclohexane 82, 83 Aromatic 91, 105, 119; 92, 106, 120 Indane 117,131;118,132 Alkylnaphthalene 128, 142, 156, 170 Alkylstyrene 104, 117, 118, 132, 146 Alkylanthracene 178, 192, 206 Alkylbiphenyl/acenaphthene 154, 168, 182, 196 Monoterpene 93, 136 Ketone 43, 58, 72, 86 Alcohol 31, 45 utilized in the identification of the questioned sample. The method described above for the identification of ignitable liquids is subjective in nature, and is often based on the analyst’s experience. The pattern recognition is not statistically based and frequently is not straightforward [4,5]. The questioned sample and reference samples used for comparison seldom correlate perfectly. In fact, the chromatograms of the questioned samples can be skewed due to weathering or insufficient recovery. Intense heat exposure from the fire can cause the loss of more volatile components at the beginning of the chromatogram. Interferences from the pyrolysis of substrate materials indigenous to the collected fire debris can create additional peaks in a questioned chromatographic profile. A fire debris analyst must be aware of these possible complications when making ignitable liquid identifications in order to prevent false positive identifications. 1.4 Current Literature Review Fire debris analysis and ignitable liquid characterization is well represented in the literature [13-25]. Because the current method of ignitable liquid identification is subjective, research into more objective methods, such as association and discrimination by Chemometric procedures, is essential for the advancement of fire debris analysis. Current research primarily discusses the utility of multivariate statistical methods in the association and discrimination of ignitable liquids [16-20, 23]. The importance of data processing prior to Chemometric analysis is also discussed [26,27]. In addition to chemometrics, interferences and burning conditions have also been studied to determine their effects on fire debris analysis [28-32]. Although gasoline is the most commonly encountered ignitable liquid in arson cases [13] and most often discussed in the literature, diesel fuel was selected as the ignitable liquid for this research project. Diesel was chosen because it is a complex petroleum distillate with a number of components that span the boiling point range, and it is less frequently discussed in the literature. 1. 4.1 Statistical and Chemometric Analysis of Petroleum Distillates Statistical comparisons of chromatograms produced from the analysis of various ignitable liquids have been limited to an evaluation of peak area ratios of specific components or to Chemometric procedures performed on small sections of the chromatogram. Mann employed a method of comparing peak area ratios for eight components from the n-pentane to n-heptane region of the chromatogram to successfully discriminate gasoline samples [13]. Barnes et al. utilized a similar approach in which normalized peak area ratios of various aliphatic and aromatic constituents present in the headspace (e. g. substituted cycloalkanes, alkylbenzenes, and alkylnaphthalenes) were used to associate unevaporated gasoline samples to those from the same source that had been evaporated to 75% and 50% [14]. More recently, Sigman and co-workers described a statistical technique called covariance mapping in which gasoline samples could be distinguished based on distance metrics calculated between covariance matrices of replicates of both the same and different samples [15]. It should be noted that in a blind study of two unknown samples from a set of ten, one of the two was determined to be statistically different from its known source. It was suggested that evaporation of the sample between analyses could explain this Type I error. However, the presence of a Type I error, which is defined as a difference identified by the technique when one is not actually present, significantly limits the utility of the proposed identification methodology. More advanced multivariate statistical techniques have also been utilized to associate and discriminate ignitable liquid samples. Sandercock and Du Pasquier used Chemometric procedures such as linear discriminant analysis (LDA) and principal components analysis (PCA) on GC data collected using selected ion monitoring (SIM) for the C0-C2 naphthalene components in order to group similar gasoline samples based on grade, country of origin, and the season in which the sample was collected [16-17]. They employed similar methods to link evaporated samples to unevaporated samples [18]. Tan et a1. utilized PCA and a soft independent model classification analogy (SIMCA) approach to differentiate GC-MS-SIM data of various ignitable liquids spiked onto wood and carpet as background matrices [19]. Successful classification was achieved for all ignitable liquids in the presence of wood and carpet using the developed SIMCA model. Doble et al. demonstrated the use of artificial neural network (ANN) algorithms to successfully classify 88 gasoline samples as either regular or premium grade based on the percent peak area of 44 compounds that were identified in all samples [20]. Similar ANN algorithms were also reported to distinguish the gasoline samples based on their season of collection with a 97% success rate. While the majority of the forensic literature focuses on the identification and classification of gasoline samples as ignitable liquids, other more complex petroleum distillates such as diesel fuel have been discussed in the environmental science literature with respect to the association of samples collected from oil spills to their likely source. Environmental literature tends to focus on the use of unique classes of compounds in diesels in order to discriminate samples. In two recent studies, Wang et al. researched the ability of the sesquiterpane and diamondoid compound classes to potentially discriminate diesel samples by analyzing more than 100 crude oils and refined products by GC-MS [21,22]. Diagnostic peak area ratios of several compounds from both the sesquiterpane and diamondoid classes were determined in an effort to identify the ratios that would provide the most discriminatory information. Ultimately these diagnostic ratios were effectively applied to a case in which the source of an oil spill was determined based on these two compound classes alone. The discriminatory potential of the sesquiterpane class was further investigated by Gaines et al. in which fourteen diesel samples were analyzed by GC-MS and extracted ion chromatograms (EIC) for 22 different characteristic compound classes were generated [23]. The peak ratios of several compounds within each of the 22 classes were determined and PCA was performed in order to select the ratios that generated maximum discrimination among samples. Results illustrated that full discrimination of all samples was possible with only nine peak ratios from the alkylbenzene, alkylphenanthrene, and sesquiterpane classes, and two samples from an actual oil spill were differentiated by the model developed. More novel analytical methods have also been discussed in oil spill identification. Another study by Gaines et al. employed a qualitative and quantitative GC x GC approach to analyze a controlled oil spill [24]. By comparing two-dimensional ordered chromatograms, they observed that the patterns of alkanes, cycloalkanes, alkylbenzenes, alkylnaphthalenes, and anthracenes/phenanthrenes were useful in comparing the oil spill to one of its two potential sources. Quantitatively, four panels of integrated peak area ratios of the abovementioned compound classes were used to compare samples. Using both these methods a controlled oil spill was correctly matched to one of two potential sources. A study by Wang et al. showed a novel procedure for the visualization of GC- 10 MS data as a two-dimensional separation that is valuable in specific compound class separation when a nonfragmentation ionization technique is used [25]. This method was demonstrated on complex petroleum fractions, so that even compound classes at low abundances can be utilized for characterization. 1. 4. 2 Chromatographic Pre-processing The need for pre-treatment of the chromatographic data prior to statistical and Chemometric analysis has been more recently discussed in the literature, although it has yet to be applied specifically in a forensic science context [26]. Some pre-treatment processes, such as area normalization and mean-centering prior to PCA have long been demonstrated [27]. Another more significant pre-treatment is that of retention-time alignment of chromatograms to correct retention time drift anomalies caused by column degradation and random fluctuations in analysis conditions. Retention time alignment is an important step prior to PCA, since PCA maximizes the variance among a sample set. If the sample chromatograms contain multiple shifts in retention time, the PCA will focus on those differences in the samples instead of actual chemical variation. Johnson et al. proposed a peak matching algorithm for the retention time alignment of GC peaks in order to reduce retention time shifts among several chromatograms [26]. This approach applies an estimation of the first derivative throughout the chromatograms and searches for zero crossings to identify peaks in a specified target chromatogram and the sample chromatograms. Peaks in the samples are matched to those in the target within a set window size and are then interpolated to include either more or less data points in the retention time axis so that the retention times of the peaks in the sample will be equal to those in the target. The authors demonstrated 11 the utility of this approach by analyzing a set of diesel samples by GC with a flame ionization detector (FID) and then subjecting them to the proposed algorithm followed by PCA. The unaligned set of 60 chromatograms exhibited differences in retention times up to 300 ms, but after alignment, the variation was only 17 ms with a relative standard deviation of 0.003%. PCA demonstrated that the alignment did not cause a decrease in chemical selectivity, but instead the resolution of clusters of different samples was shown to be greatly improved after alignment [26]. I. 4. 3 Burn Conditions and Matrix Interferences The analysis of ILRs extracted from fire debris is not straightforward. In fact. the burning conditions and contributions from the burned matrix can hinder chemometric analysis. Artifacts present in the chromatogram from sample weathering or matrix interferences may cause samples that are actually similar to be difficult to associate because of the extraneous components. These issues have been moderately discussed in the literature, though most studies only look at the matrices alone without any ignitable liquid present, much less an ignitable liquid that has been burned as would be the case in an actual arson situation [28-32]. Bertsch examined potential interferences from carpet samples in the identification of ILRs, specifically gasoline, from fire debris [28]. It was observed that when carpets and carpet paddings were burned that some amounts of alkylbenzenes and naphthalenes were produced. These same compounds are typically used as markers in the identification of gasoline. It was determined, however, that by observing EICs of the C2- C5 alkylbenzenes (m/z 106, 120, 134, and 148) and the methylnaphthalene isomers (m/z 142), the relative amounts of these compounds observed in a gasoline versus that 12 produced by the pyrolysis of carpet is distinguishable. It was noted, however, that when the carpet has been freshly manufactured some remnants of petroleum-based products used in the manufacturing process may still be detectable. In addition to these remnants, carpets treated with petroleum-based water-proofing agents or insecticides may also produce a gasoline-like chromatogram. The use of EICs in this case is not beneficial for the identification of gasoline because the petroleum product is inherent to the carpet itself. In a similar study, Cavanagh et al. examined carpets and mats from cars of unknown history, and a small percentage of these carpets exhibited a chromatographic profile similar to that of evaporated gasoline [29]. The authors suggested that inherent interferences from car carpets are uncommon, which would increase the evidentiary value of the presence of a more concentrated ILR sample or one that does not compare with an evaporated profile. Lentini et a1. presented a study in which several commonly encountered materials were shown to produce petroleum-like chromatograms [30]. Several samples of different matrix types were analyzed using passive headspace concentration followed by GC-MS. The matrix types analyzed included clothing, shoes, household products, building materials, paper products, cardboard, and adhesives. Many of the substrates examined exhibited Tle or EIPs very similar to those of ignitable liquids, though patterns consistent with gasoline were much less frequently observed than those of medium of heavy petroleum distillates. It was concluded that, because of these potential sources of interference that are likely due to the regular use of petroleum products in the manufacturing, cleaning, and treating processes, control samples are crucial as a reference in fire debris analysis. In another similar study, Almirall and Furton analyzed 13 common household products that had been burned in controlled conditions and also reported consistency in composition with ignitable liquids [31]. They determined the five most frequently encountered compounds in these substrates that are also observed in ignitable liquids are toluene, styrene, naphthalene, benzaldehyde, and ethylbenzene. In addition, Almirall and Furton analyzed several of the same samples using pyrolysis GC- MS to characterize the products formed during burning. These pyrolysis products, most frequently toluene, naphthalene, styrene, and ethylbenzene, may serve as another source of interference in the identification of ignitable liquids extracted from fire debris. Borusiewicz et al. studied several factors that may influence the collection and detection of ILRs, and, unlike the previous studies mentioned, actually included the spiking of ignitable liquids onto the matrices being analyzed [32]. By reproducing burn conditions and visually comparing chromatograms, the authors were able to examine how the ignitable liquid type, burned matrix type, burn time, and air availability affect the ability to detect ILRs. They concluded that the material being burned (e. g. wood, carpet. etc.) is the most significant factor in ILR analysis. The other parameters were determined to be less significant, even when compared to random variables in fire debris analysis that are difficult to replicate in a laboratory setting, such as ignitable liquid dispersion or the arrangement of the material when burned. The disadvantage of this study is that the ignitable liquids analyzed were not actually burned; instead, they were only spiked onto already burned matrices. 1.5 Research Objectives In the preliminary work of this project, conducted throughout the summer and fall of 2006, neat diesels were analyzed by GC-MS and characterized using PPMC and PCA 14 [33]. 25 diesels from eight different brands of service stations were collected for analysis and characterization. In addition to the TICs, EICs were generated for representative compounds of both aliphatic (rn/z 57) and aromatic (m/z 91 and m/z 141) constituents present in the diesel samples. PPMC coefficients were generally observed to be higher between diesels from the same brand and lower between those from different brands. A wider array of PPMC coefficients were observed in the EICs, especially for m/z 91, which suggests that the aromatic composition of the diesel may be the source of the variance between those diesel samples. PCA scores plots typically showed four diesel sample clusters, three that contained only samples from one brand and one with the remaining samples. This trend was generally observed for both the TIC and EIC data. The PCA loadings plots, which demonstrate the contribution of each component to the variance in the data set, suggested that the relative aliphatic and aromatic composition, similar to that observed by PPMC calculations, was the most significant source of variation among the samples. The fundamental objective of this research project is to expand on the previous work and study the possibility of linking ignitable liquid residues extracted from fire debris to a neat liquid sample. Again, diesel fuel was selected as the ignitable liquid for this project because it is more chemically complex than other ignitable liquids such as gasoline, and because it is less frequently discussed in the literature. Basic statistical and more advanced chemometric procedures, such as PPMC coefficients and PCA, are implemented to associate and discriminate both neat diesels and those that have been extracted from burned debris based on their chemical composition. Data pre-treatment issues, which include chromatographic alignment and normalization, are also considered. 15 This project expands on the current research described in the literature by focusing primarily on the ability to link an unburned diesel sample to a diesel sample that has been extracted from burned debris. This project examines data from the entire chromatogram instead of just peak ratios or sections of the chromatogram as reported in the literature. Using the entire chromatogram may increase discrimination capability, and no information needs to be known or assumed about the samples in question. EIPs of several potential characteristic compound classes (e. g. aliphatics, aromatics, polynuclear aromatics, and indanes) are also used for comparison. Again, this project implements the entire EIP to further enhance discrimination. The increased chemical information from the full TICs and EIPs may potentially show more significant association and dissociation of diesel based on chemical composition in chemometric procedures. The chemometric procedures may overcome the changes produced in the diesel sample due to burning as well as matrix interference contributions so that a burned sample can be accurately associated with its neat counterpart, which minimizes the risk of false positive ILR identifications. This research, which is applicable to almost any petroleum product with potential as an ignitable liquid, serves as the groundwork for future projects. and could ultimately conclude with scientific validation for use in a forensic laboratory setting. 16 Chapter 2 Analytical and Chemometric Theory 2.1 Gas Chromatography-Mass Spectrometry Gas chromatography-mass spectrometry (GC-MS) is a well developed analytical technique that has become commonplace in forensic laboratories because of its wide application range. The combination of chromatographic and mass spectral results offers a multidimensional analytical tool. It is commonly used in forensic laboratories for drug analysis, toxicology, environmental contamination, and explosives detection. It is also the standard technique used for ignitable liquid identification and fire debris analysis. In fact, as previously described in Chapter 1, a standard method has been developed by ASTM for the specific use of GC-MS in the analysis of ignitable liquid residues in fire debris extracts [4]. Chromatography is a technique in which separations of mixtures are driven by interactions of individual components in the mixture with a mobile and a stationary phase. In gas chromatography, the mobile phase is an inert carrier gas, typically helium or nitrogen. A GC system consists of several individual components: a heated injection port, through which the sample is introduced, a column housed in a temperature regulated oven, where the separation takes place, and a detector. Several different types of columns and detectors are available for use based on the sample and the type of separation required. In GC-MS, the typical column is a capillary column with the stationary phase coated on the inner walls, and the mass spectrometer serves as the detector. In order to analyze a sample by GC-MS, it must be volatile and thermally stable. l7 In a GC—MS system, the sample must be introduced in the vapor phase. Liquid injections, however, are commonly performed, because the injection port is sufficiently heated (250 °C and above) to immediately vaporize the sample. The mobile phase constantly flows through the system and sweeps the vaporized analytes onto the column. The column, as mentioned before, contains the chromatographic stationary phase to cause separation. The stationary phase is usually coated directly on the interior surface of the capillary column, though it can also be coated onto particles that are used to pack the column. The affinity of the analyte for this stationary phase dictates its retention on the column. For example, if a component has a significant affinity for the stationary phase (e. g. a polar component with a polar stationary phase), it will interact more with it than a component with a lesser affinity (e. g. a non-polar component with a polar stationary phase). The component with a higher affinity will be more retained and will elute later than the component with the lesser affinity. The parameters of the column, such as length, diameter, and stationary phase thickness, as well as the composition of the stationary phase itself, have an effect on retention and ultimately the quality of the separation. The positioning of the column in a temperature regulated oven allows for it to be heated to facilitate analyte separation. In an isothermal analysis, the oven is held at a constant temperature throughout the course of the run. This temperature is typically at or just above the boiling point of the sample to be analyzed. For samples that contain components with a broad range of boiling points, a temperature programmed analysis provides better resolution, quicker analysis time, and minimal band broadening in comparison to isothermal conditions. In a temperature program, the oven is held at some 18 initial temperature in order to initiate the separation as well as to prevent condensation of the analytes on the top of the column. The temperature is then ramped over the course of time to a final temperature. The rate of the ramp is dependent on the sample. A slower ramp rate is ofien used when components that have similar boiling points need to be separated, such as in a complex mixture like diesel. Faster ramp rates, however, provide quicker analysis times, thus ramp rates are often selected to provide sufficient resolution in the minimal amount of time. The oven is then held at the final temperature to ensure complete analyte elution. The column passes through a heated transfer line into the ionization source of the mass spectrometer detector. The transfer line must be maintained at a high temperature, typically 300 °C, in order to prevent condensation of the separated analytes during transfer into the director. From the transfer line, the analytes are introduced directly into the ionization source where they are ionized and fragmented. Several types of ionization sources are available, each with its advantages and disadvantages. Electron impact (El) ionization is the most common in most bench-top GC-MS instruments used in forensic laboratories for fire debris analysis. In El, a filament with electric current running through it is heated in order to produce high-energy electrons (70 eV). These electrons interact with the vaporized analyte molecules to produce molecular ions that are fragmented through further collisions. These positive molecular and fragment ions are then transported by a series of focusing lenses into the mass analyzer, where they are separated based on their individual mass-to-charge (m/z) ratio. As with ionization sources, several types of mass analyzers are available for use, but the most common in bench-top instruments is the quadrupole mass analyzer. The 19 quadrupole consists of four parallel conducting rods, with one pair connected to the negative terminal of a DC source and the other pair connected to the positive end. Each pair is also connected to a radio frequency (RF) AC source. These four rods define a cylindrical region through which the ions travel. Under the influence of the electric field. the positive ions are attracted to the negatively charged rods. As the ion begins to move toward the negative rod, the potentials on the rods are switched, and the ion moves toward the opposite rod that is now negative. As the potential is varied, the ion travels in an oscillating pathway through the quadrupole and on to the detector. At a given DC/RF ratio, only ions of a narrow m/z value will travel an oscillation path that allows them to pass through the region defined by the rods without hitting the rods. Ions of other m/z values will travel a wider path that will cause them to come in contact with the charged rods, become neutralized, and therefore not be detected. A full mass scan is obtained by increasing the DC and RF potentials while still maintaining a constant ratio between the two. As the ions pass through the quadrupole, they are directed to a transducer, typically a continuous conversion dynode. The positive ions contact the highly negatively charged conversion plate, which causes the release of secondary particles, of which electrons are included. These electrons then enter the continuous dynode electron multiplier, where they strike the multiplier plates, which are coated with a substance the readily emits secondary electrons. This release of electrons caused by the contact with the plates results in a cascade of electrons that reach the end of the multiplier where they are detected. GC-MS results are two-fold. The retention time of a component, as well as its mass spectrum, are used in making comparisons among samples. In a conventional GC- 20 MS analysis, the entire mass range is scanned during detection and a total ion chromatogram (TIC) is generated. The TIC is a plot of intensities of m/z values in each scan versus time. From this TIC, a specific m/z ratio can be selected as a filter so that only the signal from that ion is retained while all signal from all other ions is eliminated. The result is an extracted ion chromatogram (EIC) in which the abundance for a particular ion is plotted against retention time. Ele can be useful to reduce highly complex and convoluted samples to a more manageable format for interpretation. In fire debris analysis, EICs are beneficial to reduce a petroleum distillate, which is a very complex mixture of many components, to only those components of a specific compound class, such as aliphatics or aromatics. The amount of aliphatic and aromatic components present in an ignitable liquid is valuable information for ignitable liquid classification. In fire debris analysis, several individual EICs are summed to generate an extracted ion profile (EIP) of multiple representative ions of a characteristic compound class. These EIPs are at higher abundance levels and contain more characteristic features than 15le alone. 2.2 Data Pre-treatment Data pro-treatment is necessary prior to the application of chemometric procedures in order to correct for any variation among the diesels not actually due to differences in chemical composition. For example, minor shifts in retention time can occur between sample injections due to instrumental drift in the GC-MS. Retention time alignment procedures can correct for these nominal variations in retention time. Another source of variation among the samples includes slight differences in injection volume. A normalization process can correct for these slight differences. Data pre-treatment 21 procedures are essential in order to ensure accurate results from the PPMC coefficients and PCA. 2. 2.1 Retention Time Alignment The first data pro-treatment step that must be taken after the raw data is collected is a retention time alignment. The alignment is necessary to account for normal instrumental drift, such as slight changes in temperature, pressure, and flow rate, from run to run. Retention time shifts can also be caused by the natural aging process of the column, which degrades over the course of several runs. A retention time alignment algorithm can correct for these inevitable sources of retention time drift. The algorithm used in this work was developed specifically for diesels and is available in the literature [26]. Figure 2.1, which highlights the decane peak for Diesels 1-5, illustrates the effect of the application of the retention time alignment algorithm to the raw chromatographic data. The decane peak in Diesel 2 is severely shifted when compared to the other diesels, but the algorithm is capable of correcting for that shifi. In the algorithm a representative target chromatogram is chosen by the user from the sample set to be aligned. The first algorithm function is to perform a baseline correction on the target and sample chromatograms. The baseline correction factor is determined by a linear regression through the first few points and the last few points of the chromatograms. This correction is applied to account for minor changes in the baseline from run to run, though it is contained in a sub-routine in the algorithm and is easily edited or even removed if deemed unnecessary. The algorithm then cycles through each chromatogram in turn and estimates the first derivative of the chromatogram by calculating the difference in signal strength between consecutive data points. Once this 22 200000 fl Unaligned 8 E N '6 fl 5 1D 4 o 9.35 Retention Time (min) 9.60 200000 1 Aligned l 3 :1 CB 1: :1 :1 .c <2 l O A 9.35 Retention Time (min) 9.60 Figure 2.1 The Effects of a Retention Time Alignment Algorithm on the Decane Peak of Diesels 1-5 23 difference exceeds a threshold for baseline noise defined by the user, the data points are considered to be the leading edge of a chromatographic peak. The algorithm then searches for a zero crossing, or a change in sign, within this first derivative estimation. which indicates the apex and tailing edge of a chromatographic peak. The algorithm then interpolates the subsequent retention time at which the first derivative is equal to zero. rounds that retention time to the nearest integer value, and then adds that retention time to a list being generated for that chromatogram. Once the retention time tables for each chromatogram have been compiled, the algorithm compares the peaks present in each sample with the peaks identified in the target chromatogram. If the sample chromatogram contains a peak within a user-defined window size to that of the target. then it is considered a match and the retention time axis will be interpolated to include more or fewer data points so that the sample peak has the same retention time as the target peak and the peaks align. If the target chromatogram contains a peak that the sample chromatogram does not, or vice versa, those peaks are not considered in the retention time axis interpolation, which allows the alignment to accommodate samples that contain varying numbers of peaks. Prior to implementation, however, certain user-specified parameters of the alignment algorithm need to be investigated in order to ensure optimal alignment. The selection of the target chromatogram must be done carefully so that it contains as many peaks present in the chromatograms to be aligned as possible. It is also important that the peaks in the target chromatogram are relatively centered in terms of retention time and peak symmetry in comparison to the peaks from the sample chromatograms. A poor target selection will lead to a greater propensity for misalignments. Another factor 24 defined by the user that must be considered is the baseline noise threshold for peak identification. The noise of the chromatogram is determined from the first several data points in the chromatogram by default. However, because diesel chromatograms contain features within those first points, in this research the algorithm was amended to calculate the noise from the last several points in the chromatogram where a baseline was discernible. The standard threshold for determining the presence of a peak is five times the standard deviation of the calculated noise. The last user-defined variable that requires investigation is the window size. This last parameter dictates the magnitude of retention time shift of a peak in a sample chromatogram that is still determined a match to the same peak in the target chromatogram. Ideally, the window size, which the algorithm defaults as five, is large enough to account for normal retention time drift, but smaller than the average distance between adjacent peaks. A brief examination of these parameters can lead to fewer errors and improved alignments. Even after optimization of these variables, the alignment algorithm is not devoid of limitations. Peaks with small signal-to-noise ratios are often not identified because they still fall below the optimal baseline noise threshold. These smaller peaks also tend to be noisier, which can result in the determination of multiple zero crossings and leads to peak misidentifications. Another limitation of the alignment algorithm is that the peak identification method of first derivative estimations is entirely dependent on the scan rate of the mass spectrometer. Smaller peaks are often overlooked by the algorithm’s peak finding function when the scan rate is too rapid because the first derivative is not substantial enough to exceed the noise threshold. The alignment algorithm also lacks any method for the determination and alignment of shouldered peaks. Their identification 25 depends on the size of the shoulder in comparison to its parent peak, so that some shoulders are well aligned while others are not. Despite these limitations, the alignment algorithm has been demonstrated in the literature to be adequate for GC peak alignment of diesel samples [26]. The user, however, must be aware of the consequences of these limitations not only on the alignment itself, but on the ensuing chemometric analyses as well. 2. 2. 2 Area Normalization In addition to retention time alignment, other pre-treatment steps are necessary to minimize sources of variation in the chemometric analyses introduced by the analytical technique or injection method. Minimizing artificial variation ensures that the results of Pearson product moment correlation (PPMC) coefficients and principal components analysis (PCA) are based on the true sample variation. Normalization is performed in order to account for minor differences in injection volume from run to run. In a basic normalization process, each chromatographic peak is ratioed to a reference that is present at a consistent concentration throughout every analysis. In many cases, the reference peak is a purposely added component known as an internal standard that is not inherent to the sample being analyzed. However, because diesels are complex and contain numerous components that cover a broad range of boiling points, the choice of an internal standard that would not co-elute with a native component is difficult. Therefore in this research. an area normalization process was used instead. In this method, the areas of all peaks present in a chromatogram are summed, and the individual peak areas are then divided by the total area. In this research, however, the entire chromatogram is analyzed, not just the areas of each peak. In order to area normalize the individual data points of the full 26 chromatogram, the total area underneath the chromatographic curve is calculated to serve as the reference. The individual data points are then divided by the total area. The data points are then multiplied by an average of the total areas of all sample chromatograms in the data set in order to return the chromatograms to the original order of magnitude so as to maintain discernible features for further data analysis. Figure 2.2, which shows the same decane peak of the five diesels as before, demonstrates the results of the area normalization process. In the aligned data only. it appears that little similarity exists among the five diesels in the concentration of decane present. However, once the data is normalized, clear associations can be made for Diesels 3 and 4 as well as for Diesels l, 2, and 5. These two sets are easily differentiated based on the concentration of decane present. 2.2.3 Mean-Centering Another pre-treatment step, mean-centering, is performed specifically in preparation for PCA. In this process, the mean of an individual variable (in this case. the abundance level at a single retention time) is calculated across the entire data set. This mean is then subtracted from each value so that the resultant sum across the sample set equals zero. Figure 2.3 shows the same decane peak of the five diesels as before, but now the data has been aligned, normalized, and mean—centered. The same diesels are associated as before, only in the mean-centered plot, Diesels 1, 2, and 5, which were lower in decane concentration, exhibit a negative peak. Diesels 3 and 4, on the other hand, remain positive. It should be noted that the two positive peaks are greater in magnitude than the three negative peaks, a characteristic that is observed because the means have been mathematically forced to sum to zero. 27 200000 . l Aligned l 3 m = l a i -o l :1 :3 .o <1: \ 0 \ 9.35 Retention Time (min) 9.60 150000 Aligned and Normalized Diesels 3 and 4 } 3 1: fl '5 g Diesels 1, 2, and 5 } <11 0 9.35 Retention Time (min) 9.60 Figure 2.2 The Effects of Area Normalization on the Decane Peak of Diesels 1-5 28 40000 We a: Diesels 3 and 4 } Abundance P ‘? -40000 ' Retention Time (min) Figure 2.3 The Application of Mean-Centering to the Decane Peak of Diesels 1-5 29 Mean-centering is an important step prior to PCA to ensure that the maximum variance described by the principal components selected does not contain any deviations from the mean. Oftentimes when PCA is performed on data that has not been mean- centered, the primary principal component reflects how the samples deviate from the mean, not necessarily how they inherently vary. Upon mean-centering, the variance becomes centered around the origin so the principal components are allowed to describe actual sources of variation among the samples. 2.3 Pearson Product Moment Correlation Coefficients The PPMC coefficient (r) indicates the degree of linear correlation between two samples. The correlation coefficient is calculated by dividing the covariance between the two samples by the product of each sample’s standard deviation as shown in Equation 2.1: ZtX, 500’, 47) r: [201 -§)2‘/Z(Y. 4‘02 Values for r range between -1 and l, where a correlation of 1 indicates a perfect positive Equation 2.1 linear relationship and a correlation of -1 indicates a perfect negative linear relationship. It has been suggested that correlation coefficients between 0.8 and 1 indicate a strong correlation between samples, while coefficients between 0.5 and 0.8 indicate a medium correlation. Coefficients less than 0.5 imply a weak correlation [34]. Correlation coefficients close to zero signify a lack of any correlation. To assess the correlation between two diesel samples, coefficients are calculated as in Equation 2.1, where the X variables denote abundances at individual retention times for one diesel, and the Y variables correspond to abundances at the respective retention 30 times for the other diesel. In order to determine the strength of association between two diesels, the standard convention mentioned above is not applicable because of the level of inherent similarity of the samples being compared. The conventional model relates to samples that range significantly in their chemical composition. Diesels, however, are very similar in their basic composition, so their correlation coefficients would be expected to be closer to unity. Therefore, although differences in correlation coefficients between diesels will be smaller, these differences may still be significant. A more appropriate guideline for samples like diesel that are already so similar is the range of coefficients between several replicate analyses of the same sample. Replicate coefficients, which represent the highest possible level of similarity, should be close to unity. If correlation coefficients between samples that are not replicates fall within the same range as the replicates, then those samples must be considered very similar. Diesels with correlations outside that range can be considered less similar due to differences in chemical composition. With the application of this guideline, PPMC coefficients are useful in assessing the association and discrimination of diesels based on chemical composition. Another application of PPMC coefficients is method precision. As previously mentioned, replicate analyses should result in correlation coefficients of unity if the instrumental technique and manual injection procedure are repeatable. Because of random instrumental and analyst variations, perfect correlation coefficients "are unlikely; however, coefficients very close to one should be attainable between replicate analyses. Therefore, average correlation coefficients of replicates can serve as an indicator of how precise the instrument and the manual injection process are. 31 2.4 Principal Components Analysis PCA is a multivariate statistical technique most commonly used to reduce larger data sets to dimensions that can be easily visualized by standard procedures while still retaining the most significant information. PCA, on the most basic level, determines the sources of maximum variation among samples in a data set and clusters the samples based on their relative contribution to these sources of variation. It is a clustering technique that is capable of associating and discriminating samples based on how they are correlated. PCA consists of several steps of complex matrix mathematics. The first step is to generate the covariance matrix, which consists of covariance calculations between all dimensional pairs in the data set. The covariance between a pair is calculated by Equation 2.2: cov(X, Y): Z (X’ _ YXY‘ _ )7) Equation 2.2 (n - 1) The covariance matrix for a data set of n dimensions contains n!/(n-2)!*2 different covariance values, as well as commutative repeats [cov(X,Y) = cov(Y,X)] and covariance identities [cov(X,X)], which are equal to the variance of that variable [35]. These commutative repeats make the covariance matrix symmetric about the forward diagonal. The next step in PCA is to calculate the eigenvectors and eigenvalues of the covariance matrix. Eigenvectors are derived from the basic properties of transformation matrices. A vector that, when multiplied by a square matrix, results in a reflection of itself is considered an eigenvector of that matrix. Any multiple of that vector is also considered an eigenvector because the directionality of the vector is not affected, only the length. In simpler terms, an eigenvector a is a vector that satisfies the following 32 relationship with a square matrix A: O. * A = k * a, where it is a nonzero number known as the eigenvalue. In PCA of a multidimensional data set, computerized iterative algorithms are often employed to calculate all the eigenvectors and their associated eigenvalues of a covariance matrix. The eigenvectors are then ranked in descending order by their respective eigenvalue. The maximum eigenvalue corresponds to the most significant eigenvector. It should be noted that eigenvectors are orthogonal to one another, and that n eigenvectors are calculated for a matrix of n x n dimensions. For PCA, the eigenvector with the largest eigenvalue is the first principal component (PCl). This primary eigenvector can be thought of as the axis on which the original data can be projected that results in the maximum amount of spread among the samples in the data set. Additional PCs, positioned orthogonally in the sample space to the previous PC, describe lesser amounts of variation corresponding to the magnitude of their eigenvalues. The amount of variation afforded by each PC is represented as a percentage of its eigenvalue to the sum of all eigenvalues. Often, only two to three PC 3 are needed to describe 85-90% of the variance among the data [27]. The dimensional reduction property of PCA is reflected in the significant eigenvalues. If a complex data set of many dimensions can be accurately described by only two or three dimensions. which are capable of being plotted and visually observed, the analysis of that data set becomes much easier by conventional methods. Once the eigenvectors and eigenvalues have been determined, they are utilized to calculate scores for each sample. Scores are calculated for each sample by a simple matrix multiplication of the original data (in columns) by the eigenvectors (in rows). This multiplication results in a single score for each sample for each principal 33 component. A scores plot usually consists of an x-y scatterplot of the PCI score versus the PC2 score for each sample. Three-dimensional plots of the scores for the first three PCs are also frequently used. In these scores plots, the samples will cluster based on how they are each affected by the eigenvectors selected. For diesels, the eigenvectors correspond to individual chemical components that are the most variable among them. Therefore, the scores plots cluster diesels based their similarities and differences in chemical composition. A plot of the individual eigenvectors, known as a loadings plot, indicates those components that contribute the most to the variance. If a component loads high, then it contributes more significantly to the score of a sample than other components do. Although clustering patterns observed in scores plots can be used for sample association, PCA is an unsupervised technique, because the method is not guided by or modeled after a known training set. Instead, PCA clusters samples in a data set solely on how those particular samples vary. As samples are added or removed, the clustering results may change. For this reason, empirical classification by PCA is difficult; instead, it is more useful to discuss how the samples fall into the clustering patterns by degrees of association with and discrimination from one another. 34 Chapter 3 Association and Discrimination of Neat Diesels Using PPMC Coefficients and PCA 3.1 Introduction The fundamental objective of this research is to investigate the potential of linking ILRs extracted from fire debris to the corresponding neat liquid sample using chemometric procedures, specifically PPMC coefficients and PCA. The initial step, however, is to demonstrate that similar neat ignitable liquids, diesel fuel samples from different service stations in this case, can be associated and discriminated based on differences in their chemical composition. This preliminary work is essential in order to show that PPMC and PCA are capable of distinguishing diesel samples that are known to be very similar in their basic chemical composition. If the neat diesels cannot be differentiated, then burned diesels will be nearly impossible to discriminate, because they will have lost the more volatile components, and hence some discriminatory information, during the burning process. Although diesels have a similar basic composition, some amount of variation is introduced through the refining process. Diesel is a heavy distillate fraction collected from the refining of crude oil. Oil refining is a multistep process that includes a temperature-based fractional distillation, several chemical processing steps (thermal or catalytic cracking, and hydrocarbon unification or alteration), and multiple purification and product blending phases [36]. Refineries may also include various additives in their final product blends to improve their quality and performance. Refineries differ in how they complete these processes to produce their final diesel fuel product, which adds some amount of variation to a distillate fraction that is generally similar in composition. 35 The differences in the refining process allow for potential association and discrimination of diesels based on how they were produced. However, it must also be noted that the refining process itself is variable, so that even products from the same refinery may vary. In addition to random variations in the refining process, diesel fuel products must be transported to a distributor and then finally a service station for purchase by the consumer. While some service station brands such as Marathon refine their own oil to sell, other brands such as Speedway or Meijer purchase their products from a regional distributor [37]. Some stations have long-term contracts with distributors so that they always receive similar products, while others purchase their products on a need basis from the most cost efficient distributor. Therefore, it is impossible to generate a database of diesel samples from different service stations because of these constant changes. This variation must be considered in the interpretation of the PPMC and PC A results for the association and discrimination of diesel samples. In order to examine a neat diesel data set, diesel samples were collected, analyzed by GC-MS, and subjected to both PPMC and PCA in order to assess the degree of association and discrimination achieved among them. 3.2 Sample Collection and Analysis Ten diesel samples were collected from various service stations located within the surrounding area of Lansing, Michigan. Table 3.1 lists the samples, which were numbered consecutively, and the details of their collection. It should be noted that five brands of service station (Sunoco, Speedway, Meijer, Marathon, and Mobil) are represented in the sample set. Diesel samples were collected from two different locations of each of the five brands in order to have a representative but diverse data set. 36 323% Eoomocosc SBNBO onco>< $3M 955 $3 Q3woomw 3 Emma 3o=o> Eoomocosc 88:8 osco>< «Emzcnchom 550m Ev 55.982 a $35 326» Eoomocozc 33 too god $583 33 “we? Odom 8:352 w 385 3oz?A Eoomocosa 8% _\oo god @553 8:3 $8 :82 N. 335 as? asasé 85% 33. as: 32 32.68% e 335 30:03 Eoomohosc SR too gem “Ho—mam 82 ESE m Emma 3233 E8383“ no: Coo 3:22. 83% 2:80 ”.33 mmom 8.302 V 335 30:?A 3838:: noRoBo 98M :5: own 8.322 m 385 30:?A Bag 3698 98m 3x055 ooom 825m N .085 30:9A 23 59 SRO veom finial aim 805% ~ 335 noflaum 553.30 no 25,—. «a .830 253—30 he 82— not—3:1— oomtom can—am we.— .aeoseu .085 em 2...; 37 Approximately one-half gallon of diesel was collected from each station in a five gallon yellow diesel container (Wedco, Willowbrook, IL). The containers were cleaned with hexane (reagent grade, Mallinckrodt, Phillipsburg, NJ) between sample collections. After collection, the diesels were brought to the laboratory, transferred to previously acid washed 250 mL amber bottles, and stored in a refrigerator. Immediately prior to analysis, the diesels were diluted 200:1 with dichloromethane (C H2C 12, spectrophotometric grade, Sigma-Aldrich, St. Louis, MO) in clear glass vials (Kimble Chase, Vineland, NJ). One uL of the sample was injected into the GC-MS running the Diesel.M method, which employs the following temperature program: initial temperature 50 °C, 2 °C/min ramp to 150 °C, 3 °C/min ramp to 280 °C, hold for 15 minutes. For the MS, a heated transfer line (300 °C) was positioned in an electron impact (El) ionization source operating at 70 eV. A quadrupole mass analyzer scanning the 40-550 m/z range at a rate of 2.94 scans per second. Each sample was analyzed in triplicate to assess the precision of the instrument as well as the injection technique. For each diesel sample, the data files of the TIC were extracted as Comma Separated Values (CSV) files, along with the data files of the individual EICs for aliphatic compounds (m/z 57) and aromatic compounds (m/z 91). Once all diesel samples were analyzed, their individual data files were compiled into one set for the TIC and one set for each EIC using Microsoft Excel spreadsheet software (Redmond, WA). The Diesel.M method contains a 15 minute hold at the final temperature to serve as a cleaning step between runs and to prevent carryover. Prior to data analysis, the data points corresponding to this extended hold, from 80.51 minutes to 108.33 minutes. were removed from the files since they contained no information about the samples 38 themselves. Once truncated, each chromatogram, both TIC and ElC, contained 1.3.531 data points, all of which were used in subsequent data analysis. 3.3 Data Pre-treatment The compiled data sets for the TICs and each EIC were taken through a series of pro-treatment steps prior to data analysis. These pre-treatments served to minimize any artifacts in subsequent data analysis procedures resulting from retention time shifts and slight differences in injection volume. First, individual data sets were subjected to a peak matching alignment algorithm available in the literature [26]. A discussion of how the alignment algorithm functions can be found in Chapter 2. The alignment was applied in order to account for minor drifts in retention time from run to run that can be caused by routine column degradation and nominal differences in the flow rate of the carrier gas. The alignment algorithm required a target chromatogram to be selected from the data set. The target was chosen at random to eliminate any potential bias from its selection. Once chosen, the target chromatogram was inserted into the data spreadsheet, and the entire file imported into Matlab (Version 7.4.0, The MathWorks, Natick, MA). After the file was imported, the target and data columns were designated variables within the Matlab workspace and the algorithm was performed. The aligned data was extracted from Matlab into spreadsheet form in Microsoft Excel for further pre-treatment. After alignment, the data sets were normalized to overcome for slight differences in injection volume from run to run. An area normalization procedure, as described in Chapter 2, was selected as the most practical for the diesel samples. For each diesel sample, the abundance values at each retention time were summed in Microsoft Excel to 39 obtain'the total area under the chromatogram. These sum totals were then averaged to obtain a mean area. Then, for each diesel sample, every individual data point was divided by its respective area sum, and then multiplied by the mean area sum. The multiplication by the mean was performed only to return the abundance values to the original order of magnitude. A final pre-treatment step, mean-centering, was performed on the aligned and normalized data specifically prior to PCA. Mean-centering is applied so that the variation described by the calculated principal components is centered about the mean as discussed in Chapter 2. The aligned and normalized data was taken in Microsoft Excel, and the mean was calculated for the abundances of all samples at each individual retention time. The respective means were then subtracted from each data point to obtain the mean-centered data. 3.4 Data Analysis Once the data sets were pre-treated, several data analysis steps were performed to determine the precision of both the instrumental technique and the injection method. Precision was assessed through both relative standard deviations (RSD) of select peak areas and PPMC coefficients. For RSD calculations, eight peaks across the entire retention time range were selected to represent the wide array of boiling points of components contained in a diesel sample. These peaks were identified using a mass spectral search program from the National Institute of Standards and Technology (N IST, Version 2.0d, Gaithersburg, MD). The peaks were integrated in order to determine their area, which corresponds to the concentrations of the components present. The peak areas were then imported into 40 Microsoft Excel and normalized to the total area under the chromatographic curve, and the individual RSD values were calculated using the mean and standard deviation for each peak. For replicate analyses, peak areas should be consistent with less than 5% relative standard variation [3 8]. PPMC coefficients were also used to gauge precision. In order to calculate these coefficients, the aligned and normalized data spreadsheet was imported into Statistical Analysis Software (SAS, Version 9.1, The SAS Institute, Cary, NC) and the default internal PCA function was run. This default function performs PCA on the correlation matrix of the data instead of the usual covariance matrix, and part of the output of the function is a table of PPMC coefficients. This table was exported into Microsoft Excel while the rest of the data was discarded. For smaller data sets (less than 15 samples), the PPMC coefficients were also calculated in Microsoft Excel using the CORREL function. For the actual association and discrimination of diesels based on their chemical composition, the PPMC coefficients calculated for the precision determination were utilized. In addition to PPMC coefficients, PCA was performed on the data sets, since PPMC coefficients only allow a pairwise comparison, while PCA allows the full data set to be assessed simultaneously. The data was aligned, normalized, and mean-centered prior to PCA. The pre-treated data was then converted from spreadsheet form to a C SV form and imported into a temporary workspace on the Michigan State University High Performance Computing Center (HPCC) servers. From this workspace, Matlab was accessed on a powerful cluster of computers from which the default PCA function (using the covariance matrix) was performed on the data. Use of the HPCC was necessary due to the large CSV file size generated from several replicate samples, each containing more 41 than 13,000 data points. Once PCA had been performed on the data, the scores, eigenvalues, and first three eigenvectors were saved in ASCII format in the HPCC temporary workspace. Only the first three eigenvectors could be saved due to storage space complications. The first three PCs, though, typically describe 85-90% of the variance of the sample set; thus, they are sufficient to allow association and discrimination among the samples. Once saved, the PCA files were transferred back to a personal computer where they were converted into a usable spreadsheet form in Microsoft Excel. The scores values for PCI and PC2 for each sample were then plotted using an x-y scatterplot in Microsoft Excel. For 3D scores plots, Origin Pro graphing software (OriginLab, Version 8, Northampton, MA) was used to plot the scores values for PCI, PC2, and PC3 for each sample. 3.5 Results and Discussion 3. 5. 1 Initial Data Set Initially, the ten diluted diesel samples were analyzed in triplicate in order to examine the reproducibility of the analytical method. RSD values were first calculated to assess precision. For normal error distribution, a 5% error rate indicates that the measurement falls within two standard deviations from the mean. Therefore, a threshold of 5% RSD or less is typically considered acceptable for analytical precision [38]. Peak areas that fall outside of the normally accepted range indicate imprecision in either the instrument or the injection method. Table 3.2 lists the RSD values for eight peaks across the retention time range from the triplicate analysis of the ten diesel samples. Though some RSD values fall below the normally accepted 5% error threshold, several do not. GC-MS is widely 42 Table 3.2 RSD Values for Eight Peaks Across the Retention Time Range for Ten Diesel Samples Analyzed in Triplicate Rt t' ”n 10“ Potential Peakldentity Diesell Diese12 Diesel3 Diesel4 Diesels Diesel6 Diesel7 Diese18 Diesel9 Diesel 10 Average Time (min) 20.784 dodecane 7.7% 5.9% 5.7% 0.6% 6.9% 3.3% 12.7% 4.6% 22.1% 4.8% 7.4% 26.207 1’2’3’4‘tetrahydm'8' 5.9% 4.4% 13.3% 6.8% 2.1% 3.3% 9.6% 3.2% 13.2% 5.5% 6.8% methylnaphthalene 27.064 tridecane 4.7% 2.9% 2.7% 2.2% 4.0% 0.7% 5.8% 2.5% 11.1% 0.4% 3.7% 31.782 2,6,10-trimethyldodecane 4.2% 15.6% 12.4% 11.9% 10.2% 6.0% 14.9% 11.7% 9.3% 8.5% 10.5% 33.251 tetradecane 3.4% 2.2% 2.4% 2.4% 6.9% 2.4% 6.8% 3.6% 3.9% 2.3% 3.6% 39.211 pentadecane 2.0% 7.9% 4.6% 2.1% 4.2% 3.5% 4.0% 4.4% 10.5% 1.4% 4.5% 44.905 hexadecane 3.3% 6.9% 2.8% 2.5% 3.9% 2.5% 12.2% 4.8% 14.4% 8.2% 6.2% 50.327 heptadecane 4.9% 5.8% 9.8% 3.5% 7.1% 2.7% 16.9% 4.9% 15.9% 4.2% 7.6% 43 known for its analytical precision, as its instrumental errors tend to be insignificant. Therefore, it is more likely that the error observed in the RSD values is being introduced in the manual injection process. An alternative method to examine reproducibility is to consider PPMC coefficients among replicates. These correlations offer some measure of the similarity between two samples. For two replicate injections of the same sample, correlation coefficients should be very close to one if both the instrument and injection method are precise. Table 3.3 shows the combinations of PPMC coefficients for the replicate analyses of the ten diesels. It should be noted that the correlation coefficients range from 0.7769 to 0.9886, with an average of 0.9581 :t 0.0519. While the minimum correlation is low (0.7769) for a replicate analysis, the average is reasonably acceptable by normal convention, in which a coefficient greater than 0.8 indicates a strong correlation [34]. But upon further consideration, this mean indicates that replicate injections of the same sample are only 95.81% similar on average, when in fact a higher degree of correlation should be attainable. The unacceptable RSD values combined with poor correlation coefficients among replicates suggest that the analytical methodology is not sufficiently precise. Because of the well known precision of GC-MS, it is feasible that variation in the method of injection is responsible for the lack of precision. Further evidence that the injection procedure was imprecise showed in a preliminary scores plot from PC A of the ten triplicate diesels. The three replicates of each diesel should have been tightly clustered; instead, considerable spread was observed in the replicate clusters (data not shown). The consistently poor precision of the data analysis results compelled a closer look at the 44 Table 3.3 PPMC Coefficients for Replicate Sets of Ten Diesel Samples Analyzed in Triplicate Diesel Sample Replicate Pair PPMC Coefficient 1 and 2 0.9815 Diesel 1 1 and 3 0.9748 2 and 3 0.9808 1 and 2 0.9768 Diesel 2 1 and 3 0.9824 2 and 3 0.9854 1 and 2 0.9540 Diesel 3 l and 3 0.9585 2 and 3 0.9877 1 and 2 0.7896 Diesel 4 1 and 3 0.7769 2 and 3 0.9868 1 and 2 0.9805 Diesel 5 1 and 3 0.9747 2 and 3 0.9845 1 and 2 0.9673 Diesel 6 1 and 3 0.9867 2 and 3 0.9664 1 and 2 0.9326 Diesel 7 1 and 3 0.9466 2 and 3 0.9688 1 and 2 0.9785 Diesel 8 1 and 3 0.9886 2 and 3 0.9881 1 and 2 0.8903 Diesel 9 l and 3 0.9465 2 and 3 0.9650 1 and 2 0.9761 Diesel 10 1 and 3 0.9797 2 and 3 0.9876 0.9581 Standard Deviation 0.0519 45 chromatograms of the replicate samples themselves. Figure 3.1 depicts two sections (9- 10 minutes and 50-51 minutes) of three overlaid chromatograms, all of which are replicate injections of Diesel 9. The chromatograms have been aligned and normalized. In the 9-10 minute range of the chromatogram (Figure 3.1a), the components in the second replicate are most abundant, while those in the first replicate are the least abundant. In the 50-51 minute range (Figure 3.1b), however, the opposite trend is observed. This change in the composition distribution among replicates is surprising, since component levels should remain constant throughout the entire range of the chromatogram. It is possible that, as seen with the high RSD values and low PPMC coefficients, an error in the manual injection method is causing this undesirable change in composition distribution. Hence, further investigation of the injection method was warranted. Because the components found in diesels cover a very broad range of boiling points (approximately 70-400 °C), their vaporization differs in the injection port. In a normal injection technique, one pL of sample is drawn up into the syringe, which is placed into the inlet, and the sample immediately injected. This rapid injection does not allow the syringe needle to be sufficiently heated to the temperature of the inlet, which causes the less volatile components to be insufficiently vaporized. For these reasons, using the normal injection method for diesel samples introduces variability in the distribution of the compounds present by preferentially loading the column based on volatility. For example, the more volatile components in the diesel may be completely vaporized and passed onto the column while the less volatile components are only partially vaporized. The vaporization of all components is linked to both the current 46 -- Replicate 1 -- Replicate 2 — Replicate 3 50000 . (a) 9-10 mmutes 8 E a '6 E I Q < . K l 0 “MA A A i v I . ' 4:. ‘ 9.0 Retention Time (min) 10.0 150000 (b) 50-51 minute Abundance 50.0 Retention Time (min) 51.0 Figure 3.1 Distribution of Select Peak Areas Among Replicate Injections of Diesel 9 47 temperature of the injection port and the time the needle spends in the port, so that the loading preference varies with each injection as demonstrated in Figure 3.1. To overcome the volatility bias, the injection procedure was amended to draw one uL of air first, then one 11L of sample, followed by another 11L of air. These surrounding air pockets protect the plug of sample from the heat of the injection port until it is actually injected. The injection procedure was also changed to include a two second delay after the insertion of the needle in the inlet prior to actual injection. This timed injection allows the needle of the syringe to come to temperature without beginning to vaporize any of the sample, thus ensuring reproducible vaporization. 3. 5.2 Data Set using Timed Injection After it was determined that the original method of injection resulted in unacceptable precision in the first data set, five of the diesels were re-analyzed in triplicate using the timed injection method described above. The data was pre-treated and analyzed in the same manner as before in order to investigate the precision of the modified injection method. The RSD values for the data from the timed injection method are listed in Table 3.4. For the same eight peaks that were examined previously, the RSD values were improved for the timed injection compared to those from the previous injection procedure. Only the RSD value for the 2,6,10-trimethyldecane peak at 31.782 minutes exceeded the 5% error threshold in all five diesels, whereas for the previous injection procedure, five of the eight peaks had unacceptable RSD values. In addition to RSD values, PPMC coefficients were also calculated. Table 3.5 lists the correlations among replicates for the five diesels. The range of correlation 48 .32 .33 .342 .32 333 .32 2488232 Room 33am .32 3mm .33 .32 .32 6:88.28; 83.3. .33 33mm .32 333.3 .3: .33 6:333:63 2 5.3 .393. 333 3E .32 .33 .33. 288832 EN? .3: 33:: .33 .333 ..3o.3 333.3 memoaeoeiaeeewea Se. 3 m .32 .32 333. .32 .33 .33. 23823 3.8.8 .33.... .32 .33 .32. .33 .33. 8235233353565 SEN 336335.13; .33 .32 33% .33 .33 .33 888% 433.8 «aw—954 m Emu—G v Ema—G m .935 N .3105— — 33.: 55:93 :3.— EE—oeom 3:5 25,—. . . . . . . . Season 9.592 558.2: con—E. « ”Em: Snow—AF; E poi—«.3. moan—am .085 5B. you owns— ofifi. not—.33— 05 29.34 $133— Ewfi— .8.— 823» 9mm in 03:. 49 Table 3.5 PPMC Coefficients for Replicate Sets of Five Diesel Samples Analyzed in Triplicate Using a Timed Injection Method Diesel Sample Rgrlicate Pair PPMC Coefficients 1 and 2 0.9884 Diesel 1 l and 3 0.9870 2 and 3 0.9878 1 and 2 0.9898 Diesel 2 1 and 3 0.9852 2 and 3 0.9831 1 and 2 0.9870 Diesel 3 1 and 3 0.9659 2 and 3 0.9599 1 and 2 0.9821 Diesel 4 1 and 3 0.9700 2 and 3 0.9445 1 and 2 0.9850 Diesel 5 1 and 3 0.9721 2 and 3 0.9846 Average 0.9782 Standard Deviation 0.0131 50 values among replicates, 0.9445 to 0.9898, is narrower compared to that observed in the data collected using the previous injection procedure. The average correlation value, 0.9782 3: 0.0131, indicates that the timed injection generally provides improved precision throughout replicate analysis. However, an average correlation coefficient of 0.9782 is still lower than would be expected among replicates. Although the timed injection did improve the precision of replicates, any manual injection carries the opportunity for analyst error. 3. 5.3 Improving Abundance Levels of TIC and EICs Once the precision of the injection method was established, further data analysis was performed in an effort to begin associating and discriminating the diesels based on their chemical compositions. However, through the course of this data analysis, it became obvious that the level of association and discrimination demonstrated in the PPMC coefficients and PCA results was being limited by the relatively low abundances of the chromatograms. The TICs, though low in comparison to abundances achieved through standard crime lab protocols (~106-107) [39], were at least sufficient for analysis (~104-105), but the real limitation occurred in the data analysis of the EICs. Because ions of a single m/z value were extracted from an already low TIC, it became difficult to differentiate peaks in the EIC from the noise level in the background. When abundances are significantly high, the variation in the baseline is insignificant, but at these low levels, any variation can create artifacts in the subsequent data analysis. In order to achieve meaningful results from the PPMC and PCA, especially from the EICs, as well as potentially improving further upon the precision of the technique, adequate abundance levels must be reached. 51 In order to improve the abundance levels in the chromatograms, two steps were taken. First, the dilution factor was decreased from 200:1 to 50:1 to prepare the diesel samples for GC-MS analysis. This more concentrated sample yielded TICs approximately five times more abundant than the previous 200:1 dilution. These increased abundances in the TIC allowed for more abundant EICs, though some amount of variation was still discernible in the baseline. Therefore, the second step to improve abundances was taken specifically for the EICs. The particular EICs were chosen in order to represent compound classes that are characteristic of petroleum distillates. It was determined that instead of using only a single m/z value from each compound class, multiple values from each characteristic class could be used to improve the signal-to- noise ratio for the extracted ions. The many individual ions from each class were summed to create extracted ion profiles (EIPs) for the specific compound classes. The following ions were summed to create the EIPs for their respective classes: alkane — 57 + 71+ 85 + 99; aromatic— 91 + 105 +119 + 133; indane— 117 +131 +145 + 159; olefin- cycloparaffin (OCP) - 55 + 69 + 83 + 97; and polynuclear aromatic (PNA) — 128 + 142 + 156. The EIPs offer a significant amount of additional information in comparison to a single EIC. Table 3.6 illustrates this fact by showing PPMC coefficients between single extracted ions from the same compound class and their summed EIPs for the alkane. aromatic, and indane profiles. The correlations among the individual alkane ions, as well as the correlations among the individual ions and the summed EIP, are all relatively hi gh, which indicates that few differences exist between them and additional information is not obtained using the alkane EIP. On the other hand, the correlations for the individual 52 Table 3.6 PPMC Coefficients for Individual m/z Values and Their Summed EIPs for the Alkane, Aromatic, and Indane Profiles Alkane Profile m/z Values 57 71 85 99 EIP 57 1.0000 0.9917 0.9876 0.9689 0.9971 71 1.0000 0.9897 0.9758 0.9972 85 1.0000 0.9794 0.9949 99 1.0000 0.9797 EIP 1.0000 Aromatic Profile m/z Values 91 105 119 131 EIP 91 1.0000 0.4333 0.2943 0.2558 0.7288 105 1.0000 0.2255 0.1249 0.8040 119 1.0000 0.2875 0.6470 131 1.0000 0.4171 EIP 1.0000 Indane Profile m/z Values 117 131 145 EIP 117 1.0000 0.3990 0.2892 0.7160 131 1.0000 0.2846 0.7599 145 1.0000 0.7429 EIP 1.0000 53 aromatic ions and the summed EIP are very low, which suggests that the EIP contains more information than the single ions themselves. The same trend is observed for the individual indane ions and the corresponding EIP. These differences confirm that additional information concerning chemical composition is garnered from the EIP in comparison to a single EIC. These two final improvements to the method resulted in a sufficiently optimized analytical methodology and data analysis procedure. 3. 5. 4 Final Neat Data Set The modifications made to the experimental and data analysis methods throughout the course of the preliminary trials were taken into account when a final neat diesel data set was analyzed. By applying the improved method, any random variations or artifacts in the results of the PPMC calculations or the PCA would be negligible. The final neat data set consisted of all ten diesels, previously described in Table 3.1, re- analyzed in triplicate. The data was compiled for the TICs, as well as the EIPs for the five characteristic compound classes (alkanes, aromatics, indanes, OCPs, and PNAs). For the TICs, RSD values were calculated of the same eight peaks as before. For the TIC and each EIP, PPMC coefficients were calculated and PCA was performed. 3.5.4.1 RSD Values Table 3.7 shows the RSD values for the ten diesels analyzed in triplicate. As previously, RSD values for all eight peaks except the 2,6,10-trimethyldecane peak at 31.782, were lower than 5%. In fact, the RSD values for the less dilute samples were, on the whole, even lower than before. It is possible that the shape and size of the 2,6,10- trimethyldecane peak is more variable than the other peaks investigated because it is a small, shouldered peak, which would explain RSD values greater than 5%. 54 Table 3.7 RSD Values for Eight Peaks Across the Retention Time Range for Final Neat Diesel Data Set Analyzed in Triplicate Using a Timed Injection Method 132:?222) Potential Peak Identity Diesel 1 Diesel 2 Diesel 3 Diesel 4 Diesel 5 Diesel 6 Diesel 7 Diesel 8 Diesel 9 Diesel 10 Average 20.784 dodecane 2.6% 3.7% 1.2% 2.8% 5.2% 1.6% 4.2% 3.7% 4.7% 7.2% 3.7% 26.207 1’213’4'tetrahydm'8' 1.6% 0.6% 1.6% 3.2% 6.1% 2.6% 2.0% 2.2% 1.5% 3.3% 2.5% methylnaphthalene 27.064 tridecane 3.3% 1.7% 0.7% 2.1% 0.4% 0.5% 1.7% 0.5% 0.7% 0.4% 1.2% 31.782 2,6,10—trimethyldodecane 20.0% 8.5% 1.6% 5.8% 1.7% 0.7% 11.8% 2.3% 4.6% 2.7% 6.0% 33.251 tetradecane 2.1% 0.7% 0.6% 2.4% 2.1% 1.0% 2.5% 1.4% .3% 1.1% 1.4% 39.211 pentadecane 1.5% 0.8% 0.8% 1.6% 2.3% 0.7% 2.3% 1.5% 4.2% 1.0% 1.7% 44.905 hexadecane 1.3% 1.2% 0.4% 3.3% 2.2% 0.6% 4.1% 0.3% 2.1% 3.5% 1.9% 50.327 heptadecane 5.6% 0.5% 1.2% 8.2% 5.5% 0.2% 7.6% 1.8% 3.5% 2.5% 3.7% 55 3. 5. 4.2 PPMC Coefficients The PPMC coefficients for triplicate analysis of the ten diesel samples were utilized to determine the range of correlations that corresponds to diesels of known similar origin. Correlation coefficients between two samples of different origin must fall outside this range in order to be considered statistically distinguishable from the replicates. Table 3.8 lists average correlation coefficients and observed ranges among replicates for the TICs and each of the EIPs. It should be noted that correlations are highest among replicates in the TICs (0.9895 :t 0.0072), which may suggest that its potential for discrimination is higher than that of the individual EIPs. The average correlation coefficient for the TICs is higher than the previous correlation coefficient for the TICs (0.9782 i 0.0131), which indicates that the amendments to the injection method and the dilution factor improved precision. The average correlation coefficients for the EIPs are lower than that of the TIC, which could be caused by a number of factors. It is possible that the abundance levels are still problematic though they have improved. From later data analysis, it became clear that the lower correlations for the ElPs are, however, more likely due to errors in the alignment algorithm, as discussed in the next section. The average and standard deviation of correlation coefficients among replicates serves as a benchmark in the interpretation of PPMC coefficients for samples that are not of similar origin. Using these values, a 90% confidence interval (CI) can be calculated for the means of the PPMC coefficients for replicate samples as well as for sample pairs of different origin in order to assess statistical similarities among the diesels [27]. A confidence level of 90% was selected instead of the more conventional 95% due to the small sample size. If the 90% CI calculated for the PPMC coefficients for different 56 Table 3.8 PPMC Coefficients for TIC and EIPs for Replicate Sets of Ten Diesel Samples (Final Data Set) Standard Data Set Average . . Minimum Maxium Devratron TIC 0.9895 0.0072 0.9649 0.9963 Alkane 0.9824 0.0146 0.9299 0.9945 Aromatic 0.9791 0.01 15 0.9376 0.9922 Indane 0.9690 0.0171 0.9157 0.9881 OCP 0.9631 0.0142 0.9271 0.9778 PNA 0.9513 0.0389 0.8163 0.9884 57 diesels overlaps the CI for the replicate samples,then those samples are considered the same statistically at a 90%confidence level. If the CI does not overlap that of the replicates, then the samples are considered statistically different. The CI for the PPMC coefficients is more useful than the empirical range of correlation coefficients for all replicate measurements. The minimum correlation coefficient calculated between two replicates for the TIC, for example, is 0.9649, which is well outside the defined confidence interval (09800-09989). With the opportunity for occasional disparity in the manual injection, outliers, though rare, have to be anticipated. Using the confidence intervals for each data set, the PPMC coefficients for samples of known different origin could be assessed for any potential association and discrimination capabilities. The complete PPMC tables for each data set are shown in Appendix B. Again, for some sample pairs, such as Diesels 6 and 8, the correlation for the TICs (09875-09944) is similar statistically to that of the replicates (09800-09989), which indicates that those two samples are not statistically different. However, Diesels 2 and 3 (08859-08976) are less correlated than the replicate TIC samples, and those are statistically different. It should be noted that the range of correlation coefficients observed are typically wider for the EIPs. For example, the minimum correlation among the TICs is 0.8668. While the minimum for the alkane EIP is similar (0.8727), the minima for the other EIPs are considerably lower (aromatic —- 0.7440, indane — 0.7594. OCP — 0.8380, PNA — 0.6675). This wide range of correlation coefficients may indicate that these compound classes are more discriminatory than the TIC as a whole. However. the average correlation coefficients for replicates based on these EIPs are much lower than that of the TIC, which limits the statistical discrimination capabilities to those 58 samples that fall below that decreased range. These two contradicting properties place some amount of question on the validity of the application of PPMC coefficients to ElPs. If still low abundance levels or misalignments are indeed the sources of error for the EIPs, then it is possible that they can be corrected and more reliable information can be gained from the EIPs. Currently, however, any data interpretation is limited to the lower average correlation thresholds established by the replicates. 3. 5. 4.3 PCA PCA was also performed on the final data set using the TIC and each EIP. The scores plots for the TIC are shown in Figure 3.2. In Figure 3.2a, PCl is plotted against PC2, where PCl accounts for 35% of the variance, and PC2 31%. This plot affords a moderate amount of discrimination among the samples while still maintaining tight clustering of replicate samples. Diesels 1 and 2 are separated from the rest of the large cluster of samples in PC2, as PC1 appears to only distribute the samples while exhibiting very little discrimination. The separation afforded in PC2 indicates that explicit differences in chemical composition contribute to the variance described by that principal component, whereas the even distribution in PCl indicates that a uniform range of variables contributes to its described variance. However, even within PC 1 , some amount of sample discrimination can be determined. For example, although Diesels 5 and 6 fall within that large group, they are significantly different in the first principal component. Other samples, however, such as Diesels 9 and 10, are very close to one another within that large grouping, so it can be reasonably concluded that these samples are not sufficiently distinguishable. With the addition of PC3 in the three-dimensional (3D) plot. firrther discrimination is observed. The third principal components accounts for 16% of 59 A Diesel 1 ADiesel 2 A Diesel 3 Diesel 4 A Diesel 5 ADiesel 6 3. Diesel 7 ADiesel 8 A Diesel 9 A Diesel 10 1500000 — AA A r; AA 9\ a l A A A l m'leOOOO :5. EA .1.“ 1200000 A -1500000 - (a) - PC1(35%) I 000000 .3 3 I E’- 0 1500000 0 9 CI] o[o\ 0 0 \ 60/11} 101-9 (b) 1000000 Figure 3.2 Scores Plot - TIC (a) PC 1 v. PC2 and (b) PCl v. PC2 v. PC3 60 the variance, for a total of 82% described by all 3 PCs. Diesels 1 and 2 are still separated, but the addition of PC3 allows for differentiation within the larger cluster observed in the two-dimensional (2D) scores plot. Diesels 5, 6, and 8 are moderately separated in PC3. On the other hand, Diesels 3 and 7 remain closely associated with one another, as do Diesels 4, 9, and 10. The diesels in these two clusters are not distinguishable from one another, and thus are considered to be very similar in their TIC content. Additional information about how the individual diesel samples cluster can be garnered from the scores plots of the five different EIPs. The scores plots for the alkanes are shown in Figure 3.3. The 2D plot, in which PCs 1 and 2 describe 45% and 12% of the variance respectively, does not offer any further discrimination than the TIC. While Diesels 1 and 2 are still reasonably separated in both PC] and PC2, the rest of the samples are scattered amongst themselves. Even the replicates are less tightly clustered, as PC2 tends to spread them. For example, the replicates of Diesels 4 and 7 are so widely spread, it would difficult to even associate them with one another. When PC3 is also included for the 3D plot, an additional 9% of the variance is described, for a total of 66%. PC3 does nothing to further associate or discriminate samples. In fact, it causes additional spread in the replicate diesels. Similar trends are observed in both the 2D and 3D scores plots for the indane and PNA profiles, which are shown in Figures 3.4 and 3.5. Diesels 1 and 2 are still somewhat separated from the main cluster, but the spread in the replicates is even worse (again, likely due to misalignments in the EIPs). It is possible that the diesel samples do not vary greatly in their alkane, indane, or PNA content, which would explain the lack of discrimination in the scores plots. A brief visual comparison of overlaid chromatograms for different diesels for each profile indicated that some amount 61 A Diesel 1 A Diesel 2 A Diesel 3 Diesel 4 ADiesel 5 A Diesel 6 .11. Diesel 7 A Diesel 8 A Diesel 9 A Diesel 10 500000 - g: A A V r— w—%' A ' 1 N i ‘ 11_ 2500000 ‘A a“ f 500000 (0) -500000 - PC1(45%) 200000 3 e 0 8 D- -20000() -60(l()00 3,! ‘6‘ I) 1." 1.0 (b) -300000 YC Figure 3.3 Scores Plot - Alkane EIP (a) PC 1 v. PC2 and (b) PCI v. PC2 v. PC3 62 ADiesel l ADiesel 2 ADiesel 3 »“ Diesel 4 A Diesel 5 ADiesel 6 A Diesel 7 ADiesel 8 ADiesel 9 ADiesel 10 100000 1 l A k l 3° 45““ A e; .3, 2, I 2 A; -32., ,2. A 3, ”AL .2. 8 ‘ A‘A‘A‘.-. ”250000 [ A 250000 A I A‘ A ((1) -100000 PCI (58%) 100000 :3 t: n o D: -100000 fl C},- 33, 0 ’9 °/) (b) 100000 -100000 v00 ° Figure 3.4 Scores Plot - Indane EIP (a) PC 1 v. PC2 and (b) PC 1 v. PC2 v. PC3 63 ADiesel 1 A Diesel 2 A Diesel 3 Diesel 4 A Diesel 5 60000 ADiesel6 iDiesel7 ADiesel8 ADiesel9 ADiesel 10 50000 1 A 3 i ‘A A A E A 11 A A A r. . A A A ' 2-60000 A A A (a) A A -50000 — PC] (46%) 50000 pc3 (17%) o -50000 -50000 0 9It" 1. 01 (b) 50000 -50000 Yc Figure 3.5 Scores Plot - PNA EIP (a) PC] v. PC2 and (b) PCl v. PC2 v. PC3 64 of variation is noticeable, while some areas are very similar. It is likely that the differences are not significant enough to be isolated by PCA, which would imply that these compound classes are not useful for determining association and discrimination among the diesels. The other two compound classes, however, do provide discrimination of the diesels in their scores plots. The scores plots for the aromatic profiles, Figure 3.6, show tight clustering for replicates. In the 2D plot, Diesels 1 and 2 are still isolated from the main cluster, but in this case, that main cluster is not spread about the first principal component as it was in the TIC. Instead, the eight other diesels are very tightly clustered. which suggests that they are all very similar in their aromatic content. The 3D plot demonstrates the same trend, though PC3 spreads the tightly clustered eight diesels, as Diesels 5 and 7 begin to be separated from the larger group. The differentiation in PC 3 is difficult to discern, however, it only describes a small amount of the total variance (3%) in comparison to PCs I and 2, 75% and 8% respectively. The scores plots from the OCP profile, shown in Figure 3.7, afford a considerable amount of discrimination than the other profiles. Replicates are still closely associated in the 2D plot, with the exception of Diesels 2 and 8, which are spread in comparison to the other replicates. This spread could be caused by a number of different factors, though the most likely source is possible misalignment. Individual diesel samples, most notably Diesels 1 and 2, are beginning to be isolated in the 2D plot as well, though it should be noted that their separation is only due to 48% of the total variance of the data. The 3D plot, which includes an additional 10% of the variance, exhibits similar traits to the 2D plot. Replicate spread worsens in PC3, though on a minor level, some diesels are more 65 A Diesel 1 A Diesel 2 A Diesel 3 A Diesel 4 A Diesel 5 A Diesel 6 A Diesel 7 A Diesel 8 A Diesel 9 A Diesel 10 350000 - (8%) D 4" m"-350000 ‘A 350000 (11) 3500001 PC1(75%) 100000 ,1 A A . ‘ t‘ O ‘3 ° ’ 4 "at: 8 A m J “‘ 100000 A o \" 400088 ’ 5°. - 000 a. (b) 0 0)) A P C1 (75%) 100000 '1 ”00 Figure 3.6 Scores Plot - Aromatic EIP (a) PC] v. PC2 and (b) PCl v. PC2 v. PC3 66 ADiesel l ADiese12 ADiesel 3 Diesel 4 ADiesel 5 ADiesel 6 Diesel 7 ADiesel 8 ADiesel 9 ADiesel 10 150000-— § 1 ‘11 r A g 1——% Aa’i I N A 0 -150000 A 150000 On A A A A A a () -150000 — PC] (26%) 100000 PC3 (I 00/0) o - I 00000 - 120000 46 9r I 00000 O) 0 (b) v01 K“ - I 00000 Figure 3.7 Scores Plot - OCP EIP (a) PCl v. PC2 and (b) PC] v. PC2 v. PC3 67 separated. For example, in the 2D plot, Diesels 4 and 8 are closely associated, but in the 3D plot, they are more separated in PC3. Based on these results, the aromatic and OCP profiles are more useful in associating and discriminating diesels than the alkane, indane. and PNA profiles. In addition to the scores plots, PCA also provides information about which components in the samples contributed most to the determination of each principal component. This eigenvector, which can be displayed in a loadings plot, determines how the score of a sample is calculated. The loadings plots for the first and second principal components of the TICs are shown in Figure 3.8. It is clear that, while the alkane components dominate the loadings for the first PC (Figure 3.8a), a mixture of compounds from other characteristic classes is responsible for separation along the second PC (Figure 3.8b). The fact that the alkanes dominate PCl does not necessarily indicate that they are the primary source of variation among the diesels. In fact, the diesels are not clearly separated along PC] in the TIC as shown in Figure 3.2; they are relatively evenly distributed by their differences in alkane content. The lack of sufficient clustering of the diesels in the scores plot of the alkane profiles also indicates that the alkanes alone are not adequate for associating and discriminating the samples. The presence of components other than alkanes at lower levels in PCI and at higher levels throughout PC2 implies that some combination of aliphatics and non-aliphatics is responsible for the discrimination among the diesels demonstrated in the scores plots for the TIC (Figure 3.2). The loadings plot is not only useful for determining the components contributing the most to the eigenvectors, it can also highlight any anomalies caused by the analytical 68 Principal Component 1 0.20 (a) 0.15 0.10‘ 0.05 * 0.00 ~~ -0.05 Retention Time (min) Principal Component 2 0.10 (b) 0.05 -- 0.00m w. .. i 1.11 1 -0.05 Retention Time (min) Figure 3.8 Loadings Plots of First Two Eigenvectors from PCA of the TICs of Ten Diesels Analyzed in Triplicate 69 technique or the data pre-treatment steps that can potentially contribute to the sources of variation observed in scores plots. For example, in the loadings plot for the first two PC 5 of the alkane EIP, shown in Figure 3.9, several derivative-shaped curves are noticeable. In fact, these derivative-shaped peaks dominate the second eigenvector (Figure 3.9b). Not normally observed in loadings plots, these peaks are indicative of an artifact in the samples. Because these peaks are not observed in the loadings plots for the TIC s, it is possible that the artifact causing them has been introduced in the data pre-treatment steps. The most likely source of error is in the alignment procedure. Figure 3.10 overlays three replicate chromatograms of Diesel 1 around the peak with a retention time of approximately 33 minutes, which corresponds to derivative-shaped curves in the first two PCs as shown in Figure 3.9. It is obvious that this peak has been misaligned for this sample. Further inspection revealed that this peak has been misaligned throughout the alkane EIP data set, and this introduction of error in the data pre-treatment stage has propagated through to the data analysis results. The increase in range for the correlation coefficients of the replicates, as well as the spread observed among the replicates in the scores plots, can most likely be attributed to significant misalignments. It can be concluded that the alignment algorithm is by no means perfect, and misalignments must be considered when the PPMC and PCA results are being interpreted. In the analysis of a neat diesel data set, several obstacles were encountered and overcome. A precise manual injection method was developed, and sufficient abundance was achieved for accurate PPMC coefficients and PCA results. It was then demonstrated that PPMC coefficients and PCA were sufficient for the association and discrimination of neat diesels based on chemical composition. 70 Principal Component 1 0.20 2 (a) 0.10 Lil _ .11 E 0.00 i - -O. 10 i Retention Time (min) Principal Component 2 0.20 (b) 0.10 a 0-00 We LL WWL—mwj T r s . -0.10 ‘- -0.20 Retention Time (min) Figure 3.9 Loadings Plots of First Two Eigenvectors from PCA of the Alkane EIPs of Ten Diesels Analyzed in Triplicate 71 200000 Abundance 0 33.0 Figure 3.10 Misaligned Peak in the Alkane EIPs of Diesel 1 Replicates 'F—' Retention Time (min) 72 33.5 Chapter 4 Association and Discrimination of Burned Diesels Extracted from Fire Debris Using PPMC Coefficients and PCA 4.1 Introduction The substantiation of the principle that diesels can be associated and discriminated based on variation in chemical composition by chemometric procedures (PPMC coefficients and PCA) was demonstrated in Chapter 3. These procedures, however, possess the potential for applications past the association and discrimination of neat diesels alone. The next step in the research then is to investigate the associative and discriminatory potentials of these chemometric procedures for diesels that have been burned and extracted from fire debris. These conditions are more similar to those encountered in an actual arson case, and are therefore more practical in a forensic laboratory setting. When ignitable liquids are utilized as accelerants, they are often poured throughout the structure and then ignited in order to start the fire and guarantee its spread. The ignitable liquid can be affected in many ways both during the fire as well as while the fire is being extinguished. While much of the ignitable liquid is consumed in the burning process, it can also be absorbed by the substrate on which it was poured, where it remains protected to some extent from the heat of the fire. More porous substrates, such as wood and carpet, have a greater potential for retention of ignitable liquids, and these substrates are frequently sampled in fire debris analysis. Although the ignitable liquid can be absorbed by porous substrates and shielded from the fire, the intense heat is still capable of causing the loss of the more volatile 73 components present in the ignitable liquid. For these reasons, the chromatograms obtained from the GC-MS analysis of ignitable liquid residues (ILRs) extracted from fire debris are often skewed toward the less volatile components—that is, the chromatograms contain more of the less volatile components and little to none of the more volatile components. This distortion in chemical composition may lead to a misidentification of an ignitable liquid when a visual assessment of chromatographic patterns is performed. However, the application of chemometric procedures such as PPMC coefficients and PCA to associate and discriminate the burned ignitable liquids is more objective in nature, which could potentially overcome the subjectivity issues of a visual assessment. These procedures can be performed on not only the TIC, which may potentially be too skewed to be associated to its neat counterpart, but also on the ElPs of the characteristic compound classes of ignitable liquids. The classes that consist of mostly low volatile compounds may be more discriminatory than the TIC or other compound classes that are more significantly affected by the burning process. In order to investigate the potential for associating a burned diesel extracted from fire debris to its unburned counterpart, a complete set of burning experiments was performed on a newly collected set of diesels. Since a new instrument was used for this part of the research, the precision of the instrument as well as the injection method was re-assessed with the analysis of a new set of neat diesels. These diesels were analyzed in triplicate to examine the association and discrimination observed among the new sample set. Prior to conducting the burning experiments, Diesel 21, chosen at random, was spiked onto three different matrices (cotton cloth, magazine, and carpet) and a series of spike and recovery studies were performed to examine the efficiency of the solvent 74 extraction procedure selected. Then, Diesel 21 was subjected to a series of burning conditions on the three different matrices. For each matrix, five analyses were performed in triplicate: unburned matrix, burned matrix, unburned then spiked (50 uL) matrix, burned then spiked matrix, and finally a spiked then burned matrix to simulate actual arson conditions. The final phase was to perform a blind association study in which two diesels that had been previously analyzed were treated as unknowns. The blind diesels were spiked separately onto each of the three matrices and then burned. The debris was extracted and analyzed in an attempt to associate the blind diesel sample to its neat counterpart. PPMC coefficients and PCA were used to associate and discriminate the diesels, both burned and unburned, based on chemical composition. 4.2 Analysis of New Set of Neat Diesels 4. 2.1 Procedure Five of the previous ten diesel samples (Table 3.1) were re-collected from various Lansing service stations during the winter. Table 4.1 lists the samples and the details of their collection. After collection, the diesels were stored as described previously in Chapter 3. The five neat diesels were analyzed in triplicate by GC-MS, using a newly purchased instrument. The GC was the same make and model number (Agilent 6890 GC) as previous, while the MS was a newer version of the previously used MS (Agilent 5975 MSD compared to Agilent 5973). After initial analyses of the freshly collected diesel samples, it was observed that, in order to obtain chromatograms of similar abundances to before, a different dilution (10:1 in CHzClz instead of 50:1) of the neat diesels was necessary. The higher abundance levels in the TIC are essential in order for 75 BozoNA 308825 w? _ Q 3 Rod 323: com M .502 mm E35 26:?» Eoomocosm on mho ozco>< Bid 285 $3 .333on um Emma 26:0» Eoomocosc waQS osco>< 32¢ 285 3.03 mmom 8:02 mm 385 >>o=oNA 2nd mokmio 93% $0355 ooom 895m mm .035 25:0NA 2838:: M55: 5 80M ago—mm: mm 5 895m _N .035 553m cacao—EU we 25,—. as .830 533—30 we 8.5 553::— 3_>..om o-QEam ”3 559:5 .32: E. as; 76 the peaks present in the EIPs to be sufficiently distinguished from baseline noise. The sensitivity discrepancies between the two instruments are most likely due to minor discrepancies in the mass spectrometer, such as the position of the column in the ionization source or the voltages of the accelerating lenses. It was also concluded that the previous temperature program used was too long (108.33 minutes) to be practical in a forensic laboratory setting. The temperature program employed by the National Center for Forensic Science (NCFS) and the Technical Working Group for Fire and Explosions (TWGFEX) to generate the Ignitable Liquid Reference Collection (ILRC) was used instead [40]. This temperature program (NCFS.M) was as follows: initial temperature 50 OC, hold for 3 minutes, 10 °C/min ramp to 280 °C, hold for 4 minutes. The total run time was reduced to 30 minutes. All other instrumental parameters were the same as described in Chapter 3. Preliminary assessments of the chromatograms, both of the TIC and each EIP, indicated that the change to the temperature program did not noticeably sacrifice any chemical features. The increase to the ramp rate, however, offered a significant improvement in total analysis time which, in turn, allows for the methodology to be practical in the traditionally backlogged setting of an operating forensic laboratory. This compromise between improved analysis time and retention of discriminatory chemical features is a necessary consideration in the development of this methodology. Once the neat diesel data were collected and analyzed, the TIC and the ions used to generate the five EIP sets were extracted into Microsoft Excel as described in Chapter 3. The shorter temperature program, however, eliminated the need for truncating the end of the chromatograms. For the diesels analyzed by the NCFS.M method, each 77 chromatogram consisted of a total of 4702 data points (compared to 13,000 data points previously), all of which were considered in subsequent data analysis. The data were pre- treated as described in Chapter 3. RSD values and PPMC coefficients were also calculated for the TICs in order to demonstrate the precision of the new GC-MS instrument. The TIC and EIPs were also analyzed by PPMC coefficients and PC A in order to evaluate the association and discrimination of the neat diesels based on chemical composition. 4. 2.2 Results and Discussion Initially, RSD values were calculated to demonstrate the precision of the instrument. Table 4.2 lists the average RSD values for eight peaks throughout the chromatographic range for triplicate analyses of the five diesels. For the same eight peaks as selected in Chapter 3, average RSD values were well below the accepted threshold of 5% [3 8]. For previous average RSD values for the timed injection, one of the eight peaks still exhibited an RSD of greater than 5%. The lack of precision for that peak has been eliminated in this data set. It should be noted that the RSD values for Diesel 21 are higher than those of the other samples, which could indicate an aberration in the timed injection procedure for one of the replicate analyses for Diesel 21. However, all the average RSD values were below 3% which indicates that the instrumental technique and the injection method are precise. PPMC coefficients were also calculated to assess precision among replicate analyses. Table 4.3 lists the PPMC coefficients for the replicate analyses of the five diesels based on the TIC. Correlation coefficients for the replicate analyses ranged from 0.9780 to 0.9975, with an average of 0.9921 i 0.0061. This correlation is higher than 78 :2. £2 £2 :3 £3 $3 88833; w 5.: xi :2 £3 :33 $3 £3. 888st 5.2 $3 $3 :2 £3 :3 :3 858368 m3: :3 as: .x. _ .o $3 :2. 3% assuage 8v: :2 $2 :3 $2 s2 :3 mascogéeaséeqm 2 a: as: :2 :2 :3 :3 £2 05825 Ed saw $3 :3 some as? sow osoafisegsae $0.: -waifieeeafi :3. $3 :2 :2 as? $3 03088 83: owauo>< mm .895 vu .329 mm 385 NN 385 mm 335 @553 :3“— _a=:3om AEEV 25H. . . . . . . . 2:53. 83:3; 5 c-ba=< main—am .33: 25— he omnam 08:. 55:33— 2.. 233.. 9.3m Ewa— .8._ 335» Gwy— Né 035,—. 79 Table 4.3 PPMC Coefficients for the TIC of Replicate Sets of Five Diesel Samples Analyzed in Triplicate Diesel Sample Replicate Pair PPMC Coefficients 1 and 2 0.9941 Diesel 21 l and 3 0.9962 2 and 3 0.9975 1 and 2 0.9879 Diesel 22 1 and 3 0.9841 2 and 3 0.9899 1 and 2 0.9908 Diesel 23 1 and 3 0.9836 2 and 3 0.9780 1 and 2 0.9969 Diesel 24 l and 3 0.9967 2 and 3 0.9969 1 and 2 0.9955 Diesel 25 1 and 3 0.9973 2 and 3 0.9956 Average 0.992 1 Standard Deviation 0.0061 80 that of a previous data set run on a different instrument (0.9895 :1: 0.0072) as described in Chapter 3. This high correlation indicated that the new instrument was precise. As discussed in Chapter 3, 90% confidence intervals (CI) for the average PPMC coefficient for replicate analyses for the TICs and EIPs were used as the threshold for the comparison of samples from different origins [27]. If a CI for a diesel pair of different origin overlaps with the CI defined by the replicates, then that pair would be considered statistically indistinguishable. On the other hand, if the CI for that pair did not overlap the CI as determined by the replicates, then the diesel pair could be considered statistically distinguishable. The average correlation coefficient and respective standard deviations are given in Table 4.4 for the TIC and each EIP set for triplicate analyses of the five diesels. It should be noted that, on the whole, a higher level of association among replicates for the EIPs is observed as compared to previous data sets (Chapter 3). This narrow range for replicates improves the discrimination capability afforded by each EIP. Data analysis was initially performed on the neat diesel data set in order to determine the sources of variance in chemical composition among the new set of five diesels. PPMC coefficients were calculated for all diesel pairs in the TIC data set as well as each EIP compilation (alkane, aromatic, indane, OCP, and PNA). Each data set was also subjected to PCA, and scores plots were generated to assess the natural clustering of the diesels based on their inherent chemical composition. The complete PPMC coefficient tables for each data set are shown in Appendix C. The 90% CI for the replicates for the TIC was calculated to be 0.9813 — 1.0029. When examining Cls for different diesel pairs, Diesels 21 and 23 (0.9688 — 0.9934), Diesels 21 81 Table 4.4 PPMC Coefficients for TIC and EIPs for Replicate Sets of Five Diesel Samples Data Set Average Standard Minimum Maximum Dev1at10n TIC 0.9921 0.0061 0.9780 0.9975 Alkane EIP 0.9729 0.0384 0.8788 0.9971 Aromatic EIP 0.9813 0.0145 0.9597 0.9984 Indane EIP 0.9960 0.0019 0.9916 0.9987 OCP EIP 0.9869 0.0086 0.9682 0.9973 PNA EIP 0.9910 0.0064 0.9772 0.9985 82 and 24 (0.9692 — 0.9818), and Diesels 23 and 25 (0.9809 — 0.9889) exhibited overlapping ranges with the CI established by the replicates, which indicated that these three pairs were statistically similar. All other sample pairs did not exhibit CIs that overlapped with the replicates, which indicated that they are statistically different. Similar observations concerning statistical similarities between diesel pairs can be made for the ElPs as well. PCA was also performed on the neat data set using the TIC and each EIP. Only 2D scores plots are shown for these data sets for ease of comparison. The scores plot for the TIC is shown in Figure 4.1. In this plot, PCI accounts for 53% of the variance, while PC2 accounts for 26%. The replicates for each diesel are well clustered, though the replicate clusters for Diesels 21 and 23 exhibit minor spread in PC2. For the five diesels, 89% of the variance for the first two PCs reasonably separates all five samples. Diesel 22 is significantly positive in PC], while Diesel 24 is negative. The other three diesels are all relatively close to zero in PCI. Diesels 22 and 24 are negative in PC2, and Diesels 23 and 25 are positive. The combination, then, of PCI and PC2 is sufficient for separation of the replicate clusters of each diesel. Additional information about how the individual diesels cluster can be garnered from the scores plots of the five different EIPs. Figure 4.2 shows the scores plot for the alkane EIP, in which PCl accounts for 51% of the variance and PC2 24%. The same general clustering trend as observed in the TIC scores plot can be seen in PC 1 in the alkane plot. Diesel 22 is positive, Diesel 24 is negative, and Diesels 21, 23, and 25 are close to zero. However, in PC2, most sample scores are close to zero, with the exception of one triplicate analysis of Diesel 23, which is significantly distant from its replicate cluster. As discussed previously, this separation of one replicate analysis from the others 83 ADiese121 ADiese122 ADiese123 ‘ Diesel 24 ADiese125 6000000 4 ‘ A g A «‘3 v 1 ‘ 1 N 2 -6000000 A 6000000 -6000000 — PC] (53%) Figure 4.1 Scores Plot for the TIC of the Neat Diesels 84 PC2 (24%) ADiese121 ADiese122 ADiesel 23 Diesel 24 ADiesel 25 3000000 a r I -3000000 A 3000000 30000004 PC] (51 %) Figure 4.2 Scores Plot for the Alkane EIP of the Neat Diesels 85 is often due to a misalignment introduced in the data-pretreatment steps. The alkane EIP has been prone to misalignments as seen in Chapter 3, and the fact that PC2 describes 24% of the variance indicates that a considerable misalignment, or potentially more than one misalignment, has been introduced in the data pre-treatment steps for that particular replicate sample. Therefore, association and discrimination for the alkane plot can only be accurately assessed in the first principal component. The aromatic scores plot, shown in Figure 4.3, indicates that Diesel 22 differs significantly in aromatic content than the other four diesels. In fact, the other four diesels appear to be highly similar in their aromatic content. It should be noted that the clustering of the samples in the aromatic EIP is almost completely dictated by PC 1. Very little spread is observed in PC2. Because of the considerable amount of variance described by PC] (87%) as compared to PC2 (5%), it can be assumed that PC2 does little to separate the diesels based on chemical composition. Therefore, clustering in the aromatic profile is only considered as based on PCI. The indane scores plot, shown in Figure 4.4, shows a similar trend in PC 1 to the aromatic scores plot. Diesel 22 is largely positive in PCI (which describes 83% of the variance), while the other four diesels are each slightly negative and are not well separated from one another. However, unlike in the aromatic scores plot, PC2 (which describes 8% of the variance) contributes to the natural clustering of the samples. Diesels 23 and 25 are separated from each other in PC2, as well as from Diesels 21 and 24; however, due to minor spread in the replicate clusters in PC2, Diesels 21 and 24 cannot be reasonably separated from one another and are likely similar in indane content. The OCP scores plot, shown in Figure 4.5, is similar in its cluster patterns to the 86 ADiesel 21 ADiese122 ADiese123 ‘ Diesel 24 ADiesel 25 2000000 . 3" A g , 4 ‘ Iii—fl 2 -2000000 2000000 -2000000 _. PC1(87%) Figure 4.3 Scores Plot for the Aromatic EIP of the Neat Diesels 87 PC2 (3%) ADiese121 ADiese122 ADiese123 Diesel 24 ADiesel 25 200000 7 A A ‘ 1 -750000 750000 A A -200000 — PC] (83%) Figure 4.4 Scores Plot for the Indane EIP of the Neat Diesels 88 PC2 (14%) A Diesel 21 500000 ~ ADiese122 ADiese123 ' Diesel 24 ADiese125 fl 1 -1000000 A A -500000 J PC] (62%) Figure 4.5 Scores Plot for the OCP EIP of the Neat Diesels 89 1000000 alkane scores plot (Figure 4.2). In PC] (which describes 62% of the variance), Diesel 22 is positive, Diesel 24 is negative, and the other three diesels center around zero. However, in the alkane plot, differentiation in PC2 is potentially due to a misalignment instead of actual chemical variation. In the OCP plot, differentiation in PC2 is not characteristic of a misalignment, and it can be assumed that any discrimination is likely caused by differences in chemical composition. In PC2, Diesels 21, 23, and 25 are separated, while Diesels 22 and 24 are similar in PC2. Again, the combination of PCI and PC2 allows for moderate separation of the five diesels in the OCP scores plot. In the PNA scores plot, Figure 4.6 PCI accounts for 50% of the variance, and PC2 accounts for 17%. The five diesels are somewhat separated in PCI, where Diesel 22 is negative, Diesel 21 is around zero, Diesels 24 and 25 are slightly positive, and Diesel 23 is slightly more positive. PC2 also causes separation of the samples, with Diesels 22. 23, and 25 being negative and Diesels 21 and 24 positive in PC2. When combined with the separation caused by PC 1 , the five diesel replicate clusters are reasonably separated from one another. It should be observed that, with the exception of the misalignment in PC2. the clustering of the diesels in this data set is significantly improved over prior data sets, most likely due to improved precision in the manual injection method that comes with analyst experience. For example, in the final neat data set described in Chapter 3, the alkane, indane, and PNA plots were deemed less useful because they demonstrated spread in the replicates and a lack of natural clustering among the samples. In this data set (analyzed using a new instrument), however, the TICs and EIPs provide considerable association among replicates and discrimination among samples for all five diesels. This 90 PC2 (17%) ADiesel 21 ADiese122 ADiesel 23 ' Diesel 24 ADiese125 100000 7 A I I -200000 % AA 200000 AA -100000 J PC] (50%) Figure 4.6 Scores Plot for the PNA EIP of the Neat Diesels 91 association and discrimination capability demonstrates the utility of PCA, not only for the TIC but also for characteristic EIPs, for comparing neat diesel samples. 4.3 Efficiency of Solvent Extraction Procedure 4.3.] Procedure After the analysis of the neat diesels, the next step was to perform the burned studies. First, however, a method to extract the diesel residue from the charred debris was investigated. Of the ILR extraction procedures maintained by ASTM, passive headspace using activated charcoal is more commonly used [9]. However, because it is a headspace method, it can be biased against heavier, less volatile components. Diesel is considered a heavy petroleum distillate because it contains high carbon number compounds; therefore a headspace method has the potential to disproportionately extract the less volatile diesel components. For this reason, a solvent extraction is the preferred extraction method for diesel and thus was selected as a more representative procedure. Via preliminary studies, the solvent extraction method chosen was to add 5 mL of carbon disulfide (C82, spectrophotometric grade, Sigma-Aldrich, St. Louis, M0) to the matrix to be extracted in a 50 mL round bottom flask, which was then attached to the rotary arm of a Rotovap (R110, Btichi, Switzerland). The arm, which during normal operation of the Rotovap sits in a heated water bath, was positioned outside the water bath to prevent evaporation. The rotary function was adjusted to a setting of 4 out of 10. and the extraction mixture was allowed to agitate for 5 minutes. The solvent was then removed with a glass Pasteur pipet and, if necessary, filtered through a disposable syringe (Becton Dickinson & Co., Franklin Lakes, NJ) with a 25 mm, 0.2 pm pore size PTF E membrane syringe filter (Grace Davison Discovery Sciences, Deerfield, IL). The filtrate 92 was then analyzed by GC-MS using the same temperature program and instrument parameters described previously (Section 4.2.1). In order to demonstrate the efficiency of the solvent extraction method, a spike and recovery study was performed for the three matrices (cotton cloth, magazine, and carpet) to be considered in the burned studies. The cotton cloth was taken from a white cotton wash cloth (100% cotton, Wholesale Merchandisers, Grand Rapids, MI). The magazine was cut from a copy of ESPN The Magazine (20 pages including cover and glossy pages, ESPN, New York, NY). The carpet was taken from a roll of unused nylon carpeting (origin unknown). A series of standard solutions containing 1,3,5- trimethylbenzene (98%, Aldrich, Milwaukee, WI), decane (99+%, Aldrich, Milwaukee, WI), indane (97%, Aldrich, Milwaukee, WI), dodecane (99+% olefin free, Matheson, Coleman, & Bell, Norwood, OH), and tetradecane (99%, Aldrich, Milwaukee, WI) were prepared in dichloromethane (CH2C12, spectrophotometric, Sigma-Aldrich, St. LOuis, MO) at different concentrations: 0.01 %, 0.03 %, 0.05 %, 0.07 %, and 0.10 % (v/v). These components were selected to represent the common compounds classes present in diesels that span the retention time range. One mL of each calibration standard was first spiked with no matrix present, i. e. straight into the round bottom flask. It was then subjected to the extraction procedure as if a matrix were present and analyzed by GC- MS. The peak areas were integrated using the ChemStation software (Agilent, Santa Clara, CA) and plotted against mass injected (calculated using density) to create a calibration curve for each component in the calibration standard. The same extraction procedure was then performed with the matrix present for all concentrations of the calibration standard solutions. The mass of each component recovered was determined 93 by substituting the corresponding peak area into the appropriate calibration equation. The calculated mass was then divided by the theoretical mass in order to determine a percent recovery and hence assess the efficiency of the extraction procedure for each component at each of the spiked concentrations. 4.3. 2 Results and Discussion The first spike and recovery study was performed for the cotton cloth matrix. The calibration curve for the neat standards by mass injected (no matrix present) is shown in Figure 4.7 for all five components of the calibration standard. All five linear regressions show R2 values of greater than 0.9900. Table 4.5 shows the percent recovery values of each calibration standard from the cloth. With the exception of the 0.01% (v/v) standard solution, all recovery values are approximately 90% with a standard deviation of less than 0.5%. The higher than average recovery for the 0.01% (v/v) standard is most likely due to the fact that it is at the lower end of the linear range, where the theoretical mass to be recovered is already small. In fact, the linear regression equations alone nominally accounted for a significant portion of the recovery because the x-intercept values for each were close to the theoretical recoveries for the 0.01 % (v/v) standard. It should also be noted that the low standard deviations for each calibration standard indicate that the solvent extraction procedure does not preferentially extract one type of component. Figure 4.8 shows the calibration curves for the neat standards by mass injected (no matrix present) prepared for the day in which recoveries were examined for the magazine matrix. Again, R2 values of greater than 0.9900 indicate a linear calibration curve for the standard solutions. Table 4.6 lists the recovery values for each calibration standard from the magazine. Similar trends to the cloth recovery values are observed. 94 4.0E+07 y = 2272551954253x - 10764415738 A R2 = 0.9922 y --— 1 1 17822794745x - 3668916148 R2 2 09928 a y : 2268134387453x — 10617742459 if, R2 = 0.9925 ‘ a 5":1434206771519x - 4820168852 9 3 R2 r 099119 On ' y = l609814l3.3864x - 6200999180 .1 R2 = 0.9917 0.0E+00 I T 0.000 0.060 0.120 0.180 Mass Injected (ug) 01,3,5-Trimethylbenzene I Decane A Indane X Dodecane O Tetradecane Figure 4.7 Calibration Curve for the Neat Calibration Standards for Cloth Matrix Recovery Study Table 4.5 Extraction Recoveries from Cloth for Each Calibration Standard (1,3,5-TMB = 1,3,5-trimethylbenzene, C10 = decane, C12 = dodecane, C14 = tetradecane) Standard Concentration! Percent Recovery Average Recovery (°/o v/v) 1,3,5-TMB C10 Indane C12 C14 (V0) 0.01 106.33 101.89 105.16 101.27 101.76 103.28 :1: 2.30 0.03 87.86 87.92 87.54 87.27 87.19 87.56 :1: 0.33 0.15 87.65 88.29 88.02 88.33 87.71 88.00 i 0.31 0.07 87.35 87.96 88.11 87.26 87.21 87.58 :t 0.42 0.10 92.97 93.09 92.78 92.97 93.00 92.96 i 0.11 95 6.0E+07 Peak Area = 0.9929 R2 = 09943 = 0.9936 R2 = 0.9942 y = 3126603510607x - 12120031639 R2 y = 1630024456546): - 364969.795] y = 317995571 .5255x - 11718932541 R2 y = 19(3662438.281(.1x - 6160627869 A O y = 231112863.6301x - 963208.1393 R2 =0.9942 ,/ 0.0E+00 0.000 1 0.060 Mass Injected (ug) 1 0.120 0.180 01,3,5-Trimethylbenzene I Decane A Indane XDodecane O Tetradecane Figure 4.8 Calibration Curve for the Neat Calibration Standards for Magazine Matrix Recovery Study Table 4.6 Extraction Recoveries from Magazine for Each Calibration Standard (1,3,5-T MB = 1,3,5-trimethylbenzene, C 10 = decane, C12 = dodecane, C14 = tetradecane) Standard Concentrat’ Percent Recovery Average Recovery (70 V/V) 1,3,5-TMB C10 Indane C12 C14 (%) 0.01 115.78 111.98 114.44 110.62 116.21 113.81 $2.43 0.03 98.35 97.74 97.85 98.28 99.53 98.35 :1: 0.71 0.05 86.35 86.66 85.83 86.38 87.29 86.50 a: 0.53 0.07 104.22 105.04 104.54 105.56 106.04 105.08 :1: 0.74 0.10 99.13 99.95 99.51 99.86 100.54 99.80 a: 0.52 96 The 0.01% (v/v) standard is higher than average, while the rest of the standards show recoveries all greater than 85%. Even though the recovery value for the 0.07% (v/v) standard is slightly higher than 100%, it still falls within an acceptable range for recoveries (5% error, or a range of 95-105% recovery) [3 8]. As before, the low standard deviations indicate no compound bias in the solvent extraction procedure from the magazine matrix. Figure 4.9 shows the calibration curves for the neat standards by mass injected (no matrix present) prepared for the day in which recoveries were examined for the carpet matrix. R2 values are also above 0.9900 for the linear regression equations. Table 4.7 lists the recovery values for each calibration standard from the carpet matrix. Again, the same trends as for the cloth and magazine are observed. In this instance, however, the recovery value for the 0.05% (v/v) standard is low by comparison with other recovery values. In the corresponding calibration curves, however, it appears that the peak areas for the 0.05% (v/v) standard are lower than the expected peak areas determined by the regression equations. It is possible that the standard was poorly prepared. It could have been omitted from the calibration curve entirely, but because acceptable R2 values were still attained, the 0.05% (v/v) standard values were retained even with the disparity. Since the same standard that was used for the neat calibration curve was also used for the spike and recovery from the matrix, it is feasible that, if poorly prepared, this standard would project low in the recovery rates as well. The results of the spike and recovery studies using a representative standard solution indicate that a five minute extraction by agitation in five mL CS; is likely to be sufficient for the extraction of diesel from the three matrices investigated. The recoveries 97 7.0E+07 y = 4093101469953x - 28838442623 R2 = 0.9960 y 2 2139902108691x — 11201185410 R2 =. 0.9968 ,3 y = 4158075864954x - 32094293607 2 R2 = 0.9962 <3 .. - , _ :4 y =— ZED/8401115843.»; - 14706172541 3 18 = 0 9965 On , _. y = 3008998562243x - 17269372951 , r "J i ' R2 = 0.9970 0.0E+00 ‘ ' Y 7 0.000 0.060 0.120 0.180 Mass Injected (ug) O 1,3,5-Trimethylbenzene I Decane A Indane >< Dodecane O Tetradecane Figure 4.9 Calibration Curve for the Neat Calibration Standards for Carpet Matrix Recovery Study Table 4.7 Extraction Recoveries from Carpet for Each Calibration Standard (1,3,5-TMB = 1,3,5-trimethylbenzene, C 10 = decane, C12 = dodecane, C14 = tetradecane) Standard Concentrat Percent Recovery Average Recover (% vlv) 1,3.5-TMB C10 Indane C12 C14 (W 0.01 135.02 133.22 136.10 137.25 138.01 135.92 :1: 1.89 0.03 97.39 98.13 97.37 99.53 100.41 98.57 :1: 1.35 0.05 73.40 74.78 73.10 75.29 75.55 74.42 t 1.11 0.07 90.72 92.25 89.77 92.84 93.43 91.80 :1: 1.52 0.10 81.68 82.39 80.34 83.43 84.20 82.41 :t 1.50 98 are certainly not optimal; however, they are adequate for the purposes of this research project, especially since fire debris analysis is typically qualitative and not quantitative in nature. The more important feature of the solvent extraction procedure is that it is not biased toward any particular component or compound class in diesel, as any preference would skew the results of the PPMC coefficients and the PCA. Due to the sufficiency of the results of the spike and recovery studies, the solvent extraction procedure was deemed appropriate for subsequent extractions of spiked diesels from burned matrices. 4.4 Analysis of Burned Diesels 4.4.1 Procedure Three matrices (cotton cloth, magazine, and carpet) were investigated for the extraction and analysis of burned diesels. Each matrix was subjected to a series of burning conditions, with and without diesel present, in order to examine the matrix interferences and burning effects on the diesels. All substrates were cut to three centimeter squares, placed in a glass petri dish, spiked with diesel as necessary, and then burned. All GC-MS analyses were performed using the previously described NCFS.M method. First, the unburned matrix was extracted using the solvent extraction procedure and analyzed in order to identify the presence of any potential interferences inherent to the matrix itself. Next, the matrix was burned, extracted, and analyzed to determine any pyrolysis products created during the burning process that could also act as potential interferences. To burn the matrix, a butane grill lighter (BIC, Shelton, CT) was used to ignite it, and it was allowed to burn until self-extinguished. If after extinguishing, the majority of the surface was not charred, the substrate was re-ignited until significant 99 charring had occurred. Next, 50 11L of Diesel 21 was spiked onto the unburned matrix, which was then extracted and analyzed in order to examine recovery of the diesel and potential interferences. The same conditions were repeated for a burned matrix, i.e. the diesel was spiked onto the burned matrix, in order to assess the efficiency of the diesel recovery from the burned matrix. The final conditions examined were those most similar to an actual arson case. Diesel 21 was spiked, again 50 11L, onto the matrix, which was then ignited. The debris was extracted and analyzed by GC-MS in order to assess the potential for the association of the burned diesel to its unburned counterpart. After Diesel 21 was used to carry out each burning series for all three matrices. two other previously analyzed samples were treated as blind unknowns and analyzed in an attempt to assess the potential for the association of a burned diesel extracted from fire debris to its unburned counterpart. A colleague selected two diesels and placed them in vials labeled “A” and “B”. These two blind samples were then spiked (50 11L) onto each of the three matrices, burned, and finally extracted and analyzed by GC-MS. For the data sets collected for this portion of the research, the need for data pre- treatment is the same as described before, though some minor adaptations were necessary to accommodate the alterations to the chromatograms effected by both the burning process and the extraction procedure, such as loss of volatiles and variation in extraction efficiency. The TICs from the burned diesel data set (the samples that were spiked onto the unburned matrix and then burned) were similarly compiled and retention time aligned, though a window size of five was insufficient for their alignment (the algorithm failed with a window of five). For that reason, the burned TIC data set was aligned using a window size of seven. Targets were selected at random for each data set. The ElPs 100 were similarly compiled and aligned. For some EIPs, the same parameters used to align the TIC were sufficient for alignment. However, other EIP sets required a manual selection of the noise threshold for peak identification as opposed to using the default calculation applied by the alignment algorithm. Both the indane and PNA profiles required a threshold of 100, while the OCP profile required a threshold of 300. Normalization and mean-centering were also performed in the same manner as detailed in Chapter 3, though each data set was treated separately. For example, the burned diesels were normalized and mean-centered separately from the neat diesels. Separate treatments were necessary because of the differences in abundances between a 1:10 dilution of a neat diesel and an extracted 50 11L diesel spike that had been burned. After pre-treatment, the TICs and EIPs were then analyzed as described in Chapter 3. PPMC coefficients were used to determine the level of correlation among sample pairs. PCA was performed in order to observe the natural clustering of the diesels based on chemical composition. It should be noted that PCA was not actually performed on the bumed diesels. Instead, the eigenvectors calculated for the neat diesel data set were used because they represent the chemical components that are the sources of maximum variance among the five diesels. The eigenvectors were used to calculate a score for the diesels that were extracted from the burned matrices. This score was calculated in Microsoft Excel by multiplying each point in the eigenvector by its respective point in each mean-centered chromatogram. These products were then summed to calculate the score for that diesel corresponding to that specific principal component. The objective was to associate and discriminate the samples based on their chemical composition, not on changes introduced in the burning and extraction process. 101 If the diesels from the burning trials were included in the PCA, then the PCA would specifically search for any differences among the neat and burned samples instead of associating and discriminating them only on their inherent composition. 4. 4.2 Results and Discussion 4. 4. 2.1 Assessment of Potential Interferences from Unburned and Burned Matrices Figure 4.10 shows sample chromatograms from the extraction of the three unburned matrices. In the cloth matrix, no peaks of interest were observed—in fact. the chromatogram resembled that of a blank solvent injection. In the magazine extract, however, a relatively significant peak eluted at approximately 16 minutes. A mass spectral library search tentatively identified this peak as a long chain ester, likely from the ink used to print the magazine. Other peaks at lower abundance are also present in the unburned magazine. In the carpet extract, two significant peaks were observed after 20 minutes. Mass spectral library searches indicated that these two peaks were bi- naphthalene products, potentially from the carpet backing, adhesives, or a chemical treatment such as water repellant or stain resistant. Again, smaller peaks are also observed throughout the chromatogram. These smaller peaks, however, are at such an insignificant abundance compared to the abundances expected for the diesel components; therefore, it is unlikely that these smaller matrix peaks will interfere in anyway with the results of the chemometric procedures. Figure 4.11 shows sample chromatograms from the extraction of the three burned matrices. Again, the cloth extract contains no significant peaks. The magazine and carpet extracts contain the same peaks as observed in the unburned extracts, though at higher abundance. These increased levels could be due to variation in the extraction 102 10000 ..___._.._- -- 7 A --.,. Abundance 0 1 1 ‘1 1 llJILLlilutrril 1 LAMA 0 5 10 15 20 25 30 Retention Time (min) 40000 ——-—— -—» —-»-- (b g I " 1 1:1 :1 = .n < 0 up i “1.; LI; -11 ULL [111.11. ‘1“ ILL—.1.“ 0 5 10 15 20 25 30 Retention Time (min) 40000 --———-—— ._ - i —— (C) 0 8 8 1: :1 .n < J 1 1 11 l. LAgulAAllllil A. 1 JUL." 0 5 10 1 5 20 25 30 Retention Time (min) Figure 4.10 Representative Chromatograms of Unburned Matrices: (a) Cloth, (b) Magazine, and (c) Carpet 103 10000 -————-- 777 77 7-.- -2. Abundance 0 5 10 15 20 25 30 Retention Time (min) 150000 -- -- (b) Abundance AA7A 0 5 10 15 20 25 30 Retention Time (min) 50000 — -- -——— Abundance O 3 1. It.“ illibilkul . 3M 0 5 10 15 20 25 30 Retention Time (min) Figure 4.11 Representative Chromatograms of Burned Matrices: (a) Cloth, (b) Magazine, and (c) Carpet 104 efficiency, or even differences in injection volume. It is possible that these peaks could interfere in the chemometric results if they remain significant in abundance in comparison to the presence of a diesel sample. It should be noted that no significant peaks that could be attributed to pyrolysis products (that is, peaks present in the burned matrix but not in the unburned) were observed for any of the three matrices. It is possible that pyrolysis product formation is limited in these three matrices, or again that the pyrolysis products are volatilized and lost to the environment during the burning process. It is also possible that the solvent extraction procedure is not selective for any pyrolysis products present, i. e. the pyrolysis products may not be soluble in the solvent. Whatever the reason, the lack of interfering pyrolysis products is advantageous for the accuracy of the results of both the PPMC coefficients as well as PCA. 4. 4. 2. 2 Solvent Extraction of Diesels from Unburned and Burned Matrices Figure 4.12 shows sample chromatograms from the extraction of the three unburned matrices spiked with 50 1.1L of Diesel 21. All three chromatograms show similar diesel chromatographic patterns. No extraneous peaks are observed that could potentially interfere with the chemometric results. Figure 4.13 shows sample chromatograms from the extraction of the three burned matrices spiked with 50 1.1L of Diesel 21. Again, a similar chromatographic pattern is observed, and no interference peaks are obvious. The lack of any interference peaks in either the unburned or burned matrices indicates that the diesel is present at a sufficient concentration that inherent hydrocarbons from the matrices do not make significant contributions to the chromatograms. The consistency in the diesel peak pattern from the extracts of both 105 3000000 ~——— 7777 722-- - (a) 8 :1 a 11 7 1 3 < 1 O .. .u. 1 T 1 ‘ I 0 5 10 15 20 25 30 Retention Time (min) 3500000 ~7b—)-—— i 8 :1 a e :1 :1 .1: < 0 - . . , , J 0 5 10 15 20 25 30 Retention Time (min) 2000000 .77 — .__---._-____- (C) 0 ‘e’ S 5 .E < 0 — 7 7 10 15 20 25 30 Retention Time (min) 0 M Figure 4.12 Representative Chromatograms for Unburned Matrices Spiked with Diesel 21: (a) Cloth, (b) Magazine, and (c) Carpet 106 3000000 ‘--~———* - c . Abundance 0 S 10 15 20 25 30 Retention Time (min) 4000000 -————— l 8 E a a G 5 .n < I 0 1 , 7 0 5 10 15 20 25 30 Retention Time (min) 2000000 — -— - e ~~ M- (C) a 1 8 l E I 11 E = .D < 0 ' T . 1 J 0 5 10 15 20 25 30 Retention Time (min) Figure 4.13 Representative Chromatograms for Burned Matrices Spiked with Diesel 21: (a) Cloth, (b) Magazine, and (c) Carpet 107 burned and unburned matrices also signifies that the presence of the matrix does not negatively affect the solvent extraction of the broad range of diesel components. 4. 4. 2. 3 PPMC Coefficients of Neat and Burned Diesels Sample chromatograms for the extraction of spiked and burned diesels from each of the three matrices are shown in Figure 4.14. PPMC coefficient tables for the compiled neat and burned data can be found in Appendix D. For the TIC PPMC coefficients. the burned Diesel 21 replicates (spiked on unburned matrix, then burned) were assessed for precision of the burning process for all three matrices. The average correlation coefficient for each matrix was as follows: cotton cloth — 0.9847, magazine — 0.9344. and carpet — 0.8411. These correlations, which are lower than observed for neat diesel replicates, indicate the degree of uncertainty introduced by the burning process, which appears to be matrix dependent. The cotton cloth yields higher PPMC coefficients for Diesel 21 when it is burned and extracted in comparison to the other two matrices. The higher level of correlation for cotton cloth indicates that either the burning process is more reproducible for the cloth matrix, or that the solvent extraction procedure is the most efficient for diesel from the cloth. The replicate spiked and burned extracts from the magazine are less correlated, most likely because of the way the magazine itself burns. Unlike the cloth, which only chars, the magazine becomes ashen. This difference makes the surface area of the burned matrix variable, and thus introduces variability in the PPMC coefficients. The magazine is also less absorbent than the cloth, and is less apt at shielding the diesel from the burning process. The carpet yielded the least precise correlation coefficients for the replicate spiked and burned extracts of Diesel 21. This imprecision is most likely due to the relatively high retention of the diesel by the carpet. 108 1000000 ~"—-—W~- --* A 7777. 77 3, (a) 8 '; :1 1 a 1 '5 l ‘ l = .9 < 1 l 0 5 10 15 20 25 30 Retention Time (min) 1000000 a .. l I t a 1 u E :1 .1: < I O ‘ 1 1 7 0 5 10 15 20 25 30 Retention Time (min) 300000 T——~w~-————-- 77v . l (c) l 3 = 1 I 1 '° 1 1: . 5 1 a l < 1 l 1 l 0 , . ' ' 7 1 . -1 0 5 10 15 20 25 30 Retention Time (min) Figure 4.14 Representative Chromatograms for Matrices Spiked with Diesel 21 and then Burned: (a) Cloth, (b) Magazine, and (c) Carpet 109 This increased retention may protect the diesel from the fire, but it can also hinder the solvent extraction procedure. The carpet also has a greater surface area than the two other matrices due to the height of the carpet fibers from the carpet backing, as well as the presence of multiple fibers. This additional surface area can lead to a more variable burning process. The difficulty in extraction and imprecision in the burning process may prevent replicate burnings of diesel on carpet from being precise. In addition, PPMC coefficients were utilized to associate the burned Diesel 21 to its neat counterpart. However, the highest correlation for none of the three matrices corresponded to Diesel 21. The cloth extracts were on average more closely associated with Diesel 23 (0.8840) than Diesel 21 (0.8657), the magazine extracts from the magazine with Diesel 25 (0.8183) instead of Diesel 21 (0.7977), and the carpet extracts with Diesel 22 (0.6203) instead of Diesel 21 (0.6028). These differences, however, are not statistically significant at the 90% confidence level, which indicates that the extracts are as well correlated with Diesel 21 as with the other neat samples. Several steps were taken in order to improve the correlation coefficients between Diesel 21 and the three matrix extracts, which included the selection of different target chromatograms for the alignment algorithm, the alteration of the user defined window size for the alignment algorithm, and the statistical examination of replicate samples for outliers. These changes, however, did not affect the trends observed for the correlation coefficients calculated for the matrix extracts. For the blind samples, it should first be noted that the correlations for the TIC s of the magazine and carpet extracts for Blind B (Appendix D) are all either poor or negative. which indicates a complete lack of correlation to any of the neat diesels. It is possible 110 that these samples are insufficient for comparison due to poor recovery by the solvent extraction procedure. The cloth extract for Blind B has highest average correlation with Diesel 25 (0.9076). The three extracts for Blind A all have maximum correlations with Diesel 22 (cloth — 0.8665, magazine — 0.8575, and carpet — 0.8339). This regularity in maximum correlation indicates that Blind A is most consistent with Diesel 22. Closer examination of the PPMC coefficients for the EIPs reveals similar trends to those observed in the TIC. The burned Diesel 21 extracts are somewhat correlated with neat Diesel 21. However, extracts from all three matrices for Blind A consistently have a maximum correlation with Diesel 22, and the cloth extract for Blind B consistently maximizes when paired with Diesel 25. The PPMC coefficient results for the burned diesels indicate that the burning process itself is highly variable. This variability inevitably affects the precision of the PPMC coefficients and can impede the association of burned diesels to their unburned counterparts. This variability also seems to be dependent on the features of the matrix being burned, which include how the burning affects the surface area of the matrix, and how the matrix retains the diesel and shields it from the heat of the fire. On the other hand, a consistency was observed for the maximum correlations of the blind diesel samples to a neat diesel sample, which implies that some amount of association is still possible even after the effects of the burning process, even though association was not demonstrated for the replicates of Diesel 21. 4. 4. 2. 4 PCA of Neat and Burned Diesels Initially, the burned data was concatenated with the neat data, all of which was carried through the previously described data pre-treatment processes. The mean- 111 centered data was then subjected to PCA and a scores plot generated for the TIC s. This scores plot, shown in Figure 4.15, demonstrates the data set dependency of the PCA procedure. PCA is designed to calculate the variance among the given data set. PCA as an unsupervised procedure is incapable of associating a neat diesel and one that has been altered during burning based on chemical composition, because obvious differences exist between those chromatograms. In the scores plot, the neat diesels are clustered on the negative side of PCI, while the burned diesels are clustered on the positive side. Thus. PCA is not a feasible option for a compilation of neat and burned samples. When considering PCA as an associative and discriminatory tool, the nature of the samples to be associated and discriminated must be determined. In this research, the objective is two-fold: to discriminate burned diesels of different origin from one another, and to associate burned diesels to their unburned counterparts. In order to achieve this objective, the factors that are capable of associating and discriminating neat diesels based on chemical composition (i. e. the eigenvectors from the neat diesel PCA) can also be applied to burned diesels from the same data set in order to associate and discriminate them based on chemical composition. The assumption is that sufficient significant features of the diesels persist through the burning process to still associate and discriminate them based on the eigenvectors calculated for the neat diesel set. It is also possible that a specific compound class is more likely to remain after burning. so that the EIPs may provide more discriminatory information than the TIC alone. Once the scores were calculated for the first two principal components for all the burned samples for the TIC and the EIPs, they were added to the scores calculated for the neat diesel set, and a new scores plot was generated. Each blind sample was consolidated 112 A Diesel 21 A Diesel 22 A Diesel 23 - Diesel 24 A Diesel 25 A Cloth - Diesel 21 A Magazine - Diesel 21 A Carpet - Diesel 21 A Blind A - Cloth, Magazine, Carpet A Blind B - Cloth, Magazine, Carpet 10000000 1 A ,3 :2. A n N r é V fi‘ 1 2 -20000000 0 A 20000000 .10000000 J PCI (92%) Figure 4.15 Scores Plot for the TIC of the Neat and Burned Diesels 113 into a set of scores for all three matrices. However, an initial assessment of the resulting scores plots indicated that the scores for the burned diesels all seemed to center around the origin. Exemplary scores plots for the TIC and alkane EIP can be found in Figure 4.16. It was first thought that this location of the scores for the burned data was indicative of a lack of chemical features in the burned chromatograms. However, a closer look at the scores calculation method revealed that significant disparity in relative abundances between the neat and burned diesels would lead to a difference in the magnitude of their respective scores values. More explicitly, the eigenvectors were calculated for the normalized abundances of the neat diesels. These same eigenvectors were then multiplied by the less abundant chromatograms of the burned diesels and summed to calculate a score. Therefore, these scores were lower in magnitude based on the differences in abundances between the neat and burned diesels. This discovery is an important revelation when considering the applicability of this method to actual arson cases. It seems some method of correction is necessary for samples that differ greatly in abundance levels in order to apply pre-determined eigenvectors to calculate scores. In this case, it appeared that the neat diesel chromatograms were approximately an order of magnitude larger in abundance than the burned diesel chromatograms, so a correction factor of ten was applied. The normalized burned data was multiplied through by ten, and the data was again mean-centered and the scores calculated. It was observed that this multiplication of ten to the normalized data led merely to an approximate ten-fold multiplication of the calculated scores values, so henceforward the already calculated scores were simply multiplied through by ten. The score plots were re-generated, and observations were made about their probative value 114 A Diesel 22 A Diesel 21 A Diesel 23 Diesel 24 A Diesel 25 A Cloth - Diesel 2] A Carpet - Diesel 21 A Magazine — Diesel 21 .15. Blind A - Cloth, Magazine, Carpet 6000000 A Blind B - Cloth, Magazine, Carpet A A .“91 .. A . “g; .3 \c S 1 " 1 §-6000000 A " 6000000 4A (0) -6000000 PC1(53%) 3000000 ~ ? A A a: ‘ A A? c . “In, 1 93500000 A‘ A 3500000 ‘ A b () , -3000000 ~ PC1 (51 %) Figure 4.16 Representative PCA Scores Plots for (a) TIC and (b) Alkane EIP for the Burned Data Scored with the Eigenvectors from the Neat Data 115 for the association and discrimination of the neat and burned diesels. Figure 4.17 shows the TIC scores plot containing the neat and burned diesels. The first observation is the obvious spread in the replicates of the burned diesels as compared to the replicate analyses of the neat diesels. The spread occurs in both PC1 and PC2, and because these eigenvectors correspond to actual chemical differences as suggested by the discrimination of the neat diesels, it is likely that the spread in the replicates is caused by genuine dissimilarities in their chromatograms. These discrepancies can be attributed to variability in both the burning process and extraction efficiency; therefore, any interpretation of the association and discrimination among samples must be prefaced with this potential for variability. The scores for Diesel 21 spiked onto the three matrices and extracted are only moderately close to the scores for neat Diesel 21. Two replicates of Diesel 21 spiked onto a magazine and burned are more closely associated to the neat diesel than those extracted from the other two matrices, though one cloth replicate is close as well. The carpet replicates are not remotely close to the neat diesel. This trend is sensible, as the magazine is less absorbent than the cloth and carpet matrices. Neither of the blind samples can be logically associated with a specific neat diesel cluster. For Blind A, the carpet extract is located in the proximity of the Diesel 22, which is consistent with the PPMC coefficients results, while the magazine extract is closer to the Diesel 21 cluster. The cloth extract is located between the other two extracts and cannot reasonably be associated with any neat diesel cluster. For Blind B, the cloth extract is located within the cluster for Diesel 21, which is inconsistent with the PPMC coefficients results, while the magazine and carpet extracts are far from any neat diesel cluster in the upper left 116 A Diesel 21. A Diesel 22 A Diesel 23 . Diesel 24 A Diesel 25 A Cloth - Diesel 21 A Magazine - Diesel 21 A Carpet - Diesel 21 A Blind A - Cloth, Magazine, Carpet A Blind B - Cloth, Magazine, Carpet 12000000 1 A A A S; A A \D 9' ' if A . N -12000000 12000000 2 ‘ A1“ A A‘ A A A -12000000 J PC1 (53%) Figure 4.17 Scores Plot for the TIC of the Neat Diesels and the Projected Burned Diesels After a 10X Correction 117 quadrant of the scores plots. The alkane scores plot for the neat and burned data, shown in Figure 4.18. exhibits similar traits to those observed in the TIC scores plot. The replicate analyses for burned Diesel 21 are inconsistent and barely associated with their neat counterpart. For Blind A, the cloth and the carpet extracts are associated with the clusters for Diesels 21, 23, and 25, while the magazine extract is more closely linked with Diesel 24. For Blind B, the cloth extract is associated with the Diesel 24 cluster, while the magazine and carpet extracts are again not remotely close to any other clusters. In fact, they are not shown because their scores values are considerably different from the scores values of the rest of the data set. At this point, it became obvious that the magazine and carpet extracts for Blind B were extremely different from the cloth extract and the rest of the data set being analyzed. An examination of their chromatograms revealed exceptionally low abundances as compared to the other samples. The TIC for the cloth extract of Blind B exhibited an abundance level of approximately 500,000, whereas the TICs for the magazine and carpet extracts exhibited abundance levels of approximately 70,000. The burned samples are already low in abundance, and these aberrant values for the magazine and carpet extracts prevent them from being scored on a similar level as the other samples. It is possible that these two samples were poorly extracted. It does not seem feasible to use a different correction factor for these samples in order to maintain consistency for all the samples throughout the data analysis procedures. Whatever the reason, any interpretation of their association to a neat diesel will henceforth be omitted. The aromatic scores plot for the neat and burned diesels is shown in Figure 4.19. 118 A Diesel 21 A Diesel 23 A Diesel 25 A Magazine - Diesel 21 .35. Blind A - Cloth, Magazine, Carpet A Diesel 22 Diesel 24 A Cloth - Diesel 21 A Carpet - Diesel 21 A Blind B - Cloth, Magazine, Carpet 3000000 1 3 A ‘ A At». ‘A $5 sh___A._ , _.A 1‘ A . §-4500000 ‘1‘ {‘9‘ 4500000 A 30000009 PC1 (51%) Figure 4.18 Scores Plot for the Alkane EIP of the Neat Diesels and the Projected Burned Diesels After a 10X Correction 119 A Diesel 21 A Diesel 22 A Diesel 23 Diesel 24_ A Diesel 25 A Cloth - Diesel 21 A Magazine - Diesel 21 . A Carpet - Diesel 21 _ A Blind A - Cloth, Magazme, Carpet A Blind B - Cloth, Magazme, Carpet 2mmm7 .3 at In * A . I: a 5% A A 93‘ m 3 -2000000 A - ‘ 2000000 -2000000 4 PC1 (87%) Figure 4.19 Scores Plot for the Aromatic EIP of the Neat Diesels and the Projected Burned Diesels After a 10X Correction 120 Greater precision among the burned replicates is observed than for the other scores plots. and the burned Diesel 21 scores for all three matrices fall in the four diesel cluster observed in the neat aromatic scores plot that contains Diesel 21 (as well as Diesels 23, 24, and 25). The cloth extract for Blind B also falls in that same cluster, which also contains Diesels 23, 24, and 25. An interesting feature to observe is the location of the scores for the three extracts of Blind A. The three extracts are separated from the four diesel cluster along PC1 toward Diesel 22. The carpet extract is the closest to Diesel 22. while the cloth extract is the furthest away. The relative precision among replicates within this plot suggests that the identity of Blind A is potentially Diesel 22, whereas no reasonable conclusions could be drawn from the other scores plots. The indane scores plot for the neat and burned diesels is shown in Figure 4.20. As before, significant spread is observed among the burned replicates, though in this case the majority of the spread is along the second principal component. Again, little consistency with the neat sample is observed for the burned Diesel 21 replicates for all three matrices, though the magazine extracts are closest yet again. For Blind A, the extracts from the three matrices are generally closest to the cluster for Diesel 22, though by no means are they close enough to be considered statistically similar. Then again, Diesel 22 is the only neat sample that is positive in PC 1 , and the Blind A extracts are all significantly positive in PC1; therefore, it is plausible that some degree of association can be suggested by these similarities between the two samples. The carpet extract is once again the closest to Diesel 22. The cloth extract for Blind B is again located in the large four diesel cluster centered about the origin. The OCP scores plot for the neat and burned diesels is shown in Figure 4.21. The 121 A Diesel 21 A Diesel 23 A Diesel 25 A Magazine - Diesel 2] A Blind A - Cloth, Magazine, Carpet PC2 (8%) A Diesel 22 Diesel 24 A Cloth - Diesel 21 A Carpet - Diesel 21 A Blind B - Cloth, Magazine, Carpet ‘ 600000 1 A A A‘ r b“ i a -1200000 A A 1200000 A -600000 J ‘A PC1 (83%) Figure 4.20 Scores Plot for the Indane EIP of the Neat Diesels and the Projected Burned Diesels After a 10X Correction 122 A Diesel 21 A Diesel 23 A Diesel 25 A Magazine - Diesel 21 A Blind A — Cloth, Magazine, Carpet A Diesel 22 Diesel 24 A Cloth - Diesel 21 A Carpet - Diesel 21 A Blind B - Cloth, Magazine, Carpet 1000000 - A A A A i: q. A A :3“ r * § A A ' 3 -1200000 A 1200000 A A A A -1000000 PC1(62%) Figure 4.21 Scores Plot for the OCP EIP of the Neat Diesels and the Projected Burned Diesels After a 10X Correction 123 same trends are observed for the known burned Diesel 21 scores. Few are actually associated with Diesel 21, and the replicate analyses are imprecise as observed by the lack of clustering. The extracts for the three matrices for Blind A are again most closely associated with Diesel 22, with the cloth and carpet extracts located the closest. The cloth extract for Blind B is most closely associated with the clusters for Diesels 21 and 23. The PNA scores plot for the neat and burned diesels is shown in Figure 4.22, and again the same trends are observed for the known burned Diesel 21 scores. The blind samples, however, are more informative in this plot. The cloth and carpet extracts for Blind A are reasonably associated with the Diesel 22 cluster. The magazine extract is located farther from the Diesel 22 cluster, though it is still negative in PC 1 just like the other Diesel 22/Blind A scores and unlike the other neat diesel samples. The cloth extract for Blind B is not associated with any neat diesel cluster. 4. 4. 2. 5 Identification of Blind Diesels Through the PPMC coefficients and minor corroborating evidence from the PC A scores plots, it was determined that the identity of Blind A was Diesel 22, and that the identity of Blind B was Diesel 25. Although these were the correct assignments, the identifications were not capable of being made with sufficient statistical confidence. Too much variation is present throughout the PPMC coefficients and the PCA results in order to conclusively identify the blind samples. Instead, only the most likely candidates could be determined. However, with the considerable level of variability involved in the extraction procedure and the burning process itself, the identification of a likely candidate among samples that are so inherently similar is useful as a preliminary assessment. The 124 A Diesel 21 A Diesel 23 A Diesel 25 A Magazine - Diesel 2] A. Blind A - Cloth, Magazine, Carpet A Diesel 22 Diesel 24 A Cloth - Diesel 21 A Carpet - Diesel 21 A Blind B - Cloth, Magazine, Carpet 200000 — A ‘ A a A AA 0\ 5: a. X ' .52, ' A“ n. ' 0 -200000 e - 200000 a. ‘A A .. A A A -200000J PC1 (50%) Figure 4.22 Scores Plot for the PNA EIP of the Neat Diesels and the Projected Bu med Diesels After a 10X Correction 125 potential exists for a greater confidence level when associating a burned diesel to its unburned counterpart if diesels were compared with a larger set of different classes of ignitable liquids due to more significant differences in chemical composition. 126 Chapter 5 Conclusions and Future Work 5.1 Conclusions 5.1.1 Association and Discrimination of Neat Diesels Using PPMC Coefficients and PCA Throughout the initial analysis of the neat diesel samples, several obstacles were overcome in order to improve the precision of both the analytical method and the chemometric procedures, as well as the significance of the results obtained. It was determined that a two-second timed injection of a one nL sample encased in a pL air pocket is necessary to ensure consistent vaporization of all components in the diesel in the heated injection port. This consistency is imperative to the precision of the injection method. It was also determined that a less dilute sample is essential in order to achieve sufficient abundances in the chromatograms, so as to ensure that any variation in the instrumental background is inconsequential in comparison to the variation among the diesels. However, poor abundances were still observed in some of the EIPs, which indicates that further investigation into the abundance issue is necessary. In addition to a less dilute sample, it was determined that summed EIPs are more informative than a single BIC for improved signal-to-noise. Once these improvements were applied, a final neat diesel data set was analyzed to investigate the potential for association and discrimination based on chemical composition. PPMC coefficients illustrated that, on a pairwise basis, the diesel samples can either be associated or discriminated based on a comparison of their correlation coefficient to that of replicate samples. The ranges of correlation coefficients for both the replicates and for samples that are different in origin based on ElPs extend further than 127 those for the TIC, which suggests that the EIPs could provide additional discriminatory information. The PCA results for the TICs demonstrated that the diesels are relatively well differentiated, and that replicate samples are tightly clustered. The PCA of the alkane, indane, and PNA profiles did not provide any additional discrimination capabilities compared to the TIC. It should be noted that errors in the retention time alignment were observed in both the scores plots and the eigenvectors for the aforementioned profiles. These errors cause the results to misrepresent the variation among the samples based on chemical composition alone, so that the discrimination capabilities of these profiles carmot be fairly gauged until these errors are corrected. The PCA results for the aromatic and OCP profiles did, however, offer some additional association and discrimination among the samples. These profiles did not exhibit the same alignment errors as were observed in the alkane, indane, and PNA profiles, which is most likely due to higher abundances and improved peak shapes for the aromatic and OCP EIPs as compared to the other EIPs. The results from the analysis of this neat diesel data set have shown that it is possible to associate and discriminate diesel samples based on their chemical composition using both PPMC and PCA. The fact that any amount of discrimination is possible among diesels, which are so inherently similar in chemical composition. is significant for the progress of this research. This potential for association and discrimination indicates that it may be possible to apply the same instruniental techniques and chemometric procedures to associate and discriminate diesels that have been burned and extracted from fire debris. 128 5.1.2 Association and Discrimination of Burned Diesels Extracted from Fire Debris Using PPMC C oeflicients and PCA The objective of this part of the research was to examine the potential of PPMC coefficients and PCA for the association and discrimination of burned diesels that have been extracted from fire debris. In order to accomplish this objective, neat diesel samples were analyzed for comparative purposes. This sample set was analyzed on a new instrument, and it was determined the sensitivity of the new instrument required a dilution of 10:] instead of 50:1 in order to achieve comparable abundance levels to prior data sets collected on another instrument. It was also determined through RSD calculations and PPMC coefficient determinations that the precision of the neat diesel set was improved over prior data sets. The five diesels in the data set were associated and discriminated based on chemical composition in not only the TIC, but in each of the EIPs as well, which indicates the utility of both the TIC and EIPs in association and discrimination determinations. Once the neat diesel set had been analyzed, a series of spiking and burning experiments was performed on three common household matrices: cotton cloth, magazine, and carpet. A solvent extraction, consisting of agitation of the matrix in five mL of C82 for five minutes, was selected first based on a set of spike and recovery studies that demonstrated acceptable efficiency of the extraction. It was then determined that, although some unburned and burned matrices exhibit potential interference products, in the presence of a diesel, the interference peaks are masked by the components of the diesel itself. Finally, a set of spiked and burned diesels (two of which were treated as blind samples) was analyzed in an effort to simulate actual arson conditions. First, severe 129 spread was observed among replicate burnings in both the PPMC coefficients and scores plots from PCA, which confirms that the burning process is variable. Secondly, for PCA. it was determined that abundance plays a significant role when attempting to project scores for burned samples from eigenvectors calculated for a neat data set. In this case. a correction factor was applied to remedy the disparity in abundances between the neat and burned diesels. Ultimately, PPMC coefficients were a better marker for the identification of the blind samples, as differences in abundances did not affect the correlation coefficients. While it was possible to associate burned diesels to their neat counterparts using PCA, the association could only be made with minimal confidence. The PCA results for the EIP sets also showed promise for the association of burned samples to their neat counterparts. Overall, it is important to note that the diesels contained in the data sat are, by nature, very similar chemically. As a result, the ability to begin to make associations between neat and burned diesels, even at low confidence levels, is promising for the utility of the described methodology. It is likely that associations between burned samples and their neat counterparts can be made with higher confidence when the methodology is applied to a range of ignitable liquids that greatly differ in chemical composition. 5.2 Future Work The future direction of this project lies mostly in the optimization of the already developed methodology. A few issues in both the analytical technique and the data pre- treatment and analysis phases must be addressed. It was concluded that a reproducible injection method is crucial to the precision of the technique, and consequently the accuracy of the results of the chemometric procedures. PCA is a procedure that focuses 130 on differences present in the chromatograms, which should be only due to actual differences in chemical composition. An imprecise injection method can introduce artificial variations in the data set, and thus lead to erroneous results. Although an effort was made to standardize a manual injection method, an investigation into a programmed injection method performed by an autosampler could improve precision and further reduce the chance for the introduction of artifacts. The other obvious issue that needs to be addressed is the retention time alignment procedure. The current alignment algorithm is prone to misalignments, especially in the EIPs due to lower abundances as compared to the TIC. This particular peak-matching algorithm could be amended, with full optimization of all user-defined parameters, in order to be more appropriate for these specific diesel data sets. However, the fault in the alignment may be in the method with which the alignment itself is performed. More recent work published on retention time alignments suggests that piecewise alignment algorithms or warping algorithms may be more effective and less likely to result in misalignments than more basic peak-matching algorithms as was used in this research [41-44]. The last issue to be dealt with is the abundance problem in the scores projections for the burned diesel samples to be associated and discriminated. It was determined that significant disparities in the chromatographic abundance levels between the burned, extracted diesels and the neat diesels from which the eigenvectors were calculated can cause the scores for the burned data to be significantly different from the neat diesel scores. In this case, a correction factor was applied to the scores for the burned diesels based on average differences in abundance levels. However, no investigation was made 131 into the correction factor process, and in order for it to be a valid step, further study is required. Different or even multiple correction factors may be necessary in order to better represent the burned data, especially in actual arson cases when the abundances for the chromatograms collected from extracted fire debris will be so variable. Once these aspects of the research are sufficiently remedied, more broad objectives can be addressed. Diesel was the ignitable liquid chosen for this study because of its complex chemical composition, but in reality diesel is rarely seen in arson cases. Typically, the lighter petroleum distillates like gasoline and lighter fluid are the more frequently encountered ignitable liquids. When applied to different classes of ignitable liquids, the methodology could prove more useful as an associative and discriminatory tool. Further investigation into matrix interferences is also essential before this methodology can be adapted into a forensic laboratory setting. With further research, this methodology could be beneficial in the fire debris analysis section of a forensic laboratory. At the moment, ignitable liquid identifications are purely subjective in nature. The chemometric procedures offer a more objective approach for the determination of ignitable liquids. The methodology has the potential for complete automation, from the actual instrumental analysis all the way through the data analysis stage. Coupled with an extensive reference collection, this methodology could provide a powerful statistical method for the identification of ignitable liquid class. and potentially even the association of a burned ignitable liquid to its unburned counterpart. 132 APPENDICES 133 Appendix A ASTM Classification Scheme 134 Table A.l ASTM Ignitable Liquid Classification Scheme [4,5] Medium (Cg-C 13) Heavy (Cs-(320+) Class Light (C4-C9) Gasoline-all brands, . . . . including Fresh gasollne lS typically 1n the range of C4-C .2 gasohol Petroleum Ether Some Charcoal Kerosene . Starters D1esel Fuel Petroleum Some Cigarette . . . . . . Some Paint Thmners Some Jet Fuels Distillates L1ghter F lu1ds . . Some Dry Cleaning Some Charcoal Some Camping Fuels Solvents Starters Some Charcoal Isoparaffinic Aviation Gas Starters Some Commercial Products Specialty Solvents Some Paint Thinners Specialty Solvents Some Copier Toners Some Paint and Some Automot1ve . Parts Cleaners . . Varnish Removers S ecial Cleanin Some Insect1c1de Aromatic Some Automotive p SotK/en ts g Vehicles Products Parts Cleaners . . Industrial Cleaning Some Insectlctde Xylenes, Toluene- . Solvents based Products Veh1cles Fuel Additives Some Charcoal . . . Some Insect1c1de Naphthemc Starters . . Cyclohexane-based . . Veh1cles Paraffimc Some Insectimde . Solvents/Products . Lamp 0118 Products Veh1cles . . Industrial Solvents Lamp Olls n-Alkanes Solvents Some Candle Oils Some Candle OHS Pentane, Hexane, . Carbonless Forms Products Cop1er Toners . Heptane Cop1er Toners De- Some Charcoal Some Charcoal Aromatized Some Camping Fuels Starters Starters Distillates Some Paint Thinners Odorless Kerosene Alcohols, Ketones Some Lacquer Some Lacquer Thinners Oxygenated Thinners Some Industrial Solvents Fuel Additives Solvents Surface Preparation Metal Cleaners/ Solvents Gloss Removers Single Component . Products Turpentme Products Some Blended Some Blended Others- Some Blended Products . Products . . Miscellaneous Products . . Various Specralty Various Specralty Some Enamel Products Products Reducers 135 Table A.2 Chromatographic and Mass Spectral Characteristics of ASTM Ignitable Liquid Classes [4,5] Class Alkanes Cycloalkanes Aromatics Polynuclear Aromatics affirm-a“ Present, less Present, less . . ’ abundant than abundant than Abundant Present including . . aromatlcs aromatics gasohol Present Abundant, Present, less Present, less (depending on Petroleum . . . Dis tilla tes Gaussran abundant than abundant than b0111ng range), d1str1butron alkanes alkanes less abundant than alkanes Branched alkanes Isoparaffinic abundant, n- Products alkanes absent Absent Absent Absent or strongly diminished A t' Abundant roma 1c Absent Absent Abundant (depending on Products . . borlmg range) Branched Naphthenic $323123? n- Paraffinic ’ Abundant Absent Absent alkanes absent Products or strongly diminished n-Alkanes Abundant Absent Absent Absent Products De- Abundant, Present, less Absent or Absent or Aromatized Gaussian abundant than strongly strongly Distillates distribution alkanes diminished diminished nglgfelmtsw Composition may vary, presence of oxygenated organic compounds 136 Appendix B PPMC Coefficients for the TIC and EIPs of Ten Neat Diesels Analyzed in Triplicate 137 DIA DlB Dl(‘ DZA Table B.1 PPMC Coefficients for the TIC of Triplicate Analyses of Diesels 1—10 DlA D6C D7A D7B D7C DSA D813 D8C D9A D9B D9C DIOA D10B DIOC 139 Table B.2 PPMC Coefficients for the Alkane EIP of Triplicate Analyses of Diesels 1-10 DlA D10C Table B.3 PPMC Coefficients for the Aromatic EIP 0f Triplicate Analyses of Diesels 1-10 Table B.4 PPMC Coefficients for the Indane EIP of Triplicate Analyses of Diesels 1-10 D9B D9C DIOA DIOB DIOC Table B.5 PPMC Coefficients for the OCP EIP of Triplicate Analyses of Diesels 1—10 DIOA DlOB DlOC Table B.6 PPMC Coefficients for the PNA EIP of Triplicate Analyses of Diesels 1—10 143 Appendix C PPMC Coefficients for the TIC and EIPs of Five Neat Diesels Analyzed in Triplicate 144 Umuo Eman—