!
!
!

AN INVESTIGATION OF UNSUPERVISED AND SUPERVISED MULTIVARIATE
STATISTICAL PROCEDURES FOR THE ANALYSIS OF FIRE DEBRIS
By
Suzanne Towner

A THESIS
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
MASTER OF SCIENCE
Forensic Science
2012

ABSTRACT
AN INVESTIGATION OF UNSUPERVISED AND SUPERVISED MULTIVARIATE
STATISTICAL PROCEDURES FOR THE ANALYSIS OF FIRE DEBRIS
By
Suzanne Towner
Gas chromatography-mass spectrometry (GC-MS) is the method of choice for analyzing
fire debris. Analysts perform a visual comparison between chromatograms of fire debris and
ignitable liquid standards. The analysis is both complex and subjective due to evaporation of the
liquid and interference compounds from the matrix as well as thermal degradation of both the
matrix and liquid. This research investigates the use of unsupervised and supervised multivariate
statistical procedures for simplifying the analysis and creating a more objective approach.
Principal components analysis, an unsupervised technique, was used in conjunction with
Pearson product moment correlation coefficients to successfully associate simulated fire debris
to corresponding ignitable liquid standards. To do this, liquid standards of gasoline and kerosene
were evaporated to different evaporation levels. The liquids were spiked onto unburned and
burned wood that had been previously treated with Danish Oil. Additionally, simulated debris
samples were generated by spiking the liquids onto the matrix prior to burning. The samples
were extracted, analyzed by GC-MS, and subjected to the unsupervised data analysis procedures.
Soft independent modeling of class analogy, a supervised classification technique was
applied to replicate chromatograms from a set of six ignitable liquid standards, different from
those used above. The standards’ chromatograms were split into training and test sets. The
training set was used to generate models of each liquid to which the test set was classified.
Classification of the liquids was successfully performed using the total ion chromatograms and
extracted ion chromatograms.

Table of Contents
List of Tables

v

List of Figures

vi

Chapter 1: Introduction
1.1 Background
1.2 Ignitable Liquid Classification
1.4 Current Analysis of Fire Debris
1.5 Difficulties in Analysis of Fire Debris
1.6 Literature Review
1.6.1 Effects of Matrix Interferences and Thermal Degradation
1.6.2 The Application of Multivariate Statistical Procedures
1.7 Considerations for Statistical Analyses
1.8 Research Objectives and Goals
REFERENCES

1
1
1
4
6
8
8
12
17
19
22

Chapter 2: Theory
2.1 Passive Headspace Extraction
2.2 Gas Chromatography-Mass Spectrometry
2.3 Data Pretreatment
2.3.1 Smoothing
2.3.2 Retention Time Alignment
2.3.3 Normalization
2.4 Data Analysis
2.4.1 Principal Components Analysis
2.4.2 Pearson Product Moment Correlation Coefficients
2.4.3 Soft Independent Modeling of Class Analogy
REFERENCES

24
24
24
36
36
38
39
41
41
43
44
49

Chapter 3: Association of Simulated Fire Debris Samples to Corresponding Standards
Using Unsupervised Statistical Procedures
3.1 Introduction
3.2 Materials and Methods
3.2.1 Ignitable Liquid Standards
3.2.2 Surface-Treated Wood Samples
3.2.3 Inherent Matrix Interference Samples
3.2.4 Determination of Optimal Burn Time
3.2.5 Matrix Interference/Thermal Degradation Samples
3.2.6 Simulated Fire Debris Samples
3.2.7 Analysis of Samples by GC-MS
3.2.8 Data Pretreatment
3.2.9 Principal Components Analysis
3.2.10 Pearson Product Moment Correlation Coefficients
3.3 Results and Discussion
3.3.1 Characterization of Compounds Present in Ignitable Liquid Standards
3.3.1.1 Gasoline

50
50
51
51
53
53
54
54
55
55
56
57
58
58
58
58

!

"""!

3.3.1.2 Kerosene
3.3.2 Association and Discrimination of Ignitable Liquid Standards
3.3.3 PPMC Coefficients for Ignitable Liquid Standards
3.3.4 Characterization of Compounds Present in Surface-Treated Wood Flooring
3.3.5 Optimization of Burn Times
3.3.6 Association of Samples to Corresponding Standards in the Presence of Inherent
Matrix Interferences and Thermal Degradation
3.3.7 PPMC Coefficients for Inherent Matrix Interference Samples
3.3.8 PPMC Coefficients for Matrix Interference/Thermal Degradation Samples
3.3.9 Association of Simulated Fire Debris Samples to Corresponding Standards
3.3.10 PPMC Coefficients for Simulated Fire Debris Samples
3.4 Summary
REFERENCES

60
62
71
71
74
75
83
89
91
96
99
102

Chapter 4: Classification of Ignitable Liquid Standards using Soft Independent Modeling
of Class Analogy
103
4.1 Introduction
103
4.2 Materials and Methods
104
4.2.1 Liquid Standards
104
4.2.2 Analysis of Standards by GC-MS
105
4.2.3 Data Pretreatment
105
4.2.4 Principal Components Analysis
106
4.2.5 Soft Independent Modeling of Class Analogy
106
4.3 Results and Discussion
107
4.3.1 Characterization of Ignitable Liquid Standards
107
4.3.2 Principal Components Analysis of the Entire TIC Data set
110
4.3.3 Classification of Ignitable Liquid Standard TICs Using SIMCA
115
4.3.3.1 Coomans’ plots
117
4.3.3.2 Sample-to-Model Distance Versus Leverage Plots
122
4.3.3.3 The Unclassified Gasoline Sample
123
4.3.4 Classification of Ignitable Liquid Standard EICs Using SIMCA
135
4.3.4.1 Alkane EIC, m/z 99
136
4.3.4.2 EICs: m/z 91, 83, and 128
142
4.4 Summary
143
REFERENCES
146
Chapter 5 Conclusions
5.1 Summary of Research
5.1.1 Research Objectives and Goals
5.1.2 Unsupervised Multivariate Statistics Study Summary
5.1.3 Supervised Multivariate Statistics Study Summary
5.2 Future Work

!

"#!

147
147
147
147
151
153

List of Tables

Table 1.1: ASTM International classification of ignitable liquids.

2

Table 3.1: Mean Pearson product moment correlation coefficients ± standard deviations
calculated for replicates of standards at each evaporation level (n=105).
72
Table 3.2: Mean Pearson product moment correlation coefficients ± standard deviations
for replicates of the inherent matrix interference samples (n=105) and for samples to 0%
evaporated gasoline and kerosene (n=225).
84
!
Table 3.3: Mean Pearson product moment correlation coefficients ± standard deviations
for replicates of the matrix interference/thermal degradation samples (n=105) and for
samples to 0% evaporated gasoline and kerosene (n=225).
90
Table 3.4: Mean Pearson product moment correlation coefficients ± standard deviations
for replicates of the simulated fire debris samples (n=105) and for samples to 0%
evaporated gasoline and kerosene (n=225).
97
Table 4.1. The suggested number of principal components for soft independent modeling
of class analogy on total ion chromatograms.
116
Table 4.2. Classification Table of Ignitable Liquid TICs at 10% Significance Level.

118

Table 4.3. The suggested number of principal components for soft independent modeling
of class analogy on extracted ion chromatograms (m/z 99).!
138!
!

!

#!

!
!

List of Figures

Figure 2.1: Schematic of a gas chromatograph.
!

26

Figure 2.2: Schematic of a mass spectrometer.
!

32

Figure 2.3: Diagram of a quadrupole mass analyzer.
!

34

Figure 3.1: Total ion chromatograms of A) 0%, B) 50%, and C) 90% evaporated
gasoline. The internal standard used was nitrobenzene.
!

59

Figure 3.2: Total ion chromatograms of A) 0%, B) 50%, and C) 90% evaporated
kerosene. The internal standard used was nitrobenzene.
!

61

Figure 3.3: Scores plot of PC1 versus PC2 based on the total ion chromatograms for
gasoline and kerosene at the three different evaporation levels. In terms of color, blue,
green, and purple represent 0%, 50%, and 90% evaporated kerosene while red, orange,
and yellow represent 0%, 50%, and 90% evaporated gasoline. For interpretation of the
references to color in this and all other figures, the reader is referred to the electronic
version of this thesis.
!

63

Figure 3.4: Loadings plot of PC1 based on the total ion chromatograms of the
unevaporated and evaporated ignitable liquid standards.
!

64

Figure 3.5: Loadings plot of PC2 based on the total ion chromatograms of the
unevaporated and evaporated ignitable liquid standards.
!

67

Figure 3.6: Mean-centered total ion chromatogram of the 50% evaporated gasoline
standard demonstrating the introduction of n-alkanes from the kerosene standards.
!

68

Figure 3.7: Total ion chromatograms of extracts of surface-treated wood burned for A) 0
seconds, B) 30 seconds, and C) 150 seconds.
73
!
Figure 3.8: Scores plot of PC1 versus PC2 based on the total ion chromatograms for the
ignitable liquid standards, represented by the squares, and the projected scores of the
inherent matrix interference samples, represented by the circles. In terms of color, blue,
green, and purple represent 0%, 50%, and 90% evaporated kerosene while red, orange,
and yellow represent 0%, 50%, and 90% evaporated gasoline.
76
!

!

#"!

Figure 3.9: Total ion chromatograms of a 50% evaporated gasoline standard (green) and
two 50% evaporated gasoline inherent matrix interference samples (red and black),
demonstrating the differences in abundance between the standards and samples.
78
!
Figure 3.10: Scores plot of PC1 versus PC2 based on the total ion chromatograms for the
ignitable liquid standards, represented by the squares, and the projected scores of the
matrix interference/thermal degradation samples, represented by the circles. In terms of
color, blue, green, and purple represent 0%, 50%, and 90% evaporated kerosene while
red, orange, and yellow represent 0%, 50%, and 90% evaporated gasoline.
81
!
Figure 3.11: Total ion chromatograms of a kerosene standard (red) and a matrix
interference/thermal degradation sample (black), demonstrating the difference in peak
width between the standards and samples.
!

87

Figure 3.12: Scores plot of PC1 versus PC2 based on the total ion chromatograms for the
ignitable liquid standards, represented by the squares, and the projected scores of the
simulated fire debris samples, represented by the circles. In terms of color, blue, green,
and purple represent 0%, 50%, and 90% evaporated kerosene while red, orange, and
yellow represent 0%, 50%, and 90% evaporated gasoline.
92
!
Figure 3.13: Total ion chromatograms of the C2-alkylbenzenes from the five simulated
fire debris samples generated using gasoline, demonstrating the variation in abundances
across samples.
94
!
Figure 4.1: Total ion chromatograms of A) insect repellent, B) gasoline, and C) paint
thinner, D) fuel stabilizer, E) fuel injector cleaner, and F) diesel with selected peaks
labeled.
!

108

Figure 4.2: Scores plot of PC1 versus PC2 based on the total ion chromatograms of the
ignitable liquid standards training and test sets: insect repellent (green), gasoline
(orange), paint thinner (yellow), diesel (blue), fuel injector cleaner (black), and fuel
stabilizer (red).
111
!
Figure 4.3: Loadings plot of PC1 based on the total ion chromatograms of the ignitable
liquid standards (training and test sets).
112
!
Figure 4.4: Loadings plot of PC2 based on the total ion chromatograms of the ignitable
liquid standards (training and test sets).
114
!
Figure 4.5: Coomans’ plot for the gasoline and insect repellent models (at a 10%
significance level) based on the total ion chromatograms of the training sets. The sampleto-model distances are plotted for each of the ignitable liquids in the test set: insect

!

#""!

repellent (green), gasoline (orange), paint thinner (yellow), diesel (blue), fuel injector
cleaner (black), and fuel stabilizer (red). The class membership limit for the gasoline
model is overlaid on the plot in orange while the limit for the insect repellent model is
in green.
120
!
Figure 4.6: Coomans’ plot (at 10% significance level) for the gasoline and insect
repellent models based on the total ion chromatograms of the training sets. The sampleto-model distances are plotted for the gasoline test samples (orange). The class
membership limit for the gasoline model is overlaid on the plot in orange while the limit
for the insect repellent model is in green.
121
!
Figure 4.7: Sample-to-model distance versus leverage plot for the gasoline model (at a
10% significance level) based on the total ion chromatograms of the gasoline training set.
The sample-to-model distances and leverage are plotted for each of the ignitable liquids
in the test set: insect repellent (green), gasoline (orange), paint thinner (yellow), diesel
(blue), fuel injector cleaner (black), and fuel stabilizer (red). The class membership limit
of both sample-to-model distance and leverage for the gasoline model is overlaid on the
plot in orange.
124
!
Figure 4.8: Sample-to-model distance versus leverage plot for the gasoline model (at a
10% significance level) based on the total ion chromatograms of the gasoline training set.
The sample-to-model distances and leverage are plotted for gasoline test samples
(orange). The class membership limit of both sample-to-model distance and leverage for
the gasoline model is overlaid on the plot in orange.
125
!
Figure 4.9: Coomans’ plot (at 25% significance level) for the gasoline and insect
repellent models based on the total ion chromatograms of the training sets. The sampleto-model distances are plotted for the gasoline test samples (orange). The class
membership limit for the gasoline model is overlaid on the plot in orange while the limit
for the insect repellent model is in green.
127
!
Figure 4.10: Sample-to-model distance versus leverage plot for the gasoline model (at a
25% significance level) based on the total ion chromatograms of the gasoline training set.
The sample-to-model distances and leverage are plotted for gasoline test samples
(orange). The class membership limit of both sample-to-model distance and leverage for
the gasoline model is overlaid on the plot in orange.
128
!
Figure 4.11: Loadings plot of PC1 of the gasoline model based on the total ion
chromatograms of the gasoline standards.
!

130

Figure 4.12: Modeling power for the gasoline model based on the total ion
chromatograms of the gasoline training samples. The red line represents modeling power
of 0.3. Peaks that extend above this line significantly impact the model.
132

!

#"""!

!
Figure 4.13: Modeling power for the insect repellent model based on the total ion
chromatograms of the insect repellent training samples. The red line represents modeling
power of 0.3. Peaks that extend above this line significantly impact the model.
133
!
Figure 4.14: A total ion chromatogram of insect repellent demonstrating the rise in
baseline that occurs at the end of the chromatogram.
!

134

Figure 4.15: Scores plot of PC1 versus PC2 based on the extracted ion chromatograms
(m/z 99) of the ignitable liquid standards training and test sets: insect repellent (green),
gasoline (orange), paint thinner (yellow), diesel (blue), fuel injector cleaner (black), and
fuel stabilizer (red).
137
!
Figure 4.16: Modeling power for the gasoline model based on the extracted ion
chromatograms (m/z 99) of the gasoline training samples. The red line represents
modeling power of 0.3. Peaks that extend above this line significantly impact the
model.
139
!
Figure 4.17: Scores plot of PC1 versus PC6 based on the extracted ion chromatograms
(m/z 99) of the ignitable liquid standards training and test sets: insect repellent (green),
gasoline (orange), paint thinner (yellow), diesel (blue), fuel injector cleaner (black), and
fuel stabilizer (red).
141
!
!
!

!

"$!

Chapter 1: Introduction
1.1 Background
Every year approximately 267,000 fires are the result of arson, which may be defined as
1

the setting of a fire with intent to cause damage or harm . Arson is a prevalent and destructive
1

crime that costs the United States an estimated $684,000,000 in damages, annually . As a result,
numerous fire investigations are conducted yearly.
It is the job of fire investigators to determine if the fire was the result of an accident or
arson. Oftentimes, in intentional fires, an accelerant is used to maximize the spread and damage
of the fire. As a result, debris is collected from the fire scene and taken to a forensic laboratory
where it is extracted and analyzed for the presence of accelerants, such as ignitable liquids.

1.2 Ignitable Liquid Classification
Ignitable liquids are volatile and easy to ignite. They have a broad range of uses and
chemical compositions. ASTM International developed a classification scheme for ignitable
2

liquids based on their chemical composition (Table 1) . The eight classes are gasoline, petroleum
distillates, isoparaffinic products, aromatic products, naphthenic paraffinic products, normal
alkane products, oxygenated products, and miscellaneous. Each class, except gasoline, is further
characterized by the length of carbon chains present in the liquid, such as light (C4-C9), medium
(C8-C13), and heavy (C9-C20).

!

%!

!
Table 1.1: ASTM International classification of ignitable liquids. !
Composition

Gasoline- all brands,
including gasohol

C3- and C 4- alkylbenzenes
and various aliphatic
compounds

Petroleum Distillates

Homologous series of nalkanes; less significant
isoparaffinic,
cycloparaffinic, and
aromatic compounds

Petroleum ether,
cigarette lighter fluids,
camping fluids

Charcoal starters, paint
thinners, dry cleaning
solvents

Kerosene, diesel fuel,
jet fuels, charcoal
starters

Isoparaffinic
Products

Branched chain
(isoparaffinic); cyclic
(naphthalenic) alkanes and
n-alkanes insignificant or
absent

Aviation gas, specialty
solvents

Charcoal starters, paint
thinners, copier toners

Commercial specialty
solvents

Aromatic compounds;
aliphatic compounds
absent or insignificant

Paint and varnish
removers, automotive
parts cleaners, xylenes,
toluene-based products

Automotive parts
cleaners, specialty
cleaning solvents,
insecticide vehicles,
fuel additives

Insecticide vehicles,
industrial cleaning
solvents

&!

Class

Aromatic Products

Light (C4 -C9)

Medium (C8-C13)

Heavy (C8 -C20+)

Fresh gasoline is typically in the range of C4 -C12

2

!
Table 1.1 (continued)!
Composition

Light (C4 -C9)

Medium (C8-C13)

Heavy (C8 -C20+)

Naphthenic
Paraffinic Products

Branched chain
(isoparaffinic) and cyclic
(naphthalenic) alkanes
insignificant or absent

Cycohexane-based
solvents/products

Charcoal starters,
insecticide vehicles,
lamp oils

Insecticide vehicles,
lamp oils, industrial
cleaning solvents

n-Alkane Products

Only n-alkanes, typically
containing 5 or less

Solvents, pentane,
hexane, heptane

Candle oils, copier
toners

Candle oils, carbonless
forms, copier toners

Oxygenated Solvents

Oxygenated products
including alcohols, esters,
ketones; major
components include
toluene or xylene

Alcohol, ketones,
lacquer thinners, fuel
additives, surface
preparation solvents

Lacquer thinners,
industrial solvents,
metal cleaners/gloss
removers

Liquids that cannot
otherwise be classified

Single component
products, blended
products enamel
reducers

Turpentine products,
blended products,
specialty products

'!

Class

OthersMiscellaneous

3

Blended products,
specialty products

1.3 Extraction of Volatile Compounds from Fire Debris
There are a number of different procedures that are approved by ASTM International for
the extraction of ignitable liquids from fire debris. The extraction method is tailored to the
sample matrix that is to be analyzed. For example, the ASTM standard does not recommend that
a liquid extraction be used for porous debris as the matrix may trap the solvent and result in an
inefficient extraction. The passive headspace extraction, on the other hand, is extremely sensitive
and efficient. Consequently, the passive headspace extraction is more commonly used in many
forensic laboratories.
When debris is collected from the fire scene, it is sealed in an airtight container to prevent
any volatile compounds from being lost to the atmosphere. Analysts perform a passive headspace
extraction by suspending an activated charcoal strip (ACS) in the headspace of the container.
The sample is then placed in an oven for 2 to 24 hours at a temperature ranging from 50° C to
3

80° C . The volatile compounds from the debris adsorb onto the ACS. An organic solvent such
as carbon disulfide, n-pentane, diethyl ether, or methylene chloride is used to elute the volatiles
from the ACS. The resulting extract is then analyzed most commonly by gas chromatographymass spectrometry (GC-MS).

1.4 Current Analysis of Fire Debris
Gas chromatography-mass spectrometry is the gold standard by which fire debris samples
are analyzed for evidence of residues from ignitable liquids. A total ion chromatogram (TIC) is
the product of analysis by GC-MS. A TIC is a graph in which retention time is on the abscissa
and abundance is on the ordinate axis. This graph shows the ion current from all ions for each
peak present in the chromatogram and, consequently, represents every extractable compound in

!

(!

the debris. A mass spectrum is also generated for each compound and can be used to determine
their identities.
Fire debris analysts perform a visual comparison between the TIC of an ignitable liquid
reference standard and that of the fire debris. Analysts look for similar compounds or peaks in
the two chromatograms, aiming to identify the presence of an ignitable liquid. The standards
used for comparison are typically generated in-house. Standard operating procedures are
established so that certain compounds characteristic of the ignitable liquid must be present in the
chromatogram in order for the analyst to determine that the liquid is indeed present in the debris
sample. Additionally, analysts use the relative ratios of peak abundances or peak patterns from
the standards to identify the presence of an ignitable liquid in the debris. One difficulty with this
type of analysis is that peaks from the debris itself, or matrix interference compounds, may mask
the presence of compounds from the ignitable liquid. As a result, the TIC can be very complex
and difficult to interpret. In order to overcome this problem and simplify the interpretation,
different types of chromatograms can be generated using computer software.
An extracted ion chromatogram (EIC) shows only the contribution of a specific selected
ion to peak abundances. An EIC can be more sensitive than a TIC because the selected ion may
be present only in compounds from the ignitable liquid and not in the matrix. The EIC could
therefore reveal the presence of compounds indicative of an ignitable liquid despite the
additional matrix compounds. Similarly, an extracted ion profile (EIP), which consists of
multiple extracted ions, can also be used to reveal concealed peaks. Another alternative approach
is to use selected peaks based on their retention times in the chromatogram. Selected peaks may
be especially helpful when analyzing samples according to pattern recognition or similarities in
peak ratios. In this method, compounds characteristic of the ignitable liquid are selected while all

!

)!

others are removed from the chromatogram. Using this procedure, all compounds at the other
retention times, which are likely to originate from the matrix, are removed to decrease the
complexity of the data analysis. However, ions from the matrix may contribute to the height of
the selected peaks because the abundances are typically based on the total ion current as opposed
to extracted ions.

1.5 Difficulties in Analysis of Fire Debris
While the visual comparison between the chromatogram from fire debris and that of a
standard is, conceptually, quite simple, it is greatly complicated by many factors such as the
evaporation of the ignitable liquid, interference compounds from the debris matrix, and thermal
degradation of both the liquid and the matrix.
The evaporation of the ignitable liquid is quite problematic because it can lead to the loss
of volatile compounds that are characteristic of a specific liquid and that could aid in its
identification. Many times the ignitable liquids used to commit arson are not purchased specially
for the deed; instead, they are taken from garages and storage sheds where they may have been
sitting for a period of time. Evaporation of the volatile components can occur during this time.
Evaporation can also occur during the burning of the fire since the heat of the flame can cause
the more volatile compounds in the ignitable liquid to evaporate. Certain volatile compounds
may be partially or totally removed from the debris sample. Evaporation of volatiles is reflected
in the debris chromatogram and the partial or total loss could result in peak ratios differing from
those observed in the corresponding reference standard.
While compounds can be removed from the chromatogram through evaporation, they can
also be added to the chromatogram from the debris matrix. Common items such as clothing and

!

*!

building materials contain volatile compounds that may be incorporated into the chromatogram
of the debris sample. As a result, it is extremely important to know the type of debris matrix
being analyzed and to identify the compounds that the matrix is likely contributing to the
chromatogram. To combat this problem, analysts are often given more debris from the scene that
is unlikely to be contaminated with an ignitable liquid. The additional debris sample is analyzed
and a chromatogram is generated, which is used as an exclusionary tool to determine the
compounds that come from the matrix itself.
Thermal degradation, which occurs at temperatures between 100° C and 300° C, affects
4

both the debris and the ignitable liquid . Thermal degradation is the breakdown of compounds
that occurs due to the heat of the fire. This can further complicate the chromatogram of a debris
sample because the degradation of the compounds can lead to the generation of new and,
sometimes unexpected, compounds. Additionally, thermal degradation can lead to a change in
peak ratios in the chromatogram of the debris as compared to the chromatogram of the reference
standard.
All of the afore-mentioned factors work together to complicate the already subjective
visual comparison between the chromatograms of the debris and standards. These complications
can lead to analysts testifying in court that an ignitable liquid was used to set a fire when, in fact,
it was not and vice versa. There is an obvious need for improved data analysis and interpretation
procedures, as well as safeguards to reduce the number of incorrect conclusions by analysts. In a
2009 report entitled Strengthening Forensic Science in the United States: A Path Forward, the
National Academy of Sciences criticized the entire forensic community for the lack of peerreviewed research able to withstand Daubert hearings and provide statistical evaluations of the
5

evidence .

!

+!

1.6 Literature Review
While arson investigation is extremely complex, research has been performed to improve
current methods in an attempt to simplify fire debris analysis. Some studies have identified
interference compounds that come from different matrices, as well as their thermal degradation
products. Other studies have investigated the usefulness of statistical procedures to increase the
certainty of analysts’ findings and avoid the subjectivity involved in a simple visual comparison
of chromatograms.

1.6.1 Effects of Matrix Interferences and Thermal Degradation
Lentini et al. addressed the issue of inherent matrix interference compounds from many
6

common items . The matrices examined included materials such as clothing, shoes, and building
materials. The aim of this study was to demonstrate that compounds indicative of petroleum
products can routinely be detected in common items even though no ignitable liquids have been
added to them. These materials, examined without being burned, were extracted using the
passive headspace method and analyzed by GC-MS. The results show that some items contained
compounds indicative of an ignitable liquid; however, the peak ratios were such that an
experienced analyst would not likely mistake them as coming from an ignitable liquid. Other
items, on the other hand, such as spandex, gave a “strong pattern” indicative of kerosene. These
results were not unexpected since petroleum products are used to manufacture many common
household items. In terms of building materials, the authors concluded that the presence of floor
coatings may be a larger problem than previously recognized because petroleum distillate

!

,!

solvents are used in many coatings including stains and may be detected months after
application.
Most wood used in homes is treated with a finish such as paint, stain, or other surface
protectants. In many cases, the treatments contain compounds also found in ignitable liquids and
so, can be particularly problematic in fire debris analysis. Hetzel and Moss attempted to
determine the point after the last application at which the petroleum distillates from a wood
7

waterproofing coating could no longer be detected on an outdoor patio . The wood purchased for
the study was pretreated with a preservative and fungicide, as outdoor wood commonly is. The
pretreated pine contained some aldehydes that could further mimic the presence of a medium
petroleum distillate. The authors performed two identical experiments in which waterproofer was
applied to a treated lumber deck and small samples of decking were collected over several days
and analyzed using GC-MS. The combined results indicated that medium petroleum distillates
could still be isolated from the decking 16 days after the last application, but not more than 20
days. The temperatures during the studies were warm and predominantly dry with the average
temperatures being 76° C and 67° C and the average rainfalls being 1 and 9 cm for the two
studies.
In a similar study pertaining to indoor treated lumber, Lentini treated pine and oak
8

flooring with either stain and a polyurethane sealer or with an oil finish . The treated wood was
sampled over a 24-month period and analyzed by GC-MS. Solvents from the surface treatments
were characterized on both types of wood up to two years after the application. In addition, the
solvents were all present in essentially the same amounts as when they were first applied to the
wood boards, regardless of the time point at which they were sampled.

!

-!

In a study on inherent matrix interferences and thermal degradation products, Almirall
and Furton characterized compounds found in common residential and commercial objects (both
old and new) by burning the materials, extracting volatiles via the passive headspace procedure,
9

and analyzing the extracts by GC-MS . The burning was performed at different temperatures and
with varying amounts of oxygen present. Some volatiles inherent to the matrix and created by
thermal degradation were found to produce certain target compounds indicative of an ignitable
liquid residue, but did not generate the same peak ratios that would be found in a chromatogram
of a neat ignitable liquid.
In a similar series of experiments, Fernandes et al. partially burned many common
household items (newspaper, carpet flooring, painted wood, etc.) in an attempt to characterize
the matrix interferences in both new and one-month old items and to determine whether more
10

compounds were extracted from the new or the one-month old items . All volatiles were
extracted using the passive headspace procedure and analyzed by GC-MS. For most tested items,
it was concluded that new items contained more volatiles and created more interferences in
chromatograms than the older items. The authors also concluded that the majority of the matrix
interferences were inherent to the substrate and did not occur as result of thermal degradation.
While volatiles were a source of interferences in the chromatogram, the authors stated that they
could not be misidentified as an ignitable liquid because they lacked the characteristic peak
profile of a neat ignitable liquid and that any potential for misidentification could be overcome
by using control samples. While the authors did examine painted wood, they did not investigate
the effects of matrix interference and thermal degradation compounds on the misidentification of
an ignitable liquid from other surface treatments.

!

%.!

While the studies by Almirall and Furton and the Fernandes et al. concluded that matrix
interferences do not mimic the correct pattern of ignitable liquids and should not be mistaken for
such a liquid, neither study attempted to burn samples in the presence of an ignitable liquid

9, 10

.

It is unclear in either study if the authors are taking into account the change that occurs in the
pattern of an ignitable liquid that is burned due to the loss of volatiles (evaporation) and the
thermal degradation of the liquid itself. Also, when a small volume of an ignitable liquid is
applied to a substrate, the residue may be at levels just above the detection limit of the GC-MS
and matrix interferences as well as thermal degradation products could mask the visible pattern
of peaks from the ignitable liquid residue. Additionally, the studies by Almirall and Furton and
the Fernandes et al. did not address the concerns raised by the three previous studies showing
that wood, especially surface-treated wood, could contribute compounds to the chromatogram
that are indicative of an ignitable liquid. Almirall and Furton did investigate the thermal
degradation products of pine wood; however, the thermal degradation products of a surface
treatment, which can be detected over two years after application, were not investigated.
In a study by Dehaan and Bonarius, gasoline, paint thinner and camp fuel were used as
accelerants in a real-life experimental burning of floor coverings such as carpet, padding, and
11

synthetic turf . Debris samples were immediately removed from the fire scene, extracted using
the passive headspace procedure, and analyzed using GC-MS. The authors found that, while
floor coverings do produce volatiles when burned, the liquids were still identifiable on the debris.
It was noted, however, that the volatiles from the flooring could lead to a misidentification of the
liquid as a special blend liquid due to deviations from the characteristic patterns of certain
ignitable liquids. The results of this study were very encouraging; however, the experimenters
did not look specifically at surface-treated wood flooring.

!

%%!

In an ambitious attempt to determine the element of a fire that most affects the
identification of an accelerant, Borusiewicz et al. performed a study on the effect of the type of
accelerant, type of burned matrix, the length of burn time, and the availability of air on the
12

detection and identification of ignitable liquid residues . Including gasoline and kerosene, 5
different ignitable liquids were investigated. The liquids were spiked onto the matrices (carpet,
wood logs, chipboard) and the samples were burned until they self-extinguished. The samples
were immediately collected, extracted using the passive headspace technique, and analyzed by
GC-MS. The authors found that the type of burned matrix has the biggest effect on the
identification of accelerants. No ignitable liquid was identified in the wood logs; however, in
reality, wood does not always burn until it self-extinguishes. Additionally, it would be useful to
investigate treated wood as that is the most likely form of wood in a structure fire. Lastly, the
spike volume used for each of the liquids was not optimized before or during the study, so an
ignitable liquid in a sufficiently large volume may be detected from wood fire debris.

1.6.2 The Application of Multivariate Statistical Procedures
In a series of three studies, Sandercock et al. attempted to use principal components
analysis (PCA) and linear discriminant analysis (LDA) to differentiate various gasoline samples.
In the first study, 35 randomly collected gasoline samples were analyzed based on their trace
13

polar and polycyclic aromatic hydrocarbon content . The gasoline samples were unevaporated
and consisted of three different grades: regular unleaded, premium unleaded, and lead
replacement. A solid phase micro-extraction (SPME) procedure was used to extract the
compounds from the gasoline samples. The extracts containing the different types of compounds
were analyzed by GC-MS in selected ion monitoring mode for each sample. The authors

!

%&!

concluded that the trace polar compounds did not vary significantly among gasoline samples and
should not be used as distinguishing compounds. The polycyclic aromatic hydrocarbons,
specifically the C0- to C2- naphthalenes, on the other hand, were sufficiently variable across
gasoline samples and were able to distinguish the samples using PCA and LDA.
In the second study, the change in the C0- to C2- naphthalene content across evaporation
levels of gasoline samples, as well as unevaporated gasoline samples, collected over an extended
14

period of time was investigated . For the first part of the study, 35 gasoline samples, of the
same three grades as before, were evaporated to different extents (25, 50, 75, and 90% by
weight). The samples were analyzed in a similar manner as in the first study and PCA was
performed in conjunction with LDA on the resulting chromatograms. Using PCA on the C0- to
C2-naphthalene compounds, the evaporated gasoline samples were successfully associated to
their respective unevaporated counterparts. In the second part of the study, 96 unevaporated
gasoline samples were collected from three stations over a 16-week time period and analyzed as
described above. Again, the C0- to C2- naphthalenes were used to differentiate samples from one
another and associate samples collected from the same stations. Using PCA and LDA, all 96
gasoline samples could be distinguished from one another based on differences in the
naphthalene peak ratios.
In an almost identical manner, the third study investigated the ability of unevaporated
gasoline samples from different locations in two different countries to be differentiated from one
15

another, again using only the C0- to C2- naphthalenes for each sample . By applying PCA to
the data set, 28 samples from New Zealand could be differentiated from 24 samples collected in

!

%'!

Australia. All of the samples from Australia could also be differentiated from one another, but
only half of the samples from New Zealand could be differentiated from one another using these
compounds and PCA.
The series of studies performed by Sandercock and Du Pasquier demonstrate the success
and potential of multivariate statistical procedures in differentiating multiple samples of one
ignitable liquid

13, 14, 15

. This is encouraging data that may, someday, help link ignitable liquids

at fire scenes to those found in a suspect’s possession. These studies, however, do not investigate
the usefulness of these statistical procedures when applied to real fire debris where matrix
interferences are present and ignitable liquid residues must be extracted from the debris.
Hupp et al. used PCA and Pearson product moment correlation (PPMC) coefficients to
investigate the discrimination of 25 different diesel samples across 13 brands that were analyzed
16

by GC-MS . It was demonstrated that PCA on the TICs could differentiate the diesel samples
into 4 distinct groups based on their chemical compositions. The groupings observed with the
PCA were further confirmed by PPMC coefficients for intragroup samples, which indicated
strong similarities between samples. Additionally, the authors performed PCA on the alkane and
aromatic EIPs of each diesel sample and found that even greater discrimination of the samples
was obtained, which was again reflected in the calculated values of the PPMC coefficients. The
authors demonstrated that these statistical procedures could be used to differentiate between
diesel samples; however, supervised statistical procedures were not investigated in this study.
Principal components analysis was further investigated along with canonical variate
analysis (CVA), and orthogonal canonical variate analysis (OCVA), which was used in
17

conjunction with LDA, in a study by Petraco et al.

!

%(!

The authors used 15 selected compounds to

differentiate replicates of gasoline chromatograms accumulated from 20 separate fire scene
investigations. All of the statistical procedures allowed for discrimination of the samples. For
CVA, OCVA, and PCA, the number of dimensions required for accurate differentiation was 3, 4,
and 10, respectively. This demonstrates that all of the statistical procedures can be used to
differentiate samples, but that there is room for improvement. Oftentimes, the tenth dimension in
PCA accounts for a very small percentage of overall variance and, as a result, a weak
differentiation of samples; therefore, statistical procedures that allow for a more definitive
differentiation would be beneficial in a forensic setting. The authors acknowledged that even
though the results are promising, preliminary studies they have performed with evaporated or
degraded gasoline samples currently limit the usefulness of some of these statistical procedures
as the samples could not be associated to corresponding ignitable liquid standards.
Bodle and Hardy investigated the potential use for other statistical analyses such as soft
independent modeling of class analogy (SIMCA), in addition to hierarchical cluster analysis
18

(HCA) and PCA . In a study aimed at optimization of an extraction by SPME and analysis by
gas chromatography-flame ionization detection, the authors generated chromatograms of
ignitable liquids including gasoline, diesel, and kerosene. To condense the data set, the resulting
TICs were divided into 30-second or 60-second intervals, the signal intensities of which were
summed, such that the statistical analyses were performed on the 114 or 57 newly calculated
variables. The ultimate goal of this project was to investigate whether a supervised classification
procedure such as SIMCA could be used to group ignitable liquids according to the ASTM
International classification scheme (Table 1.1). Hierarchical cluster analysis was used to
determine natural linkages or groups within the data set of ignitable liquids collected. Later, PCA
models were generated using the previously generated variables, which showed strong

!

%)!

correlations between ignitable liquids and their respective classes. Lastly, the authors concluded
that SIMCA was potentially useful as it was able to correctly classify 97.2% of the ignitable
liquid samples. The samples that were not correctly classified were clear outliers of the entire
data set and were not assigned to any other ignitable liquid classes.
The method of selecting the variables used for differentiation in the studies by Bodle and
Hardy and Petraco et al. may not be practical in a forensic laboratory where analysts will not
likely know the identity of the ignitable liquid before testing

17, 18

. Even though TICs, EICs, and

EIPs are likely to be more realistically useful, the authors of these studies did not investigate the
advantages or disadvantages of using these chromatograms, rather than selected variables, for
classification procedures.
A study by Tan et al. also investigated the use of SIMCA and PCA for the identification
and classification of over 50 ignitable liquids by the ASTM International classification, which
19

were extracted from unburned wood and carpet matrices . After the ignitable liquids were
exposed to the matrix, the samples were solvent extracted and analyzed by GC-MS. The
resulting TICs and selected EICs were divided into 19 equal parts and the signal was summed for
each section, which generated the 19 new variables that were used for the statistical analyses. All
liquids were correctly classified using this procedure. Simulated fire debris samples were also
generated by adding some of the ignitable liquids to carpet and then burning it. The identity of
the liquid used to make the simulated debris was also correctly determined using a SIMCA
model. While these results are extremely promising, the authors did not investigate the effects of
a surface treatment on the wood, which could complicate identification of a liquid, nor did they
investigate the use of the original TICs, EICs, or EIPs for classification purposes.

!

%*!

Baerncopf et al. conducted a study that accounted for thermal degradation as well as the
20

matrix interferences encountered in fire debris analysis . Six ignitable liquids, from different
ASTM International classes, were spiked onto a carpet matrix and burned. Samples underwent a
passive headspace extraction and subsequent analysis by GC-MS. Principal components analysis
and PPMC coefficients were successfully applied to the full TICs to objectively associate the
ignitable liquid residues back to their corresponding neat liquids. The effect of evaporation on
association was not examined in these experiments; however, the positive results from this study
demonstrate the potential for the use of some multivariate statistical procedures and provide a
foundation for the research performed in this thesis with surface-treated wood as a matrix.
The afore-mentioned studies demonstrate the potential of using multivariate statistical
procedures for the purpose of classifying ignitable liquids; however, very little of the data
analyzed was performed on representative chromatograms that would result from an actual fire
debris sample. Other than the work by Tan et al. and Baerncopf et al., the effects of thermal
degradation on the ignitable liquid and the difficulties that arise from extracting the volatiles
from a matrix were not investigated

19, 20

. Furthermore, no statistical analyses were performed

on simulated fire debris containing surface-treated wood. This type of investigation is a
necessary next step since surface-treated wood is commonly used in building and decorating and,
as a result, is likely to be contained in fire debris submitted to forensic laboratories for analysis.

1.7 Considerations for Statistical Analyses
While multivariate statistical procedures have shown to be promising in a research
setting, they introduce new difficulties. Principal components analysis, for example, is such a
powerful tool because it describes the data set in terms of the factors corresponding to the

!

%+!

greatest variance. This type of analysis procedure is so sensitive to variation, however, that it will
sometimes place more emphasis on meaningless nonchemical variations as opposed to the
chemical variations that actually describe the data. To minimize these meaningless differences,
data pretreatment procedures can and, oftentimes, are performed on chromatographic data prior
to data analysis. These procedures include smoothing, retention time alignment, and
normalization of chromatograms in the data set.
A smoothing algorithm is often applied to the data first because chromatograms consist
of both noise and signal. Noise is unintentionally introduced as part of the data collection process
and can come from many different sources, such as random fluctuations in measurements made
by a detector. Signal, on the other hand, is the desired output, which describes the data. Noise
can be extremely detrimental to data analysis because it is possible for the noise to mask or
misrepresent the signal and, therefore, distort the results of the data analysis. A smoothing
algorithm minimizes the noise while enhancing the true signal of the data.
After the signal of chromatographic data has been enhanced, it is commonly retention
time aligned. Retention time drift can cause the same peak in different chromatograms to have
different retention times. This drift occurs naturally when samples are analyzed over a period of
time. As a result of retention time drift, variation is identified across chromatograms that should
not exist. This can be corrected by applying alignment algorithms to the chromatographic data.
Ideally, the end result of alignment is that corresponding compounds should have the same
retention time across all chromatograms.
Normalization is commonly performed next and is used to reduce the non-significant
variations in peak abundance between replicates, between samples, or between sample
populations. These variations have many different sources and may be inherent to the data

!

%,!

collection process. Again, for an analysis procedure such as PCA, which describes the greatest
sources of variance, random differences in peak abundance may result in the data being
inaccurately described.

1.8 Research Objectives and Goals
The current methods of fire debris analysis are extremely subjective, even with standard
operating procedures and other safeguards. This research attempts to demonstrate the potential of
both unsupervised (PCA) and supervised (SIMCA) statistical procedures for performing
objective fire debris analyses.
The combination of PPMC coefficients and PCA has been successfully used in the
literature to associate evaporated liquids and simulated fire debris made of carpet. The first
objective of this research was to investigate the effects of evaporation, matrix interferences, and
thermal degradation on the association of surface-treated wood samples containing ignitable
liquids to their respective standards using PCA and PPMC coefficients. In order to meet the first
objective, standards and three data sets were generated. Each data set demonstrated the effects of
evaporation, matrix interferences, and thermal degradation in a piecewise manner using a
surface-treated wood matrix.
All statistical analyses were performed on the full TICs of each data set. To evaluate the
unsupervised association of samples to their standards, PCA was first performed on the liquid
standards. The samples from each data set were later projected separately onto the scores plot of
the standards. A visual assessment of the resulting scores plots was used to gauge the association
of the samples to the standards in light of the complicating factors. For each association, mean

!

%-!

PPMC coefficients were also calculated to provide a numerical value of the similarity between
samples.
The second objective of this research was to perform a preliminary investigation on the
potential of SIMCA for providing a supervised classification of ignitable liquids without the
effects of the afore-mentioned complicating factors. In order to fulfill the second objective, a
new set of ignitable liquid standards was generated. Six ignitable liquids were chosen for this
preliminary study, all from different ASTM International classes. The liquids used were fuel
stabilizer, gasoline, paint thinner, insect repellant spray, diesel, and fuel injector. Each liquid was
diluted in methylene chloride and analyzed in replicate by direct injection GC-MS, generating
fifteen chromatograms per liquid. Initially, PCA was performed on the entire data set to
determine natural groupings of the liquids, according to chemical composition. Next, SIMCA
models were generated and validated using the TICs as well as selected EICs and EIPs in an
attempt to determine which type of chromatogram, if any, is more successful for classification
purposes. Lastly, SIMCA models were developed based on the unevaporated and evaporated
gasoline and kerosene standards from the previous study to demonstrate the effects of
evaporation and passive headspace extraction on the supervised classification.

!

&.!

REFERENCES

!

&%!

REFERENCES
!
!
1. Karter MJ, Jr. Fire Loss in the United States During 2009. Quincy (MA): National Fire
Protection Association; 2010 Aug. Report No. FLX09.
2. ASTM International, ASTM E 1618-06e1. Annual Book of ASTM Standards 14.02.
3. ASTM International, ASTM E 1412-07. Annual Book of ASTM Standards 1402.
4. Stauffer E. Concept of pyrolysis for fire debris analysts. Science & Justice 2003; 43(1):
29-40.
5. Committee on Identifying the Needs of the Forensic Sciences Community, National
Research Council. Strengthening Forensic Science in the United States: A Path Forward.
Washington, D.C.: National Academies Press, 2009.
6. Lentini JJ, Dolan JA, Cherry C. The Petroleum-Laced Background. Journal of Forensic
Sciences 200; 45(5): 968-989.
7. Hetzel SS, Moss BA, Moss RD. How long after waterproofing a deck can you still isolate
an ignitable liquid? Journal of Forensic Science. 2005; 50(2): 269–276.
8. Lentini JJ. Persistance of Floor Coating Solvents. Journal of Forensic Science 2001;
46(6); 1470-1473.
9. Almirall JR, Furton KG. Characterization of background and pyrolysis products that may
interfere with forensic analysis of fire debris. J. Anal. Appl. Pyrolysis 2004; 71: 51–67.
10. Fernandes MS, Lau CM, Wong WC. The effect of volatile residues in burnt household
items on the detection of fire accelerants. Science & Justice 2002; 42: 7-15.
11. Dehaan JD, Bonarius K. Pyrolysis products of structure fires. Journal of the Forensic
Science Society 1988; 28(5-6): 299-309.
12. Borusiewicz R, Zi!ba-Palus J, Zadora G. The influence of the type of accelerant, type of
burned material, time of burning and availability of air on the possibility of detection of
accelerants. Forensic Science International 2006; 160: 115-126.
13. Sandercock PML, Du Pasquier E. Chemical fingerprinting of unevaporated automotive
gasoline samples. Forensic Science International 2003; 134: 1-10.
14. Sandercock PML, Du Pasquier E. Chemical fingerprinting of gasoline 2. Comparison of
unevaporated and evaporated automotive gasoline samples. Forensic Science
International 2004; 140: 43-59.

!

&&!

15. Sandercock PML, Du Pasquier E. Chemical fingerprinting of gasoline Part 3.
Comparison of unevaporated automotive gasoline samples from Australia and New
Zealand. Forensic Science International 2004; 140: 71-77.
16. Hupp AM, Marshall LJ, Campbell DI, Smith RW, McGuffin VL. Chemometric analysis
of diesel fuel for forensic and environmental applications 2008; 606(2): 159-171.
17. Petraco NDK, Gil M, Pizzola PA, Kubic TA. Statistical Discrimination of Liquid
Gasoline Samples from Casework. Journal of Forensic Science 2008; 53(5): 1092-1101.
18. Bodle ES, Hardy JK. Multivariate pattern recognition of petroleum-based accelerants by
solid-phase microextraction gas chromatography with flame ionization detection.
Analytica Chimica Acta 2007; 589: 247-254.
19. Tan B, Hardy JK, Snavely RE. Accelerant classification by gas chromatography/mass
spectrometry and multivariate pattern recognition. Analytica Chimica Acta 2000; 422:
37-46.
20. Baerncopf JM, McGuffin VL, Smith RW. Association of ignitable liquid residues to neat
ignitable liquids in the presence of matrix interferences using chemometric procedures.
Journal of Forensic Sciences 2011; 56: 70-81.

!

!

&'!

Chapter 2: Theory
2.1 Passive Headspace Extraction
Passive headspace extraction is but one of the methods recommended by ASTM
International and is commonly used as an extraction method for fire debris samples suspected of
1

resulting from arson .
For a passive headspace extraction, fire debris samples are placed in a sealed container
and an activated charcoal strip (ACS) is suspended within the container. The samples are left in
1

an oven at a temperature ranging from 50 to 80 °C over a period of 2 to 24 hours . When the
sample is heated, the volatile compounds are released into the headspace of the container and
adsorb onto the ACS. The type of volatile compounds that adsorb on the strip are dependent on
the heat, as well as the duration, of the extraction. At higher temperatures, heavier, less volatile
compounds are released into the headspace, while at lower temperatures, the smaller, more
volatile compounds are primarily collected. Longer extraction times also favor heavier molecules
because, if at some point in the extraction the strip becomes saturated, the heavier molecules
have a tendency to displace the smaller molecules. After the headspace extraction is performed,
the ACS is eluted with an organic solvent such as carbon disulfide, n-pentane, diethyl ether, or
methylene chloride. The resulting extract is then analyzed, typically, by gas chromatographymass spectrometry (GC-MS).

2.2 Gas Chromatography-Mass Spectrometry
Gas chromatography-mass spectrometry is the method of choice for analyzing suspected
arson fire debris samples in forensic laboratories. Chromatography techniques are used to
separate sample mixtures into individual analytes through interaction of the sample between a

!

&(!

2

mobile phase and a stationary phase . In modern gas chromatography, the mobile phase is a gas,
commonly referred to as the carrier gas, while the stationary phase is typically a liquid. The
mobile phase gas is contained in a pressurized cylinder that is connected to the injection port of
the GC (Figure 2.1). The gas flows through the column that contains the stationary phase. The
column is housed in an oven to allow careful control of temperature during the analysis. The
column is fed through the transfer line and emerges directly into the mass spectrometer detector
where additional information about the sample is generated and collected.
The separation begins when a syringe is used to inject a liquid sample mixture into the
injection port of the GC. Once the sample is injected, it is quickly volatilized by
the hot temperatures of the port. It is important to note that because the separation of the mixture
occurs while it is in the gas state, all of the analytes that can be separated and detected using this
method must be sufficiently volatile or they will not be converted to the gas state and carried
through the column. The temperature of the injection port is typically 50 °C above the boiling
point of the least volatile compound in the mixture to ensure volatilization and separation in the
2

column . If the injection port were any cooler than that, the mixture would not be volatilized
rapidly and would enter the column over too broad a period of time, which could result in
inefficient separations. Additionally, inadequate volatilization may result in only part of the
sample being analyzed. Specifically, if the injection port is at a lower temperature than the
highest boiling point of an analyte, that analyte would not enter the column and the results of the
analysis would not be representative of the actual sample.
The injection can be performed in four different modes: split, splitless, pulsed split, and
pulsed splitless. A split injection disposes of a fraction of the sample before it even reaches the
column. Some common split ratios are 50:1 or 100:1. This is used, and can be beneficial, for

!

&)!

Inlet
Detector

Column

Oven

Gas Cylinder
Figure 2.1: Schematic of a gas chromatograph.

!

&*!

highly concentrated samples. Discarding some of the sample prevents the column from being
overloaded or contaminated. Overloading the column, which is discussed later, leads to poor
separation of the sample. A splitless injection, on the other hand, injects the entire volume of the
sample onto the column. This mode is ideal for low concentration samples and allows for the
maximum amount of sample to reach the column and undergo separation. In pulsed split or
splitless injection, a pressure is simply applied to transfer all or part of the sample quickly from
the inlet onto the column. This results in the sample entering the column in a tight plug with
minimal spread of the analytes.
Also, within the injection port is an inlet that allows a constant flow of carrier gas
(mobile phase) to enter and flow through the system (typically ~1 mL/min for GC-MS). The
sample mixture is carried in the flow of gas from the injection port through the column and to the
detector. For GC-MS applications, the most commonly used carrier gas is helium due to its inert
nature and low molecular weight.
Ideally, the sample mixture should be introduced onto the column in as narrow a band as
possible. As the mixture travels through the column, analytes within the sample mixture interact
differently with the stationary and mobile phases, depending on the properties of the analyte
molecules. In gas chromatography, interaction with the stationary phase is mainly through
absorption, which is also known as partition. This occurs when molecules of the analyte diffuse
into the thin coating of the liquid stationary phase. Ideally, in a mixture each analyte will have a
slightly different affinity for both phases. An analyte that spends more time in the mobile phase,
for example, will travel more quickly through the column, while an analyte that spends more
time in the stationary phase will travel more slowly. As a result of the varying affinities, when

!

&+!

the carrier gas carries the mixture through the column, the sample mixture is separated into
several distinct bands, which ideally each contain one type of analyte.
The choice of stationary phase is very important for optimal separation. The mobile phase
merely carries the sample through the column whereas the stationary phase interacts with and
retards specific compounds differently so that they can be separated from one another. The
stationary phase is chosen based on the extent of its thermal stability in the high oven
temperatures, inertness and compatibility (similar polarities) with the compounds to be
separated2. The most common type of stationary phase used in forensic laboratories are those
with polysiloxane backbones. One such column is known commercially as HP-5 where HP
stands for the manufacturer and the 5 indicates that the stationary phase is 5% phenyl- and 95%
methyl-polysiloxane. These columns are very useful for separating a large range of polar,
nonpolar, basic, and acidic compounds, including those routinely seen when performing an
3

analysis of fire debris .
For efficient chromatographic separation of the sample analytes, any band broadening of
the sample mixture should be minimized. There are multiple factors that lead to band broadening
such as longitudinal diffusion and the efficiency of the mass transfer between the mobile and
stationary phase2.
Longitudinal diffusion occurs over time as the analyte molecules diffuse from a more
concentrated region to a less concentrated region within the mobile phase. This can occur in a
column depending on the length of time it takes for the separation to occur. The less time a
mixture spends in the column, the less time is available for diffusion to occur. As a result, a
higher flow rate of the carrier gas is usually considered to be best to decrease diffusion since it
decreases the amount of time the mixture spends in the column. However, a high flow rate is not

!

&,!

always optimal because it can adversely affect the efficiency of the mass transfer of analytes
during the separation. Additionally, higher flow rates can negatively impact the separation by
moving the sample through the column so quickly that it does not have enough time to separate.
This may result in the co-elution of compounds and a decrease in overall resolution.
Mass transfer is the transfer of analyte molecules from the mobile phase to the stationary
phase and back again. Ideally, equilibrium should exist between the analytes in the mobile phase
and stationary phase during the separation; however, equilibrium is established so slowly that
2

separations never occur under equilibrium conditions . Some adjustments can be made so that an
equilibrium approximation can occur resulting in an increase in mass transfer efficiency as well
as a decrease in band broadening.
Some factors affecting mass transfer are the flow rate, the concentration of the analyte,
and the length of the column. High flow rates decrease the efficiency of the mass transfer since
there is less time for equilibrium to occur. An analyte in a mobile phase with a high flow rate
will travel a long distance down the column while some of the analyte is partitioned into the
stationary phase. This results in irreversible band broadening. The same analyte traveling at a
lower flow rate will not travel as far ahead of the analyte in the stationary phase; therefore, lower
flow rates aid in increasing the efficiency of the mass transfer and decreasing band broadening.
Similarly, a separation using a long column may result in more band broadening than a short
column. A common column length is 30 meters. Regardless of flow rate and length of the
column, a high concentration of the analyte, known as overloading the column, also decreases
the mass transfer. This occurs when the excess analyte is present such that there are no more sites
in the stationary phase for the analyte to partition into. As a result, most of the analyte is in the
mobile phase and band broadening occurs. To increase mass transfer efficiency and consequently

!

&-!

decrease band broadening, the stationary phase in the column is applied as a very thin layer of
liquid, commonly less than one micron thick, on the inner walls of the column. The analytes can
completely partition into the stationary phase more quickly because there is less distance or
width to travel through before they begin to partition back into the mobile phase.
Since the separation of the molecules is also dependent on the temperature at which the
separation occurs, the column is housed within an oven to allow strict control and close
monitoring of temperature. It is possible to use two types of temperature programs in gas
chromatography. An isothermal program is when the oven, and consequently the column, is
maintained at one temperature for the entire analysis. This type of program is typically used for
the separation of molecules with very similar boiling points and provides the best resolution.
Some limitations to isothermal analyses are that they require a longer period of time to be
completed and cannot separate mixtures that contain analytes that have a wide range of boiling
points. In temperature programming, the oven and hence, column temperature, is increased from
low to high temperatures in a controlled manner. Temperature programming in this way has
many benefits. One such benefit is that a more complex mixture containing analytes with a wide
range of boiling points can be separated in a short period of time. Since the separation can occur
over a shorter time, there is less band broadening and more efficient separation. Conversely, one
disadvantage of temperature programming is that the ramp rate may be too high so that
compounds with similar boiling points co-elute, leading to poor separation efficiency.
Once the separated molecules reach the end of the column, they travel into the detector.
While many different detectors are available for GC, the mass spectrometer is widely used for
forensic applications. The analytes are transferred to the mass spectrometer directly from the
column by way of a heated transfer line, which is kept at a temperature equal to the highest

!

'.!

temperature used in the oven temperature program, typically 250-300 °C. The transfer line is
heated to prevent or minimize condensation of the separated analytes.
As the analyte is carried into the mass spectrometer, the carrier gas is pumped away while
allowing the analyte to reach the ion source. This is important because gas chromatography is
performed under atmospheric pressure whereas mass spectrometry must be performed under
-4

-8

2

vacuum conditions. Typical pressures needed for this analysis range from 10 to 10 torr .
Because the flow rate of the mobile phase is so low (~1 mL/min) in gas chromatography when
using a capillary column, a specialized interface is not needed to remove the carrier gas from the
sample; the vacuum pumps associated with the MS are able to pump the carrier gas away and
maintain the low pressures needed.
Vacuum conditions are needed in mass spectrometry to ensure that the ions being created
and analyzed do not undergo any reactive collisions between ionization and detection. Multiple
types of pumps work together to generate and maintain the low pressures that are required for
mass spectrometry to be performed. The pumps work by removing excess molecules and
therefore increasing the mean free path, or the distance that the ion can travel without chance of
collision with another molecule.
There are three major parts of a mass spectrometer: an ion source, a mass analyzer, and
an ion detector (Figure 2.2). In a GC-MS bench top instrument, the most commonly used ion
source is an electron ionization source, while the mass analyzer is typically a quadrupole
analyzer and the detector is an electron multiplier detector. Overall, the mass spectrometer works
by ionizing and fragmenting molecules, determining the masses and charges of the fragments
produced, and then detecting the fragments. Under given conditions, the molecules fragment in a

!

'%!

Data
Acquisition
System

Inlet System
(From GC)

Ion Source

Mass Analyzer

Detector

Vacuum Pump

Figure 2.2: Schematic of a mass spectrometer.

!

'&!

unique and reproducible manner and so the resulting fragmentation patterns can be used to
definitively identify the separated analytes.
Electron ionization (EI) requires the creation of free electrons to remove electrons from a
neutral molecule, which causes a cascade of events leading to fragmentation. Free electrons are
produced by applying a current across a thin filament. The electrons are accelerated across a
potential toward an anode, which imparts the electrons with energy (typically 70 eV although
this energy can be varied). The ion source also contains a collimating magnet, which causes the
electrons to travel in spiral pathways. The neutral analytes from the gas chromatograph are
introduced perpendicularly to the flow of electrons. The compounds have to travel through the
flow of electrons in order to reach the mass analyzer and, on doing so, are bombarded with
electrons. The spiral motion of the electrons increases the probability of interaction with the
neutral molecules. When an electron passes close to a neutral molecule, energy is transferred. If
the electron can transfer sufficient energy, the ionization potential of the neutral molecule will be
surpassed, which creates a positively charged ion. Excess energy from the electron or from
interactions with subsequent electrons can lead to an excess of energy in the once neutral
molecule. The excess energy cannot be disposed of quickly enough, resulting in fragmentation.
Because EI results in extensive fragmentation, it is referred to as a ‘hard’ ionization technique
and, due to the extent of fragmentation, EI is very useful in structural determinations. Once the
positive fragment ions have been produced, they are directed to the mass analyzer by way of a
positively-charged repeller plate and a negatively-charged extractor plate.
A quadrupole mass analyzer consists of four rods running parallel to one another in a
diamond formation (Figure 2.3). Rods located oppositely from one another are paired. The rods
are connected to a direct current (DC) source; one set of rods is positive, while the other is

!

''!

Non-resonant Ions

Resonant Ions
To Detector

_

+!

!

_

!

+!
Ions from Source

Figure 2.3: Diagram of a quadrupole mass analyzer"!

!

'(!

negative. Additionally, radio-frequency (RF) alternating current is applied to both sets such that
one set of rods is always out of phase with the other set. Quadrupole mass analyzers are used to
perform mass selective stability scans, which are performed by scanning the DC and RF
2

potentials at a fixed ratio . At a given ratio, only ions with a specific mass-to-charge (m/z) value
will have stable trajectories that allow them to pass through the cavity defined by the rods and
reach the detector. Ions that are lighter and heavier than the stable m/z value will have unstable
trajectories that cause them to hit the rods where they are neutralized and pumped away by the
vacuum system. In order to collect an entire mass spectrum of each analyte, the entire range of
DC/RF fixed ratio potentials are scanned so that ions with a large range of m/z values can pass
through and reach the detector.
Once an ion of a specific m/z has passed through the mass analyzer, it is detected using a
continuous-dynode electron multiplier detector. The detector is horn-shaped and made of glass
2

doped with lead, which easily emits secondary electrons . The opening, where the positive ions
enter, is held at a slight negative potential while the other end is held at ground. This produces a
potential gradient down the length of the horn. When a positive ion strikes the opening of the
detector, electrons are emitted. The emitted electrons are then attracted toward a less negative
part of the detector where they again strike the surface and more electrons are emitted. This
occurs several times until the signal of the original ion has been amplified (by approximately
5

82

10 -10 ) . Since the response of the detector is constant, meaning that each ion leads to the
emission of a constant number of electrons, the amplified signals are comparable for all of the
ions generated in the mass spectrometer. As a result, the amplified signal can be used to quantify
the analytes in a mixture.

!

')!

An analog-to-digital converter is used to transform the electrical current to a digital signal
that is interpreted by a computer. The end result of a GC-MS analysis is the generation of a total
ion chromatogram. This is a graph where the abscissa is the retention time of the molecules, in
minutes, and the ordinate is the total ion current, which is dimensionless. The retention time of
the molecule is simply the time that it takes the molecule to travel from the beginning of the
column to the detector. The molecules are represented as peaks on the chromatogram and the
area underneath the peak can be used to quantify the amount of the molecule present in the
mixture. Furthermore, the identity of the molecule represented by each peak can be deduced by
analyzing the mass spectrum associated with the peak, which is reproducible and characteristic to
that molecule, under a given set of conditions.

2.3 Data Pretreatment
!
2.3.1 Smoothing
All data sets, including chromatograms, consist of both signal and noise. The signal is the
part of the data that is intentionally collected and is the desired output. Noise, on the other hand,
is incidental and can come from a number of sources. For example, natural fluctuations in the
measurements made by a detector will lead to small, but random, variations within the data.
Noise can be detrimental to the analysis or characterization of a data set because it has the
potential to partially or completely mask trends that would otherwise be visible. Smoothing of
the data set can be performed as a way to minimize noise and its effects. In short, smoothing
methods can be used to increase the signal-to-noise ratio of the data set.
A very popular method of smoothing is the Savitzky-Golay smoothing algorithm. This
procedure uses a local polynomial regression over a set number of data points to describe the

!

'*!

4

data and reduce the noise . A certain number of data points, called a window size, is used to
perform a polynomial regression, which essentially fits a polynomial to that chosen set of points.
Once the polynomial has been fitted, the y-value of the centermost point in the window size is
replaced by the new y-value predicted from the fitted polynomial. After the center point for that
window size has been smoothed, the algorithm moves to the next x-value and continues to
smooth the center point, one at a time until it has gone through all of the x-values. Eventually the
entire data set will be smoothed with one notable exception. The first and last few points at the
beginning and the end of the data set will not be smoothed because they do not fulfill the window
size requirement: that is, these data points can never be the center point of a window in order to
5

be smoothed .
The number of points used for the window size is crucial and is modified according to the
data set. Larger window sizes smooth the data more than smaller window sizes, but a window
size that effectively smooths one data set will not necessarily smooth another data set as well.
Large window sizes, for example, may over-smooth the data and remove some of the actual
signal as well as the noise. Smaller window sizes, on the other hand, may do the opposite and not
5

remove enough of the noise . It should also be noted that window sizes must contain an odd
4

number of points . This is because the center point is smoothed using this method and a center
point can only exist if the window size is an odd number of points. Typically, for
chromatographic data, a window size that has a similar number of points as an average peak
within the chromatogram is used as a starting point. From there, adjustments are made and
evaluated to determine if larger or smaller sizes should be used.

!

'+!

The order of the polynomial used for the regression can also be modified to best
complement the data set. Higher-order polynomials tend to preserve tall and narrow peak shapes
better, while lower-order polynomials tend to work best for wide peaks. The order of the
polynomial chosen depends not only on the trends in the data that need to be preserved, but also,
the window size that is chosen for the smooth to occur. The order of the polynomial has to be
less than the number of points chosen to compose the window size because a specific number of
points is needed to create a polynomial depending on the order. A first-order equation, for
example, is a straight line. In order to draw a line, at least two points must be present so the
window size has to be greater than two. The same logic follows for polynomials of every other
order.
The Savitzky-Golay smoothing algorithm has many advantages over other types of
smoothing. The fact that a polynomial is fitted to a window means that this method is extremely
good for preserving the overall trend of the data set, while minimizing the noise. This is in
contrast to other smoothing procedures that may replace the center point of the window with the
average across all data points that make up the window, which can distort peak shape, trends in
the data, and decrease, or even completely remove, some of the signal from the data set.

2.3.2 Retention Time Alignment
After smoothing, it is often necessary to perform retention time alignment on
chromatographic data. The alignment minimizes drift in retention time of the same analyte in
samples analyzed over a period of time. Oftentimes, the sample chromatograms are aligned to a
chromatogram of a consensus target, which contains all of the compounds known to exist in the
samples.

!

',!

A correlation optimized warping (COW) algorithm can be used to align chromatographic
data. The chromatogram is divided into sections, each containing the same number of data points
6

as defined by the analyst, and referred to as the segment length . Another parameter, known as
the warp, defines the maximum number of data points that can be added to or removed from each
6

segment in order to produce the best alignment . For example, a warp of 3 means that 0, 1, 2, or
3 data points can be added or subtracted from each segment. This number is also chosen by the
analyst.
Alignment of the sample chromatogram to the target chromatogram is performed from
the end to the beginning of the chromatogram, so the last segment is optimized first. This is
performed by calculating a local correlation coefficient between corresponding segments in the
sample and the consensus target chromatograms. The coefficients are calculated for each
possible warp and segment combination. All of the local coefficients are then summed, for each
specific warp and segment combination, to generate multiple global correlation coefficients. The
warp and segment combination resulting in the highest global correlation coefficients is
considered to be the optimal alignment. A high coefficient, however, does not always mean that
the consensus target and sample chromatogram are well aligned because the slope of a peak in
one chromatogram may be aligned to the apex of a peak in the other chromatogram.
Consequently, the optimal alignment is best determined through a visual comparison of the
aligned chromatograms.

2.3.3 Normalization
Normalization is a very common pretreatment procedure that is used to reduce the nonsignificant variations in abundance between replicates, between samples, or between sample

!

'-!

populations. These variations in abundances are expected and have many sources such as
variations in the volume of sample injected into the GC for analysis. It is necessary to minimize
the fluctuations in abundance so that later data analysis procedures can describe actual
differences or trends in the data as opposed to changes in abundance inherent to the data
collection process. Ideally, this ensures that any changes described by later statistical procedures
come from meaningful, chemical, differences in the data and not random fluctuations.
As with any other pretreatment procedure, there can be negative consequences when
normalizing data, depending on the goal of the subsequent analysis steps and the information that
is gained from the data set. One such draw back of normalization is that since it stretches or
compresses the data to be comparable across a set, all information about concentration or
5

abundance is completely removed . This means that normalization in certain situations can be
detrimental to analysis and should not be performed. In chromatograms used for this project,
however, it is the presence and pattern of the peaks that is most important and not the
abundances, which makes this type of data set an ideal candidate for undergoing the
normalization process.
There are many different types of normalization that can be performed on
chromatographic data. One commonly used type is total area normalization. In this method, the
abundance at each retention time in a single chromatogram is divided by the total area of that
chromatogram and then multiplied by the average total area of the chromatograms within the
data set. The concept of first dividing the chromatogram by its own total area, is to reduce all of
the chromatograms in the data set to the same scale, where the abundances vary from 0 to 1. The
later step of multiplying by the average restores the rescaled chromatograms to a similar
abundance that they began with.

!

(.!

2.4 Data Analysis

2.4.1 Principal Components Analysis
Principal components analysis (PCA) is an extremely powerful tool that is used to
analyze multidimensional data and reduce it into fewer dimensions that explain from where the
most variance in the samples is coming. This multivariate statistical procedure allows the user to
condense the data and discover trends that might otherwise be masked by the overwhelming
dimensionality of the data. Since chromatographic data are multidimensional, PCA is a good
statistical procedure to identify the variations in the data and assess natural groupings of the data.
The first step in performing PCA is to mean center the data. Specifically for
chromatographic data, this is done by calculating the average abundance at each retention time,
across the entire data set. This average is then subtracted from the abundance at the
corresponding retention time in each individual chromatogram. Mean centering is a way to
ensure that the principal components (discussed later) describe the maximum amount of variance
by redefining the average or mean as zero.
The next step is to calculate the covariance matrix on the mean-centered data. Covariance
7

(Equation 2.1) is a measure of spread in the data set . Or, it can simply be defined as the variance
between two samples. In the equation below, x and y are the individual data points in samples x
7

and y, respectively, and n is the number of data points being evaluated .

!"#!!!!! !

!
!!!!!! !!!!!! !!!

!!!

Equation 2.1

The covariance identifies how one variable changes across two samples. In
chromatographic data, this occurs by a point-by-point comparison between the abundances at

!

(%!

each retention time in two chromatograms. In order to accomplish these comparisons, a
covariance matrix is developed, in which the covariance calculated between all retention times in
each pair of chromatograms is displayed.
Once the covariance matrix is established, its eigenvectors and eigenvalues are
calculated. The eigenvector is a unit vector that, when multiplied by the covariance matrix,
produces a multiple of the original matrix. The eigenvalue is the number by which the original
matrix was multiplied. Eigenvectors are used to identify the sources of variance within the entire
data set. Many eigenvectors, located orthogonally to one another, can be calculated to satisfy the
same data set; however, each accounts for a different amount of variance. The number of
eigenvectors that can be calculated for a given covariance matrix is equal to the number of
samples being evaluated7. For example, if a covariance matrix was developed using 90 samples
(90 " 90) then 90 eigenvectors could be calculated to describe it. Along with each eigenvector is
an associated eigenvalue, which defines the amount of variance described by the eigenvector.
Thus, the eigenvector with the highest corresponding eigenvalue accounts for the most variance
7

and is considered to be the first principal component . The eigenvectors are ranked in this way
so that the second highest eigenvalue corresponds to the second principal component and
describes the next greatest amount of variance, and so on.
The mean-centered data are then multiplied by each eigenvector, separately. This
essentially results in the original data being described by the eigenvectors. For chromatographic
data, the original abundance at each retention time is replaced by a new number, which is called
the loading. The loadings at each retention time are summed, which results in the score of the
sample. From this a scores plot can be developed, which is a visual representation of the scores

!

(&!

of the samples within the two dimensions used for the analysis. In a scores plot, chemically
similar samples are clustered more closely to one another than other dissimilar samples.
The eigenvector can be graphed to identify the variables that are responsible for the
variance in the data set. The resulting graph, called a loadings plot, can also be used to explain
the positioning of the samples in the scores plot. The loadings plot identifies not only the
variables that are leading to the positioning of each sample, but also how heavily each is
weighted in determining the position. A loadings plot can be generated to describe each principal
component. For chromatographic data, the eigenvector can be plotted versus retention time such
that each variable (in this case, compound) can be identified based on retention time. The sign of
the components (positive or negative) within each loadings plot is assigned arbitrarily and only
serves as a way to place the samples positively or negatively on PC1 or PC2 in the scores plot.

2.4.2 Pearson Product Moment Correlation Coefficients
Pearson product moment correlation (PPMC) coefficients provide a pairwise comparison
between two different samples and result in a numerical value that describes the relationship
between the two samples. The coefficient is calculated by dividing the covariance calculated
8

between two variables by the product of the variances for both variables (Equation 2.2) . The
variance is a measure of spread in the data for each set of variables, while the covariance
identifies how the x-variable changes in relation to the y-variable. In the equation below, x and y
8

represent the individual data points from samples x and y, respectively .

!!" !

!

!
!!!!!! !!!!!! !!!
!
!
!!!!!! !!!

!
!
!!!!!! !!!

('!

Equation 2.2

In terms of chromatographic data, the two variables would be the abundance, at a specific
retention time, in two chromatograms. The resulting coefficient represents a pairwise comparison
of the two chromatograms, on a point-by-point basis. For a given peak in the two
chromatograms, differences in the point at which the peak begins, reaches the apex, and ends
result in lower coefficients.
The value of a PPMC coefficient can range from -1 to +1. A coefficient of +1 means that
the two samples are perfectly, positively correlated to one another; while a coefficient of -1
means that the two samples are perfectly negatively correlated to one another. A coefficient
greater than ±0.8 indicates that two samples are strongly correlated, a coefficient between ±0.5
and ±0.79 indicates a moderate correlation, and a coefficient less than ±0.49 indicates a weak
8

correlation . A coefficient of zero indicates no correlation between the two samples.

2.4.3 Soft Independent Modeling of Class Analogy
Unlike PCA, soft independent modeling of class analogy (SIMCA) is a supervised
pattern recognition procedure. In this case, ‘supervised’ means that unknown samples can be
identified as belonging to groups or classes, which are predetermined by the user. To perform
SIMCA, the data set is divided into a training set and a test set. ‘New’ samples may later be
classified using the SIMCA model that was developed on the training set and validated using the
test set.
The first step in SIMCA is to generate statistical models for each predefined group within
the training set. This is typically done using PCA, in which a PCA model is developed for each
of the known groups in the training set. The resulting models characterize each group in the
training set independently of one another. The number of PCs used to describe one group is

!

((!

9

determined independently of the number of PCs used to describe the others . The resulting
scores for each sample can be plotted on a scatter plot to visualize the results of PCA for each
group. A loadings plot is also generated for each PC and can be used to explain the positioning
of the samples on the plot as well as identify the sources of variance for each PCA model.
The PCA models are validated to evaluate how well the models describe the predefined
groups. While many validation procedures are available, cross validation using the ‘leave one
out’ method is commonly used. In this validation procedure, one sample is removed from the
training set and is used as a testing sample. A new model is made using the remaining samples
from the training set and then the model is applied to the test sample. This procedure is repeated
numerous times until each sample has been used as a test sample to validate the model created
from the remaining samples in the training set.
When performing SIMCA, the original data set is split into a training and testing set, as
mentioned previously. The training set is used to develop the PCA models, as described above.
The test set is used to assess the ability of the models to classify samples according to the
appropriate group. The assessment is performed by projecting each sample in the test set on to
each of the PCA models, in a manner identical to that used when projecting scores for PCA.
Next, a distance measurement, known as the object-to-model distance, is calculated to determine
9

how far the test samples are from each model . Equation 2.3 below describes the calculation
performed to determine the object-to-model distance of a new sample (Si) where m is the model,
10

ResXCal is the variance per x-variable, and a is the principal component number .

!! !! ! !

!

!"#$%&'!"#!! !!! !!

()!

Equation 2.3

Another measurement, called the leverage (Hi), is calculated and describes the distance
9

of the test samples from the mean score of the group . Equation 2.4 describes how leverage is
calculated for each sample (Hi) where I is the number of samples, a/A are the principal
10

component number/number of principal components, and t/T are the scores (vector/matrix) .
!

!! ! !
!

!
!!"
!
!!! ! ! !

Equation 2.4

! !

A combination of the object-to-model distance and the leverage, are used to determine
whether the test samples fall within the predefined group membership limit set for each PCA
9

group model . If the test samples do, then they are assigned to that class. If the test samples do
not, they may not be classified.
If the test set does not validate the SIMCA model developed using the training set, the
model must be revised, specifically, the PCA models must be re-evaluated. This may be done by
changing the number of PCs used to describe the PCA models for each group. Models may also
be revised by removing outliers, or samples that falsely and negatively impact the model, from
the training set. Lastly, if the PCA models developed from the training set do not have enough
discriminating power, other statistical procedures may be used separately or in conjunction with
SIMCA to improve the classification.
After the model has been validated, the ‘unknown’ samples are subjected to each PCA
model to classify the samples according to group. Again, this involves calculating the object-tomodel distance as well as the leverage to determine which unknown samples fall within in each
PCA group model limit.

!

(*!

The SIMCA procedure is considered to be a ‘soft’ classification procedure because there
are three possible outcomes for the identification of each unknown sample. The sample could be
9

assigned to one, multiple, or none of the groups . The ability not to force a classification differs
from other statistical procedures and may be beneficial in a real world scenario because an
unknown sample may not belong to any of the predefined groups. Additionally, the assignations
can be calculated at different confidence levels and may change depending on which confidence
level is used. Thus, SIMCA not only offers the ability to classify unknown samples to groups,
but also provides a statistical confidence associated with the classification.

!

(+!

REFERENCES

!

(,!

REFERENCES
!
!
1. ASTM International, ASTM E 1412-07. Annual Book of ASTM Standards.
2. Skoog DA, Holler FJ, Crouch SR, Principals of instrumental analysis. 6th edition.
Belmont, CA: Thompson, 2007.
3. Optimized Sensitivity, Accuracy and Reproducibility on a SINGLE Column. 2012.
(http://www.labplus.co.kr/catalog/detailed_pages/Hp1n5.pdf)
4. Chau F, Liang Y, Gau J, Shao X, Chemometrics: From Basics to Wavelet Transform.
Hoboken, NJ: John Wiley & Sons, Inc., 2004: 25-31.
5. Beebe KR, Pell RJ, Seasholtz MB, Chemometrics: A Practical Guide. New York, NY:
John Wiley & Sons, Inc., 1998: 32-34.
6. LineUpTM User Manual (version 1.0.62, Infometrix, Inc., Bothwell, WA).
7. Smith LI, A tutorial on principal components analysis. 2002.
(http://www.sccg.sk/ ~haladova/principal_components.pdf).
8. Devore JL, Probability and Statistics for Engineering and the Sciences. Belmont, CA:
Duxbury Press, 1991: 487-490.
9. Unscrambler X SIMCA Theory Section of User Manual (version 10.2, Camo, Inc.,
Woodbridge, NJ).
10. Unscrambler X Methods Manual (version 10.2, Camo, Inc., Woodbridge, NJ).
!

!

(-!

Chapter 3: Association of Simulated Fire Debris Samples to Corresponding Standards
Using Unsupervised Statistical Procedures

3.1 Introduction
Wood, of all kinds, is an extremely common product used in building structures as well
as furnishing and decorating them. Oftentimes, the wood is treated whether it be to keep away
pests, to lend extra strength to the surface, or to make it more aesthetically pleasing. These
treatments also introduce compounds to chromatograms of the wood, which can make
identification of an ignitable liquid during an arson investigation extremely complex.
In this chapter, the use of unsupervised statistical procedures to associate simulated fire
debris samples to their corresponding standards is investigated. A set of standards for gasoline
and kerosene, at three different evaporation levels, was generated by spiking the ignitable liquid
TM

onto a Kimwipe

and analyzing it by gas chromatography-mass spectrometry (GC-MS). Next,

three different data sets were generated. The first data set (known as the inherent matrix
interferences data set) was generated by spiking each ignitable liquid onto unburned, surfacetreated wood. These samples were extracted using a passive headspace procedure and analyzed
by GC-MS. This data set was used to demonstrate the effects of evaporation and inherent matrix
interferences on the association of samples to their respective standards. The second data set
(matrix interference/thermal degradation data set) consisted of burned, surface-treated wood that
was spiked with each ignitable liquid, then extracted and analyzed. This data set was used to
demonstrate the effects of evaporation, matrix interferences, and thermal degradation of the
matrix on the association of the samples to their corresponding liquid standards. The final data
set (simulated fire debris data set) consisted of surface-treated wood samples that were spiked

!

).!

with each ignitable liquid and then burned. This data set was used to investigate the effects of
thermal degradation of both the ignitable liquid and the matrix, in addition to evaporation and
matrix interferences, on the association of the samples to their corresponding standards. Thus,
each data set illustrates, in a piecewise manner, the effect of each complicating factor on the
association of the sample to its respective standard.
Principal components analysis (PCA) was performed on the chromatograms of the
standards to generate a standards scores plot. The three data sets of samples were projected,
separately, onto the standards scores plot to investigate the objective association of the samples
to their respective standards, in the presence of each of the complicating factors. The calculated
Pearson product moment correlation (PPMC) coefficients also provided pairwise comparisons
between chromatograms of the standards and samples. The PPMC coefficients provide a
numerical value, which describes the similarities between the chromatograms.

3.2 Materials and Methods

3.2.1 Ignitable Liquid Standards
The gasoline and kerosene used for this research were available in the laboratory. The
fuels were previously collected from fuel stations and stores in the Lansing, MI area. Both were
stored at refrigerated temperatures in acid-washed amber containers that were capped and
®

covered with parafilm (American National Can

TM

, Greenwich CT).

Both liquids were evaporated to three different levels by volume: 0, 50, and 90%. To do
this, a 10-mL acid-washed graduated cylinder was filled with the liquid, which was then
evaporated using a stream of nitrogen. A star-shaped stir bar was placed in each cylinder in order

!

)%!

to maintain the homogeneity of the liquid as it evaporated. This evaporation was done multiple
times and aliquots of each evaporated liquid were thoroughly mixed together, once again, to
ensure a homogenous sample of each evaporated liquid. The liquids were stored as described
above.
Prior to analysis, each ignitable liquid was diluted (1:10 v/v) in methylene chloride (J.T.
Baker, Phillipsburg, NJ), which contained nitrobenzene (Mallinckrodt, Inc., Paris, KY) as an
internal standard at a concentration of 0.2 M. Twenty microliters of the diluted liquid was spiked
2

onto a 4 " 4 cm piece of Kimwipe

TM

(Kimberly-Clark Global Sales, LLC, Roswell, GA) in a

nylon bag (Grand River Products, LLC, Grosse Pointe Farms, MI). A quarter of an activated
charcoal strip (Albrayco Technologies, Inc., Cromwell, CT) hanging on a paperclip (previously
rinsed with methylene chloride) was inserted into the nylon bag, which was then sealed. Five
samples were generated in this manner for each evaporation level of both ignitable liquids. The
samples underwent a passive headspace extraction where the bags were placed in an 80° C oven
1

for 4 hours as recommended by ASTM International . Following extraction, the activated
charcoal strips were removed from the bags and eluted with 200 µL of methylene chloride. The
resulting extracts were analyzed, in triplicate, by gas chromatography-mass spectrometry (GCMS).
In addition to the liquid standards, a consensus target was also prepared. The target was
made in a manner identical to the liquid standards except, gasoline and kerosene were both
diluted (1:10 v/v) in the same aliquot of methylene chloride (containing nitrobenzene) and that
mixture was spiked onto the Kimwipe

TM

. The consensus target was extracted and analyzed as

described above.

!

)&!

3.2.2 Surface-Treated Wood Samples
Unfinished Red oak hardwood flooring was purchased from a local home improvement
store. The flooring boards were cut into 4.2 cm x 7 cm rectangles using a compound miter saw
®

(Delta Power Equipment Corporation, Anderson, SC). The boards were 2.9 cm thick. A
Watco

TM

®

Danish Oil finish (Rust-oleum Corporation, Vernon Hills, IL) was applied to the

wood with a disposable foam brush, as indicated by the manufacturer. More finish was applied to
areas that soaked up the finish. Thirty minutes after the first application, another coat was
applied and allowed to soak for an additional 15 minutes before the excess oil was removed with
a dry cloth, as per the manufacturer’s instructions.
Samples of untreated (n=3) and treated (n=3) wood were placed in separate nylon bags
containing an activated charcoal strip, then extracted using the passive headspace procedure
described above. Following extraction, the charcoal strips were eluted with methylene chloride
and the extracts analyzed by GC-MS, as described above. A NIST library search was performed
on the TICs of these samples in order to identify the compounds inherent to the wood and to the
surface treatment.

3.2.3 Inherent Matrix Interference Samples
Gasoline (1:10 v/v) and kerosene (9:100 v/v) were diluted in methylene chloride
containing nitrobenzene (0.2M) as the internal standard. The same dilution factor was used for
all of the evaporation levels of that liquid. Next, 20 µL of the diluted ignitable liquid was spiked
onto a 4.2 cm x 7 cm rectangle of treated, unburned wood. This procedure was used to create
five samples per evaporation level of both ignitable liquids, resulting in a total of 30 samples.
The samples were then sealed in nylon bags and underwent the passive headspace extraction

!

)'!

with subsequent analysis by GC-MS as described previously. Each sample was analyzed in
triplicate, resulting in a final data set of 90 chromatograms. This data set was used to investigate
the effect of interferences inherent to the matrix on the association of the samples to their
respective standards.

3.2.4 Determination of Optimal Burn Time
The optimal burn time for the wood samples was determined by applying a propane torch
®

(Bernzomatic , Medina, NY) to the surface-treated wood squares for 30, 60, 90, 120, 150, and
180 seconds. An over-turned beaker was used to distinguish any flames still observed beyond the
burn time evaluated. The wood squares were sealed in nylon bags with activated charcoal strips,
then subjected to the same extraction and analysis procedures described previously.
Unburned, but treated, wood was also analyzed, in a similar manner, and used for
comparison with the chromatograms from the burned samples. The burn time that generated the
most abundant matrix interferences was selected and used throughout the rest of the study for the
matrix interference/thermal degradation and simulated fire debris samples.

3.2.5 Matrix Interference/Thermal Degradation Samples
The diluted ignitable liquid standards prepared in section 3.2.1 were spiked onto separate
4.2 cm x 7 cm rectangles of treated wood, which were previously burned for 30 seconds by
applying a propane torch. This procedure was repeated to create five samples per evaporation
level of both ignitable liquids, resulting in a total of 30 samples. The samples were then sealed in
nylon bags containing activated charcoal strips and, again, underwent the passive headspace
extraction with subsequent analysis by GC-MS. Each sample was analyzed in triplicate, resulting

!

)(!

in a final data set of 90 chromatograms. This data set was used to investigate the effect of
thermal degradation of the surface treatment, as well as the inherent matrix interferences and
evaporation of the ignitable liquids, on the association of samples to their respective standards.

3.2.6 Simulated Fire Debris Samples
Each ignitable liquid standard was spiked onto separate 4.2 cm x 7 cm rectangles of
treated wood, then a propane torch was applied for 30 seconds. The spike volumes were 225 µL
of gasoline and 115 µL of kerosene. These spike volumes were used for each evaporation level
of both ignitable liquids. The burned samples were placed in separate nylon bags containing
activated charcoal strips. To each sample, 20 µL of methylene chloride with nitrobenzene (0.2
M) as the internal standard was added. This procedure was used to create five samples per
evaporation level of both ignitable liquids, resulting in a total of 30 samples. The samples, again,
underwent the passive headspace extraction procedure and were analyzed by GC-MS. Each
sample was analyzed in triplicate, resulting in a final data set of 90 chromatograms. This data set
was used to investigate the effects of thermal degradation of both the surface treatment and the
ignitable liquid, as well as the evaporation of the liquid and inherent matrix interferences, on the
association of the samples to their respective standards.

3.2.7 Analysis of Samples by GC-MS
All samples were analyzed using an Agilent 6890N gas chromatograph, coupled to an
Agilent 5975C mass spectrometer, and equipped with an Agilent 7683B autosampler (Agilent
Technologies, Palo Alto, CA). The GC contained an Agilent HP-5MS capillary column (30 m x
0.25 mm I.D. x 0.25 µm film thickness). The carrier gas was ultra high purity helium (Airgas,

!

))!

East Lansing, MI), at a nominal flow rate of 1mL/min. One µL of each sample was injected
using the pulsed, splitless mode, with a pressure of 15 psi for 0.25 minutes. The inlet was
maintained at 250 °C. The GC oven temperature program was as follows: 40 °C for 3 min, 10
°C/min to 280 °C, hold for 4 min. The transfer line was maintained at 280 °C and the mass
spectrometer was operated in electron ionization mode (70 eV). Full mass scan mode was used,
scanning the range 50 to 550 amu, with a scan rate of 2.91 scans/s.

3.2.8 Data Pretreatment
Data pretreatment was performed on the total ion chromatograms (TICs) of the standards’
and samples’ extracts within each data set. The Savitsky-Golay smooth was performed in the
©

ChemStation Enhanced Data Analysis Software (version E.01.01.335, Agilent Technologies).
A correlation optimized warp (COW) alignment was used to align all the TICs to the TIC of the
consensus target. This alignment was performed using LineUp

TM

(version 1.0.62, Infometrix,

Inc., Bothwell, WA). Many combinations of the warp and segment size were investigated and the
alignment afforded by each combination was evaluated based on visual assessment of the aligned
chromatograms. The parameters offering optimal alignment were a warp of 3 and a segment size
of 75 and this combination was used to align all data sets.
Next, the TICs were subjected to a total area normalization procedure, which was
performed using Microsoft Excel (version 12.0.6425.1000, Microsoft Corp., Redmond, WA).
For a specific evaporation level, the total area of each chromatogram (n=15) across all retention
times was calculated and then the average area of all 15 chromatograms was calculated. To
perform the normalization, each chromatogram was divided by its total area and then multiplied

!

)*!

by the corresponding average. This process was repeated for each evaporation level of each
ignitable liquid, for both the standards and the samples.

3.2.9 Principal Components Analysis
Principal components analysis was performed on the pretreated TICs of the ignitable
®

®

liquid standards using MatLab (version 7.11.0.584, Mathworks , Natick, MA). Scores for each
standard were generated, along with the eigenvectors and corresponding eigenvalues for each
principal component described. The scores for the standards on PC1 and PC2 were graphed in
Microsoft Excel to create the scores plot for the ignitable liquid standards. The eigenvectors for
PC1 and PC2 were plotted against the retention time (also in Microsoft Excel) to create the
loadings plot for each PC.
The samples in the inherent matrix interference data set were then projected onto the
scores plot generated for the liquid standards. In order to project the scores, the TICs of the
samples were mean centered. To do this, the average abundance at each retention time in the
liquid standards was calculated and then subtracted from the corresponding abundance in the
TIC of the sample. To calculate the score for a sample on PC1, the mean-centered data for that
sample was multiplied by the eigenvector for PC1 (generated from Matlab). The product was
summed across all retention times to generate the score on PC1. The score on PC2 was
calculated in a similar manner, using the eigenvector for PC2. This was repeated for all samples
and the calculated scores of the samples were graphed onto the scores plot of the standards. This
procedure was repeated for the remaining two data sets, resulting in a total of three scores plots,
in addition to the scores plot of the liquid standards.

!

)+!

Each scores plot was used to visually assess the association of samples to their respective
standards despite evaporation, matrix interferences, and thermal degradation.

3.2.10 Pearson Product Moment Correlation Coefficients
Pearson product moment correlation coefficients were calculated in MatLab. Coefficients
were calculated for all pairwise comparisons of the standards and the samples within each data
set, as well as among the replicates of the standards and samples. The comparison of the
standards and samples was used to investigate the similarity between the sample and its
respective standard. The comparison of standards’ replicates illustrated the precision of the
sample preparation, headspace extraction, and GC-MS analysis procedures, while the
comparison of the samples’ replicates additionally included variability introduced by the burning
process.

3.3 Results and Discussion

3.3.1 Characterization of Compounds Present in Ignitable Liquid Standards
3.3.1.1 Gasoline
Exemplar total ion chromatograms (TICs) of each evaporation level for the gasoline
standards are shown in Figure 3.1. The compounds characteristic of 0% evaporated gasoline
(Figure 3.1A) are toluene, the C2-, C3-, and C4-alkylbenzenes, and the methylnaphthalenes.
The evaporation of gasoline leads to a loss, or decrease in abundance, of the earlyeluting, more volatile, compounds. At 50% evaporation (Figure 3.1B), there is a decrease in

!

),!

Internal Standard

A

C2-Alkylbenzenes
Toluene

C3-Alkylbenzenes

C4-Alkylbenzenes
Methylnaphthalenes

Abundance

B

C

0

Retention Time (min)

Figure 3.1: Total ion chromatograms of A) 0%, B) 50%, and C) 90% evaporated
gasoline. The internal standard used was nitrobenzene.

!

)-!

20

abundance of toluene, while the later-eluting compounds become concentrated. At 90%
evaporation (Figure 3.1C), there is significant evaporative loss of toluene as well as a decrease in
abundance of the C2-alkylbenzenes. The later-eluting C3- and C4-alkylbenzenes, as well as the
methylnaphthalenes, become more concentrated, leading to an increase in abundance of these
compounds.

3.3.1.2 Kerosene
Exemplar TICs of each evaporation level for the kerosene standards are shown in Figure
3.2. The 0% evaporated kerosene (Figure 3.2A) contains normal (n)-alkanes in a Gaussian
distribution. The kerosene used in this project contained n-alkanes C9 through C17, in addition to
a myriad of branched and aromatic alkanes. The C17 peak, however, is in such low abundance
that its existence is not immediately obvious in the 0% evaporated standard.
Once again, the evaporative process results in the loss, or decrease in abundance, of
early-eluting, more volatile, compounds. At 50% evaporation (Figure 3.2B), there is evaporative
loss of the early-eluting aromatic compounds, as well as the C9 and C10 n-alkanes, while the
abundances of the later-eluting alkanes increase. At this evaporation level, a Gaussian
distribution of the remaining n-alkanes is still obvious. At 90% evaporation (Figure 3.2C), there
is a significant loss of the n-alkanes up to C13. The abundance of the remaining compounds is
markedly increased. At this evaporation, the C17 peak is also visible for the first time.
Additionally, the ratios of the later-eluting n-alkanes change from the Gaussian distribution
described for the 0% and 50% evaporated kerosene. At 90% evaporation, C15 becomes the most

!

*.!

Internal Standard

C11
C9

C12

C10

C13

A

C14

C15

C16

C17

Abundance

B

C

0

Retention Time (min)

Figure 3.2: Total ion chromatograms of A) 0%, B) 50%, and C) 90% evaporated
kerosene. The internal standard used was nitrobenzene.

!

*%!

20

abundant compound, followed by C14, then C16 whereas, before, the abundance of C15 was
between C14 and C16.

3.3.2 Association and Discrimination of Ignitable Liquid Standards
A combined total of approximately 84% of the variance within the data set is described
by the first and second principal components (PC1 and PC2, respectively) in the scores plot of
the ignitable liquid standards (Figure 3.3). Replicates of each liquid are clustered closely and
each liquid forms a distinct cluster from the others. From visual assessment of the scores plot,
ignitable liquid type can be discriminated on PC1. Additionally, each evaporation level for both
liquids can be distinguished when using both PCs.
The gasoline standards are positioned positively on PC1, whereas the kerosene standards
are positioned negatively. This difference in positioning can be explained by using the loadings
plot for PC1 (Figure 3.4). The plot shows that toluene, the C2-, C3- and C4- alkylbenzenes are
weighted positively. These compounds are present in the gasoline standards, thus explaining the
positive positioning on the scores plot. The n-alkanes (C11-C17) are weighted negatively on PC1
in the loadings plot. These compounds are present in the kerosene standards, thus explaining the
negative positioning on PC1 in the scores plot. It should be noted that since the n-alkanes C9 and
C10 have a low weighting in the loadings plot, these compounds do not contribute significantly
to the positioning of the kerosene standards on PC1 in the scores plot.
The 50% and 90% evaporated standards of gasoline are positioned more positively, while
the 50% and 90% evaporated kerosene standards are positioned more negatively, on PC1 than

!

*&!

Principal Component 2 (15.61%)

4.5E6

-4.0E6

4.0E6

-4.5E6
Principal Component 1 (68.82%)

Figure 3.3: Scores plot of PC1 versus PC2 based on the total ion chromatograms for
gasoline and kerosene at the three different evaporation levels. In terms of color, blue,
green, and purple represent 0%, 50%, and 90% evaporated kerosene while red, orange,
and yellow represent 0%, 50%, and 90% evaporated gasoline. For interpretation of the
references to color in this and all other figures, the reader is referred to the electronic
version of this thesis.

!

*'!

0.25

C3-Alkylbenzenes

Principal Component 1

C2-Alkylbenzenes

Toluene

C11

C17

C12
C16
C13
C14 C15

-0.25

0

Retention Time (min)

Figure 3.4: Loadings plot of PC1 based on the total ion chromatograms of the
unevaporated and evaporated ignitable liquid standards.

!

*(!

20

their 0% evaporated counterparts. These positioning shifts can be explained by examining the
chromatograms of the standards at each evaporation level (Figures 3.1 and 3.2), as well as the
loadings plot for PC1 (Figure 3.4). As gasoline is evaporated to 50%, toluene decreases in
abundance, while the remaining compounds increase in abundance. A slight increase in the C2and C3-alkylbenzenes, which are more positively weighted than toluene in the loadings plot,
results in a more positive positioning of the 50% evaporated standard on PC1 in the scores plot,
compared to the 0% evaporated standard. At 90% evaporation, toluene is present at very low
abundance and the C2-alkylbenzenes decrease in abundance. Conversely, the C3- and C4alkylbenzenes increase significantly in abundance. Collectively, the C3- and C4-alkylbenzenes
are more positively weighted than toluene and the C2-alkylbenzenes in the loadings plot. The
marked increase of these late-eluting compounds overcompensates for the decrease in abundance
of toluene and the C2-alkylbenzenes, which results in a more positive positioning of the 90%
evaporated gasoline standards on PC1 in the scores plot, compared to the 0% and 50%
evaporated standards.
The more negative positioning of the evaporated kerosene standards on PC1, compared to
the 0% evaporated standard, can be explained similarly. As kerosene is evaporated, some of the
earlier eluting compounds (C9-C13) undergo varying degrees of evaporative loss. While the
characteristic kerosene compounds are all weighted negatively in the loadings plot for PC1, these
earlier eluting compounds are less heavily weighted and, therefore, do not contribute greatly to
the positioning of the standards on the scores plot. The most heavily weighted compounds in the
loadings plot are C13 through C16, which are concentrated as evaporation level increases. An

!

*)!

increase in concentration of the most heavily weighted compounds as kerosene is evaporated,
therefore, explains the more negative position of the 50% and 90% evaporated standards on PC1
compared to the 0% evaporated standard.
The loadings plot for PC2 (Figure 3.5) can be used, in a similar manner, to explain the
positioning of the standards on PC2 in the scores plot. The 0% and 50% evaporated gasoline
standards are positioned negatively, whereas the 90% evaporated standard is positioned
positively. According to the loadings plot, the only compounds present in gasoline that
contribute significantly to its positioning on this PC are toluene and the C2-alkylbenzenes. These
compounds are weighted negatively and are present in highest abundances in the 0% and 50%
standards, explaining the negative positioning of these standards in the scores plot. In addition,
since these compounds are present in similar abundances in the 0% and 50% evaporated
standards, the standards are positioned similarly on PC2. The 90% evaporated standard is
positioned positively on PC2 in part, because of the lower abundance of toluene and the C2alkylbenzenes as a result of evaporation, and in part, due to the mean centering of the data.
When the chromatographic data are mean centered, the average abundance at each
retention time across all standards is subtracted from each standard chromatogram at the
corresponding retention time. The result of this procedure is that sometimes compounds that are
not originally present in the standard can be introduced into the chromatogram. For the gasoline
standards, n-alkanes C13 through C17 were introduced into the chromatograms (Figure 3.6). The
n-alkanes were present in high abundance in the kerosene standards, which means that the
average value at that retention time was a large positive number. As a result, when the

!

**!

0.25
C12

Principal Component 2

C11

C13

C10
C9

C14

Toluene

C16

C2-Alkylbenzenes

-0.25

0

C17

C15

Retention Time (min)

Figure 3.5: Loadings plot of PC2 based on the total ion chromatograms of the
unevaporated and evaporated ignitable liquid standards.

!

*+!

20

C3-Alkylbenzenes

Abundance

C2-Alkylbenzenes

Toluene

C16

C13
C14

0

C17

C15

Retention Time (min)

Figure 3.6: Mean-centered total ion chromatogram of the 90% evaporated gasoline
standard demonstrating the introduction of n-alkanes from the kerosene standards.

!

*,!

20

average was subtracted from the gasoline standards, the mean-centered data contained a negative
contribution from these n-alkanes.
The mean-centered data are then multiplied by the eigenvector for the PC to generate the
score of the sample on that PC. For example, in the case of the 90% evaporated gasoline
standard, the n-alkanes C14 through C17, which contribute negatively in the mean-centered data,
are also weighted negatively on PC2. When these two negatives are multiplied, a positive
loading results for each of the n-alkanes. It should be noted that C13, which is also negatively
introduced in the chromatogram, is weighted positively in the loadings plot for PC2, resulting in
one negative loading.
Additionally, the average abundance of toluene, calculated across all standards, is greater
than its abundance in the 90% evaporated gasoline standard; therefore, toluene is also negatively
introduced into the chromatogram. Since toluene is negatively weighted in the loadings plot for
PC2, and negatively introduced into the mean-centered chromatogram, it contributes positively
to the positioning of the 90% evaporated gasoline standard the scores plot. The final score for the
sample, which is graphed in the scores plot, is the sum of the loadings across all retention times.
Overall, for the 90% evaporated gasoline standard, the final score is positive.
For the 0% and 50% evaporated gasoline standards, the mean-centered data also contain
negative contributions from the n-alkanes. However, in this case, more of these n-alkanes are
weighted positively in the PC2 loadings plot. When the mean-centered data are multiplied by the
eigenvector, the result is an increase in the number of negative loadings. When summed, the
negative loadings cancel out many of the positive loadings that contribute to the positive
positioning of the 90% evaporated gasoline standard in the scores plot. Hence, the introduction
of the n-alkanes, in the neat and 50% evaporated gasoline standards, does not contribute

!

*-!

significantly to their positioning in the scores plot. Thus, positioning of the 0% and 50%
evaporated gasoline standards is more affected by the decrease in abundance of toluene and the
C2-alkylbenzenes than by the compounds introduced during the process of mean centering.
The 0% and 50% evaporated kerosene standards are positioned positively on PC2, while
the 90% evaporated standard is positioned negatively on this PC in the scores plot. In the
loadings plot for PC2 (Figure 3.5), C9-C13 n-alkanes are weighted positively and C14-C17 are
weighted negatively. The 0% evaporated kerosene standard contains all of these compounds.
Overall, more of the n-alkanes are weighted positively, and are more heavily weighted, than the
n-alkanes that are weighted negatively. This results in the positive positioning of the 0%
evaporated kerosene standard in the scores plot. The 50% evaporated standard contains similar
abundances of the positively weighted (C11-C13) and negatively weighted (C14-C17) n-alkanes.
Because the positively weighted compounds contribute more on this PC than the negatively
weighted compounds, the 50% evaporated standard is also positively positioned in the scores
plot. However, the 50% standard is less positively positioned on PC2 than the 0% evaporated
standard due to evaporative loss of C9 and C10, which are weighted positively in the PC2
loadings plot. The 90% evaporated standard is positioned negatively on PC2 because it contains
only one compound (C13) that is weighted positively on this PC, while the remaining compounds
(C14-C17) are all weighted negatively.

!

+.!

3.3.3 PPMC Coefficients for Ignitable Liquid Standards
Mean PPMC coefficients calculated for pairwise comparisons of replicates, at each
evaporation level for the gasoline and kerosene standards, demonstrate the precision of the
extraction and analysis procedures (Table 3.1). In theory, the PPMC coefficients calculated for
replicates should be 1, indicating complete correlation. In reality, however, a value of 1 is
difficult to attain due small imprecisions in the measured spike volume of the ignitable liquid, the
variability in the passive headspace extraction procedure, and variability in the GC-MS analysis.
All of the replicates for each evaporation level are strongly correlated with a coefficient greater
than 0.98. The strong correlations, coupled with the small standard deviations for the
coefficients, indicate that the extraction and analysis procedures are precise.

3.3.4 Characterization of Compounds Present in Surface-Treated Wood Flooring
The most identifiable compounds present in the chromatograms of the unburned,
surface-treated wood are the C9, C10, C11, and C12 n-alkanes (Figure 3.7). This is an important
observation since all of these alkanes are also present in the kerosene standards. Also present in
the surface treatment are branched and cyclic alkanes, as well as aldehydes.
It is important to note that all of the compounds present in the chromatogram of the
treated wood come from the treatment itself and not from the wood. Extraction and analysis of
untreated wood, in a similar manner, yielded two very small and unidentifiable peaks at the
beginning of the resulting chromatogram. Peaks at those retention times were not present in the
chromatogram of the treated wood.

!

+%!

Table 3.1: Mean Pearson product moment correlation coefficients ± the standard
deviations calculated for replicates of standards at each evaporation level (n=105).
Ignitable Liquid Standard
Evaporation Level
0% Gasoline
50% Gasoline
90% Gasoline
0% Kerosene
50% Kerosene
90% Kerosene

!

Mean PPMC Coefficient
± Standard Deviation
(n=105)
0.9976 ± 0.0017
0.9947 ± 0.0039
0.9956 ± 0.0027
0.9969 ± 0.0023
0.9950 ± 0.0036
0.9839 ± 0.0184

+&!

C10

C11

A

C12
C9

Abundance

B

C

0

Retention Time (min)

Figure 3.7: Total ion chromatograms of extracts of surface-treated wood burned for
A) 0 seconds, B) 30 seconds, and C) 150 seconds.

!

+'!

20

The absence of compounds in the chromatograms of the untreated wood could be for a
number of reasons. Before wood is used in homes, it oftentimes goes through an intense drying
stage; therefore, this wood may have been sufficiently dried so that all of the volatiles were
removed or present at extremely low abundance.

3.3.5 Optimization of Burn Times
The burn times investigated were 30, 60, 90, 120, 150, and 180 s. A sample of unburned,
surface-treated wood was also analyzed for comparison purposes. Exemplar chromatograms are
shown in Figure 3.7. The unburned, surface-treated wood showed the most abundant matrix
interferences (Figure 3.7A). As burn time increased, there was a marked decrease in abundance
of matrix interference compounds. At burn times greater than 150 s (Figure 3.7C), no peaks were
observed in the chromatograms. This is likely due to the flame removing the entire layer of the
surface treatment. This hypothesis is strengthened by the fact that, as the burn time increases, the
abundances of the peaks decreases until C10 and C11 are barely visible at 150 seconds.
Since the matrix interference/thermal degradation and simulated fire debris samples
require that the wood be burned, the burn time had to be balanced with the observed decrease in
abundance of the interferences. Based on this compromise, a burn time of 30 s (Figure 3.7B) was
used to create the matrix interference/thermal degradation and simulated fire debris samples.
This short burn time ensured that the abundances of the interferences were maximized. Shorter
burn times between 0 and 30 s were not investigated because a shorter time period could further
contribute to the irreproducibility of the burning process.
In terms of the compounds present, the chromatograms of the unburned and burned
surface-treated wood appear very similar (Figure 3.7A and B). It was expected that compounds

!

+(!

from the surface treatment be degraded by the heat of the propane flame, generating additional
compounds. However, this does not appear to be the case for any of the burn times evaluated.

3.3.6 Association of Samples to Corresponding Standards in the Presence of Inherent Matrix
Interferences and Thermal Degradation
Principal components analysis was performed on the data set containing liquids extracted
from the unburned surface-treated wood to investigate the effect of matrix interferences on the
association to the liquid standard. Similarly, PCA on the data set containing liquids extracted
from the burned, surface-treated wood was used to investigate the effect of both matrix
interferences and thermal degradation on the association. In general, similar trends were
observed in the scores plots for both data sets and hence, only results from the liquids extracted
from the burned surface-treated wood will be discussed in detail.
Scores for the gasoline and kerosene samples extracted from the burned matrix were
calculated and projected onto the scores plot generated for the liquid standards (Figure 3.8). The
gasoline samples are all positively positioned on PC1 in the scores plot, similarly to the
corresponding standards, whereas the kerosene samples are negatively positioned on this PC,
similar to the kerosene standards. Thus, the gasoline and kerosene samples can be associated to
their corresponding standards by liquid type. The samples cannot, however, be associated to their
respective standards in terms of evaporation level. Even though the gasoline and kerosene
samples are clustered by evaporation level, the samples are spaced too far apart from their
respective standards to be associated to them based solely on visual assessment of the plot.
The less positive positioning of the gasoline samples on PC1 than their corresponding
standards in the scores plot, is due to differences in abundance of the gasoline compounds in the

!

+)!

Principal Component 2 (15.61%)

1.5E6

-2.5E6

2.5E6

-1.5E6
Principal Component 1 (68.82%)

Figure 3.8: Scores plot of PC1 versus PC2 based on the total ion chromatograms for the
ignitable liquid standards, represented by the squares, and the projected scores of the
matrix interference/thermal degradation samples, represented by the circles. In terms of
color, blue, green, and purple represent 0%, 50%, and 90% evaporated kerosene while
red, orange, and yellow represent 0%, 50%, and 90% evaporated gasoline.

!

+*!

chromatograms. While these differences in abundance can be observed in all of the gasoline
compounds, only those compounds present in the loadings plot for PC1 will affect the
positioning of the samples in the scores plot. This negative shift is exhibited by all of the
gasoline samples, but it is most significant between the 50% evaporated gasoline samples and
standards. In the chromatograms of the 50% evaporated samples, the abundance of the
characteristic compounds are a factor of 0.25 to 0.5 less than the abundance of the corresponding
compounds in the standards (Figure 3.9). The decrease in abundance is translated in the scores
plot such that the scores on PC1 for the 50% evaporated samples are still positive, but one-fourth
to one-half of the magnitude of their respective standards. This decrease in abundance of varying
degrees between standards and samples is true at all evaporation levels, resulting in less positive
positioning of the samples on PC1 compared to the standards.
The range in abundance of gasoline compounds across all gasoline samples, regardless of
evaporation level, also resulted in the spread of the samples on PC1. The variations in abundance
of compounds are likely due to the porous nature of the wood, since some of the ignitable liquid
soaked into the wood. Additionally, the presence of the surface treatment may have affected the
extent to which the gasoline soaked into the wood. The compounds may not have been entirely
available for extraction using the passive headspace procedure, which led to the range in
abundance of gasoline compounds observed in the chromatograms and illustrated by the scores
plot.
The positive shift of the kerosene samples in comparison to their respective standards on
PC1 can also be described in a similar manner. A decrease in abundance of the n-alkanes in the
samples, which are negatively weighted in the loadings plot for PC1, translated into the positive
shift of the kerosene samples in the scores plot. The differences in the abundances of the n-

!

++!

Abundance

1E6

0

0

Retention Time (min)

13

Figure 3.9: Total ion chromatograms of a 50% evaporated gasoline standard (green) and
two 50% evaporated gasoline inherent matrix interference samples (red and black),
demonstrating the differences in abundance between the standards and samples.

!

+,!

alkanes are greater than those observed in the gasoline samples, resulting in greater shift of the
kerosene samples compared to the corresponding standards than previously observed for the
gasoline samples and standards.
The shift in positioning of the gasoline and kerosene samples compared to their standards
on PC2 can be explained in a similar manner, based on differences in abundance. All samples
exhibit spread on PC2 except the 90% evaporated gasoline and 50% evaporated kerosene
samples. For gasoline, only toluene and the C2-alkylbenzenes affect the positioning of the
samples on PC2, according to the loadings plot for this PC (Figure 3.5). At 90% evaporation, the
gasoline samples do not contain a significant abundance of toluene, and the C2-alkylbenzenes are
present at the lowest abundance of all the evaporation levels. As a result, differences in
abundances of these compounds results in minimal spread on this PC.
The 50% evaporated kerosene samples exhibit less spread on PC2 as opposed to on PC1
because of the weighting of the n-alkanes present in the loadings plots for both PCs. The
chromatograms of the 50% evaporated kerosene samples contain n-alkanes C11-C16 in a
Gaussian distribution. In the loadings plot for PC2, C11-C13 are positively weighted and C14C16 are negatively weighted. The positively and negatively weighted n-alkanes are present in
collectively equal abundances in the sample chromatograms; however, the positively weighted nalkanes are more heavily weighted in the loadings. As a result, any variation in abundance of
these n-alkanes will both positively and negatively affect the scores of the samples similarly.
Specifically, a decrease in abundance of the positively weighted n-alkanes will be minimized by
the proportional decrease in abundance of the negatively weighted n-alkanes. Consequently,
differences in abundances of n-alkanes of replicates will not create significant spread. This can

!

+-!

be contrasted to the loadings for PC1 in which all of the n-alkanes load negatively; therefore, any
decrease in the abundance of n-alkanes will result in an entirely positive shift in positioning of
the samples.
It should be noted that the presence of the surface treatment does not significantly affect
the overall positioning of the samples for two reasons. First, although the surface treatment
contributes C9-C12 to the chromatograms of the samples, only C11 and C12 are present in the
PC1 loadings plot. Furthermore, C11 and C12 are not heavily weighted in the loadings plot for
PC1 so they do not contribute significantly to the positioning of the samples on PC1 in the scores
plot. Secondly, the surface-treated wood was burned 30 s prior to being spiked, which diminishes
the abundances of these compounds in the chromatograms. Because of the low abundances, the
surface treatment does not contribute significantly to the positioning of the scores on PC2 either,
even though the compounds are more heavily weighted in the loadings plot for this PC.
The scores plot generated for the liquids extracted from the unburned surface-treated
wood samples displayed the same general trends in terms of positioning of the samples compared
to the corresponding liquid standards (Figure 3.10). However, there was one notable difference:
on PC2, all samples, except for 90% evaporated gasoline, exhibited a positive shift compared to
the corresponding standard. This shift was not apparent in the scores plot for the liquids extracted
from the burned wood samples (Figure 3.8). The positive shift is due to the addition of the C9C12 n-alkanes from the surface treatment, which are weighted positively in the loadings plot for
PC2. A visual comparison between the matrix interference samples and the corresponding matrix
interference/thermal degradation samples reveals that the abundance of the n-alkanes from the
surface treatment is much higher in the chromatograms of the former data set. The differences in

!

,.!

Principal Component 2 (15.61%)

1.5E6

2.5E6

-2.5E6

-1.5E6
Principal Component 1 (68.82%)

Figure 3.10: Scores plot of PC1 versus PC2 based on the total ion chromatograms for the
ignitable liquid standards, represented by the squares, and the projected scores of the
inherent matrix interference degradation samples, represented by the circles. In terms of
color, blue, green, and purple represent 0%, 50%, and 90% evaporated kerosene while
red, orange, and yellow represent 0%, 50%, and 90% evaporated gasoline.

!

,%!

abundance of n-alkanes between the two data sets is due to the fact that a flame was applied to
the surface-treated wood to generate the matrix interference/thermal degradation samples, but not
to generate the inherent matrix interference samples. The burning resulted in a decrease in
abundance of n-alkanes in one data set, but not in the other. Therefore, samples from the matrix
interference data set are positioned more positively on PC2 than the standards due to the high
abundance of the matrix interference compounds.
The positive shift in positioning on the scores plot is not exhibited for the 90%
evaporated gasoline inherent matrix interference samples, which are similarly positioned to the
corresponding samples in the matrix interferences/thermal degradation scores plot. A visual
comparison of the 90% evaporated gasoline sample chromatograms from both data sets reveals
little difference in the abundance of the n-alkanes from the surface treatment or the compounds
characteristic of gasoline, thus explaining their similar positioning. A possible explanation for
the similarities of the sample chromatograms from both data sets is analyst error. The 90%
evaporated gasoline samples from both data sets were generated on the same day. It is possible
that the surface-treated wood pieces were all burned before the evaporated gasoline was spiked
onto them. In this way, two sets of 90% evaporated gasoline matrix interference/thermal
degradation samples may have been, inadvertently, generated. This would explain the lower
abundance of the n-alkanes from the surface treatment in the 90% evaporated gasoline samples
in the inherent matrix interference data set as opposed to the higher abundance of n-alkanes in
the rest of the samples within this data set. It also explains why these same 90% evaporated
gasoline samples did not exhibit the expected positive shift on PC2 in the scores plot.
Differences in abundances of compounds between standards and samples, as well as
among replicates, are the main factors contributing to the shift in positioning of the samples on

!

,&!

the scores plot, compared to the corresponding standards, as well as the spread exhibited by
them. This is a result of inadequate normalization procedures. Specifically, the total area
normalization performed was able to minimize the variations between abundances of replicates
of each individual sample, but not across all samples. This could potentially have been corrected
by normalizing to an internal standard in addition to the total area normalization, but the internal
standard in this study was affected by the porosity of the wood to the same extent that the
ignitable liquids were. This resulted in variations in abundance of the internal standard even
though it was applied to the samples and standards at the same concentration. The raw
chromatograms of the sample replicates contain identical compounds, but in different
abundances; therefore, better normalization procedures should be able to reduce the spread in the
scores plot. Improved normalization procedures may also increase the association of the samples
to the standards in both data sets because the positioning of the samples is predominantly
determined by the abundance of compounds from the ignitable liquids, which vary greatly from
standards to samples.

3.3.7 PPMC Coefficients for Inherent Matrix Interference Samples
The calculated mean PPMC coefficients for replicates of the inherent matrix interference
samples are greater than 0.92, indicating strong correlation among replicates for each
evaporation level of each liquid (Table 3.2). Even the samples, such as the 90% evaporated
kerosene samples, that are positioned both negatively and positively on PC2, are strongly
correlated to one another. Initially these results may seem to conflict with one another, but they
do not. This is because PCA and PPMC coefficients are two fundamentally different statistical

!

,'!

Table 3.2: Mean Pearson product moment correlation coefficients ± standard
deviations for replicates of the inherent matrix interference samples (n=105) and for
samples to 0% evaporated gasoline and kerosene (n=225) standards.
Ignitable Liquid
Sample Evaporation
Level
0% Gasoline
50% Gasoline

Mean PPMC Coefficient ± Standard Deviation
Sample Replicates
0% Evaporated
0% Evaporated
(n=105)
Gasoline (n=225)
Kerosene (n=225)
0.9356 ± 0.0525
0.5205 ± 0.1370
0.4492 ± 0.0530
0.9672 ± 0.0270
0.5978 ± 0.0830
0.4346 ± 0.0282

90% Gasoline
0% Kerosene

0.6251 ± 0.0557
0.2756 ± 0.0481

0.4813 ±0.0242
0.5639 ± 0.0751

50% Kerosene
90% Kerosene

!

0.9794 ± 0.0179
0.9726 ± 0.0264
0.9650 ± 0.0245
0.9240 ± 0.0586

0.3011 ± 0.0421
0.3535 ± 0.0485

0.6476 ± 0.0674
0.6184 ± 0.0588

,(!

procedures that highlight different aspects of the data (variations and similarities) within the data
set.
Principal components analysis identifies and emphasizes specific variables across a data
set that describe the majority of the variance, in order to discriminate samples from one another.
Consequently, samples are not discriminated based on all of the compounds in the
chromatograms; only specific compounds are considered. The extent of the discrimination is
based on the magnitude of compound’s contribution to the variance in the data set, as well as the
abundance of the compound in the sample chromatogram.
Pearson product moment correlation coefficients, on the other hand, provide a point-bypoint comparison between two chromatograms in an effort to describe the similarity or extent of
correlation between samples. As a result, coefficients are affected by differences in the retention
time at which a peak begins, reaches the apex, and ends. Even for peaks with an apex at the same
point, differences in the width of the peak translate into differences in the beginning and end
retention times of the peak, which lowers the coefficient.
As a result of these fundamental differences, samples that contain the same compounds,
in different abundances, may seem to be discriminated by PCA, yet be strongly correlated
according to PPMC coefficients. This is demonstrated by the 90% evaporated kerosene samples,
which are positioned positively and negatively in the scores plot (Figure 3.10) but are strongly
correlated (Table 3.2). For PCA, the C9-C13 n-alkanes are weighted positively on PC2, while
C14-C17 are weighted negatively. Kerosene evaporated to 90% by volume contains C13-C17,
which results in an overall negative positioning of the standards, but the addition of the
positively weighted C9-C13 from the surface treatment results in a positive shift of the samples.

!

,)!

Differences in abundance of the positively weighted n-alkanes in the surface treatment
correspond to the extent of the positive shift; large abundances will result in a positive score
whereas small abundances will result in a negative score for the samples. Since PPMC
coefficients are insensitive to differences in overall abundance, a point-by-point comparison of
the peaks, in the chromatograms of 90% evaporated kerosene samples, resulted in a strong
correlation because the peak widths and, consequently, the relative abundance of the data points
at each retention time do not vary significantly between the chromatograms of samples.
Even though replicates of samples are strongly correlated, the coefficients are less than
those calculated for replicates of the standards (Table 3.1). This observation can be explained by
differences in the width of the peaks for corresponding compounds across normalized sample
chromatograms. These differences are not significant and, as a result, the calculated coefficients
were not significantly impacted; however, these minor differences in the width of the peaks led
to small differences in the abundance between the data points at each retention time, which
reduced the correlation between replicates of samples in comparison to replicates of standards.
Some compounds from the ignitable liquids, surface treatment, and the internal standard
vary enough in abundance to impact the peak widths across sample replicates (Figure 3.11). This
variation is further reflected by the increased standard deviations for the coefficients of the
sample replicates as opposed to those of the standard replicates. Since the extraction and analysis
procedures were demonstrated to be precise, the differences in abundance and, therefore
variations in the peak widths of compounds from the ignitable liquids and internal standard and,
therefore, relative abundance of corresponding data points, are likely due to the porous nature of
the wood matrix. As a result of this porosity, some of the ignitable liquids may have soaked into
the wood and therefore, been unavailable for adsorption onto the charcoal strip during the

!

,*!

Abundance

8E5

0

7.63

Retention Time (min)

7.80

Figure 3.11: Total ion chromatograms of a kerosene standard (red) and a matrix
interference/thermal degradation sample (black), demonstrating the difference in peak
width between the standards and samples. The peak depicted here is the C10 n-alkane.

!

,+!

passive headspace extraction. Additionally, the surface treatment works by penetrating into the
wood, which could affect the degree to which the ignitable liquid can soak into the wood and
therefore, the extent of its availability during the extraction. Thus, the variability observed in the
chromatograms for these particular samples is likely due to the properties of the matrix before
the burning process even occurs.
Mean PPMC coefficients demonstrate that most samples are moderately correlated to
their corresponding 0% evaporated standard. Strong correlation between samples and standards
was not expected, especially between the gasoline samples and standards, due to the addition of
the C9-C13 n-alkanes from the surface treatment. The gasoline standards do not contain the nalkanes; therefore, the introduction of these compounds into the chromatograms of the gasoline
samples decreases the correlation between the standards and samples. Since the kerosene
standards already contain the n-alkanes from the surface treatment, the application should not
have a significant negative impact on the correlation between standards and samples. The surface
treatment does, however, contain compounds other than the n-alkanes that the kerosene does not
contain, such as aldehydes, which will negatively impact the correlation.
Weak correlation was observed between most samples and the other non-corresponding
0% evaporated standard. For example, the 90% evaporated kerosene samples are moderately
correlated to 0% evaporated kerosene standards and weakly correlated to the 0% evaporated
gasoline standards. However, the standard deviation of the calculated coefficients does increase
some of these weak correlations above the threshold of 0.5, which indicates a moderate
correlation. This is true of the correlation between the 0% and 90% evaporated gasoline samples
with the 0% evaporated kerosene standard.

!

,,!

While the gasoline samples are weakly correlated to the 0% evaporated kerosene
standard, the coefficient is higher than that between the kerosene samples and the 0% evaporated
gasoline standard. The higher correlation between gasoline samples and kerosene standards is
likely due to the addition of n-alkanes to the gasoline samples, which are present in the kerosene
standards. In these cases, the chromatograms of the gasoline samples become more similar to the
kerosene standards, due to the presence of these alkanes in both the samples and standards.

3.3.8 PPMC Coefficients for Matrix Interference/Thermal Degradation Samples
The mean PPMC coefficients calculated for replicates of the matrix interference/thermal
degradation samples at each evaporation level are greater than 0.91, indicating that replicates are
strongly correlated (Table 3.3). The calculated coefficients of replicate samples, however, are not
as high as the coefficients of replicate standards. The overall decrease in mean coefficients is due
to differences in relative abundance of data points comprising peaks from the ignitable liquids,
surface treatment, and internal standard as well as misalignment of the peak apexes. This is
likely due to the liquids soaking into the wood and being retained so that the compounds from
the liquid are not entirely available during the passive headspace extraction step. Furthermore,
the large differences in relative abundance of corresponding data points within the peaks are
reflected by the large standard deviations associated with the coefficients.
The samples are moderately to strongly correlated to their corresponding 0% evaporated
standards, even with the high standard deviations associated with the calculated coefficients.
This range of coefficients was expected because the n-alkanes from the surface treatment are
being introduced into the samples, but the burning process decreases the abundances, resulting in
fewer points comprising the peaks, of these compounds between standards and samples. Fewer

!

,-!

Table 3.3: Mean Pearson product moment correlation coefficients ± standard deviations
for replicates of the matrix interference/thermal degradation samples (n=105) and for
samples to 0% evaporated gasoline and kerosene (n=225) standards.
Ignitable Liquid
Sample Evaporation
Level
0% Gasoline

Mean PPMC Coefficient ± Standard Deviation
Sample Replicates
0% Evaporated
0% Evaporated
(n=105)
Gasoline (n=225)
Kerosene (n=225)
0.9192 ± 0.0742
0.7237 ± 0.1123
0.5392 ± 0.0267

50% Gasoline
90% Gasoline

0.8293 ± 0.0559
0.6545 ± 0.0658

0.4785 ± 0.0314
0.4999 ± 0.0245

0% Kerosene
50% Kerosene

0.9716 ± 0.0248
0.9360 ± 0.0599

0.4323 ± 0.0324
0.4479 ± 0.0702

90% Kerosene

!

0.9563 ± 0.0359
0.9690 ± 0.0354

0.9500 ± 0.0409

0.4492 ± 0.0569

0.7092 ± 0.0513
0.8266 ± 0.1001
0.6951 ± 0.0694

-.!

data points in a peak translates into fewer retention times at which the relative abundance
between corresponding data points can differ. For example, the differences in relative abundance
between data points in the small peak widths in the gasoline samples, to which the n-alkanes
were introduced, and the gasoline standards, which do not contain these n-alkanes, will be
minimized.
When accounting for the large standard deviations, the samples can be weakly to
moderately correlated to the non-corresponding 0% evaporated standard, for reasons similar to
those noted for the inherent matrix interference data set. It should be noted that the burning
process minimizes peak widths from the application of the surface treatment and the number of
data points comprising each peak; however, the mere presence of the peaks from the surface
treatment will negatively impact the correlations between the standards and samples.

3.3.9 Association of Simulated Fire Debris Samples to Corresponding Standards
Scores for simulated fire debris samples were calculated and projected onto the scores
plot generated for the liquid standards to illustrate the effects of evaporation and matrix
interferences, as well as thermal degradation of the liquids and matrix (Figure 3.12). The
gasoline samples are positioned positively on PC1, as are their respective standards. Similarly,
the kerosene samples are positioned negatively on this PC, as are their respective standards. This
demonstrates that the fire debris samples containing gasoline and kerosene can be associated
according to liquid type and differentiated from each other on PC1. The samples, however, could
not be associated to their respective standards in terms of evaporation level for either ignitable
liquid. This is due to spread in the samples on the scores plot as well as some shifts in
positioning relative to their respective standards. Regardless, the explanation concerning the

!

-%!

Principal Component 2 (15.61%)

4.5E6

4.0E6

-4.0E6

-4.5E6
Principal Component 1 (68.82%)

Figure 3.12: Scores plot of PC1 versus PC2 based on the total ion chromatograms for the
ignitable liquid standards, represented by the squares, and the projected scores of the
simulated fire debris samples, represented by the circles. In terms of color, blue, green,
and purple represent 0%, 50%, and 90% evaporated kerosene while red, orange, and
yellow represent 0%, 50%, 90% evaporated gasoline.

!

-&!

general positioning of the samples on PC1 and PC2 remains the same as that described
previously for the standards.
An obvious difference between the standards and samples is that the standards are tightly
clustered, while the samples exhibit considerable spread, mainly on PC1. This is true for all
samples and is mainly due to differences in abundances of compounds as a result of the porous
nature of the wood, as well as the variability in the burning process. This is illustrated by the C2and C3- alkylbenzenes in the 50% evaporated gasoline samples (Figure 3.13). Even after
normalization, there are still differences in abundance of these compounds among samples of the
same evaporation level, despite using the same spike volume to generate the samples. The
loadings plots (Figures 3.4 and 3.5) illustrate that the C2- and C3-alkylbenzenes are more heavily
weighted on PC1 than on PC2. As a result, spread in the abundances of these compounds will
lead to greater spread in their positioning on PC1 than on PC2.
In spite of the spread observed for the kerosene samples, there is a clear negative shift on
PC1 of the samples in comparison to their respective standards. This shift is also due to
differences in abundance, but in this case, it is a difference in abundance of the n-alkanes in the
samples compared to the standards. In this case, the spike volume used to generate the samples
was greater than that used to generate the standards, resulting in the increase in abundance. A
larger spike volume was needed so that compounds from the ignitable liquid would survive the
burning process and exhibit thermal degradation effects; the smaller spike volume used to
generate the standards would not allow for this to happen. The n-alkanes present in kerosene
have a high, negative weighting on PC1 (Figure 3.4). As a result, an increase in abundance of

!

-'!

1E6

0

7.63

Retention Time (min)

7.80

Figure 3.13: Total ion chromatograms of the C 2-alkylbenzenes from the five simulated
fire debris samples generated using gasoline, demonstrating the variation in abundances
across samples.

!

-(!

these compounds in the samples will translate to a more negative positioning of the samples in
comparison to their respective standards on PC1.
It is important to note that surface treatment does not have a large effect on the
positioning of the samples on PC1. The surface treatment contains n-alkanes C9 through C12, but
only C11 and C12 affect the positioning of the samples, according to the loadings plot for this PC
(Figure 3.4). While C11 and C12 load negatively on PC1, these compounds are not very heavily
weighted; therefore, the surface treatment provides only minimal contributions to positioning of
the samples on PC1.
The surface treatment does, however, greatly affect the positioning of the gasoline and
kerosene samples on PC2. The loadings plot illustrates this for PC2 (Figure 3.5), where the nalkanes that are present in the surface treatment (C9-C12) load positively and are, collectively,
heavily weighted. The addition of these compounds from the surface treatment results in the
samples being more positively positioned on PC2 compared to their respective standards. This is
especially illustrated by the positioning of the gasoline samples. The 90% evaporated gasoline
samples are positioned even more positively on PC2 than the other gasoline samples because the
more heavily weighted n-alkanes C12 and C13, which load positively, are present in higher
abundances (by more than an order of magnitude) in the 90% evaporated samples in comparison
to the other gasoline samples.
This positive shift is also observed for the kerosene samples, albeit to a lesser extent. The
shift is less obvious than for the gasoline because of the increase in abundance of the negativelyweighted n-alkanes that resulted from using a larger spike volume to create these samples. The
increase in abundance of the n-alkanes may offset some of the positive contributions of the

!

-)!

surface treatment. The 90% evaporated and two of the 50% evaporated kerosene samples are not
shifted on PC2 compared to the corresponding standards. In addition to the previous explanation,
these samples also display less matrix contributions, in terms of abundance, from the surface
treatment than replicates of the same samples.

3.3.10 PPMC Coefficients for Simulated Fire Debris Samples
Mean PPMC coefficients calculated for pairwise comparisons of replicates were greater
than 0.89 indicating strong correlation, even though the samples exhibited spread in the scores
plot (Table 3.4). It may seem like the strong correlation conflicts with the extent of the spread
observed; however, PPMC coefficients provide a measure of similarity using an entire
chromatogram while PCA identifies and emphasizes specific peaks in the chromatogram that
lead to the variance. The PPMC coefficients, in this case, demonstrate that the sample
chromatograms contain similar peaks, while PCA highlights the differences in abundance of the
peaks.
The mean coefficients, however, are less than those calculated for pairwise comparisons
of replicates of the standards (Table 3.1). The overall decrease in mean coefficients is due to
differences in abundance of compounds from the ignitable liquids, surface treatment, and internal
standard, which ultimately lead to differences peak width and larger differences in relative
abundance between more data points. This is reflected in the large standard deviations associated
with the coefficients. Again, these differences are likely due to the liquids soaking into the wood,
which leads to the compounds from the liquid not being entirely available for extraction. In
addition, when the liquids soak into the wood they become protected from the full effects of the
burning process. Furthermore, these mean coefficients are similar to those calculated for the

!

-*!

Table 3.4: Mean Pearson product moment correlation coefficients ± standard deviations
for replicates of the simulated fire debris samples (n=105) and for samples to 0%
evaporated gasoline and kerosene (n=225) standards.
Ignitable Liquid
Sample Evaporation
Level
0% Gasoline
50% Gasoline

Mean PPMC Coefficient ± Standard Deviation
Sample Replicates
0% Evaporated
0% Evaporated
(n=105)
Gasoline (n=225)
Kerosene (n=225)
0.9530 ± 0.0323
0.3369 ± 0.0731
0.4414 ± 0.0334
0.9262 ± 0.0600
0.5322 ±0.0760
0.4655 ± 0.0435

90% Gasoline
0% Kerosene

0.2869 ± 0.0313
0.1808 ± 0.0180

0.4857 ± 0.0125
0.7526 ± 0.0182

50% Kerosene
90% Kerosene

!

0.9870 ± 0.0095
0.9831 ± 0.0106
0.8976 ± 0.0889
0.9731 ± 0.0245

0.1404 ± 0.0982
0.0860 ± 0.0444

0.6872 ± 0.0919
0.4646 ± 0.0766

-+!

other two data sets. This similarity suggests that the majority of changes in the chromatograms
are due to the matrix, which is characteristically porous, and has less to do with the
irreproducible effects of the burning process.
The 0% and 50% evaporated kerosene samples could be moderately associated to the 0%
evaporated kerosene, even with the significant standard deviations associated with the calculated
coefficients. However, the 90% kerosene samples could only be weakly to moderately correlated
to the same standard. The n-alkanes from the surface treatment are present in very low
abundances in the chromatograms of the 90% evaporated kerosene as opposed to the 0% and
50% evaporated samples. The 90% evaporated kerosene samples would not contain these
compounds if not for the application of the surface treatment, but the 0% evaporated standards
do. The addition of these compounds to the 90% evaporated kerosene should increase the
correlation to the 0% evaporated standard; however, the small peak widths of the n-alkanes in the
samples as opposed to the large peak widths in the standard, prevents the correlation from
increasing further.
The 0% and 90% evaporated gasoline samples exhibited weak correlation to the 0%
evaporated gasoline standard, whereas the 50% evaporated sample was weakly to moderately
correlated to the same standard. The weak correlations are due to the addition of compounds
from the surface treatment that are not present in the standards. In addition, the 0% evaporated
gasoline sample chromatograms exhibit large variation in the abundance of toluene and the C2alkylbenzenes so that the widths of these compounds vary across the chromatograms. The
difference in peak widths between the chromatograms of the samples and the standards lead to a
decrease in the extent of correlation. This, along with a significant increase in abundances, and

!

-,!

change in peak width, of the C4-alkylbenzenes in comparison to the standard, explains the low
coefficients calculated for 90% evaporated gasoline.
A higher correlation was observed between gasoline samples and kerosene standards than
between kerosene samples and gasoline standards. The addition of the n-alkanes from the surface
treatment to the gasoline samples makes these samples more similar in composition to the
kerosene standards, resulting in a slightly higher correlation.

3.4 Summary
The addition of compounds from the surface treatment can greatly complicate the visual
assessment of a chromatogram from fire debris. This is especially true of the surface-treated
wood investigated in this study because the treatment contains n-alkanes (C9-C12), which are
also present in kerosene.
Principal components analysis can be used to provide a more objective assessment of a
chromatogram from fire debris. This statistical procedure can be used to associate simulated
debris samples to their respective standard by type of ignitable liquid despite evaporation, matrix
interferences, and thermal degradation of the liquid and matrix. The debris samples, however,
could not be accurately associated to their respective standards in terms of evaporation level.
This was due, primarily, to differences in abundances of compounds for which normalization
procedures could not account.
Pearson product moment correlation coefficients can be used in conjunction with PCA.
The coefficients could only provide a weak to moderate correlation for two of the three data sets,
including the simulated fire debris samples. As a result, the coefficients do not provide a
numerical value of the association between samples and their respective standards, as was

!

--!

intended, but instead, can be used to associate replicates at each evaporation level to one another
in order to minimize the effects of spread within the scores plot.

!

%..!

REFERENCES

!

%.%!

REFERENCES

1. ASTM International, ASTM E 1412-07. Annual Book of ASTM Standards.

!

%.&!

Chapter 4: Classification of Ignitable Liquid Standards using Soft Independent Modeling
of Class Analogy

4.1 Introduction
According to a report by the National Academy of Sciences, the forensic sciences are in
1

dire need of ways to assess the accuracy and significance of analysis results . This is especially
true for fire debris analysis, which consists of a subjective visual assessment of chromatograms
from fire debris to identify the presence of an ignitable liquid. One statistical procedure that can
potentially be used to link fire debris back to the ignitable liquid used to generate it, in a more
objective manner, is soft independent modeling of class analogy (SIMCA).
Since the application of SIMCA to fire debris data is a relatively new concept, the
investigation performed in this chapter was simplistic and aimed at classifying ignitable liquid
standards based on their chemical compositions as a proof-of-concept study. This supervised
procedure provides a more objective approach to association because it can identify the class to
which unknown samples are likely to belong, based on statistically meaningful class membership
limits. Additionally, classifications at various significance levels are calculated to indicate the
probability that an unknown sample belongs to the class to which it was assigned.
A set of six ignitable liquids was generated by diluting each liquid in methylene chloride
and analyzing it by gas chromatography-mass spectrometry (GC-MS). Each diluted liquid was
analyzed in replicate, resulting in 15 chromatograms per liquid and 90 chromatograms in total.
The ignitable liquids used, which span five ASTM International classes, were insect repellent
2

spray, gasoline, paint thinner, fuel stabilizer, fuel injector cleaner, and diesel .

!

%.'!

Principal components analysis (PCA) was applied to the total ion chromatograms (TICs)
of the entire set of ignitable liquid standards to assess the natural grouping of the liquids. Next,
the data were subjected to SIMCA. To do this, the TICs were split into a training and test set.
The training set was comprised of 72 chromatograms (12 chromatograms per liquid) and the
remaining chromatograms formed the test set. The TICs for each liquid within the training set
were subjected to PCA by liquid type to generate models that described the chemical
composition of each liquid. Then, the models were used to classify the ignitable liquids in the
test set according to their chemical compositions. Soft independent modeling of class analogy
was also applied to selected extracted ion chromatograms (EICs) to investigate whether
improvements in classification were possible.

4.2 Materials and Methods

4.2.1 Liquid Standards
The six ignitable liquids used for this research were purchased from stores in the Lansing,
MI area. Each was diluted, by volume, in methylene chloride (J.T. Baker, Phillipsburg, NJ), as
follows: insect repellent spray, 1:1600; gasoline, 1:200; paint thinner, 1:350; fuel stabilizer,
1:150; fuel injector cleaner, 1:100; and diesel, 1:50. The diluted liquids were directly injected
and analyzed by GC-MS.

!

%.(!

4.2.2 Analysis of Standards by GC-MS
All liquids were analyzed using an Agilent 6890N gas chromatograph, coupled to an
Agilent 5975C mass spectrometer, and equipped with an Agilent 7683B autosampler (Agilent
Technologies, Palo Alto, CA). The GC contained an Agilent HP-5MS capillary column (30 m x
0.25 mm I.D. x 0.25 µm film thickness). The carrier gas used was ultra-high purity helium
(Airgas, East Lansing, MI), at a nominal flow rate of 1mL/min. One µL of each liquid was
injected using the pulsed, splitless mode, with a pressure of 15 psi for 0.25 minutes. The inlet
was maintained at 250 °C. The GC oven temperature program was as follows: 40 °C for 3 min,
10 °C/min to 280 °C, hold for 4 min. The transfer line was maintained at 280 °C and the mass
spectrometer was operated in electron ionization mode (70eV). Full mass scan mode was used,
scanning the range 50 to 550 amu, with a scan rate of 2.91 scans/s.
Each liquid was analyzed in replicate (n=15) and TICs were generated. Additionally,
EICs for m/z 83, 91, 99, and 128 were generated from the TICs using the ChemStation

©

Enhanced Data Analysis Software (version E.01.01.335, Agilent Technologies).

4.2.3 Data Pretreatment
Total ion chromatograms and EICs of the six ignitable liquids were treated as separate
data sets. Data pretreatment was performed in a similar manner on each data set, separately.
©

Firstly, the Savitsky-Golay smooth was performed in the ChemStation Enhanced Data Analysis
Software.
Next, each data set was subjected to a total area normalization procedure, which was
performed in Microsoft Excel (version 12.0.6425.1000, Microsoft Corp., Redmond, WA). For a
specific ignitable liquid, the total area of each chromatogram (n=15) across all retention times

!

%.)!

was calculated and then the average area of all 15 chromatograms was calculated. The
abundance at each retention time was divided by the total area of the chromatogram and then
multiplied by the corresponding average. This process was repeated for each ignitable liquid.

4.2.4 Principal Components Analysis
Principal components analysis was performed on the ignitable liquid TICs (n=90) using
Unscrambler X (version 10.2, Camo, Inc., Woodbridge, NJ). The scores plots were used to
visually assess the natural groupings of the ignitable liquid standards. The loadings plots were
used to explain the positioning of the samples in the scores plots. The EICs were also subjected
to PCA and assessed in a similar manner.

4.2.5 Soft Independent Modeling of Class Analogy
Soft independent modeling of class analogy was applied to the TICs using Unscrambler
X. Each data set consisted of 90 chromatograms (n=15 for six liquids), which were further
divided into training and test sets. The training set consisted of 12 of the 15 replicate
chromatograms from each ignitable liquid, while the remaining chromatograms formed the test
set. Chromatograms of liquids in the training set were separately subjected to PCA, by liquid
type, to generate six distinct models. The PCA models, which consist of loadings and scores
plots, identify the compounds that describe each ignitable liquid. The PCA models were
validated using a full validation procedure in the software. In this procedure, one chromatogram
was removed from the training set, a new model was generated and a new score of the TIC that
was removed was calculated using the new model to assess how well the training sample fit the
model. This was repeated for all TICs in the training set. Next, the test samples were classified

!

%.*!

by projecting the test set TICs onto each model. The probability of each TIC in the test set
belonging to each of the modeled ignitable liquid groups was determined. The classifications
were investigated at a 0.1%, 1%, 5%, 10%, and 25% significance level. Later, the EICs were
subjected to SIMCA in a similar manner.

4.3 Results and Discussion

4.3.1 Characterization of Ignitable Liquid Standards
Exemplar TICs of each ignitable liquid are shown in Figure 4.1. Classified as a member
of the aromatic class, the insect repellant contains substituted aromatics such as C3alkylbenzenes, as well as malathion (Figure 4.1A). The gasoline fuel, classified as gasoline,
contains branched and cyclic alkanes such as the C2-, C3-, and C4-alkylbenzenes, as well as
methylnaphthalenes (Figure 4.1B). The paint thinner contains mostly branched alkanes in the C9C12 range (Figure 4.1C) and is classified as isoparaffinic. The fuel stabilizer is a member of the
naphthenic paraffinic class due to the presence of branched and cyclic alkanes (Figure 4.1D).
The fuel injector cleaner is classified as a heavy petroleum distillate due to the presence of nalkanes C9-C15, as well as substituted aromatics (Figure 4.1E). The diesel fuel is classified as a
heavy petroleum distillate, due to the presence of n-alkanes in the range C10-C19 and some
aromatic compounds (Figure 4.1F).

!

%.+!

!
!

A

Malathion
C3-Alkylbenzenes

B

Abundance

C2-Alkylbenzenes
Toluene

C3-Alkylbenzenes

Methylnaphthalenes

C4-Alkylbenzenes

%.,!

C

2,2,6-Trimethyloctane 2,2,8-Trimethyldecane
3-Methyl-5-propylnonane

3

Retention Time (min)
Figure 4.1: Total ion chromatograms of A) insect repellent, B) gasoline, and
C) paint thinner, D) fuel stabilizer, E) fuel injector cleaner, and F) diesel
with selected peaks labeled.

108

31

!
Figure 4.1 (continued)
D

Abundance

2,6-Dimethylundecane

C10

E

C11 C
12

C13

C9

C14

%.-!

C15 C
16
C11

C12

C13 C14 C15 C
16

F
C17

C10
C9
Retention Time (min)

3
Figure 4.1 (continued)

109

C18

C19
C20

C21

C23
C22

C24
31

4.3.2 Principal Components Analysis of the Entire TIC Data set
Prior to SIMCA, principal components analysis was performed on the full data set to
assess natural groupings of the liquids. A combined total of approximately 83% of the variance
within the data set is described by the first and second principal components (PC1 and PC2,
respectively) in the scores plot (Figure 4.2). Replicate TICs of each ignitable liquid were
clustered, resulting in the six expected groups according to liquid type. Both principal
components were necessary to fully differentiate the ignitable liquids from one another.
The diesel and fuel injector samples are located positively on PC1. The positioning of
these samples can be explained by the loadings plot for PC1 (Figure 4.3). The plot shows that all
n-alkanes (C9-C24) and many of the branched alkanes are weighted positively on PC1.
n-Alkanes are present in both diesel and fuel injector cleaner explaining why both are positively
positioned on PC1 in the scores plot. Diesel contains more n-alkanes, specifically C9-C24, while
fuel injector cleaner contains fewer n-alkanes, specifically C9-C16, thus explaining why diesel is
the most positively positioned of the two liquids.
Branched and cyclic alkanes, which are weighted positively in the PC1 loadings plot, are
present in fuel stabilizer; however, this liquid is negatively positioned on PC1 in the scores plot.
When PCA is performed, there is a mean-centering step in which the average abundance at each
retention time is calculated across all chromatograms in the data set and then the average is
subtracted from each individual chromatogram. The aromatic and n-alkanes that are present in
high abundance in the diesel and fuel injector result in a large average abundance for the
corresponding retention times. Consequently, when the averages were subtracted from the fuel

!

%%.!

Principal Component 2 (16%)

3E6

-8E6

8E6

-3E6
Principal Component 1 (67%)

Figure 4.2: Scores plot of PC1 versus PC2 based on the total ion chromatograms of the
ignitable liquid standards training and test sets: insect repellent (green), gasoline (orange),
paint thinner (yellow), diesel (blue), fuel injector cleaner (black), and fuel stabilizer (red).

!

%%%!

0.12
C12

C13

C14C15
C16
C17
C18
C19
C20
C21
C22
C24
C23

Principal Component 1

C11
C10
C9

Malathion

Toluene
C3-Alkylbenzenes
C2-Alkylbenzenes

-0.12
3

Retention Time (min)

31

Figure 4.3: Loadings plot of PC1 based on the total ion chromatograms of the ignitable
liquid standards (training and test sets).

!

%%&!

stabilizer chromatograms, negative contributions from the aromatics and n-alkanes were
introduced into the mean-centered fuel stabilizer data. The negative contributions of the meancentered data, multiplied by the positive weighting in the PC1 loadings, results in the negative
positioning of the fuel stabilizer samples in the scores plot on PC1. Consequently, fuel stabilizer
is negatively positioned in the scores plot even though it predominantly contains compounds that
are positively weighted in the loadings plot.
Also positioned negatively on PC1 in the scores plot are gasoline, insect repellent, and
paint thinner. The positioning of these liquids can also be explained by the loadings plot for PC1.
Toluene, as well as some C2- and C3-alkylbenzenes and malathion, are negatively weighted in
the plot. Many of these compounds are present in gasoline, explaining this liquid’s negative
positioning on PC1 in the scores plot. Some of the C3-alkylbenzenes and malathion, are also
present in insect repellent, thus explaining its negative location in the scores plot. Paint thinner,
on the other hand, is positioned negatively on PC1 due to the presence of some substituted
alkanes in the C9-C12 range that are negatively weighted in the loadings plot in addition to the
negative contributions from the previously mentioned mean-centering of the data.
The positioning of the standards on PC2 in the scores plot can be explained, in a similar
manner, by the loadings plot for PC2 (Figure 4.4). Diesel is positively positioned on PC2 in the
scores plot because it contains higher abundance of the positively weighted n-alkanes (C14-C24)
in the loadings plot than the negatively weighted n-alkanes (C9-C13). Insect repellent and
gasoline are positively positioned on PC2 in the scores plot because toluene and malathion, as
well as the C2- and C3-alkylbenzenes, all of which are contained in one or both of the liquids, are

!

%%'!

0.13
Malathion
C15

Principal Component 2

C2-Alkylbenzenes

C16

Toluene

C18
C20

C17
C19

C14

C22
C21

C23 C24

C13

C9
C10
C11

C12

-0.13
3

Retention Time (min)

31

Figure 4.4: Loadings plot of PC2 based on the total ion chromatograms of the ignitable
liquid standards (training and test sets).

!

%%(!

positively weighted on PC2. Paint thinner is also positively positioned on PC2 in the scores plot
because the major compounds present in the TIC are weighted positively in the loadings plot for
PC2.
Fuel stabilizer and fuel injector cleaner, on the other hand, are negatively positioned on
PC2 in the scores plot. The branched and cyclic alkanes contained in the fuel stabilizer are all
weighted negatively on the loadings plot for PC2. Fuel injector contains compounds that are both
positively (C14-C16) and negatively (C9-C13) weighted on PC2; however, more of the
compounds are weighted negatively, thus explaining the overall negative positioning of the
samples on the scores plot on PC2.

4.3.3 Classification of Ignitable Liquid Standard TICs Using SIMCA
Principal components analysis was performed first on the entire set of TICs (training and
test samples) not only to assess the natural groupings of ignitable liquid chromatograms, but also
to determine the number of PCs necessary to distinguish between the different liquid types.
Because the overall PCA scores plot demonstrated that differentiation of ignitable liquid types
was possible using 2 PCs, SIMCA was performed using only PC1 and PC2. This does not match
the recommended number of PCs to use for SIMCA that was suggested by the software program
(Table 4.1). As a result, SIMCA was performed using 2 PCs, as well as the recommended
number of PCs. However, since classification of the test set was unaffected by the number of
PCs used in SIMCA, only the results using 2 PCs are discussed below.

!

%%)!

Table 4.1. The suggested number of principal components for soft
independent modeling of class analogy on total ion chromatograms.
Ignitable Liquid
Fuel Stabilizer

5

Gasoline

3

Paint Thinner

7

Insect Repellent

4

Diesel

2

Fuel Injector

!

Suggested PCs

5

%%*!

The first step in SIMCA is to generate models that will be used for sample classification.
To do this, PCA was performed on the TICs of liquids in the test set, by liquid type, thus
generating a total of six models (one for each liquid). Using two PCs in each of the models, all
TICs in the test set were correctly classified according to liquid type between significance levels
of 0.1% and 10%. However, at the 25% significance level, one gasoline replicate was left
unclassified to any model while all other test liquids were correctly classified (Table 4.2).
The significance level, as calculated in the computer software, is a p-value, which
indicates the likelihood that a sample was classified to a model by chance. Since smaller p-values
and, consequently, smaller significance levels indicate that the classification of a sample is less
likely to have occurred by chance, smaller significance levels (particularly less than 5%) are
3

considered to be more statistically significant . Later in the chapter, the reasoning for the
replicate not being classified is discussed; however, since the larger significant levels are
considered to be less statistically significant, the lack of classification of a gasoline replicate at
25% is not of great consequence.
In the initial PCA scores plot of all liquids (Figure 4.2), differentiation according to type
was possible using only two PCs. As a result, correct classification of the test samples using
SIMCA was expected at all significance levels. To further investigate the unclassified gasoline
replicate in the test set at the 25% significance level, Coomans’ plots and plots of sample-tomodel distance versus leverage were assessed.

4.3.3.1 Coomans’ plots
Coomans’ plots are plots of the sample-to-model distance for two models. The sample-tomodel distance describes how far the PCA score of a test sample lies from a model after the

!

%%+!

!
Table 4.2. Classification Table of Ignitable Liquid TICs at 10% Significance Level.
Fuel Stabilizer
Fuel Stabilizer 1

Insect Repellent

Diesel

Fuel Injector

*

Fuel Stabilizer 3

Paint Thinner

*

Fuel Stabilizer 2

Gasoline

*

Gasoline 1

*

Gasoline 2

*

Gasoline 3

*

%%,!

Paint Thinner 1

*

Paint Thinner 2

*

Paint Thinner 3

*

Insect Repellent 1

*

Insect Repellent 2

*

Insect Repellent 3

*

Diesel 1

*

Diesel 2

*

Diesel 3

*

Fuel Injector 1

*

Fuel Injector 2

*

Fuel Injector 3

*

118

model is used to calculate a score for the test sample. Specifically, the sample-to-model distance
is the square root of the residual distance from the score of the projected sample with respect to
4

the principal components used to describe the model . An equation describing how the sampleto-model distance is calculated is located in the SIMCA Theory section of this thesis
(Equation 2.3). The Coomans’ plot visually demonstrates how and why test samples are likely to
be classified. As an example, a Coomans’ plot comparing the models generated for gasoline and
insect repellant, at a 10% significance level, is shown in Figure 4.5. In the plot, the sample-tomodel distance for gasoline is on the ordinate while the sample-to-model distance for insect
repellent is on the abscissa. The sample-to-model distance for all TICs in the test set is
determined for both models and then plotted.
Class membership limits, which describe the maximum distance the score of a sample
can be from a model and still be classified as that liquid, are overlaid on the Coomans’ plot. Test
samples likely to be classified to a model are positioned between zero and the class membership
limit for that model. The class membership limits can differ for each model; in the plot for the
gasoline and insect repellant models, class membership limits are approximately 780 and 878,
respectively. Test samples likely to be classified to one model only will fall within membership
limits of that model and outside of the membership limits of the other model. For example,
replicates of the gasoline in the test set are plotted on the abscissa within the class membership
limits for that liquid; hence, these replicates are classified as gasoline (Figure 4.6). On the
ordinate, however, these replicates are positioned outside the class membership limits for the
insect repellent model, thus indicating that the gasoline replicates in the test set are not classified
as insect repellent. A similar explanation can be used to describe why replicates of insect
repellent in the test set are classified as insect repellent and not as gasoline.

!

%%-!

Sample Distance to Model Insect Repellent (10%)

1.6E5

0

0

Sample Distance to Model Gasoline (10%)

1.6E5

Figure 4.5: Coomans’ plot for the gasoline and insect repellent models (at a 10%
significance level) based on the total ion chromatograms of the training sets. The
sample-to-model distances are plotted for each of the ignitable liquids in the test set:
insect repellent (green), gasoline (orange), paint thinner (yellow), diesel (blue),
fuel injector cleaner(black), and fuel stabilizer (red). The class membership limit for the
gasoline model is overlaid on the plot in orange while the limit for the insect repellent
model is in green.

!

%&.!

Sample Distance to Model Insect Repellent (10%)

4E4

0

0

Sample Distance to Model Gas (10%)

1E3

Figure 4.6: Coomans’ plot (at 10% significance level) for the gasoline and insect repellent
models based on the total ion chromatograms of the training sets. The sample-to-model
distances are plotted for the gasoline test samples (orange). The class membership limit
for the gasoline model is overlaid on the plot in orange while the limit for the insect
repellent model is in green.

!

%&%!

In addition to illustrating likely classifications, the positioning of test samples within the
Coomans’ plot also indicates how well the two models being compared are discriminated from
one another. When performing SIMCA, all of the models should be well discriminated to
minimize the possibility of incorrect classification of the test samples. The models are considered
poorly discriminated if any test samples fall within the area between the origin and where the
two class membership limits intersect because this indicates that the test samples could be
classified to either of the two models. This is not the case in the Coomans’ plot for the gasoline
and insect repellant; no test samples fall within this area, indicating that these two models are
well discriminated at the 10% significance level.

4.3.3.2 Sample-to-Model Distance Versus Leverage Plots
The Coomans’ plot cannot alone be used to determine the classification of test samples.
Classification is determined using a combination of two variables for each test sample: the
sample-to-model distance and the leverage. The leverage is the distance calculated between the
projected score of a test sample and the mean score of the training samples used to generate the
3

model . The equation describing specifically how leverage is calculated is located in the SIMCA
Theory section of this thesis (Equation 2.4). Essentially, leverage is a measure of the variation
between the test sample and the model. A sample can only be classified to a model if both the
sample-to-model distance and leverage fall within the class membership limits for the model.
A sample-to-model distance versus leverage plot can be generated for any model to
describe why samples are or are not classified to that model. Unlike the Coomans’ plot, this plot
cannot be used to directly compare models, but is instead used to understand classification of the
test samples for an individual model. An example of a sample-to-model distance versus leverage

!

%&&!

plot for the gasoline model at a 10% significance level is shown in Figure 4.7, in which the
model leverage is on the abscissa while sample-to-model distance is on the ordinate. The sampleto-model distance and leverage for all of the test samples, with respect to the gasoline model, are
plotted along with the class membership limits. Samples that are positioned near the origin where
these membership limits overlap are within the corresponding class membership limits and will
be classified as the liquid type represented by the model. In this example, all of the gasoline test
samples fall within the class membership limits for both distance and leverage, indicating that
these test samples are classified as gasoline at the 10% significance level (Figure 4.8). No other
test samples fall within these limits, indicating that no other samples will be incorrectly classified
as gasoline.

4.3.3.3 The Unclassified Gasoline Sample
Coomans’ plots and sample-to-model distance versus leverage plots were used to
investigate the unclassified gasoline replicate in the test set at the 25% significance level. A
sample can be classified at the 10% significance level, but not at 25% because a change in the
significance level translates to a change in class membership limits.
The significance level calculated in the software is a P-value and is used to draw general
conclusions about a larger population from a small experimentally-collected sample population.
P-values are used in SIMCA to calculate the percent probability that the classification of a
sample to a model occurred by chance. For example, in terms of class membership limits,
classifications assessed using a P-value of 0.025 indicate that, in a larger population, 25% of the
samples would have sample-to-model distances greater than the calculated membership limit and
would be misclassified. Therefore, in the same larger population, 75% of the samples would

!

%&'!

Sample Distance to Model Gasoline (10%)

1.6E5

0
0

Leverage Gasoline (10%)

180

Figure 4.7: Sample-to-model distance versus leverage plot for the gasoline model (at a
10% significance level) based on the total ion chromatograms of the gasoline training set.
The sample-to-model distances and leverage are plotted for each of the ignitable liquids
in the test set: insect repellent (green), gasoline (orange), paint thinner (yellow), diesel
(blue), fuel injector cleaner (black), and fuel stabilizer (red). The class membership limits
of both sample-to-model distance and leverage for the gasoline model is overlaid on the
plot in orange.

!

%&(!

Sample Distance to Model Gasoline (10%)

1E3

0
0

Leverage Gasoline (10%)

1

Figure 4.8: Sample-to-model distance versus leverage plot for the gasoline model (at a
10% significance level) based on the total ion chromatograms of the gasoline training set.
The sample-to-model distances and leverage are plotted for gasoline test samples
(orange). The class membership limits of both sample-to-model distance and leverage for
the gasoline model is overlaid on the plot in orange.

!

%&)!

have sample-to-model distances less than the class membership limit and would be classified
appropriately.
A P-value of 0.025 is the highest used for classifications in the software; the next highest
is 0.01. As the P-value decreases, the probability that a sample was classified by chance
decreases. In order to decrease the likelihood of classification by chance to 10% (P-value=0.01),
the class membership limits need to be made larger so that 90% of the test samples have sampleto-model distances less than the membership limits. The change in class membership limits
corresponding to the change in significance level from 10% to 25% is reflected in both the
Coomans’ plot (Figure 4.9) and sample-to-model distance versus leverage plots (Figure 4.10).
The Coomans’ plot for the gasoline and insect repellent spray at a 10% significance level
was previously discussed. At the 10% significance level, the class membership limit for the
gasoline model is 780 while, at the 25% significance level, the limit is 530 (Figure 4.9). The
sample-to-model distance for the unclassified gasoline replicate in the test set was approximately
677. In this case, the sample-to-model distance was outside the membership limits at the largest
significance level, meaning that the sample was not classified as gasoline at the 25% level. In
addition, this sample was outside the membership limits of all other models; as a result, this
particular gasoline replicate was not classified as belonging to any of the previously defined
liquid classes using SIMCA.
The sample-to-model distance versus leverage plot for gasoline (Figure 4.10) illustrates
the same decrease in class membership limits for the sample-to-model distance, as the
significance level increases. This plot indicates that the gasoline test sample only falls outside
membership limits for the sample-to-model distance, not for leverage. This is because the
membership limit for leverage is a fixed value across all significance levels and is calculated

!

%&*!

Sample Distance to Model Insect Repellent (25%)

4E4

0

0

Sample Distance to Model Gasoline (25%)

1E3

Figure 4.9: Coomans’ plot (at 25% significance level) for the gasoline and insect repellent
models based on the total ion chromatograms of the training sets. The sample-to-model
distances are plotted for the gasoline test samples (orange). The class membership limits
for the gasoline model is overlaid on the plot in orange while the limit for the insect
repellent model is in green.

!

%&+!

Sample Distance to Model Gasoline (25%)

1E3

0

0

Leverage Gasoline (25%)

1

Figure 4.10: Sample-to-model distance versus leverage plot for the gasoline model (at a
25% significance level) based on the total ion chromatograms of the gasoline training set.
The sample-to-model distances and leverage are plotted for gasoline test samples
(orange). The class membership limits of both sample-to-model distance and leverage for
the gasoline model is overlaid on the plot in orange.

!

%&,!

3

from the number of components and training samples used to make the models . As a result,
only the sample-to-model distance prevents correct classification of this replicate.
Since this is proof-of-concept work to investigate the potential of SIMCA for ignitable
liquid classification, the data set was intentionally generated to minimize variation between
replicate TICs of individual ignitable liquids. The lack of variation in the TICs of each ignitable
liquid resulted in models developed that poorly describe the liquids. This is especially true of the
gasoline model and is likely the reason that the gasoline replicate in the test set was not
classified. This is illustrated by the PCA loadings plot for the gasoline model (Figure 4.11). The
majority of the peaks in the loadings plots for both PC1 and PC2 are derivative-shaped, which
are the result of trivial differences in peak shape among the replicate TICs. Due to the high
degree of similarity among the TICs for the gasoline samples, PCA identified the trivial
difference in peak shape, likely a result of instrument variation, as a major source of variance
(i.e., non-chemical variance). As a result, the PCA model for the gasoline class was,
unintentionally, built on insignificant variation that occurred due to instrument variations during
analysis, rather than chemical differences among samples. The reason that one gasoline sample
remained unclassified at the largest significance level is due to those natural and chemically
insignificant variations. In the future, classification could be improved by introducing additional
samples of different gasoline brands into the training set. This would ensure that the PCA model
would be built on chemically meaningful differences between gasoline TICs.
As mentioned earlier, all of the ignitable liquid models generated describe the chemically
insignificant variations that were illustrated by the gasoline model; however, only one gasoline
replicate was left unclassified. The loadings plots for gasoline already established that the model
is built on chemically insignificant variations between TIC replicates. The modeling power of the

!

%&-!

Principal Component 1

0.3

-0.3
3

Retention Time (min)

Figure 4.11: Loadings plot of PC1 of the gasoline model based on the total ion
chromatograms of the gasoline standards.

!

%'.!

31

gasoline model can be used in conjunction with the loadings plots to demonstrate that the
gasoline model is more strongly affected by these trivial fluctuations than the other liquid models
(Figure 4.12). The modeling power highlights the influence that each variable has on the model.
3

An influence above 0.3 is considered significant to the model . The modeling power of gasoline
shows many peaks ranging from approximately 3 to 17 minutes that significant impact the
model. In reality, the plot of the modeling power shows more peaks impacting the gasoline
model than the number of peaks in the TIC of gasoline replicates. Consequently, some of the
peaks significantly impacting the model represent the chemically insignificant variations that
correspond to the noise from the baseline.
The modeling power for the gasoline model can be contrasted to that representing insect
repellent (Figure 4.13). Less of the peaks significantly impacting the insect repellent model are a
result of the chemically insignificant variations from the baseline that were seen in the gasoline
model. This is highlighted in the modeling power for insect repellent by the fact that the peaks
impacting the model correspond, by retention time, to peaks in the TIC of insect repellent.
It should also be noted that the loadings plots and modeling power for all of the ignitable
liquid models incorporate an additional source of nonchemical variation. A rise in baseline at the
end of the all ignitable liquid TICs is identified as variance in the models and is described in the
loadings plots (Figure 4.14). The rise in baseline occurs as a consequence of the column being
heated to high temperatures at the end of the temperature program. It occurs in every TIC and
affects how the models are described, but it is inherent to the analysis process and not the
chemical makeup of the sample. And, because the samples are so similar, it is being identified by
PCA as a major source of variance.

!

%'%!

Modeling Power (Gasoline)

1

0
3

Retention Time (min)

31

Figure 4.12: Modeling power for the gasoline model based on the total ion
chromatograms of the gasoline training samples. The red line represents modeling power
of 0.3. Peaks that extend above this line significantly impact the model.

!

%'&!

1

Malathion

Raise in baseline

Modeling Power (Insect Repellent)

C3-Alkylbenzenes

0
3

Retention Time (min)

31

Figure 4.13: Modeling power for the insect repellent model based on the total ion
chromatograms of the insect repellent training samples. The red line represents modeling
power of 0.3. Peaks that extend above this line significantly impact the model.

!

%''!

Abundance

7E5

0
3

Retention Time (min)

Figure 4.14: A total ion chromatogram of insect repellent demonstrating the rise in
baseline that occurs at the end of the chromatogram.

!

%'(!

31

The effect of the rise in baseline is further reflected in a plot of the modeling power
versus the variable for each individual model. As the modeling power plot for the insect repellent
model shows, variations associated with the C3-alkylbenzenes and malathion peaks significantly
influence the model (Figure 4.13). Unfortunately, according to the modeling power, the rise in
baseline or nonchemical variation influences the model as much as that associated with the actual
peaks in the insect repellent. To circumvent this problem in the future, it may be necessary to
truncate the TICs before SIMCA is performed to reduce the negative effects of the rise; however,
truncating the chromatogram may not be a plausible solution because compounds may be
detected in the rise of the baseline in the chromatograms of ignitable liquids investigated in the
future.

4.3.4 Classification of Ignitable Liquid Standard EICs Using SIMCA
While SIMCA was shown to successfully classify the test samples using TICs up to the
10% significance level, classification using EICs was also investigated across the same
significance levels. Extracted ion chromatograms can provide many benefits over TICs including
improved sensitivity, as well as reducing the negative effects of matrix interference compounds
that do not contain the ion extracted.
The ions used to generate the EICs for each ignitable liquid were selected because they
represent different classes of compounds and are commonly used in forensic laboratories for
EICs or as part of extracted ion profiles. In addition, the selected ions were present in similar
abundances in the ignitable liquids used in this research.
Extracted ion chromatograms for ions m/z 99, 91, 83, and 128 were generated separately
from the TICs of the ignitable liquid standards in both the training and test sets. Each EIC was

!

%')!

treated as a separated data set. Principal components analysis was performed on the EIC training
and test samples to assess the natural groupings of the ignitable liquids. Lastly, SIMCA was
performed to classify the EIC test set to the ignitable liquid models generated from the EIC
training set.

4.3.4.1 Alkane EIC, m/z 99
Using extracted ion chromatograms of m/z 99, which is representative of the alkane
compound class, all six ignitable liquids were differentiated in the PCA scores plot (Figure 4.15).
The first two PCs account for approximately 75% of the total variance in the EIC data set. As for
the TICs, the plot indicated that classification by SIMCA should be possible using 2 PCs for
each ignitable liquid model. The number of PCs suggested from visual assessment of the overall
scores plot differs from the number of PCs recommended by the computer software (Table 4.3).
As a result, SIMCA was performed using 2 PCs and using the recommended number.
Using 2 PCs for each model, correct classification of the test samples by SIMCA was
possible at all significance levels investigated. The correct classification of test samples includes
the gasoline replicate that was previously unclassified using SIMCA on TICs. To explain why
the classification of gasoline replicates was successful at a 25% significance level using EICs as
opposed to the TICs, it is necessary to investigate the modeling power of the gasoline model
based on EICs (Figure 4.16). By using EICs to generate the models, the problem of the rise in the
baseline, which contributed significantly to the TIC models, was greatly reduced. Additionally,
in terms of the gasoline model specifically, the problem with the variation in the baseline being
detected early in the chromatogram, before the rise at the end, was also reduced. The peaks in the
modeling power plot for the gasoline EIC model more accurately correspond to the peaks in the

!

%'*!

Principal Component 2 (16%)

1E5

-2E5

2E5

-1E5
Principal Component 1 (59%)
Figure 4.15: Scores plot of PC1 versus PC2 based on the extracted ion chromatograms
(m/z 99) of the ignitable liquid standards training and test sets: insect repellent (green),
gasoline (orange), paint thinner (yellow), diesel (blue), fuel injector (black), and fuel
stabilizer (red).

!

%'+!

Table 4.3. The suggested number of principal components for
soft independent modeling of class analogy on extracted ion
chromatograms (m/z 99).
Ignitable Liquid
Fuel Stabilizer

1

Gasoline

4

Paint Thinner

3

Insect Repellent

6

Diesel

2

Fuel Injector

!

Suggested PCs

2

%',!

Modeling Power (Gasoline)

1

0
3

Retention Time (min)

31

Figure 4.16: Modeling power for the gasoline model based on the extracted ion
chromatograms (m/z 99) of the gasoline training samples. The red line represents
modeling power of 0.3. Peaks that extend above this line significantly impact the model.

!

%'-!

EICs of gasoline. As a result, even though the gasoline model still poorly characterizes gasoline,
the trivial variations in the baseline do not significantly affect the model.
Using the recommended number of PCs for each model, all test samples were classified
as fuel stabilizer. Paint thinner, insect repellent, and diesel were also correctly classified to their
corresponding liquid models. The misclassifications that resulted are likely a product of using
too many PCs to describe the liquid models. Generally, only the first few PCs describe
chemically significant variation in the data set. Beyond the first PCs, much of the variation
described is considered chemically insignificant. This insignificant variation has many different
sources such as random instrumental fluctuations that occur as a result of the method of analysis.
These incorrect classifications highlight the need to carefully choose the number of PCs
to use when performing SIMCA. Scores plots can be used to determine the number of PCs that
should be used for SIMCA. For example, using the first two PCs, the overall scores plot of the
entire EIC data set exhibits six well-clustered groupings of samples corresponding to the six
different ignitable liquids. This trend is maintained when a two-dimensional scores plot is
generated using any two of the first five principal components to generate the scores plot. The
tight clustering of samples indicates that the PCs used to generate the scores plot describe
chemically significant variation in the data set. If PC6 or higher is used to generate a scores plot,
the tight clustering of replicates is no longer observed (Figure 4.17). This is especially obvious in
the EICs of the diesel samples, which exhibit spread on the scores plot due to PC6 (and higher)
describing the insignificant instrumental variation.
Since the models are used for classification, it is necessary to examine the corresponding
scores plots to determine the optimal number of PCs to use for each model. In the case of this
research, it was not possible to determine the optimal number of PCs to use for each model by

!

%(.!

Principal Component 6 (<0%)

3E6

-8E6

8E6

-3E6
Principal Component 1 (67%)
Figure 4.17: Scores plot of PC1 versus PC6 based on the extracted ion chromatograms
(m/z 99) of the ignitable liquid standards training and test sets: insect repellent (green),
gasoline (orange), paint thinner (yellow), diesel (blue), fuel injector cleaner (black), and
fuel stabilizer (red).

!

%(%!

evaluating the corresponding two-dimensional scores plot. This is due to the fact that the
replicate training samples used to generate each individual model were so similar that, when
PCA was performed on the training samples, chemically insignificant variation in the data set
was emphasized. As a result, the training samples were not well-clustered in the scores plot and
the method of selecting the optimal number of PCs discussed above was not possible.
The results of SIMCA demonstrate that using the correct number of PCs is essential for
accurate classifications. If too few PCs are used, the chemically significant variation may not be
described sufficiently to allow accurate classification among samples that contain similar
compounds. If too many PCs are used, noise or insignificant chemical variation is accounted for
in the model which can result in the misclassification of samples.

4.3.4.2 EICs: m/z 91, 83, and 128
Ions m/z 91 and 83 are representative of the aromatic and olefinic/cycloparrafinic
compounds, respectively. Correct classification of the test samples was possible up to the 10%
significance level for EICs of m/z 91 and 83 when only 2 PCs were used to model each liquid. In
addition, correct classification of all samples occurred at all significance levels when the
recommended number of PCs was used in the SIMCA models. Classification across all
significance levels, when using the recommended number of PCs, is likely a result of PC1 and
PC2 not accounting for enough of the chemically significant variation. As discussed earlier, it is
difficult to determine the optimal number of PCs to use for classifying this data set by SIMCA
due to the high similarity between replicates.
For ion m/z 128, which represents polycyclic aromatic hydrocarbons, correct
classification of the test samples by SIMCA was possible at all significance levels when 2 PCs

!

%(&!

were used to describe each liquid model. Classification using the recommended number of PCs,
on the other hand, was only possible up to the 10% significance level. At 25%, one gasoline
sample was not classified to any model. The gasoline sample was likely not classified at 25%
using the recommended number of PCs because the recommended number for the gasoline
model was five. As discussed previously, the higher PCs tend to describe chemically
insignificant variation of models that already poorly describe the ignitable liquids for which they
were generated; therefore, the lack of classification of one gasoline replicate is likely due to the
model describing too much noise.

4.4 Summary
The application of SIMCA, a supervised classification procedure, was used to
successfully classify TICs of ignitable liquids in a test set to the corresponding liquid standards
in a training set up to a 10% significance level regardless of the number of PCs used to make the
models. At a significance level of 25%, the high similarity of the replicates within the data set
used to create the models resulted in one test sample not being classified. Since the larger
significant levels are considered to be less statistically significant, the successful classification of
all test samples at the smaller significance levels outweighs the lack of classification of a
gasoline replicate at the larger significance level.
The use of EICs instead of TICs for SIMCA was demonstrated. Correct classification of
all test samples resulted at all significance levels when using a combination of 2 PCs or the
number recommended by the software.
The classification of standards using TICs and EICs has highlighted the importance of
selecting the optimal number of PCs with which to perform SIMCA. The optimal number of PCs

!

%('!

will describe only chemically significant variation within the models. As a result, using the
optimal number of PCs for each model when performing SIMCA could potentially minimize the
possibility of false positives.

!

!

%((!

REFERENCES

!

%()!

REFERENCES
!
1. Committee on Identifying the Needs of the Forensic Sciences Community, National
Research Council. Strengthening Forensic Science in the United States: A Path Forward.
Washington, D.C.: National Academies Press, 2009.
2. ASTM International, ASTM E 1412-07. Annual Book of ASTM Standards.
3. Unscrambler X SIMCA Theory Section of User Manual (version 10.2, Camo, Inc.,
Woodbridge, NJ)
4. Unscrambler X Methods Manual (version 10.2, Camo, Inc., Woodbridge, NJ)
!

!

%(*!

Chapter 5 Conclusions
!
5.1 Summary of Research
!
5.1.1 Research Objectives and Goals
This research investigated the use of multivariate statistical procedures for the objective
analysis of fire debris. Specifically, unsupervised statistical procedures such as principal
components analysis (PCA) coupled with Pearson product moment correlation (PPMC)
coefficients and supervised statistical procedures such as soft independent modeling of class
analogy (SIMCA) were explored. While both types of procedures offer an objective approach for
a currently subjective visual analysis of chromatographic fire debris data, they achieve this
objectivity in two distinct manners.

5.1.2 Unsupervised Multivariate Statistics Study Summary
/01!2345!36!70"8!879:;!<48!73!"=#187"2471!701!>371=7"45!36!9=89>1?#"81:!8747"87"@8A!
8>1@"6"@455;!BCD!4=:!BBEC!@3166"@"1=78A!63?!701!4883@"47"3=!36!8"F95471:!6"?1!:1G?"8!84F>518!
73!701"?!@3??18>3=:"=2!"2="74G51!5"H9":!874=:4?:8!"=!8>"71!36!1#4>3?47"3=A!F47?"$!
"=71?61?1=@18A!4=:!701?F45!:12?4:47"3=I!J370!BCD!4=:!BBEC!@3166"@"1=78!<1?1!981:!73!
4=45;K1!"2="74G51!5"H9":!874=:4?:8!4=:!70?11!:474!8178A!<0"@0!"55987?471!701!1661@78!36!
1#4>3?47"3=A!F47?"$!"=71?61?1=@18A!4=:!701?F45!:12?4:47"3=!"=!4!>"1@1<"81!F4==1?I!
D!89?64@1L7?1471:!<33:!F47?"$!<48!981:!73!"=#187"2471!701!1661@78!36!F47?"$!
"=71?61?1=@18I!M33:!"8!@3FF3=5;!981:!"=!G9"5:"=2A!69?="80"=2A!4=:!:1@3?47"=2!87?9@79?18!
4=:A!"=!89@0!@4>4@"7"18A!"8!989455;!@3471:!<"70!4!89?64@1!7?147F1=7!63?!:9?4G"5"7;!48!<155!48!

!

%(+!

:1@3?47"#1!>9?>3818I!/01!@3F>39=:8!6?3F!701!89?64@1!7?147F1=78!4?1!>1?8"871=7!4=:!@4=!
F"F"@!701!>14N!>4771?=8!36!"2="74G51!5"H9":8I!!
!

O"?875;A!"2="74G51!5"H9":!874=:4?:8!<1?1!21=1?471:I!P=!3?:1?!73!:1F3=87?471!701!

1661@78!36!1#4>3?47"3=A!24835"=1!4=:!N1?381=1!<1?1!1#4>3?471:!73!70?11!:"661?1=7!51#158A!G;!
#359F1!Q"=@59:"=2!.R!1#4>3?47"3=SI!T4@0!36!701!5"H9":8!<48!8>"N1:!81>4?4715;!3=73!
U"FM">18I!/01!874=:4?:8!<1?1!1$7?4@71:!98"=2!701!>488"#1!014:8>4@1!>?3@1:9?1!<"70!4=!
4@7"#471:!@04?@345!87?">!4=:!4=45;K1:!G;!248!@0?3F4732?4>0;LF488!8>1@7?3F17?;!QVCLEWSI!
/01!?18957"=2!@0?3F4732?4F8!<1?1!89GX1@71:!73!:474!>?17?147F1=7!>?3@1:9?18!73!F"="F"K1!
701!1661@78!36!4=;!=3=L@01F"@45!#4?"47"3=!7047!F4;!04#1!G11=!"=7?3:9@1:!:9?"=2!701!
1$7?4@7"3=!3?!4=45;8"8!>?3@1:9?1I!/01!>?17?147F1=78!"=@59:1:!4!W4#"7KN;LV354;!8F3370"=2!
4523?"70FA!4!?171=7"3=!7"F1!45"2=F1=7!98"=2!4!@3??1547"3=!3>7"F"K1:!<4?>"=2!4523?"70FA!4=:!
=3?F45"K47"3=!G;!73745!4?14!63?!14@0!"2="74G51!5"H9":!1#4>3?47"3=!51#15I!
!

D671?!:474!>?17?147F1=7!>?3@1:9?18A!701!@0?3F4732?4F8!<1?1!89GX1@71:!73!BCD!4=:!

701=!BBEC!@3166"@"1=78!<1?1!@45@95471:I!/0181!7<3!8747"87"@45!>?3@1:9?18!@4=!G1!7039207!36!
48!@3F>51F1=74?;I!B?"=@">45!@3F>3=1=78!4=45;8"8!":1=7"6"18!701!2?147187!839?@1!36!#4?"4=@1!
<"70"=!4!:474!817!73!:"661?1=7"471!4=:A!@3=81H91=75;A!2?39>!8"F"54?!84F>518!G481:!3=!701"?!
@01F"@45!@04?4@71?"87"@8I!B14?83=!>?3:9@7!F3F1=7!@3??1547"3=!@3166"@"1=78!3=!701!3701?!
04=:!488188!701!8"F"54?"7;!G17<11=!7<3!@0?3F4732?4F8!4=:!4?1!@45@95471:!G;!>1?63?F"=2!4!
>3"=7LG;L>3"=7!@3F>4?"83=I!/01!?18957"=2!@3166"@"1=7!>?3#":18!4!=9F1?"@45!#4591!7047!
:18@?"G18!701!1$71=7!36!701!8"F"54?"7;I!/01!@3FG"=47"3=!36!70181!>?3@1:9?18!>?3#":18!G370!4!
#"8945!4=:!=9F1?"@45!F1703:!36!@3F>4?"=2!@0?3F4732?4>0"@!6"?1!:1G?"8!:474I!
!

M01=!BCD!<48!4>>5"1:!73!701!"2="74G51!5"H9":!874=:4?:8A!701!874=:4?:8!@395:!G1!

:"661?1=7"471:!6?3F!3=1!4=3701?!G;!5"H9":!7;>1!48!<155!48!1#4>3?47"3=!51#15!36!14@0!5"H9":I!

!

%(,!

D::"7"3=455;A!701!BBEC!@3166"@"1=78!@45@95471:!63?!?1>5"@4718!4@@3?:"=2!73!"2="74G51!5"H9":!
1#4>3?47"3=!51#15!:1F3=87?471:!7047!?1>5"@4718!@395:!G1!87?3=25;!@3??1547"3=!73!3=1!
4=3701?I!/0"8A!@39>51:!<"70!701!53<!874=:4?:!:1#"47"3=8!4883@"471:!<"70!701!BBEC!
@3166"@"1=78A!"=:"@471:!7047!701!4=45;8"8!4=:!1$7?4@7"3=!>?3@1:9?18!981:!"=!70"8!879:;!<1?1!
>?1@"81I!
!

Y1$7A!73!"=#187"2471!701!1661@78!36!F47?"$!"=71?61?1=@18!4=:!701?F45!:12?4:47"3=!36!

701!F47?"$!3=!701!4883@"47"3=!36!6"?1!:1G?"8!84F>518!73!?18>1@7"#1!874=:4?:8A!701!"2="74G51!
5"H9":!874=:4?:8!<1?1!8>"N1:!3=73!9=G9?=1:!4=:!G9?=1:!89?64@1L7?1471:!<33:A!<0"@0!<48!
701=!1$7?4@71:!4=:!4=45;K1:!G;!VCLEWI!!/01!5"H9":8!8>"N1:!3=73!701!9=G9?=1:!89?64@1L
7?1471:!<33:!:1F3=87?471:!701!1661@78!36!"=01?1=7!F47?"$!"=71?61?1=@18!3=!701!4883@"47"3=!
36!701!8"F95471:!6"?1!:1G?"8!84F>518!73!701"?!?18>1@7"#1!874=:4?:8I!/01!5"H9":8!8>"N1:!3=73!
701!G9?=1:!89?64@1L7?1471:!<33:!:1F3=87?471:!701!1661@78!36!701?F45!:12?4:47"3=!36!701!
<33:!F47?"$I!Z4875;A!4!8"F95471:!6"?1!:1G?"8!:474!817!<48!21=1?471:!G;!8>"N"=2!14@0!36!701!
"2="74G51!5"H9":8!81>4?4715;!3=73!89?64@1L7?1471:!<33:A!<0"@0!<48!89G81H91=75;!G9?=1:!63?!
'.!81@3=:8I!D24"=!84F>518!<1?1!1$7?4@71:!4=:!4=45;K1:!G;!VCLEWI!/0"8!:474!817!74N18!"=73!
4@@39=7!1#4>3?47"3=A!F47?"$!"=71?61?1=@18A!4=:!701?F45!:12?4:47"3=!3=!701!4883@"47"3=!36!
701!84F>518!73!701"?!?18>1@7"#1!874=:4?:8I!!
/01!@0?3F4732?4F8!6?3F!455!:474!8178!<1?1!89GX1@71:!73!:474!>?17?147F1=7!
>?3@1:9?18I!Y1$7A!63?!14@0!:474!817A!701!8@3?18!36!701!84F>518!<1?1!@45@95471:!4=:!
>?3X1@71:A!81>4?4715;A!3=73!701!3?"2"=45!8@3?18!>537!36!701!"2="74G51!5"H9":!874=:4?:8!
?18957"=2!"=!70?11!=1<!8@3?18!>5378I!!BBEC!@3166"@"1=78!<1?1!@45@95471:!63?!84F>51!
?1>5"@4718!"=!14@0!:474!817!48!<155!48!G17<11=!84F>518!4=:!701!9=1#4>3?471:!"2="74G51!
5"H9":!874=:4?:8I!

!

%(-!

!

[124?:5188!36!1#4>3?47"3=A!F47?"$!"=71?61?1=@18!3?!701?F45!:12?4:47"3=A!"=!455!70?11!

:474!8178A!701!BCD!8@3?18!>537!@395:!G1!981:!73!:"661?1=7"471!701!84F>518!G;!"2="74G51!5"H9":!
7;>1!981:!73!21=1?471!701FI!\3<1#1?A!"=!=3=1!36!701!>5378!<1?1!701!84F>518!4G51!73!G1!
#"89455;!4883@"471:!73!701"?!@3??18>3=:"=2!874=:4?:!"=!71?F8!36!1#4>3?47"3=!51#15I!P=!14@0!
@481A!70"8!54@N!36!4883@"47"3=!<48!:91!73!701!@3=8":1?4G51!4F39=7!36!8>?14:!811=!"=!701!
1#4>3?47"3=!51#15!?1>5"@471!84F>518!48!<155!48!4!80"67!"=!>38"7"3="=2!"=!701!8@3?18!>537!36!701!
84F>518!<"70!?18>1@7!73!701!874=:4?:8I!
/01!8>?14:!"=!701!8@3?18!>537!"8!5"N15;!:91!73!701!>3?398!=479?1!36!701!<33:!48!
3>>381:!73!701!#4?"4G"5"7;!36!701!G9?="=2!>?3@188!G1@4981!701!8>?14:!<48!3G81?#1:!"=!1#1=!
701!9=G9?=1:!84F>518I!/01!>3?38"7;!36!701!<33:!F4;!4553<!701!"2="74G51!5"H9":!73!834N!"=73!
701!<33:!4=:!=1247"#15;!"F>4@7!701!166"@"1=@;!36!701!>488"#1!014:8>4@1!1$7?4@7"3=I!/0"8!"8!
69?701?!?1651@71:!"=!701!>33?1?!QG97!87"55!87?3=25;!@3??15471:S!BBEC!@3166"@"1=78!4F3=2!
?1>5"@4718!36!701!84F>518!63?!14@0!:474!817!<01=!@3F>4?1:!73!70381!@45@95471:!63?!?1>5"@4718!
36!701!874=:4?:8I!
/01!80"678!"=!>38"7"3="=2!36!84F>51!?1>5"@4718!4<4;!6?3F!701!874=:4?:8A!<0"@0!<48!
>?1#1=7"=2!@3??1@7!4883@"47"3=!73!874=:4?:8!G;!1#4>3?47"3=!51#15!"=!701!8@3?18!>5378!"8!
F3875;!:91!73!:"661?1=@18!"=!4G9=:4=@18!36!@3F>39=:8!?18957"=2!6?3F!98"=2!:"661?1=7!8>"N1!
#359F18!73!21=1?471!701!:474!8178I!/01!:"661?1=7!4G9=:4=@18!@395:!=37!G1!@3F>1=8471:!63?!
<"70!@9??1=7!=3?F45"K47"3=!>?3@1:9?18I!!
!

!T#1=!703920!701!84F>518!6?3F!14@0!:474!817!@395:!=37!G1!4883@"471:!73!701"?!

?18>1@7"#1!874=:4?:8!G;!1#4>3?47"3=!51#15!"=!701!8@3?18!>537A!70"8!>?3X1@7!89@@1886955;!
:1F3=87?471:!701!>371=7"45!36!F3?1!3GX1@7"#1!F1703:!36!6"?1!:1G?"8!4=45;8"8!98"=2!BCD!4=:!
BBEC!@3166"@"1=78I!P=!63?1=8"@!54G3?473?"18!4=45;878!4?1!=37!@3=@1?=1:!<"70!03<!

!

%).!

1#4>3?471:!701!"2="74G51!5"H9":!981:!73!@3FF"7!4?83=!"8I!/01!X3G!36!701!4=45;87!"8!73!":1=7"6;!
701!>?181=@1!36!4=!"2="74G51!5"H9":!4=:!701=!73!":1=7"6;!"78!@5488I!P=!8>"71!36!701!>?181=@1!36!
701!89?64@1L7?1471:!<33:!F47?"$A!701!84F>518!21=1?471:!"=!455!:474!8178!@395:!G1!":1=7"6"1:!
48!@3=74"="=2!701!"2="74G51!5"H9":!981:!73!21=1?471!701!84F>51I!
!
5.1.3 Supervised Multivariate Statistics Study Summary
!

/01!2345!36!70"8!879:;!<48!73!"=#187"2471!701!>371=7"45!63?!98"=2!89>1?#"81:!

F957"#4?"471!8747"87"@45!>?3@1:9?18!89@0!48!WPECD!63?!701!@5488"6"@47"3=!36!"2="74G51!5"H9":!
874=:4?:8!8>4=="=2!6"#1!:"661?1=7!DW/E!P=71?=47"3=45!@548818I!J1@4981!70"8!<48!4!>?336L36L
@3=@1>7!879:;A!701!>?1#"3985;!F1=7"3=1:!@3F>5"@47"=2!64@73?8!36!6"?1!:1G?"8!4=45;8"8!<1?1!
=37!"=#187"2471:I!
!

P=81@7!?1>1551=7A!24835"=1A!>4"=7!70"==1?A!6915!874G"5"K1?A!6915!"=X1@73?A!4=:!:"1815!<1?1!

455!8151@71:!48!"2="74G51!5"H9":!874=:4?:8I!T4@0!<48!:"5971:!"=!F170;51=1!@053?":1!4=:!
4=45;K1:!G;!VCLEW!98"=2!4!:"?1@7!"=X1@7"3=I!/01!5"H9":8!<1?1!4=45;K1:!"=!?1>5"@471!Q=]%)SI!
/01!?18957"=2!73745!"3=!@0?3F4732?4F8!<1?1!89GX1@71:!73!BCD!73!488188!701!=479?45!
2?39>"=28!36!701!5"H9":8!4=:!:171?F"=1!701!=9FG1?!36!BC8!7047!80395:!G1!981:!73!21=1?471!
14@0!5"H9":!F3:15I!D55!5"H9":8!@395:!G1!:"661?1=7"471:!6?3F!3=1!4=3701?!98"=2!&!BC8I!/01!
@0?3F4732?4F8!<1?1!701=!8>5"7!"=73!7?4"="=2!Q=]%&!>1?!5"H9":S!4=:!7187!Q=]'!>1?!5"H9":S!
8178I!/01!7?4"="=2!@0?3F4732?4F8!<1?1!981:!73!21=1?471!4!BCD!F3:15!63?!14@0!"2="74G51!
5"H9":A!?18957"=2!"=!8"$!73745!F3:158I!/01!7187!@0?3F4732?4F8!<1?1!701=!@5488"6"1:!48!
G153=2"=2!73!3=1A!F957">51!3?!=3!F3:158I!/01!@5488"6"@47"3=8!<1?1!>1?63?F1:!98"=2!1"701?!&!
BC8!3?!701!=9FG1?!36!BC8!?1@3FF1=:1:!G;!701!8367<4?1!73!:18@?"G1!14@0!36!701!"2="74G51!

!

%)%!

5"H9":!F3:158I!/0"8!>?3@188!<48!4583!?1>1471:!98"=2!639?!:"661?1=7!1$7?4@71:!"3=!
@0?3F4732?4F8!QTPCSI!T4@0!TPC!@0381=!?1>?181=71:!4!:"661?1=7!@5488!36!@3F>39=:8I!
!

^8"=2!/PC8!701!@5488"6"@47"3=8!36!701!7187!@0?3F4732?4F8A!<0"@0!<1?1!>1?63?F1:!

98"=2!G370!&!4=:!701!?1@3FF1=:1:!=9FG1?!36!BC8A!<1?1!89@@188695!47!4!.I%RA!%RA!)RA!4=:!
%.R!8"2="6"@4=@1!51#15I!_=1!24835"=1!7187!/PC!?1F4"=1:!9=@5488"6"1:!47!4!&)R!8"2="6"@4=@1!
51#15A!?124?:5188!36!701!=9FG1?!36!BC8!981:!73!:18@?"G1!701!F3:158I!W"2="6"@4=@1!51#158!5188!
704=!)R!4?1!F3?1!8747"87"@455;!8"2="6"@4=7`!701?163?1A!701!54@N!36!@5488"6"@47"3=!36!3=1!
24835"=1!?1>5"@471!47!&)R!<48!=37!36!2?147!@3=81H91=@1I!O9?701?F3?1A!701!?1483="=2!63?!701!
9=@5488"6"1:!24835"=1!?1>5"@471!47!&)R!<48!F4"=5;!:91!73!>33?5;!:18@?"G1:!5"H9":!F3:158A!
<0"@0!<1?1!701!?18957!36!:1#153>"=2!F3:158!6?3F!4!0"205;!8"F"54?!817!36!?1>5"@471!
@0?3F4732?4F8!63?!14@0!"2="74G51!5"H9":I!M01=!"2="74G51!5"H9":!F3:158!<1?1!:1#153>1:!G;!
>1?63?F"=2!BCD!3=!4!817!36!>?4@7"@455;!":1=7"@45!@0?3F4732?4F8A!701!"=8"2="6"@4=7!=3=L
@01F"@45!#4?"47"3=!<48!":1=7"6"1:!4=:!1F>048"K1:!"=!701!F3:158I!M0"51!70"8!<48!7?91!36!455!
36!701!5"H9":!F3:158!21=1?471:!63?!70"8!879:;A!701!24835"=1!<48!>4?7"@954?5;!>33?5;!
:18@?"G1:!8"=@1!701!=3=L@01F"@45!#4?"47"3=!6?3F!701!G4815"=1!@3=7?"G971:!F3?1!73!701!
24835"=1!F3:15!704=!4=;!3701?!F3:15I!P7!<48!63?!70181!?1483=8!7047!701!24835"=1!?1>5"@471!
<48!=37!@5488"6"1:!47!4!0"201?!8"2="6"@4=@1!51#15I!
!

C5488"6"@47"3=!98"=2!WPECD!3=!TPC8!<48!4583!"=#187"2471:I!C5488"6"@47"3=!63?!1#1?;!7187!

TPC!63?!701!639?!"3=8!"=#187"2471:!<48!89@@188695!4@?388!455!8"2="6"@4=@1!51#158!98"=2!1"701?!&!
BC8!3?!701!=9FG1?!?1@3FF1=:1:!G;!701!8367<4?1I!C5488"6"@47"3=8!63?!14@0!"#$!TPC!:":!#4?;!
6?3F!98"=2!&!BC8!73!98"=2!701!?1@3FF1=:1:!=9FG1?A!7098!0"205"207"=2!701!"F>3?74=@1!36!
@0338"=2!701!3>7"F45!=9FG1?!36!BC8!<"70!<0"@0!73!:18@?"G1!701!"2="74G51!5"H9":!F3:158I!
^8"=2!733!F4=;!BC8!@395:!?18957!"=!4!F3:15!7047!"8!8"2="6"@4=75;!"=6591=@1:!G;!@01F"@455;!

!

%)&!

"=8"2="6"@4=7!#4?"47"3=8!<01?148!98"=2!733!61<!BC8!@395:!?18957!"=!4!F3:15!7047!:318!=37!
4:1H94715;!:18@?"G1!701!@01F"@45!#4?"47"3=8!"=!701!:474I!
!

5.2 Future Work
!

!/01!>371=7"45!36!9=89>1?#"81:!4=:!89>1?#"81:!8747"87"@45!>?3@1:9?18!048!G11=!

:1F3=87?471:!"=!6"?1!:1G?"8!4=45;8"8!9=:1?!54G3?473?;!@3=:"7"3=8A!G97!G163?1!7047!>371=7"45!
@4=!G1!?145"K1:!"=!63?1=8"@!54G3?473?;A!F3?1!?1814?@0!=11:8!73!G1!>1?63?F1:I!M0"51!BCD!048!
G11=!981:!73!"=#187"2471!701!1661@78!36!1#4>3?47"3=A!F47?"$!"=71?61?1=@18A!4=:!701?F45!
:12?4:47"3=!3=!701!4883@"47"3=!36!8"F95471:!6"?1!:1G?"8!73!701"?!?18>1@7"#1!874=:4?:8A!WPECD!
048!=37I!P=!701!>?336L36L@3=@1>7!879:;!>?181=71:!"=!70"8!7018"8A!701!>371=7"45!63?!WPECD!<48!
:1F3=87?471:!G;!@5488"6;"=2!701!/PC8!4=:!TPC8!36!"2="74G51!5"H9":!874=:4?:8I!/0181!
874=:4?:8!@395:!148"5;!04#1!G11=!@5488"6"1:!G;!6"?1!:1G?"8!4=45;878!G;!#"89455;!4=45;K"=2!701!
@0?3F4732?4F8I!Y1$7!701!>371=7"45!981!36!WPECD!3=!@0?3F4732?4>0"@!:474!7047!"8!5188!
#"89455;!3G#"398!F987!G1!"=#187"2471:!G1@4981!"7!"8!63?!70"8!7;>1!36!:474!7047!WPECD!"8!
=11:1:I!
!

/01!4=45;8"8!36!6"?1!:1G?"8!6?3F!4!@?"F1!8@1=1!"8!45<4;8!@3F>5"@471:!G;!1#4>3?47"3=!

36!701!5"H9":A!"=71?61?1=@1!@3F>39=:8!6?3F!701!F47?"$!4=:!701?F45!:12?4:47"3=!36!G370!701!
5"H9":!4=:!F47?"$I!/0181!1661@78!3=!701!4=45;8"8!36!6"?1!:1G?"8!<1?1!45?14:;!"=#187"2471:!
98"=2!BCD!4=:!80395:!G1!"=#187"2471:!=1$7!98"=2!WPECDI!D8!<"70!BCDA!701!@3F>5"@47"=2!
64@73?8!80395:!G1!"=#187"2471:!<"70!WPECD!"=!4!>"1@1<"81!F4==1?A!98"=2!F957">51!:474!8178!
73!"55987?471!701!"=:"#":945!4=:!@3FG"=1:!1661@78!36!14@0!@3F>5"@47"=2!64@73?I!/01!
89@@188695=188!36!701!@5488"6"@47"3=!<395:!:1F3=87?471!<01701?!3?!=37!WPECD!048!4!6979?1!

!

%)'!

"=!6"?1!:1G?"8!4=45;8"8I!D::"7"3=455;A!70"8!7;>1!36!879:;!<395:!4553<!"=#187"2473?8!73!
:171?F"=1!<0"@0!@3F>5"@47"=2!64@73?!F387!=1247"#15;!"F>4@78!701!@5488"6"@47"3=!4=:A!
>371=7"455;A!"=#187"2471!<4;8!73!F"="F"K1!70181!1661@78!89@0!48!98"=2!TPC8!4=:!1$7?4@71:!"3=!
>?36"518!QTPB8SI!
!

_=@1!701!"="7"45!879:;!:18@?"G1:!4G3#1!98"=2!WPECD!048!G11=!>1?63?F1:A!

:1F3=87?47"=2!7047!701!@5488"6"@47"3=!36!8"F95471:!6"?1!:1G?"8!4@@3?:"=2!73!"2="74G51!5"H9":!
7;>1!"8!>388"G51A!WPECD!4=:!BCD!<"55!G1!47!701!84F1!87421!"=!?1814?@0!4=:!:1#153>F1=7!63?!
981!"=!6"?1!:1G?"8!4=45;8"8I!D7!70"8!>3"=7A!G370!>?3@1:9?18!@395:!G1!981:!73!"=#187"2471!701!
1661@78!36!F3?1!@3FF3=!F47?"@18!89@0!48!5"=3519F!6533?"=2A!:1@3?47"#1!>4=15"=2A!3?!
@39=71?73>!89?64@18I!D::"7"3=455;A!6"?1!8@1=18!4?1!@3F>?"81:!36!701!G9?="=2!36!F957">51!
F47?"@18!"=!@5381!>?3$"F"7;!73!3=1!4=3701?`!"7!"8!0"205;!9=5"N15;!7047!3=5;!3=1!G9?=1:!F47?"$!
<395:!G1!>?181=7I!D8!4!?18957A!6979?1!879:"18!80395:!"=#187"2471!701!1661@78!36!F"$1:!
F47?"@18I!
!

/01!4883@"47"3=!4=:!@5488"6"@47"3=!36!6"?1!:1G?"8!84F>518!98"=2!3701?!"2="74G51!5"H9":8!

80395:!4583!G1!"=#187"2471:!"=!@3=X9=@7"3=!<"70!1#4>3?47"3=A!F47?"$!"=71?61?1=@18A!4=:!
701?F45!:12?4:47"3=I!V4835"=1!4=:!N1?381=1!<1?1!"=#187"2471:!"=!70"8!879:;!G1@4981!36!
701"?!>?1#451=7!981!:9?"=2!701!@3FF"88"3=!36!4?83=A!G97!3701?!5"H9":8!89@0!48!5"2071?!659":!
4?1!4583!@3FF3=5;!981:I!C3=81H91=75;A!F3?1!5"H9":8!8>4=="=2!F957">51!DW/E!P=71?=47"3=45!
@548818!=11:!73!G1!"=#187"2471:!"=!701!>?181=@1!36!701!463?1LF1=7"3=1:!@3F>5"@47"=2!64@73?8I!
D::"7"3=455;A!F"$1:!5"H9":8!80395:!G1!"=#187"2471:!G1@4981!4?83="878!981!"2="74G51!5"H9":8!
7047!4?1!4#4"54G51!73!701F!4=:!F4;!981!4!F"$79?1!36!F957">51!"2="74G51!5"H9":8!73!874?7!6"?18I!!
!

/0"8!?1814?@0!:1F3=87?471:!701!>371=7"45!63?!981!36!9=89>1?#"81:!4=:!89>1?#"81:!

F957"#4?"471!8747"87"@45!>?3@1:9?18!89@0!48!BCD!4=:!WPECD!"=!>1?63?F"=2!4=!3GX1@7"#1!

!

%)(!

4=45;8"8!36!6"?1!:1G?"8I!M0"51!701!?189578!36!70"8!?1814?@0!4?1!>?3F"8"=2A!F9@0!F3?1!
?1814?@0!=11:8!73!G1!>1?63?F1:!G163?1!70181!>?3@1:9?18!@4=!G1!"F>51F1=71:!"=!63?1=8"@!
54G3?473?"18I!/01!?1814?@0!>1?63?F1:!"=!70"8!7018"8!"8!4!=1@1884?;!871>!"=!G?":2"=2!701!24>!
G17<11=!?1814?@0!4=:!701!4>>5"@47"3=!36!70181!>?3@1:9?18!63?!701!3GX1@7"#1!4=45;8"8!36!6"?1!
:1G?"8!"=!63?1=8"@!54G3?473?"18I!

!

%))!