‘11—}. 1524317!” ’9; I . ;
' $3.5}; fl&.&',¥'f;% ( ( -. ,1 ‘
N- (”flied ‘ I I i
12%“ '1'!) H 'l I. I
J : ’ " - '
A. ~ ‘ I
I. I. - A
‘ , #44. V >
V V ‘ - x n 'A
. . -'v>'~<"‘ ‘ -
1, ’ t .k "k. I.
, ‘3: w“
v: H ‘ ' ‘5‘."
)- ”is!” ' at:
\
2:4, gh:
‘ 4'. :‘.
‘31:.
£93133.
. ac ‘ 1
$316
2‘
r
‘1
gr}.
?~“
1
3}"?
% ér'I‘r Ji .
3/; vii"? ‘ ‘ ‘ W .-
‘y. ,7 £2 ‘ 1' - .‘ 135051“;
a. . .) 3.,- . . - ’ 7‘5 I’ll-
. -~ v " . " 7.. ta’j'“ V
I ‘ I m . .‘ I I I
, A
”Ii-1:3. ' .‘ g
n' '53? new”. '7' "z”? (1'?
,r" t ”M
A. A.“§u~“*“““
fian‘fil-fim 1f ‘1.
:J-u? 1“)" .r'f. ‘
bps: u‘l‘f‘“ ' ”A
‘It‘ ‘hj ._ v“~..‘_-
. 9 Es":
n . M
I {’1‘} '4 i
N Iv‘ ! |
‘1‘. I a.
«I?
: \‘L
”If . . '
“Egb’.
:' r
Rh
:7 :M
+123: 3‘
3.24..
5 1‘ ”(L
._l ‘ "-‘
vflr ‘9
A '
.
‘7. M“
‘fi. .9
.-.\;
,i;
t" I
1 H‘ I)
.r. ,.~€‘.r “ \
. LY . ' ‘
. _ I J‘ l ' '
,V ‘ H1"
a. ' . 6‘ W1," '
. Ar! 3‘
‘, _,.,v '1’
u‘FL'
y-
f
r 3a
v.7 b... 6‘5”";er . r. '
jinn-E" (in. ‘ m‘gni'l . 11kt
2 1!!“ ;:"‘."(/>, ny’ “h. I.
l ' 1" H. 'I‘.
'13:: ‘ l‘ "— I
,r'r
.5713”! " NV an)?“
I‘ ; ' '1 I 'r . ~ H
‘ Illa! ‘ ~ “'1‘". (-1?” 1/ g 4“
fin "WW? r ‘O."1!"' ' ‘ ‘ ‘-
.. mfin:¥)'-C~”Iw’"' H "
i l A ‘ ' “ v . .'
fulfil?” m‘éf‘j‘iw ¢ ‘, l r . a .
‘ .
. ‘ , ., I. I . V ‘
> ‘ , - . o" ‘
‘ V f ‘1 . -
. . ‘ >1 ‘3 I
J, ' ‘P' . .
,n - '
.. _.=*"
.' V
.4"
Q..-
4‘s
'0
.1’3‘ 13w.“
2: 3‘. “‘4
N M" .
~ g.
.nuxzfifl'flw“ *‘
n.1,.U'Jfif'lh‘ ..
r ,
m
“(51...“
W-
' 1
.f
‘. N
M.
{#1,er
cf.
‘11
‘
.~ , ('3
w"
._o" 'L"
3’”;- ”J
«n‘ w
_ '9‘»
:3”
_ 1‘
I“ a“:
.
.
fin". l.
"my
Amizzw’
A. (3..
‘ .
‘ ‘lrfix'h’fiih‘v‘ a.)
U. '36": 1‘ '9'
I r I 3'0
.t- ' '
. .1 .
5 55;"? 4
"‘ a- : 7’
v ' '
Jr 97 ‘
A 1" '
.3 ..-.'. 4“" ' "x.
1 .7"! ‘“ 7‘
.1” «2:97:35"
. a '31 : 7—0"~ .. “
~ '5".- ." . 4233‘; c
'1' 5”: r. "I ‘
.
’ —-'.
“{f 3.: ;.r.__--”
.573»? J ‘
Ira-’0' “
l ‘l ' ' l J ‘ ..
hum»! _ . . 1.x»: 0””
. . 3', . r—j' '~!.‘r"‘-'J"'“
41"
”ms
IHIHIHIHIlllllllll'mlllllllllllIlllllll‘lllllllilllill
300891 7167
This is to certify that the
l dissertation entitled
An Assessment of the Impacts
of Alternative Factor Analyses
on the Stability of Cluster Membership
presented by
Sheng Jung Ou
has been accepted towards fulfillment
of the requirements for
PhD degree in the
Department of Park and Recreation Resources
flwW/U
Major p fessor
Datej / 3 /9/
MS U is an Affirmative Action/Equal Opportunity Institution 0-12771
1A_
LIBRARY
‘Mlchlgan State
University
PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.
DATE DUE DATE DUE DATE DUE
'JL‘N 1 1 19‘3"!
%f
—I
—|
4“?
—.
MSU Is An Affirmdlve Action/Equal Opportunity Institution
cmmflt
AN ASSESSMENT OF THE IMPACTS OF ALTERNATIVE FACTOR
ANALYSES ON THE STABILITY OF CLUSTER MEMBERSHIP
BY
SHENG JUNG 0U
A DISSERTATION
Submitted to
Michigan State University
in partial fulfillment of the requirement
for the degree of
DOCTOR OF PHILOSOPHY
Department of Park and Recreation Resources
1990
@47—ifl%
ABSTRACT
AN ASSESSMENT OF THE IMPACTS OF ALTERNATIVE FACTOR
ANALYSES ON THE STABILITY OF CLUSTER MEMBERSHIP
By
Sheng Jung Ou
Even though the use of factor scores as input data for cluster
analysis is a relatively common procedure, there has been very little
research on the effect of alternative factor analyses on the results of
cluster analysis, especially cluster membership. The primary purpose of
the study was to examine the impact of factor analyses on cluster
membership when clustering is based on factor scores. Specifically, the
study examined the effect of alternative factor solutions (number of
factors) and factor rotation on cluster membership.
The study used the importance ratings of 20 different campground
attributes/facilities collected in a study of the 1988 National Campers
and Hikers Association Campvention. To achieve three study objectives,
principal component analysis with and without varimax rotation, cluster
analysis (Ward's method using the squared Euclidean as a distance
measure), crosstabulation technique, and the entropy (information)
measure were employed.
Three major conclusions were drawn from the analyses. First, when
factor analysis is used in conjunction with cluster analysis, the factor
solution (number of factors) selected has an effect on the cluster
membership. Second, whether or not the initial factors are rotated does
not affect cluster membership. However, rotation will effect the
interpretation of the clustering results (i.e., the cluster labels).
Third, clustering on raw data rather than factor scores results in more
stable cluster membership.
The study resulted in two primary recommendations regarding the
use of factor analysis and cluster analysis. First, when factor
analysis is performed as a preliminary step to cluster analysis, they
should not be treated as distinct analyses. Decisions regarding the
number of factors should be based on both the factor analysis criteria
(eigenvalues greater than one, percentage of variance explained, scree
test) and the impact on the cluster solution. Second, researchers may
first perform cluster analysis based on raw data for classification
(segmentation) purposes, and then use factor analysis as a means of
describing clusters.
Copyright by
Sheng Jung Ou
1990
ACKNOWLEDGEMENTS
I would like to thank my major adviser, Dr. Edward M. Mahoney, for
his support and instruction. He was essential in the success of my
Ph. D. program at Michigan State University. Without his patience,
thoughtful response, and intellectual stimulation, I would not be able
to finish this dissertation.
I would also like to thank my committee members Dr. Donald F.
Holecek (Department of Park and Recreation Resources), Dr. Rene C.
Hinojosa (Department of Urban Planning), and Dr. Paul E. Nickel
(Department of Resource Development) for their patience and constructive
criticisms of this dissertation. Special thanks are extended to Douglas
B. Jester, for his input and assistance, especially his advice regarding
research and statistical methods.
I would also like to express my appreciation to Dr. James L.
Bristor for his friendship and spiritual support, to Ms. Tsao Fang Yuan
for her assistance, and to Mr. Jong Pyng Li for his help with the
computer programming.
My deepest and warmest gratitude must go to my parents, fiancee,
sisters, and brothers for their support throughout my graduate program.
Without their contributions, I would not have achieved my personal
goals. Finally, to everyone else who helped me and made my stay‘at
Michigan State University truly enjoyable and unforgettable, thanks.
ii
TABLE OF CONTENTS
Page
LIST OF TABLES ................................................. v
LIST OF FIGURES ................................................ x
Chapter
I. INTRODUCTION ........................................... 1
Problem Statement .................................. a
Study Objectives ................................... 7
Organization of the Study .......................... 7
II. LITERATURE REVIEW ...................................... 9
Factor Analysis and Cluster Analysis ............... 9
Factor Analysis ............................. 9
Cluster Analysis ............................ 13
Comparisons of Factor Analysis
and Cluster Analysis ................... 17
Literature Supporting the Combined Use of Factor
Analysis and Cluster Analysis ................. 18
Studies on the Combined Use of Factor Analysis
and Cluster Analysis .......................... l9
Potential Impacts of Factor Solutions on
Clustering Results ............................ 36
Summary ............................................ 37
III. RESEARCH METHODS ....................................... 38
Source and Description of Data ..................... 38
The 1988 Michigan Campvention Study ......... 38
Data Collection Methods and Response Rate. 39
Profile of Persons Who Completed
Questionnaires ......................... Al
Data Used in the Present Study .............. 42
iii
Statistical Methods Used to Achieve the Study
Objectives .....................................
The Effect of Different Factor Solutions on
Cluster Membership ........................
Procedures .........................
The Effect of Factor Rotation on Cluster
Membership ................................
Procedures .........................
Comparison of Different Clustering Approaches.
Procedures .........................
IV. RESULTS .................................................
Importance Ratings of 20 Campground Attributes .....
Appropriateness of the Data for Factor Analysis....
Assessment of the Effect of Different Factor
Solutions on the Clustering Results ............
Factoring Results .........................
Clustering Results ........................
Factor Score Pattern ......................
Comparison of Cluster Membership ..........
Assessment of the Effect of Rotation on Cluster
Membership .....................................
Comparison of Clustering on Factor Scores with
Clustering on Raw Data .........................
Clustering Results ......................
Comparisons Between Clustering Approaches.
VI. CONCLUSIONS .............................................
Summary of the Study ................................
Major Conclusions ...................................
Study Limitations ...................................
Recommendations Regarding the Use of Factor Analysis
and Cluster Analysis ...........................
BIBLIOGRAPHY .................................. ' .................
APPENDIX A .....................................................
APPENDIX B .....................................................
APPENDIX C .....................................................
APPENDIX D .....................................................
iv
Page
44
44
S7
57
6O
6O
66
66
68
69
69
92
94
119
133
136
137
137
174
174
176
177
178
181
f“
\0
KO
198
199
Table
l.
10.
ll.
12.
LIST OF TABLES
A summary of studies in which combined factor analysis
and cluster analysis was employed ........................
Illustration of the crosstabulations of clusters
across different factor solutions ........................
Illustration of major elements in calculating
conditional entropy ......................................
. The calculation process for the information measure
(changes in cluster membership) between the 20-factor
and the l9-factor solution ...............................
. Artificial data for information (entropy) measure ........
Illustration of (factor score) centroids for each of
the six clusters across different factor solutions .......
Illustration of crosstabulation comparison of the
membership of clusters derived from rotated factor
scores with clusters derived from unrotated factor
scores ...................................................
Illustration of the calculation of the sum of
squared distance .........................................
Illustration for the measure of cluster similarity .......
Importance ratings (assigned the campground attributes)
which were used in the factor analyses and cluster
analyses .................................................
Eigenvalue, percent of variance explained, and
cumulative percent of variance explained for
20 campground attributes .................................
Campground attribute sought factor pattern matrix for
"20 factor" principal component analysis with varimax
rotation .................................................
Page
20
49
SS
56
58
61
62
67
73
Table Page
13. Campground attribute sought factor pattern matrix for
"19 factor" principal component analysis with varimax
rotation ................................................. 74
14. Campground attribute sought factor pattern matrix for
"18 factor" principal component analysis with varimax
rotation ................................................. 75
15. Campground attribute sought factor pattern matrix for
"17 factor" principal component analysis with varimax
rotation ................................................. 76
16. Campground attribute sought factor pattern matrix for
"16 factor" principal component analysis with varimax
rotation ................................................. 77
17. Campground attribute sought factor pattern matrix for
"15 factor” principal component analysis with varimax
rotation ................................................. 78
18. Campground attribute sought factor pattern matrix for
"14 factor“ principal component analysis with varimax
rotation ................................................. 79
19. Campground attribute sought factor pattern matrix for
"13 factor" principal component analysis with varimax
rotation ................................................. 8O
20. Campground attribute sought factor pattern matrix for
"12 factor" principal component analysis with varimax
rotation ................................................. 81
21. Campground attribute sought factor pattern matrix for
"11 factor" principal component analysis with varimax
rotation ................................................. 82
22. Campground attribute sought factor pattern matrix for
"10 factor" principal component analysis with varimax
rotation ................................................. 83
23. Campground attribute sought factor pattern matrix for
”9 factor" principal component analysis with varimax
rotation ................................................. 84
24. Campground attribute sought factor pattern matrix for
"8 factor" principal component analysis with varimax
rotation ................................................. 85
25. Campground attribute sought factor pattern matrix for
"7 factor" principal component analysis with varimax
rotation ................................................. 86
vi
Table Page
26. Campground attribute sought factor pattern matrix for
"6 factor" principal component analysis with varimax
rotation ................................................. 87
27. Campground attribute sought factor pattern matrix for
"5 factor" principal component analysis with varimax
rotation ................................................. 88
28. Campground attribute sought factor pattern matrix for
"4 factor" principal component analysis with varimax
rotation ................................................. 89
29. Campground attribute sought factor pattern matrix for
"3 factor" principal component analysis with varimax
rotation ................................................. 9O
30. Campground attribute sought factor pattern matrix for
"2 factor" principal component analysis with varimax
rotation ................................................. 91
31. Mean attribute sought factor scores for the eight-cluster
candidate solution when clustering on factor scores ...... 95
32. Mean attribute sought factor scores for the six-cluster
candidate solution when clustering on factor scores ...... 96
33. Mean attribute sought factor scores for the three-cluster
candidate solution when clustering on factor scores ...... 97
34. Number of respondents in each of the cluster candidate
solutions when clustering on factor scores ............... 98
3S. Cluster membership crosstabulation of the 20-factor
solution and the 20-factor solution ...................... 120
36. Cluster membership crosstabulation of the 20-factor
solution and the 19-factor solution ...................... 120
37. Cluster membership crosstabulation of the 20-factor
solution and the 18-factor solution ...................... 121
38. Cluster membership crosstabulation of the 20-factor
solution and the l7-factor solution ...................... 121
39. Cluster membership crosstabulation of the 20-factor
solution and the 16-factor solution ...................... 122
40. Cluster membership crosstabulation of the 20-factor
solution and the 15-factor solution ...................... 122
vii
Table Page
41. Cluster membership crosstabulation of the 20-factor
solution and the 14-factor solution ...................... 123
42. Cluster membership crosstabulation of the 20-factor
solution and the 13-factor solution ...................... 123
43. Cluster membership crosstabulation of the 20-factor
solution and the 12-factor solution ...................... 124
44. Cluster membership crosstabulation of the 20-factor
solution and the ll-factor solution ...................... 124
45. Cluster membership crosstabulation of the 20-factor
solution and the 10-factor solution ...................... 125
46. Cluster membership crosstabulation of the 20-factor
solution and the 9-factor solution ...................... 125
47. Cluster membership crosstabulation of the 20-factor
solution and the 8-factor solution ...................... - 126
48. Cluster membership crosstabulation of the 20-factor
solution and the 7-factor solution ...................... 126
49. Cluster membership crosstabulation of the 20-factor
solution and the 6-factor solution ...................... 127
50. Cluster membership crosstabulation of the 20-factor
solution and the S-factor solution ...................... 127
51. Cluster membership crosstabulation of the 20-factor
solution and the 4-factor solution ...................... 128
52. Cluster membership crosstabulation of the 20-factor
solution and the 3-factor solution ...................... 128
53. Cluster membership crosstabulation of the 20-factor
solution and the 2-factor solution ...................... 129
54. Entropy measures (using the 20 factor solution as a
basis of comparison) of cluster membership for
different factor solutions ............................... 131
55. Crosstabulation of clustering results based on
rotated and nonrotated factors ........................... 135
56. Comparison of factor score centroids for clusters
based on rotated and nonrotated factor scores for
the "20 factor” solution ................................. 135
viii
Table Page
57. Mean attribute sought factor scores for the six-cluster
candidate solution when clustering on raw data ........... 139
58. Mean attribute sought factor scores for the five-cluster
candidate solution when clustering on raw data ........... 140
59. Mean attribute sought factor scores for the four-cluster
candidate solution when clustering on raw data ........... 141
60. Mean attribute sought factor scores for the three-cluster
candidate solution when clustering on raw data ........... 142
61. Number of respondents in each of the cluster candidate
solutions when clustering on raw data .................... 143
62. Comparison of stability of factor score patterns between
two approaches ........................................... 170
63. Differences in the importance ratings of different
campground attributes between two subsamples ............. 197
64. Comparison of factoring results between two subsamples .. 198
ix
Figure
1.
10.
11.
12.
13.
LIST OF FIGURES
Illustration of a plot of the coefficient of hierarchy
by number of clusters ....................................
Illustration of the plot of 19 entropy measures ..........
Illustration of the plot of factor centroids .............
Scree test for selecting candidate factor solutions ......
Coefficient of hierarchy by number of attribute sought
clusters when clustering is based on factor scores .......
. The "factor 1" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores .........................................
The "factor 2" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores .........................................
. The "factor 3" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores .........................................
The "factor 4" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores .........................................
The "factor 5" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores .........................................
The "factor 6" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores .........................................
The "factor 7" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores .........................................
The "factor 8" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores .........................................
Page
47
56
59
72
93
99
100
101
102
103
104
105
106
Figure Page
14. The "factor 9” factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores ......................................... 107
15. The "factor 10" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores ......................................... 108
16. The ”factor 11" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores ......................................... 109
17. The "factor 12" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores ......................................... 110
18. The "factor 13" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores ......................................... 111
19. The "factor 14" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores... ...................................... 112
20. The "factor 15” factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores ......................................... 113
21. The "factor 16" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores ......................................... 114
22. The "factor 17” factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores ......................................... 115
23. The "factor 18" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores ......................................... 116
24. The "factor 19" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores ......................................... 117
25. The "factor 20" factor score centroids for six clusters
across the different factor solutions when clustering
on factor scores ......................................... 118
26. Entropy pattern of cluster membership across the
different factor solutions ............................... 133
Figure Page
27. Coefficient of hierarchy by number of clusters when
clustering is based on raw data .......................... 138
28. The "factor 1" factor score centroids for six clusters
across the different factor solutions when clustering
on raw data .............................................. 144
29. The "factor 2" factor score centroids for six clusters
across the different factor solutions when clustering
on raw data .............................................. 145
30. The "factor 3" factor score centroids for six clusters
across the different factor solutions when clustering
on raw data .............................................. 146
31. The "factor 4" factor score centroids for six clusters
across the different factor solutions when clustering
on raw data .............................................. 147
32. The "factor 5" factor score centroids for six clusters
across the different factor solutions when clustering
on raw data .............................................. 148
33. The "factor 6" factor score centroids for six clusters
across the different factor solutions when clustering
on raw data .............................................. 149
34. The "factor 7" factor score centroids for six clusters
across the different factor solutions when clustering
on raw data .............................................. 150
35. The "factor 8" factor score centroids for six clusters
across the different factor solutions when clustering
on raw data .............................................. 151
36. The "factor 9" factor score centroids for six clusters
across the different factor solutions when clustering
on raw data .............................................. 152
37. The "factor 10" factor score centroids for six clusters
across the different factor solutions when clustering
on raw data .............................................. 153
38. The "factor 11" factor score centroids for six clusters
across the different factor solutions when clustering
on raw data .............................................. 154
39. The ”factor 12" factor score centroids for six clusters
across the different factor solutions when clustering
on raw data .............................................. 155
xii
CHAPTER I
INTRODUCTION
Cluster analysis is a statistical method commonly used to classify
individuals or objects into groups (clusters) based on their similarity
with respect to specific characteristics/variables so that the resulting
clusters possess high internal (within-cluster) homogeneity and high
external (between-cluster) heterogeneity. In addition to the grouping
function, cluster analysis can also be used to perform data reduction
and to test hypotheses (Anderberg, 1973; Everitt, 1974). Cluster
analysis has been applied in many fields such as business, social
science, psychology, biology, political science, remote sensing
research, and leisure research.
Clustering methods have been recognized throughout this century,
but most of the literature on cluster analysis and its application has
been written during the past two decades. Cluster analysis was first
discussed by social scientists during the 1930s (Driver & Kroeber, 1932;
Tryon, 1939; Zubin, 1938). However, it was not until the late 19505
that cluster analysis attracted significant attention. The main stimuli
for this increased interest were the publication of Principles of
Numerical Taxonomy by Sokal and Sneath (1963), and the development of
high-speed computers and cluster analysis software. At least 14
2
different computer software programs are now available for cluster
analysis (Punj & Stewart, 1983), including SPSS (Statistical Package for
the Social Sciences), SAS (Statistical Analysis System), BMDP, and
CLUSTAN.
Cluster analysis has been utilized extensively to segment various
product and service markets including different recreation and tourism
markets (Boggis & Held, 1971; Calantone & Johar, 1984; Calantone,
Schewe, & Allen, 1980; Crask, 1980; Davis, Allen, & Cosanza, 1988;
Ditton, Goodale, & Jonsen, 1975; Funk & Hudon, 1988; Goodrich, 1980;
Green, Frank, & Robinson, 1967; Green, Sommers, & Kernan, 1973;
Harrigan, 1985; Huszagh, Fox, & Day, 1985; Lessig & Tollefson, 1971;
Mazanec, 1984; Perreault, Darden, & Darden, 1977; Saunders, 1985; Sethi,
1971; Shoemaker, 1989; Stynes & Mahoney, 1980; Tatham & Dornoff, 1971;
Woodside & Motes, 1981). Besides market segmentation, cluster analysis
also has been used in the field of recreation and tourism to classify
leisure activities (Devall & Harry, 1981; Ellis & Rademacher, 1987;
Tinsley & Johnson, 1984) and to identify different types of experiences,
preferences and attributes (Hautaluoma & Brown, 1979; Heywood, 1987;
Knopp, Ballman, & Merriam, 1979; Manfredo, Driver, & Brown, 1983).
The increased use of cluster analysis has resulted in greater
attention to various clustering/methodological decisions including (a)
the clustering algorithm, (b) the similarity measure, and (c) the number
of clusters. These decisions are all critical elements in the
clustering process. Another primary concern in cluster analysis is the
degree of correlation between the clustering variables. Correlation
among clustering variables results in an implicit weighting (double
counting) problem; correlated variables have more weight in determining
3
the cluster solution. To address the implicit weighting problem,
researchers have proposed/used factor analysis (principal component
analysis) as a prelude to cluster analysis (Aldenderfer & Blashfield,
1984; Anderberg, 1973; Everitt, 1979; Gorsuch, 1983; Green et a1., 1967;
Rohlf, 1970; Skinner, 1979; Smith, 1989). Factor analysis is also used
as a preparatory step to reduce potential clustering variables to a core
set of dimensions in order to make the results more interpretable
(Kikuchi, 1986).
Factor (principal component) analysis is a process for grouping
variables. It is a multivariate statistical technique in which a large
number of interrelated variables is summarized/reduced to a smaller
number of factors (dimensions) without appreciable loss of information.
By performing factor (principal component) analysis,the original data
are reduced to some independent (noncorrelated) dimensions or factors.
Factor scores (calculated by multiplying the original raw data
we‘-
measurements by the corresponding factor score coefficients) are often
used as ‘ieeéiiieiishiéé in cluster sash/33...?-
In addition to data reduction, there are two additional benefits
to clustering based on the principal component analysis rather than raw
data (e.g., ratings of attributes). First, the dimensions (factors) are
independent, thereby avoiding the collinearity or multicollinearity
problem associated with correlated data. Second, the resultant factors
are given equal weight which avoids the implicit weighting problem.
Although factor scores (derived from principal component analysis) are
commonly used as input to clustering algorithms, researchers have raised
questions or concerns about this practice. Anderberg (1973) questioned
whether the factors reflect the relationship among variables that are
4
actually observed in the clusters. Rohlf (1970) voiced the concern that
principal component analysis tends to maintain the representation of
widely separated clusters in a reduced space but minimizes the distances
between clusters or groups that are not widely separated.
Factor analysis can affect/determine cluster solutions in three
M_,__l--v ~r----—--
potential ways: (a) the number of factors that determine factor scores
(Coovert & McNelis, 1988; Zwick & Velicer, 1986), (b) factor rotation
(Dielman, Cattell, & Wagner, 1972; Gorsuch, 1983), and (c) factor
weighting (DeSarbo, Carroll, & Clark, 1984; Sneath & Sokal, 1973).
Relatively little attention has been directed at the potential effects
of alternative factor solutions on clustering results. A review of 32
studies in which factor scores were used as the basis for clustering
identified only one which analytically compared clustering results based
on two different factor solutions (as the bases for clustering) (Day,
Fox, & Huszagh, 1988). In another study, Bartko, Strauss, and Carpenter
(1971) compared clustering results based on raw data and factor scores.
Shutty and DeGood (1987) compared clustering results based on
standardized scores and factor scores.
Problem Statement
Although the use of fegtagssesaa ss-i.921&.§§£§,£9§fleece:
analysis is a relatively common procedure, very little research has
been done on the effect of factor analysis--number of factors and
rotation-~on the results of cluster analysis, especially cluster
membership. Numerous researchers have raised various methodological
questions regarding factor analysis as an independent procedure
5
(Armstrong & Soelberg, 1968; Bobko & Schemmer, 1984; Browne, 1968b;
Hakstian & Muller, 1973; Heeler, Whipple, & Hustad, 1977; Horn, 1965a;
Moojjaart, 1985; Tucker, 1971) and cluster analysis (Bayne, Beauchamp,
Begovich, & Kane, 1980; Dreger, Fuller, & Lemoine, 1988; Funkhouser,
1983; Krzanowski & Lai, 1988; Lathrop, 1987; Marriott, 1971; McIntyre &
Blashfield, 1980; Milligan & Cooper, 1985; Mojena, 1977; Rand, 1971;
Skinner, 1978). However, as previously mentioned, only one study was
found that examined the effect of alternative factor analyses on
clustering when factor scores were the basis for clustering.
Factor analysis and cluster analysis are usually treated as
distinct analyses even when used in conjunction with each other
(Collins, Cliff, & Cudeck, 1983; Hooper, 1985; Shutty & DeGood, 1987).
The factor analysis is performed first; then the factor solution--the
number of factors extracted--is decided based on different factoring
criteria (e.g., eigenvalues greater than one, percentage of variance
explained, scree test, interpretability of factors), and not (also) on
the potential effect on the clustering solution--number of clusters,
cluster membership, homogeneity of clusters, and identification
(description) of clusters (Calantone & Johar, 1884, Crask, 1981;
Kikuchi, 1986; Meade, 1987). Although eigenvalues greater than one,
percentage of variance explained, and scree test are useful in
evaluating and selecting a factor solution, a great deal of subjectivity
is still associated with arriving at a factor solution and interpreting
the resultant factors.
An important decision in factor analysis is the method to be used
in rotating the initial factors that are extracted from the correlation
matrix. Rotating the factor matrix redistributes the variance from
6
earlier factors to later ones to achieve a simpler, theoretically more
meaningful, factor pattern (Hair, Anderson, & Tatham, 1987; Kim &
Mueller, 1989). Rotating factors generally improves the interpretation
by reducing some of the ambiguities that often accompany initial
unrotated factor solutions. Although rotating the factor matrix may
create more interpretable factors, Frank & Green (1968) pointed out that
rotation of factor axes also lends a certain arbitrariness to the
procedure. Most studies on rotation have focused on alternative
methods, either orthogonal or oblique (Arbuckle & Friendly, 1977;
Carroll, 1953; Hakstian, 1976; Saunders, 1961); no studies of the effect
of rotation on cluster membership were found.
Although the_ggmbingd use of factor analysis and cluster analysis”
has been commonly employed in segmentation and ¢l§§§i§i9§PiQD-SFHQi§S.
it has also been used for other purposes, such as differentiating small
geographic areas on the basis of well-established sociological
constructs, understanding social differentiation in modern industrial
society, revealing consumer search patterns, and measuring the concept
of social identity.
The use of factor analysis in conjunction with cluster analysis is
I
,/
{ also wildly used in recreation and tourism, such as segmenting
7 vacationer market based on lifestyle variables, segmenting the tourism
market on benefitfseeking choices, exploring aspects of lifestyles with
respect to vacation activities, establishing lifestyle profiles of
elderly female travelers, and ascertaining the barriers to recreation.
The primary purpose of this study was to assess the effect of
different approaches to factor analysis on cluster membership when
clustering is based on factor scores. Specifically, the study examined
. /—
7
the effect alternative factor solutions (number of factors) and factor
rotation on cluster membership. Another purpose was to compare the
stability of clusters based on factor scores with the stability of
clusters based on raw data.
Study Objectives
To address the aforementioned purposes, three objectives were
defined to guide and evaluate this study.
Objective 1. To assess the effect of different factor
solutions (number of factors) on cluster
membership.
Objective 2. To ascertain the effect of factor rotation on
cluster membership.
Objective 3. To compare clustering on factor scores with
clustering on raw data.
Organization of the Study
Chapter II is a review of relevant literature, focusing on
previous studies, especially in the fields of marketing, recreation, and
tourism, that have employed both factor analysis (principal component
analysis) and cluster analysis. Chapter III contains a description of
the data--ratings of 20 campground attributes—-used in the study,
including how they were collected, and a discussion of the statistical
8
procedures used for the different objectives. Chapter IV includes
descriptive statistics on the ratings of the twenty campground
attributes, the appropriateness of data for factor analysis, an
assessment of the effect of different factor solutions on the clustering
results, an assessment of the effect of rotation on cluster membership,
and comparison of clustering on factor scores with clustering on raw
data. Chapter V includes a summary of the study, major conclusions,
study limitations, and recommendations regarding the combined use of
factor analysis and cluster analysis.
CHAPTER II
LITERATURE REVIEW
The primary objective of this chapter is to acquaint the reader
with the literature on the combined use of factor analysis and cluster
analysis and its application in the fields of marketing (especially
market segmentation), recreation, and tourism.
Factor Analysis and Cluster Analysis
Factor Analysis
As mentioned previously, factor analysis is a multivariate
statistical tool for exploring the similarity of relationships among
variables. The primary purpose of factor analysis is to reconstruct
original variables into an underlying multivariate space that specifies
the positions of original variables rather than establishing which
variables go together (Gorman, 1983; Gorsuch, 1983). Factor analysis
starts out with a correlation matrix, which is a table showing the
intercorrelations among all variables. The interrelationships between
10
variables are typically determined by Pearson product-moment
correlation.
The underlying factors are extracted using either a component
model or a common factor model. There are a number of differences
between the two models. The major difference is the elements comprising
the diagonal of the correlation matrix. The component model uses total
variance (unity) in the diagonal of the correlation matrix, whereas the
common model uses communalities (common variance). The component model
._.- ‘aa‘
KN... H— -————“ "
is used to summarize most of the original information (variance) in the
minimum number of factors. _Ih§_£2§§3§_§§2£2£amg§gl is used to identify
underlying factors or dimensions not easily recognized (Hair et a1.,
1987; Kim & Mueller, 1988).
Although both factoring models are capable of extracting common
factors, the initial result seldom represents the final solution because
the initial factors are difficult to interpret and may not adequately
represent the simple structure. Frequently, the initial factors are
rotated. Two rotation procedures are commonly used, orthogonal and
oblique. In orthogonal rotation the factors are mutually independent.
Three major types of orthogonal rotation-—varimax, equimax, and
quartimax--are most commonly used in practice. Of the three, varimafi
rotation is used most frequently (Bieber & Smith, 1986; Norusis, 1988).
In oblique rotation the factors are correlated (Bieber & Smith, 1986;
Gorsuch, 1983; Hair et a1., 1987; Kim & Mueller, 1988). When the result
(e.g., factor score) of factor analysis is to be used in subsequent
statistical analyses (e.g., cluster analysis), an orthogonal rotation is
appropriate because collinearity is eliminated. In contrast, oblique
11
rotation is appropriate if the objective is to obtain theoretically
meaningful constructs or dimensions.
There is no agreement in the literature regarding the best
rotation method. Bartholomew (1985) indicated that there is no
significant difference between orthogonal and oblique rotation
procedures in terms of factoring results. Stewart (1981) contends that
the basic solutions provided by most rotational programs result in the
same factors, thus, the rotation method should have relatively little
impact on the interpretation of factor analysis results.
A primary step/decision in factor analysis concerns how many
factors should be extracted. Several criteria are typically used to
decide on the numbggmgfmfagtggs. The most common one is the Kaiser
criterion (Kaiser, 1960), whereby all factors having eigenvalues greater
than oneware accepted. This criterion often is used in conjunction with
percentage of variance explained and the scree test (Cattell, 1966).
Other methods, including significance tests associated with the maximum
likelihood and least squares solutions, Horn's (1965b) parallel
analysis, Bartlett's (1950, 1951) chi-square test, Velicer's (l976a)
minimum average partial method, and interpretability of the factors are
also used to determine the number of factors.
Although each criterion has its supporters, Zwick and Velicer
(1986) contend that which criterion is most appropriate depends on a
number of different factors--samp1e size, number of variables, component
saturation (scale of factor loading), component identification, and
special variables (variables having a nonzero loading on more than one
component) (Zwick & Velicer, 1986). Based on their research, they
12
concluded that parallel analysis and the minimum average partial method
are generally the best across situations. However, a review of factor
analysis studies showed that the majority used combined criteria, such
as eigenvalue greater than one, percentage of variance explained, and
the screefltest (Allen, 1982; Beard & Ragheb, 1983; Connelly, 1987;
Hollender, 1977; Lounsbury & Hoopes, 1988; Tinsley & Kass, 1979; Wahlers
& Etzel, 1985).
Once the number of factors is decided, the next step in factor
analysis is to interprethhe,fact r solution. The most common
interpretation approach involves analyzing the size and pattern of
fagtgrhlggdiggs. Factor loadings are key in understanding the nature of
factors. A factor loading indicates the relationship between a variable
and a factor. The higher the factor loading, the stronger the
relationship. Hair et a1. (1987) suggested that factor loadings greater
than $0.80 are significant, those greater than i 0.40 are more
important, and loadings greater than or equal to i 0.50 are very
significant. Their suggestions can be viewed as a rule of thumb. In
addition, Gorsuch (1980) indicated there are more exacting but
computationally more difficult ways of determining the significant
loadings including: Archer and Jennrich's (1973) formulas, Jéreskog's
(1978) confirmatory maximum likelihood factor analysis, and Lindell and
St. Clair's (1980) jackknife approach.
A review of factor analysis studies showed that the factor loading
rulefioffithumb is used most often. However, researchers contend that
valid interpretation of a factor solution should depend on examination
of high, medium, and low loadings. High loadings indicate variables
13
which are highly related to a particular factor, whereas low loadings
indicate variables which are not related to a particular factor (Bieber
& Smith, 1986).
The final stage in factor analysis is to calculate factor scores,
which are commonly used as input variables in other statistical analyses
such as cluster analysis, discriminant analysis, and regression
analysis. There are several different methods for estimating factor
scores. According to Tucker (1971), the least squares solution
characterized by Horst (1965) and Bartlett (1937) would yield
appropriate factor score estimates for evaluating group differences on
factors. Thurstone (1935) also suggested that if group membership is to
be predicted from factor scores, the regression estimates method would
be appropriate. Although Velicer (1976b) found that there is little
practical difference among factor score estimates, image scores, and
principal component scores, he suggested using principal component or
rescaled image scores. However, unless the principal components model
is used, factor scores can only be estimated (Kass & Tinsely, 1979;
McDonald & Burr, 1967).
Cluster Analysis
The purpose of cluster analysis is to formulate relatively
h°1995€9999§ BIZQUP'IDBS... -95.. inéividgels/ij ects ,bas,s.9_.-9r1” onenegmo .28.. .
similarity criteria. Cluster analysis starts with a similarity measure
of the proximity or closeness between all possible pairs of
individuals/objects. There are four types of similarity measures:
l4
correlation coefficients, distance measures (e.g., Euclidean distance
measure), association coefficients, and probabilistic similarity
coefficients (Aldenderfer & Blashfield, 1984). The last two are
infrequently used. Although it has been demonstrated that using
correlation coefficients as the similarity measure reduces the ratio of
misclassification (Hamer & Cunningham, 1981), correlation coefficients
are relatively insensitive to differences in the magnitude of the
variables and fail to satisfy the triangle inequality (i.e., d(x,y) S
d(x,z) + d(y,z), given that x, y, and z are different entities). In
contrast, distance measures provide the actual distance between cases
andfliagisfywthe'triangle_inequality.
The literature indicated that distance measures are the most
commonly used measures of similarity (Aldenderfer & Blashfield, 1984;
.9 ‘91.... -_..
Bieber & Smith, 1986; Everitt, 1974; Hair et a1., 1987). Three types of
distance measures are commonly used: Euclidean distance, Manhattan
distance, and Mahalanobis 92- Euclidean distance (assuming that
variables are independent) is most commonly used, even though some
researchers argue that Mahalanobis D2 is more versatile in that it can
be used even if the clustering variables are correlated. Euclidean
distance is often criticized as not having ability to preserve distance
ranking (Everitt, 1974). However, this problem can be solved by
standardizing the data (Aldenderfer & Blashfield, 1984).
What clus§g££E§W§}ggli§hm to use is obviously an important
clustering decision. Most researchers prefer to use hierarchical rather
than nonhierarchical clustering algorithms because nonhierarchical
clustering algorithms start with the selection of an appropriate
15
starting partition/seed point which is relatively subjective
(Blashfield, 1978).
The five popular hierarchical methods--sing1e linkage (minimum
distance), complete linkage (maximum distance), average linkage (average
distance), Ward's method ( minimum variance), and the centroid method
(distance between means)--differ in terms of how the distance between
clusters is calculated. However, results of a number of studies
indicated that Ward's method consistently outperforms the other methods
in terms of the accuracy of the cluster solution (Bayne et a1., 1980;
Blashfield, 1976; Edelbrock, 1979; Edelbrock & McLaughin, 1980; Mojena,
1977).
Ward's (1963) method is used to optimize the minimum variance
within clusters. In Ward's procedure, the distance between two clusters
is the sum of squares between the two clusters summed over all
variables. At each step in the clustering process, the union of every
possible pair of clusters is considered. The two clusters whose fusion
results in the minimum increase in the error sum of squares become a new
cluster (Aldenderfer & Blashfield, 1984; Everitt, 1974; Hair et a1.,
1987; Norusis, 1988).
Although many researchers recommend Ward's method, it has two
problems/limitations. First, it is sensitive to outliers. Also, there
is no function for reallocating entities that might have been poorly
classified at early clustering stages (Everitt, 1974). Some researchers
have suggested that the outlier problem can be eliminated by using both
the hierarchical clustering method and the iterative partitioning method
(Milligan, 1980; Punj & Stewart, 1983).
16
A critical step in cluster analysis is deciding on a clustering
solution--the number of clusters to form. There are a number of
procedures for determining the number of clusters (Aldenderfer &
Blashfield, 1988; Dubes & Jain, 1979; Everitt, 1974; Milligan & Cooper,
1985). In many studies, the decision has been based on an examination
of different levels of the fusion dendrogram or a similar scree test. A
similar scree test involves plotting the fusion coefficients against the
number of clusters, which is the numerical value at which various cases
merge to form a cluster. Sudden jumps or breaks in the scree plot
indicate that two relatively dissimilar clusters have been merged. The
solutions (number of clusters) prior to these mergers are likely
candidate solutions (Thorndike, 1953). Both the fusion dendrogram and
the similar scree test approaches are subjective.
Other less subjective approaches for deciding on cluster solutions
have also been discussed (Everitt, 1979; Milligan & Cooper, 1985). For
example, Marriot (1971) suggested that a possible criterion for
selecting the number of groups/clusters is to take that value of k for
which kzhll is a minimum, where k is the number of clusters and [WI is
the determinant of the pooled within-group variance-covariance matrix.
Beale (1969) proposed using a F-ratio to test the hypothesis of the
existence of K2 versus Kl cluster in the data (K2 > Kl). Wolfe (1970)
proposed a likelihood ratio criterion to test the hypothesis of k
clusters against k-l clusters.
Despite the numerous criteria that have been proposed, Everitt
(1979) believes that no one completely satisfactory solution is
available. The best way to decide on the number of clusters seems to be
17
to utilize a combination of the decision criteria along with the
interpretability of results (Bieber & Smith, 1986; Everitt, 1979;
Gnanadesikan & Wilk, 1969). Other criteria, such as identifiability,
substantiality, variation in responses, and exploitability, are also
important in deciding a final cluster solution, especially if the
purpose is market segmentation (Kikuchi, 1986; Kotler, 1984; Stynes,
1983).
Comparisons of Factor Analysis and Cluster Analysis
There still is some confusion regarding the differences between
factor analysis and cluster analysis. This frequently results in
inappropriate applications of both methods.
The major distinction between factor analysis and cluster analysis
is that the former detects relationships between variables and thereby
reconstructs original variables into fewer dimensions, whereas the
latter is concerned with the classification of individuals/objects.
Neither method alone may be sufficient if researchers are trying to
reduce a large set of data and to classify individuals into groups (on
the basis of the reduced data). In this situation, the use of factor
analysis in conjunction with cluster analysis is often suggested
(Anderberg, 1973; Everitt, 1979; Gorsuch, 1983; Green et a1., 1967;
Mark, 1980; Punj & Stewart, 1983; Rohlf, 1970; Skinner, 1979; Smith,
1989).
18
Literature Supporting the Combined Use of
Factor Analysis and Cluster Analysis
A number of researchers have determined that factor analysis is
helpful in identifying meaningful dimensions/factors on which to cluster
individuals/objects. Mark (1980) suggested using principal component
analysis as a preparatory step to cluster analysis to identify
neighborhoods for preservation and renewal. Swinyard and Struman (1986)
found that clustering consumers after a factor analysis, thereby
reducing various measures to a fewer factors, resulted in
(restaurant/dining) clusters/segments that were easier to describe and
act on. Smith (1989) preferred the combined factor;glusterlanalysis
approach over the "a priori" method because it resultsain~mgre
homogeneous clusters. Gorsuch (1983) indicated that factoring before
cluster analysis helps clarify the basis on which individuals are
grouped, and provides empirical methods of producing typologies. Wind
(1978) suggested performing a principal component analysis as a way to
obtain a more reliable and meaningful factor structure before
clustering.
Combined factor and cluster analysis can be used to solve the
problem of independency of variables and to deal with implicit weighting
problem in clustering procedures (Green et a1., 1967; Punj & Stewart,
1983). In addition, the combined approach can be used to identify a
"best" set of dimensions for depicting the relationships among
individuals (Skinner, 1979).
Punj and Stewart (1983) contend that when a researcher desires
that all dimensions or attributes be given equal weight in the
"N-..
l
19
clustering process, it is necessary to correct for interdependencies.
They suggested two approaches to correct for interdependencies: (a)
using Mahalanobis EV or (b) completing a preliminary principal component
analysis with orthogonal rotation. Component (factor) scores can then
be used as input variables for computing similarity measure in the
clustering process.
Studies on the Combined Use of Factor
Analysis and Cluster Analysis
As previously stated, combined factor-clustering analysis has been
utilized by researchers in many fields, such as marketing, recreation,
tourism, psychology, medical science, and sociology. This section
contains a review of a number of studies that used factor scores as a
basis for clustering, with special attention to the factoring method,
criteria for selecting the number of factors, the clustering method, and
the criteria for selecting the number of clusters. Table 1 summarizes
22 of the 32 studies which were reviewed.
Day and Heeler (1971) used a randomized block experiment with five
strata composed of three stores to test the sales effect of three
price-level changes in a new food product. Principal component analysis
was first performed on 12 store attributes (e.g., selling area of store,
average household income). Fiyg mutually independent factors were
identified, which accounted for 12% of the total variance. Factor
’W‘Lp“ n
scores were then calculated to obtain two different similarity measures:
modified matching coefficient and Euclidean distance. Both similarity
measures were used as the basis for hierarchical and nonhierarchical
oftuoam
poc_m.axm uoz co_umuo«
mucosmom occum oo_»mooom co_wmoodm oocm_cm> .ucoCOQEou mouan_cuu< pho—
uoc_um_o m oz uoz «02 co ammucoocoa .ma_oc_ca otoum mp .co.ooz w >mo
poc_mdaxm co_uuuo«
mucoEmmm oocm_tm> me_ca>
coco_uwoo> otmaom cocoa: amoucoocoa .p .ucocoaeou mouan_cuu<
uuc_um_o m oz co 53m totem m.uto3 A o:.o>com_m .oa_oc_ca co_umom> mp Poop .xmmtu
mucoEmom
288 85393
wu_wocou ooca_co> co mouanmcuu<
decouoom amoucoocoa .P oo_»_ooaw co_uoc_umoo «wo— .cocOs
uoc_um_o o.m oz o_uoc-m mcooe-u A o:.o>com_w uoz do>otp ON a ocouca.ou
hobo,
O 9... .mso—
oz .950— .moo—u
coo> zoom
com uoEcOu comuouoz
ace: acouma.u poc_o.axm xoe_co> wodna_tm>
>coasou po_~_ooam vague: ooco_ca> .ucocoaeou mcowm_ooo
uoc_um_a 03h oz uoz m.ocoa co omoucoocoa .aa_oc_ca ucoc_>_o no «no, .ocozm_m
m_m>.oc<
couma.u
uc< m_w>.oc<
couoou acouma.u co_u:.om
co comuoocouc_ co cones: couuou comuauoz
on» on» ac_uoo.um nocuo: asp oc_uoo.om uc< cacao: anon
mu.:wo¢ co co_mm:om_o to; optou_cu mc_coum:.u cod o_cou_cu uc_couoam wo ocauoz Amvcozua<
.po>o.aeo mo: m_o>.oca toumado new m_m>.aca couoac uoc.neoo go_:3 c_ mo_u:ao co >cassam < .— odou—
.l.
2
mucoEmom co_umuom
ommmucoz co_cmuwcu vogue: noc_m.axm xms_cm> modnm_cm>
.m_ucon_mo¢ c_n:¢ mc_co_u_ucma oucm_cm> .ucocoaeou ommmuco:
uocmummo 0 02 a cmeco_ca o>_umcou_ yo ommucoocoa .md_oc_ca .mmucoc_mo¢ «o asap .amo
mcmoe.z
nc< mc_coum:.u
.au_cucocm_z
zoom coca
po>_coo «to: mo.na_co>
mucoEmom a_;m mcooe-x noc_a.axu po~o.o¢
.caacocaocucu uo_»_uoam .mc_coum3du ouco_co> uo_*_ooam a_cuc:oc
oz» oz uoz .au_zucoco_z co unaccoucua uoz .ocaocucw co coo. .cocucoo
uo_»_uoam
mucosuom mcaoe-x uoz comuauox mu_u_>_uo<
oc_gocoom uco_o_vcoou .uoguox you» carom .P .ucocoasou ac_gucoom coop
ecu uucmua_o 0 02 canon o.uto3 A o:.a>cou_m .aamoc_ca tau :0: c~ .._a an oats;
mco_u:.om
touuou
ucucoww_a
com mucoeoom
$58 8:30am 8:30am 25.8% 23855 82
auc_um_o o mo> uoz mcoos-x uoz uoz u_socouu a. ...o uo >oo
m_m>_ac<
coum:_u
92 £322
couooa acoumadu co_u:.om
wo copuoocouc_ we conezz cououa comuouoa
on» 0:» uc_uoo.om vogue: 0:» uc_uoo.om uc< toga»: ouao
audamoz we co_ma:om_o com a_cou_cu ac_coum:.u com o_cou_cu ac_couoou co utauoz Amocozpa<
i/IIIII
122
85293 8:6me
mucoEmmm venom: oocm_cm> uoz co_omaoa modnm_cm>
.m_oom co_»_oodm mc_couw3.u ammucoocma .— .ucmCOQEou co_u_m0QEou
uoc_~m_o cm 02 ~02 p_0cucou A o:.m>com_m .md_oc_ca .m_oom om moor .moc0a
mucoEmom amok co_uouoz
maumum m_cou_cu motom ._m_utma xme_tm> muoacumcou
o_eocoooo_oom mc_coum:.u cocoa: ommto>< .ucmcoaeou .mu_mo.o_oom swap ...m
353.3 9 oz 333 922.. .555: .8355 oo 3 >335:
mocm_tm>
cmumadu c_;u_3
mucoEmom o» ouca_cm> co_uauoa
>u_ucop_ coumadu co_uouocacouc_ oao_.no modnamcm>
.o_oom coozuom co_*_omam .umo» carom .— .ucocodeou papa.o¢
3:33“. 3 oz to 0:2. so: A 3.2633 .335: :28... om £2 :88:
noc_etouonoca poc_a.axm comuauoa mo.na_ca>
mucueaom «to: aco_u:.om oocomco> mo xaamto> nouo.o¢ coop
>u_.ocomcoa coum:.u oooucoucua .P .ucocOQEou co_ttam .toxac.num
uoc_um_o c oz _co_ca < c. _co_ca < A o:.o>com_m .od_oc_ca co_uootooa mm a comtoocoz
.. amok oocom
voc_5couopoca .uoc_o.axm co_uouo¢
«to: mco_u3.om moca_co> co xoe_co>
mucoeaom om< coum3.u unannoucoa .P .ucocoaeou mucoEOHoum
noc_»opota m oz mco_ca < c_ _co_ca < A o:.o>cou_m .na_oc_ca o_< mm mwo— .mozo:
m_m>.oc<
coumadu
us. 39:22
cauooa mcoumadu co_u:.om
mo co_uoucouc_ yo cones: couooa co_uouoz
och on» uc_uoo.om vague: oz» mc_uuo.om uc< vogue: sumo
madamoz co co_mm:om_o to; o_cou_cu mc_coum:.u cam a_cou_cu uc_couooa co ocauoz amvcogu3<
.A.u.uc09v — o.noh
223
mucoEmom
o.nm_cm>
oo_ta a emu
tommzocaa-cmu no_w_oodm oo_c_ooam po_w_ooam oo_c_ooam co wodno_cm>
uoc_um_o o. 02 ~02 uoz uoz uoz .ao_m>;a —— nwo— .oomwz
coucoooz poc_a.axm co_umuoz
uoz mm: oocomca> co xme_ca> modnm_tm>
co_ua_comoo to po_w_oodm cacao: ammucmucoa .— .m_m>.mc< poua.oa 050— ...m um
tonenz coum3.u oz uoz m.pcm3 A o:.o>com_m couoaa coeeou now «am xm_woum>~tx
mucoEmom mo.no_cm>
o_mouocum noc_n_axm co_w_ooam o_mouocum
v a mucoEmom ooComca> *0 co: co_uouo¢ m. a modna_cn>
.oucoecoc_>Cw ecccu cacao: unaccoucoa .P .ucocoaeou .aucoecoc_>Cm woo—
c oz ocoaam-cooz m.uca3 A o:.n>com_m .aa_oc_ca n" .E_4 a e_x
wodna_to>
co_uoooa
3:258 852. 8:33: -3325 3
co_uooo. co_uouocatouc_ mc_co_u_utoa ooca_ca> comuouoz a mouanmtuu<
-aumoodm .n_couacox o>_uocou_ wo amoucoocoa xns_ca> mc_zm_u
a pan unusom .ocoacm .pozuoz .umo» «atom .F .ucocoaeou doco_uootoo¢
mou:n_cua< o oz we saw totem m.uta3 A o:.o>cou_w .aa_oc_ca - one, ._co:x_z
muCQEuom poc_o.axm co_uauoa wodna_co>
com>uzoo ouca_ca> co o:o_.no mc_zocaom
cocaom uo_»_ooam eunucoucoa .— .ucocoaeou co_uoEco»c_ poo—
cou auc_um_o n 02 uoz mcaoe-x A oada>com_m .aa_uc_ca ~— .cou>oa a .o_x
a_o>.oc<
coumadu
uc< 39:9:
couoou acoumadu co_u:dom
wo co_uoacouc_ wo cones: couoou co_uauo¢
2: 2: 838.3 852. 2: 828.8 .9: 85»: 33
madamoz co co_ao:ua_o com o_cou_cu oc_coum:.u to; omcou_cu oc_couoom «o ucauoz Amocogu3<
.A.u.uc00v — 0.0m»
l4
2
mucoEmom mmc_to**o
mmu_>tom ou_>tom
.382: «833 852. 63:88 8:38.». .282: 32 r:
uoc_um_o c oz #0 53m totem u.ucoz uoz uoz o.ooma: - no >odcoum
85a: 8232.
mucQEmom mc_coum:.u xms_co>
o.>umou_a uo_w_ooam omoxc_a .ucocoaeou mco_mcue_o one. ...a
uuc_um_o a 02 uoz ouodaeou p A o:.o>com_u doa_oc_ca o.>umowma 0 yo outom
poc_eaxw
at»: mco_u:.om cowuauoz
coum:.u muc_uooa xus_co>
3:23 8:38.... .56.: J $889.3 2395a 82
o o» ~ oz uoz mcaue-x A o:.o>cou_u .oa_oc_ta o_c_.u nu .o.coumoa
mucoeuom
833.5 858% 850.. 3:83 8:38» 3:233.“ o: to. t...
auc_uu_o m oz uoz o.uco: uoz uoz u.co_uooa> as no udaaottoa
m_m>_oc<
coumadu
uc< m_m>.oc<
couoaa acouma.u comuadom
co co_uoocouc_ wo coneaz touuoa co_uouo¢
och on» oc_uoo.om vogue: as» oc_uoo.om uc< vogue: coco
oudzmoz *o co_omaoo_o com -_cou_cu uc_coum:.u com o_cou_cu oc_couuoz co ocauoz auuco=u3<
.A.u.uc00v p 0.90»
25
clustering processes to test the homogeneity and representativeness of
strata. Although the factor-cluster approach was used in this study,
only one criterion, percentage of variance explained, was used to decide
on the number of factors. The authors did not indicate any concern
regarding the impact of the factor analysis on the clustering results.
Helge (1978) analyzed data on profiles of 113 occupation groups,
using three different clustering procedures: (a) hierarchical grouping
of standard scores, (b) hierarchical grouping of orthogonal factor
scores, and (c) NORMIX analysis assuming equal covariance matrices for
each group. Ward's method and Euclidean distance were used in all three
cluster analyses. The hierarchical grouping of standard scores resulted
in 13 groups, which were used as the basis of comparison with the
results of the other two methods. The results showed that the NORMIX,
in which the distance measures were calculated based on component
(factor) scores, produced a solution having the most intuitive
psychological sense. The results also showed that the hierarchical
grouping of orthogonal factor scores provided clustering results nearly
as good as NORMIX, whereas the hierarchical grouping of standard scores
was the worst of the three approaches in terms of cluster homogeneity.
The author did not discuss the impact of alternative factor solutions on
the clustering results.
Green et a1. (1967) proposed a factor-cluster approach that not
Cnnly included a data-condensation function but also changed the implicit
‘Veighting of characteristics. Principal component analysis was
{Nerformed on the data matrix first; then objects were clustered, based
CH3 principal component scores. They employed this technique to classify
538 cities for the purpose of selecting test markets. Two factors were
26
derived from 14 variables (e.g., population, retail sales, and
television coverage), and three clusters were formed. The authors did
not provide information on the criteria to decide on the number of
clusters, nor did they discuss the potential effect of the factor
analyses on the clustering results.
Skinner (1979) presented a hybrid approach to integrate the
dimensional and discrete clusters approaches to classification research.
Two major steps are involved in this approach. First, a parsimonious
set of dimensions is identified by performing a preliminary principal
component analysis with orthogonal rotation, and evaluated by
replication across samples. Second, relatively homogeneous subgroups
are identified (using a clustering or density search algorithm), based
on factor scores derived from the first step. This hybrid approach
helped Skinner successfully cluster male delinquent adolescents, who had
completed the Basic Personality Inventory (i.e., an ll-scale structured
inventory of psychology), into three modal profiles (groups). These
three groups are similar to what most clinical psychologists would
describe. The criteria used to decide on the number of clusters and the
potential impact of alternative factor solutions on the clustering
results were not discussed.
To develop taxonomies of search behavior by new car buyers, Kiel
43nd Layton (1981) used factor analysis to reduce 12 different search
‘Lariables (e.g., search time, trips made) to four initial factors. The
factors were then rotated by oblique rotation, and the four factors were
retained. Factor scores were calculated and used to derive an aggregate
Eiearch index. A K-means clustering algorithm was used to group buyers,
tDased on the index number. The authors provided no information on the
27
criteria they used to decide on the number of clusters, nor did they
discuss the rotational effect of the factor solution on the clustering
results.
Stanley, Powell, and Danko (1987) factor analyzed ratings of the
desirability of 22 "upscale" financial service offerings (e.g.,
investment management and advice, immediate access to credit), and
developed seven "upscale" financial service factors. Scores for those
seven factors were used to categorize financial service customers (using
Ward's clustering method) into four clusters/segments. The authors did
not report on the factoring method or the criteria for selecting a
factor solution. Nor did they discuss the potential impacts of the
factor analyses on cluster/segment membership.
To differentiate small geographic areas in Rhode Island on the
basis of well-established sociological constructs, Humphrey, Buechner,
and Velicer (1987) proposed using combined factor-cluster analysis.
Principal component analysis with varimax rotation was performed to
reduce 60 original variables (e.g., families with income below poverty
level in 1979, females in labor force) to four factors. To demonstrate
the clustering procedure, the authors used only two factors (wealth and
education factor). Ward's method (using square Euclidean distance) was
‘performed on factor scores. Fifteen socioeconomic status clusters
finnerged. The potential impacts of alternative factor solutions on the
<21ustering results were not discussed.
To understand social differentiation in modern industrial society,
sLones (1968) used combined factor-cluster analysis. Principal component
analyses were performed on three domains: socioeconomic status (24
‘Kariables), household composition (24 variables), and ethnic composition
28
(22 variables). Three factors emerged for each domain. Factor scores
for each domain were computed to test the independence of the three
dimensions. Another principal component analysis was performed, based
on 24 variables (eight variables were selected from each dimension).
Two constructs/factors were identified (socioeconomic status/ethnicity
and household composition). Factor scores for these two factors were
used as the basis for clustering. Twenty groups were identified using
the centroid clustering method (with the squared Euclidean distance
measure). Again, the author did not discuss criteria for selecting the
number of clusters or the possible effect of the factor
analysis/solutions on the clustering results.
To study the strategic positioning of product (car) range by
manufacturers, Meade (1987) employed factor analysis to condense the
information contained in 10 observable (e.g., engine capacity, maximum
speed) variables to fewer factors. Three factor analyses were
performed, which resulted in three-factor, two-factor, and
single-factor solutions. The three-factor solution was used only to
evaluate pricing policy; no cluster analysis was performed. The
two-factor solution was used as the basis for clustering; 10 car
segments emerged. The one-factor solution was used to provide the
Ineasure for cluster analysis; three groups/segments (small, medium, and
ilarge) were formulated. Meade indicated that the combined use of factor
Ernalysis and cluster analysis allowed the researcher to superimpose some
Sitructure on the ranges of products offered in the market. However, the
<2riteria for deciding on the number of factors or clusters, the
factoring method, the clustering method, and the possible effect of
factor analysis on the clustering results were not discussed.
29
Day et a1. (1988) used combined factor and cluster analysis to
segment the global market for industrial goods BEEEErEP economic
indicators. Two different factor analyses were performed. The first
factor analysis was conducted on 18 economic indicators; three factors
\emerged. In the second factor analysis, two of the original 18 economic
indicators were dropped because they did not have any strong affiliation
with any of the three factors. Three factors emerged from the second
factor analysis on the 16 remaining variables. Fagtgr_§gg£g§ were
computed for the factors from both factor analyses. K-means clustering
v‘N—m . I~fi—i\— -».
‘5‘,
algorithm was used to group countries. Cluster analyses on the factor
Mm "ww'ow'fiN-I‘WM ‘M‘xm o1- W‘Jw, «1'.
scores from both the first and second factor analyses resulted in two
H NM
si§Lcluster solgtions. Comparison of the two solutions indicated that
countries were grouped similarly in both analyses. The authors failed
‘>‘¢I—-’-‘a-‘
factors andflclusters. However, they examined the clustering results
between two different factor solutions (as the bases for clustering).
Sorce, Tyler, and Loomis (1989) employed factor analysis and
cluster analysis to segment older Americans based on lifestyle
variables. Eight lifestyle dimensions, each containing four to six
statements, were submitted to a principal component analysis with
‘Varimax rotation. Five factors emerged, which accounted for 31% of the
‘Variance. A complete linkage clustering method (using the squared
Ehaclidean distance measure) was used to group the older Americans based
<3nL£aftgr scores; eight clusters/segments emerged. The authors did not
Firovide information on the criteria they used to decide the cluster
Solution, nor did they discuss the potential effects of factor analysis
on the clustering results.
30
In Gartner's study (1990), combined factor and cluster analysis
was employed to explore the underlying meanings of entrepreneurship.
Ninety different attributes were identified from various definitions of
entrepreneurship. Factor analysis was employed to reduce the 90
variables to eight dimensions (factors). Two different clustering
meth°d5"hiéféffhiééil9iE§£éfi§3 and t§§;K;meahs:clusteringi-were then
used to discover whether participants (academic researchers in
entrepreneurship, business leaders, and politicians) in a Delphi study
could be grouped together based on their rating (not factor scores) of
the eight entrepreneurship factors. Two_grgup§{clusters emerged from
both clustgrmanalyggs. The membership of clusters derived from the two
clustering methods were compared. The criteria used to decide the
number of clusters and the potential impact of alternative factor
solutions on the clustering results were not discussed.
Bishara (1984) used combined factor and cluster analysis to
investigate whether the size of companies, their organizational
structure, or the availability and stability of funds, most influenced
the dividend decisions of life insurance companies. Factor analysis
with varimax rotation was performed on 63 original variables (e.g.,
policy loans, income before taxes, ratio of policy loans to total
assets); seven factors emerged based on the criterion of percentage of
‘tariance explained. Factor scores were computed and submitted to a
(‘Jard's method) cluster analysis for each of the four years selected
(7L965, 1970, 1975, and 1979). Two clusters were identified for four
Selected years, with slight changes in cluster membership. Bishara did
That discuss the criteria for choosing the cluster solution or the
Imossible impacts of factor solutions on the clustering results.
31
Gau (1978) undertook factor analysis and cluster analysis to
assess the relative levels of default risk inherent in residential
mortgages. Sixty-four variables describing the financial, property, and
borrower characteristics of residential mortgages were reduced to 28
independent factors using principal component analysis and varimax
rotation. Factor scores were then utilized as input in a two-group
discriminant analysis. A stepwise-determined subset of 17 factors was
employed in the formation of discriminant functions that would
differentiate between mortgage defaulters and nondefaulters. After
weighting the factor scores on the basis of their respective
discriminant coefficients, a nonhierarchical clustering algorithm
(iterative partitioning method) was employed to identify a six-cluster
solution. Gau did not discuss the potential impact of alternative
factor solutions on the clustering results.
Krzystofiak, Newman, and Anderson (1979) used factor-cluster
analysis to develop a quantified job analysis system for a power utility
firm. Common factor analysis with varimax rotation was performed on 594
job-related items, and 60 factors emerged. Factor scores then were used
as the basis for job profiling. Jobs were identified at approximately
the same organizational level, and six organizational levels were
identified. Within each of the organizational levels, jobs were grouped
into job clusters based on Ward's clustering (using Mahalanobis
distance). The authors did not provide information on the criteria they
used to decide on either the factor analysis or clustering solution, nor
did they discuss the potential impact of the factor analyses on the
clustering results.
32
Kim and Lim (1988) concluded that factor analysis and cluster
analysis are useful ways to examine the relationship between task
environment and strategy. Factor analysis with orthogonal rotation was
performed separately on two domains--environmental (e.g., scope of
distribution channel, price change of materials/parts) and strategic
(e.g., new product development, operating efficiency). Based on the
w--\~V_, _
criteria of eigenvalues greater than one and percentage of variance
explained, 13 environmental variables were reduced to five factors, and
”am” y ”4,.
the original 15 strategic variables were reduced to four factors.
Wardstmethod (using the Euclidean distance measure) was performed on
factor scores for both the environmental and strategic domains, and four
clusters were formulated for both domains. Kim and Lim did not discuss
the potential impact of alternative factor solutions on the clustering
results.
Using factor analysis and cluster analysis, Furse, Punj, and
Stewart (1984) replicated and extended previous research on consumer
search patterns. In the first case study (new car buyer study), a
principal component analysis was carried out on 24 items related to
various search activities (e.g., time spent talking to salespersons,
number of different dealers visited). Five factors were extracted and
then rotated using both varimax and oblique rotation methods. The
rotated factors, both varimax and oblique, were similar to the original
factors. The five oblique rotation factors were retained because
oblique rotation reduced moderate factor loadings. Factor scores were
computed and used as the basis for clustering. Ward's_hierarchical
clustering method with Euclidean distances then was performed to obtain
five to seven candidate cluster solutions, which served as seed points
33
inwamK:meansclustering procedure; six clusters were formulated. In the
second case study (new car dealer salesperson study), same factoring and
clustering procedures were performed, and three factors and six clusters
were identified. The authors did not discuss the potential impact of
alternative factor solutions on the clustering results.
Hooper (1985) utilized combined factor-cluster analysis to measure
the concept of social identity more comprehensively and precisely than
previous researchers had done. Principal component analysis (with
oblique rotation) was performed on 59 sociological variables (e g.,
marital status, physical attraction, race). Fifteen factors were
extracted. Factor scores were computed and then weighted by multiplying
a weighted average of the stimuli defining each social identity
according to the importance in the composition of the social-identity
factor. The weighted scores then were submitted to cluster analysis.
Based on the ratio of between-cluster variance to within-cluster
variance and interpretability, 13 clusters were identified. Although
Hooper used the weighted scores as the input to cluster analysis,
neither weighting scheme, clustering algorithm, nor the relationship
between factor and cluster solutions was discussed.
Rescorla (1988) employed combined factor-cluster analysis to
explore the major issues of classification regarding autistic children.
A principal component analysis with varimax rotation was performed on 73
items derived from Achenbach's Child Behavior Checklist (e.g., child’s
clinic symptoms--strange behavior, disobedient at home, trouble
sleeping). Based on three criteria-~eigenvalues greater than one,
number of variables with loading above .30, and interpretation, eight
factors emerged. Unweighted factor scores were computed by summing each
34
child's scores on the symptom items with loading of .30 or above. Each
child's unweighted sums were then converted to T scores. The T scores
then were submitted to K-means clustering analysis (using the Euclidean
distance measure). Cluster runs were made for 2, 3, 4, 5, and 6
clusters. The relation between cluster assignment and diagnostic
grouping was examined. However, the author did not discuss the
potential impact of alternative factor solutions on the clustering
results.
Calantone and Johar (1984) attempted to segment the tourism market
on benefit-seeking choices in different seasons. Factor analysis was
first performed for each season on 20 variables (e.g., familiarity with
the state, scenery, historical attractions). Based gnueigenyalues
greater than one and percentaggigf variance Explained, five significant
benefitsfsought factors emerged for the spring season. Six significant
factors were identified for the summer, fall, and winter seasons.
Factor scores for the seasonal benefits factors were then used as input
formclustering. Ward's method was used in the clustering for each
tr...-
Seasono Based on the ratigiegritbiersmprregimes c.9--.t9.§al. .variance
and interpretation, a five-cluster solutignflwas elected for each
V _ , ,. ' - . r— -.v\o'—~.v..-.o—AN.u-..-..arpu-“e-y. .,..,
season. Calantone and Johar did not discuss the potential impact of
alternative factor solutions on the clustering results.
Crask (1981) used both factor analysis and cluster analysis to
segment the vacationer market based on lifestyle variables. A principal
component analysis with a varimax rotation was performed on 15 vacation
attribute statements (e.g., scenic beauty of the area, distance from
home, opportunity for fishing and hunting). Based on eigenvalues
greater than one and percentage of variance explained, five factors
35
emerged, which accounted for 56.9% of the total variance. Factor scores
were computed and submitted to a hierarchical clustering algorithm.
Based on within-group variance criteria, five vacationer segments, which
had distinct vacation interests and socioeconomic profiles, were
identified. Crask did not specify the clustering method, nor did he
discuss the possible effect of the factor solution on the clustering
results.
Perreault et a1. (1977) used factor-cluster analysis to explore
aspects of lifestyles with respect to vacation activities. Factor
analyses was carried out on 285 vacation-specific statements, and 28
vacation-specific dimensions (factors) emerged. F39§9£1§92£35 were
computed and used as input data to Wardismethod (using the Euclidean
distance measure). Five different vacation segments were identified.
The authors did not provide information on the criteria they used to
decide on either the number of factors or clusters, nor did they discuss
the potential impact of factor solutions on their clustering results.
Kikuchi (1986) used factor-cluster analysis to evaluate two
1*
different approaches for segmenting Michigan's sport fishing market:
attributes sought and preferred species and locations to fish. For each
segmentation approach, factor analysis with varimax rotation was
performed before clustering. Based gnufgurugriteria;;eigenvalues
greater than one, scree test, variance explained, and interpretability
of factors--five attributes sought and nine species-location factors
were identified. Factor scores were computed and used as input to the
two-stage clustering process. In the first stage, Wardts method (using
the Euclidean distance measure) was performed to obtain preliminary
Cluster solutions based on the criterion of error sum of squares. In
36
the second stage, these candidate cluster solutions were submitted to a
reallocation clustering algorithm to determine the final cluster
solution. Eight attributes-sought and eight speciesilocation segments
were identified. Kikuchi did not address the potential effects of
alternative factor solutions on the clustering results.
Hawes (1988) attempted to establish lifestyle_pr9files of elderly
(50+ years old) female travelers by using both factor analysis and ”a
priori" cluster analysis. The respondents were categorized into five-
year "a priori" age clusters/segments (five clusters). Factor analysis
with varimax rotation was performed on 38 variables/characteristics (33
A10 statements and 5 demographic variables) for each of the five age
segments. Hawes did not discuss the potential impact of alternative
factor solutions on the clustering results.
Henderson and Stalnaker (1988) also used factor analysis and "a
priori" cluster analysis to ascertain the barriers to recreation
confronting women and to determine the relationship between perceived
barriers and gender-role traits. Factor analysis with varimax rotation
was performed on 55 barrier-related variables (e.g., work schedule, lack
of equipment). Based on eigenvalues greater than one and percentage of
variance explained, ten factors emerged. The authors did not discuss
the potential effect of factor solutions on the clustering results.
Potential Impact of Factor Solutions on Clustering Results
Very few studies have analytically examined (or mentioned) (1) the
differences between clustering solutions based on raw data and factor
37
scores, or (2) the impact of alternative factoring methods or solutions
on clustering results. The most critical impact of factor analysis on
the clustering results is the change in cluster membership that results
from the different input variables (factor scores rather than raw data)
to the clustering procedures.
Bartko et a1. (1971) compared raw data and factor scores as the
basis for clustering and obtained different clustering solutions.
Shutty and DeGood (1987) compared clustering on standardized scores and
clustering on factor scores and concluded that the.resultsmderiyedufrom
clusteringgn factorscores might provide a more accurate description of
clusters/segments. Schaninger (1986) compared clustering on raw data
and clustering on standardized data, and concluded that the standardized
data-cluster solution is better than the raw data-cluster solution
because the standardized data solution resulted in clearer and more
meaningful clusters.
Summary
A review of 32 studies shows that most researchers express little
concern about the impact of alternative factor solutions on cluster
membership. Some researchers even failed to specify the factoring
method, the criteria for selecting a factor solution, the clustering
method, or the criteria for deciding a cluster solution.
CHAPTER III
RESEARCH METHODS
This chapter details the methods employed to achieve the study
objectives. It begins with a description of the data on which the
different factor and cluster analyses were performed. This is followed
by a discussion of the different statistical methods employed to achieve
the three objectives.
Source and Description of Data
The 1988 Michigan Campvention Study
Several different data sets were evaluated to determine whether
they were appropriate with respect to the study objectives. The data
obtained from a study of the 1988 National Campers and Hikers
Association (NCHA) Campvention were used in this study. The NCHA is one
of the largest and most active camping organizations in the country,
with more than 25,000 members. Each year the NCHA holds a Campvention.
The 1988 Campvention was held from July 8 to July 14 at Highland State
Recreation Area, located in southeast Michigan. Approximately 4,000
parties from all over the country attended the Campvention.
38
39
The Michigan Association of Private Campground Owners (MAPCO) and
State Parks requested that Michigan State University assist them in
conducting a marketing and economic study of the Campvention. There
were three major purposes for the study: (a) developing a profile of
Campvention attendees which could be used to develop and target camping
related marketing efforts (see Mahoney, Oh, & Ou, 1989); (b) assessing
the economic impact of the Campvention in Michigan; and (c) evaluating a
$1.00 off per night of camping sales promotion designed to increase the
amount of before and after Campvention camping in Michigan (see Oh,
1990).
Data Collection Methods and Responsg,Rate
Two data-collection methods were employed in the Michigan
Campvention study (for a more detailed discussion of the data collection
methods, refer to Mahoney et a1. (1989) and Oh (1990)). A self
administered questionnaire and postage paid return envelope (pretrip)
was mailed eight weeks before the 1988 Michigan Campvention to a
systematic random sample of 1,575 (33%) of the 4,729 members who were
preregistered for the Campvention. One week after the Campvention, the
1,575 persons who had received a pretrip questionnaire were sent a
four-page posttrip questionnaire and a postage-paid return envelope.
Even if no one in a sampled household had completed the pretrip
questionnaire, they were urged to complete the posttrip questionnaire.
The four page pretrip questionnaire was used to collect a variety
0f information, including: (a) campvention trip plans (i.e., trip
length); (b) likelihood that they would take advantage of the $1.00 off
40
sales promotion offer; (c) pretrip perceptions of Michigan campgrounds;
(d) their annual volume of camping activity and participation in
off-season (before Memorial Day and after Labor Day) camping; (e) the
importance they assigned to different attributes when selecting
campgrounds; and (f) socioeconomic characteristics--state of residence,
gender, work status, marital status, and whether they had children
living at home.
Information collected on the posttrip questionnaire included: (a)
respondents' evaluation of the Campvention; (b) the number of nights
they camped in Michigan before, during, and after the Campvention; (c)
posttrip perceptions of Michigan campgrounds; (d) likelihood that they
would camp again in Michigan; (e) whether they planned to take advantage
of the sales promotion offer; (f) spending on their Campvention trip;
(g) membership in camping clubs/organizations and subscription to
camping magazines; and (h) additional socioeconomic characteristics,
such as family income and education (for detailed information on the
development, form, and content of the questionnaires see Oh (1990)).
About fifty percent (794) of the 1,575 pretrip questionnaires were
returned; 778 of them were usable. The response rate was somewhat
higher for the posttrip questionnaire. A total of 860 (54.6%) posttrip
questionnaires were returned; 847 were complete enough to be used in the
analysis. A relatively high percentage of the sample (38%) completed
wand returned both a pretrip and a posttrip questionnaire. Thirty-two
[Jettent did not complete either of the questionnaires.
A random sample of 100 (19.6%) of the 510 persons/parties who
‘tFEIi;led to return either a pretrip or a posttrip questionnaire were
mailed an abbreviated questionnaire in an effort to assess possible
41
nonresponse bias. Fifty percent of the nonrespondents returned the
"nonresponse bias" questionnaire. The results showed that there was
little difference between respondents and nonrespondents in their
ratings of the Campvention, the Campvention party size, number of nights
on the Campvention trip, likelihood of camping again in Michigan, work
status, martial or family status, and presence of children living at
home. However, as would be expected, nonrespondents were less likely to
have attended the Campvention and less likely to have been aware of or
taken advantage of the sales promotion offer.
Prof’ e of Persons Who Com eted Questionnaires
The findings from the Michigan Campvention study are detailed in
Mahoney et a1. (1989) and Oh (1990). The majority of persons who
attended the Campvention were retired. Almost all of them (94.6%) were
married. Approximately 29% had children living with them at home. Over
three quarters (77.2%) percent had family incomes of $20,000 or more.
Twenty-seven percent (27%) had incomes of $40,000 or more. This is
relatively high given that the majority were retired persons. Almost
80% of the parties were from other states and Canada. About a quarter
(22.6%) of the nonresidents traveled from the bordering states of Ohio
(12.4%), Indiana (6.4%) and Illinois (3.8%). Thirteen (13.2) percent
were from Canada.
They were very active high, volume campers. About 98% camp every
year, and they averaged 51 nights of camping annually. About 29% camped
60 or more nights a year. A high proportion of their camping nights
(53%, 27 nights) were outside their home state where they resided. On
42
average, they camped in five states in addition to the one where they
lived. Most said that selecting where to camp was a family decision.
Approximately three quarters (74.8%) subscribed to some camping related
magazine/publication/club other than the NCHA. The majority of these
were members of Good Sam. Sixty-nine percent (69%) attended camping or
outdoor shows.
They were also very active off—season campers. A high percentage
camped before Memorial Day (85.8%) or after Labor Day (93.3%). About
83% camped both before Memorial Day and after Labor Day.
More than half (55.8%) had no preference for either public or
private campgrounds. About a quarter (25.3%) preferred to stay in
private/commercial campgrounds while 18.8% preferred public campgrounds.
Data Used in the Present Study
The factor and cluster analyses were performed on the importance
ratings of different campground attributes/facilities (see pretrip
questionnaire, Appendix A). Respondents were asked to rank the
impggtanceM(on a five-point scale, "1" being crucial, and "5" being not
important) of 20 campground attributes/facilities: large sites, shaded
sites, cleanliness, quietness, site privacy, security, hospitality of
campground staff, low price, flush toilets, electricity, showers,
laundromat, campground store, water hookups, sewer hookups, natural
surroundings, situated on a lake/stream, hiking trails, pool, and
playgrounds.
Even though the ratings of the campground attributes are ordinal,
‘. *r—Aflvr “‘4
it is still appropriate for factor analysis. Usually, an interval or
43
ratio scale is expected for calculating correlation coefficients (e g.,
Pearson productemoment correlation coefficient) in factor analysis,
because factor analysis is based on linear relationships of variables.
However, Gorsuch (1983) indicated that it is not necessary. He pointed
out that when rank,(ordinal) data are submitted to a standard computer
program for Pearson product-moment correlations, the results will be
Spearman rank correlation coefficients which is a special case of the
Pearson product-moment correlation coefficient and is appropriate for
factor analysis.
Only the 424 respondents who rated all 20 attributes were included
l/ ‘.""4-§.\‘\4auh_.
,l
I‘
y’in this study because missing values on any attribute would have
1““"’”'“
affected the calculation .9.£ the correlation matrix and thus have
‘M ’Nh‘fia—
”“119-30‘ .
4-.« 'r’y -; ‘5 ‘,‘A l;‘ “N
‘\‘\directlyl affected the parameter estimation (factor loading). However,
-‘ ”a 'W 'A‘M‘.“"J ”'7" “M'mfiyc. . z ”4' ‘\\0; am ”L“ V- In»- 3" "‘ WI! k " -.
because of the sample-size limitations of the cluster program and for
cross-validation purposes, the total sample was divided into two
subsamples, each containing 212 randomly selected cases. T-tests (see
Appendix B) showed that there was no statistically significant
difference in the importance ratings of different campground
attributes/facilities between the two subsamples. Factor analysis was
also performed for each subsample. The results of the factor analyses
for both subsamples were similar (see Appendix C).
Statistical Methods Used to Achieve the Study Objectives
This section describes the statistical methods which were employed
to achieve the study objectives.
44
The Effects of Different Factor Solutions on Cluster Membership
Objective 1. To assess the effect of different factor solutions (number
of factors) on cluster membership.
2%
A seven-step procedure was employed to achieve Objective 1.
Step 1: Principal component analyses with varimax rotation were
performed on the ratings of the 20 campground attributes/facilities.
Nineteen different factor analyses were performed. Each analysis
extracted a different number of factors from 20 factors to 2 factors.
In the ”20 factor" factor solution, each variable represents a factor.
Priheipal component analysis is a method for extracting principal
factors under the component model, which summarizes the data by means of
a linear combination of the observed data. The first extracted factor
maximizes the variance accounted for in the correlation matrix. Each
succeeding factor is extracted to maximize the residual variance
explained (Gorsuch, 1983).
A frequent criticism of factor analysis is that the choice of
technique is crucial to the final result. However, this criticism has
not been supported by empirical evidence comparing the several types of
factor analysis (Browne, 1968a; Gorsuch, 1983; Harris & Harris, 1971;
Tucker, Koopman, & Linn, 1969). Stewart (1981) also indicated that when
communalities are high there are virtually no differences among
different factor extracting methods.
There are three primary types of orthogonal factor
rotation--varimax, quartimax, and equimax. Varimax rotation is used to
JI_”‘\
/
MN\
45
simplify the column of the factor matrix. It maximizes the variance of
the squared loadings for each factor. Quartimax rotation is used to
simplify the row of the factor matrix. Instead of maximizing variance
of squared loadings for each factor, it maximizes the variance of the
squared loadings for each variable so that a variable loads high on one
factor and as low as possible on all other factors. Equimax rotation is
a compromise between the varimax and quartimax criteria (Hair et a1.,
1987).
With the varimax rotational approach, there tend to be some high
loadings close to -1 or +1 (indicating a clear association between the
variable and the factor) and some loadings “63$”? (indicating a clear
lack of association) in each column of the matrix. Thus, the results of
varimax rotation are easier to interpret than are those of quartimax
rotation, which often produces a general factor with high-to-moderate
loadings on most variables.
/ §tep 2: Factorwsccres from the "20 factor" factor analysis were
used as input variables for cluster analyses. Faetor scores were
obtainedéx- 9111?,113l3138fih9. raw ._yari§h.1§.8 (ratings, 9f rappribvtefi) by the
factgrrgeggemcgeffieienta. They were treated as independent variables
nd received equal weight in the clustering procedures.
Step 3: The squared Euclidean distance measure and Ward's method
were used to cluster respondents based on factor scores.
Squared Euclidean distance is defined as the square of the
distance between two cases. It is generally used along with Ward's
method (Norusis, 1988; Saunders, 1985). Ward's method involves a series
of clustering steps that begins with N clusters, each containing one
case, and ends with one cluster containing all cases. At the first
46
stage, each case is in its own cluster and the error sum of squares
(within-groups sum of squares) is 0. In the following stages, the two
clusters which increase the least amount value of the sum of squares are
merged. This clustering procedure results in a series of fusion
coefficients (coefficient of hierarchy). Small increases in the
coefficients indicate that fairly homogeneous clusters are being merged.
Larger increases of coefficients indicate that clusters containing quite
dissimilar members are being combined.
Step 4: The next step was to select a final cluster solution
(number of clusters) for the clustering based on the "20 factor" factor
solution. The selection criteria were: (a) error sum of squares
(coefficient of hierarchy), (b) significance of the inter-cluster
differences, and (c) size of clusters.
The coefficient of hierarchy for each clustering stage was
plotted, beginning at the 25 cluster solution (see Figure l for
illustration). The plot was examined to identify break points. A break
point indicates a relatively large loss of information resulting from
the fusion (of the clusters) at that point/level. Cluster solution(s)
immediately preceding a break point(s) are candidates for a final
cluster solution. ’
The three candidate solutions were then examined for significance
of the inter-cluster differences. The factor scores centroids for each
cluster (for each of the three candidate solutions) were compared using
analysis of variance to determine differences between the clusters. The
assumptions of ANOVA such as independence, normality, and homogeneity of
variances were tested by using Bartlett-Box F test. The tests indicated
that the ANOVA assumptions were not violated. The six-cluster solution
47
.muoumsao no “ones: >9 >coumuoaz
uo ucoaofiuumoo on» no node a no coauouumsHHH .a madman
c0320. toe-3.0
wNnvn
_ . . Fl
0 h o 00* w—Nan—vrowo—b—o'a—ONvNNNnflvuou
nL ~—
— n p p n p b a p p n p p p p o
I on
I 0*
I on
I 00
I on
I on
I CO
I 00»
I o:
I ON—
I on—
I 0:
n... on,
Amman 30109994909
48
had the greater significance of the inter-cluster differences and was
selected as the final cluster solution.
Step 5: In order to compare the effects of alternative factor
solutions on cluster membership, Ward's method (using the squared
Euclidean distance) was used to formulate six clusters for each of the
other 18 factor analyses (19, 18, ..., 2).
Step 6: Changes in cluster membership across the different factor
solutions (20, 19, ..., 2) were assessed by calculating and plotting
information/entropy measures derived from crosstabulations of clusters.
Table 2 illustrates how cluster memberships were crosstabulated.
It compares membership of clustering based on the "20 factor" factor
solution with clustering based on the "19 factor" factor solution and
clustering based on the "20 factor" factor solution with clustering
based on the "18 factor" factor solution.
Information theory is derived from probability theory. It is
concerned with how events/symbols are affected by various processes
(Jones, 1979). Jones defined the self-information (I) of the event E,
as the logarithm of the event k's probability (p,). The mathematical
expression is: I(E;) - - log p,. ‘The smaller p, is, the larger I(E,)
is. This means that the rarer an event is, the more information is
conveyed by its occurrence. For example, in Table 2 (page 49), the
probability of cases being assigned to cluster 1 in the 20-factor
solution is 44 (number of cases in cluster 1) divided by 212 (the total
sample size); p, is 0.208. Therefore, I(Eh) - - log 0.208 - 0.682 is
the measure of information in assigning cases to cluster 1.
49
Table 2. Illustration of the crosstabulations of clusters
across different factor solutions.
20-Factor Solution l9-Factor Solution
Cluster
1 2 3 4 5 6
Cluster/Sizea (percent)b
l (44) 68.2 11.4 4.5 4.5 11.4 0.0
2 (46) 6.5 45.7 6.5 28.3 13.0 0.0
3 (29) 31.0 0.0 17.2 41.4 10.3 0.0
4 (32) 0.0 9.4 46.9 25.0 12.5 6.3
5 (45) 0.0 2.2 6.7 48.9 37.8 4.4
6 (16) 18.8 6.3 0.0 12.5 0.0 62.5
lS-Factor Solution
Cluster
1 2 3 4 5 6
(percent)°
l (44) 40.9 13.6 0.0 36.4 4.5 4.5
2 (46) 30.4 26.1 10.9 6.5 2.2 23.9
3 (29) 34.5 10.3 0.0 31.0 20.7 3.4
4 (32) 50.0 3.1 6.3 3.1 12.5 25.0
5 (45) 8.9 48.9 4.4 24.4 11.1 2.2
6 (16) 6 3 0.0 81.3 0.0 12.5 0.0
3Cases in cluster 1 derived from the 20-factor solution.
0Percent of cases assigned to the same cluster number in
both factor solutions (e.g., 20-19, 20-18).
50
Information can be seen as the measure of uncertainty. As Donderi
(1988) pointed out, information quantifies the effect of choice on
uncertainty measured over a finite set of objects. In other words,
information is a measure of what you have gained by your choice.
Therefore, information gained is uncertainty reduced. For example,
assume that a person planning a vacation originally has 8 possible
destinations to choose among. After some initial consideration the list
of possible destinations is reduced to four. Choosing four destinations
reduces the set size from the original eight possible destinations,
which required three binary choices (bits) to select a single
destination, to a subset of four destinations, which requires only two
bits to select a single destination. Narrowing the original eight
possible destinations to four results in a gain of one bit of
information, which means that the uncertainty has been reduced.
The concept of entropy introduced by Shannon (1948a,1948b) is
fundamental in information theory. Entropy can be interpreted either as
a measure of how unexpected the event was, or as measure of the
information (uncertainty) yielded by the event (Aczél & Daréczy, 1975).
Shannon (1948a,1948b) defined entropy (H) as the summation of each
event's probability (p5) individually multiplied by the logarithm of the
prdbability of individual event (log pp). Jones (1979) integrated the
information theory and the concept of entropy. He defined the entropy
of system (H(S)) as the average of the self-information.
n
H(S) - E(I) - - 2 pi * log 13. (l)
he
Entropy is either positive or zero because p, ranges from 0 to 1.
When p, is 0, the value 0 is assigned to p.‘* log p,. When H(S) - 0,
51
there is complete certainty the event must occur. In addition, entropy
has a limit that entropy (H(S)) should be less than or equal to maximum
entropy (H(S),m) (Jones, 1979; Krippendorff, 1986). The maximum value
of H(S) is attained when the probabilities of events in system S are all
equal.
0 S H(S) S H(S)”, - log (min Nun )
where:
IL : the number of events in system S.
n : the sample size.
Entropy as the measure of uncertainty has been applied to
different fields, such as biological science, behavioral science,
economics, geography, marketing, management, finance, and accounting.
For example, Attaran and Guseman (1988) used entropy as a measure of the
level of economic activity within the service sector of the United
States to assess the changes in employment concentration between or
within the manufacturing and service sectors over a 20-year period.
Attaran and Zwick (1987) demonstrated that entropy is a useful measure
for comparing industrial diversity either among regions or for a
particular region over time. Lesser (1988) used entropy to predict the
relationship between belief-behavior prediction and shopping style.
Starr (1980) proposed a unique modification of the entropy level measure
to explain switching patterns of loyalty. Beecher (1989) used entropy
to measure the information capacity of an animal's "signature system"
(the set of cues by which individuals are identified). Love (1986) used
entropy to detect the relationship between concentration and export
instability. Garrison (1974) applied an entropy measure of geographical
concentration to examine the extent to which rural and small-town
52
counties competed with urban areas for manufacturing employment in the
Tennessee Valley region.
Conditional self-information (entropy) was used to measure the
stability of cluster membership across different factor solutions (20
vs. 20, 20 vs. 19, 20 vs. 18, ..., 20 vs. 2). Similar to
self-information, conditional self-information is based on conditional
probability (the probability of event E, given that event F has
occurred). Conditional entropy is likewise an analogue of entropy,
obtained by taking the average of conditional self-information over all
pairs of events, one from each system. Jones (1979) defined the
conditional self-information 1(E, I F,) of E, given that F, has occurred
(see Formula 2) and the conditional entropy H(Sl I 82) (see Formula 3).
10:. I F.) - - log ME. I F.) - -1og
‘
DA»-
OJI4
Q3 -
OJ 1
o I V I I j T I I I I I I I I I I I
20 1! 1. 11 1O 15 14 13 12 11 1o 9 I 7 6 5 s 3 2
Ficuu'mfluuon
Figure 2. Illustration of the plot of 19 entropy measures.
57
crosstabulation of the 20-factor solution and the lS-factor solution.
The difference in bits of information indicates how cluster membership
has been changed during the process of reducing the factor solution
(i.e., reducing factor solution from 20 to 19 and from 19 to 18).
Third, the information (entropy) serves as an indicator assessing the
stability of cluster membership. Because the level of changes in
cluster membership is uncertain during the process of reducing the
factor solution, plotting all the information measures (derived from the
crosstabulation of the 20- and l9-factor solution, 20— and 18-, ...,
20-and 2-factor solution) will provide the stability/change pattern of
cluster membership.
Step 7: In order to assess the stability of the (factor) centroids
for each cluster, the (factor score) centroids of each of the six
clusters was calculated for each of the 19 factor analyses (see Table 6
for illustration). The (factor score) centroids of the six clusters
were then plotted for the 19 different factor solutions (see Figure 3
for illustration).
The Effects 9f Factor ggtagigg og Cluster Membership
Objective 2. To ascertain the effect of factor rotation on cluster
membership.
Procedures
A four-step procedure was used to achieve Objective 2. The first
two steps, factor analysis and clustering on the factor scores, were the
Tat
F31
Sc
‘IICIIQIIICICIII‘I‘IQI
58
Table 6. Illustration of (factor score) centroids for each of the six
clusters across different factor solutions.
Cluster
Factor 1 2 3 4 5 6
Solution (Factor 1 Factor Score Centroid)
2 .655 .048 .567 -.698 -.953 -1.800
3 .866 -.129 .573 -.860 -.866 -.201
4 -.777 .338 .811 °1.213 -.369 1.956
5 -.686 .343 .808 °1.333 -.372 1.736
6 -.716 .354 .775 -1.199 -.379 1.767
7 -.662 .173 .782 -1.079 -.342 1.970
8 -.700 .090 .775 -.945 -.302 2.192
9 -.683 .160 .785 -1.055 -.325 1.992
10 -.665 .138 .799 -1.119 -.288 1.878
11 -.693 .137 .768 -.923 -.268 1.547
12 -.697 .134 .769 -.945 -.245 1.513
13 -.591 .316 .676 -.946 -.304 1.449
14 -.583 .217 .667 -.941 -.302 1.437
15 -.535 .236 .631 -.909 -.327 1.363
16 -.533 .247 .620 -.913 -.325 1.369
17 -.523 .265 .610 -.908 -.342 1.373
18 .733 -.754 .500 -.923 -.170 -.195
19 .721 -.752 .490 -.898 -.156 -.209
20 -.436 -.176 .540 -.452 -.053 1.125
Cluster
1 2 3 4 5 6
(Factor 2 Factor Score Centroid)
2 -.484 .133 .916 -1.619 -.349 1.635
3 -.500 .106 .868 '1.574 °.269 1.715
4 .742 -.421 .554 -1.353 -.289 -.928
5 -.006 .099 .291 -.363 -.328 °.035
6 .411 .343 .377 -.230 -.978 -1.780
7 .436 .081 .363 -.032 -.867 -1.338
8 .449 .210 .355 -.143 -.918 -1.571
9 .441 .167 .329 -.067 -.888 -1.413
10 .435 .184 .332 °.048 -.926 '1.332
11 .446 .174 .326 -.066 °.926 -1.217
12 .433 .201 .308 -.008 -.959 -1.135
13 .458 .198 .293 .004 -.969 -1.122
14 .472 .261 .255 .024 -1.017 -1.058
15 .464 .245 .262 -.014 -1.001 °.931
16 .731 °.752 .503 -.907 -.197 '.099
17 .736 -.746 .503 -.906 ~.208 -.104
18 -.470 .253 .520 -.910 -.196 .975
19 -.262 .194 .309 - 848 -.034 .659
20 .435 -.456 .390 -:652 -.145 -.29a
59
.A6 Heumzau "mu .m “mumsau “no .e uwumSHu ”60 .n umumsao
“no .N umumsao “mo .H uwumsao “do acofiusaom uouomu enouomu ma:
ecu How m HoumsHu mo pfiouucmo muoom uouomm m uouoou 1 n
.cofluaaom uouumu :uouomu w: may now H umumsau mo pflouucmo muoum
Houomu H uouomu n my mpfiouucmo monomu no uon on» no coflumuumSHHH
t. ..x 3. 3. :4 ..o t. ..x 2.
2:31.15.
4. 4
5.2.3....
2::2::::::2 . . . . . s a ._ :2:::::::::.
p P
u - nI— h - .Ib p n n n p n h u "I p p .— b n n b h b -
I.-
I.”
F-
.m madman
.0
can l.—-- .3----
I
d
0
---_~m _--.——‘
in
60
same as steps 1 and 2 used to achieve Objective 1 except that the
initial factors were not rotated.
Step 3: The clusters (memberships) formulated on the basis of
unrotated factor scores were compared (crosstabulated) with cluster
(memberships) formulated on the basis of rotated factor scores. Table 7
illustrates how the comparison was performed.
Step 4: The cell percentages were analyzed to determine the degree
of similarity in cluster memberships. If the diagonal percentages
equaled 100%, the cluster memberships were the same. The greater the
deviation from 100%. the greater the difference in cluster memberships.
Comparisop of Different Clustering Approaches
Objective 3. To compare clustering on factor scores with clustering on
raw data.
Pgocedures
A seven-step procedure was employed to achieve Objective 3.
§tgp 1: Respondents were first clustered on the raw data
(importance ratings of the 20 attributes). Ward's method (using the
squared Euclidean distance measure) was employed. The error sum of
squares, significance of the inter-cluster difference, and size of
clusters were again used as the criteria to decide a cluster solution.
A six cluster solution was selected.
fispp_z: Nineteen principal component analyses with varimax
rotation were performed on the rating of the 20 campground
61
Table 7. Illustration of crosstabulation comparison of the memberships
of clusters derived from rotated factor scores with clusters
derived from unrotated factor scores.
Rotated Factor Analysis Unrotated Factor Analysis
(20, 19, 18, ..., 2) (20, 19, 18, ..., 2)
Clusters
Clusters l 2 3 4 5 6
(percent)8
1 % % % % % %
2 % % % % % %
3 % % % % % %
4 % % % % % %
5 % % % % % %
6 % % % % % %
‘percentage of cases assigned to cluster 1 in both the rotated and
unrotated factor analysis.
attributes/facilities, as was done in step 1 for Objective 1 (see page
44). Nineteen different factor analyses were performed. Each factor
analysis extracted a different number of factors from 20 factors to 2
factors.
Step 3: The (factor score) centroids for each of the six clusters
were calculated for each of the 19 factor analyses (see Table 6 for
illustration). The (factor score) centroids of each of the six clusters
were then plotted for each factor solution (see Figure 3 for
illustration).
Stgp 4: The sum of squared distance for each cluster on each
factor (factor score) centroid was computed when clustering on raw data.
For example, in Table 8, the sum of squared distance for cluster 1 on
"factor 1" factor score centroid is calculated by adding the squared
62
Table 8. Illustration of the calculation of the sum of squared
distance.
Cluster
1 2 3 4 5 6
Factor D, D, D, D, D, D,
Solution (Factor 1 Factor Score Centroid)
2 1 3 2 2 1 4
3 l 0 2 l O 4 l l 3 4 l 9
4 3 4 O 4 l l 3 4 0 9 l O
5 2 l 0 O 2 l l 4 l 1 2 l
6 0 4 l l 2 O 3 4 2 l l l
7 3 9 1 0 l l 3 0 3 l 2 1
20
Sum of
Squared 18 6 7 13 16 12
Distance
Note: For illustration purpose, this table only shows five squared
distances.
IL means the squared difference of factor 1 factor score centroid
between different factor solutions for cluster 1.
In means the squared difference of factor 1 factor score centroid
between different factor solutions for cluster 2.
1% means the squared difference of factor 1 factor score centroid
between different factor solutions for cluster 3.
IL means the squared difference of factor 1 factor score centroid
between different factor solutions for cluster 4.
1% means the squared difference of factor 1 factor score centroid
between different factor solutions for cluster 5.
1% means the squared difference of factor 1 factor score centroid
between different factor solutions for cluster 6.
63
distance of centroid points between a 2-factor solution and a 3-factor
solution, the squared distance of centroid points between a 3-factor
solution and a 4-factor solution, ..., and the squared distance of
centroid points between a l9-factor solution and a 20—factor solution.
§§ep 5: The sum of squared distance for each cluster on each
factor (factor score) centroid was also computed when clustering on
factor scores.
Step 6: The similarity of each of the clusters formulated on raw
data and factor scores was assessed using a specially designed computer
program (see Appendix D). The program identified the best set of
matched clusters for each factor (factor score) centroid. For example,
in factor 1 factor score centroid, the cluster 6 derived from clustering
on factor scores is most similar to the cluster 1 derived from
clustering on raw data (see Table 9).
The program was specially written to determine the best set of
matched clusters between the two clustering approaches-~raw data and
factor scores. The sum of squared distances calculated in step 4 and
step 5 were used as input to this computer program. In each iteration,
the program generates a set of matched clusters. For example, cluster 1
(based on raw data) matches with cluster 6 (derived from factor scores),
which marked as C“; cluster 2 (based on raw data) with cluster 5
(derived from factor scores), marked as C”; the other matched clusters
were marked as C“, C“, C”, and C“.
The difference of the sum of squared distance is then calculated
for each of the six matches (e.g., C“, C”, ..., CM) and summed. The
computer program then generates other sets of matched clusters. For
each set of cluster match, the total difference of the sum of squared
64
Table 9. Illustration for the measure of cluster similarity.
Clustering On Clustering On
Factor Scores Raw Data
Sum of Standard Sum of Standard
Cluster Distance Deviation Cluster Distance Deviation
6 12.783 1.3 1 5.686 0.8
4 9.453 2.1 2 1.672 0.6
3 6.656 1.5 3 0.084 1.1
2 6.909 1.2 4 0.472 0.5
1 4.612 0.5 5 0.305 1.4
5 15.527 1.7 6 20.342 0.7
distance is calculated. Based on the criterion of minimum total
difference of the sum of squared distance, the computer program
identifies the best set of matched clusters.
Step 7: The standard deviations of factor score centroids for each
cluster across different factor solutions were calculated. The values
of the standard deviation for each of the six matched clusters were used
as the basis for comparing the stability of each factor score centroid
between clustering on raw data and clustering on factor scores. Six
sets of stability comparisons were made. The higher the standard
deviation, the more unstable the cluster membership (factor score
centroid). The ”best" approach results in more stable clusters.
To demonstrate how the stability comparisons were made, the
following example is presented. The computer program identified a set
65
of matched clusters: C“, C", C“, C“, C”, C“. As stated above, standard
deviations were calculated for each of the six matched clusters.
Suppose that the standard deviation of the cluster 1 (based on raw data)
is 0.8 and the standard deviation of the cluster 6 (based on factor
scores) is 1.3, the cluster membership of the cluster 1 (based on raw
data) is more stable than the cluster 6 (based on factor scores). The
other five matched clusters were also compared based on the value of
standard deviations. If clustering on raw data has more stable clusters
than that of clustering on factor scores, clustering on raw data is
identified as a better approach.
CHAPTER IV 0
RESULTS
The chapter is divided into five sections dealing with (l) the
importance ratings of the twenty different campground attributes, (2)
the appropriateness of data for factor analysis, (3) an assessment of
the effect of different factor solutions on the clustering results, (4)
an assessment of the effect of rotation on cluster membership, and (5) a
comparison of clustering on factor scores with clustering on raw data.
Importance Ratings of 20 Campground Attributes
The importance ratings assigned to the 20 campground
attributes/facilities by respondents are shown in Table 10. The ratings
ranged from crucial (1) to not important (5). The distribution of
ratings, mean and median scores, and standard deviation for each
attribute are also reported in Table 10.
Cleanliness of a campground (mean - 1.877) was the most important
attribute. This was followed by security (mean - 2.160), hospitality of
campground staff (mean-2.500), quietness (mean - 2.759), electricity
(mean - 2.750), and low price (mean - 2.896). Campers as a whole were
66
Table 10. Importance ratings (assigned the campground attributes) which were used in
the factor analyses and cluster analyses.
67
Importance Ratinga
1 2 3 4 5 Standard
Campground Attributes (percent) Mean Median Deviation
Large sites 6.6 17.9 41.5 25.9 8.0 3.108 3.0 1.008
Shaded sites 1.9 20.8 40.6 29.2 7.5 3.198 3.0 0.918
Cleanliness 30.2 55.2 11.8 2.4 0.5 1.877 2.0 0.738
Quietness 6.1 32.1 45.3 12.7 3.8 2.759 3.0 0.889
Site privacy 2.4 17.5 37.3 30.2 12.7 3.335 3.0 0.986
Security 23.1 46.2 22.6 7.5 0.5 2.160 2.0 0.883
Hospitality of campground staff 12.3 41.5 33.0 10.4 2.8 2.500 2.0 0.936
Low price 8.5 26.4 35.8 25.5 3.8 2.896 3.0 1.002
Flush toilets 6.1 18.9 29.7 25.9 19.3 3.335 3.0 1.167
Electricity 13.2 29.2 32.5 19.3 5.7 2.750 3.0 1.088
Showers 9.0 25.9 31.1 23.6 10.4 3.005 3.0 1.129
Laundromat 1.9 5.7 24.5 34.0 34.0 3.925 4.0 0.990
Campground store 1.4 9.4 20.8 43.4 25.0 3.811 4.0 0.965
water hookups 9.4 26.4 25.5 22.2 16.5 3.099 3.0 1.233
Sewer hookups 4.7 11.3 23.6 25.9 34.4 3.741 4.0 1.182
Natural surroundings 4.7 20.8 34.9 27.4 12.3 3.217 3.0 1.058
Situated on a lake/stream 1.4 8.0 18.4 30.2 42.0 4.033 4.0 1.028
Hiking trails 1.4 9.4 15.1 35.8 38.2 4.000 4.0 1.021
Pool 1.4 10.4 20.3 25.0 42.9 3.976 4.0 1.086
Playgrounds 0.9 6.6 8.5 15.1 68.9 4.443 5.0 0.965
aThe importance ratings of campground attributes ranged from crucial (1) to not important (5).
68
less concerned with whether a campground had a laundromat (mean -
3.925), a swimming pool (mean - 3.976), or a hiking trail (mean =
4.000), whether it was situated on lake/stream (mean = 4.033), and
whether it had playgrounds (mean - 4.443).
Appropriateness of the Data for Factor Analysis
Prior to performing a factor analysis, the data (importance
ratings) were examined with respect to their appropriateness (sample
size and correlation between variables) for factor analysis. A number
of criteria for determining whether a factor analysis should be applied
to a set of data were reviewed. A common criterion is the size of the
sample. Comrey (1973) suggested that if the sample size is equal to
100, the appropriateness for factor analysis is poor; 200 it is fair;
300 it is good; 500 it is very good; and 1000 it is excellent. Stewart
(1981) suggested six methods of determining whether the data are
appropriate for factor analysis. These include the examination of the
correlation matrix, the plotting of the eigenvalues obtained from matrix
decomposition, the examination of communality estimates, the inspection
of the off-diagonal elements of the anti-image covariance or correlation
matrix, Bartlett's test of sphericity, and the Kaiser-Meyer-Olkin
measure of sampling adequacy (MSA).
The criteria used were (a) the sample size, (b) Bartlett's test of
sphericity, and (c) the Kaiser-Meyer-Olkin measure of sampling adequacy
(MSA). In the present study, there are two split subsamples each
containing 212 cases, which is an adequate size for factor analysis.
69
Bartlett's test of sphericity was used to test (using a
chi-square test) the hypothesis that the correlation matrix is an
identity matrix (e.g., variables correlate perfectly with themselves,
but are uncorrelated with other variables). That is, all diagonal terms
are 1 and all off-diagonal terms are 0. Rejecting the hypothesis
indicates that the data are appropriate for factor analysis (Bartlett,
1950, 1951).
Bartlett's test of sphericity was performed. The chi-square value
is 1441 (with 190 degrees of freedom) that is highly significant. Thus,
based on this test, the data is appropriate for factor analysis.
Kaiser-Meyer-Olkin measure of sampling adequacy (MSA) provides a
measure of the extent to which the variables belong together (Kaiser,
1970). Small value for the MSAs (less than .50) indicate that data may
not be appropriate for factor analysis because correlation between pairs
of variables can not be explained by the other variables (Norusis,1988).
In this study, the MSA is 0.81, which indicates that data is appropriate
for factor analysis (Kaiser & Rice, 1974).
Assessment of the Effect of Different Factor
Solutions on the Clustering Results
Factoring Results
Nineteen (20, 19, 18, ..., 2 factors) different principal
component analyses with varimax rotation were performed. The
eigenvalues and percentages of variance explained are reported in Table
11 along with the cumulative percentage of variance explained by the
70
Table 11. Eigenvalue, percent of variance explained, and
cumulative percent of variance explained for 20
campground attributes.
Cumulative
Percent of Percent
Variance of Variance
Factor Eigenvalue Explained Explained
1 5.60131 28.0 28.0
2 1.93845 9.7 37.7
3 1.69936 8.5 46.2
4 1.32863 6.6 52.8
5 1.16849 5.8 58.7
6 1.09119 5.5 64.1
7 1.02010 5.1 69.2
8 0.80158 4.0 73.2
9 0.67725 3.4 76.6
10 0.61859 3.1 79.7
11 0.57406 2.9 82.6
12 0.54578 2.7 85.3
13 0.50535 2.5 87.9
14 0.47601 2.4 90.2
15 0.44025 2.2 92.4
16 0.38502 1.9 94.4
17 0.32611 1.6 96.0
18 0.29376 1.5 97.5
19 0.27759 1.4 98.8
20 0.23112 1.2 100.0
71
different number of factors. For each factor, the eigenvalue is the sum
of squared factor loadings. Eliminating factors one at a time starting
from the 20 factor reduced the percentage of total variance explained.
The eigenvalues and percentages of variance explained in proportion to
the eigenvalues of the factors eliminated from the solution remained the
same. For example, the first 18 eigenvalues of the "19 factor"
principal component analysis are identical to the 18 eigenvalues of the
"18 factor" principal component analysis.
The next step was to identify the "best" factor solution based on
factor analysis criteria. The scree test/plot which was used to select
candidate factor solutions is presented in figure 4. The scree plot
identified three candidate factor solutions (2 factors, 4 factors, and 7
factors). A seven-factor solution was selected from among all possible
solutions because (a) eigenvalues from factor 1 to factor 7 were greater
than 1, and (b) the percentage of total variance explained was about
70%. In many studies, the seven-factor solution would have been used as
the basis for clustering. However, the purpose of this study was to
assess the effects of alternative factor solutions on the clustering
results, so the seven-factor solution was only one of 19 different
factor solutions which were considered.
Next, one factor at a time was eliminated beginning with the
20-factor solution. The impact of the ”one at a time" factor
elimination on the factor pattern matrix are shown in Tables 12-30.
Only the loadings vf variables with a factor loading of 0.40 or greater
are shown in the tables. For example, Table 12 shows the factor pattern
matrix for the 20 factor principal component analysis (with varimax
72
.AcoHusHom wouomu
on» on UH503 mHnw .pmchmem oocmem> Hmuou
mo ammquuwdm Ucm H A msHm>cmem co comma "IV
mCOHusHom wouomu mumUHpccu mcHuomHmm wow ummu mdwum .v mwsmHm
co=3_om cocoon
ON up 0— hp Op 0’ vp np N, H? 0’ o o h o n v n N
p p p p — p b p p p n p b
onlowabg
73
Table 12. Campground attribute sought factor pattern matrix for "20 factor" principal component
analysis with varimax rotation.
Campground
Attributes
_a
N
“O
Factor
0 0 0 1 1 1 1 1 1 1 1
7 8 9 0 1 2 3 4 5 6 7 8 9 0
.a
..a
N
(Factor Loadings)
Large sites
Shaded sites
Cleanliness
Quietness
Privacy
Security
Hospitality
Low price
Flush toilets
Electricity
Shower
Laundromat
Store
water hookups
Sewer hookups
Natural surroundings
Lake/stream
Hiking trail
Swimming pool
Playgrounds
.89
.92
.95
.96
.91
.89
.85
.89
.88
.86
.87
.90
.89
.88
Note: Only variables whose loadings are greater than .04 are shown.
74
Table 13. Campground attribute sought factor pattern matrix for "19 factor" principal component
analysis with varimax rotation.
Factor
Campground 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1
Attributes 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
(Factor Loadings)
Large sites .96
Shaded sites .96
Cleanliness .90
Quietness .89
Privacy .90
Security .92
Hospitality .92
Low price .96
Flush toilets .90
Electricity .91
Shower .84
Laundromat .89
Store .88
water hookups .79
Sewer hookups .88
Natural surroundings .89
Lake/stream .89
Hiking trail .88
Swimming pool .91
Playgrounds .94
Note: Only variables whose loadings are greater than .04 are shown.
75
Table 14. Campground attribute sought factor pattern matrix for "18 factor" principal component
analysis with varimax rotation.
Factor
Campground 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
Attributes 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8
(Factor Loadings)
Large sites .96
Shaded sites .96
Cleanliness .90
Quietness .86
Privacy .90
Security .91
Hospitality .91
Low price .96
Flush toilets .89
Electricity .87
Shower .86
Laundromat .87
Store .86
water hookups .72
Sewer hookups .91
Natural surroundings .88
Lake/stream .89
Hiking trail .88
Swimming pool .89
Playgrounds .94
Note: Only variables whose loadings are greater than .04 are shown.
76
Table 15. Campground attribute sought factor pattern matrix for “17 factor" principal component
analysis with varimax rotation.
Factor
Campground 0 0 0 0 0 0 0 0 0
7 8 9
Attributes 1 2 3 4 5 6
1 1 1 1
0 1 2 3 4 5 6 7
—D
—h
—h
_.
(Factor Loadings)
Large sites .96
Shaded sites
Cleanliness
Quietness
Privacy
Security
Hospitality
Low price
Flush toilets
Electricity
Shower
Laundromat
Store
water hookups .
Sewer hookups
Natural surroundings
Lake/stream
Hiking trail
Swimming pool
Playgrounds
.91
.96
.94
.95
.90
.82
.85
.89
.87
.89
Note: Only variables whose loadings are greater
than .04 are shown.
77
Table 16. Campground attribute sought factor pattern matrix for "16 factor“ principal component
analysis with varimax rotation.
Factor
Campground 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1
Attributes 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
(Factor Loadings)
Large sites .96
Shaded sites .95
Cleanliness .90
Quietness .65 .42
Privacy .89
Security .90
Hospitality .90
Low price .96
Flush toilets .89
Electricity .44 .80
Shower .86
Laundromat .86
Store .84
Hater hookups .85
Sewer hookups .87
Natural surroundings .85
Lake/stream .90
Hiking trail .76
Swimming pool .88
Playgrounds .94
Note: Only variables whose loadings are greater than .04 are shown.
78
Table 17. Campground attribute sought factor pattern matrix for "15 factor" principal component
analysis with varimax rotation.
Factor
Campground 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1
Attributes 1 2 3 4 S 6 7 8 9 0 1 2 3 4 5
(Factor Loadings)
Large sites .96
Shaded sites .95
Cleanliness .89
Quietness .65
Privacy .90
Security .86
Hospitality .90
Low price .94
Flush toilets .89
Electricity .45 .80
Shower .86
Laundromat .85
Store .69
Hater hookups .86
Sewer hookups .86
Natural surroundings .81
Lake/stream .85
Hiking trail .82
Swiming pool .89
Playgrounds .94
Note: Only variables whose loadings are greater than .04 are shown.
79
Table 18. Campground attribute sought factor pattern matrix for "14 factor" principal component
analysis with varimax rotation.
Factor
Campground 0 0 0 0 0 0 0 0 0 1 1 1
Attributes 1 2 3 4 5 6 7 8 9 0 1 2
(Factor Loadings)
Large sites .95
Shaded sites .94
Cleanliness .90
Quietness .62
Privacy .92
Security .50
Hospitality .90
Low price .92
Flush toilets .89
Electricity .56 .68
Shower .86
Laundromat
Store .
Hater hookups .87
Sewer hookups .85
Natural surroundings .79
Lake/stream .85
Hiking trail .82
Swimming pool .88
Playgrounds .93
as
Note: Only variables whose loadings are greater than .04 are shown.
80
Table 19. Campground attribute sought factor pattern matrix for "13 factor" principal component
analysis with varimax rotation.
Campground
Attributes
Factor
0 0 0 0 0 0 0 0 0 1
1 2 3 4 5 6 7 8 9 0 1
(Factor Loadings)
Large sites
Shaded sites
Cleanliness
Quietness
Privacy
Security
Hospitality
Low price
Flush toilets
Electricity
Shower
Laundromat
Store
water hookups
Sewer hookups
Natural surroundings
Lake/stream
Hiking trail
Swimming pool
Playgrounds
.95
.90
.77
.41 .57 -.40
.91
.73
.76
.91
.88
.57 .67
.86
.83
.75
.87
.85
.56 .63
.84
.85
.88
.93
Note: Only variables whose loadings are greater than .04 are shown.
81
Table 20. Campground attribute sought factor pattern matrix for "12 factor" principal component
analysis with varimax rotation.
Factor
Campground 0 0 0 0 0 0 0 0 0 1
Attributes 1 2 3 4 5 6 7 8 9 0
(Factor Loadings)
Large sites .92
Shaded sites .90
Cleanliness .44 .63
Quietness .79
Privacy .80
Security .75
Hospitality .81
Low price
Flush toilets .86
Electricity .80
Shower .87
Laundromat .79
Store .78
water hookups .83
Sewer hookups .80
Natural surroundings .61 .53
Lake/stream .84
Hiking trail .85
Swimming pool
Playgrounds .92
.91
.87
Note: Only variables whose loadings are greater than .04
are shown.
82
Table 21. Campground attribute sought factor pattern matrix for "11 factor" principal component
analysis with varimax rotation.
Campground 0 0
Attributes 1 2
Factor
0 0 0 0 0 0
3 4 5 6 7 8
(Factor Loadings)
Large sites
Shaded sites
Cleanliness
Quietness
Privacy
Security
Hospitality
Low price
Flush toilets
Electricity
Shower
Laundromat
Store
water hookups
Sewer hookups .
Natural surroundings
Lake/stream
Hiking trail .83
Swimming pool
Playgrounds
size
.90
.78
.89
Note: Only variables whose loadings are greater than .04 are shown.
83
Table 22. Campground attribute sought factor pattern matrix for "10 factoru principal component
analysis with varimax rotation.
Campground 0 0
Attributes 1 2
0
4
Factor
0
5
0
6
0
7
(Factor Loadings)
coo
Large sites
Shaded sites
Cleanliness
Quietness
Privacy
Security
Hospitality
Low price
Flush toilets
Electricity
Shower
Laundromat .
Store
water hookups
Sewer hookups
Natural surroundings 70
Lake/stream .84
Hiking trail 83
Swimming pool
Playgrounds
'3
sass
.87
.74
.78
.43
.48
.47
.86
.92
.89
-.42
.76
Note: Only variables whose loadings are greater than .04 are shown.
84
Table 23. Campground attribute sought factor pattern matrix for "9 factor“ principal component
analysis with varimax rotation.
Factor
Campground 0 0 0 0 0 0 0 0
Attributes 3 4 S 6 7 8
(Factor Loadings)
.a
N
‘00
Large sites .92
Shaded sites .85
Cleanliness .72
Quietness .79
Privacy .79
Security 52
Hospitality .79
Low price .82
Flush toilets .86
Electricity .75
Shower .86
Laundromat .44 .51
Store .41 .41 .55
water hookups .85
Sewer hookups .81
Natural surroundings .69
Lake/stream .84
Hiking trail .82
Swimming pool .72
Playgrounds .77
Note: Only variables whose loadings are greater than .04 are shown.
85
Table 24. Campground attribute sought factor pattern matrix for "8 factor" principal component
analysis with varimax rotation.
Campground 0
Attributes 1
0
3
Factor
0 0
4 5
0 0 0
6 7 8
(Factor Loadings)
Large sites
Shaded sites
Cleanliness
Quietness
Privacy
Security
Hospitality
Low price
Flush toilets
Electricity .79
Shower
Laundromat .41
Store
Hater hookups .83
Sewer hookups .78
Natural surroundings
Lake/stream
Hiking trail
Swimming pool
Playgrounds
.75
.72
.73
.58
.70
.42
.62
Note: Only variables whose loadings are greater than .04 are shown.
86
Table 25. Campground attribute sought factor pattern matrix for "7 factor" principal component
analysis with varimax rotation.
Campground
Attributes 1
Factor
0 0 0 0 0 0
(Factor Loadings)
Large sites
Shaded sites
Cleanliness
Quietness
Privacy
Security
Hospitality
Low price
Flush toilets
Electricity .77
Shower
Laundromat .47
Store .42
Hater hookups
Sewer hookups .81
Natural surroundings
Lake/stream
Hiking trail
Swimming pool
Playgrounds
.81
.82
.41 .55
Note: Only variables whose loadings are greater than .04 are shown.
87
Table 26. Culpgrowid attribute sought factor pattern matrix for "6 factor" principal component
analysis with varimax rotation.
Factor
Campground 0 0 0 0 0 0
Attributes 1 2 3 4 S 6
(Factor Loadings)
Large sites .60
Shaded sites .45
Cleanliness .76
Quietness .59 .55
Privacy .72
Security .57
Hospitality .73
Low price .81
Flush toilets .80
Electricity .71
Shower .83
Laundromat .53 .45
Store .49 .55
Hater hookups .83
Sewer hookups .81
Natural surroundings .72
Lake/stream .78
Hiking trail .80
Swimming pool .56
Playgrounds .48
Note: Only variables whose loadings are greater than .04 are shown.
88
Table 27. Campground attribute sought factor pattern matrix for "5 factor" principal component
analysis with varimax rotation.
Campground
Attributes 1
Factor
0 0 0 0
2 3 4 5
(Factor Loadings)
Large sites
Shaded sites
Cleanliness
Quietness
Privacy
Security
Hospitality
Low price
Flush toilets
Electricity .72
Shower
Laundromat .61
Store .56
Hater hookups .83
Sewer hookups .82
Natural surroundings
Lake/stream
Hiking trail
Swimming pool
Playgrounds
-.54
.42
.65
.65
.81
.74
.78
.79
.53
.50
Note: Only variables whose loadings are greater than .04 are shown.
89
Table 28. Campground attribute sought factor pattern matrix for "4 factor" principal component
analysis with varimax rotation.
Factor
Campground 0 0 0 0
Attributes 1 2 3 4
(Factor Loadings)
Large sites
Shaded sites 42
Cleanliness .66
Quietness .77
Privacy 62
Security 73
Hospitality 42
Low price .51
Flush toilets .65
Electricity .74
Shower .75
Laundromat .56 .49
Store .46 .54
Hater hookups .82
Sewer hookups .78
Natural surroundings
Lake/stream
Hiking trail
Swimming pool .53
Playgrounds .48
'r :33
Note: Only variables whose loadings are greater than .04 are shown.
90
Table 29. Campground attribute sought factor pattern matrix for “3 factor“ principal component
analysis with varimax rotation.
Factor
Campground 0 0 0
Attributes 1 2 3
(Factor Loadings)
Large sites
Shaded sites
Cleanliness .61
Quietness .78
Privacy .67
Security .72
Hospitality .41
Low price
Flush toilets .53
Electricity .73
Shower .60
Laundromat .69
Store .60
Hater hookups .77
Sewer hookups .75
Natural surroundings .65
Lake/stream .61
Hiking trail .65
Swimming pool .60
Playgrounds .65
Note: Only variables whose loadings are greater than .04 are shown.
91
Table 30. Campground attribute sought factor pattern matrix for "2 factor" principal component
analysis with varimax rotation.
Factor
Campground 0 0
Attributes 1 2
(Factor Loadings)
Large sites
Shaded sites .52
Cleanliness .56
Quietness .45
Privacy .44
Security .50
Hospitality .52
Low price
Flush toilets .46
Electricity .74
Shower .48
Laundromat .70
Store .61
Hater hookups .79
Sewer hookups .77
Natural surroundings .71
Lake/stream .67
Hiking trail .70
Swimmfing pool
Playgrounds .48
Note: Only variables whose loadings are greater than .04 are shown.
92
rotation). Only one variable was significantly loaded on each of the 20
factors.
Tables 12 through 30 reveal two major changes as the number of
factors are reduced from 20 to 2. First, the size of factor loadings
change. Second, certain factors will have two or more variables with
significant ( >.40) loadings. Changes in factor loadings and the number
of variables with significant loadings on different factors result in
different factor interpretation and different factor scores. When
factor scores are used as the basis for clustering process, the
clustering results (cluster membership and cluster description) would be
different for different factor solutions (20, 19,..., 2).
Clustering Results
Factor scores were computed for each factor in each of the 19
different principal component analyses. The regression estimates method
was used to obtain the factor scores. The original raw data
measurements were multiplied by the corresponding factor score
W‘".fl-Iilfl'fl~lfi‘“l
(regression) coefficients. The factor scores\ge£3~used as thé’basigifor
...,. v - — ,-‘ -\ ...,...TZWM
clustering.
(...,..-
The factor scores from the "20 factor" principal component
analysis were used as input data to Ward’s clustering method with the
squared Euclidean distance as the distance measure. Figure 5 shows the
increase in the coefficient of hierarchy (which resulted from fusion of
clusters) plotted against the number of clusters. As stated previously,
the break points along the plot mean that a relatively large loss of
information resulted from the fusion of two clusters. Based on the
93
. EOHusHom dumpHpcmu “I.
mmwoum wouusu no woman mH mcmeumsHo :053 wwwumsHo
unmpom muanwuum no names: an >suwmmen uo ucmHoHuuwou .m mwsmHm
ONNV H c0320” Leuma.0
omov N p N n v o o h o a op : up 0’ v. a. o, hp 0. up on H" «N nu vu nu
mhmn n p . p n . p . p w . . . p . . p . . . . p . a . .
anon v on
warm n 0d
mean a nu
hhmn h .
mmvn m a.“
omen m an
vmnn OH 0
vmmn HH fin
VHNn NH .
omHn nH Nn
boon vH n.n
mmon nH .En
mwmn ma 06
hHmN hH .
mama mu on
nmmn mH sun
mhhm on m.n
vnhm Hm Qn
Hmon «N
omen nu v
HHmN vn Ev
mum" mm a...
lllllllllllllllllllllll Q?
hnowmon: :oHusHom
uo ucaHoHuuoou wouusHO
(..qu
max 10 mam
M
94
coefficient of hierarchy and the examination of plot slopes, three
candidate cluster solutions were identified: eight clusters, six
clusters, and three clusters.
The three candidate solutions were evaluated on (a) the
significance of inter-cluster differences and (b) the size of clusters.
ANOVA was used to test for inter-cluster differences. The results of
the ANOVA tests on the three candidate cluster solutions are presented
in Table 31-33. In the eight-cluster solution (Table 31), there were
significant differences across clusters on all but two (flush toilet and
campground store) of the 20 factors/variables. The six clusters
differed significantly on 16 of the 20 factors/variables (Table 32).
The three-cluster solution showed the least amount of inter-cluster
differences (Table 33); clusters differed significantly on only 10 of
the 20 factors/variables.
Even though the eight cluster solution exhibited more
inter-cluster differences. The six-cluster solution was selected as the
final solution because one of the 8 clusters was disproportionally
small; it only had 5 (2.4%) cases (see Table 34). In the six cluster
solution, the smallest cluster contained 16 (7.5%) cases.
Factor Score Pattern
The (factor score) centroids for each of the six clusters were
calculated for each of the 19 principal component analyses (20, 19, 18,
., 2). The (factor score) centroids are graphically presented in
Figures 6-25. Each graph shows the factor centroids for each cluster
95
Table 31. Mean attribute sought factor scores for the eight-cluster
candidate solution when clustering on factor scores.
Cluster
Factor 1 2 3 4 5 6 7 8 F-ratio
Electricity —.48 .06 -.49 .54 .31 .13 -.17 -.01 3.72*
Toilet .02 .22 .15 —.20 .36 -.21 -.21 -.69 1.67
Playground .36 .17 .33 -.14 .23 .16 —2.29 .47 24.21*
Price .12 -.22 .23 .81 -.16 -.31 -.21 -.18 4.37*
Large sites -.46 -.02 .10 .53 .14 -.24 -.05 .82 2.98*
Shade sites -.12 .83 -.51 -.11 -.13 -.32 -.43 1.26 10.39*
Pool -.74 .02 .07 .50 -.31 .43 -.70 -.09 6.57*
Hospitality .30 .26 .30 -.68 .28 -.28 -.14 -.01 4.10*
Security -.33 .29 -.10 -.39 .09 .31 -.19 -.76 2.87*
Privacy .07 .09 -.32 .63 -.00 -.37 -.25 1.42 5.05*
Natural surr. .45 .35 -.37 -.33 .41 -.27 .09 -.93 4.56*
Lake/stream .31 .03 -.72 .08 -.33 .48 -.05 -1.03 5.85*
Cleanliness .00 -.45 -.27 -.29 1.78 .01 -.08 .69 16.69*
Laundromat .64 -.41 .01 .16 .15 -.24 -.04 1.40 5.13*
Quietness .14 -.05 -.15 .47 .62 -.26 -.15 -l.53 4.80*
Sewer hookups .49 .26 -.14 .23 .39 -.45 -.22 -1.92 7.34%
Natural trail .59 -.22 -1.16 .09 .41 .47 -.21 .28 12.74*
Store .35 -.21 .22 -.25 -.27 .24 -.11 -.54 2.02
Water hookups -.92 .09 .18 .19 .54 -.11 .17 .11 4.82*
Shower -.03 -.48 .36 .18 -.09 .31 -.18 -.41 3.25*
* Significant at .05 level.
96
Table 32. Mean attribute sought factor scores for the six-cluster
candidate solution when clustering on factor scores.
Cluster
Factor 1 2 3 4 5 6 F-ratio
Electricity -.14 .06 -.49 .45 .13 -.17 3.37*
Toilet .17 .22 .15 -.27 -.21 -.21 1.87
Playground .03 .17 .33 -.05 .16 -2.29 33.08*
Price .00 -.22 .23 .65 -.31 -.21 4.92*
Large sites -.20 -.02 .10 .58 -.24 -.05 3.22*
Shade sites -.12 .83 -.51 .10 -.32 -.43 11.98*
Swimming pool -.56 .02 .07 .41 .43 -.70 8.31*
Hospitality .29 .26 .30 -.57 -.28 -.14 5.32*
Security -.15 .29 -.10 -.45 .31 -.19 3.48*
Privacy .04 .09 -.32 .75 -.37 -.25 6.42*
Natural surr. .43 .35 -.37 -.43 -.27 .09 6.05*
Lake/stream .04 .03 -.72 -.09 .48 -.05 5.70*
Cleanliness .77 -.45 -.27 -.14 .01 -.08 9.1971r
Laundromat .43 -.41 .01 .35 -.24 -.04 4.94*
Quietness .35 -.05 -.15 .16 -.26 -.15 2.11
Sewer hookups .45 .26 -.14 -.11 -.45 -.22 5.00*
Natural trail .51 -.22 -1.16 .12 .46 -.21 17.80*
Store .08 -.21 .22 -.30 .24 -.11 1.88
Water hookups -.29 .09 .18 .18 -.ll .17 1.40
Shower -.06 -.48 .36 .09 .31 -.16 4.24*
* Significant at .05 level.
97
Table 33. Mean attribute sought factor scores for the three-cluster
candidate solution when clustering on factor scores.
Cluster
Factor 1 2 3 F-ratio
Electricity -.14 .06 .03 0.58
Toilet .17 .22 -.14 3.01
Playground .03 .17 -.17 4.71*
Price .00 -.22 .08 1.59
Large sites -.20 -.02 .08 1.29
Shade sites -.12 .83 -.27 25.12*
Pool -.56 .02 .19 9.82*
Hospitality .29 .26 -.20 6.30*
Security -.15 .29 -.05 2.61
Privacy .04 .09 -.05 0.35
Natural surr. .43 .35 -.29 13.30*
Lake/stream .04 .03 -.02 0.09
Cleanliness .77 -.45 -.11 22.21*
Laundromat .43 -.41 .00 8.53*
Quietness .35 -.05 -.11 3.53*
Sewer hookups .45 .26 -.26 10.93*
Natural trail .51 -.22 -.10 8.05*
Store .08 -.21 .05 1.27
Water hookups -.29 .09 .07 2.35
Shower -.06 -.48 .20 8.53
* Significant at .05 level.
98
Table 34. Number of respondents in each of the cluster candidate
solutions when clustering on factor scores.
Number of Relative Size
Cluster Respondents (percent)
Eight Cluster Solution
1 25 11.8
2 46 21.7
3 29 . 13.7
4 27 12.7
5 19 9.0
6 45 21.2
7 16 7.5
8 5 2.4
Total 212 100.0
Six Cluster Solution
1 44 20.8
2 46 21.7
3 29 13.7
4 32 15.1
5 45 21.2
6 16 7.5
Total 212 100.0
Three Cluster Solution
1 44 20.8
2 46 21.7
3 122 57.5
Total 212 100.0
99
.AcoHusHom
wouoau swouoau as ago wow H wouasHo mo oHowusoo swoon
wouomu H wouoou no .6 wouasHo "o0 .n wounaHo ”no .3 wouasHo
"do .n woumsHo "no .N wouaaHo "No .H woumaHo "Hov nowoom
wouosu so musoumsHo can: acoHuaHou wouoeu ucowouqu ecu
mmowus mwoumsHo xHa wow mpHowucoo owoom wouosu :H wouosus och .o swamHm
mu D nu X v0 a mu 0 NU + ru
mLOdeu no LmnEJZ
ommemefioemeinemetemmnenema...
_ . _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ [p MI
W INI-
1 _
s 3 J ’4‘: If!
, o
I r
..m
100
.Aw wauusHo "mo .n woumsHo "no .3 wounaHo
"so .n wouusHo ”no .N wouasHo "No .H woomaHo ”Hov mowoow
wouomu so mcHwouasHo can: mcoHusHou wouomu ucowomqu ego
mmowos owoomsHo an wow mpHowosoo owooo wouoeu :N wouoow: ask .5 swamHm
mo 6 no x we a no 6 mo +
mLOwoom to Lamasz
828.22.23.32;ononmmeN
_ _ _ _ _ _ _ _ _ _ _ _ _ _
-
—
b
_
o. ; ,,.
v T is?
101
. .Ao wouwsHo "we .n wouasHo "no .6 wounaHo
”do .n woundHo "no .N woumaHo "No .H woumsHo ”How nowooa
wouosu do wcHwouwnHo can: acoHuaHoa wouomu ucowouqu ecu
mmowoo awooasHo xHa wow mpHowocoo owooo woooam an woooeu: 058 .o owstm
mo 6 no x we 4 no 6 mo +
9.033 to 3952
ON me mr he mr mr Vv mv wr rr 0r m m n m m v m
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
102
.Aw wouasHo "co .n weuasHo "no .a waonsHo
"so .n wounsHo ”no .N woumaHo "No .H wouwsHo "How aowoom
wouoou so wcHwoumaHo can: msoHusHou wouomu ocowouqu on»
mmowom awounsHo an wou apHowupoo owooa wouomw :4 wouomus 63H .m owstm
DU D n0 X v0 0 mu 0 NU +
mLO#Udu LO LmOEDZ
ON or or he mv me we we mr rr 99 m m n m n v n
_ _ _ _ _ a _ _ . _ _ _ _ . _ _ _
A MHV
103
n
I
.Am woumsHo nwo .m wouasHo "no .a woumsHo
"so .n woudeu "no .N waumsHo "No .H woumpHo ”Hov mowooo
wouomu do mprowmsHo cos; acoHosHoa woooou ucowouqu ecu
mmowom awoumaHu xHa wow mpHowucoo swoon wouomu um wouosus 65H .oH owstm
00 D WU X VU 0 m0 0
mLO#Udu yo LwOEJZ
om me me he we mr vr me me er 0v m m n m m
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _
NU
’ \\\..r ’ .~ ..."VA. ...4
.‘flurfl-Ir_rkl=ad
.Q..‘-Vt&V‘.Jillr
AK 5
(a
b
A
104
A; A
.Ao wouaaHo "co .m wounaHu "no .4 wouwsHo
”co .n woumaHu "no .N wounsHo ”No .H weumsHo "How mowooa
wouoou do wusouadHo cos: asoHusHoa wouosu usowouqu on»
mnowoo wwoumaHo xHa wow apHowucoo owoom wouoom so wouoom: 658
mu 6 no x .6 o no 0
mbwuam to 3232
ON mv or hr mv mr vv nr Nr er 0v m m n w m
r
.HH owsmHm
NU +
. _ L _ _ _ _ r _ _ _ _ _ _
., V3.04.» .44.. is... '
5r-
11.. I. .. 4‘ MW. ’9‘ Jaubr‘flufi"
‘v )4 ... 4‘ . aw
105
.40 weuaaHo "co .n wouaaHo "no .a wouwsHo
“so .n wouusHo "no .N wouosHo "we .H wouasHo "How aowooo
wowomw so mcHwooaaHo cos: ocoHosHoa wooomw ucowowpr as»
mmowoo awooaaHo xHo wow mvwowocoo owooo woooow an woooow. oak .NH owswwm
mo 6 no x vu a no 6 mo + ro
9.0on 40 39:32
829.232.3293“.menmmvn _.
_ _ r r a _ _ Pl. . _ _ _ _ _ r _ m-
... IN..-
6 Ir-
... ‘ ..
v. . b <.#b.~v ». 4
(.70.. I 11.1.... .. .1... 4 c o
«‘3 l..\\ . m. are
m .. via.“ ‘ .
.. c
. [e
..m
106
.Aw wounsHo ”we .n woomsHo "no .a wouasHo
“so .n woumaHo "no .N wooosHo "No .H woumaHo "Hov mowooo
wooosw so musoomsHo son: acoHusHon wouosw usawowpr 05w
nmowoo uwoumsHo xHo wow mpHowocoo owoom wooomw um woooows och .nH owsmHm
mU D no X vU a mu 0 NU +
rU
mLO¥Udu $0 LOOEDZ
om mv mv hr mr mr vv me Nr er 0v 0 m n m m v n r
_ . _ _ _ _ _ _ _ _ _ _ _ P _ p _
M:
[Na-
II V.
../..
1.1 IX 0
. I r
I N
107
ON
.Ao woumsHo "mo .n wounsHo "no .4 wowmaHo
"so .n waumaHo "no .N woumsHo "No .H woudeo ”Hov nowoom
wouoow co wswwoumsHo can: acoHosHoa wooomw ucowowpr any
mmowom nwownsHo me wow mpHowucoo owoom wouoow so woooows one .eH owswwm
mu 6 no x vu a mu o No +
rU
mLOHUdu b0 Lwoenz
or or hr mr M? I. 2. Nr E. 0.. m m h w m v m N r
— — _ _ b p _ _ _ _. F _ _ _ _ _ _
ml
IN!
rl—sl
.V...
\. AV 1. O
.3“... H
K
Ir
IN
108
.Ao wouasHo "mo .n wouadHo "no .4 woumsHo
"co .n woumsHo "no .N wounsHo "No .H woumaHo "Hov mowoon
wouoow so mswwoonaHo can: asoHusHoo woooww ucowowpr oz»
mmowoo.mwooasHo xHa wow apHowocoo owooo wouosw .OH wouoawa ssh .nH swamHm
mo 6 no x vo. a no 6 «6 +
«Leeann wo consaz
am we or he mr mr vr nv NV er Or a m n m m v n
_ _ _ _ . _ _ _ . . _ _ _ _ t
109
.40 wounaHo ”co .n woumaHo "no .e wouasHo
”so..n woumsHo ”no .N woumaHo ”No .H woumaHo ”Hog mowoom
wouoow so wswwoumaHo cos: nsowusHoo wouoaw ocawowpr any
nmowom nwounsHo xwa wow anowucoo owooa wouoow :HH wouoow: orb .oH owstm
mu D no X v0 6 no 0 NU +
mLOwuou wo consaz
0N me or hv mr mr vv av Nr rr 0v m m n m m v n
_ _ _ _ _. _ _ _ _ _ _ a _ _ _ _ _
MVYE-WQ % x?
.1 Isl
‘
.44 (.1
16.
¢
110
.Am woumsHo "mo .n wounsHo "no .4 wooasHo
ueo .n wouaaHo "no .N woumsHo "No .H wooosHo "Hov aowooa
wouosw so musounsHo.so£3 acowusHoo wouoww ucowowpr 6:6
«mowos mwowmsHo me wow mpHowucoo owoou wouoow .NH wooosw: 059 .NH owswwm
mo 6 no x we a mo 0 No +
mtowood to 166532
ON or or he mv me tr we we er of m m n m m e n
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
”I,
.9»!
.mDrv.Llr.1:57xu. .»
.‘J-‘--m‘“Vr 1. .€:.ew)/
I a.
111
.40 wsosaHo "co .m wsossHo "no .c wsossHo
”so .n wswnsHo ”no .N wsusaHo "No .H wounsHo ”Hov nswoom
wouosw co wcstusoHo can: scowusHos wouosw ucswswpr saw
mmowos swsussHo st wow upwowosso swoon wouosw .nH wouosw. say .oH swamHm
mU D nU X V0 0 mu 0 NU +
mLOHUOu ho LmDEDZ
0N mv mr he or mr v« we Nv er 0r m m n m m v m
_ _ _ _ _ _ _ _ _ _ _ e _ _ _ _ _
112
.Ao wsuasHo "co .n wsuasHo "no .a wsosaHo
"so .n wsusdHo "no .lesuuvHo "No .H wsusdHo ”How nswoos
wouosw so wansumsHo ass: scowusHos wouosw upswswwwp ssu
mmowos swsusaHo st wow spHowusso swoos wouosw usH wouosw: ssh .mH swamHm
mu 6 no x vs a no 6 «6 +
stowuau to Lassa:
ommemrtmemeinemereoemonamen
F__—_____———_____—
rN
113
.Ao wsossHo "mo .n wsumsHo ”no .4 wsonaHo
.qo .n wsumaHu .nu .N wounsHu .No .H wsumaHu .Hov aswoon
wouosw so MdesumaHo ass: acoHuaHon wooosw upswswwwp on»
mmowos mwsusaHo st wow anowusso swoon wooomw :nH wouosws ssh .oN swsmwm
mo 6 no x vu a mo 6 mo +
. MLOHUdu to LOOEDZ
ON or or he or mr Vv mv Nr vv Dr N m n m m v m
_ _ _ _ _ _ _ _ 4 _ _ _
I-
II-
_
-
)—
I v
114
.Am wsusaHo ”o0 .n wsumsHu ”mu .w wsumaHu
"cu .n wswnsHo "no .N wsomaHo “Nu .H wsumsHo "Hov mswoos
wouosw so wcwwsusaHo ass: ssowudHou wouosw osswswpr snu
snowos mwsousHo st wow mpwowusso swoon wouosw :oH wouosw: sns .HN swzwwm
DU D nU X vU d MU O NU +
mLO#UOu wo LwDEJZ
am we mr hr mr mr vr me me rr 0r m m n m m v n
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
115
.Aa wsussHo "mo .n wowsaHo "no .a wsuasHo
"so .n woossHo "no .N wsossHo ”No .H wsuasHo "Hov sswooa
wooosw so mswwsosaHo coca ssoHosHos wouosw usswswpr saw
smowos nwsomsHo xHa wow spHowocso swoon wouosw 14H wooosw: ssh .NN swstm
mo 6 no x. vs a no 0 ms +
stowage to ransaz
ON me me he or mr vv me Nr rr Dr N m n m m v m
_ L _ _ _ 4 _ _ _ _ _ _ _ _ _ _ _
116
ON mv mr hr me m
_ _ _ _
.Ao wsussHo "co .n wsuspHo "no .4 wsumsHo
"as .n wswssHo "no .N wsomaHo "No .H wsumaHo ”How sswoos
wouosw so wswwsumsHo con: scowuaHos wooosw ucswowwHu saw
snowed mwsunsHu me wow upwowucoo swoon wouosw :mH wouosw: one .MN swawwm
mo 6 no x vu a mu 0
«LOwoou wo consaz
r vv NV Nr rr Dr N m h m m
_ _
_ _ _ _ _ _ _ P _
IVI
117
. .Aw wounsHo "4o .w wsumsHo "no .4 wsumaHo
”40 .m wsuadHo ”no .N wsuszHo "No .H wsumsHo “Hov «swoon
wouosw so wussumaHo coca scowosHos wooosw ucswswpr saw
mmowos mwsomaHo st wow mpHowucso swoon wowosw :mH wooosw. ssh .4N swawwm
mob. nox woo moo No...
rU
mbwupm 40 .8952
ON mr mr hr mv mr 4v Nv Nv rr 0r m m n m m 4 n r
_ _ _ _ _ _ _ _ _ p _ _ — _ _ _ _ M.-
II NI
I V1
n
v a o
I r
i N
118
.44 wsossHo "40 .n wsosaHo ”no .4 wsosaHo
"40 .n wsussHo "no .N wowssHo "No .H wsundHo ”Hov sswoos
wooosw co mcwwsussHo cs5: ssowusHos wowosw usswswpr saw
mmowos swsomsHo st wow mpHowucso swoon wowosw sou wouosw. ssh .nN swswwm
mo 6 no x .6 4 mo 0 mo +
mLOuUdu «.0 1.09.532
ON 04 Or he Or mv vr me me rr Or m m n m m 4 n
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _
—
—
119
for each factor solution. For example, Figure 6 shows "factor 1" factor
score centroids for the six clusters across different factor solutions.
The graphs show that factor score centroids differ markedly across
the different factor solutions. In Figure 6, "factor 1" factor score
centroid for cluster 1 changes significantly across the 19 different
factor solutions. The same is true for the centroids of the other five
clusters. Figures 7 ("factor 2" factor score centroids) to 24 ("factor
19" factor score centroids) show similar instability of factor score
centroids across factor (20, l9, l8, ..., 2) solutions. In Figure 7,
the "factor 2" factor score centroid for cluster 1 changes across the
different factor solutions. The results indicate that when clustering
on factor scores different factor solutions yield very different
clustering results in terms of cluster membership and cluster
description.
Comparison Of Cluster Membership
As described in Chapter III, a crosstabulation technique and
entropy (information) measure was employed to assess the effects of
alternative factor solutions on cluster membership. Tables 35 to 53
show the crosstabulation of cluster membership. In each table, the "20
factor" factor solution serves as the basis for (cluster membership)
comparison. Crosstabulations serve two primary functions. First, the
crosstabulations show the percentage of cases assigned to the same
cluster numbering (e.g., cluster 1) in two different clustering analyses
each based on factor scores from a different factoring solution (e.g.,
"20 factor" factor solution vs. "19 factor" factor solution). For
120
Table 35. Cluster membership crosstabulation of the 20-factor
solution and the 20-factor solution.
20—Factor Solution
20-Factor 2 3 4 5
Solution (percent)
1 100.0 0.0 0.0 0.0 0.0 0.0
2 0.0 100.0 0.0 0.0 0.0 0.0
3 0.0 0.0 100.0 0.0 0.0 0.0
4 0.0 0.0 0.0 100.0 0.0 0.0
5 0.0 0.0 0.0 0.0 100.0 0.0
6 0.0 0.0 0.0 0.0 0.0 100.0
Table 36. Cluster membership crosstabulation of the 20-factor
solution and the l9-factor solution.
19-Factor Solution
20-Factor 2 3 4 5
Solution (percent)
1 68.2 11.4 4.5 4.5 11.4 0.0
2 6.5 45.7 6.5 28.3 13.0 0.0
3 31.0 0.0 17.2 41.4 10.3 0.0
4 0.0 9.4 46.9 25.0 12.5 6.3
5 0.0 2.2 6.7 48.9 37.8 4.4
6 18.8 6.3 0.0 12.5 0.0 62.5
121
Table 37. Cluster membership crosstabulation of the 20-factor
solution and the 18-factor solution.
l8-Factor Solution
20-Factor 2 3 4 5
Solution (percent)
1 40.9 13.6 0.0 36.4 4.5 4.5
2 30.4 26.1 10.9 6.5 2.2 23.9
3 34.5 10.3 0.0 31.0 20.7 3.4
4 50.0 3.1 6.3 3.1 12.5 25.0
5 8.9 48.9 4.4 24.4 11.1 2.2
6 6.3 0.0 81.3 0.0 12.5 0.0
Table 38. Cluster membership crosstabulation of the 20-factor
solution and the l7-factor solution.
l7-Factor Solution
20-Factor 2 3 4 5
Solution (percent)
1 34.1 4.5 29.5 9.1 22.7 0.0
2 17.4 17.4 23.9 39.1 2.2 0.0
3 24.1 0.0 20.7 51.7 3.4 0.0
4 9.4 0.0 37.5 31.3 12.5 9.4
5 22.2 0.0 55.6 17.8 0.0 4.4
6 0.0 0.0 6.3 25.0 0.0 68.8
122
Table 39. Cluster membership crosstabulation of the 20-factor
solution and the 16-factor solution.
l6-Factor Solution
20-Factor 1 2 3 4 5
Solution (percent)
1 54.5 6.8 9.1 0.0 0.0 9.5
2 23.9 19.6 34.8 2.2 19.6 0.0
3 62.1 17.2 10.3 6.9 3.4 0.0
4 12.5 31.3 37.5 15.6 3.1 0.0
5 51.1 11.1 22.2 13.3 2.2 0.0
6 6.3 0.0 0.0 12.5 81.3 0.0
Table 40. Cluster membership crosstabulation of the 20-factor
solution and the lS-factor solution.
15-Factor Solution
20-Factor 1 2 3 4 5
Solution (percent)
1 56.8 18.2 4.5 4.5 15.9 0.0
2 32.6 19.6 13.0 13.0 21.7 0.0
3 24.1 24.1 17.2 6.9 27.6 0.0
4 15.6 3.1 56.3 9.4 9.4 6.3
5 8.9 8.9 31.1 40.0 11.1 0.0
6 12.5 0.0 0.0 6.3 6.3 5.0
123
Table 41. Cluster membership crosstabulation of the 20-factor
solution and the 14-factor solution.
l4-Factor Solution
20-Factor 1 2 3 4 5
Solution (percent)
1 29.5 11.4 18.2 4.5 36.4 0.0
2 17.4 21.7 8.7 13.0 39.1 0.0
3 0.0 6.9 27.6 0.0 55.2 0.3
4 3.1 18.8 31.3 9.4 31.3 6.3
5 13.3 33.3 31.3 17.8 4.4 0.0
6 0.0 0.0 0.0 6.3 12.5 1.3
Table 42. Cluster membership crosstabulation of the 20-factor
solution and the l3-factor solution.
13-Factor Solution
20-Factor 1 2 3 4 5
Solution (percent)
1 29.5 31.8 6.8 15.9 15.9
2 19.6 30.4 10.9 30.4 4.3
3 13.8 13.8 41.4 27.6 3.4
4 12.5 37.5 9.4 18.8 15.6
5 15.6 33.3 33.3 8.9 0.0
6 0.0 0.0 0.0 12.5 0.0
\JoomOI-‘O
mowowo
124
Table 43. Cluster membership crosstabulation of the 20-factor
solution and the 12-factor solution.
12-Factor Solution
20-Factor 1 2 3 4 5
Solution (percent)
1 50.0 11.4 18.2 0.0 18.2 2.3
2 26.1 39.1 26.1 0.0 2.2 6.5
3 27.6 0.0 51.7 3.4 3.4 13.8
4 12.5 15.6 15.6 9.4 18.8 28.1
5 40.0 4.4 17.8 4.4 11.1 22.2
6 0.0 6.3 6.3 87.5 0.0 0.0
Table 44. Cluster membership crosstabulation of the 20-factor
solution and the ll-factor solution.
ll-Factor Solution
20-Factor 1 2 3 4 5
Solution (percent)
1 29.5 36.4 6.8 9.1 11.4 6.8
2 13.0 39.1 8.7 23.9 2.2 13.0
3 6.9 6.9 65.5 0.0 13.8 6.9
4 6.3 28.1 28.1 18.8 6.3 12.5
5 22.2 26.7 35.6 11.1 2.2 2.2
6 12.5 6.3 18.8 0.0 0.0 62.5
125
Table 45. Cluster membership crosstabulation of the 20-factor
solution and the lO-factor solution.
lO-Factor Solution
20-Factor 1 2 3 4 5
Solution (percent)
1 20.5 2.3 6.8 54.5 13.6 2.3
2 6.5 17.4 13.0 21.7 30.4 10.9
3 3.4 48.3 17.2 24.1 6.9 0.0
4 9.4 6.3 43.8 15.6 21.9 3.1
5 17.8 37.8 15.6 17.8 8.9 2.2
6 0.0 6.3 18.8 0.0 0.0 75.0
Table 46. Cluster membership crosstabulation of the 20-factor
solution and the 9-factor solution.
9-Factor Solution
20-Factor 1 2 3 4 5
Solution (percent)
1 36.4 11.4 20.5 15.9 11.4 4.5
2 19.6 34.8 10.9 26.1 2.2 6.5
3 41.4 10.3 6.9 10.3 27.6 3.4
4 9.4 6.3 15.6 31.3 12.5 25.0
5 46.7 17.8 15.6 4.4 2.2 13.3
6 12.5 0.0 0.0 81.3 0.0 6.3
126
Table 47. Cluster membership crosstabulation of the 20-factor
solution and the 8-factor solution.
8-Factor Solution
20-Factor 1 2 3 4 5
Solution (percent)
1 38.6 36.4 6.8 4.5 11.4 2.3
2 21.7 23.9 30.4 6.5 8.7 8.7
3 17.2 6.9 6.9 69.0 0.0 0.0
4 9.4 18.8 59.4 12.5 0.0 0.0
5 46.7 15.6 4.4 11.1 4.4 7.8
6 18.8 0.0 25.0 0.0 56.3 0.0
Table 48. Cluster membership crosstabulation of the 20-factor
solution and the 7-factor solution.
7 Factor Solution
20-Factor 1 2 3 4 5
Solution (percent)
1 36.4 15.9 27.3 13.6 2.3
2 15.2 21.7 17.4 28.3 8.7
3 13.8 0.0 6.9 10.3 62.1
4 9.4 3.1 40.6 31.3 12.5
5 22.2 28.9 13.3 22.2 8.9
6 6.3 0.0 0.0 18.8 0.0
Uibwmoob
ObeOxJUi
Table 49. Cluster membership crosstabulation of the 20-factor
127
solution and the 6-factor solution.
6-Factor Solution
20-Factor 3 4
Solution (percent)
1 22.7 22.7 0.0 22.7 29.5 2.3
2 19.6 19.6 4.3 43.5 4.3 8.7
3 13.8 10.3 37.9 3.4 24.1 10.3
4 6.3 0.0 15.6 53.1 3.1 21.9
5 6.7 26.7 11.1 37.8 13.3 4.4
6 6.3 31.3 31.3 12.5 0.0 18.8
Table 50. Cluster membership crosstabulation of the 20-factor
solution and the 5-factor solution.
5-Factor Solution
20-Factor 3 4
Solution (percent)
1 11.4 25.0 34.1 11.4 15.9 2.3
2 34.8 21.7 4.3 6.5 23.9 8.7
3 0.0 24.1 13.8 6.9 24.1 31.0
4 18.8 31.3 0.0 25.0 12.5 12.5
5 31.1 20.0 11.1 0.0 11.1 26.7
6 0.0 31.3 6.3 6.3 25.0 31.3
128
Table 51. Cluster membership crosstabulation of the 20-factor
solution and the 4-factor solution.
4-Factor Solution
20-Factor 1 2 3 4 5
Solution (percent)
1 29.5 36.4 13.6 4.5 6.8 9.1
2 10.9 37.0 17.4 15.2 10.9 8.7
3 20.7 0.0 17.2 24.1 37.9 0.0
4 3.1 28.1 31.3 21.9 15.6 0.0
5 22.2 22.2 4.4 26.7 2.2 22.2
6 6.3 0.0 43.8 12.5 12.5 25.0
Table 52. Cluster membership crosstabulation of the 20-factor
solution and the 3-factor solution.
3-Factor Solution
20-Factor 1 2 3 4 5
Solution (percent)
1 61.4 4.5 18.2 9.1 4.5
2 52.2 6.5 15.2 17.4 6.5
3 20.7 10.3 27.6 3.4 3.4
4 28.1 15.6 28.1 18.8 0.0
5 40.0 0.0 28.9 8.9 6.7
6 0.0 18.8 12.5 56.3 0 0
UWODUTNW
129
Table 53. Cluster membership crosstabulation of the 20-factor
solution and the 2-factor solution.
2-Factor Solution
20-Factor 1 2 3 4 5 6
Solution (percent)
1 38.6 0.0 9.1 29.5 2.3 20.5
2 39.1 10.9 8.7 23.9 8.7 8.7
3 6.9 27.6 34.5 20.7 10.3 0.0
4 9.4 25.0 21.9 28.1 3.1 12.5
5 40.0 8.9 13.3 17.8 15.6 4.4
6 0.0 25.0 31.3 6.3 37.5 0.0
example, in Table 36, about sixty-eight percent (68.2%) of the cases
which were grouped into cluster 1 when clustering was based on"20
factor" factor scores and was also assigned to cluster 1 when clustering
was based on "19 factor" factor scores. And, as indicated in the
methods chapter, the crosstabulations were also used as the basis for
calculating entropy measures.
Table 35 shows the comparison of cluster membership between the
"20 factor" factor solution and "20 factor" factor solution when
clustering on 20 factor scores. The reason for this self-comparison is
to serve as a foundation (starting point) for calculating the entropy
measure. This self-comparison shows complete certainty (entropy is 0)
because all the elements of diagonal in Table 35 are 100% which means
that cluster one in "20 factor" factor solution is exactly the same as
the cluster one in "20 factor” factor solution.
The membership crosstabulations (Tables 36 to 53) reveal two major
things about clustering and the membership of clusters. First,
numbering of the different clusters appears to have changed across
130
different cluster analyses. For example, in Table 36, cluster 3
formulated on factor scores from the ”20 factor" factor solution is
likely not to be the same as cluster 3 formulated on the "19 factor"
factor scores. Only 17.2% of the cases assigned to cluster 3 are the
same for the "20 factor" and "19 factor" factor solution. Cluster 3 in
the "20 factor" factor solution is more likely cluster 4 in the "19
factor" factor solution. About forty-one percent (41.3%) of cluster 3
(20 factor factor solution) members are also in cluster 4 (19 factor
factor solution). This created a problem when it came to assessing the
impacts of factor-cluster solution on the stability of clusters.
Second, cluster membership is not stable; it changes across
different factor solutions (e.g., "19 factor" factor solution vs. "18
factor" factor solution). The percentage of cases assigned to clusters
changed significantly. For example, comparing Table 36 with Table 37,
the percentage of cases (68.2%) assigned to cluster 1 when clustering
was based on the "20 factor" factor scores and "19 factor" factor scores
(see Table 36) changed to 40.9% (percentage of cases assigned to cluster
1) when clustering on "20 factor" factor scores and "18 factor" factor
scores (see Table 37). About twenty-seven percent (27.3%) of cases were
redistributed to other clusters.
Both the uncertainty of cluster numbering and the shift of cluster
membership lead to the use of entropy measure to assess the effects of
alternative factor solutions on cluster membership.
Based on the crosstabulation results (Table 35 to 53, page
120-129) and Formula 3 (discussed in Chapter III, page 52), an entropy
measure was calculated for each crosstabulation/comparison. The entropy
measures are presented in Table 54. The lower the entropy value, the
131
Table 54. Entropy measures (using the 20 factor solution as a basis of
comparison) of cluster membership for different factor
solutions.
Factor Solution
Comparison Entropy
20 - 20 0.0000
20 - 19 0.5181
20 - 18 0.5756
20 - 17 0.5170
20 - 16 0.5371
20 - 15 0.6174
20 - 14 0.5849
20 - 13 0.5964
20 - 12 0.5487
20 - 11 0.6083
20 - 10 0.5979
20 - 09 0.6112
20 - 08 0.8245
20 - 07 0.5942
20 - 06 0.6377
20 - 05 0.6788
20 - 04 0.6572
20 - 03 0.5727
20 - 02 0.6552
less the uncertainty of cluster membership between two different
factor-cluster analytic solutions. That is, when the entropy value is
low,changes in cluster membership between two different factor-cluster
analytic solutions is small. Cluster membership (having lower entropy
value) is relatively stable. Large entropy values indicate instability
and that the membership of clusters based on different factor solutions
is very different. For example, the uncertainty (membership
instability) of cluster membership increases when basis for clustering
is the "16 factor" factor solution rather than the "15 factor" factor
132
solution. Uncertainty (membership instability) decreases when the
clustering basis changes from the "13 factor" factor solution to the "12
factor" factor solution.
The entropy measures for different factor solution comparisons are
plotted in Figure 26. The sudden downward or upward movement/change in
the plot indicates that cluster membership is very instable across
factor solutions. The result also indicates that the greatest
instability occurs between the "9 factor" factor solution and the "7
factor" factor solution. Selecting a "9 factor" factor solution would
result in a clustering solution that is very different from a clustering
solution based on "8 factor" factor scores.
The entropy (information) measures indicate that cluster
membership is very unstable across clustering solutions based on
different factor scores (solutions). Thus, when clustering on factor
scores, different factor solutions (number of factors) will affect
cluster membership. The implication is that alternative factor
solutions (number of factors) will result in different clustering
results.
Assessment of the Effect of Rotation on Cluster Membership
Objective two was to ascertain the effect of factor rotation on
the results of clustering (cluster membership). Nineteen (20, 19, 18,
., 2) principal component analyses were again performed on the
importance ratings of the 20 campground attributes. However, the
initial factors were not rotated. The eigenvalues and percent of
133
.mcoHusHom wowosw ucswswpr
on» mmowos szmwsnEsE wsumsHo wo :wsuusm amowucm .om swsmwm
cons—om cocoon.
NF 0' .vp or or hp up up ON
— p O
u p — p p
D 0 Op pp
p -
m.0
Kdonug
134
variance explained for the factors are the same as the results derived
from factor analysis with varimax rotation (see Table 11). Factor
scores were again calculated using regression estimate method.
The factor scores were again used as input variables for a Ward's
clustering method (using squared Euclidean distance). Nineteen
different cluster analysis were performed; one on factor scores for each
of the 19 (nonrotated) factor analyses. In each case, a six-cluster
solution was selected to permit comparison of cluster membership with
the clusters generated on rotated factor scores (see previous section).
Table 55 shows the results of crosstabulation of clusters based on
rotated and nonrotated factor scores for the "20 factor" factor
solution. It shows that there is pp difference in cluster membership.
The same is true for the other factor solutions (19, 18, 17, ., 2).
Rotation (or nonrotation) of factors does not affect clustering results
when clustering based on factor scores. Clustering results do not
change because rotating factors does not affect the goodness of fit of a
factor solution. This is because the communalities and the percentage
of total variance explained do not change.
Although rotation changes the factor matrix, the cluster
(membership) solution does not change because rotation does not change
the original relationship between variables. The distance between cases
for each variable is not changed by rotation.
However, rotation of the initial factors can lead to a different
interpretation of clustering solutions because of the difference in
factor scores. Table 56 presents a comparison of factor score centroids
for clusters based on rotated and nonrotated factor scores for the "20
factor" solution. It shows that the cluster centroids are different for
135
Table 55. Crosstabulation of clustering results based on rotated
and nonrotated factors.
20 20 Nonrotated Factors
Rotated 1 2 3 4 5 6
Factors (percent)
1 100.0 0.0 0.0 0.0 0.0 0.0
2 0.0 100.0 0.0 0.0 0.0 0.0
3 0.0 0.0 100.0 0.0 0.0 0.0
4 0.0 0.0 0.0 100.0 0.0 0.0
5 0.0 0.0 0.0 0.0 100.0 0.0
6 0.0 0.0 0.0 0.0 0.0 100.0
Table 56. Comparison of factor score centroids for clusters based on rotated and
nonrotated factor scores for the "20 factor“ solution.
Rotated Approach Nonrotated Approach
Cluster Cluster
Factor 1 2 3 4 5 6 Factor 1 2 3 4 5 6
1 -.14 .06 -.49 .45 .13 -.17 1 .57 09 '.51 25 - 10 *1 13
2 .17 .22 .15 -.27 -.21 -.21 2 .34 29 -.61 - 55 32 - 48
3 30 17 .33 -.05 16 -2.29 3 20 18 -.36 16 - 55 85
4 00 - 22 .23 .65 - 31 -.21 4 07 - 27 .93 - 10 - 23 - 24
5 -.20 - 02 .10 .58 - 24 -.05 . 5 - 39 -.02 .07 51 - 03 07
6 ~.12 83 -.51 .10 ' 32 -.43 6 - 24 04 °.46 83 - 09 - 01
7 -.56 02 .07 .41 .43 '.70 7 -.17 - 25 .72 25 16 -1 08
8 29 26 .30 - 58 ° 28 -.14 8 - 30 44 .03 40 - 48 03
9 -.15 .29 -.10 - 45 31 -.19 9 - 19 -.34 .03 - 10 30 80
10 .04 09 -.32 75 - 37 -.25 10 .46 17 - 13 - 50 12 - 86
11 .43 .35 * 37 - 43 - 27 .09 11 -.59 .44 - 02 39 23 :1 05
12 .04 .03 -.72 -.09 .48 -.05 12 .11 35 60 03 - 74 - 36
13 .77 -.45 -.27 -.14 01 -.08 13 - 12 02 - 18 34 05 - 22
14 .43 -.41 .01 35 - 24 -.04 14 -.37 64 16 - 37 - 07 - 15
15 .35 - 05 -.15 16 - 26 -.15 15 .16 - 31 - 02 27 08 - 25
16 .45 .26 -.14 -.11 - 45 -.22 16 .46 '.41 -.27 .55 -.15 -.25
17 .51 -.22 -1.16 .12 .46 -.21 17 .34 °.16 -.49 .27 .04 -.25
18 08 - 21 .22 - 30 24 -.11 18 - 47 - 03 - 12 16 47 - 08
19 - 29 09 18 18 - 11 .17 19 - 27 - 11 - 05 06 34 12
136
clusters on rotated and nonrotated factor scores (because the factor
matrix changes), even though the cluster membership is the same. Since
cluster centroids are used to label/describe clusters, rotating factors
will affect the interpretation of the clustering results. For example,
cluster 1 based on rotated factor scores would be labeled based on
factor 13 (.77), factor 17 (.51), and factor 7 (-.56). Cluster 1
formulated on nonrotated factor scores would be labeled based on factor
1 (.57), factor 16 (.46), and factor 11 (-.59). So, clusters comprised
of the same members would be described differently depending on whether
the clusters are based on rotated or nonrotated factor scores.
Comparison of Clustering on Factor Scores
with Clustering on Raw Data
As mentioned previously, factor analysis is often performed as a
preliminary step to clustering in order to reduce a large number of
variables and make it easier to describe/label the resultant clusters.
Shutty and DeGood (1987) contended that clustering on factor scores
results in clusters which can be described more accurately. However,
reducing variables to a smaller number of dimensions also results in a
loss of information (e.g., percentage of total variance explained) which
affects the clustering results (e.g., membership). This section
compares clustering based on factor scores with clustering on raw data
(the importance ratings assigned different campground attributes).
137
Clustering Results
Ward's clustering method (with squared Euclidean distance as the
distance measure) was used to group respondents based on the importance
they assigned to the 20 different campground attributes. Figure 27
shows the increase in coefficient of hierarchy (which resulted from
fusion of clusters) plotted against the number of the clusters. Four
candidate cluster solutions were identified: six clusters, five
clusters, four clusters, and three clusters.
ANOVAs were conducted to determine the extent of inter-cluster
differences among the four potential cluster solutions. For each of the
four potential solutions, there were statistically significant
differences among clusters on all 20 attributes (see Tables 57-60). The
primary weakness of the six-cluster solution is that one of the clusters
has less than 10 cases (see Table 61). However, the six-cluster
solution was still selected to enable comparisons with the six cluster
formulated on factor scores.
Comparisons Between Clustering Approaches
Nineteen principal component analyses with varimax rotation were
performed on the importance ratings of the 20 campground attributes.
Again, the regression estimates method was used to calculate factor
scores. The (factor score) centroids for each of the six clusters
(based on raw data) were then calculated for each of 19 factor analyses
(20, 19, 18, ..., 2). They are graphically presented in Figures 28-47.
138
EoHudHom 2.362230 "'4
spam 3sw co psmsn mH mcstumsHo :sn3
mwspmsHo wo wanes: >9 unowswan wo ucsHonwsoo .bm swsmwm
wanw
.n c0330» tea-3.0
vmmn N p N n v n o h D o O— Hp Np hp vp or 0— kw O— 9— ON HN NN 0N VN 0N
wOvn n . . _ . . . _ . . . LII. . . p p . . . . . _ _
mmmn 4 a
ndfln m .
hOOn m I NN
HnmN b .
hmmN m I vN
vwa m .
thN OH I 0N
NmON HH .
mme NH F 0N
mNmN NH I n
HFVN VH
OHVN ma I ”n
QONN 0H
VNNN hH I in
ONNN NH
OVNN OH I 96
OONN. ON
NOHN HN I 9n
mNHN NN ,
OmON nN I v
hmON 4N .
VNON mm I Nv
llllllll Illllllllllllll *.V
wcowswswz coHusHom
wo ucsHonwsoo wsumsHo
@pmmau)
.44 noon” )0 10323911303
139
Table 57. Mean attribute sought factor scores for the six-cluster
candidate solution when clustering on raw data.
Cluster
Factor 3 4 5 6 F-ratio
Large sites 3.27 3.39 3.60 2.48 2.75 3.50 8.80*
Shaded sites 3.53 2.91 3.55 2.81 2.90 2.83 6.23*
Cleanliness 2.02 1.77 2.24 1.52 1.55 1.50 7.09*
Quietness 2.70 3.02 3.05 2.20 2.32 3.33 7.09*
Site privacy 3.00 3.61 3.81 2.67 3.05 3.33 8.17*
Security 2.23 2.43 2.43 1.76 1.70 1.50 6.41*
Hospitality 2.67 2.68 2.88 1.95 1.92 2.00 8.66*
Low price 3.11 2.98 3.10 1.95 2.85 2.33 5.66*
Flush toilets 3.98 2.77 4.19 1.95 2.80 3.00 32.31*
Electricity 2.21 2.61 3.67 1.67 2.48 4.33 30.11:k
Showers 3.81 2.30 3.76 1.67 2.58 2.67 37.93*
Laundromat 3.79 4.07 4.60 2.48 3.58 4.67 26.34*
Campground store 3.98 3.95 4.41 2.48 3.28 4.00 23.98*
Water hookups 2.44 3.34 4.16 1.86 2.42 4.67 36.68*
Sewer hookups 3.30 4.09 4.87 2.14 3.28 4.83 30.16*
Natural surr. 3.58 3.11 3.76 2.76 2.58 2.00 11.91*
Lake/stream 4.47 4.41 4.48 3.57 2.85 3.33 27.60*
Hiking trails 4.42 4.09 4.53 3.57 3.10 2.67 19.79*
Pool 4.21 4.23 4.60 3.14 3.02 3.67 19.26*
Playground 4.72 4.75 4.88 3.38 4.10 2.00 30.00*
* Significant at .05 level.
140
Table 58. Mean attribute sought factor scores for the five-cluster
candidate solution when clustering on raw data.
Cluster
Factor 1 2 3 4 5 F-ratio
Large sites 2.74 3.39 3.60 2.48 2.85 10.03*
Shaded sites 3.53 2.91 3.55 2.81 2.89 7.81*
Cleanliness 2.02 1.77 2.24 1.52 1.54 8.89*
Quietness 2.70 3.02 3.05 2.19 2.46 6.73*
Site privacy 3.00 3.61 3.81 2.67 3.09 10.12*
Security 2.23 2.43 2.43 1.76 1.67 7.96*
Hospitality 2.67 2.68 2.88 1.95 1.93 10.86*
Low price 3.11 2.98 3.10 1.95 2.78 6.67*
Flush toilets 3.98 2.77 4.19 1.95 2.83 40.47*
Electricity 2.21 2.61 3.67 1.67 2.71 27.88*
Showers 3.81 2.30 3.76 1.67 2.59 47.60*
Laundromat 3.79 4.07 4.60 2.48 3.71 29.10*
Campground store 3.98 3.95 4.41 2.48 3.37 28.36*
Water hookups 2.44 3.34 4.16 1.86 2.71 32.99*
Sewer hookups 3.30 4.09 4.59 2.14 3.48 31.68*
Natural surr. 3.58 3.11 3.76 2.76 2.50 14.33*
Lake/stream 4.47 4.41 4.48 3.57 2.91 33.89*
Hiking trails 4.42 4.09 4.53 3.57 3.04 24.36*
Pool 4.21 4.23 4.60 3.14 3.11 23.24*
Playground 4.72 4.75 4.88 3.38 3.83 22.62*
* Significant at .05 level.
141
Table 59. Mean attribute sought factor scores for the four-cluster
candidate solution when clustering on raw data.
Cluster
Factor 1 2 3 4 F-ratio
Large sites 2.74 3.39 3.60 2.73 12.54*
Shaded sites 3.53 2.91 3.55 2.87 10.42*
Cleanliness 2.02 1.77 2.24 1.54 11.91*
Quietness 2.70 3.02 3.05 2.37 8.48*
Site privacy 3.00 3.61 3.81 2.96 12.34*
Security 2.23 2.43 2.43 1.70 10.60*
Hospitality 2.67 2.68 2.88 1.94 14.55*
Low price 3.12 2.98 3.10 2.52 4.99*
Flush toilets 3.98 2.77 4.19 2.55 46.32*
Electricity 2.21 2.61 3.67 2.39 27.82*
Showers 3.81 2.30 3.76 2.30 53.11*
Laundromat 3.79 4.07 4.60 3.33 23.43*
Campground store 3.98 3.95 4.41 3.09 29.07*
Water hookups 2.44 3.34 4.16 2.45 38.33*
Sewer hookups 3.30 4.09 4.59 3.06 28.66*
Natural surr. 3.58 3.11 3.76 2.58 18.73*
Lake/stream 4.47 4.41 4.48 3.12 40.32*
Hiking trails 4.42 4.09 4.53 3.21 29.97*
Pool 4.21 4.23 4.60 3.12 31.13*
Playground 4.72 4.75 4.88 3.69 28.26*
* Significant at .05 level.
142
Table 60. Mean attribute sought factor scores for the three-cluster
candidate solutions when clustering on raw data.
Cluster
Factor 1 2 3 F-ratio
Large sites 3.07 3.60 2.73 13.09*
Shaded sites 3.22 3.55 2.87 9.42‘k
Cleanliness 1.90 2.24 1.53 16.27*
Quietness 2.86 3.05 2.37 10.99*
Site privacy 3.31 3.81 2.96 13.07*
Security 2.33 2.43 1.70 15.25*
Hospitality 2.68 2.88 1.94 21.93*
Low price 3.05 3.10 2.52 7.29*
Flush toilets 3.37 4.19 2.55 42.84*
Electricity 2.41 3.67 2.39 39.07*
Showers 3.05 3.76 2.30 34.34*
Laundromat 3.93 4.60 3.33 33.81*
Campground store 3.97 4.41 3.09 43.80*
Water hookups 2.90 4.16 2.45 45.05*
Sewer hookups 3.70 4.59 3.06 34.20*
Natural surr. 3.34 3.76 2.58 24.92*
Lake/stream 4.44 4.48 3.12 60.68*
Hiking trails 4.25 4.53 3.21 42.93*
Pool 4.22 4.60 3.12 46.91*
Playground 4.74 4.88 3.69 42.57*
* Significant at .05 level.
143
Table 61. Number of respondents in each of the cluster candidate
solution when clustering on raw data.
Number of Relative Size
Cluster Respondents (percent)
Six Cluster Solution
1 43 20.3
2 44 20.8
3 58 27.4
4 21 9.9
5 40 18.9
6 6 2.8
Total 212 100.1
Five Cluster Solution
1 43 20.3
2 44 20.8
3 58 27.4
4 21 9.9
5 46 21.7
Total 212 100.1
Four Cluster Solution
1 43 20.3
2 44 20.8
3 58 27.4
4 67 31.6
Total 212 100.1
Three Cluster Solution
1 87 41.0
2 58 27.4
3 67 31.6
Total 212 100.0
144
.AcowusHon wouosw .wouonw 4: snw wow H wounsHo wo vHowucso
swoon wouosw H wouonw "s .4 wounaHo ”oo .n wounsHo "no
.4 wounaHo "40 .n wsunsHo "no .N wounaHo "No .H wounaHo "Hov
sump 3sw :o wcstunaHo cs5: ncowuaHOn wouonw upswswpr ssu
nnowon nwsunsHo an wow anowucso swoon wooosw :H wouonw: ssh .ON swamHm
mo 6 no x 46 o no 0 Nu + B
2368 40 43532
ONmFO—Co—nwinwwpfiowmwhomen —
H _ _ _ _ P _ _ _ P _ _ b _ _ b _ n1
_
.
.
m.
.
“ 1a..
.
.
.
u 1.-
, O
A
14
1
IN
n
145
.Am wswnaHo "wo .n wsunaHo "no
.4 wsunaHo "4o .n wswnsHo "no .N wounsHo ”No .H wsonsHo "Hov
swap saw so mcwwsonsHo cs5: ncowusHon wouosw ucswswpr s5»
nnowos nwsunaHo an wow anowusso swoon wouonw :N wouosw: sch .mN swswwm
mu 6 06 x 46 4 no 0 No +
2262 as constaz
ONO—mptm—m—zn—NHZOFOOhmnvn
_________r_______
U
‘9-
146
.Ho wswnsHo "co .n wounsHo "no
.4 wounsHo "40 .n wounsHo "no .w wsunaHo "No .H wsundHo ”Hov
some saw so mcstonsHo cs5: nconaHon wooonw ucswswpr saw
nnowon nwsunaHo an wow anowocso swoon wowosw gm woooswu ssh .On swsmHm
mu 6 Os X 46 4 no 0 No +
9.262 do .6952
ONmHm—tmpmp4wnpNHZO—mwhomvnN
L — _ L — _ — b — P p _ — _ _ — —
147
.Am wsunsHo "mo .n wsunaHo ”no
.4 wsunsHo “40 .n wounsHo "no .N wsundHo "No .H wounsHo "How
soap 3sw co wcstunsHo can: nconsHon wouosw ucswswpr snu
nnowos nwsunaHo an wow anowocso swoon wouosw s4 wooonw. s59 .Hn swsmwm
O6 o no x 4o 4 no 0 No +
2268 40 39.52
ONmpmph—m—m—vpn—NHZOF.Owhonsn
_______ww._______
148
ON
no 6 no
a.
.44 wounaHo "40 .n woundHo "no
.4 wsunaHo "4o .n wounsHo "no .N wswnaHo "No .H wsonaHo ”How
soup saw so wswwswnaHo cs5: ncoHquon wouosw waswswpr saw
mMOHOG muflumfldu “an HON «vacuum-600 OHOOm HOUOQH an HOUOQN: 05H. .Nn ouswwh
x 4o 4 no 0 No +
3262 40 .3832
o—b—o—mF4—n—NHCOHOONO04m”
_ . w _ _ _ _ 4 _ _ _ _ _ _ _ _
149
.Ao wsunaHo "as .n wounaHo "no
.4 wswnaHo "4o .n wsonsHo "no .N wounsHo "No .H wswnnHo "Hov
sons saw to mcwwswnsHo cs5: ncowwsHon wouosw ucswswpr saw
nnowon nwswnaHo an wow anowusso swoon wouosw cw wooosw. sch .nn swsmwm
no 6 Os x 46 4 no 0 «o + 3
9.368 40 ..ooEaz
ONOmen—oFm—4wnpr—HOHOOson4nNH
r — _ P b — _ _ . _ p _ _ . . H p —
”I
1 Ni.
. . . . .17
.4, . it ‘...(.5/ ‘3‘ch w. 0“ . <
114 r 1.1.5.13. alas. . . o
14.. ...,... alt. .. . 1...? ...1 - ..
.1 444.11.... (71>... .
.. w
1 a
n
150
.44 wsonsHo "mo .n wounsHo "no
.4 wounsHo "4o .n wsunsHo ”no .N wounaHo "No .H wounsHo "Hov
sump saw so mswwsunsHo can: ncowusHon wouosw ucswswpr saw
nnowos nwswnsHo an wow npwowucso swoon wouonw .4 wooosw: szh .4n swawwh
mo 9 Os x .46 a no 0 No +
nwowoow wo ..snEaz
ONO— OH: OH n—v—np N— :opm m h m n 4 n
.1 _ _ _ _ w w _ _ . _ _ _ _ _ _ _
151
.A4 wouano "mo .n wsunsHo "no
.4 wswnsHo "4o .n wswnaHo and .N wsunaHo ”No .H wsunsHo "Hov
swap saw so mcstunsHo coca ncowusHon wowosw wcswswpr saw
nnowos nwswnaHo an wow anoquso swoon wowonw cm wouonw: ssh .mm swamwm
mu 6 no x 4.1. 4 no 0 Nu +
naowoow 40 39:52
ONO—Optm—npzn—NHZOHOwhomvn
_________P__w____
152
.A4 wsunaHo "mo .n wswnaHo ”mo
.4 wsunaHo "40 .n wswnaHo "no .N wswnaHo "No .H wswnsHo "How
cusp Sow co mcwwsunaHo ass: ncoHusHon wouosw ucswswpr szu
nnowos nwsunaHo an wow npwoquso swoon wouosw an wouoswx ssh .mn swamHm
mu 6 mo x 46 4 no 0 Nu +
9.262 46 ..onEaz
ONO— m— up 3044—2. NP :Opm o N. m n .v n
— H — L L L — — — — — H L p L L -
153
sz wsunsHo "no .n wswnaHo "no
.4 wswnsHo "4o .n wswnsHo "no .N wounsHo "No .H wsunsHo "Hov
swsp saw so wussonsHo can: nsoHoaHon wooosw osswswpr on»
nnowos nwsonaHo an wow anowocso swoon wouosw :OH wooosw. sch .Hn swsmwm
mo 6 no x 4.6 4 no 0 No +
2262 Lo LsnEaz
ONOHmHhHoHnHeHnHNHHHoHOwhom4n
H H H H H H H H H H H H H H H H H
... auwrfl.
154
.Ao uouusuo "cu .n uoumaao "no
.c umumsHu "co .n uoumsau "no .w uoumaau "Nu .H uuuuzau "Hov
sumo any so mauuoumsau can: unequauoa wouuau ucououuuo an»
mmouuo uuoumaao xum you «odouuCoo muoum uOuuau .HH uouoau: 05H .mn ouswum
mu o no x 3 4 3 0 Nu +
8068 .o .3632
ommHmHhHmHmH¢HnHNHHHo—mwhom¢n
_ r r» _‘ _ _ _ _ p H r‘ . _ _ _ _ _
155
.Aw uouusao "mo .n uouaaao "no
.4 uoumaau "co .n woumSHo ”no .N uouaSHu "Nu .H uoumzau "HOV
want any :0 mauuuumaHo sun: «Godu3H0m nouowu unouuuuuv onu
mmouua muoumaao xam you mvaouucou uuoom acuumu :NH nouowu. och .mn ouswum
mu 9 mu x 3 a mu 0
2062 .o .3832
omew—tonHanHNHHHonmumm
_LF.__________P
156
.Hm wounaao "we .n wouazao "nu
.e woumsao ":0 .n uoumsuo "flu .N nounsao "No .H uuumauu "gov
uuuv any so mauuoumSHo cos: acoHuSHoa uououu ucauomuuv as»
mmouou muoumsHo me you mvuouucmu ouoom uouuau .nH nououu. «:5 .oc ouswwm
mu 9 3 x vu a nu 0 Nu +
8038 Ho .3832
ONme— hHmHnHHLnH NH :on m u m n v n
H H H H H H H H r‘ H H H H H H H H
F—
157
.Hw uoumaau "mo .n uoumnau "no
.¢ uuumaao "#0 .n uoumaao "no .N uoumado "Nu .H nouaSHo "HOV
aunt any :0 mcuuwumaao Con: chHuSHoa uououw uncuwuuav cnu
mmouou muoumaao xau you mvaouucoo wuoou uouuau :cH nouoau. 05H .Hc ousmum
mu 9 nu x 3 4 mu 0 Nu +
9038 Ho .3532
ommHmHnHmHmHeHnHNHHHonwho
HHHHHHHHHHHHHH
158
\
.Ao uoumsuo "mo .n uoumSHo ”no
.q uoumaao u¢u .n youmaao "no .N uuumauu "No .H noumsao "How
«pan 3a» :0 wcauoumsao can) unadusaou uouoaw ucuuouuuv on»
mmouou mumumzao xum you mvuouucoo uuoou uouuuu and nououu. 0:8 .~¢ ousmum
mu 9 no x 3 a mu 0 Nu +
9632 Ho .8952
ONmHmHBHmHmHinHNHHHonmmon¢n
HHHLLHHHHHHHHHHHH
0.0
v.0."
‘OCO‘. 1.
159
.Aw noumsao "on .n youaSHu ”no
.4 umumsao “40 .n uouuaau ”no .N uauasau “No .H uoumzHu "How
«not 34a :0 mcduoumaao can: accuuaaou uououu ucuuouuav can
mmouua muoumsau xum you unwouucuu ouooa uouoau .oH wouoaus 05H .n4 ousmam
mu 9 mu x 3 4 no 0 Nu +
928. Ho .3832
ONmHmHhHonH¢HnH~HHHonwhom4».
.HHHHHHHHHHHHTHHH
160
.Ao noumzao "on .n uoumaHo "no
.4 uoumSHo "40 .n umumaau “no .N u0uusao "No .H woumaau "HOV
auuv 34a :0 wcuuoumaHo :05: mcouuaaoa acuuau unououuuv ecu
amouua muoumsau xum you unwouucou auoum uouoau .HH uouowu. och .44 wuawwm
mu o no x vu 4 mu 0 Nu +
3032 Ho HonEaz .
ONmHmHhHmHmHanHNHHHonmx.
H H H H H H H H H H‘ H H r‘ H H H H
161
.Ae Houmsao ”cu .n woumsau ”no
.4 uoumsHu "40 .m noumaau ”no .N uoumaao ”no .H uoumSHu "Hov
aunt 34y co mauuoumSHu can: «acqunHom nouuau ucououuav any
mmouuc muuumsHu xHa you acuouucoo ouoom uouoau .mH uouuum. 05H .m4 ousmHm
mu .9 mu x 3 4 nu 0 Nu +
9633 Ho #5532
82223912quonHHHHH.anu
H P H H H H H H H H H H H r P H H H
A
Tia/4
162
.Hw youmado "on .n youmaHu ”no
.4 youmsau "40 .n yoymaao ”no .N youmauo "No .H youmaao ”HOV
oyov 34y so mayyoymsHo can: mcoHuSHom youoou ycoyouuyv ozu
mmoyoa myoymzHo me you moHoyuCou oyoum yoyoou .aH youuou. use .44 oysmyu
mo 9 no x 4o 4 no 0 Nu . +
9632 Ho ..onEoz
ONmHmHhHmHnHanHNHHHonmhmn4n
HHLHHHHHHHHHHHrrH
Ilfl
163
.Hw youuaao you .m youmouo "no
.4 youmsao ”40 .n yoyuaao "no .N youuaao "No .H younsHo "HOV
ouov any so wcuyoumaao cos: mcoyuaaou youoou unoyouuuv 05y
mmoyou «youmaHo xym you mvyoyucoo oyoom youoou cow youoou. one .54 oysmym
mu 9 no x 3 4 no 0 No +
8033 Ho HonEaz
omemHnHmHnHanHNHHHon who u 4 n
H H H H H H H H H H H H H P H H H
164
For example, Figure 28 shows the "factor 1" factor score centroids for
each of the six clusters for each of the 19 factor solutions.
The graphs show that factor score centroids of the different
clusters based on raw data do not differ significantly across the
different factor solutions. For example, Figure 28 shows that the
"factor 1" factor score centroid for each of the six clusters is
relatively stable across the different factor solutions. In comparison,
the factor score centroids of clusters formulated on the basis of factor
scores differ significantly across the different factor solutions (see
Figure 6). Figure 48 compares the cluster centroid stability across
factor solutions (20, 19, ..., 2) for clusters based on factor scores
and raw data. They reveal that the cluster centroids/membership is more
stable when clustering is based on raw data.
The sum of squared distance between centroid points was calculated
for each of the six clusters for each of the two clustering approaches
(i.e., raw data and factor scores). Table 62 reports the sum of squared
distance for each of the six clusters for 18 different factor scores
centroids.
The sum of squared distances were used as input to a computer
program (see discussion in Chapter III, page 63) to determine the best
set of matched clusters between clusters formulated on factor scores and
clusters formulated on raw data. The results are also shown in Table
62. The table shows which clusters are most similar. For example,
cluster 1 (based on factor scores) is most similar to cluster 5 (based
on raw data) for the "factor 1" factor score centroid pattern.
Within the best set of matched clusters for different factors,
standard deviations of 18 different factor score centroids were
165
Clustering on Factor Scores Clustering on Raw Data
FACTOR
‘1
“ II1
‘ fiv“ v v v v v—Vififi Yfi“ ‘ v v v vrTT—v—v—‘v’vw vv’v fiv
OIIIIIIIOQMMHOCIRIOID- ...I..'...'..“~'“"-..
-0.- “0*
In 0‘ 0‘ Ad Id 0‘ Id 0. 0‘ out I. "
“WV/".7 I
\§'
.0...’r;or.t:.%.g§.m. _ .
/
III
‘ fl VVVVVVVVVVVVVVVVV ‘ Vfi "V V V f1 V V V V V V 1’ V V i V
I I I I I I 9 I I D I. C. II I. a U I? I U. - 0 I I I I I 9 I I D N 0. U 00 I I n I I. D
_d* _d_
I II 0 C 0 CD I a I C V O I c. 0 fl 0 d I a I O C ‘
l
IIIIII'II-IIIIIIDCINCII- IIIII'IIUNII'II'UWIIID
_‘~
_0*
CC 0. I. I. I‘ " I“ 0‘ 0‘ I‘ I. "
"1
"* <<
'0 vvvvvvvvvvvvvvvvvv '0 vvvvvvvvvvvvvvvvvv
tcooocvoocouunuunnlu- inoooovconuuluunnun-
C” _'~
In on o: ..- I‘ v. on on on o.- I. 0.:
Number of Factors Number of Factors
U + o A x v
Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster6
Figure 48. Comparisons of the stability of cluster centroids based on factor scores
with clustering based on raw data.
166
Clustering on Factor Scores Clustering on Raw Data
FACTOR
5
«I4
‘ vvvvvvvvavvvvvfiivv ‘ V‘Vfi‘VV'fi vvvvvv
IIIIIIVIIflNIDuflflfllfl- 'IIOII'IOOICfluhfluOIUII.
_¢~ ud‘
I00 0‘ 06 Ad I. I. I0. 0‘ I. out I. I‘
I I
1‘ I4
{4.x ,,A‘.~;..\‘-...
\v w "n;- :7;
.44
-I<
'0 vvvvvvvvvvvvvvvvvvv
IIIOII’IIflflflflullfllfl- IIIIII'II-IIIIOIOIOIIIWI’OI-
—C* _C-
In 0‘ 0‘ 60. I‘ I. on 0‘ 0‘ I“ Id In
I I
I u
I!
.4
CO!
d‘
‘ VT—T VVVVVVVVVVVVVVV
OIIIII’I0.0I.‘IM”UWDU.
OIIOII'IIQNOIOICIUINUU.
_d_ *d_
.0. 0‘ 0‘ I‘ I. C. .d I. 0‘ I‘ I‘ U.
I I
I1 I!
I.
I
‘1
.14 «.1
‘ Y—YV'YVVV‘Vvfi' VVVVV '—V_ ‘ 'V'Y— VVVVVVVVVVVVVVV
IIIIII'II.0I.“U.~W-.- 'IIIII'II.”II““.~".I-
—C_ C‘
.0. 0‘ 0‘ I. I. CC I. 0‘ OFI‘ I. I.
Number of Factors Number of Factors
Cl + o A x V
Clusterl Cluster2 Cluster3 Cluster4 Clusters Cluster6
Figure 48 (Cont'd).
Clustering on Factor Scores
.4
‘ vvv1v1vvvvvv*rvfivvv
IIIIIIIIIUIIU...-n.'-
—C—
In 0. I. I‘ I. 0‘
I
.1
“‘—
III I. O. I. I. I.
I
.4
I!
I
‘4
*1
‘ V‘ V‘V'fi‘v'itVTY V'V
IIIIII'IIUIIIIIIOI..HII.
C-
.0. I. CPI. I. I.
I
I1
'1
. v
‘4
.4
‘ vavvvvfi VVVVVVVVV
OIIIII'IIUIOIDMCU"..-
u.-
IIO .. O. I. I. '.
Number of Factors
4.
Cluster 2
D O
Cluster 1
Figure 48 (Cont'd).
Cluster 3
167
FACTOR
9
Clustering on Raw Data
.‘
.‘4
‘4
* ‘Vv'VV‘VV'VY'VVfi’VV'
'IIOIOVIOQIIIIIIMDIOWUOI-
I-
I“ O. IrI. I. I.
I
.4
.1
I
a.
‘1
«I
IIIOII’IIIONQIIIOIIOIW...
‘~
I“ 0‘ or... I‘ I.
I
.4
04
I
d1
44
‘ Y‘vv‘tvv‘Vv j‘VV T‘I‘va
IIIOIIVIIIOOMIIIIOIII'I‘D-
—C_
I“ 0‘ I. 6‘ I. 9.
I
.4
g.
l
I
44
d1
‘ ‘Vvvvivvvvvfivi VVVVV
CIIOIO’IIflflllflfiflflflflilb
‘-
Id 0. .3...- I‘ 0‘
A
Cluster 4
Number of Factors
V
x
Cluster 5 Cluster6
Clustering on Factor Scores
FACTOR
o
.4
I!
l
o
.14
~04
-. vvvvvvvvvvvvvvvvvv
I I scoovoo-«nuuuunuu.
~0—
Id 0‘ III I. I. I.
I
.1
.4
I
44
dd
*f‘vV‘VVVYV'V'v‘Vfi—TV
0 IIIII’II'IIII‘IIIOIOIW...
nun-c—n
I00 0‘ In. I‘ I. I.
I
I4
ll
l
o
..4
‘4
* VT‘VVV'VVj VVVVVVV
O IIIII'IIQNOIIIOI'flWCOI-
-¢_
In 0‘ 0‘ on I. I.
I
.4
.4
. j
.‘4
..4
-O H .................
I IIIII'II.IOIIIIH.IIII.II.
F..-
Ian 0‘ I so. I. 0‘
Number of Factors
0 + o
A
168
Clustering on Raw Data
:S—fil
.4
* VY‘Y‘YVVjT' Vij‘vvjif
.IIIII'II.I|CIII“.“"...
~¢fi
In 0‘ Id In I. on
I
.I
.{
.‘W
«J
i‘
* YVYVV Y‘vvafir fij
IIIIIIII.IIIIIIII.IIII”N.
“I-
It! 0‘ on can I. I.
I
.1
IJ
I
..4
.‘
‘ YvwvvvvvafivVVTfi—VV
IIIIII’II.|IIIIIII.HI'.”.
n-uu-n
II. I. I. I“ I. I.
I
I!
.1
.d—w
‘4
‘1
vawvrrvvvvvvvvvvvw
0 IIIII’II..I“IIII.II".II.
_C—
II. 0. I. I. I. '.
Cluster1 Cluster2 Cluster3 Cluster4
Figure 48 (Cont'd).
Number of Factors
x v
Cluster 5 Cluster6
Clustering on Factor Scores
FACTOR
.‘J
* vvwvvvvvvvvvavvvv
.aoooovoouuuuunnnun-
dun-
Io! o. .3"... I. v.-
n
.4
.1
. g
.1
.1
0.--.vavwa.HHHHH.
.Ioooovoounuuuuunun-
_‘*
o.- .. on on I. In
a
.4
u
0 ‘fi
«1
‘4
‘ V‘VYY—VfivvvaV—Vivvvv
onlooovoouuununnn-n-
nun-c—
on on on ..- u. 0..
o
.1
I4
0
4
4'4
..4
* "Vvfif‘v f‘vfivi'v
.Ioooovoouu-uuuunun.
a-
on o. .3"... I. v.
Number of Factors
0 +
Cluster 1 Cluster 2
Figure 48 (Cont'd).
o
Cluster 3
17
18
19
20
169
Clustering on Raw Data
.4
.4
. K
.4
.‘4
‘virvvvvvvvvf‘vvvva
IIIIII'II.II|I.II.|I"..-
.—
I. I. I I. I. I.
I
.4
.4
. E
.—-—1
O‘—
.1
.1
‘ VYVVVYY‘VVf‘YYV‘fiYV
OIIIII'II.“.IIII..N.II.
”C-
I. I. I. I- I. I.
I
I1
0‘ '1
6—1
I
.1
d4
* 'ffijvvfi‘T'jvvvth'
OIIIII'II.".|IOI..I'...
—.-
Id 0. I. I- I. I.
I
I!
Ii
ll
I
.1.
d4
‘ vvvvvvvvvvvvvvvv
Cluster 4
fl
IIIIII’II.II..II.III'-.-
I. IF.” I. I.
Number of Factors
v
x
Cluster 5 Cluster6
170
Table 62. Comparison of stability of factor score patterns between two approaches.
Clustering on Clustering on
Factor Scores Raw Data
Comparison
Cluster Sum of Standard Cluster Sum of Standard of
Order Distance Deviation Order Distance Deviation Stability
Factor 1
1 4.612 0.374 5 0.305 0.208 a
2 6.909 0.488 4 0.472 0.186 a
3 6.656 0.466 3 0.084 0.109 a
4 9.453 0.510 2 1.672 0.312 a
5 15.527 0.584 6 20.342 1.197 b
6 12.783 0.620 1 5.685 0.569 a
Factor 2
1 5.455 0.384 5 1.137 0.367 a
2 4.477 0.426 3 0.290 0.182 a
3 11.008 0.447 2 3.120 0.333 a
4 10.812 0.547 1 4.333 0.401 a
5 11.919 0.670 6 14.028 1.040 b
6 7.999 0.599 4 1.952 0.552 a
Factor 3
1 3.257 0.290 4 3.152 0.318 b
2 5.611 0.405 3 0.692 0.157 a
3 11.800 0.559 6 41.624 0.914 b
4 5.820 0.432 2 4.486 0.505 b
5 8.807 0.478 1 3.093 0.400 a
6 10.647 0.690 5 2.464 0.280 a
Factor 4
1 2.926 0.388 5 0.973 0.233 a
2 6.696 0.406 4 4.364 0.330 a
3 10.353 0.500 6 7.702 0.588 b
4 8.567 0.514 1 3.276 0.375 a
5 17.068 0.829 2 4.177 0.378 a
6 4.331 0.413 3 0.858 0.179 a
Factor 5
1 2.700 0.374 5 0.354 0.127 a
2 5.959 0.410 4 4.697 0.353 a
3 11.189 0.509 6 9.398 0.573 b
4 5.479 0.437 2 0.759 0.154 a
5 2.506 0.314 3 0.510 0.128 a
6 15.803 0.174 1 2.332 0.263 b
Factor 6
1 3.188 0.342 5 0.915 0.184 a
2 16.430 0.708 6 15.796 0.786 b
3 7.591 0.465 2 1.667 0.222 a
4 5.346 0.407 4 3.180 0.369 a
5 8.322 0.464 3 0.378 0.102 a
6 14.872 0.678 1 1.175 0.198 a
171
Table 62 (Cont'd).
Clustering on Clustering on
Factor Scores Raw Data
Comparison
Cluster Sum of Standard Cluster Sun of Standard of
Order Distance Deviation Order Distance Deviation Stability
Factor 7
1 3.642 0.374 5 1.062 0.200 a
2 13.045 0.658 4 1.768 0.332 a
3 0.953 0.256 3 0.373 0.108 a
4 4.180 0.366 2 1.136 0.217 a
5 9.233 0.563 1 0.965 0.175 a
6 31.822 0.900 6 15.005 0.704 a
Factor 8
1 1.975 0.400 5 0.566 0.189 a
2 6.710 0.543 6 3.072 0.428 a
3 1.867 0.354 4 0.338 0.161 a
4 4.038 0.412 3 0.263 0.117 a
5 10.315 0.550 1 2.084 0.256 a
6 2.966 0.298 2 1.481 0.250 a
Factor 9
1 1.500 0.290 5 0.202 0.171 a
2 3.886 0.431 4 1.231 0.264 a
3 2.287 0.389 3 0.124 0.095 a
4 4.670 0.357 2 2.164 0.302 a
5 5.048 0.700 1 0.688 0.154 a
6 5.519 0.440 6 21.845 0 022 a
Factor 10
1 1.449 0.281 5 1.334 0.287 b
2 0.662 0.172 3 0.155 0.080 a
3 2.549 0.311 2 2.172 0.266 a
4 10.712 0.741 6 16.980 0.996 b
5 6.052 0.474 1 0.382 0.187 a
6 9.541 0.840 4 1.128 0.304 a
Factor 11
1 6.897 0 531 5 1.256 0.221 a
2 1.026 0.214 4 0.639 0.240 b
3 1.491 0.361 3 0.134 0.090 a
4 1.833 0.306 2 0.756 0.222 a
5 7.628 0.469 1 0.178 0.081 a
6 11.581 0.687 6 10.224 0.693 6
Factor 12
1 7.001 0.487 6 2.201 0.292 a
2 1.176 0.234 5 1.126 0.266 b
3 3.188 0.422 4 1.112 0 229 a
4 3.832 0.442 2 0.672 0 193 a
5 0.619 0.247 3 0.062 0 069 a
6 0.671 0.281 1 0.188 0 127 a
172
Table 62 (Cont'd).
Clustering on Clustering on
Factor Scores Rau Data
Comparison
Cluster Sum of Standard Cluster Sun of Standard of
Order Distance Deviation Order Distance Deviation Stability
Factor 13
1 2.348 0.366 6 1.304 0.389 b
2 1.687 0.312 4 0.011 0.036 a
3 0.945 0.314 2 0.791 0.270 a
4 2.476 0.346 1 0.290 0.204 . a
5 1.945 0.290 3 0.086 0.104 a
6 1.757 0.275 5 1.122 0.366 b
Factor 14
1 0.476 0.238 5 0.394 0.194 a
2 0.561 0.237 3 0.044 0.076 a
3 0.647 0.223 2 0.369 0.165 a
4 4.965 0.575 6 7.415 0.592 b
5 0.841 0.222 1 0.394 0.194 a
6 1.302 0.280 4 1.830 0.338 b
Factor 15
1 0.550 0.213 5 0.277 0.153 a
2 0.845 0.214 3 0.058 0.087 a
3 0.562 0.298 2 0.359 0.170 a
4 3.446 0.483 6 8.554 0.733 b
5 2.221 0.389 4 1.952 0.308 a
6 0.327 0.187 1 0.185 0.110 a
Factor 16
1 0.604 0.311 3 0.056 0.089 a
2 1.002 0.242 4 1.289 0.354 b
3 0.200 0.266 2 0.151 0.134 a
4 0.630 0.214 1 0.583 0.287 b
5 3.447 0.490 6 3.817 0.698 b
6 0.736 0.237 5 0.353 0.222 a
Factor 17
1 0.609 0.427 5 0.249 0.225 a
2 0.238 0.160 1 0.215 0.258 b
3 1.398 0.554 3 0.263 0.196 a
4 0.060 0.068 4 0.118 0.141 b
5 1.816 0.399 6 3.890 0.772 b
6 0.439 0.188 2 0.390 0.224 b
Factor Score 18
1 0.071 0.092 6 0.011 0.048 a
2 0.003 0.020 5 0.000 0.009 a
3 0.289 0.310 4 0.004 0.033 a
4 0.346 0.233 3 0.002 0.019 a
5 0.014 0.041 2 0.000 0.012 a
6 0.527 0.320 1 0.005 0.037 a
Note: Two approaches are (l) clustering on factor scores and (11) clustering on raw data
aClustering on factor scores has a larger standard deviation.
bClustering on raw data has a larger standard deviation.
173
calculated for each cluster for the two clustering approaches. The
results are reported in Table 62. The higher the standard deviation,
the more unstable the cluster membership. Overall, the results indicate
that the approach of clustering on raw data was better than the approach
of clustering on factor scores in terms of cluster membership stability.
CHAPTER V
CONCLUSIONS
The primary purpose of this study was to examine the impact of
factor analyses on cluster membership when clustering is based on factor
scores. Although many researchers have utilized factor analysis as a
prelude to clustering, very few have examined the potential effects of
alternative factor solutions (number of factors) on clustering results.
The study had three objectives: (1) to assess the effect of different
factor solutions (number of factors) on cluster membership, (2) to
ascertain the effect of factor rotation on cluster membership, and (3)
to compare clustering on factor scores with clustering on raw data.
This chapter presents a summary of the study, major conclusions, a
discussion of study limitations, and recommendations regarding the
combined use of factor analysis and cluster analysis.
Summary of the Study
The study utilized the importance ratings of 20 different
campground attributes/facilities collected in a study of the 1988
Michigan Campvention. Respondents ranked the importance of these
174
Pr
17S
attributes/facilities on a five-point scale ("1" being crucial and "5"
being not important).
Nineteen (20, 19, 18, ..., 2) different principal component
analyses with varimax rotation were performed on these data. Cluster
analysis was performed on the factor scores from the "20 factor" factor
analysis. A six-cluster solution was selected. Cluster analyses were
also performed on the factor scores from the other 18 factor analyses.
A six—cluster solution was derived for each of the other 18 factor
analyses. The stability of cluster membership was compared across the
18 different factor-cluster analyses using an entropy (information)
measure.
Nineteen different principal component analyses without rotation
were performed on the attributes/facilities data. Cluster analyses were
again performed on the factor scores from each of these factor analyses.
A six-cluster solution was decided for each factor-cluster analysis.
The cluster memberships derived from the nonrotated factor scores were
compared (using membership crosstabulation) with the memberships of
clusters based on rotated factor scores.
Cluster analysis was performed to group respondents based on the
importance they assigned to the 20 different campground attributes. A
six-cluster solution was selected. Nineteen principal component
analyses with varimax rotation were performed on the 20 campground
attributes. Factor score centroids were calculated and graphed for each
of the six clusters across different factor solutions. The sum of
squared distance for each cluster on each factor was computed for both
clustering on raw data and clustering on factor scores. A computer
program was utilized to determine the best set of matched clusters
176
between two clustering approaches. The standard deviations of factor
score centroids for each cluster across different factor solutions were
calculated and used as the basis for comparing the stability of cluster
membership derived from clustering on raw data with the stability of
cluster membership derived from clustering on factor scores.
Major Conclusions
Three major conclusions were drawn from the analyses. First, when
factor analysis is used in conjunction with cluster analysis, the factor
solution (number of factors) selected has an effect on the cluster
membership. Different factor solutions generate different factor
scores, which result in different similarity measures. Different
similarity measures lead to different cluster solutions. As a result,
cluster membership is very unstable across clustering solutions based on
factor scores.
Second, whether or not the initial factors are rotated does not
affect cluster membership. Because the original relationship between
variables does not change when the initial factors are rotated, the
distance measure between cases for each variable in the clustering
procedure will not be changed. The difference between clustering on
rotated factor scores and clustering on nonrotated factor scores is that
clusters will be labeled differently.
Third, clustering on raw data rather than factor scores results in
more stable cluster membership. Because factor analysis is used to
reduce observed variables into fewer dimensions by means of a linear
combination of the observed data, a certain amount of information
177
(percentage of variance explained) will be lost depending on the number
of factors selected. Thus, when clustering on the different factor
scores, the loss of information will result in significant changes in
cluster membership as compared to the cluster membership derived from
clustering on raw data (no information is lost).
Although this study identified that alternative factor solutions
will affect cluster membership, it does not mean that results of
previous studies using factor analysis in conjunction with cluster
analysis are methodological and statistical incorrect. However, this
study raises some significant concerns about the impact of alternative
factor analyses on cluster analysis. These concerns should be
incorporated into future studies which utilize combined factor analysis
and cluster analysis.
Study Limitations
The study had five major limitations. First, the number of cases
that could be analyzed by the clustering software was limited. Not all
of the 424 respondents (cases) who rated all 20 campground attributes
could be clustered. This required selection of a subsample of 212
cases. As a result, some of the formulated clusters had fewer than 10
members. Calculation of chi-square statistics to compare cluster
membership differences was not possible because one or more of the cells
in the cluster crosstabulation tables had less than five members.
Second, although considerable thought was given to identify
relevant campground attributes, there is no assurance that they
represent complete list of all the relevant attributes sought. The
178
problem of identifying relevant attributes is not unique to this study,
but is rather inherent in classification, especially attributes and/or
benefits sought segmentation studies.
Third, only Ward’s method with the squared Euclidean distance was
used. Other clustering techniques are available that have different
characteristics and procedures. These clustering techniques often yield
different clustering results because different similarity measures (for
hierarchical clustering methods) and different partitioning rules (for
nonhierarchical clustering methods) are used.
Fourth, although the entropy (information) measure was used to
assess the stability of cluster membership, no statistical test was used
to reject or accept the hypotheses.
Fifth, because the similarity of the six clusters formulated on
raw data and clustering on factor scores is uncertain, a computer
program was used to identify the best set of matched clusters based on
the criterion of minimum total difference of the sum of squared
distance. However, there might be more appropriate ways to select the
matched clusters.
Recommendations Regarding the Use of
Factor Analysis and Cluster Analysis
Six major recommendations are offered regarding the use of factor
analysis and cluster analysis. First, when factor analysis is performed
as a preliminary step to cluster analysis, they should not be treated as
distinct analyses. The findings show that alternative factor solutions
will affect the clustering results (cluster membership). Researchers
179
who use factor scores as the basis for clustering should examine the
impact of alternative factor solutions on the clustering results.
Decisions regarding the number of factors should be based on both the
factor analysis criteria (eigenvalues greater than one, percentage of
variance explained, scree test) and the impact on the cluster solution.
Second, researchers may first perform cluster analysis based on
raw data for classification (segmentation) purposes, and then use factor
analysis as a means of describing clusters. Selection of variables (raw
data) to be used in cluster analysis should have theoretical support.
Also, when many variables are included in the study, researchers should
consider alternative methods (e.g., multiple discriminant analysis) to
determine which variables can contribute the most to the correct group
classification.
Third, the findings indicate that the entropy (information)
measure can be used as an indicator of cluster stability. The entropy
(information) measure has been commonly used in the fields of marketing,
management, finance, accounting, biology, communication, and geography.
It has rarely been used in the field of recreation. The results of this
study show that the entropy (information) measure provides a good
indicator with which to assess the uncertainty of cluster memberships.
The information measure can also be used to assess the stability of
derived clusters over time.
Fourth, the assessment of the impacts of alternative factor
analyses on the clustering results should be repeated with different
clustering data, similarity measures, and other clustering techniques
that produce different clustering results. Fifth, although a specially
designed computer program was used to assess the similarity of
180
clustering results formulated on raw data and factor scores,
alternatives to solve the problem of cluster matching should be examined
in the future. Finally, the entropy (information) measure was used to
assess the stability of cluster membership derived from clustering on
raw data and factor scores; however, researchers should investigate
appropriate statistical tests to use with the entropy (information)
measure .
BI BLIOGRAPHY
BIBLIOGRAPHY
Aczél, J., & Daréczy, Z. (1975). On measures of information and their
chagacterizations. New York: Academic Press.
Aldenderfer, M. S., & Blashfield, R. K. (1984). Cluster analysis.
Newbury Park, CA: Sage Publications.
Allen, L. R. (1982). The relationship between Murrary's personality
needs and leisure interests. Journal of Leisure Research, l3(1),
63-76.
Anderberg, M. R. (1973). Cluster analysis for applications. New York:
Academic Press.
Arbuckle, J., & Friendly, M. L. (1977). On rotating to smooth functions.
Psychometrika, 4;, 127-140.
Archer, C. O., & Jennrich, R. I. (1973). Standard errors for rotated
factor loadings. Psychometrika 38 581-592.
Armstrong, J. 8., & Soelberg, P. (1968). On the interpretation of factor
analysis. Psychological Bulletig, 19(5), 361-364.
Attaran, M., & Guseman, D. (1988). An investigation into the nature of
structural changes within the service sector in the U.S. Journal
of the Market Researgh Society, ;Q(3), 387-396.
Attaran, M., & Zwick, M. (1987). Entropy and other measures of
industrial diversification. ngxgerly Journal of Business and
mm. 16(4). 17-34.
Bartholomew, D. J. (1985). Foundations of factor analysis: Some
practical implications. British Jouggal of Mathematical and
Statistical Psychology, 38(1), 1-10.
Bartko, J. J., Strauss, J. S., & Carpenter, W. T. (1971). An evaluation
of taxometric techniques for psychiatric data. Classification
Society Bulletin, 2(1), 1-27.
Bartlett, M. S. (1937). The statistical conception of mental factors.
British Journal oi Psychology, gg, 97-104.
181
182
Bartlett, M. S. (19SO,June). Tests of significance in factor analysis,
8pitish Journal of §tatistical Psychology, 3, 77-85.
Bartlett, M. S. (1951,march). A further note on tests of significance in
factor analysis. tish Journal 0 Statistical Psychology,
3, 1-2.
Bayne, C. K., Beauchamp, J. J., Begovich, G. L., & Kane, V. E. (1980).
Monte Carlo comparisons of selected clustering procedures.
Pattern Recognitiop, l8, 51-62.
Beale, E. M. L. (1969). Euclidean cluster analysis. Bulletin of the
International Statistical Institute, 83, 92-94.
Beard, J. G., & Ragheb, M. G. (1983). Measuring leisure motivation.
Journal of Leisure Research, l3(3), 219-228.
Beecher, M. D. (1989). Signaling systems for individual recognition: An
information theory approach. Animal Behaviour, 38(2), 248—261.
Bieber, S. L., 6 Smith, D. V. (1986). Multivariate analysis of sensory
data: A comparison of methods. Chemical Senses, ll(1), 19-47.
Bishara, H. I. (1984). Aggregate dividend decision making in Canadian
life insurance companies. 5310p 8uaipe§§ and Economic Review,
13(2), 6-14.
Blashfield, R. K. (1976). Mixture model tests of cluster analysis:
Accuracy of four agglomerative hierarchical methods.
Psychological Bulletin, 83(3), 377-388.
Blashfield, R. K. (1978). The literature on cluster analysis.
Multivapiate 8ehavioral Research, l3, 271-295.
Bobko, P., & Schemmer, F. M. (1984). Eigenvalue shrinkage in principal
components based factor analysis. Applied Psychological
Measuzemepp, 8(4), 439-451.
Boggis, J. G., & Held, 1. (1971). Cluster analysis-a new tool in
electricity usage studies. Jourpal of the Market Research
Society, l3(2), 49-66.
Browne, M. W. (1968a). A comparison of factor analytic techniques.
Psychometrika 33 267-334.
Browne, M. W. (1968b). A note on lower bounds for the number of common
factors. Psychometrika, 33(2), 233-236.
Calantone, R. J., & Cross, A. C. (1980). The impact of segment dynamics
on retail bank advertising strategies. In D. W. Scotton & R. L.
Zallocco (Ed.), Readings in market segmentation (pp. 126-142).
Chicago, IL: American Marketing Association.
183
Calantone, R. J., 6 Johar, J. S. (1984). Seasonal segmentation of the
tourism market using a benefit segmentation framework. Journal
pf Travel gesearch 83(2), 14-24.
Calantone, R., Schewe, C., 6 Allen, C. T. (1980). Targeting specific
advertising messages at tourist segments. In Hawkings, Shafer,
and Rovelstad (Eds.), Tourism marketing and management issues
(pp. 149-160). Washington, DC: George Washington University
Press.
Carroll, J. B. (1953). An analytic solution for approximating simple
structure in factor analysis. Psychometrika, l8, 23-38.
Cattell, R. B. (1966). The scree test for the number of factors.
Multivariate Behavioral Research, l(2), 245-276.
Collins, L. M., Cliff, N., 6 Cudeck, R. A. (1983). Patterns of crime in
a birth cohort. Multivariate Behavioral Research, l8(3),
235-258.
Comrey, A. L. (1973). A first course in factor apalysis. New York:
Academic Press.
Connelly, N. A. (1987). Critical factors and their threshold for camper
satisfaction at two campgrounds. Jourpal pr Leisure Research,
l8(3), 159-173.
Coovert, M. D., 6 McNelis, K. (1988). Determining the number of common
factors in factor analysis: A review and program. Educational and
anghplogical Measuremehr, 88(3), 678-692.
Crask, M. R. (1980). Segmenting the vacation market: Identifying the
vacation preferences, demographics, and magazine readership of
each group. Journa o rave esearch, 38(2), 29-34.
Davis, D., Allen, J., 6 Cosenza, R. M. (1988). Segmenting local
residents by their attitudes, interests, and opinions toward
tourism. Journal of Travel Research, 81(2), 2-8.
Day, E., Fox, R. J., 6 Huszagh, S. M. (1988). Segmenting the global
market for industrial goods: Issues and implications.
lpterpational Marketing Review, 3(3), 14-27.
Day, G. S., 6 Heeler, R. M. (1971). Using cluster analysis to improve
marketing experiments. ou na a ketin esearch
340-347.
DeSarbo, W. 8., Carroll, J. D., 6 Clark, L. A. (1984). Synthesized
clustering: A method for amalgamating alternative clustering
bases with differential weighting of variables. Psychometrika,
38, 57-78.
184
Devall, B., 6 Garry, J. (1981). Who hates whom in the great outdoors:
The impact of recreational specialization and technologies of
play. Laisure 3cienpe, 8(4), 399-418.
Dielman, T. E., Cattell, R. B., 6 Wagner, A. (1972). Evidence on the
simple structure and factor invariance achieved by five
rotational methods on four types of data. Mpltivariate Behavioral
gesearph, 1, 223-231.
Ditton, R. B., Goodale, T. L., 6 Jonsen, P. K. (1975). A cluster
analysis of activity, frequency, and environment variables to
identify water-based recreation types. Journal of Leisure
Reaearch, 1(4), 282-295.
Donderi, D. C. (1988). Information measurement of distinctiveness and
similarity. Perceptioh aha Payphophysics, 88(6), 576-584.
Dreger, R. M., Fuller, J., 6 Lemoine, R. L. (1988). Clustering seven
data sets by means of some or all of seven clustering methods.
M v te eh v a e a , 23(2), 203-230.
Driver, H. E., 6 Kroeber, A. L. (1932). Quantitative expression of
cultural relationships. Arphaeplogy and Ethnology, 3i, 211-256.
Dubes, R., 6 Jain, A. K. (1979). Validity studies in clustering
methodologies. garterp Repogpition, ll, 235-254.
Dulewicz, S. V., 6 Keenay, G. A. (1979). A practically oriented and
objective method for classifying and assigning senior jobs.
Jou a o 0 cu at ona s cho , 52(3), 155-166.
Edelbrock, C. (1979). Comparing the accuracy of hierarchical clustering
algorithms: The problem of classifying everybody. Multivariate
Behavioral Research, l8, 367-384.
Edelbrock, C., 6 McLaughlin, B. (1980). Hierarchical cluster analysis
using intraclass correlations: A mixture model study.
Mulrivariata 8ehavioral geaearch, l3, 299-318.
Ellis, G. D., 6 Rademacher, C. (1987). Development of a typology of
common adolescent free time activities: A validation and
extension of Kleiber, Larson, and Csikszentminalyi. Journal of
Laiaure gesearch, l3(4), 284-292.
Everitt, B. S. (1974). Cluster apalysis. London: Heinemann Educational
Books Ltd.
Everitt, B. S. (1979). Unresolved problems in cluster analysis.
Biometrics, 33(1), 169-181.
Frank, R. E., 6 Green, P. E. (1968). Numerical taxonomy in marketing
analysis: A review article. Jourpal of Marketing Research, 3(1),
83-98.
185
Funkhouser, G. R. (1983). A note on the reliability of certain
clustering algorithms. qurnal of harhetipg Research, 88(1),
99-102.
Furse, D. H., Punj, G. N., 6 Stewart, D. W. (1984). A typology of
individual search strategies among purchasers of new automobiles.
gpprpal of Cppapmer Research, 18(4), 417-431.
Garrison, C., 6 Paulson, A. (1973). An entropy measure of the geographic
concentration of economic activity. Epopomic Geography, 88,
319-324. '
Gartner, W. B. (1990). What are we talking about when we talk about
entrepreneurship? Joprpal pf 8paipa§§ Venruring, 3, 15-28.
Gau, G. W. (1978). A taxonomic model for the risk-rating of residential
mortgages. qurnal oi Business, 31(4), 687-706.
Gnanadesikan, R., 6 Wilk, M. B. (1969). Data analytic methods in
multivariate statistical analysis. In P. R. Krishnaiah (Ed.),
Multivariate analysia 11 (pp. 593-638). New York: Academic Press.
Goodrich, J. N. (1980). Benefit segmentation of U.S. international
travelers: An empirical study with American Express. In Hawkins,
Shafer, and Rovelstad (Eds.), Ippriap parkeripg and management
ifiéflfifi (pp. 133-147). Washington, DC: George Washington
University Press.
Gorman, B. S. (1983). The complementary use of cluster and factor
analysis methods. ou a of e m nta ducation, 3l(4),
165-168.
Gorsuch, R. L. (1983). Eactor analysia. Hillsdale, NJ: Lawrence
Erlbaum Associates, Publishers.
Green, D. W., Sommers, M. 8., 6 Kernan, J. B. (1973). Personality and
implicit behavior patterns. Jourpal 98 Marketing Research, l8,
63-69.
Green, P. E., Frank, R. E., 6 Robinson, P. J. (1967). Cluster analysis
in test market selection. Managemept Science, l3(8), 387-400.
Green, P. E., 6 Rao, V. R. (1969). A note on proximity measures and
cluster analysis. Joarpal pf harkaripg gesearch, 8(3), 359-364.
Hair, J. F., Anderson, R. E., 6 Tatham, R. L. (1987). Multivariate data
analyaia wirh readipga. New York: Macmillan.
Hakstian, A. R. (1976). Two-matrix orthogonal rotation procedures.
W0 41! 267'272 '
186
Hakstian, A. R., 6 Muller, V. J. (1973). Some note on the number of
factors problem. Multivariate Behavioral Research, 8(4),
461-475.
Hamer, R., 6 Cunningham, J. (1981). Cluster analyzing profile data
confounded with interrater differences: A comparison of profile
association measures. Applieg Esychological Measurement, 3,
63-72.
Harraigan, K. R. (1985). An application of clustering for strategic
group analysis. Strategic Managemepr Journal, 8(1), 55-73.
Harris, M. L., 6 Harris, C. W. (1971). A factor analytic interpretation
strategy. Educational and Esychological Measurement, 31,
589-606.
Hautaluoma, J., 6 Brown, P. J. (1979). Attributes of the deer hunting
experience: A cluster-analytic study. Journal of Leisure
Research, 18(4), 271-287.
Hawes, D. K. (1988). Travel-related lifestyle profile of older women.
Journal of Travel Research, 81(2), 22-32.
Heeler, R. M., Whipple, T. W., 6 Hustad, T. P. (1977). Maximum
likelihood factor analysis of attitude data. Journal of Marketing
Research, 18(1), 42-51.
Hegwood, J. L. (1987). Experience preferences of participants in
different types of river recreation groups. Journal of Leisure
Research, 18(1), 1-12.
Henderson, K. A., 6 Stalnaker, D. (1988). The relationship between
barriers to recreation and gender-role personality trait for
women. Jourpal of Leisure gesearch, 88(1), 69-80.
Hollender, J. W. (1977). Motivational dimensions of the camping
experience. Journal of Leisure Research, 8(2), 133-141.
Hooper, M. (1985). A multivariate approach to the measurement and
analysis of social identity. Payphplogical Report, 31(1),
315-325.
Horn, J. L. (1965a). An empirical comparison of methods for estimating
factor scores. Educational and Psychological Measurement, 83(2),
313-322.
Horn, J. L. (1965b). A rationale and test for the number of factors in
factor analysis. Psychomatriha, 38(2), 179-185.
Horst, P. (1965). Factor analysis of data matrices. New York: Holt,
Rinehart 6 Winston.
187
Humphrey, A. B., Buechner, J. S., 6 Velicer, W. F. (1987).
Differentiating geographic areas by socioeconomic
characteristics. Northeast qurnal of Business 6 Economics,
13(2), 47-64.
Huszagh, S. M., Fox, R. J., 6 Day, E. (1985). Global marketing: An
empirical investigation. Columbia Journal of World Business,
88(4), 31-43.
Jones, D. S. (1979). E ementa orma o theor . Oxford: Clarendon
Press.
Jones, F. L. (1968). Social area analysis: Some theoretical and
methodological comments illustrated with Australian data. British
Journal of Sociology, 18, 424-444.
Jéreskog, K. G. (1978). Structural analysis of covariance and
correlation matrices. Psychometrika 43 443-477.
Kaiser, H. F. (1960). The application of electronic computers to factor
analysis. Educational and Paychological Measurement, 88,
141-151.
Kaiser, H. F. (1970,December). A second generation little jiffy.
anchometrika, 33, 401-415.
Kaiser, H. F., 6 Rice, J. (1974,8pring). Little jiffy mark IV.
Enucational and Esychological Measurement, 38, 111-117.
Kass, R. A., 6 Tinsley, H. E. A. (1979). Factor analysis. Journal of
Laisure Research, 11(2), 120-138.
Kiel, G. C., 6 Layton, R. A. (1981). Dimensions of consumer information
seeking behavior. qurnal of Markering Research, 18(2), 233-239.
Kikuchi, H. (1986). Segmenting Michigan's sport fishing market:
valu t o 0 two a oac es. Unpublished doctoral dissertation,
Michigan State University.
Kim, J. 0., 6 Mueller, C. W. (1989). Eaptpr analysis: Statistical
narhods and practical iasues. Newbury Park, CA: Sage
Publications.
Kim, L., 6 Lim, Y. (1988). Environment, generic strategies, and
performance in a rapidly developing country: A taxonomic
approach. Apadeny pf flanagemenr Journal, 31(4), 802-827.
Knopp, T. B., 6 Merriam, L. C. (1979). Toward a more direct measure of
river use preferences. Journal of eisure Research, 11(4),
317-326.
Kotler, P. (1984). Marketing managemenr; Analysis, planning, and
ponrrpl. Englewood Cliffs, New Jersey: Prentice-Hall.
188
Krazanowski, W. J., 6 Lai, Y. T. (1988). A criterion for determining the
number of groups in a data set using sum-of-square clustering.
W. 55(1). 23-34.
Krippendorff, K. (1986). Infornaripn rhapry; §rrnctura1 nodels for
gnalirapiya_8ara. Newburg Park, CA: Sage Publications.
Krzystofiak, F., Newman, J. M., 6 Anderson, G. (1979). A quantified
approach to measurement of job content: Procedures and payoffs.
Eersonnel Esychology, 38(2), 341-358.
Lathrop, R. G. (1987). The reliability of inverse scree tests for
cluster analysis. Educatipnal ana Rsychological Measurement,
81(4), 953-959.
Lesser, J. A. (1988). Entropy and the prediction of consumer behavior.
8ehavioral 8cience, 33(4), 282-291.
Lessig, V. P., 6 Tollefson, J. D. (1971). Market segmentation through
numerical taxonomy. Journal pf flarkating fiesearch, 8, 480-487.
Lindell, M. K., 6 St. Clair, J. B. (1980). Tukknife: A jacknife
supplement to canned statistical packages. Educational and
Esychologipal heaauramenr, 88, 751-754.
Lounsbury, J. W., 6 Hoopes, L. L. (1988). Five-year stability of leisure
activity and motivation factors. Journal pf Laisure Research,
88(2), 118-134.
Love, J. (1987). Commodity concentration and export instability: The
choice of concentration measure and analytical framework.
Journal of Developing Areas, 81(1), 63-74.
Mahoney, E. M., Oh, I. K., 6 On, 3. J. (1989). A study of the National
Campers and hikers Association's 1988 Michigan Campvention. Dept.
of Parks and Recreation Resources, Michigan State University.
Manfredo, M. J., Driver, B. L., 6 Brown, P. J. (1983). A test of
concepts inherent in experience based setting management for
outdoor recreation areas. Jpprnal pf Laianra gesearch, 13(3),
263-283.
Mark, J. H. (1980). Identifying neighborhoods for preservation and
renewal: Comment. Grpwrh and Qhanga, 11(4), 47-48.
Marriott, F. H. C. (1971). Practical problems in a method of cluster
analysis. 8ipmetrips, 81(3), 501-514.
Mazanec, J. A. (1984). How to detect travel market segments: A
clustering approach. Journal of Iraval Research, 83(1), 17-21.
McDonald, R. P., 6 Burr, E. J. (1967). A comparison of four methods of
constructing factor scores. anghpnetrika, 38(4), 381-401.
189
McIntyre, R. M., 6 Blashfield, R. K. (1980). A nearest-centroid
technique for evaluating the minimum-variance clustering
procedure. Multivariate Behavioral Research, 13(2), 225-238.
Meade, N. (1987). Strategic positioning in the UK car market. European
Journal of flarhating, 81(5), 43-56.
Milligan, G. W. (1980). An examination of the effects of six types of
error perturbation on fifteen clustering algorithms.
Psychometrika, 83, 325-342.
Milligan, G. W., 6 Cooper, M. C. (1985). An examination of procedures
for determining the number of clusters in a data set.
Psychometrika, 38(2), 159-179.
Mojena, R. (1977). Hierarchical grouping methods and stopping rules: An
evaluation. Ihe Computer Journal, 88(4), 359-363.
Moojjaart, A. (1985). Factor analysis for non-normal variables.
Psychometrika, 38, 323-342.
Norusis, M. J. (1988). Spssch+ advanced statistics V2.0. Chicago, IL:
SPSS Inc.
Oh, I. K. (1990). Evaluatio o the ect've ess of a cam in refund
0 a d he elat on o ' c aracteristics.
Unpublished doctoral dissertation, Michigan State University.
Overall, J. E. (1964). Note on multivariate methods for profile
analysis. Psychological 8ulletin, 81(3), 195-198.
Perreault, W. D., Darden, D. K., 6 Darden, W. R. (1977). A psychographic
classification of vacation life styles. Journal of Leisure
Research, 8(3), 208-225.
Punj, G., 6 Stewart, D. W. (1983). Cluster analysis in marketing
research: Review and suggestions for application. Journal of
Marketing gesearch, 88, 134-48.
Rand, W. M. (1971). Objective criteria for the evaluation of clustering
methods. Jou a o the Ame ica s ical Association,
88, 846-850.
Rescorla, L. (1988). Cluster analytic identification of autistic
preschoolers. Journal of Autism and Developmental Disorders,
18, 475-492.
Rohlf, F. J. (1970). Adaptive hierarchical clustering schemes.
Systamarip zoology, 19, 58-82.
Sampson, P., 6 Pergentino, de Fmendes de Almeida. (1979). A note on
selecting the appropriate factor analytic solution from several
available. European Research, 1(5), 212-217.
190
Saunders, D. R. (1961). The rationale for an oblimax method of
transformation in factor analysis. Esychometrika, 88, 317-324.
Saunders, J. A. (1985). Cluster analysis for market segmentation.
Enrppean Journal of harkering, 18, 422-435.
Schaninger, C. M., 6 Buss, W. C. (1986). Removing response—style effects
in attribute-determinance ratings to identify market segments.
Journal of 8usine§§ Reaearph, 18(3), 237-252.
Sethi, S. P. (1971). Comparative cluster analysis for world markets.
Journal of Marketing Research, 8, 348-354.
Shannon, C. E. (1948a). A mathematical theory of communication. Bell
System Technology Journal, 81, 379-423.
Shannon, C. E. (1948b). A mathematical theory of communication. Bell
System Iechnplogy Journal, 81, 623-656.
Shoemaker, S. (1989). Segmentation of the senior pleasure travel market.
qurnal or Iravel Reaearch, 81(3), 14-21.
Shutty, M. S., 6 DeGood, D. E. (1987). Cluster analyses of responses of
low-back pain patients to the SCL-90: Comparison of empirical
versus rationally derived subscales. Rehabilitation Psychology,
38(3), 133-144.
Skinner, H. A. (1978). Differentiating the contribution of elevation,
scatter, and shape in profile similarity. Educational and
Paychplogical Measurenenr, 38(2), 297-308.
Skinner, H. A. (1979). Dimensions and clusters: A hybrid approach to
classification. Applied Psyphological Measurement, 3(3),
327-341.
Smith, S. L. J. (1989). Tourism analysis; A handbook. New York: John
Wiley 6 Sons.
Sneath, P., 6 Sokal, R. (1973). Numerical taxonomy. San Francisco: w. H.
Freeman.
Sokal, R., 6 Rohlf, F. (1962). The comparison of dendrograms by
objective methods. Taxon, 11, 33-40.
Sokal, R., 6 Sneath, P. (1963). rin les of numerical taxonomy. San
Francisco: W. H. Freeman.
Sorce, P., Tyler, P. R., 6 Loomis, L. M. (1989). Lifestyles of older
Americans. The Journal of §eryice Marketing, 3(4), 37-47.
Stanley, T. J., Powell, T., 6 Danko, W. D. (1987). Trust marketing:
Courting a segmented market. Irusts 8 Estates, 126(11), 14-20.
191
Starr, M. K. (1980). Some new fundamental considerations of
variety-seeking behavior. 8ehavioral Science, 83(3), 171-179.
Stewart, D. W. (1981). The application and misapplication of factor
analysis in marketing research. Journal of Marketing Research,
18(1), 51-62
Stynes, D. J. (1983). Marketing Tourism. Leisure today. Journal of
WW. 3(4). 21-23.
Stynes, D. J., 6 Mahoney, E. M. (1980). a downhill ski marketing
WWW- (Research Report No. 391). East
Lansing: Michigan State University, Agricultural Experiment
Station.
Swinyard, W. R., 6 Struman, K. D. (1986). Market segmentation: Finding
the heart of your restaurant's market. Cornell Hotel 6 Restaurant
Administration Quarrerly, 81(1), 89-96.
Tatham, R. L., 6 Dornoff, R. J. (1971). Marketing segmentation for
outdoor recreation. Jonrnal or Laisure Research, 3(1), 5-16.
Thorndike, R. L. (1953). Who belongs in a family? anchometrika, 18,
267-276.
Thurstone, L. L. (1935). Iha veprpra pf nind. Chicago: University of
Chicago Press.
Tinsley, H. E. A., 6 Johnson, T. L. (1984). A preliminary taxonomy of
leisure activities. qurnal pf Laiaure Eesearch, 18(3), 234-244.
Tinsley, H. E. A., 6 Kass, R. A. (1979). The latent structure of the
need satisfying properties of leisure activities. Journal of
Leisure Research, 11(4), 278-291.
Tucker, L. R. (1971). Relations of factor score estimates to their use.
mm. 33(4). 427-436.
Tucker, L. R., Koopman, R. F., 6 Linn, R. C. (1969). Evaluation of
factor-analytic research procedures by means of simulated
correlation matrices. fiayphpnarriha, 38, 421-459.
Velicer, W. F. (1976a). Determining the number of components from the
matrix of partial correlations. Esyphpmetrika, 81, 321-327.
Velicer, W. F. (1976b). The relation between factor score estimates,
image scores, and principal component scores. Educational and
anphplogipal geasurenanr, 38, 149-159.
Wahlers, R. G., 6 Etzel, M. J. (1985). Vacation preference as a
manifestation of optimal stimulation and lifestyle experience.
W. 11(4). 283-295.
192
Ward, J. H. (1963). Hierarchical grouping to optimize an objective
function. Jou na 0 the er ca tatistical Association,
38, 236-244.
Williams, W. (1971). Principles of clustering. Annual Review of Ecology
and Systematipa, 8, 303-326.
Wind, Y. (1978). Issues and advances in segmentation research. Journal
pf flarkering gaaearch, 13, 317-337.
Wolfe, J. H. (1970). Pattern clustering by multivariate mixture
analysis. Multivariate 8ehavipral Rasearch, 3, 329-350.
Wolfe, J. H. (1978). Comparative cluster analysis of patterns of
vocational interest. hultivariate Behavioral Research, 13(1),
33-44.
Woodside, A. G., 6 Motes, W. H. (1981). Sensitivity of market segments
to separate advertising strategies. Journal pf Marketing, 83,
63-73.
Zubin, J. (1938). A technique for measuring like-mindedness. Journal of
Ab orma Social 5 c 10 , 33, 508-516.
Zwick, W. R., 6 Velicer, W. F. (1986). Comparison of five rules for
determining the number of components to retain. Psychological
Bulletin, 88, 432-442.
APPENDIX A
Pretrip Questionnaire
193
Appendix A: Pretrip Questionnaire
1 H CH1 Ail if" ST Y
Michigan State University, Michigan Division of State Parks, Michigan Association of Private Caqagromd Owners,“ the
national Capers and hikers Association are conducting a comrehensive study of person the attend the 1988 MlCMlCAM
CAMPVEMTlai being held at highland Recreation Area. The study will provide infer—tion mich will be useful in decisions
regarding future caqwentions.
we will also be sending you another brief questionnaire after you return home from your trip to gather information on
your satisfaction with the 1988 Calpvention and caning in Michigan.
if you are plaming to attend the 1988 Cmtim W the following qaestiorneire and firm“ it to us in the
attached postage paid envelope. Pips; take the tine to coeplete the questiomeire. without your help the study will not
be successful. we manta: that your response will remain strictly confidential. '
1. DATE 70.! METED this “SUM!!! I I (“YMMV/YEAR)
2. will the W be the LL33]. Mationel Cm and Mikers Willi you have attended?
-:::::{f::__ Yes (the 1988 will be Iy FIRST CAMPVEMTIOM) (co TO DUESTTDI a) I
_ *0 -) Did you «we the 1937 tour mutton? _ m __ so (as TO «snow 0
3- WWI-"void!“ durum-n:
3a) as you- ntire WIN T!!! (This inclldes nidlts at the Win, Mine in laws before and after the
Causation, and nimts in other states traveling to and fro the Cemention)
w of total 11"!!! away fro ha
31:) At the loi- CAMEMTIG SITE: W of 0"!!!
3c) At cm in ion- (W): Miner of nifits at other enroll-ids
3d) At WWI”: ”of Mine
“(hMldq-ltheuoflb.3c.emld)m
was IICIIGAI WIN WIN!
$1. n tMWM-wmmmumwummr
4a)llietmuIeA£ESofthepers¢IiliewillWMF .Fers-IZ ,Person3 ,
Ferson'4 PerenS W6 ,Persen7 .
5. m your MlCMlGAM MMTTN TllP mat type of causing equip-It will you utilize? a
Tent A qung Trailer AW Travel Trailer HT
— — 5... —
_Motorl~- _VenlIuConversion@ ____5“|Mitf-
Other
6. mme-wmwwu mm m- mun-u
nidits at the equation, nidots in Michigan before and after the Cmtion and, nights in other states traveling
to and fro. the Camvention.
Total 1983 WHO TRIP Milt!
194
7. How many nights are you planning to carp a; she (Michigan) CAMMHTIOH SIT;
located in Highland Recreation Area? (The CAWVEHTIUI will last 7 nights)
m of nidits at the Michigan rmguTIOH §IT§
_
§
8. Other than the nifita a: she CAMMHTIOH 5116 are you planning to carp additional nights in MICHICAM either
before or after the Caspvention?
r Ho (DO TO usual 15)
Yes '9 How .ny ADDITIONAL hid!!! (not countim nights at the CAMPVEHTIOI SUE) are you planning to cam in
Michigan 7 Hider of additional nights (GO TO NESTIGI 9)
9. will you likely S T T r h alr el ed the calpgroindu) (OTHER THAH mulch guy
you will stay at in Michigan agree; ifiAVIHg m on the trip?
1 Ho (so 10 “$1!“ 11)] 108 (so 10 "SUN 98)
9a) Have you already selected the cmrou'ldts) (0mg: THAII mung 517;) you will stay at in Michigan 7
——-—— k
_ Yes -> How .Iy Michigan wreath W? W of camgrouet
VI
10. will. you ate, or have you already ads, reservations at these cmromds (pig; THAH mung slTfl before
leaving hm on the trip?
Yes -> 10a) Have you ALREADY aade reservations at W in Michigan 1 Ho Yes
~1-
bii. lnyourregistration package there is an offer for a Sufi QF lgm for each nidtt you spend swing at a
Michi an Stat P r m f ichi iation f Fr r .
The refund offer will not amly to other Michigan cmrouds g nidits at the equation site.
U T K 1A VA 1" PF
_Ho mum
m
‘2- W! W Won-"on will you row on In! to select the “loam-Idle) W
you will stay at in Michigan? (Please check all that only)
land Mcaally Cwing Directory Cmromd brochlres
woodslls Caning Directory loco-sndations froa other cmers at CARVEHTIGI
Michigan Camgromd Directory lecouendations from cners you nest in Michigan caspgromds
Highway signs Past caving experience in Michigan
Trailer life lethions of frierds S relatives
Other (specify)
195
6
A)
13. CIRCL; THE NUM8§R§ (F6) on the up at the right 19 §H()\I TH: R§OlON§ of 0
Michigan to.) PLAN TO CAMP IN am: on m 1988 CAMPVENTIGI TRIP.
CIRCLE the nubers of All egglgs you are planning to can: in. 3 I.
#ONLY CIRCLE REGION 1 IF you PLAN TO CAMP AT CAII’CRwNDS (OTH§R THAN
CAMPVENTION SITE) in this region.
2 I
14. Have you already written or called. or do you plan to write or call, for additional Michigan travel/recreational
information?
_ No
_ Yes 914a) wish Organization“) have you written or called, or will you write or call for sore information?
_ Michigan Travel Duresu _ Nest Michigan Tourism Association
_ Michigan Dept. of Natural Resources _ Southwest Michigan Tourism Association
__ East Michigan Touris- Organixation _ Upper Peninsula Tourisa Association
A _ Southeast Michigan Tourism Organization _ Other (Specify)
\
15. Please rate the [NYE of ti). follouim moan AmIIflES ADM FACILITIES WEI SELECTING A W?
CAMPCRQIND ATTRIDUTES Crucial Very lwortant Imortant So-euhat Inortsnt Hot Isportant
Large sites
Shaded Sites
Cleanliness
Ouietness
Site Privacy
Security
Hospitality of camgromd staff
Low Price
Flush toilets
Electricity
Showers
Laindrout
Caspgromd store
water hookups
Sewer hookws
Natural surrounding
Situated on a lake/strea-
Hiking trails
Pool
Playgrotsfls
16. Do you USDA“! prefer to cm in public or private (conercial) cqagromds‘?
Mlic c-pgrouid Private (cmial) W No preference
I7. Uho is USUALLY MOST INFLIIHTIAL in deciding W you stay at?
Myself My spouse Children F-ily (Crow) decision Other
18.
19.
20.
21.
22.
23.
24.
Michigan cwgrounds:
196
Approximately how any nights did you casp LA§T YEAR (1987)? (If you didi't carp, write “0" on the line)
How many of these nights were Mglpg TH§ §1AT§ WER§ TO) “5? (If none, write '0' on the line)
Now any states (no; including your ha agate} did you can in during 1987? (If no other states write “0")
Do you UQALLY casp m Memorial Day ? No Yes
Do you USUALLY carp AFTER Labor Day? No Yes
Have you :55 caaped in Michigan ? No _Yes 9 mien was the last year you caused in MICHIGAN? 19
Based on your ispressions, experience, information from others, or travel/canning literature, please comlete
the following perception of W imich include ptblic and private canrou'Ids.
Strongly Strongly
gree Agree Disagree Disagree IQression
>
3
are very large (radar of emites)
are inexpensive
are crowded
have hospitable w staff
offer many (in-camgromd) recreation facilities
provide large cqsites
are clean
are quiet
are f-ily oriented
offer modern W (elsctric,sewer,water)
are secluded
provide modern restroalshower facilities
are safe/secure
are well uintainsd
25.
Are you s m of MICHIGAN? m _ inc->25» new you w in man... ? _m
25b) Do you have family/friends iii/lg in Michigan? _Yes _No
25c) Hill you [[111 them on Your Cmtion trip? _Yes _No
26. that is the m of Y” PERMANENT RESIDENCE?
27.
29.
30.
Are vou sale or fs-le? _ Fule Male
Are Ya) retired? __ No _ Yes
Are you My: _ Single __ Divorcsdlwidowsd _ Separated
_Marrisd 9 Isyourspouseretirsd?_Yes _No
Do you have childrm “VIE AT m U115 12! ?
No Yes-amt are their ages ? Child 1 Child 2 Child 3 Child 4 Child 5
APPENDIX B
Differences in The Importance Ratings of Different
Campground Attributes Between The Two Subsamples
197
Appendix B: Differences in the importance ratings of different
campground attributes between two subsamples.
Table 63. Differences in the importance ratings of different
campground attributes between two subsamples.
Subsample I Subsample II
Campground Standard Standard
Attributes Mean Deviation Mean Deviation Significance
Large sites 3.11 1.01 3.08 .96 .730
Shade sites 3.20 .92 3.10 .94 .250
Cleanliness 1.88 .74 1.82 .62 .435
Quietness 2.76 .89 2.74 .92 .872
Privacy 3.33 .99 3.33 .96 1.000
Security 2.16 .88 2.18 .89 .784
Hospitality 2.50 .94 2.42 .82 .057
Low price 2.90 1.00 3.02 .93 .193
Flush toilets 3.33 1.17 3.20 1.24 .085
Electricity 2.75 1.09 2.56 .99 .062
Shower 3.00 1.13 2.96 1.21 .083
Laundromat 3.92 .99 4.05 .89 .164
Store 3.81 .97 3.76 .98 .583
Water hookups 3.10 1.23 2.98 1.14 .289
Sewer hookups 3.74 1.18 3.63 1.18 .325
Natural Surr. 3.22 1.06 3.18 1.00 .707
Lake/stream 4.03 1.03 3.95 .95 .378
Hiking trail 4.00 1.02 4.17 .92 .081
Swimming pool 3.98 1.09 3.87 1.12 .333
Playgrounds 4.44 .97 4.46 .94 .839
Note: Significant at .05 level.
APPENDIX C
Comparisons of Factoring Results Between Subsamples
198
Appendix C: Comparisons Of Factoring Results Between Two Subsamples.
Table 64. Comparisons of factoring results between two subsamples.
Subsample I Subsample II
Factor Eigenvalue Percent‘ Factor Eigenvalue Percenta
1 5.601 28.0 1 4.087 20.4
2 1.938 9.7 2 2.244 11.2
3 1.699 8.5 3 1.878 9.4
4 1.329 6.6 4 1.402 7.0
5 1.168 5.8 5 1.149 5.7
6 1.091 5.5 6 1.110 5.5
7 1.020 5.1 7 1.024 5.1
8 .802 4.0 8 .959 4.8
9 .677 3.4 9 .831 4.2
10 .619 3.1 10 .745 3.7
11 .574 2.9 11 .668 3.3
12 .546 2.7 12 .629 3.1
13 .505 2.5 13 .616 3.1
14 .476 2.4 14 .520 2.6
15 .440 2.2 15 .489 2.4
16 .385 1.9 16 .402 2.0
17 .326 1.6 17 .387 1.9
18 .294 1.5 18 .324 1.6
19 .278 1.4 19 .293 1.5
20 .231 1.2 20 .243 1.2
8Percent of variance explained.
APPENDIX D
The computer program for finding sets of matched clusters
199
Appendix D: The computer program for finding sets of matched clusters.
#include
#define Max 6
#define One '\x01'
#define Maximum 99999999
#define TRUE 1
#define FALSE -1
int index[7], Best_Choice[7];
float tab1e[7][7], Min;
float tl[7], t2[7];
main(argc, argv)
int argc;
char *argv[];
{
int i,j,k, depth;
unsigned a,b,c, mask;
float sum;
FILE *fp;
if(argc >1)
if( (fp - fopen( *++argv,"r")) - NULL ) printf("error\n");
Read_data(fp);
Min - Maximum;
for( i - l; i <- Max; i++ )i
depth - 1;
mask - One << i;
index[depth] - i;
sum - tab1e[1][i];
Comb_Search( mask, depth, sum );
1
PrintResu1t();
} /* End of Program */
Comb_Search( mask, depth, sum )
unsigned mask;
int depth;
float sum;
l
int i, j, k;
float T_sum;
unsigned T_mask;
200
depth ++;
for( i - 1; i <-Max; i++ ){
if( ( mask 6 ( One << 1 )) - 0 ){
T_mask - mask | (One << i);
index[depth] - i;
T_sum - sum + table[depth][i];
if( depth < Max )
Comb_Search( T_mask, depth, T_sum );
else i
if( T_sum <- Min ){
Min - T_sum;
for( k - 1; k <- Max; k++ )
Best_Choice[k] - index[k];
I /* end if */
} /* end else */
i /* end if */
i /* end for */
} /* end Comb_Search */
Read_data(p)
FILE *p;
1
char c;
float a, b, diff;
int i,j,k, count;
int flag, terminate, start;
for(i-1;i<3;i++){
for(j-1;j