FACI’OR ANALYSIS AND MULTIPLE SCALOGRAM ANALYSIS: A LOGICAL AND EMPIRICAL COMPARISON ’ Thesis—for the Dewar m; o; . . ‘ ‘ _ ' ' MICHIGAN STATE UNIVERSITY ‘ ' DOnald M Wilkins; _: IV ‘ ~ ’ ,‘ 1962‘V ‘ This is to certifg that the thesis entitled FACTOR AI‘I"LYSIS AND I'IIULTIPLE SC.£IDGR1‘II“I IJALYSIS: A LOGICAL AI‘JL‘ EILPIRICAL COIVIPARISGJI‘I presented by Donald M. Wilkins has been accepted» towards fulfillment of the requirements for Ph.D. Psychology degree in Major professor Date 17 August 1962 0-169 LIBRARY Michigan State University ABSTRACT FACTOR ANALYSIS AND MULTIPLE SCALOGRAM ANALYSIS: A LOGICAL AND EMPIRICAL COMPARISON by.Dona1d M.-Wilkins The purpose of the studies undertaken by this author was to investigate the difficulties involved in using either factor analysis or multiple scalogram analysis and also to study the relationships which exist between them. .The results should give those contemplating the use of either technique or dichotomous data additional information so that a better decision can be made by the investigator as to which of these methods, if either, would be most appropriate for his purposes. The theoretical analysis of the two-techniques indicated that factor analysis has three major advantages over multiple scalogram analysis: it allows for multiple classification of the variables, it results in orthogonal factors (independent groupings), and it utilizes more of the relationships among the variables (lesser correlations as well as the highest). The empirical data suggested that for-dichotomous data only the latter advantage is salient (although for other data the multiple classification of items might be important). That the factors were orthogonal was not important, because multiple scalogram analysis produced similar groupings which were correlated without the weightings (factor loadings). The theoretical discussion of multiple scalogram analysis indicated the following advantages for multiple scalogram analysis: it requires fewer complicated mathematical manipulations of the data; it indicates order among the variables and relations among the respondents; and it gives an indication of how well the model of multiple scalogram analysis is met——the reproducibility coefficient. The analyses of empirical data suggest that the first of these is unimportant, because despite the differences in the mathematical demands the results were similar. One of the major difficulties pointed out in the theoretical discussion of factor analysis was the possibility of difficulty factors. In the empirical data, at no.time did a factor appear which could be considered a difficulty factor, which indicates that such factors do not always occur among the main factors, even when the items being studied differ widely in level of difficulty. One difficulty with multiple scalogram analysis in the study of dichotomous data is that items with extreme marginal proportions will scale with almost any item whether the two items are related or not. Another problem is that when items are approximately equal in difficulty level it is harder to achieve an acceptable scale; however, as the results of the analyses of empirical data indicate, if the marginals differ only slightly and the correlations are sufficiently high, multiple scalogram analysis can still group them together. The major finding of this study is that factor analysis and multiple scalogram analysis generally give quite similar results in a wide variety of empirical data, particularly when the factoring is done in conjunction with the quartimax rotation. This similarity of results seems to stem primarily from the close association between the-starting points for the two techniques--the calculation of phi coefficients for factor analysis and the calculation of agreement scores for multiple scalogram analysis. FACTOR ANALYSIS AND MULTIPLE SCALOGRAM.ANALYSIS: A LOGICAL AND EMPIRICAL COMPARISON By 7,“. \ .Donald MIIWilkins A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degreelof DOCTOR OF PHILOSOPHY Department of Psychology 1962 IN m U 'I\% 1; .J {34“} \\‘ (9 ACKNOWLEDGEMENTS I would like to express my gratitude to Dr. Charles F. Wrigley, who offered constant intellectual stimulation and encouragement. -Dr. Terrence M. Allen also gave freely-of his time and energies. I would also like to thank Dr. Alfred G. Dietze for his help in the final phases of the thesis. A special note of thanks is offered to Dr. James Shaffer, Department of Agricultural Economics, who supported this research in its early phases from funds in one of his grants. Many others have helped during this research, and, while all cannot be named specifically, Hilda Jaffe should be thanked for her help in editing the manuscript and my wife, Karel, should be thanked for her help in typing the rough copies and also for aiding in the phraseology. To all these and many others I acknowledge a great debt of gratitude. ii DEDICATION To my-wife and my parents who waited patiently through it all. 111 TABLE OF CONTENTS I 1mODUCTION O O O l O O O O I O O O The Mbdel of Factor Analysis 0 0 O O O O O 9 Problems in the Factor Analytic Mbdel . O The Model of Guttman’s Scalogram Analysis The Mu1tip1e Scalogram Analysis Mbdel . RELATIONSHIPS AMONG THE MODELS . . Factoring Single Scale Data Factoring Multiple Scale Data .V. . Scaling Unifactorial Data . -Scaling'Multifactoria17Data ANALYSES OF EMPIRICAL DATA . . . . The Guttman Data . . . . . The Knudsen Data . . . . . The Trier Data . . . . . . Discussion of the Empirical SUMMARY AND CONCLUSIONS . . . . . . BIBLIOGRAPIIY O O O O O O 0 O O O 0 iv Results 'PAGE 11 16 19 21 24 26 27 29 29 .33 41 52 54 57 TABLE 10 ll 12 LIST OF TABLES Rotated Factors Illustrating the Appearance of a Difficulty Factor . . . . . . . . . . . . . . . . . Example of Four Types of Persons RespOnding to Three Questions . . . . . . . . . . . . . . . . . . . . . Fourfold Table of Question 1 With Question 2 . . . . Fourfold Table of Question 2 With Question 3 . . . . Fourfold Table of Question 1 With Question 3 . . . . The Latent Roots of a Perfect Six-Item Guttman Scale With Rectangular Distribution of Scale Scores . . . Two Uncorrelated Scales With Perfectly Correlated Items . . . . . . . . . . . . . . . . . . . . . . . Factor Analysis of Form A of Guttman's "Attitude Towards Officers" Scale . . . . . . . . . . . . . . Factor Analysis of Form B of Guttman's "Attitude Towards Officers" Scale . . . . . . . . . . . . . . A Factor Analysis of the Votes in the Twelfth United Nations General Assembly Meeting . . . . . . . . . A Scalogram Analysis of the Votes in the Twelfth United Nations General Assembly Mbeting . . . . . . The Three Highest Correlations With Each Variable and Their Classifications Into the Same Groups By the Two Analyses . O o 0 O 0 O O 0 0 O O O 0 O O O O O PAGE '10 24 26 30 ~ 31 33-35 36-38 40 TABLE 13 l4 15 16 17 NUmber of Variables Classified in Groups of Variables With Which They are Mbst Highly Correlated Items From Decision Making Questionnaire Arranged According to Factors in Which They Appear Items From Decision Making Questionnaire Arranged According to Scales in Which They Appear The Three Highest Correlations With Each Variable Their Classification in to the Same Group By the Two Analyses . . . . . . . . . . . . . . NUmber of Variables Classified in Groups of Variables With Which They Are Mbst Highly Correlated vi PAGE 41 42-46 46-49 51 52 LIST OF FIGURES 'FIGURE . PAGE 1 Curves of Loadings of Items on Factors . . . . . . . . 13-14 2 A Perfect Guttman Scale . . . . . . . . . . . . . . . 15 ‘3 Cell—frequency Table forComputation of Phi Coefficients or Agreement Scores . . . . . . . . . . .21 vii LIST OF APPENDICES APPENDIX PAGE I I An Example With Mbre‘Levels of Difficulty Than Factors . . . . . . . . . . . . . . . . . . . . . 60 II I Unrotated and Rotated Factor Loadings of Items on Factors . . . . . . . . . . . . . . . . . . . . . 63 111 Correlation Matrices of the Two Guttman-"Attitude Towards Officers" Forms . . . . . . . . . . . . . 66 IV Intercorrelations Among the Items In the Knudsen 'Data . . . . . . . . . . . . . . . . . . . . . . 68 V Intercorrelations Among the Items In the Trier Data 0 G O I O O O O O O O O O O O O O O O O O O 70 viii INTRODUCTION In recent years electronic computers have become-available for ruse by the social scientist. This has made it possible for masses of data to be analyzed rapidly. Previously, the analysis of such masses of data would have taken months, or even years, if one tried to do it all. The social scientist with large amounts of data has, therefore, begun to turn to the computer. unfortunately, some investigators have decided how to analyze data almost solely on the basis of the techniques available on the computer. Others have used computer techniques without sufficient knowledge of some of the pitfalls which might be involved. It should be appropriate, therefore, to investigate some of the problems which might be encountered while utilizing some of these computer techniques. One of the first techniques used by social scientists to be made available on the computer was factor.ana1ysis. Over the years many controversies have arisen concerning factor analysis, many of which are unresolved today. In addition, special problems arise when factor analysis is applied to dichotomous data. Recently Lingoes has developed multiple scalogram analysis, an ‘ extension of Guttman's scalogram analysis, which he feels is more appro- priate for the analysis of dichotomous data than factor analysis. He has also made this technique available for use by computers. The purpose of the studies undertaken by this author was to investigate the difficulties involved in using either of these techniques and also to study the relationships which exist between them.\\The results should give those contemplating the use of either technique additional information so that a better decision can be made by the investigator as to which of these methods, if either, would be most appropriate for his purposes. Guttman (1950) stated that factor analysis in the Spearman— Thurstone sense was designed for the study of "quantitative" variables and criticized the use of factor analysis for the study of "qualitative" variables. "Quantitative" variables, as he defined them, can have linear regressions, have normal distributions, have their interrelationships measured by product—moment correlation coefficients, and have all their partial and higher-order relationships calculated from the zero-order correlation coefficients alone. "Qualitative" variables to Guttman lack two of these basic features. The relationships between items cannot be expressed in simple analytical expressions like linear equations and product-moment correlation coefficients, and higher-order relationships. He offered as examples of "qualitative" variables attitudinal items suCh as questions requiring yes—no, true-false, and agree-disagree types of answers. Guttman (1941) proposed scalogram analysis as a means of analyzing certain "qualitative" data. See the section of the Scalogram Analysis Model for an explanation of the restrictions for such data. Guttman's scalogram analysis has itself been criticized, particu- 1arly for his use of the concept of "universe of content" (Festinger, 1947; Loevinger, 1948). Lingoes (1960) proposed his own multiple scalo- gram analysis as an extension of the Guttman model and an alternative to factor analysis. When alternative methods of analyzing data are available, it is valuable to know the difficulties likely to be encountered in each method and the similarities and differences in the results obtained by each method. The primary purpose of this series of studies was to provide this information for factor analysis and multiple scalogram analysis; however, a thorough investigation of the problem requires consideration of certain aspects of scalogram analysis as well. .In the following dis- cussion the three models are first taken up separately, and the diffi- culties inherent in them are discussed. Then follows a logical compari— son of factor analysis and multiple scalogram analysis, primarily in regard to what happens when data which fit one model are analyzed by the other method. AFinally, the results of analyzing empirical data by each technique are presented and compared, and conclusions are drawn regarding their relative effectiveness under various conditions. The Mbdel of Factor Analysis Traditionally there are two ways of thinking about factor analysis. The first is as a data reduction technique. When the method is used for this purpose, no model is necessary, the prime considerations being parsimony and "meaningfulness." The second way is based on the assumption that factors represent real variables underlying the tests. It is from this second viewpoint that factor analysis was originally vdeveloped and the model constructed. Factor analysis rests upon two basic formulae. The first of these is R = FF', where R is the correlation matrix, F is.the matrix of factor loadings, and F' is the transposed matrix of factor loadings. This means that the original matrix of correlations can be reproduced from the matrix of factor loadings. However, since an infinite number of solutions will satisfy this requirement, various factor analytic techniques have been developed and proposed as the most appropriate solutions for determining the underlying factors. In this series of studies the technique used was the principal axes method, where the basic equation above becomes RI: UAU' (UAé»= F;j\%U’ = F') in which U is the matrix of latent vectors andJA_is the matrix of latent roots of the R matrix. The second formula underlying factor analysis is M = FP, where M denotes the original measures Of the people on the tests, F is the matrix of factor loadings, and P is the matrix of the factor scores (scores of the people on the factors). This means that the original scores of the people on the tests can be reconstructed from the factor loadings and the factor scores. Problems in the Factor Analytic Mbdel Factor analysis, when applied to dichotomous data, presents several problems. The first relates to the basic algebraic concept of "significant digits." When the calculations are performed with greater accuracy (i.e., using more decimal places) than was present in the original measures, errors are likely to occur in the results. The original measures in dichotomous data are 1's and 0's. When such data are factored, the necessary calculations assume a much greater accuracy than is justified in actuality. The second difficulty relates directly to the second basic formula of factor analysis, M = FP. Applied to dichotomous data, this relationship will hold only in rare instances when the factor loadings are either 1 or O and the factor scores of the individuals are either 1 or 0. Burt (1951) indicated that he could reproduce the original meas- urement matrix of dichotomous variables, but he did not achieve this by the use of the simple procedure suggested by the above formula and did not use the principal axes method of factor analysis. Generally, both factor loadings and factor scores will be fractions; hence, their products must also be fractions, not 1's and 0’s as in the original measures. Another difficulty inherent in the factoring of dichotomous data is the problem of selecting the coefficient to be used to form the original correlation matrix. Three coefficients will be discussed: phi coefficients, tetrachorics, and phi/phi max. The use of phi coefficients produces two problems. One is that only when two items are both split 50-50 does the phi coefficient have a possible range from plus 1 to minus 1. Under all other marginal condi- tions, that is, proportional response frequencies, the distribution is attenuated. When both items have identical response frequencies, the correlation can equal plus 1 but cannot equal minus 1 under conditions other than a 50-50 split. When the response frequencies of two items are exactly reciprocal, the coefficient can equal minus 1 but cannot equal plus 1. No other marginal conditions will permit the coefficient to equal either plus or minus 1. The other problem in using phi coefficients is that most investi- gators (e.g., Ferguson, 1941; Dingman, 1958; Wherry and Gaylord, 1944) report that difficulty factors are produced. Difficulty factors are factors which result entirely from the differences in the marginals of the items. While Comrey and Levonian (1958) did not find any difficulty factors in their use of phi coefficients in factoring, they do not deny that such factors could occur in other data. Unfortunately, they do not present sufficient information in their study to suggest why difficulty factors did not result from their analysis. Ferguson (1941) stated that, when phi coefficients were used, the number of factors could not be less than the number of levels of difficulty or number of different response frequencies, but this is not strictly true. If there are 2 groups of people, and within each group all members respond identically to all items, there can still be as many levels of difficulty as there are items, but if n (the number of groups) is less than the number of items, there can be no more than 3 factors. See Appendix-1 for an example. The proof of this principle is relatively simple. Burt (1937) has shown that if there is a matrix W of scores, centered by rows and by columns with t traits and p persons, two matrices of covariances can be formed. The non—zero roots of the two matrices will be the same. If their dimensions differ, i.e., 3 does not equal 2, the larger will have additional roots, which will be zero. The matrix considered has 2 people and t traits (items), but the 2 people fall into n grOups, and n is less than p. Consider the 1th of these n groups, and suppose that it has 5 members. These 5 people will correlate perfectly with each other and have identical correlations with individuals in other groups. When the latent roots of the p matrix are found, these persons cannot separate into different factors. ‘The 2 matrix, then, cannot have more than'n roots, and the 3 matrix, therefore, cannot have more than n roots. Generally, however, there will be more types of response patterns than there are items. The findings of this study support those who report the emergence of difficulty factors. -For example, see Table l for an illustration of the factors produced in the analysis of a perfect seven—item Guttman scale. TABLE 1 ROTATED FACTORS ILLUSTRATING THE APPEARANCE OF A DIFFICULTY FACTOR* Variable Number Factor I Factor II 1 — .000 — .817 2 - .178 - .891 3 - .399 — .797 4 - .617 — .617 5 - .797 ~ .399 6 - .891 ~ .178 7 - .817 ~ .000 ._‘_ *Quartimax and Varimax results are identical. The advantage of using tetrachoric coefficients is that they have a possible range from plus 1 to minus 1 under all marginal condi- tions. Wherry and-Gaylord (1944) also stated that tetrachorics will not produce difficulty factors. Gourlay (1951), however, reported finding difficulty-factors in some work done with this coefficient. The explana- tion of this apparent contradiction is that Gourlay was referring to factors resulting from failure to correct for "chance" correct responses, rather than to factors resulting from different marginals of the items (the usual definition of difficulty factors). One major objection to tetrachorics is, as Guttman (1950) has shown, that it is possible by the use of tetrachorics for each of two variables to have a correlation of plus 1 with a third variable but be correlated minus.1 with each other. Guttman gave the following example. Suppose that three questions are asked of-a population and that the answers show only four categories of people with frequencies according to the following table: TABLE 2 EXAMPLE OF FOUR TYPES OF PERSONS RESPONDING TO THREE QUESTIONS Question Type of person 1 2 3 Frequency I Yes Yes Yes 10 II Yes No Yes 30 III Yes No No 40 IV No No »Yes _39 There are three tetrachorics to be computed among the items. table of frequencies for questions 1 and 2 is as follows: TABLE 3 FOURFOLD TABLE OF QUESTION 1 WITH QUESTION 2 Question 1 Yes No Yes 10 0 Question 2 No 70 20 The fourfold Since the upper righthand cell is zero, the resulting tetraChoric between question 1 and question 2 is plus 1. The fourfold table of frequencies for questions 2 and 3 is as follows: TABLE 4 FOURFOLD TABLE OF QUESTION 2 WITH QUESTION 3 Question 2 Yes No Yes 10 50 Question 3 No 0 40 Since the lower lefthand cell is zero, the resulting tetrachoric between questions 2 and 3 is plus 1, also. The fourfold table of frequencies for questions 1 and 3 is as follows: 10 TABLE 5 -FOURFOLD TABLE OF QUESTION 1 WITH QUESTION 3 Question 1 Yes No Yes 40 20 Question 3 No 40 0 Since the lower righthand cell is zero, the-resulting tetrachoric between questions 1 and 3 is minus 1. This last coefficient completely contradicts the results of the first two coefficients. In addition to the above objection to using-tetrachorics for factoring Comrey and Levonian (1958).have shown that anomalous factor analytic results may be produced when they are used. -0ne such anomaly is that non—Gramian matrices (matrices with negative latent roots When unities are inserted in the leading diagonal of the correlation matrix) sometimes result. Another is the fact that the size of the residual matrix is sometimes not much reduced despite the extraction of addi- tional factors. Like the tetrachoric, the phi/phi max coefficient also has the advantage of a possible range from plus 1 to minus 1 under all marginal conditions. However, this coefficient is only a descriptive statistic—- i.e., its sampling distribution and variance are unknown. -In addition the phi/phi max coefficients would be identical to the tetrachoric coefficients in the above example from Guttman. Mbreover, Comrey and 11 Levonian (1958) found the same anomalies when using this coefficient as *with tetrachorics. Since tetrachorics and phi/phi max sometimes yield contradictory correlations, and since anomalies may result in factor analysis using them, the empirical studies here reported used the phi coefficient. Despite its inherent difficulties, this index does not permit such contradictions to arise. The Mbdel of Guttman’s Scalogram Analysis Guttman (1950) stated that the concept of "universe of content" was basic to the theory of scales. This concept has been a major point of criticism of scalogram analysis (Festinger, 1947; Loevinger, 1948). Generally, the phrase "universe of content" refers to the set of all statements which may be made which concern a given attribute. Among possible universes are all statements which might comprise peopleIs attitude towards Negroes, their attitude toward war, their morale, their knowledge of arithmetic, etc. The most important part of the conCept is that the "universe of content" should have reference to a single variable. The major difficulty with the concept seems to center around this word "single," since the only criterion as to whether a set of statements refers to only a "single" variable is the decision of the investigator. Festinger (1947) criticizes the concept of "universe of content" on the grounds of subjectivity. Statements of Guttman (1950) regarding the scalability of universes seem to support Festinger's argument, although Guttman (1947) disputed Festinger's contention. Guttman (1950) made the following statements regarding the scalability of universes: 12 '"A universe may form a scale for a population at a given time and may not at a later time.' "Conversely, a universe may not be scalable at one time, but scalable at another.‘ "A universe may form a scale for one population, but not for another.' "A universe may not form a scale for the total population, but still form a scale for subgroups of that popu- lation."‘ Since Guttman argued that a universe does not necessarily scale, and that scalability is not a "proof" of a universe, the question of whether a group of statements constitutes a universe is left entirely to the investigator. This is supported by Guttman's statement that "the essential definition of a scale is that of"single-meaning'," and "single-meaning" can only be defined by the investigator. Guttman (1954) spent an entire chapter discussing "The Principal Components of Scalable Attitudes." This is particularly interesting, since he had criticized Burt's (1953) proposal that attitudes could be studied as well by the use of factor analysis as by the use of scale analysis. It is also interesting in view of Guttman's (1950) criticism of the use of factor analysis to study "qualitative" data. However, in the above named chapter Guttman discussed the first four principal components of scales, defining_them, respectively, as the underlying least-squares metric, intensity, closure, and involution (involvement). He presented these components in the form of graphic curves. In the present study perfect scales were factored by the princi- pal components method, and curves similar to those Guttman indicated were also found. (See Figure l for an example of such curves plotted in order of size of the factors, from unrotated loadings.) The 13 :OHUmOHU: _.>HH.wC®uCH: N. m H» m N H p b m H. m N H _ . . _ _ . I Ha . I _ _ . _ _ . _ . I fi 1.: N . I N . I o o II N n N o 1 Hy . H... W HOHUNh g :oHEoHZ osmium “weed: H. m H» m N H _ _ _ _ _ _ I O L. I , noQEdH/H Eon—H A o m H» m N H -Iww .| H _ _ _ H _ _ o m 2 IN .I z N . a 0 T... W. o - H». 3 -. N . - o . A. e. -- m. N Acuomh , L. o . H Hosomh -r c .H mmoeoefis zo mzmfi so 8sz664 so mac/moo . .L 55ch [ III l4 o .HoHumh o. m H. H . . _ . i. H. . .. JI N . I < /\I4 0 LT N . A. He. N. .HoHomh :GOHHSHOZHH: H NI 0 m H. m N _ Li H» . .. . . . . . . II N e l p 0 I? N o [v w. m Louomh . 13 .HOHSmOfiU: N. o m H» m N H _ _ . _ _ _ H I- an . I If N o I I o < /, II N . .. Hm . HV Hon—och :35.on osmium unwed: N. o m H» m N H _ _ _ _ . _ . 4. O H. I Aw .I IN .I 0 tr N a H t N Heuomrm . I. o . .waufimHHOHfiHHHH + o m HV m N H . . _ _ _ . . I1¢.I IINoI 0 am. [*0 MHOHUdrm .. HOQESZ Emu—H Ht 0 m H. m N H p H H r H H _ o G. Z 1 N. 8 O T: .m lw B w I .w I 09 llmb H .Howomrhm II_G.H mMHOBO/Hflm ZO mEHHH .mO mDZHQ/wOIH MO mM>MHDU . .H HMDUHM 14 HI AoHomrH :GOHHDHOZMH: Hm V m N H h 0 m He. m N H . _ _ _ Li H. . I _ . _ _ _ . _ - II N o I I I o l N O ‘7 N I i... H o .HoHomh m Houomh l5 curves are artifacts of the system, since the original data were simply created. As Torgerson (1958) points out, since such curves are arti— facts, the statement that one of them represents "intensity" (see factor 3 in Figure 1) simply because it is similar to curves found in studies of intensity is meaningless. Meanings ascribed to the other curves may be similarly questioned. Guttman (1950) said that any set of items from a "universe of content" which can be ordered in degree of favorableness from "most favorable" to "least favorable" comprises a scale if subjects respond favorably to the items up to a certain point (depending upon the subject) and unfavorably to all subsequent items. The number of items to which the subject agrees (responds favorably) is then his scale score, and his responses to the entire set of items can then be perfectly reproduced from the scale score alone. For an example of a Guttman scale see Figure 2.. FIGURE 2 A PERFECT GUTTMAN SCALE Subject Item NUmber type 1 2 3 4 5 6 l l l l l l l 2 l l l l 1 0 3 1 l l l 0 0 4 1 1 l 0 0 0 5 1 1 0 0 0 0 6 l 0 0 0 0 0 7 0 0 0 0 0 0 16 Guttman recognized that all subjects would not follow this model exactly; some subjects would not show this complete consistency of response. To measure the extent to which a group of items matched the model Guttman introduced the coefficient of reproducibility: number of errors (number of items) (number of SS) Coefficient of Reproducibility = l — An "error" is defined as an unfavorable response to a scale item which is followed by a favorable response to an item after the items have ordered in degree of favorableness. The coefficient of reproducibility measures the extent to which all responses by the subjects can be reproduced with no knowledge other than the scale score of each subject and the order of the items. Groups of items with reproducibilities greater than .90 are accepted by Guttman as scalable but called "quasi—scales" since they are not perfect scales, their reproducibilities being less than 1. The Multiple Scalogram Analysis Model Lingoes (1960) introduced an extension of Guttman's scalogram analysis; viz. multiple scalogram analysis. He stated that this method differs from Guttman's technique in several important respects. First, the concept of a "universe of content" is not necessary to the technique; the technique relies entirely on the interrelationships of the data. Secondly, the technique yields multiple scales when the data requires, rather than rejecting scalability when all items in a set do not fit into a single scale. Thirdly, Lingoes uses Goodenough's (1944) technique of counting errors, this being more conservative tending to lower the coef- ficient of reproducibility. The Guttman coefficient of reproducibility can be estimated from this coefficient. 17 Multiple scalogram analysis is a procedure for finding scales in a matrix of diChotomous data. It is generally-carried out on a computer using the following procedures: Positive responses are indi- cated by'a "l" and negative responses by a "0". -All responses for a given item are input together, subjects being kept in some arbitrary fixed order whiCh must be the same for all items. The investigator selects the number of errors that he will accept between any pair of items in a given scale, thus limiting the minimum reproducibility of any scale. In a Guttman scale, after the ordering of the items from the one with the highest number of positive responses to that with the smallest number; if a subject responds negatively to an earlier item but responds positively to a subsequent item, an error is said to have occurred. If the subject responds positively or negatively to both items, or responds postively to one item.and negatively to a subsequent one, his response pattern is considered to be errorless for this pair of items. The computer first reflects-—i.e., changes all 0's to 1'3 and 1's to 0's-- all items which have marginals of less than .5, this reflection being indicated in the output. The item now, is selected which has the highest marginal value. The computer then searches for the item Which has the highest agreement with the initially selected item. The number of subjects with 1's for both items and 0's for both items is counted as the agreement between the two items- If this item does not produce more errors than the criterion selected by the investigator, it is added to the scale. The computer searches next for the item which agrees most with the last selected item, and the error criterion is again applied. This process is continued until no more items can be added to the scale without producing too much scale error. If the process does not yield at least three items, the set is discarded as not scaling. When either of these contingencies occurs, the computer starts the sequence of operation over again, selecting as a new starting point the item with the highest marginal which has not yet appeared in any previous scale. No item that has once been brought into a scale is allowed to enter any subsequent scale. When all possible scales have been formed under these restrictions, the following information is printed out: 1) the ordered series of items in each scale with the larger marginal response propor— tion for each item, reflections being indicated by minus signs in front of pertinent item numbers; 2) the ordered responses of each subject to the items in each scale; 3) each subject's scale score and error score on each scale; and 4) the reproducibility for each scale. Lingoes' method, accordingly, analyzes a set of data in such a way that if they form a single Guttman scale this will be apparent in the results. However, if each of several groups of items within such set form Guttman scale patterns, these also will become apparent in the results. The latter property of Lingoes' method is its major departure from Guttman's technique. The interpretation of the results is, naturally, left to the investigator; for he must decide whether or not these are actually Guttman scales. RELATIONSHIPS AMONG THE MODELS The major advantage of factor analysis over multiple scalogram analysis is that it permits multiple classification of the variables. Also, the scores of persons on factors are orthogonal or uncorrelated. Moreover, factor analysis deals more effectively with badly split items, i.e., items on which the marginal response frequencies approach unanimity. In multiple scalogram analysis, such items contain very few zeroes (if most of the responses were negative, the item would be reflected) and therefore very few errors are possible. As a result, such items tend to scale with almost any other item. Factor analysis usually depends on correlations. A badly split item cannot correlate highly with less badly split items when phi coefficients are used (when tetrachorics or phi/phi max coefficients are used, the same problem arises as in multiple scalogram analysis); therefore, factor analysis will largely ignore the item because of the low correlations. Factor analysis has still another property, which is sometimes a weakness and sometimes a strength--under almost all conditions factoring of data will produce results, while scaling will sometimes produce essentially nothing even when variables are related. Thus, factor analysis has an advantage when there is a relation among the variables; contrariwise, multiple scalogram analysis has the advantage if there are no large relations among the variables. Multiple scalogram analysis, on the other hand, has certain advantages over factor analysis. First, the technique will work only when the data meet the model to a predetermined extent-—viz., the error 19 20 rate allowed by the investigator. In addition, the results of multiple scalogram analysis are much easier to study. It is, for instance, much easier to study the relation among the individuals on the groups of items in multiple scalogram analysis. To get comparable figures from a factor analysis--factor scores on the individuals—-one must go through a series of complicated mathematical manipulations. A third advantage is that the original data can be reproduced from the results of a multiple scalogram analysis. As was previously shown, this cannot generally be' done from the results of a factor analysis of dichotomous data. Up to now the discussion has concerned the differences between the two methods. They have, however, one very great point of similarity. Consider Figure 3. Using this array of frequencies, the formula for the phi coefficient is as follows: AD - BC phi = WA-I—B) (cw) (A+C) (B+D) For any two items with given marginals the correlation will have its maximum positive value when the B and C cells are minimum. Also, when the two items differ in difficulty so that B and C cannot both become zero, the maximum positive correlation would be achieved when the C cell is zero. Factor analysis groups items which have the largest intercorrelations, and it will therefore tend to group items which have the smallest B and C cell frequencies. Multiple scalogram analysis groups together the items which have the largest agreement scores, agreement here being defined as pairs of responses that fall in either the A cell or the D cell, i.e., either 1,1 or 0,0. The other restriction 21 that multiple scalogram analysis places on two variables before grouping them is that the C cell must be small because, as has been pointed out, pairs of responses appearing in this cell are called errors, being of the type 0,1. Recalling the conditions necessary for a maximum positive phi coefficient, it is clear that the conditions demanded by multiple scalogram analysis are exactly those which produce the maximum phi. Since factor analysis is based on maximum correlations, and multiple scalogram analysis on maximum agreements, one would expect, in general, quite similar results from the two methods. However, certain other differences between the two techniques lead to differences in their results. FIGURE 3 CELLPFREQUENCY TABLE FOR COMPUTATION OF PHI COEFFICIENTS OR AGREEMENT SCORES Item 2 l 0 1 A B A + B Item 1 0 C D: C + D Factoring Single Scale Data The single scale case is identical for multiple scalogram analysis and for scalogram analysis. When a perfect Guttman scale is factored, using tetrachorics or phi/phi max coefficients, the matrix of correlations will consist entirely 22 of 1.0's; only one factor is obtained, and all variables will be equally loaded on that factor. When phi coefficients are used, the results depend on whether or not the distribution of scale scores is rectangular. If the distribution of scale scores is rectangular, there will be as many factors as there are items. The latent roots indicate the relative size of the factors. Analyzing the total variance, i.e., using 1's in the diagonal, the latent roots will be as follows: when there are 2 items, the first latent root will be (n+1)/2; the second, 1/3 of the first; the third, 1/6 of the first; the fourth, 1/10 of the first; etc. The general formula is as follows: (n+1) kth latent root = 2 = n+1 , where n = no. of items k(k+1) k(k+l) k = no. of roots 2 when arranged in order of size This relationship was arrived at inductively, using a computer. The sum of the latent roots should equal n. It can be shown that the sum of this series is always n, which strengthens this inductive argument. The proof is as follows: Assume n ='l (n+1) + l (n+1) +...........+ l (n+1) l 2 3 2 n (n+1) 2 2 n = (n+1) 1 + I +.....+ 1 2 l 3 n (n+1) 2 Then; 2n ='l + _l +.......+ 1 1. (n+1) 1 3 n (n+1) 23 When the series is one item longer, that is, when n,is n + l, the left- hand part of equation 1 becomes 2 (n+1) and the right-hand part becomes: (n+2) l + l +......+ 1 + 1 , 1 3 n (n+1) (n+1) (n+2) 2 2 an increase of l .= 2 (n+1) (n+2) (n+1) (n+2) 2 The question then is: 2n +~ 2 = 2 (n+1) (n+1) (n+1) (n+2) ~Cn+2) 2. 2n + 2 = 2n (n+2) + 2 = 2n2.+ 4n + 2 (n+1) (n+1) (n+2) (n+1) (n+2) (n+1) (n+2) = 2(n2 + 2n + 1) = 2 (n+1) (n+1) = 2 (n+1) , which is what was (n+1) (n+2) (n+1) (n+2) (n+2) was to be proved (see equation 2). The only remaining thing that is necessary is to show that it holds for one case ofvn for it will then hold for all larger n's, since n was not fixed. Assume n = 2, (i.e., there are only 2 items); then, according to the above statements, the following should hold true: (2+1)"a 3 + _£E:ll_ + ._ 2 '2 O Nflw le NIH .l 3 Since the series holds for 2, it holds for all series containing more than 2 items. A typical case is illustrated in Table 6. Another iteresting feature of factoring a single perfect Guttman scale when the distribution of the scores is rectangular is that quartimax and varimax rotations yield the same results. If the distribution is not rectangular, 2 factors are still obtained but the closer the distribution of the scale scores approaches 24 ;r€3c:1:angularity, the more the latent roots will resemble the roots as 1 nd i cated . TABLE 6 THE LATENT ROOTS OF A PERFECT SIX-ITEM GUTTMAN SCALE WITH RECTANGULAR DISTRIBUTION OF SCALE SCORES Latent root Predicted value Value of latent root number of latent root obtained from computer 1 _1_ (6 + 1) 3.50000 1 2 2 ‘1 (6 + 1) 1.16667 3 2 3 l (6 + 1). 0.58333 6 2 4 _l (6 + 1) 0.35000 10 2 5 _l (6.+ 1) 0.23333 15 2 6 _l (6 + 1) 0.16667 21 2 6.00000 Factoring Multiple Scale Data Here the concern is with multiple scalogram analysis, since Scalogram analysis does not admit this posSibility. The problem is what happens when more than one Guttman scale is superimposed on the Same set of people. If every one who gets a given score on one scale a1530 gets given scores on all subsequent scales (although there may be 25 no correlation between the sets of scale scores), there will not be-more than 1' + l latent roots (therefore factors) where-j is the longest scale, be-c ause there will not be more than 1 + 1 types of people (see Chapter 1 for ~a discussion of this principle). Using phi coefficients, then, there are two restrictions on the rank of the matrix (number of factors): (1) the number of levels of difficulty or (2) the number of linearly independent items or persons, whichever is less. Factoring more than one Guttman scale superimposed on the same group of people, using tetrachorics or phi/phi max coefficients, results in unusual matrices. Within each scale all interitem correlations are equal to 1. However, correlations between items on different scales W111 not be consistent with these correlations: two items on one scale correlate perfectly but have differing correlations with an item on another scale (see Table 7) unless the scales are perfectly‘correlated. These matrices will be non-Gramian. The factors that are produced will Split each scale into more than one factor. Another finding of this study was that unrotated factor analysis Solutions can be misleading. Two completely orthogonal scales, the co«"li‘relation between the scale scores being zero and. all correlations be tWeen items on different scales being zero, were formed on the same people. The matrix of phi coefficients was factored. Items from both sGales loaded equally onl'all factors, and the sizes of the loadings were similar (see Appendix 2). However, after rotating the factors, it was f011nd that the two scales separated. Only items from one of the scales 26 were loaded on any factor; the loadings for all items on the other 'scale were zero. TABLE 7 TWO UNCORRELATED SCALES WITH PERFECTLY CORRELATED ITEMS Subject' . Item number number 1 2' 3 4 5 6 _’Z 8 9 10 11 12 13 1.4 1 l l l l 1 l l l l l l 1 1 1 2 l l l l l l 0 0 0 0 0 0 0 0 3 1 1 l 1 1 0 0 l l l 1 l O 0 4 l 1 l 1 0 O 0 l 0 0 0 0 O 0 5 l l l 0 0 0 0 l 1 l 1 O O O 6 l 1 0 O 0 0 0 l l 0 0 0 0 0 7 l O O O 0 0 0 l l l 0 0 0 0 8 0 0 0 0 0 0 0 l l l l l l 0 Scaling Unifactorial Data When data actually fit the unifactorial model, will the items Sc3a1e? In the unifactorial model of factor analysis each group of variables having loadings on a factor'have' no loadings onany other 1:ac‘l‘wr. As Guttman (1950) pointed out, when dichotomous data are factored, the results are general-ly‘not unifactorial. To be unifactorial all the off-diagonal tetrads must disappear. This is rare in any type of data, and with dichotomous data seemingly even more rare. Since Ferg’uson (1941) has shown that there will be as many factors as there 27 are levels of difficulty, except when there are fewer "types" of people than levels of difficulty, the only condition under which data could possibly be unifactorial is when all items on each factor are of the same level of difficulty, or when all items on each-factor produce only one "type" of person. Only in the trivial case, then, where all items for a given factor have identical responses could the items scale without error. As will be recalled, a perfect scale with‘a large sample of items will always produce as many factors as items, which strengthens this conclusion. When the data form a matrix containing several factors which are Unifactorial, scales will not form which take (items from several of these factors. For items to appear on separate factors in such a case the cor- relation between two items must be low. This means that the 0,1 C cell must be large, producing too much error for the items to scale together. Scaling Multifactorial Data What will happen when data which actually fit the multifactorial IncNile-l are scaled? Multifactorial data are those in which each variable measures parts of several factors so that it should have loadings on more than one factor. When dichotomous data are factored, a multi- fa-<=torial solution generally results. , Two general conditions must be considered separately: (1) when ‘all items have approximately the same difficulty level and (2) when the difficulty levels of the items loaded on one factor differ widely. If items falling on a single factor have approximately equal levels of difficulty and are truly multifactorial, the correlations 28 among most of the items have to be less than 1.0 or'the items would-not be multifactorial. Since the items are equal in difficulty and the correlations (phi coefficients) are less than-1.0, the-0,1 cell must have been large enough to produce considerable error in scaling; there- fore, the items will not all scale together. When the difficulty levels of the items falling on a single factor differ widely and the items are actually multifactorial, if the phi coefficients approach phi max (when the 0,1 cell is small), some of the itemswil‘l scale together. In general, if the data are multifactorial, all of the items will not scale together, because 1"Allen all the variables should not be maximally related (phi should not approaCh phi max in all cases); therefore, too .muCh error-will be PrOduced for all the items to scale together. It can further be stated that if multifactorial items do form a Scale it will be a distortion of the conditions. Items are notallowed to appear on more than one scale, but this condition requires that items Should appear in more than one grouping. In the multifactorial case the resulting scales generally in~<21ucte items from more than one factor. As was pointed our previously, SClales when factored form more than one factor. ANALYSES OF EMPIRICAL DATA For a further investigation of the relationships between factor analysis and multiple scalogram analysis, three sets of empirical data were analyzed by each technique and the results compared. Since these three sets—-the Guttman data, the Knudsen data, and the Trier data--were collected at different times on different subjects for different reasons, it was assumed that they would represent sufficientlyvaried conditions to produce further insights into these relationships. Only one of the sets of data--the Guttman data—-was purposely collected to show that it fit the scalogram model.. The other two sets of data were collected for other purposes and without concern for which technique would better analyze them. One problem in the factoring was the number of factors to rotate in each solution. It was decided to follow the Kiel-Wrigley (1960) criterion. By this criterion the maximum number of factors should be rotated where each factor had at least two variables with their highest loading on that factor. This criterion was applied in the Guttman data and the Knudsen data. The Guttman Data In the past Guttman has presented many examples of sets of items he considered scalable. To throw more light on the relationships among the techniques, it was decided to apply factor analysis to some of this data. Two such scales were selected arbitrarily, primarily because both were considered scalable by Guttman, and because they were alternate forms 29 30 of a questionnaire. The scales (Guttman, 1950; p. 284, 285) were admin- istered to a random sample of army personnel and concerned "Attitude Toward Officers.’ Guttman gave the reproducibility of Form A as .89 and of Form B as .921. Each form was seven items long. The scales were factored separately. The results for Form A are presented in Table 8, and those for Form B in Table 9. TABLE 8 FACTOR ANALYSIS OF FORM A OF GUTTMAN' s "ATTITUDE TOWARD OFFICERS" SCALE Item Response Factor no. proportion loading Item 1 .29 .776 Do your officers give you a good chance to ask questions as to the reason why things are done the way they are? 2 .39 .806 In general, how good would you say your offi- cers are? ' 3 .38 .819 How many of your officers use their rank in ways that seem unnecessary to you? 4 .43 .421 How many of the officers in your company (bat- tery, squadron, troop) are the kind you would want to serve under in combat? 5 u.43 .798 How much do you personally like your officers? 6 .52 .676 How many of your company officers are the kind who are willing to go through anything they ask their men to go through? 7 .56 .583 How do you feel about the officers that have been picked for your outfit? 31 TABLE 9 FACTOR ANALYSIS OF FORM B'OF GUTTMAN‘S "ATTITUDE TOWARD OFFICERS" SCALE Item Response Quartimax no. .proportion loadings Item 1 .31 .667 2 .38 .644 3 .41 .827 4 .51 .815 5 \ .51 .799 6 .61 .700 7 .74 .586 How many of your officers take a per- sonal interest in their men? Do you think that your officers gener— ally do what they can to help you? How well do you feel that your officers understand your problems and needs? Do you feel that your officers recognize your abilities and what you are able to do? ’ ' When you do a particularly good job do you usually get recognition or praise for it from your officers? When your officers give you something to do, do they tell you enough about it so that you can do a good job? Can you count on your officers to back you up in your duties? Using the Kiel—Wrigley (1960) criterion, the quartimax rotation for each form yielded one factor. This indicated a close agreement between scalogram analysis and factor analysis; by both methods the items belonged to the same group. However, scalogram analysis seemed preferable since the results made comparisons among the subjects readily achievable. Several comments are necessary here to relate these results to material presented earlier. It should be noted that the first scale 32 (Form A) has less variation in item difficulty (.29 to .56) than Form B, the second scale (.31 to .74). .It should also be noted that the scales are quite short-—only seven items in length. It should be stated further that these are not unifactorial solutions (where all the off-diagonal tetrads disappear); seven factors (seven latent roots larger than zero) were found in both cases, but the Kiel-Wrigley criterion indicated that only one of these was of interest. Finally,.it should be observed that the data do not conform closely to the Guttman model; considerable error appears in each form. The reproducibility of Form A does not meet Guttman’s criterion of scalability, which is .90. The correlations matrices for the two forms (Appendix III) do not follow the pattern for a correlation matrix among the ordered items of a Guttman scale. Generally, when a correlation matrix is formed on the ordered items of a Guttman scale, an item will correlate most closely with the items nearest it; correlation will decrease with its distance from other items on the scale, whether the item precedes or follows it. In summary, the two types of analysis, applied to scales pro- duced similar results, although the results of scalogram analysis seemed easier to use. As shown earlier, the phi coefficients on which the factor analysis were based and the agreement scores on which the Guttman scalogram analyses were based were related; this, supplemented by the facts pointed out in the preceding paragraph, explained to a large degree the similarity of the results despite the differences in the two models. 33 The Knudsen Data Knudsen (1962) analyzed the votes of nations in the Twelfth Plenary Session of the United Nations General Assembly, using both factor analysis and multiple scalogram analysis. The results of the quartimax rotation of the factor analysis are presented ianable 10. The multiple scalogram analysis results.are presented in Table 11. -Seventy-four nations were included in both analyses: A ten percent error rate was used in the multiple scalogram analysis so that a maximum of seven errors was allowed between any pair of items in a scale. TABLE 10 A FACTOR ANALYSIS OF THE VOTES IN THE TWELFTH UNITED NATIONS GENERAL ASSEMBLY MEETING Variable Factor number loading Issue Factor A. General Factor 7 .911 Togoland under French administration 6 .883 Committee on Southwest Africa 27 .871 To adjourn discussion of the threat to Syria for three days 8 .866 Transmission of information about non- selfgoverning territories 32 .852 Future of Togoland under French admin- istration 31 .848 Study of defining agression 29 .843 Problem of West New Guinea .9 .817 Transmission of information about non- selfgoverning territories 34 TABLE lO--Continued. Variable number Factor loading Issue Factor A. General Factor 33 .809 To dissolve the Disarmament Commission 28 .797 To adjourn discussion of the threat to 'Syria for three days 24 .768 Geographical distribution of staff personnel 18 .763 Unify Korea peacefully 10 .757 Placing West New Guinea on the agenda 19 .737 Ask the Disarmament Commission to reach agreement on arms reduction 11 .735 Composition of the General Committee of the General Assembly 30 .709 Committee to work on ending nuclear tests 34 .674 Self-determination for Cyprus 5 .651 Cameroons under British and French administration 26 .618 Continuation of the U.N. Emergency Force 25 .543 Expansion of international trade 16 .516 Composition of the General Committee of the General Assembly Factor B. Cold War 20 .817 Not to consider any proposal to include Red China and/or exclude Nationalist China 21 .817 Not to discuss the question of Red China 22 .796 Not to discuss the question of Red China 35 TABLE lO—-Continued. Variable Factor number loading Issue Factor B. Cold War 23 .785 Not to discuss the question of Red China 17 .477 Enlarge the Disarmament Commission Factor C. Racial Problem 14 .812 Place the issue of treatment of people of Indian descent in South Africa on the agenda 12 .793 Problem of the treatment of people of Indian descent in South Africa 13 .785 Negotiations of India, Pakistan, and South Africa on the treatment of people of Indian descent in South Africa 15 .750 Plea to South Africa to heed this issue 2 .306 Economic Commission for Africa Factor D. Procedural Matters 4 .632 Dissemination of information on modern arms having to do with reduction of arms 3 .484 Establishing three new posts in geographi— cal distribution of the staff 1 .256 Peaceful coexistense of states 36 TABLE 11 A SCALOGRAM ANALYSIS OF THE VOTES IN THE TWELFTH UNITED NATIONS GENERAL ASSEMBLY MEETING Variable number Issue Scale 1. Procedural Matters 1 Peaceful coexistence of states 2 Economic Commission for Africa 3 Establishing three new posts in the geographical dis— tribution of the staff 4 Dissemination of information on modern arms, having to do with the reduction of arms 5 Cameroons under British and French administration 6 Committee on Southwest Africa 7 Togoland under French Administration 8 Transmission of information about non-selfgoverning territories 9 Transmission of information about non—selfgoverning territories 10 Placing of West New Guinea on the agenda 11 Composition of the General Committee of the General Assembly 12 Problem of the treatment of people of Indian descent in South Africa 13 Negotiations of India, Pakistan, and South Africa on the problem of the treatment of people of Indian descent in South Africa 14 Place the problem of the treatment of people of Indian descent in South Africa on the agenda 37 TABLE ll--Continued. Variable number Issue Scale 1. Procedural Matters 15 Plea to South Africa to heed this issue 16 Composition of the General Committee of the General Assembly . R = .953 Scale 2. Cold War 17 Enlarge the Disarmament Commission 18 Unify Korea peacefully 19 Ask the Disarmament Commission to reach agreement on arms reduction 20 Not to consider any proposal to include Red China and/or exclude Nationalist China 21 Not to discuss the question of Red China 22 Red China 23 Red China 24 Geographical distribution of personnel R = .953 Scale 3. International Relations A 25 Expansion of international trade 26 Continuation of the U.N.-Emergency Force 27 To adjourn discussion of the threat to Syria for three days 28 To adjourn discussion of the threat to Syria for three days 38 TABLE 11--Continued. Variable number 'Issue Scale 3. International Relations 29 Problem of West New Guinea R = .881 Scale 4. Cold War 30 Committee to work on ending nuclear tests 31 Study of defining aggression 32 Future of Togoland under French administration R = .973 Not Appearing 33 To dissolve the Disarmament Commission 34 Self-determination for Cyprus Both analyses used four groupings to account for the variables. Knudsen favored the results of the quartimax rotation over the multiple scalogram analysis because in her judgment it grouped the issues more meaningfully. The variables that were closely associated in the factor analysis results, however, tended to be closely associated in the multi- ple scalogram analysis results also. The multiple scalogram results also provided the information for looking at the groupings of the nations on the various issues. Generally, the groupings of the countries on the issues were what might have been expected from world political alignments in the past. 39 The following tabular material, giving the results of the Knudsen study, will repay examination. (See the correlation matrix in Appendix IV.) Table 12 lists for each variable the three (or four in case of ties) other variables with which it showed the highest correlation, as well as how many of the highest—correlates were placed in the same group as the given variable by the two methods of analysis--Lingoes and quarti- max. Table 12 thus shows that Lingoes was superior to quartimax on seven variables; quartimax was superior to Lingoes on 15 variables. As can be seen by Table 13, the two techniques were about equally effective in placing a variable in the same group with the variable with which it had the highest correlation. However, in grouping the variable with its second highest correlate quartimax is clearly superior, and even more so in grouping it with its third highest correlate. The comparisons support Knudsen's conclusion; the Lingoes technique tended to select the highest correlate consistently, but to ignore secondary correlations almost as large as the primary ones. One example should indicate the general pattern: Variable 6 correlated most highly with variable 7, r = .802; variable 7 correlated most highly with variable 8, r = .868; variable 8 correlated second highest with variable 9, r = .783; at the same time variable 32 correla- ted -.770 with variable 6, .802 with variable 7, and -.683 with variable 8. Although the magnitude of these correlations was essentially equal, the Lingoes technique did not include variable 32 in this group of items because it never became the first choice of any of_these items. The quartimax technique did not make the same mistake. 40 TABLE 12 THE THREE HIGHEST.CORRELATIONS WITH EACH VARIABLE AND THEIR CLASSIFICATIONS INTO THE SAME GROUPS BY THE TWO ANALYSES Variable Highest Three Number in Same Group Classified by no. Variables Lingoes Quartimax (ordered) 1 13, 16, 12 3 0 2 13, l6, l4 3 2 3 31, 7, 8, 9 3 O 4 5, 17, 18, 19 l 0 5 4, 6, 27 2 2 6 7, 27, 32 l 3 7 8, 29, 32, 6 2 4 8 7, 9, 6 3 3 9 8, 10, 31 2 3 10 9, 7, 31 2 3 ll 9, 8, 10 3 3 12 13, l4, l5 3 3 13 12, 14, 15 3 3 14 15, 12, 13 3 3 15 14, 12, 13 3 3 16 11, 12, 9 3 2 l7 4, 18, 22 2 l 18 19, 20, 21, 33 3 l 19 18, 20, 21, 33 3 l 20 21, 22, 23 3 3 21 20, 22, 23 3 3 22 21, 20, 23 3 3 23 22, 20, 21 3 3 24 7, 8, 9 0 3 25 26, 5, 31 l 3 26 25, 5, 8, 9 l 4 27 28, 7, 6 l 3 28 27, 6, 7 l 3 29 7, 8, 9 0 3 30 31, 33, 7 l 3 31 32, 7, 30 2 3 32 7, 29, 6 O 3 33 18, 19, 7 0 3 34 10, 9, 8 0 3 41 TABLE 13 NUMBER OF VARIABLES CLASSIFIED IN GROUPS OF VARIABLES WITH WHICH THEY ARE MOST HIGHLY CORRELATED By Each of the Analyses Second Third Highest Highest Highest Lingoes 28 21 22* Quartimax 28 26 33* *Includes variables tied for third highest. Generally, then, since both the models are based on the same fourfold table and the two basic coefficients, phi coefficients and agreement scores, are closely related, the two analyses achieve similar results. The findings in the first two sets of empirical data support this conslusion. However, the Knudsen data demonstrate that factor analysis is superior under certain conditions, because it is not as dependent on the primary correlation or agreement. The next set of data (the Trier data) show that this superiority does not always hold. The Trier Data Trier (1959) collected data from a random sample of 242 house- wives in Lansing, Michigan. These housewives were asked to respond to a 37-item scale which measured the components that went into their decision- making on buying food. The data were then factored and rotated by the 42 Quartimax method. The results consisted of ten factors, accounting for 93 percent of the variance (Trier did not use the Kiel-Wrigley criterion). All variables with loadings greater than :30 on any factor were included; the report included 29 variables, since 8 did not have a loading of .30 on any factor, as can be seen in Table 14. .TABLE 14 ITEMS FROM DECISION MAKING QUESTIONNAIRE ARRANGED ACCORDING TO FAQTQBS IN WHICH THEY APPEAR_‘ Factor A. Cost of Food .629 25. Before I go to the store I figure which foods I can buy that will cost the least amount of money. .542 28. I read the newspapers to find which food stores are having_specials and I shop at - those stores which are having an attractive offer. .520 ~ 16. When I go food shopping I take and use a pencil and paper, or some other device to aid in figuring. .506 8. I look in the food pages of the newspapers for food items that can be quickly and easily prepared. .447 9. I buy something else when the price of a food item I usually buy goes up. I .417 29. I follow the prices of food very closely. I know when the basic foods have either increased or decreased even by only a few cents. .412 24. The meat that I decide to buy is often deter- mined by the number of servings and meals which I have figured it will supply. 43 TABLE 14--Continued. Factor A. Cost of Food .397 22. I compare instructions on various food packages so that I can calculate which foods will be easiest and quickest to prepare. Factor B. Friends (Indirect) .662 17. I pretty well know what foods my friends like or dislike. .598 14. The meals that we eat are very similar to the meals that our friends eat. Factor C. Friends (Direct) .653 13. Conversations with my friends have changed some of my food buying habits. .498 2. I have received some excellent ideas about food buying from our friends. Factor D. Parents .654 31. The type of meals and the kinds of foods that we eat are entirely different from those of our parents. -.562 23. The meals that my family eats are very similar to the meals that my parents eat or ate. -.487 18. The meals that my family eats are very simi- lar to the meals that my spouse's parents eat or ate. -.406 5. My parents have given me many recipes and/ or ideas on food buying. Factor E. Preparation (Time) .654 33. I like to spend as little time as possible 43 TABLE l4—-Continued. Factor A. Cost of Food .397 22. I compare instructions on various food packages so that I can calculate which foods will be easiest and quickest to prepare. Factor B. Friends (Indirect) .662 17. I pretty well know what foods my friends like or dislike. .598 14. The meals that we eat are very similar to the meals that our friends eat. Factor C. Friends (Direct) .653 13. Conversations with my friends have changed some of my food buying habits. .498 2. I have received some excellent ideas about food buying from our friends. Factor D. Parents .654 31. The type of meals and the kinds of foods that we eat are entirely different from those of our parents. -.562 23. The meals that my family eats are very similar to the meals that my parents eat or ate. -.487 18. The meals that my family eats are very simi- lar to the meals that my spouse's parents eat or ate. -.406 5. My parents have given me many recipes and/ or ideas on food buying. Factor E. Preparation (Time) .654 33. I like to spend as little time as possible 44 TABLE l4--Continued. Factor‘E. .612 34. .Preparation (Time) preparing.meals, that is why I buy the good, frozen, ready made dishes that can be prepared easily. I seldom spend more than 30 minutes in preparing the day's largest meal. Factor F. Husband (Direct) .643 36. My husband wants nothing to do with the food buying or deciding what to eat. .561 35. I decide how much money our family can or will spend for food. .461 20° In our home, I am the boss of the kitchen and how much and what foods are purchased is my concern. Factor G. Husband (Indirect) .618 15. My husband tells me how much money can be spent for food. .578 4. My husband is the business man in this family. I let him work out the budget and amount we have available for foods. Factor H. Food Value .558 3. The food decisions I make before going to the store are based primarily on the flavor— fullness and healthfulness of the food. .473 1. In buying foods I figure the amount of calories and nutrients they contain. 45 TABLE 14--Continued. Factor I. Food Quality .605 30. When I buy food, I buy the very best quality no matter what price. Factor J. Mass Media .488 27. .308 7. .302 12. I read Consumers” Union, Consumers' Research, Changing Times, government publications, or consumer service publications to get ideas on buying foods. I listen to the information services offered on the radio or TV to find out which foods are good and nutritious. I read many magazine articles concerning foods and meals. Items Not Appearing in Above Factors 6. 10. 11. 19. 21. 26. 32. Before I go to the market I make out a complete grocery list. I can tell very quickly when the flavor, freshness, or appearance of a food that I have been buying regularly changes. My spouse's parents have given me many recipes and/or ideas on food buying. My husband tells me what I should buy at the store. I plan my menu, meal for meal, a couple of days in advance. I figure out in advance before going to the store foods and meals that can be quickly and easily prepared. My friends and I discuss menus, meals, and foods with each other. 46 TABLE l4--Continued. Items Not Appearing in Above Factors 37. .Often before I buy a certain brand of food item, I compare the various sizes to deter- mine the actual cost per ounce for each size. For purposes of this study Trier's data were subjected to the Lingoes' technique. A twenty percent criterion was used: i.e., a maximum of 48 errors was allowed between any pair of items within a scale. The results were 8 scales with reproducibilities ranging from .745 to .877. These 8 scales accounted for 34 of the variables, leaving only 3 unclassified. The results of the analysis are presented in Table 15. TABLE 15 ITEMS FROM DECISION MAKING QUESTIONNAIRE ARRANGED IN ORDER ACCORDING TO SCALES IN WHICH THEY APPEAR Scale 1. Dominant Spouse in Decisions -l9. My husband tells me what I should buy at the store. —15. My husband tells me how much money can be spent for food. — 4. My husband is the businessman in this family. I let him work out the budget amount we have for foods. 20. In our home I am the boss of the kitchen, and how muCh and what foods are purchased is my concern. 35. I decide how much our family can or will spend for food. 36. My husband wants nothing to do with the food-buying or deciding what to eat. R = .823 47 TABLE 15-—Continued. Scale 2. Economy 24. The meat that I decide to buy is often determined by the number of servings and meals which I have figured it will supply. 25. Before I go to the store, I figure which foods I can buy that will cost the least amount of money. 37. Often before I buy a certain brand of food item, I compare the various sizes to determine the actual cost per ounce of each size. 29. I follow the prices of foods very closely; I know when the basic foods have increased or decreased by even a few cents. 28. I read the newspapers to find which foodstores are having specials, and I shop at those stores which are having an attractive offer. 7. I listen to the information services offered on the radio to find out which foods are good and nutritious. R = .765 Scale 3. Influence of Friends 32. My friends and I discuss menus, meals and foods with each other. 2. I have received some excellent ideas about food-buying from my friends. 13. Conversations with my friends have changed some of my food—buying habits. R = .877 Scale 4. Information Gathering 3. The food decisions I make before going.to the store are based primarily on the flavorfulness and healthfulness of the food. 48 TABLE 15—-Continued Scale 4. Informatioanathering 12. I read many magazine articles concerning food and meals. 27. I read Consumer's Union, Consumers“ Research, Changing Times, government publications, or consumer service publi— cations to get ideas on buying foods. 11. My spouse's parents have given me many recipes and/or ideas on food—buying. 18. The meals that my family eat are very similar to the meals that my spouse’s parents eat or ate. R = .755 Scale 5. Preparation Time 34. I seldom spend more than thirty minutes in preparing the day's largest meal. 33. I like to spend as little time as possible preparing meals, that is why I buy the good frozen ready—made dishes that can be prepared quickly. 8. I look in the food pages of the newspapers for food items that can be quickly or easily prepared. 26; I figure out in advance before going to the store foods and meals that can be quickly and easily prepared. 22. I compare instructions on various food packages so that I can calculate which food can be easiest and quickest to prepare. R = .745 Scale 6. .Organization -16. When I go food shopping, I take and use a pencil and paper or some other device to aid in figuring. — 1. In buying foods I figure the amount of calories and nutrients they Contain. .49 TABLE l5-—Continued. Scale 6. Organization -21. I plan my menu, meal for meal, a couple of days in advance. R = .775 Scale 7. Influence of Wife's Parents — 5. My parents have given me many recipes and/or ideas on food buying. -23. The meals that my family eats are very similar to the meals that my parents eat or ate. —l4. The meals that we eat are very similar to the meals that our friends eat. R = .816 Scale 8. Food Quality 30. When I buy food, I buy the very best quality no matter what the price. - 9. I buy something else when the price of a food item I usual— ly buy goes up. -l7. I pretty well know what my friends like or dislike. R = .752 Not Appearing 6. Before I go to the market I make out a complete grocery list. 10. I can tell very quickly when the flavor, freshness, or appearance of a good food that I have been buying regu- larly changes. 31. The size of the meals and the kinds of foods that we eat are entirely different from those of our parents. 50 Certain similarities were apparent in the results of the two analyses: Scale 1 contained the two items which defined Trier's Factor G and the three which defined his Factor F; Scale 2 contained four of the eight items in Factor A;.Scale 3 contained the two items which defined Factor C; Scale 4 contained two of three items which defined Factor J; Scale 5 contained the two items which defined Factor E and two of the eight items which defined Factor A; Scale 7 contained two of the four items which defined Factor D; and both analyses failed to classify items 6 and 10. Obviously, then, many of the items were appearing together in both analyses. No factor contained a complete scale, but Scale 1 con- tained Factors F and G. In addition, two other scales contained a factor each. Also eight scales accounted for 34 items, while ten factors accounted for only 29 items. The two sets of results seemed about equal in interpretability with, in this author's judgment, a small advantage going to the multiple scalogram analysis results. The col- lapsing of Factors F and G and the greater length of the scales than that of the factors accounted for this advantage. Again, it should also be noted that multiple scalogram analysis gave the advantage of allowing direct comparisons of the subjects. Examination of the information in Table 16 (which lists for each variable the three variables with which it showed the highest correla— tions) yielded further comparisons. The correlation matrix is in Appendix V. The Lingoes method was superior to quartimax 15 times; the quartimax method was superior to the Lingoes method 12 times. Addi— tional information, contained in Table 17, allowed yet further 51 TABLE 16 THE THREE HIGHEST CORRELATIONS WITH EACH VARIABLE AND THEIR CLASSIFICATIONS IN TO THE SAME GROUPS BY THE TWO METHODS Variable Highest Three NUmber in Same Group rClassified by no. Variables Lingoes Quartimax 1 7, 3, 2 0 1 2 13, 22, 5 2 l 3 l, 10, 4 0 l 4 15, 35, 2 2 1 5 23, 11, 2 l l 6 16, 34, 4 0 0 7 l2, 8, l3 0 l 8 26, 25, 7 l l 9 25, 28, 37 0 2 10 3, 22, 28 0 0 ll 18, 5,.31 l 0 12 32, 7, l7 0 1 13 2, 32, 7 2 l l4 17, 23, 32 l 1 15 4, 35, 25 2 l 16 25, 29, 28 0 3 17 14, 32, 12 0 1 18 23, ll, 31 l 2 19 4, 25,.8 1 0 20 35, 36, 21 2 2 21 26, 37, 29 0. 0 22 26, 13, 8 2 1 23 31, 18, 5 l 3 24 25, 26, l 1 l 25 9, 16, 24 l 3 26 22, 8, 21 2 0 27 7, 12, 16 l 2 28 37, 29, 25 3 2 29 37, 28, 16 2 2 30 25, 9, 3 1 0 31 23, 18, 5 0 3 32 l3, 17, 12 l 0 33 34, 8, 26 3 1 34 33, 6, 31 l l 35 36, 20, 4 3 2 36 35, 20, 4 3 2 37 28, 29, 9 2 0 52 TABLE 17 NUMBER OF VARIABLES CLASSIFIED IN GROUPS OF VARIABLES WITH WHICH THEY ARE MOST HIGHLY CORRELATED Second Third Method Highest Highest Highest Lingoes 20 16 7 Quartimax 22 14 8 comparisons. All of this information combined indicated that for this set of data neither techniques was greatly superior to the other. Again it is illustrated that, despite differences in the two models, the close association between the basic starting computations, phi coefficients and agreement scores, led to similar results. There were no large technical differences, as there were in the Knudsen data, probably because the items were only loosely related-~the largest correlation between any pair of items was only .58 between items 4 and 15. Since there was no empirical basis for preferring one technique over the other, the Lingoes method was favored, because it used fewer groupings and accounted for more items. Discussion of the Empirical Results Despite the differences between multiple scalogram analysis and factor analysis, their separate applications to empirical data brought quite similar results, a similarity apparently due to the relationship between the calculation of a phi coefficient and the calculation of an agreement score. 53 In the empirical data the number-of groups of items resulting from each of these analyses was in every case approximately the same, which makes statements about the parsimony of one technique over the other impossible. As was predicted in the more theoretical discussion, the effect of the difference in marginal difficulty was apparent: factors were more homogeneous than scales. The major advantage of factor analysis seemed to be its ability to utilize correlations that were smaller than the highest correlation for a given item. The major advantages of multiple scalogram analysis appeared to be that it gave more information for comparisons among the subjects and that it gave some information as to how well the data fit the analysis model—-the reproducibility coefficient. .SUMMARY AND CONCLUSIONS The theoretical analysis of the two techniques indicated that factor analysis has three major advantages over multiple scalogram analysis: it allows for multiple classification of the variables, it results in orthogonal factors (independent groupings), and it utilizes more of the relationships among the variables (lesser correlations as well as the highest). The empirical data suggested that for dichotomous data only the latter advantage is salient (although for other data the multiple classification of items might be important). That the factors were orthogonal was not important, because multiple scalogram analysis produced similar groupings which were correlated without the weightings (factor loadings). The theoretical discussion of multiple scalogram analysis indicated the following advantages for multiple scalogram analysis; it requires fewer complicated mathematical manipulations of the data; it it indicates order among the variables and relations among the respond- ents; and it gives an indication of how well the model of multiple scalogram analysis is met—~the reproducibility coefficient. The analyses of the empirical data suggest that the first of these is unimportant, because despite the differences in the mathematical demands the results were similar. One of the major difficulties pointed out in the theoretical discussion of factor analysis was the possibility of difficulty factors. In the empirical data, at no time did a factor appear which could be 54 55 considered a difficulty factor, which indicates that such factors do not always occur among the main factors, even when the items being studied differ widely in level of difficulty. One difficulty with multiple scalogram analysis in the study of dichotomous data is that items with extreme marginal proportions will scale with almost any item whether the two items are related or not. Another problem is that when items are approximately equal in difficulty level it is harder to achieve an acceptable scale; however, as the results of the analyses of empirical data indicate, if the marginals differ only slightly and the correlations are sufficiently high, multiple scalogram analysis can still group them together. The major finding of this study is that factor analysis and multiple scalogram analysis generally give quite similar results in a wide variety of empirical data, particularly when the factoring is done in conjunction with the quartimax rotation. This similarity of results seems to stem primarily from the close association between the starting points for the two techniques—-the calculation of phi coefficients for factor analysis and the calculation of agreement scores for multiple scalogram analysis. Since the results of the two analyses do appear to be similar in many cases, how does the investigator decide which technique would be better for his purposes? If he hypothesizes that the items which should belong together are approximately equal in difficulty and he is inter— ested in orthogonal weightings of the variables within groups with respect to other groups, he is probably better off using factor analysis. 56 If the items differ widely in difficulty and he still prefers factor analysis, he must be careful not to overlook the possibility of dif- ficulty factors. Only under three conditions will the investigator probably be better off choosing multiple scalogram analysis—~if he feels that the items which belong together differ in level of difficulty, if he is interested in the ordering of the variables, or if he is inter- ested in the relationships among the subjects. Where it is feasible, this author suggests that the investigator will derive the maximum amount of information if he uses the two techniques in conjunction. The factor analysis will provide information about the variables, including aspects of all intercorrelations, while the scalogram analysis will provide information concerning the ordering of the variables and the relations among the individuals. There are still some questions not answered by this research which deserve further study: What relationships do other techniques of analyzing dichotomous data have to these methods? Scalogram analysis gives an indication of the extent to which the model is met by the data; can any similar measure be found for factor analysis for use with dichotomous data? 15 there a more appropriate measure of the inter— relationships among dichotomous variables than either phi coefficients or agreement scores, which would lack their inherent problems? BIBLIOGRAPHY Burt, C. Correlations between persons. Brit. J. Psychol., 1937, 28, 59-96. . The factorial analysis of qualitative data. Brit. J. Psychol., Stat. Sect., 1950, 3, 166—185. . Scale analysis and factor analysis. Brit. J. Statist. Psychol., 1953, 6, 5—23. Cattell, R.B. Factor Analysis, New York: Harper & Brothers, 1952. Comrey, A.L., & Levonian, E. A comparison of three point coefficients in factor analysis of MMPI items. Educ. Psychol. Meas., 1958, 18, 739—755. Cureton, E.E. Notes on phi/phi max. Psychometrika, 1959, 24, 89—91. Dingman, H.F. The relation between coefficients of correlation and difficulty factors. Brit. J. Statist. Psychol., 1958, 11, 13—17. DuBois, P.H. An analysis of Guttman's simplex. Psychometrika, 1960, 25, 173—182. Eysenck, H.J. & Crown, S. An experimental study in opinion—attitude methodology. Int. J. opin. att. Res., 1949, 3, 47-86. Ferguson, G.A. The factorial interpretation of test difficulty. Psychometrika, 1941, 6, 323—329. Festinger, L. The treatment of qualitative data by "scale analysis". Psychol. Bull., 1947, 44, 149-161. Gourlay, N. Difficulty factors arising from the use of tetrachoric correlations in factor analysis. Brit..J. Statist. Psychol., 1951, 4, 65—76. Green, B.F. Attitude measurement. In G. Lindzey et al, Handbook of Social Psychology, pp. 335-368, Cambridge, Mass.: Addison— Wesley Publishing Co., Inc., 1954. Guttman, L. Multiple rectilinear prediction and the resolution into components. Psychometrika, 1940, 5, 75—90. 57 58 . The quantification of a class of attributes. In P. Horst et al, The Prediction of Personal Adjustment, pp. 319—348, New York: Soc. Sci. Res. Council, 1941. . A basis for scaling qualitative data. Amer. Sociol. Rev., 1944, 9, 139—150. . -In Stouffer et al,.Measurement and Prediction, Princeton, N.J.: Princeton University Press, 1950. . -Sca1e analysis, factor analysis, and Dr. Eysenck. Int. J. opin. att..Res., 1951, 5, 103-120. . In P. Lazarsfeld et a1, Mathematical Thinking in the Social Sciences, Glencoe, 111.: The Free Press, 1954. Hotelling, H. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol., 1953, 24, 417- 441, 498-520. Kaiser, H.F. The varimax criterion for analytic rotation in factor analysis. Psychometrika, 1958, 23, 187-200. Kiel, 0., & Wrigley, C.F. A criterion for the number of factors to be extracted from a matrix. Unpublished research, M.S.U., 1960. Knudsen, Karel G. .A comparison of two methods, multiple scalogram analysis and factor analysis for analyzing United Nations voting behavior. Unpublished M.A. thesis, M.S.U., 1962. Lazarsfeld, P.F. The logical and mathematical foundations of latent structure analysis. In Stouffer et a1, Measurement and Prediction, pp. 362—472, Princeton, N.J.: Princeton University Press, 1950. Lingoes, J.C. multiple Scalogram Analysis. Computational Rep., 1960, 1, Michigan State Computer Laboratory, East Lansing. . Multiple scalogram analysis: A generalization of Guttman's scale analysis. Unpublished Ph.D. thesis, M.S.U., 1960. Loevinger, Jane. The technic of homogeneous tests compared with some aspects of "scale analysis" and factor analysis. -Psychol. Bull., 1948, 45, 507-529. Neuhaus, J.0., & Wrigley, C.F. The quartimax method: an analytic approach to orthogonal simple structure. Brit. £.-Statist. Psychol., 1954, 7, 81-91. 59 Peatman, J.G. .Descriptive and Sampling Statistics, New York: Harper & Brothers, 1947. Restle, F.A. A metric and an ordering on sets. Psychometrika, 1959, 24, 207-220. Stouffer, S.A. (Ed.) Measurement and Prediction, Princeton, N.J.: rPrinceton University Press, 1950. Thomson, G.H. The Factorial Analysis of Human Ability, London: University of London Press, Ltd., 1951. Thurstone, L.L. .Multiple—Factor Analysis, Chicago: University of Chicago Press, 1947. Torgerson, W.S. Theory and Methods of Scaling, New York: John Wiley & Sons, Inc., 1958. Trier, H.E. Sociological Variables, personality traits, and buying atti— tudes related to role conflicts among 242 Michigan wives. Unpublished Ph.D. thesis, M.S.U., 1959. Wherry, R.J. & Gaylord, R.H. Factor patterns of test items and tests as a function of the correlation coefficients: content, diffi- culty, and constant error factors. Psychometrika, 1944, 9, 237—244. APPENDIX I An Example With Mere Levels of Difficulty Than Factors 61 When there are n patterns of responses and more than n levels of difficulty, there will not be more than n factors. Suppose that there are 100 people with three separate seven—item Guttman scales super— imposed on their responses and that the scale scores are distributed as indicated in the illustration given on the following page. With this distribution of people nineteen levels of difficulty are obtained, but there are still only eight patterns of responses, because every time that a given pattern occurs on one of the scales a specific pattern occurs on each of the other scales. The principle indicates that not more than eight factors will be produced. In this particular example only seven roots were found. The latent roots (a non-zero latent root indicates a factor) from the analysis of this data are presented below. The twenty—one latent roots are as follows: 6.514 0.000 0.000 0.000 3.536 0.000 0.000 5.109 2.535 0.000 0.000 0.000 0.000 1.103 1.729 0.000 0.000 0.474 0.000 0.000 0.000 62. Number of persons ----- Scale 1—----- ----- Scale 2 ———————————— Scale 3 ______ with the indicated --------------------------- Item-------—------e _______________ response pattern 1 2 3 4 5 6 7 8 9 10 ll 12 13 14 15 16 17 18 19 20 21 10 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 l 20 1 l 1 1 1 1 0 1 1 l 1 O O O O O O O O 0 O 15 1 1 1 l 1 O O l 1 1 1 l 0 O 1 1 1 1 1. 0 O 30 1 l l l O 0 O 1 1 1 1 1 1 O 1 O 0 O 0 0 0 15 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 l 1 1 0 0 0 5 1 1 0 O 0 O O 1 1 0 O 0 O O 1 1 0 O 0 O 0 3 1 0 0 0 0 O 0 1 0 0 O 0 0 0 1 l 1 0 0 0 0 2 0 0 O 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 l 0 Item difficulty .98 .90 .45 .10 .82 .67 .32 .80 .45 .27 .10 .95 .75 .30 .85 .77 .47 .02 .50 .42 . .12 APPENDIX II Unrotated and Rotated Factor Loadings of Items on Factors 64 Two n perfect six-item Guttman scales were superimposed on the same theoretical group of people in such a way that the correlation between the two scales was zero and the interitem correlation between any pair of items, one from the first scale and the other from the second, was zero. No indication is found of this complete independence of the separate scales in the unrotated factor loadings below; loadings from separate scales are approximately equal. The rotated loadings do indicate the independence of the items and scales, for when the loadings of the items from one scale are high the loadings for the items of the other scale are eSSentially zero. Variable Unrotated Rotated number loadings loadings Factor 1 l .433 .066 2 .559 .247 3 .612 .484 4 .612 .704 5 .559 .830 6 .433 .745 7 —.433 .001 8 -.559 .001 9 —.612 -.001 10 —.612 -.003 11 -.559 —.004 12 —.433 —.004 Factor 2 l -.417 -.017 2 —.323.. —.018 3 -.118 -.014 4 .118 -.008 5 .323 -.002 6 .417 .002 65 Variable Unrotated Rotated number loadings loadings Factor 2 7 .417 .065 8 .323 .245 9 .118 .483 10 —.118 .703 11 —.323 .829 12 —.416 .744 Factor 3 l .433 -.004 2 .559 -.004 3 .612 -.003 4 .612 -.001 5 .559 .001 6 .433 .001 7 .433 -.745 8 .559 -.830 9 .612 -.704 10 .612 —.484 11 .559 —.247 12 .433 -.066 Factor 4 l —.416 -.744 2 -.323 —.829 3 —.118 -.703 4 .118 —.483 5 .323 —.245 6 .416 -.065 7 —.417 .002 8 -.323 -.002 9 -.118 -.008 10 .118 -.014 11 .323 —.018 12 .417 —.017 APPENDIX III Correlation Matrices of the TWO Guttman "Attitude Towards 0fficerS" Forms Correlations for Form A* 67 Variable 1 2 3 4 5 6 1 __ 2 57 -- 3 54 64 —- 4 31 28 25 —— 5 51 59 65 20 -- 6 47 42 40 19 53 -- 7 37 36 43 20 31 32 *Decimal points have been omitted. Correlations for Form B* Variable 1 2 3 4 5 6 1 __ 2 37 —— 3 41 44 -- 4 57 40 61 ~- 5 44 44 65 60 —- 6 36 33 54 49 49 -- 7 25 37 45 38 33 32 *Decimal points have been omitted. APPENDIX IV Intercorrelations Among the Items In the Knudsen Data 69 Intercorrelations Among the Items in the , Knudsen Data* 1 Variable v 1 2 3 4 5 6 7 8 ‘9 10 11 ‘ 12 13 14 15 16 »17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 w 1 -— a 2 —01 —— g 3 :03 ~03 ~~ i 4 ~05 ~05 ~09 ~~ » 5 ~06 ~06 ~12 78 ~— F 6 ~09 ~09 18 51 65 ~— 7 -11 ~11 25 41 52 80 ~— 8 ~12 ~12 23 37 48 74 87 ~— 9 ~12 ~12 23 37 48 63 71 78 —~ 10 ~15 ~15 19 32 41 63 73 68 74 —~ 11 ~16 ~16 18 29 37 57 66 72 72 69 —~ 12 ~21 ~21 13 22 28 43 53 58 58 68 68 i ‘- 13 ~23 ~23 12 20 25 39 49 53 53 53 62 92 '— 14 06 -22 13 21 27 41 51 55 55 65 65 88 88 -— 15 07 —2o 14 23 30 46 57 62 62 67 67 86 78 89 -- 16 —22 —22 -02 21 27 41 45 42 49 45 58 1 49 47 36 37 -— 17 —07 —07 ~01 58 40 18 29 31 25 22 11 4 25 22 23 28 08 -- 18 ~08 —08 09 57 58 71 65 60 60 50 45 31 28 29 34 29 45 -- 19 —08 —08 04 57 58 65 60 54 60 50 45 31 28 29 34 29 39 94 -- 20 —10 —10 03 45 44 50 46 44 50 31 24 34 30 33 39 13 42 79 74 -— 21 —10 -10 03 45 44 50 46 44 50 31 24 34 30 33 39 13 42 79 74 1.0 ~— 22 —11 —11 02 43 48 51 51 44 49 35 22 38 33 35 42 16 45 75 69 95 95 —— 23 —12 —12 '00 40 44 56 60 51 57 .42 28 42 37 39 46 20 40 69 64 87 37 92 ". 24 _11 -11 14 43 48 62 73 71 71 46 57 38 40 35 42 35 39 69 64 56 56 57 60: —— 25 -09 -09 06 51 58 48 47 52 52 46 34 36 32 34 4O 27 37 47 41 50 50 51 50 40 4— 26 15 -09 O6 51 58 54 52 57 57 46 52 36 32 41 46 34 24 41 35 28 28 29 33 46 60 ~- 27 -10 -10 05 48 61 78 79 73 62 61 61 45 42 43 49 37 21 60 60 45 45 45 55‘ 62 49 55 __ 28 -10 ~10 03 45 58 73 68 66 61 59 53 41 37 39 39 39 05 50 50 34 34 35 44 51 ‘44 50 84 ~~ W 29 -12 -12 12 40 50 57 81 73 73 69 68 55 50 53 59 46 22 58 64 44 44 49 51‘ 71 33 45 72 66 ~- 30 13 13 -21 ~36 _39 _59 ~66 ~62 _52 -51 -54 ~48 ~49 ~45 ~41 ~39 ~34 ~62 ~62 ~56 ~56 ~50 ~52 ~61 ~36 ~36 ~58 ~40 ~57 ~— 31 11 11 _25 _42 _4‘7 _71 ~76 _74 _.74 ,_71 ~58 ~52 ~48 ~43 ~43 ~43 ~25 ~67 ~62 ~54 ~54 ~48 ~57 ~59 ~55 ~49 ~71 ~59 ~62 74 ~— 32 09 09 ~18 -51 ~58 _77 ~80 ~68 ~63 ~63 -52 ~43 ~39 ~41 ~46 ~34 ~31 ~71 ~71 ~50 ~50 ~51 ~56 ~62 ~43 ~43 ~72 ~61 ~78 59 77 ~- 33 10 10 -03 -45 ~58 -73 -73 ~66 -55 -53 _53 ~41 ~37 ~33 ~39 ~40 ~36 ~80 ~74 ~56 ~56 ~56 ~60 ~67 ~44 ~50 ~72 ~56 ~66 67 70 67 ~- 34 10 10 -05 ~48 _47 ~60 _57 ~62 ~62 ~66 -49 ~45 ~42 1-43 ~49 ~37 ~37 ~43 ~43 ~34 ~34 ~34 ~39 ~45 ~49 ~55 ~61 ~61 ~55 35 54 55 39 ~- *Decimal points have been omitted. APPENDIX V Intercorrelations Among the Items In the Trier Data 71 Intercorrelations Among the Items in the Trier Data* Variable 2 3 4 5 6 7 8 9 10 11 12 1 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 1—— 1' 2 25 ~~ 1 3 27 12 —— 4 11 22 20 ~~ 1 5 08 25 11 05 -~ | 6 03 05 02 18 02 ~~ . 7 28 25 12 05 20 12 ~~ f 8 18 20 17 05 11 08 29 ~~ } 9 06 06 05 ~01 10 ~03 13 23 ~~ . 10 06 10 22 O8 07 ~05 13 06 06 —- 1 11 04 17 00 ~03 26 07 02 08 ~07 01 ~— 12 19 10 19 01 12 09 31 14 04 13 06 ~— 13 01 49 10 06 24 16 29 19 09 08 08 21 -— ‘ 14 ~05 11 03 09 11 00 07 06 10 03 09 16 12 -— 1 15 06 05 10 58 09 13 06 06 04 00 02 02 05 10 -- f 16 04 02 ~09 10 ~02 23 24 26 13 05 ~10 01 01 -03 12 -- 1 17 12 11 06 ~05 11 ~09 20 14 01 12 09 26 22 41 10 09 -- 18 ~03 04 ~05 ~01 16 06 00 01 04 04 33 ~07 01 15 04 —06 12 -— 19 05 13 08 18 04 04 10 15 09 08 11 02 07 -03 10 07 -03 12 -- 20 08 05 09 ~16 01 ~12 15 08 14 05 00 14 10 02 -12 00 08 -05 -01 -- 21 15 01 06 ~02 17 12 12 15 09 12 ~03 15 06 -05 -04 19 07 04 02 21 -- 1 22 11 18 10 08 18 05 23 23 09 19 13 14 26 16 05 09 21 10 15 05 16 -- j 23 04 09 ~02 07 30 ~04 11 04 14 15 16 02 07 28 12 ~08 12 38 02 -03 -01 17 -- 1 24 20 07 17 00 14 ~01 14 19 12 05 03 11 14 09 01 13 01 07 09 09 17 18 19 -- ‘ 25 13 15 10 18 07 17 25 29 '41 11 ~02 15 19 12 19 36 11 10 17 09 14 22 14 35 ~-‘ 26 10 16 13 08 19 12 18 32 07 02 11 13 20 13 03 08 16 14 12 07 26 4O 16 22 22 —— 27 09 05 05 ~06 ~01 06 29 14 00 ~02 ~05 23 15 -10 ~06 22 09 ~11 04 08 16 03 ~14 01 091 06 ~— 28 11 18 00 ~02 11 12 28 24 32 17 03 22 27 16 01 26 10 02 00 10 18 16 05 13 33 ’13 14 ~- 29 12 00 11 ~09 01 ~02 22 24 22 12 ~02 23 07 02 ~02 28 10 ~01 —05 13 22 08 06 12 27 [08 16 37 ~— 30 04 00 16 ‘_10 01 01 -01 03 -29 12 06 07 ~04 ~10 ~10 03 03 01 ~05 ~01 06 02. ~10 ~06 ~30 ‘07 06 ~07 10 —— 31 00 ~08 _05 _04 -21 00 _12 _03 ‘-02 04 -20 _04 ~04 ~16 ~01 13 01 ~33 01 08 15 ~03 ~45 ~15 ~04 —11 09 O7 07 06 ~- 32 05 29 15 06 13 03 13 05 10 08 08 34 43 26 02 01 39 03 01 11 10, 19 04 15 26 11 18 17 23 ~02 08 ~— 33 00 14 05 09 -01 -11 03 19 _04 01 10 06 09 15 01 ~03 15 00 03 01 ~08 16 O7 02 ~05 16 01 05 ~10 05 ~09 00 ~- 34 02 02 -01 03 ~06 _21. 07 _02 -01 03 -01 -14 ~03 03 ~01 ~07 06 01 01 04 ~12 ~03 084 ~13 ~10 :05 ~01 ~~06 ~12 00 ~16 ~14 33 ~- 35 10 -01 _03 -32 01 -09 12 12 09 .05 -03 08 10 ~05 ~20 ll 03 ~03 02 38 16 10 ~01 13 11‘ 08 12 15 20 09 17 07 11 04 ~— 36 03 09 02 ~18 07 -17 17 , 05 O7, 03. 03 05 02 15 ~03 \ 06 13 00 ~06 28 08 03 05 11 ~02 01 05 07 11 09 08 06 00 07 44 __ 37 22 10 12 03 09 01 ’ 19 18 30 03 00 22 20 09 07 19 13 01 ~03 00 22 18 13, 15 33 13 19 38 38 —02 02 21 ~08 -03 14 _02 _- l *Decimal points have been omitted. w m.