EFFECTS UPON THE FACT ORIAL SOLUTION OF ROTATING VARYING NUMBERS OF FACTORS WITH DIFFERING INITIAL COMMUNALITY ESTIMATES Thesis for the Degree of M. A. MICHIGAN STATE UNIVERSITY Donald F. KieI I966 .1. C I 93775 W ,. .,'.;,12_I.’] ABSTRACT EFFECTS UPON THE FACTORIAL SOLUTION OF ROTATING VARYING NUMBERS OF FACTORS WITH DIFFERING INITIAL COMMUNALITY ESTIMATES by Donald F. Kiel The most crucial problem in the entire field of factor analysis has been its subjectivity. ‘While a number of issues must be resolved in order to arrive at a truly objective method of factor analysis this study concerned itself with two key issues -- (1) the number of factors which should be considered for a particular set of data and, (2) the effect of different initial communality estimates on the final solution. Four correlation matrices, well known in factor analytic literature, were selected for study -- the eight physical variables, eight political variables and twenty-four psychological variable matrices from Harman (1) and the eleven.Air-Force classification tests from.Fruchter (2). Three principal axes factor analyses were carried out on each matrix, one using unities (1.0) in the diagonals of the correlation matrix, another using "Guttman communalities" and a third using squared multiple correlations. The unrotated factors from each of the twelve analyses were ordered from largest to smallest on their corresponding latent roots and an extensive series of rotations were calculated using the normalized Varimax method. The two largest factors were rotated, then the three largest, and so on until all real factors were included in the I'll! {I'll A I ‘ w 1" Donald F. Kiel set of solutions. The findings indicated that there is very little difference in the numeric values of the largest factors regardless of the initial communality estimates used. A point is reached, however, at which unique factors begin to appear when unities are used which are not present when squared multiple correlations are used as initial communality estimates. The factors have a tendency to split as additional factors are rotated, emerging in a definite hierarchical pattern. For example, a large verbal-deductive factor present in one solution may Split into verbal and deductive factors when an additional unrotated factor is included in the solution. As a result of this study, a criterion has been proposed (frequently called the KielJWrigley Criterion) for the number of factors which should be included in a factor analytic report, i.e. "when to stop factoring," It suggests that unities are satisfactory as diagonal entries in the correlation matrix to be factored. A series of rotations should be carried out, starting with the two largest factors, then the three largest, etc. until a solution is reached which includes a rotated factor on which fewer than three of the variables have their highest loadings. This is based on the theory that three or more points are necessarily to define a hyperplane in n-dimensional space. EFFECTS UPON THE FACTORIAL SOLUTION OF ROTATING VARYING NUMBERS OF FACTORS 'WITH DIFFERING INITIAL COMMUNALITY ESTIMATES By 5., '. ,e Donald F. Kiel A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of ‘MASTER OF ARTS Department of Communication 1966 ACKNOWLEDGMENTS While the writing of theses is never regarded by graduate students as a simple and enjoyable task, the extraction of the present document from the author has been a particularly arduous, frustrating and frequently painful chore. It has been produced through years of alternate patience, threats, kindness, cooperation and cajoling by many wonderful individuals -- too many, in fact, to mention all of their names. My deepest appreciation goes to Professor Charles F. Wrigley, director of the Computer Institute for Social Science Research at Michigan State University, who has been most deeply involved in the progress of this research and has been most responsible for guiding and influencing my thoughts for a number of years, both profession— ally and personally. Professors David K. Berlo and Hideya Kumata of the Department of Communication at Michigan State University and Professor Malcolm MacLean of the Department of Journalism, University of Iowa have been wise counselors, stimulating teachers, good friends, and long- suffering advisors. The late Professor Paul Deutschmann gave me many hours of excellent advice and encouragement. I sincerely regret that his untimely death in 1963 prevented him from.seeing the culmin- ation of this effort. .A Special word of heart-felt appreciation to my wife, Mary, without whose love, patience and frequent persuasion this thesis would never have been completed. ii TABLE OF CONTENTS ACKNOWLEDGEMENTS ------------------------------------------- LIST OF TABLES --------------------------------------------- LIST OF FIGURES -------------------------------------------- Chapter I. II. INTRODUCTION """""""'""""“"""* ..... DATA.AND.ANALYTICAL METHODS ---------------------- Introduction Correlation Matrices Used in the Study Computing Procedures Used in this Investigation Factor Rotation III. THE EFFECT ON THE FACTORIAL STRUCTURE OF INCREASING THE NUMBER.OF FACTORS IN THE ROTATED FACTOR SOLUTION ----------------------------------------- IV. THE EFFECT OF DIFFERENT COMMUNALITY ESTIMATES 0N VARYING ROTATIONAL SOLUTIONS ------------------ V. CONCLUSIONS -------------------------------------- LITERATURE CITED ------------------------------------------- iii Page ii iv vi 10 26 51 71 82 LIST OF TABLES Page Intercorrelations Among Eight Physical Variables for 305 Girls --------------------------------------- 9 Intercorrelations of Eight Political Variables for 147 Election Areas (1932 Presidental Election) -- ll Intercorrelations of Eleven Air Force Aptitude Tests- 13 Intercorrelations of Twenty-four Psychological Tests for 145 Children ------------------------------ l6 unrotated Principal Axes Factors for Eight Physical Variables Unities Used in Leading Diagonals of Correlation Matrix ---------------------------------- 27 Two, Three, Four, Five and Six Factor Varimax Rotation Solutions for Eight Physical Variables ----- 28 Eight Political Variables -- Unrotated Principal Axes Factors Unities Used in Leading Diagonals of Correlation Matrix ---------------------------------- 34 Eight Political Variables -- Varimax Rotation Solutions for Two, Three, Four, Five and Six Factors ---------- 35 Eleven Air Force Classification Tests -- Unrotated Principal Axes Factors Unities Used in Leading Diagonals of Correlation Matrix --------------------- 40 Eleven Air Force Classification Tests -- Varimax Rotational Solutions for Two, Three, Four, Five and Six Factors ------------------------------------- 42 TWenty-four Psychological Tests -- Unrotated Principle Axes Factors Unities in Leading Diagonals of Correlation Matrix ---------------------------------- 46 TWO, Three and Four Factor Varimax Rotation Solutions for Twenty-four Psychological Tests ----------------- 47 Communality Estimates for Eight Physical Variables -- 55 Effects of Differing Communality Estimates on the First Four Unrotated Factors ------------------------ 56 Remaining Unrotated Principal Axes Factors with Guttman Communalities ------------------------------- 57 Communality Estimates for Eight Political Variables - 59 iv LIST OF TABLES (cont.) Table Page 4.5 Effects of Differing Communality Estimates on the First Three Unrotated Factors for Eight Political Variables ------------------------------------------- 59 4.6 Communality Estimates for Eleven Air Force Classification Tests -------------------------------- 61 4.7 Eleven Air Force Classification Tests Effects of Different Communality Estimates on the First Five Unrotated P.A. Factors ------------------------------ 62 4.8 Twenty-four Psychological Variables Initial Communality Estimates ------------------------------- 66 4.9 Variance Contribution (Eigenvalues) of First Eight Unrotated Factors as a Percent of Total Original Communality for Twenty-four Psychological Variables - 67 Figure 3. 1 LIST OF FIGURES Hierarchical Structure of Eight Physical Variables Two Through Eight Factor Solutions Varimax Rotation -- Unities -------------------------------- Hierarchical Structure of Eight Political Variables Two Through Eight Factor Solutions Varimax Rotation -- Unities -------------------------------- Hierarchical Structure of Eleven Air Force Classification Tests --'With Unities as Initial Communality Estimates .............................. Hierarchical Structure of Twenty-four Psychological Tests -- Starting with Unities in Correlation Matrix --------------------------------------------- Hierarchical Structure of Eight Physical Variables Starting with Guttman Communalities ................ Hierarchical Structure of Eight Physical Variables Starting with Squared Multiple Correlations -------- Hierarchical Structure of Eight Political Variables Starting with Guttman Communalities ---------------- Hierarchical Structure of Eight Political Variables Starting with Squared Multiple Correlations -------- Hierarchical Structure of Eleven Air Force Classification Tests -- Starting with Guttman Communalities -------------------------------------- Hierarchical Structure of Eleven.Air Force Classification Tests -- Starting with Squared Multiple Correlations ------------------------------ Hierarchical Structure of Twenty-four Psychological Tests -- Starting with Guttman Communalities ------- Hierarchical Structure of Twenty-four Psychological Tests -- Starting with Squared Multiple Correlations ....................................... Eight Physical Variables ........................... Eight Political Variables .......................... Eleven Air Force Classification Tests -------------- Twenty-four Psychological Variables ................ vi Page 32 36 41 48 58 58 6O 60 63 64 69 7O 76 77 78 79 CHAPTER I INTRODUCTION Factor analysis is a mathematical method by which a large number of variables can be resolved, i.e. described, in terms of a small number of underlying categories or "factors". The technique is often mistakenly labeled as psychological theory primarily because it had its foundations among psychologists who were attempting to develop mathematical models for the description of human intellec- tual ability and its deve10pment for many years continued to be a psychological problem. Only in fairly recent years has there been a concerted effort to bring factor analysis into the realm of a recognized statistical theory. The father of factor analysis is generally conceded to be Charles Spearman, who in 1904 published a paper (41) on the nature of intelligence in which he described a factor analytical investigation of various cognitive tests. He Spent the rest of his life developing his Two-Factor Theory which stated that intellect consists of a general ability factor and a number of Specific factors -- one for each test used. In other words, his theory was that each test contains two factors, a factor common to all tests and a factor Specific to the individual test. By the 1930's it became apparent that a single factor was not adequate to describe many batteries of psychological tests and group factor theory develoPed. While many investigators con- tributed to the develOpment of multiple-factor theory, the name of L. L. Thurstone is most widely recognized for popularization of the theory. His pioneering work, "Vectors of the Mind", published in 1935 (42) set forth a set of principles which have become pop- ular among factor analysts and known as Simple Structure Theory. Implicit in this brief history is the controversy which has been generated over the years as to the prOper system.of factor analysis. The controversy has been primarily between British and American schools of factor analytic thought and -- of particular salience to this thesis -- the key issue has been the number of factors worth extracting from a given battery of variables. The British have traditionally tended to stop with fewer factors than have the Americans. Excellent presentations of details of the various factor theories are available elsewhere (17, 7, 11, 18), and will be dis- cussed where applicable in this thesis. The overriding problem, however, with the entire field of factor analysis has been its subjectivity. .As'Wrigley wryly states it (52): ... In most Statistical work two persons who start with the same data and calculate correctly will reach the same answer. This is not necessarily the case for factor analysis. This remains a method which depends upon arbitrary judgments by the investigator, so that skill is acquired only after long experience in estimating communalities, deciding upon the nums ber of factors to be extracted, selecting pairs of factors for rotation, and so on. ... In general, mathematical statisticians have ignored factor analysis, treating it as a jungle of the psychologists' making, into which any self-resPecting statisti- cian would be most unwise to stray. 'Hhile there are a number of issues which must be resolved in order to arrive at an objective method of factor analysis -- that is, a method which does not depend primarily on human judgment -- this study will concern itself with only two of these issues, namely, the determination of the number of factors which should be conSidered for a particular set of data and the effect of different initial communality estimates on the solution. Various criteria have been proposed in the past for either determining the number of factors which should be extracted in fac- toring a correlation matrix or detenmining the number of extracted factors which should be rotated. Philip Vernon, in an unpublished manuscript (44) in 1949, listed twenty-four indices of significance for centroid factors. Other indices have since been advanced. The question of how many factors to extract was more important with non- computerized, short-cut methods of factor analysis when factors were determined serially, one at a time. Since desk calculator computing methods were extremely laborious for even small selections of variables, investigators usually preferred to Stop with as few factors as possible. Present day analytic methods using digital computers usually extract as many factors as variables, making more important the question of determining which of these factors should be con- sidered meaningful for further analysis, i.e. how many Should be rotated? Two general classes of criteria have been suggested: (1) Judgmental methods which are based on arbitrary judgments of such things as proportion of variance accounted for by each factor and accepting only those which account for more than, say, five or ten or twenty percent of the total variance; clarity of the factor solution which, unfortunately, depends upon some vague definition of the concept of clarity; or size of the sums of squares of the loadings. Kaiser, for example, proposes that only those principal- axes factors, obtained from a correlation matrix with unities in the diagonals, which have latent roots greater than 1.0 should be accepted on the theory that it is a doubtful gain to accept into the system.any factors which contribute less information than a single test. (25) (2) Statistical criteria which involve tests with an associated probability level to determine if factor loadings or residuals are Significantly different from zero. Such Statistical criteria have the disadvantage of being dependent upon the size of the sample from which the correlation matrix was calculated and usually result in a number of statistically significant factors which have no practical value (28, 19). A.number of factor analysts have found that empirical tests of significance frequently lead to about the same results as the more proper statistical tests (17, p. 363). Factor analysts, following the Thurstonian concept of simple structure, have tended to regard each factor as having equal status with every other factor, and that there is therefore only one "correct" solution. It has also been the tradition for published factor analyses to provide only one of a series of inter- pretable solutions -- the particular one depending upon the inves- tigator's criterion for, or judgment about, completeness of factor extraction. However, little seems to be known about the effects upon the final solution of varying the number of factors (52). There have been a few suggestions in the literature which suggest that a hierarchical organization would be more effective (40, 15, 3, 48) and_a few prior studies have noticed that when the number of factors is increased the larger factors may Split into smaller ones (45, 51). No Systematic study of this phenomenon has been made. Some newer methods of factor analysis introduced in the past twenty-five years, such as Lawley's Maximum.Like1ihood method (30) and Rao's Canonical Analysis (38), attempt to bypass the sep- arate extraction of factors followed by rotation to a meaningful and interpretable solution by going directly to a "final" solution. Such methods have the disadvantage that they also depend upon a statistical criterion making use of sample size and in actual practice tend to produce far more factors than are considered mean- ingful by most practicing psychologists or others using factor analytic methods. The problem.of the number of factors which Should be accepted in a factor analysis is difficult to isolate from.the problem of communality, that is, the initial entries in the diagonal cells of the correlation matrix. In the Thurstonian simple structure model the two problems are interrelated. This problem is discussed in greater detail in Chapter IV. The controversy between psychologists about the factors of mental ability and the foundations of factor analysis among psychologists should not be construed as limiting factor analysis to usefulness only by psychologists. Indeed, quite the contrary is the case. .Among the examples used in this study is an early application in the field of physiology and one from political science. Some stimulating applications have been and are being made in the fields of medicine for classification of symptoms and clinical tests as an aid to diagnosis, (4, 8, 35) in urban and regional planning for comparison of American cities (37, 2), in advertising and marketing research (6), political science (53) and in many other disciplines. The use of factor analysis in communications research is not new, probably because many of the theoreticians in commun- ications are also psychologists and have introduced the techniques into other communications oriented areas. Extensive use of factor analysis has been made by C. E. Osgood, et. a1. (34) in the study of the measurement of meaning and the develOpment of the widely used semantic differential technique, by Berlo, Lemert and Mertz in the study of source credibility (1) and Kumata (29) in the cross- cultural analysis of meaning. In the field of journalism.some of the studies using factor analysis are those of Nafziger, MacLean and Engstrom on tools for newsPaper readership data (32) and Deutschmann and Kiel on attitudes toward the mass media (9). The present study was undertaken for two reasons: (1) "traditional" methods of factor analysis involving the rotation of factors are most common and have a vast background in empirical studies -- they will not soon be completely obsolete; and (2) no systematic Study of the effect of rotating increasing numbers of factors has been done. The results of such a study may shed more light on efforts to reach the ephemeral goal of a "definitive" direct and completely objective method of factor analysis or at least encourage researchers to report factor analytic results in a way which would make possible more and better comparisons between similar studies. CHAPTER II DATA.AND.ANALYTICAL METHODS Introduction Since the purpose of this study was to investigate various factorial solutions in the hope that an objective criterion for "when to stop factoring" could be developed, particularly one which was not dependent upon strictly Statistical criteria, methods of analysis were selected which did not require subjective decisions at any point in the analysis; and, Since it was also intended that the findings of this investigation should have general applica- bility to the wide range of disciplines now using factor analysis, the data for this study were chosen not because of the relevance of the variables to any particular discipline but, rather, because the matrices had been intensively studied and reported on by other investigators and were therefore well-known, even classics, in factor analytic literature. Correlation.Matrices Used in the Study Initially ten matrices were considered for analysis and some work was done on all of them. However, it became evident early in the study that many of them would merely duplicate the results of other matrices and unnecessarily confuse the presentation of results, so the list was narrowed to four matrices which clearly illustrated different types of factor solutions, and the major ' emphasis was placed upon the intensive study of these four. In two of the four examples the matrices were made up of subsets of variables chosen from larger sets of variables for computational convenience by earlier investigators. This fact does not reduce their value for the purposes of the present study, but it would be of interest to apply the methods of this study to the larger matrices at some future time to validate the results of this study and to further insure that the criterion of "factorial in- variance" is met. Matrix I: Eight Physical Variables. -- This is a matrix of intercorrelations of eight physical measurements made on 305 girls between the ages of seven and seventeen. They were chosen by Holzinger and Harman (21, p. 80) from a larger set of seventeen variables reported by Mullen (31) to be representative of two distinct factors which were bi-polar. It has been intensively analyzed by Holzinger and Harman and by Harman (17, p. 82).as an example of a rank two matrix. The numbering and description of the variables and the complete correlation matrix is shown in Table 2.1. Most of the measurements are self-explanatory except for "bitrochanteric diameter" which is, in laymen's terms, hip measurement. The first four variables were chosen to be measures of longitudinal growth or "lankiness" and the second four as measures of horizontal growth or "stockiness". Observe that the first four are more highly intercorrelated with each other than are the second four and that, in comparison with most empirical matrices, part- .mmma .owwofiao mo .9 .oOfiuooaum mo uooauuumon .oowumuummmfin :.ow< mo memo» cmmuom>om ou oo>wm mango mo susouw as» ea muouomm: .Q.nm .mwoomum .ooaaaz u aoum ooo.H mmm. mum. mac. mom. mam. qu. Nmm. Snows umoso .w ooo.H mmm. omn. mum. nmm. mum. Hon. nuufiw ummso .n coo.H «on. mum. mam. own. mom. .amfiu owuouomnoouuwm .0 ooo.H one. Hon. ohm. nee. unwfims .m ooo.H How. ems. ans. wma taxoa no aumaoa .s ooo.a Ham. mom. sumouom mo nuwooq .m ooo.a as». swam aha .N ooo.a “amass .H m n o m a m N H manofiuo> mHun mom mom moanmfiuw> Hmowmhnm unwfim wooa< mefiumHouuoououoH H.N mam shame .m oo.H aw. nam>mmoom .N oo.H enema .H m a a m a m N H maaaana> Aaoauomam Haauaueamuum Nmmav manna aoauomam “SH now moan—35; Hoognom uzwfim mo mooaumaouuoououfin ~.N mama; 12 8. Education -- percentage of population, eighteen years of age and older, which had completed more than ten grades of school. ‘Matrix III: Eleven Air Force Classification Tests. -- In Table 2.3 are the intercorrelations of eleven tests which were part of a battery of tests used by the United States Army.Air Force during ‘Horld‘war II to classify aviation cadets into training assignments for air-crew positions such as pilot, bombardier, and naviagtor. The product moment correlations are based on a sample of 8,158 unclassified aviation students. The complete description of these tests may be found in Guilford (14) although the matrix has been intensively studied by Fruchter (11, pp. 69-72) and it is Fruchter's analysis which has been used for comparison. A brief description of the eleven tests follows: 1. ‘Qial and Table Reading -- the test consists of two parts, the first of which measures how quickly and accurately the examinee can read the dials on an in- strument panel; the second involves locating Specific values within the body of tables. 2. Spatial Orientation I -- a perceptual-Speed test in which the subject is required to locate small sections of an aerial photograph within a larger picture. 3. Reading Comprehension -- a test designed to measure understanding of paragraph material and the ability to make inferences based on the material read. An attempt was made to minimize mechanical and numerical content in the material presented. 4. Instrument Comprehension -- each of the 60 items consisted of pictures of two instruments, and artificial horizon and compass, followed by pictures of a plane in five different attitudes. The problem was to determine which of the five planes had a position and direction consistent with the instrument readings. 13 oo.a oN. ma. NH. SN. mN. am. on. oN. om. om. coaumaaenooo Utaste .HN coo.N mN. ON. NH. SN. mm. SN. as. «N. NN. newsman” Naoauomnm .oa oo.H oo.- ma.u No. Na. NN. NN. Ho. No. aoaamanouaa amuaaaaomz .a oo.a No. we. mo. NN. mm. mN. mm. HH maoaumuoao Naoanmasz .w oo.N nN. No.u NN. mN. mN. Nm. H maoaumnmao Hananmauz .N ooo.N SH. NN. NH. 04. Nm. coaumoamauamea mo eomam .e oo.N mm. Nm. ma. ON. maaaauaana Haoaaaeoamz .m oo.H mm. ww. mm. dowmoonouaaoo unmanuumoH .a oo.H ma. Ha. oofimaonoumaoo weaved“ .m oo.a as. H coaumuamano Hmaumam .N oo.j waaemmn manna was Hana .N Ha oa a m N a m a m N H name mumoH owoufiunoam mo moowumfionuounouoH m.N mamaH l4 5. Mechanical Principles -- each item pictured a mechan- ical stiuation, with questions designed to test the sub- ject's ability to understand mechanical forces and movements. 6. <§peed of Identification -- the subject was required to match airplane silhouettes. 7. Numerical Operations I -- 100 simple numerical-computation items involving addition and multiplication. 8. Numerical Operations II -- same as number 7 except the problems involved subtraction and division. 9. Mechanical Information -- a verbally stated mechanical knowledge test, relating particularly to Operation of parts of automobiles. The items were quite brief, calling for only a limited amount of reading and re- quiring quite Specific mechanical knowledge. 10. Practical Judgment -- a test requiring the subject to determine the most practical course of action to a verbally presented problem situation. 11. Complex Coordination -- an apparatus test of the Speed and accuracy of hand and foot adjustments to a complex perceptual stimulus. The subject was faced with a panel containing three rows of red lights and three rows of corresponding green lights. 'Hhen a particular stimulus pattern of red lights was presented the subject was required to move controls similar to those used in an airplane in flight so as to turn on the green lights correSponding to each of the red lights. AS soon as the match had been completed, a new set of red lights was automatically presented. Matrix IV: Twenty-four Psychological Tests. -- This example con- sists of twenty-four psychological tests administered to 145 seventh and eighth grade pupils of a suburban Chicago school in the late 1930's. The data was gathered by Holzinger and Swineford (20) and Subsequent analyses by Holzinger and Harman (21), Harman (17), Kaiser (27), Neuhaus and‘wrigley (33) and others have made it a classic in factor analytic literature. The complete correlation '3 15 matrix is presented in Table 2.4 and a brief description of the tests follows: 1. 10. 11. 12. 13. .Visual Perception Test -- a non-language multiple- choice test composed of items from Spearman's Visual Perception Test. Cubes --.A simplification of Brigham's test of Spatial relations. Paper Form Board --.A revised multiple-choice test of Spatial imagery, with dissected squares, tri- angles, hexagons, and trapezoids. Flags -- Adapted from a test by Thurstone. Requires visual imagery in two or three dimensions. Genergl Information --.A multiple-choice test of a wide variety of simple scientific and social facts. Paragraph Comprehension -- Comprehension of written material measured by completion and multiple-choice questions. Sentence Completion --.A multiple-choice test in which "correct" answers reflect good judgment on the part of the subject. ‘Hord Classification -- Sets of five words one of which is to be indicated as not belonging with the other four. ‘word Meaning --.A multiple-choice vocabulary test. Add -- Speed of adding pairs of one-digit numbers. Code --.A simple code of three characters is presented and exercise therein given to measure perceptual Speed. Counting Groups of Dots -- Four to seven dots, arranged in random patterns, to be counted by the subject. .A test of perceptual Speed. Straight and Curved Capitals --.A series of capital letters. The subject is required to distinguish between those composed of straight lines only and those containing curved lines. .A test of perceptual Speed. TABLE 2.11 INTERCORRELATIONS OF TWENTY-FOUR PSYCHOLOGICAL TESTS FOR 1115 CHILI‘REN Test 1 2 3 8 5 6 7 8 9 10 11 12 13 18 15 16 17 18 19 20 21 22 23 28 I 1 1.000 2 .318 1.000 ‘ 3 .803 .317 1.000 ‘ 8 .868 .230 .305 1.000 . 5 .321 .285 .287 .227 1.000 6 .335 .238 .268 .327 .611 1.000 7 .308 .157 .223 .335 .656 .722 1.000 . 8 .332 .157 .382 .391 .578 .527 .619 1.000 -; ‘ 9 .326 .195 .188 .325 .723 .718 .685 .532 1.000 .; ' 10 .116 .057 —.075 .099 .311 .203 .286 .285 .170 1.000 . w 11 .308 .150 .091 .110 .388 .353- .232 .300 .280 .888 1.000 -' ’ 12 .318 .185 .180 .160 .215 .095 .181 .271 .113 .585 .828 1.000 f 13 .889 .239 .321 .327 .388 .309 .385 .395 .280 .808 .535 .512 1.000 1 18 .125 .103 .177 .066 .280 .292 .236 .252 .260 .172 .350 .131 .195 1.000 ; 15 .238 .131 .065 .127 .229 .251 .172 .175 .288 .158 .280 .173 .139 .370 1.000 16 .818 .272 .263 .322 .187 .291 .180 .296 .282 .128 .318 .119 .281 .812 .325 1.000 2,, . 17 .176 .005 .177 .187 .208 .273 .228 .255 .278 .289 .362 .278 .198 .381 .385 .328 1.000*«: . ;,u,..: 18 .368 .255 .211 .251 .263 .167 .159 .250 .208 .317 .350 .389 .323 .201 .338 .388 .888 1:060 “ “7*1 19 .270 .112 .312 .137 .190 .251 .226 .278 .278 .190 .290 .110 .263 .206 .192 .258 .328 3358 1.0001, 20 .365 .292 .297 .339 .398 .835 .851 .827 .886 .173 .202 .286 .281 .302 .272 .388 .262 2.301 .167 1.COO 21 .369 .306 .165 .389 .318 .263 .318 .362 .266 .805 .399 .355 .825 .183 .232 .388 .173 .357 .331V .113 1.000 22 .813 .232 .250 .380 .881 .386 .396 .357 .883 .160 .308 .193 .279 .283 .286 .283 .273 '.317 ;3821 .863 .378 1.000 23 .878 .388 .383 .335 .835 .831 .805 .501 .50‘4 .262 .251 .350 .392 .282 .256 .360 .287 .272. g303; .599 .851 .503 1.000 28 .282 .211 .203 .288 .820 .833 .837 .388 .828 .531 .812 .818 .358 .308 .165 .262 .326 .805e .378 1.366 .888 .375 .838 1.000 Harman, Harry H. Modern Factor Analysis. hicago, 111.: U. of Chicago Press. 1960. p. 138,. :’ —_—- 91 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 17 ‘Hord Recognition -- Twenty-five four-letter words are studied for three minutes. These words are then to be checked from.memory on a hundreddword list. Number Recognition -- Similar to test 14. Fifteen three-digit numbers. Figure Recognition -- Similar to test 14. Fifteen geometric designs. Object-Number -- Twenty pairs of names of familiar objects and two-digit numbers are studied for three minutes. The words only are then presented to the subject, who is required to supply the pr0per numbers. Number-Figure -- Similar to test 17. Ten pairs of numbers and geometric figures. FigureAHOrd -- Similar to test 17. Ten pairs of geometric figures and words studied for one minute. Deduction -- Logical deduction test using the symbols ) and ( and the letters A, B, C, and D. Numerical Puzzles --.A numerical deduction test, the object being to supply four numbers which will produce four given answers employing the operations of addition, multiplication, or division. Problem Reasoning --.A reasoning test in completion form. Each problem lists the steps in obtaining a required amount of water using two or three vessels of given capacity. Series Completion -- From a series of five numbers the subject is supposed to deduce the rule of procedure from.one number to the next, and thus supply the sixth number in the series. Woody-McCall Mixed Fundamentals: Form I -- A series of thirty-five arithmetic problems, graduated for difficulty. ngputing Procedures Used In This Investigation The decision as to the procedures to be used in this invest- igation was based on four criteria: (1) the solution produced should be as mathematically precise as possible, (2) no subjective 18 decisions Should be required in arriving at a solution, such as estimate of rank, significance of residual matrix, etc., (3) the solution should be psychologically acceptable, and (4) the methods should be programmable for computation on an electronic computer. .All of the original calculation for this study were made on MISTIC, Michigan State University's original electronic digital computer. Several new computer programs were written for MISTIC in conjunction with this study and, Since MISTIC is now obsolete, most of them.have been reprogrammed for or incorporated into existing programs for the Control Data Corporation 3600 Computer currently in use at Michigan State University. A 3600 FORTRAN program (called FACTORA), incorporating all of the methods used in this study as well as provision for calculating from.the raw data matrix, is available from.the Michigan State University Computer Laboratory. .All of the data for this study have been recalculated on the CDC 3600 and the complete set of results and a listing of the program are on file in the Michigan State University Library. It is interesting to compare calculation time as an example of the growth of the computer "art". On MISTIC, which required laborious punching of paper tape for input and output and some desk calculator work between runs, the complete eigenvalue- factor loading solutions for each of three different communality estimates (discussed in Chapter 5) and all sets of rotations for all three sets of factors required approximately fifteen hours of computer time over a period of several months. .All of the same results were obtained on the 3600 in one computer run in a... 19 less than ten minutes. For each of the four examples used in this study, fact- oring started from a previously calculated matrix of product- moment correlations. For the methods used in this study no assumptions are made about the statistical distribution of the variables, hence the matrices could have been any other of the frequently used non-parametric indices of relationship, i.e. the tetrachoric correlation coefficient, phi coefficient, biserial or point biserial correlation, etc. ‘Hhenever the assumptions of the product-moment correlations can be met, however, they are the preferred starting point for factor analysis, since they allow the calculation of additional statistical tests for the signif- icance of factor loadings, etc. Most modern factor analysis programs for digital computers provide for Starting with "raw" data; that is, the individual measurements of each of the variables for each observation point (subject). For the initial phase of this investigation unities were used in the leading diagonals of the correlation matrices. The controversy over the appropriate initial diagonal entries, the so-called "communality question", is a burning issue in factor analytic literature and will be discussed further in Chapter 5. Unities, which are the self-correlations of each variable and represent the total variance of each variable in the linear factor model, were chosen because (1) they were computa- tionally most convenient since they‘were already in the diagonals in the output of the correlation program on MISTIC, (2) they 20 preserve the Gramian property of the matrix, producinglg_real, positive roots, and (3) they are one of the few unique communality estimates. From.both a mathematical and statistical point of view the preferred method of extracting the initial factors is the method of principal axes, first proposed by Pearson in 1901 (36) and developed by Hotelling in the 1930's (22). ‘While this method was recognized as superior for many years, the computational method was so laborious that it was not until high-Speed electronic computers came into general use in the 1960's that the method became practical for use with other than very small matrices. The principal axes method has a number of properties which make it ideal from a factor analytic standpoint: (1) it produces a unique solution for any given correlation matrix; (2) the first factor extracted in the sequential method (the factor correSponding to the largest eigenvalue in the Jacobi method) accounts for the maximum possible pr0portion of variance in the matrix, the second factor for the maximum.pr0portion of the remaining variance, the third factor for the maximum.remaining after the first two were extracted, and so on (usually a small number of the total roots will account for almost all of the total communality); (3) the resulting columns of factor coefficients are "orthogonal"; that is, the correlation between any two pairs of factors is zero meaning that all factors are independent. This prOperty, expressed mathe- matically, is: 21 where ap is a vector of factor loadings, AP is an eigenvalue of the correlation matrix and the Kronecker 6P9 = 1 if p = q and 6pq = 0 if p i q. The original method proposed by Hotelling (22), involving the sequential extraction of factors, is extremely arduous and inefficient even on high-Speed calculators (although it has been programmed and used on MISTIC for factoring very large matrices when only a relatively Small number of the total possible factors was desired and when capacity would otherwise have been exceeded). For computer applications the relationship between factor analytic theory and the mathematical problem of determining the eigenvalues and eigenvectors of a square, symmetric matrix has been taken into consideration. The most frequently used method, and the method programmed originally for MISTIC and more recently for the CDC 3600, makes use of work done by Jacobi (24) in the 1840's -- commonly known as the Jacobi method. It is an iterative method which produces the complete matrix of eigenvalues (latent roots) and eigenvectors without producing a residual matrix after each eigenvalue. The eigenvectors are transformed into factors by scaling each eigenvector by the square root of the correSponding eigenvalue. Even the Jacobi method has disadvantages. For large matrices the number of iterations necessary frequently leads to very great rounding errors. Recent work by Householder (33), Givens (12),‘Hilkinson (46), and others is leading to more effi- cient methods of solving the eigenvalue-eigenvector problem on 22 electronic computers. These methods are being rapidly added to the repertory of computer methods of factor analysis. Factor Rotation While an occasional factor analyst will stop once the principal axes factors have been extracted, most psychologists as well as users of factor analysis in other disciplines do not find the end product of the principal axes method useful or acceptable from the standpoint of interpretability; Most prefer to rotate some subset of the principal axes factors to some reference structure which provides more psychological "meaningfulness". For years the methods of rotation were crude, subjective methods in which the final solution depended on the judgment of the investigator. Any two investigators analyzing the same data were not likely to reach the same solution (52). Just over ten years ago the first significant developments were reported of analytical, objective methods of rotation. Carroll (5), Neuhaus and'Hrigley (33), Saunders (39), and Ferguson (10) almost simultaneously, although independently, developed criteria which were very similar. Neuhaus and'Hrigley, who were the first to program.and use the method on electronic computers, coined the now well-known name, Quartimax, for this method. It attempts to minimize the complexity of the individual variables, that is, to approach a unifactorial structure in which each variable has a high loading on only one factor. This is extremely difficult to achieve with empirical data, capecially for orthogonal structures, and usually leads to a fairly large general factor. 23 Since the quartimax method did not completely meet one of the important requirements for simple Structure in the Thurstonian sense, namely that the number of zero loadings on a factor should be maximized, Kaiser (25) deve10ped a modification of the quartimax method which he called the varimax method. 'Uhere the quartimax method placed the emphasis on the Simplification of each-Egg (variable) of the factor matrix, the varimax method places more emphasis on Simplifying the columns (26). It was also Kaiser who noted a disadvantage in both his original varimax method as well as in the quartimax method, namely, that even after rotation there was more diSparity in the variance contributions (sums of squares of the loadings on a factor) of the different factors than is desirable. In other words, one objective of rotation should be to give equal weight to each factor. Kaiser attributed this diSparity to the fact that in either method of rotation, each variable contributes to the function being maximized as the square of its communality. In other words, a variable with a communality twice as great as another variable will influence the rotation four times as much. For this reason, Kaiser modified his original method (referred to as the “raw" varimax method) by weighting each variable so that it contributes equally to the rotation. This procedure, known as normalizing, involves dividing each loading before rotation by the square root of the sum of squares of all of the loadings for that variable (the observed communality of the variable for that solution) which extends the 24 vector representation of that variable in common factor Space to unit length, performing the rotation on the "normalized" loadings, and then reducing the vectors back to their original length by ‘multiplying all of the rotated loadings for a variable by the original Scaling factor. Notice that under orthogonal rotation the communalities remain constant even though the individual loadings change. The method using this weighting process is known as "normal" varimax and requires that the final loadings be such as to maximize the following function: mu m n V.1n 2 2(b /h,)‘*- 2(Eb2/h2)2 p=1j=1 19 J p=1 1:1 12 J The mathematical details for achieving this maximization is available elsewhere (25). The same weighting or normalizing process is applicable to the quartimax method as well, and results in a considerable improvement in evenness of the factors over the "raw" quartimax. The normalized quartimax and varimax are the methods presently programmed for the CDC 3600. The question of which rotational method is "correct" is unresolvable. The normal varimax method was selected for this study because it results in a solution which is probably closest to the concept of simple structure preferred by most American factor analysts. .After comparing varimax, quartimax and subjective graphical solutions for four factors of the twenty-four psychological tests, Harman concludes (17, p. 306) that "the varimax solution seems to be the 'best' parsimonious analytical solution in the 25 sense that it correlates best with the intuitive concept of that term as exemplified by the graphical solution." It has an additional attribute which is considered by Kaiser as of primary importance; namely, that of "factorial invariance." AS Thurstone describes this principle, it is that "the factorial description of a test must remain invariant when the test is moved from one battery to another which involves the same common factors." (43, p. 361) For the purposes of this study, the unrotated principal axes factors were ranked in order by decreasing size of their corres- ponding eigenvalues and then the first two, the first three, four, five, and so on, factors were rotated. In the case of the factors obtained using squared multiple correlations as initial communalities (Chapter 5), only those factors corresponding to positive eigenvalues were rotated. CHAPTER III THE EFFECT ON THE FACTORIAL STRUCTURE OF INCREASING THE NUMBER.OF FACTORS IN THE ROTATED FACTOR SOLUTION Introduction The initial phase of this investigation was concerned with observing the effects of rotating the first two, then the first three, etc. factors where first refers to the unrotated factor corresponding to the largest eigenvalue, second to the next largest eigenvalue, and so on as described in the preceding chapter. In this phase only those factors obtained using unities in the principal diagonals of the various test correlation matrices were studied. Eight Physical Variables The eight variables in this matrix were Specifically chosen by Holzinger and Harman to represent four "longitudinal" and four "horizontal" variables. In previously published analyses of this matrix only two principal axes factors have ever been given, and these using communalities calculated by estimating the rank. Table 3.1 shows the unrotated principal axes factors for this matrix. The maximum.discrepancy between the unrotated factors calculated by Harman (17, p. 173) and those obtained in this study is only .09. As might be expected, since the variables were chosen to represent only two factors, slightly more than 80% of the total variance is accounted for by the first two factors. The first factor is a large general factor with the first four variables (hypothesized 26 27 .moHocooH ow out Ho.OOH 68 ea. mmmmpaooama HmseHsHecHS oo.ooH mm.wm wo.mm 25.:m mm.Hm em.ow mm.om H:.mm ommpooonoo oprmHoeoo ooemfiamb *Ho.QOH HN.H NN.H mH.N Nm.N NN.m Ho.o eH.NN He.®m H8868 Ho 8866888 AoSHmPCmmemv ooo.© Nao. NNH. NwH. HHN. HNS. Hes. HNN.H Hee.e sesame Ho aoHpsthecoo ooo.H Hmo.u mHo. wHo. mmo.n meH. Nam. mHN. owe. spews Hmoeo .m ooo.H mmo.- omo.- HHH. SNH. mos. NaN.- oem. HHS. sagas 8.680 .N HmecsmHt oco.H Noo.- Heo. SNH. 80H. Nee.- Hmo.- Hmm. sea. oHaopemeoongm .8 ooo.H meo. OHo.u HHm.n meH.u meo.u maH.n mNm. mme. HemHoe. .m ooo.H emo. meN. SOH. NNN.- Hao. HOH.- mam.- cam. somaeH m6H nmon .4 ooo.H maH.- oao. oNH.- NeN. Hmo.- 6H0. mme.- HHS. aeacmH essence .m ooo.H mNN. HOH.- Hmo. meH. ego. mNo. Hee.- New. swam sea .N ooo.H oNH.- mmN.- woo. NAH.- oNo.- oHo.- NNm.- amm. HamHom .H ANav .EEOU HHH» HHS HS S SH HHH HH H oHannmS NHmedx 20Hedqmmmoo mo mgdzocde UZHQ¢SH 2H 9mm: mmHBHZD II mamdema> anemwmm HmUHm_mom mmoao¢m mmxd HaMHDZHmm Qm9 28 NN.N ON. om. OO.H ON.H ON.H NO.H OO.N ON. HO.H ON.H HO.H N;.m 6666. OOO NOO ONH- OOO OON- OON NHN OOO NOO- HOO OON OOO- NHN O OO, OOO- ONH- ONN HmN- ONO NOH OOO HHO HON HHO HOO- OOH N OO6 HHO OOH- ieN NOO- mON ONH OOO OOO NNN HON OOO- mNH O mOO HOO Nme- NON NNO- HNO HON OOO eNN- NNN ONO OOO- ONN m OOO HON- NOH- OOH O3O- OOH NOO NNO HNN- OOH OOH OOH- OOO H ONO ONO NOH- HOH OOH- MOO OOO HOO ONN OOO OOO NOH- ONO m ONO NmH NHO NON mOH- OOO ONO OOO OOH OOH OOO NOH- ONO N OmO NOH- OOH- NHH NON- ONO OOO ONO HeN- OHH ONO HHN- NOO H maemHHm> maopomm me maopomm obHH ®0§mfihm> .mm.N. Om. Nm.H, Om.H, OH.m N0.0 m0.0 OO.N Om.m Hewe mO.N Om.m 86H66H OOO NOO HNN NHH- NHN ONO HOO NOO ONN ONO OON HON O ONO OOH OOO NON- OOH OOO NNO mma ONH OHN ONO NOH N OOO ONN NON OOO- ONH HON OON NON OOH ONN OOO HOH O ONO ONN emm OOO- OON NNO ONN NOO OON OOO NNO HON m ONO ONO OOH NHH- OOO HNO OOO HON :OO HOO ONN OOO O ONO OHH OOO OOHJ. OHO NNO OHH ONH OHO HNO OOH OHO O H-HO OON HNO NOO- ONO OOO NOH NNH ONO HOO OOH ONO N NON ONO OOH OmN- HOO NON ONO EN moO NNN OUN OOO H Na SH HHH HH H Ne HHH HH H Ne HH H OHOOHHOO mnopomm Hoom mHopomm conga muopomm 039 . m. Mme. Om m:w pooh .o.H mmooamb Scum uoPHHEo mpewoa Hmewoou weflomomv SOHOHO HHOH-HOO s.O HO HO OHOHHOH3N HOHHHHON HHHHOOS OOHOHN HHO OOH .HSHO .szO .OamOH O3: N. O OH OH ll IIIIJIIII ll I111 29 as "lankiness“ variables) most highly loaded. The second unrotated factor is of the bipolar type; that is, the two groups of variables have opposite signs. Table 3.2 contains the rotational solutions for two, three, four, five and six factors. 'Uhen rotated the two factors originally hypothesized are clearly evident, the longitudinal variables (1, 2, 3, and 4) appearing highly loaded on the first factor and the four horizontal variables (5, 6, 7, and 8) highly loaded on the second factor. Notice that the observed communalities (columns headed hz, the usual abbreviation) indicate that 85% or more of the total original communality of the first five variables has been extracted in the first two factors. (Since the initial communality was 1.0 for each variable, the observed communality is a prOportion.) Notice that only 62% of the total variance of variable 8 appears in the first two factors and that the loading of this variable on the third unrotated factor is .60, an additional variance contribution of approximately 36%. The effect of in- cluding this third factor in a rotation is to Split the eighth variable away from the group of "stockiness" variables to form a new factor on which only the eighth variable has a high loading although there is still an appreciable loading of variable 8 on the second factor. On the unrotated factors, variables 6 and 7 have appreciable loadings on the fourth factor, although of opposite signs, and each of the other variables also have relatively higher loadings on at least one of the remaining factors. These additional non-zero 3O loadings are probably attributable to "unique and Specific" variance, since communality estimates were not used in the diagonals of the correlation matrix. In the four factor solution, variables 6 and 7 do not form a new factor but, instead, because of the Opposite signs, cause the formation of two new factors, doublets of variables 5 and 6 and another of variables 5 and 7 (variable 5 has its highest loading on the factor with variable 6). Additional examples of bipolar factors Splitting into either Specific factors or doublets will be seen in other test matrices. The rotation of the five largest factors produces no new factors. The fifth factor is of the type we Shall call a "null" factor, signifying that none of the variables have their highest loading on that factor and, in most cases, none of the variables have any appreciable loading on such a factor. The addition of the sixth factor to the rotational solution results in a Specific factor on which variable five has its highest loading although variable 5 still retains appreciable loadings on the two doublet factors produced in the four factor solution. The seven and eight factor solutions produce no new factors on which any other variable has its highest loading, although two factors appear which have relatively high loadings on variables 1 and 4. Variables 1 through 4, the "lankiness" or longitudinal variables, which had the highest correlation coefficients and the largest loadings on the first unrotated factor, remain grouped into a single factor. 31 — Figure 3.1 gives a hierarchical diagram of the results of the seven sets of rotations. The variable numbers in solid boxes indicate that those variables had their highest loadings on that particular factor. 'Hhen a variable had a loading greater than .40 (16% of the variance of that variable) on a particular factor it is indicated in dotted lines for that factor. The symbol N indicates a "null" factor, one on which no variable had any appreciable loading. Factors on which at least three variables were most highly loaded are noted by rectangular boxes, those with less than three highest loadings are enclosed within circles. From a physiological point of view the hierarchy seems to be eminently sensible. The "lankiness" variables are related to bone structure which normally is proportional in an individual and is independent of leanness or obesity. The arm Span and length of forearm (variables 2 and 3) would naturally be most closely related. The "stockiness" variables, on the other hand, are less closely related. It would not be unusual for girls seven to seventeen to have varying chest and hip girths, and the chest width and chest girth are measures of somewhat different types of bodily deve10pment. Additional Speculation is not germane to this discussion. 32 FIGURE 3.1 HIERARCHICAL STRUCTURE OF EIGHT PHYSICAL VARIABLES -- STARTING WITH UNITIES IN CORRELATION MATRIX Number of Factors 2 ®®® 33 Eight Political Variables The unrotated principal axes factors for this matrix are shown in Table 3.3. In this example over 85% of the total variance is contributed by the first two factors, although almost 30% of the variance of the first variable appears in the third factor. The results when the two largest, the three largest, etc. factors are rotated is presented in Table 3.4. The order of the variables has been rearranged in the table to group the variables in the factors in which they appear. Figure 3.2 presents the same information in the form of a hierarchical structure. Notice that it is much easier to interpret in this form. When only two factors are rotated, six of the eight variables have their highest loadings on the first factor, and the other two have their highest loadings on the second factor. Variables 4 and 8 also have appreciable loadings on the second factor. Observation of the signs of the loadings is necessary for interpretation of these factors, since both are of the bipolar type. These two factors have previously been identified by Holzinger and Harman (17, p. 178) as a large "Traditional Democratic Voting" factor (relating high Democratic party as well as straight ticket vote to high unemployment, high residential mobility, and IOW'median rental and low education) and a smaller factor which has been called a "Home Permanency“ factor (high home ownership negatively related to home mobility, education, and median rental). 34 c.00H N.m o.mm m.©m w.mm 0.0m 0.:w m.Jm mumpcmopoc ohwpwflzsdo o.ooH w.o N.H m.H w.m o.m o.o «.mH w.ao mocmHnmp Hmpop co pcoocmm Aodam>Cmmwmv oo.w co. OH. 4H. Hm. mm. mm. 4m.H mH.m genome co :OHpspprcoo oo.H me- Hm mH- go- mH mo co. m- :OHpmosem .w oo.H m.- mo ow em mo- No- No. He- thHHpoe Hmecmcmmmm .N oo.H so- mo- mH- mm JH mH- mo- Hm peeEmOHasmes pcootsa .0 oo.H Ho- mo mH 0H co mm mm mm mHemnmcso use: peoonmm .m oo.H mo mH- co- mo mm mm :0. ma- Happen smHemz .2 QQ.H OH: 00: NH 0H: ow NH: Hm: mm upob magma pnmfimnpm pcoohmm .m oo.H NH :H Ho HO 0H 0H m:. ow mp0» pH¢>mmoom pcmopmm .N oo.H OH- so- go. co NH- gm em. as opo> mmon pcoonma .H we HHH> HHs H» p >H HHH HH H mHanams xwme¢fi ZOHH¢Hmmmoo mo mM¢ZQQ¢HQ wzHa¢Vg 2H mum: mbhHHz: mmoeo m mmx¢ Admwoszw mb9¢ecm25 I: mmpm¢Hm¢> HdoHBquu emcww m.m WHm¢H 35 muouomm poem muouomm omusa moamfium> qw.~ on. no. no.H oo.H mm.H HH.N on.m mm. Hm.H mm.H ow.H Ho.~ uouomm mam. meo mom mmmu mNmu 00¢: Hun Hmm. mHo coo wwmu mmmu ¢n¢u w mom. mmm NNH ¢0Nn mom: omHu mom com. mum Hem quu mom- Num- 5 5mm. mwou moo: Hmm mmH ncm Nmm: mom. moH 55¢: «mm mmm qan o mam. NHo aqon 5H0: 0mm mmou nmon onm. mNH N50- mmo: qu @003 m mum. NmH qu ONN: Neal «mun mum mum. moo «mm mom: mmHn owmu q 0mm. wHHu NmNn mmn mmo can mnqu Hem. mmuu own: Hum mmo own m «mm. one: mwmu ¢¢q one: «mm nmmu qmm. meu mom- mom one: man N wwm. nmou Nwou HmH oqo Nm¢ omNn nwm. 0H0: NHNu Ham mmo mNN H : H> > >H HHH HH H s > >H HHH HH H oHanum> N muouomm xwm N muouomm o>wm moanum> as.“ Ho.H qh.H mw.H om.~ mN.n mw.H mm.H wm.m «5.0 qH.N mm.¢ ecuowm 0mm. nmo Cum: ommu mHmn mom. Hem- mmmn wow- oom. mods wmwn w omm. NNM NmHn ann mum: owm. meu down Oman Nww. mmwu «mm: m oww. o¢¢u mom mam can mmw. HHm NHN new Hmw. mom wmw o m¢a. coon wNo: mom mmou Naa. omou mom wmo www. cam quu m «mm. wHw wqmn NHN: mqan Hum. MHNu quu mum: mmn. mam: nan: q 0mm. smmn o¢m moo 0mm mow. moq mqo wmw New. 05H mom m qmm. Num- Nww o¢0u Nee mqm. mNn woo: mdo mam. osou nmm N nwm. mom: Nmm ooo mom mum. Hmm moo mHm mum. amen mNm H N: >H HHH HH H m: HHH HH H N: muowwmm 03w mHannm> mzCHHDAOm ZCHH¢HOM N¢2HM¢5 In mmgm¢Hm<> AdUHHHAom Hmem ¢.m mnm¢H 283m x3 52 was 358 data .2: mom Number of Factors 2 436 FIGURE 3. 2 HIERARCHICAL STRUCTURE OF EIGHT POLITICAL VARIABLES -- STARTING WITH UNITIES IN CORRELATION MATRIX 37 ‘When three factors are rotated the six variables which previously clustered on the first factor subdivide into two groups. The first variable (percent voting for Lewis) shifts completely to the third factor and "percent Roosevelt vote" also has its highest loading on this third factor, although still retaining a high loading on the first factor. 'While two high loadings do not uniquely determine a factor, we can see that there is a tendency for the "Traditional Democratic Vote" to subdivide into what might be inter- preted as "Local" and "National" components. It is possible that had there been additional measures of local voting behavior in the selection of variables, this third factor might have been better established. In Chapter 5 it will be shown that when lower commun- ality estimates are used, rotational stability is established with the three factor solution. 'When the four largest factors obtained with unities in the diagonals of the original correlation matrix are rotated the first factor, on which variables 3, 4, 6, and 8 had their highest loadings in the three factor solution, splits with variables 4 and 8 forming a new factor on which variable 6 also has an appreciable negative loading. In this solution four factors are produced, on each of which two variables have their highest loadings. It should be noted that each of the four factors so obtained seem.to "make sense" from the standpoint of interpretability. The addition of a fifth factor produces only a "null" factor. The six factor solution, however, results in the splitting of variables 3 and 6 into two unique factors. Rotational stability is —- 38 achieved at this point, with the seven and eight factor solutions adding only additional null factors. Eleven.Air Force Classification Tests The analysis of this matrix is particularly interesting because it illustrates quite a different effect than previously observed, namely, the complete disintegration of the factorial structure. In each of the two preceding examples, at least one group of variables remained clustered in a single factor; that is, at least one factor remained on which more than one variable had its highest loading even after all factors had been included in the rotational solution. In this example, when unities are used in the diagonals of the original correlation matrix, complete fissioning takes place, eventually resulting in eleven unique factors. The difference is apparent even with the unrotated factors (Table 3.5). Unlike the two preceding examples, in which the first two factors accounted for from eighty to ninety percent of the total variance, with this matrix the first two factors account for less than 50% of the total variance; in fact, the first five factors (this matrix has been used elsewhere in the literature (11, p. 149-151) as an example of a five factor test battery) account for only about 75% of the total variance. The total variance is apportioned much more evenly among the eleven unrotated factors, with the later factors accounting for a higher percentage of the variance than had been the case in the two preceding examples. 39 Figure 3.3 shows the hierarchical structure for the Eleven Air Force Classification Tests matrix. The effect of rotating the first two factors is that those variables (1, 2, 6, 7, 8) which had negative or zero loadings on the second factor grouped on one factor and those with positibe loadings grouped on the other factor; this in spite of the fact that variables 5 and 9 had higher loadings on the second unrotated factor than on the first. ‘Uhen three factors are rotated, the three variables which had the largest negative loadings on the third unrotated factor (2, 6, 11) have split away from the two preceding factors and formed a new factor. Variable 4, which was also negatively loaded on the third unrotated factor, and variable 1 which had a zero loading on the third factor also have appreciable loadings on this new third factor. The four factor solution results in splitting of both the second and third factors of the three factor solution. It is at this point that the solution begins to stabilize. In this example less than fifty percent of the total variance of variables 2, 3, 4, 6, 10 and 11 were accounted for in the first three factors. (The communalities shown for each solution may be interpreted as prOportions, since the total original communality was 1.0 for each variable.) Notice that there was a great deal of factorial instability between the two, three and four factor solutions -- that, in general, the variables which had the greatest increase in proportion of variance with the addition of a new factor tended to have the greatest influence in the formation of a new factor. The five factor solution;is that presented in the literature by Fruchter. It would be of little value to present a detailed 0.00H 0K0 060 N50 mfuw m.mn m.mn N.$ 0.00 00¢ m.mm mwmuaooumm o>wumasfiso 0.00H w.m 0.m IHJH 01H 0.m w.m m6 NK N2: m.0H m.~m oofimwumfi Hmuoa .uHo ”Emu-How OO.HH Hm. Os. me. an. mm. ea. as. mN. mN.H NO.H mm.m HosHm>amemv nouomm mo :OHuanHuaoo 4O OO.H so NH mo- NO NH- am He me- Nm- NN mm noHumaHeuooO meaaoo .HH OO.H NO- HO HO OH mm eN em Os om mN Nm unmewesn HmUHuomum .OH OO.H OO mo- Nm- mo HO NN Os- OH- OH ON ON aoHumsuoHaH HmoHameomz .m OO.H O4 OO OH OO OO so ON- so- Om mm- NO HH m:0HumumOO HmoHumasz .O OO.H ON- qN HO- OO mo NH «H. OH- MN HO- OH H mnoHumumao HmoHumasz .N OO.H mo- OO- «H He HN- NO- HN- mm mm. OO Nm noHumUHHHuaoeH Ho emmam .O OO.H NO- NO Hm HN- OO NO- OH- OH. NH NO Om mmHaHonHum HmUHsmeomz .m OO.H HO- mo OH- OH ON OO- NH mN- OH- mN me aonamemuaaoo unmasuumaH .e OO.H HO OH «H- OH- on. HN- NH ON Om OO «O aonamemuasoO waHemmm .m OO.H mo OH eH- Oe- ON HO OH- ON on- NH- on H :oHumuanuo HmHumam .N OO.H OO- mm- so- NO- NO- mo so OH- HO Om- NN wsHemmm mHan can HmHO .H 5 Ha x xH HHH> HH> H> s >H HHH HH H mHanum> figg ZOHHfiw-MMOU mo 3420630 Gama ZH Gama meHHZD mMOHU .ONN.O OO.H OH.H OO.H OO.H OO.H OH.N NN0.0 HO.H OO.H NO.H OO.H OH.N Oouumm OOO. OHO OOH- OOO OOH NNH HOH ONO. OOO OOO OOO OOH OOO HH OOO. OOH NHO NOO ONH NNO OOO OHO. OOH OOO OOH NOO HOO OH OOO. OOO OOO OOO OHO OHO ONO- NHO. OHO HOO OOO NOO OOO- O OHO. OOO- ONO- OHH ONO OOO OOO OOO. OOO- OOH NNO NOO OOO O OOO. OOO OOO- OOO OOO- OOO OOO HON. ONO OOO OOH- OOO ONO N OON. HOO NHH- OOO HOO OOO ONO OON. HOO HNO OOO OOO ONO O HOO. OOH ONO- HNN OHN ONO OOO HOO. OON NON OON OOO OHO O OOO. ONH ONO- OOH NNH HOH ONH NOO. NNO OOH OHO OOH NOH O OOO. HOO- ONO- OOO OOH OOO NOO OOO. OOH OOO OOH ONO NOO O OON. OOH ONO- OOO OOO- OON OOH HON. OON OOO HHO- OON ONH N OOO. ONN ONN- OHN OHO OON OOO NOO. OOO HON OHO OON OOO H O H> > >H HHH HH H O > >H HHH HH H N mHOUUmrm NHm N muouomh OPHNH OHDNHHN> moamwum> OOO.N OO.H NO.H HO.H HO.N OO0.0 HO.N OH.N OO.N OO0.0 OO.N OO.N nouumm OOO. HNH- OHO OOO OHN NOO. OOO NNO HNO NOO. ONO OON HH HOO. OOO NHH OOH OOO ONO. NHO ONO OON OOO. OOO HHN OH OOO. NOO ONH- ONO NON- OOO. ONO OON OHN- OOO. OON HON- O OON. OOH OOO OOO- OOO NON. HOH OOO OOO NOO. OOO- OHO O NNN. OOO OOO OOO- ONO OON. NNH OOO- NOO NOO. OHH- HNO N ONN. OOH OOO OOO HOO OOO. ONN OOO NNO OON. ONO NOO O NOO. OOO NNO NHN OOO- OOO. OON NNN OOO- ONO. OON OHO- O NNO. OOH OON OHO NON OOO. OOO ONO NOH OOO. OOO ONO O NNO. OOO OOO NOH ONO OOO. OOO OOO NHO HHO. NOO OOO O NOO. OOO OOO OHH HOH NOO. ONN ONO- OOH ONO. ONN NOO N OOO. OOH HOO NON OOO ONO. HOO ONH ONO ONO. OON NNN H NO OH HHH HH H NO HHH HH H NO HH H OHOOHNO> w-HOUUNM Mach MHOUUNW OOHQH . WHOUUMM 03H. OOOHOOO OHO OOO O>He .OOOO .OOOOH .OOH OOH mZOHHDAOm A¢ZDHH490M Ndsz¢>nu mamma ZOHH¢UHmHmmmgm 0.m mnm¢H 43 interpretation in this study, but for reference they have been identified by Fruchter as: (numbered in decreasing order of factor variance) I -- Numerical. The three highest loadings are for numerical operations (1, 7, 8) II -- Perceptual Speed, consisting of the Spatial orientation test (a misnomer?) and Speed of identification test (variables 2 and 6), both of which involve perception of small detail working against a time limit. III -- Mechanical Experience, with highest loadings on the mechanical principles and mechanical information tests (variables 5 and 9) IV -- Verbal Comprehension, with highest loadings on the reading comprehension (3) and practical judgment (10) tests V -- Spatial Relations, consisting of the tests of instrument comprehension (4) and complex coordination (ll). Fruchter points out that "in a new area of investigation it would ordinarily not be feasible to identify five factors derived from only eleven tests." (11, p. 149) This is certainly true. This was a matrix based on a very large sample and consisting of test items which had been validated in many preceding analyses, therefore it was useful to demonstrate the effects of increasing the number of factors. The solutions for six through all eleven factors each result in the splitting of one of the preceding factors. The first factor, that composed of variables 1, 7 and 8, persists longest, through the nine factor solution. Table 3.6 gives the rotational solutions for two, three, four, five and six factors. The remaining solutions are on file in the Michigan State University library. 44 Twenty—four Psychological Tests Each of the preceding examples has been a very well structured matrix, with variables chosen from larger batteries of tests to illustrate particular types of solutions. The twenty-four psychological test matrix was for a number of years considered a prime example of a large matrix for principal axes factor analysis, and has been extensively studied and factored in different ways by Holzinger and Harman,‘Wrigley and Neuhaus, Kaiser and many others. 'Hith the comparatively recent advent of large-capacity electronic computers, it can no longer be considered exceedingly large, since it is now possible to accurately factor matrices of many more variables which psychologists and others have long desired. It does, however, serve as an excellent example of a large matrix containing variables which are not necessarily clearly representative of any single factor and for which more than one solution has been proposed in the past. Table 3.7 gives the unrotated principal axes solution with unities as initial communality estimates. Five factors have sums of squares (eigenvalues) greater than 1.0 and account for approximately 60% of the total variance. In prior analyses of this matrix it has always been presented as either a four or five factor set of variables. It is interesting to notice that three of the variables have their highest unrotated loadings on factors other than the first four. Variable l9 (figure- word) is most highly loaded on the fifth factor, variable 2 (cubes) on the seventh factor, and variable 15 (number recognition) on the eleventh factor. 4.5 The two, three and four factor solutions are shown in Table 3.8. The complete set of 23 different rotational solutions is on file in the Michigan State University Library. 'Uith the rotation of only the two largest factors, tests which are essentially verbal and Spatial are divided from those which are perceptual and numeric. Only a small proportion of the total variance of many of the variables is included in the two-factor solution, however. 'Hhen three factors are rotated, verbal and Spatial factors appear separately with deductive tests common to both factors. 'Hith the addition of a fourth factor, a group of memory and recognition tests is isolated. The addition of a fifth factor results in a new factor on which only variable 19 (figure-word) has its highest loading, although variables 3 and 17 (Paper Form Board and Object-Number) also have quite high loadings on this factor. The six factor solution illustrates an effect which is frequently observed with large matrices, namely, that segments of two or more former factors will sometimes combine to form an additional factor. In the Six factor solution, the factor containing variables 14 through 17 of the five factor solution Splits into two new factors, isolating the recognition tests (14 through 16) from the memory tests (17 through 19). Notice that variable 19 has recombined with the other memory tests and that variable 3 has Split away from the other Spatial tests into an additional factor on which variables 1 and 13 also have high loadings. The seven, eight and nine factor solutions result in the Splitting of variable 2 away from the Spatial TABLE 3.7 TWENTY-FOUR PSYCHOLOGICAL TESTS ~— UNROTATED PRINCIPLE AXES FACTORS UNITIES IN LEADING DIAGONALS OF CORRELATION MATRIX Variable I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII XVIII IXX XX XXI XXII XXIII XXIV 1. Visual Perception 62 ~01 43 ~20 ~01 07 20 22 ~15 16 05 13 ~04 32 ~04 02 12 ~27 12 ~01 -13 -08 05 04 2. Cubes 40 ~08 40 ~20 35 09 ~51 ~02 ~26 ~26 07 02 ~12 ~22 04 ~09 ~12 ~07 07 ~07 ~02 ~07 02 04 3. Paper Form Board 45 ~19 48 ~11 ~38 33 ~08 ~36 ~06 02 ~05 =13 ~03 ~01 10 24 07 00 -16 09 02 08 ~04 09 4. Flags 51 ~18 34 ~22 ~01 ~19 46 14 14 ~27 04 ~21 ~28 ~16 ~00 ~20 ~04 03 10 14 06 03 ~07 ~06 5. General Information 70 ~32 ~34 ~05 08 08 ~12 03 ~23 ~01 00 ~12 13 ~01 ~20 03 08 ~15 ~13 19 08 ~08 ~05 ~22 6. Paragraph Comprehension 69 ~42 ~27 08 ~01 12 00 13 ~05 ~13 02 16 ~11 04 21 10 ~02 05 02 ~04 ~08 25 20 ~11 7. Sentence Completion 68 ~43 ~36 ~07 ~04 01 08 01 00 ~09 ~10 ~03 01 08 14 -14 ~05 ~10 ~09 ~24 ~08 05 ~25 06 8. Word Classification 69 ~24 ~14 ~12 ~14 12 16 ~17 12 ~07 ~26 ~08 22 ~15 ~23 10 ~11 01 22 ~16 ~04 ~12 12 03 9. Word Meaning 69 ~45 ~29 08 ~01 ~07 ~01 12 ~12 03 07 11 ~02 02 ~07 ~11 06 08 01 15 23 ~02 06 26 10. Addition 47 54 ~45 ~20 08 ~09 ~01 ~08 08 ~10 ~06 04 ~10 ~03 ~10 08 ~17 ~11 ~10 22 ~24 06 01 12 11. Code 58 43 ~21 03 00 30 ~04 32 ~00 06 21 02 14 ~15 13 24 03 06 21 02 05 00 ~15 02 ;§ 12. Counting Dots. 48 55 ~13 ~34 10 04 16 ~30 ~13 16 02 02 ~06 06 03 ‘ ~07 ~12 ~12 04 ~11 30 14 03 ~04 13. Straight-Curved Capitals 62 28 04 ~37 ~08 36 13 18 ~04 10 ~02 ~05 03 ~01 09 -26 ~04 26 ~18 01 ~08 ~13 07 ~03 14. Word Recognition 45 09 ~06 56 16 38 ~08 ~13 26 06 11 ~31 ~14 06 ~09 ~21 09 ~08 11 02 ~05 07 03 02 15. Number Recognition 42 14 08 53 31' ~06 13 07 ~30 21 ~44 ~07 ~18 ~01 07 14 ~06 08 ~04 ~02 01 ~07 ~02 01 16. Figure Recognition 53 09 39 33 17 17 08 13 30 ~21 02 29 11 08 ~21 06 ~16 ~00 ~20 ~05 11 03 ~03 ~01 17. Object-Number 49 28 ~05 47 ~26 ~11 25 ~21 ~15 ~14 18 19 ~00 ~29 12 ~08 13 ~13 ~07 ~04 ~04 ~13 04 ~01 18. Number—Figure 54 39 20 15 ~10 ~25 ~02 ~00 ~35 ~29 01 ~17 27 18 ~13 ~07 07 16 05 ~02 ~06 17 ~02 01 19. Figure-Word 48 14 12 19 ~60 ~14 ~34 19 10 11 ~21 05 ~04 02 06 ~14 ~21 ~09 09 09 06 00 ~02 ~06 20. Deduction 64 ~19 13 07 29 ~19 03 ~29 18 06 08 00 28 15 34 . ~02 ~14 03 08 17 ~02 ~10 ~00 ~03 21. Numerical Puzzles 62 23 10 ~20 17 ~23 ~16 18 32 ~00 ~27 ~07 13 ~15 12 ~01 32 ~10 ~10 ~05 06 06 07 03 22. Problem Reasoning 64 ~15 11 06 ~02 ~33 ~05 13 02 37 36 ~23 01 ~11 ~09 10 ~15 ~00 ~15 ~15 ~06 03 06 02 23. Series Completion 71 ~11 15 ~10 06 ~11 ~08 ~25 07 30 ~02 31 ~10 ~12 ~20 . ~08 14 18 10 03 ~11 10 ~12 ~07 24. Arithmetic Problems 67 20 ~23 ~06 ~10 ~17 ~23 ~12 15 ~19 10 ~02 ~29 29 ~01 L 16 10 14 ~00 ~11 06 ~21 01 ~04 Contribution of Factor 8.14 2.10 1.69 1.50 1.03 .94 .90 .82 .79 .71 0.64 0.54 0.53 0.51 0.48 0.39 0.38 0.34 0.33 0.32 0.30 0.27 0.19 0.17 (Eignevalue) Percent of Total Variance 33.9 8.7 7.1 6.3 4.3 3.9 3.8 3.4 3.3 3.0 2.7 2.3 2.2 2.1 2.0 1.6 1,6 1,4 1,4 1,3 1_2 1,1 ,8 _7 Cumulative Percentage 33.9 42.6 49.7 55.9 60.2 64.1 67.9 71.3 74.6 77.5 80.2 82.5 84.7 86.8 88.8 ‘90_4 I92,O 93.4 94.8 96.1 97.4 98.5 99.3 100. 9.! ...-H. .v-_-.-——~al-Y 47 eocmHHm> O0.0H ON.N ON.O O0.0 HH.O NO.HH N0.0 O0.0 ON.O ON.OH N0.0 O0.0 OOHOOO OO OON HOO OOH OOO OOO NOH OOO OOO HOO HOO HOO ON OOO OON OON NNO HHO OOO ON NON NNO NHO ONO OHO ON OOO OON OOH OHO HH OOO OOO ONN ON OOO OON OOO NN HOO NOH OOO OHO NNH OOO OOO OHO OOH OOO ONO ONO HN HNO OON OOO NOO NNO OOO OOO OOH OOO OOO OON OHO ON NON ONO NON NON OOH OON HOO HOO OOH OON OHO NNN OH NOO HOO OOO OHO HNO- OOO NOO OOO OOO- OOO OOO ONH OH NOO ONO OON OHO OOH OHO OO OOO OOH OHO ONO OOH NH OOO NOO OOO OOO NOO NOO NOO HON HNO OON NHO NOO OH ONO ON OOO OOO OOH OOH NNN HOO ONH OOH ONO ONN OH HNO NOO OOO OOO NNN NHN OOH OOO OON OON OOO OON OH OOO OOO ONO HHO NOH HOO ONO OOO NOH OOO OHO OON OH OOO OOO OON OOH OHO OO NOO NON ONO OOO OON OHO NH OOO OOO HOO OOO NOH OOO OOO NNN OOH ONO NON OOH HH OON OHH OOO OOH- HOH OHN OOH- OOO HOH OHO ONN NHO OH ONN HON OOO HNH OOO ONN HON ONH OOO OOO OOO NNO O ONO OOH OON OHO OOO HOO OON OON OOO HOO OON OOO O ONN NOO OOH OOH OOO OON OOH OOH HOO OOO NOH NON N ONN NHN OOO OOH OOO HNN OHN OOH OHO HOO HNH OON O HON NHH NHN NOH OON OOO NOH OON OON OOO OOH OON O HOO NNO NOO OHO OON OOO OOO NNO HON NON OOH OOO O NNO OOH OOO- OOO NOH HOO OOO HHO- NOH OON OOH OOO O NOO ONO OOO OOO NOO ONO OOO OOO OOO OOH OOH OOO N OOO HOH HON OHN HOH NOO OOO OON NOH ONO OOO ONO H meHOPomrm. ...-Dom“ whopomm mmhflfi whopomm. 0N5.L mfiflmfihgr A.0N4.0 mm 0N: camp ..m.H HOOOHOO OOOH OOOOHEO OOOHoO HOEHOOO OOHOOOHO memme HOUHGQHOmonm m:omlwezw33 KOO mZQHBZHOm ZOHH mceoah mrgu LZO Hamme N039 w.m Eam HmHoH N . . . . Houomm mo oo. oo. SN. mo. HN. mo. mm. SN. oo. Sm. oo. mN.H No. oN.H SS.H No.H no No N SH m mm m aoHHSHHHHeoo om. mo NN HS Hm. NH- SN- mS- NH. oN- oN- Nm- Ho. oN HN HN No. wS Hm mm HH mm. Ho NH Sm oo. SH om oS mH. HN NN on So. MN mN oN mo. SS NS No oH SS. mo- SN- SS- SH. So HH- oH- mo. mH NH NH oH. Nm om oN mo. HN mN SN o HN. No- NH- mN- NH. mo- No- oo- oo. HN mN om mo. mS- mS- mm- No. oo Ho No S NH. Ho- Ho. SH- mN. No- SH- NH- oo. SH NH mN oo. Nm- om- Ho- Ho. mm mm on N oH. No- NH- HN- NS. mo oN mm HN. Nm- SS- om- mo. no mo oo. No. mS NS No S SH. No- NH- oH- oH. oo NH- NH- Ho. NH NH NH oo. Hm on No So. SS SS on m oH. No SH NH oN. oo- oH- mN- So. oo- oo- oH- No. mN SN mN oo. mm on mo S NH. No NH NH mH. NH SN SN SH. SN om om No. oo oo No No. Nm oo So m oo. No- oH- oo- oN. oo mH mN No. SN. SS- on- mo. So- So- NH- No. oS No on N So. No mo mo NH. No- mo- oH- no. So- mo- Ho mo. HN- mN- oN- So. mN mN NH H ..HHHo pro 8 o ..Hho 88 oo o ..HHHHH 2a 8 o ..Hfio ozmlooo ..HHHHH 26 oo o .> >H HHH HH mSwims mMOHU¢m .¢am QHH¢HOMZD H>Hm HmMHm mmH zc mMH¢ZHHmm MHHA