i. SEATZSECAL ANALYSIS FOR ENDEN’HFECATIO’N 8F KOREAN HANDWRITING 104 667 THS ”Eben: .‘ECX' if“ aezree M: M. S. ZEECHEGM V“ ’ENE‘IERSETY 4\ . a" 3.. .. f‘lt . UNG: 8; VE‘TO ‘3”‘15 2&04 LIBRARY Michigan State University A STATISTICAL ANALYSIS FOR IDENTIFICATION OF KOREAN HANDWRITING BY Sung Tai Cho AN ABSTRACT OF A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE School of Police Administration and Public Safety 1964 APPROVED R l ’ w \ {Mm Chairman ' (/‘7 0“}: A . 7 . ‘ '(‘_ I / —-‘-‘,. . l_/' ilbuugha4S\xt ] ’ -~ "* (fa ,I—l Memb%§, /7 zxéftbm Member ‘ cor1 muc} samp stat prob and r be 1e study The m9 restrj‘ mEHSUri thESe 1 cation j Sung Tai Cho The research reported in this paper was an exploratory study of the possibility of using inferential statistical methods in hand- writing analysis. Two inherent characteristics of handwriting-- internal variation of a single individual and coexistence of dis- similarities with similarities in writings made by more than two different individuals--make handwriting identification problematic. Current methods of identification do not completely solve the problems presented by these factors, because, while they do con- sider similarity and difference, they do not provide objective crite- ria for deciding how much similarity there must be before it can be concluded that samples of writing were made by the same person or how much dissimilarity there must be before it can be concluded that samples were made by different persons. Technique of inferential statistics was developed in order to deal with the same type of problem in other areas of inquiry and an examination of the conceptual and mathematical structure of this technique suggests that it can be legitimately used in the area of handwriting analysis. The present study was limited in scope in several ways. The study was based on the measurement of elements of a single character. The measuring instrument developed by the writer for the study was restricted to line and angular measurement and was not capable of measuring stroke curvature. Small samples were used. In spite of these limitations the statistical tests used led to correct identifi- cation in 69 percent of the cases for the least discriminating Sung Tai Cho element and in 86 percent of the cases for the most discriminating element. Furthermore, it has been shown that the accuracy of identification based on a single element can be improved by in- creases in sample size and by changing the region of rejection of the null hypothesis. This, to emphasize, is possible. The findings, then seem to demonstrate that the techniques of statistical inference hold great promise for improvements in handwriting identification. A STATISTICAL ANALYSIS FOR IDENTIFICATION OF KOREAN HANDWRITING A Thesis Presented to the Faculty of the School of Police Administration and Public Safety Michigan State University In Partial Fulfillment of the Requirements for the Degree Master of Science by Sung Tai Cho 1964 ACKNOWLEDGEMENT I wish to thank Professor Joseph P. Nicol for help in review of the literature in handwriting, pretesting, and photo- graphic works at the initial phase of the research; Professor Ralph F. Turner, my adviser, for his constant enthusiasm and en- couragement throughout the research; Mr. P. Rajeswaran for his critical reading of various chapters; and Professor Patrick T. Cleaver of The Ohio State University for statistical advice and extremely valuable suggestions in revision of the thesis. ii TABLE OF CONTENTS Page ACKNOWLEDGEMENT . . . . . . . . . . . . . . . . . . . . . . ii LISTOFTABLES.......................iv CHAPTER I INTRODUCTION . . . . . . . . . . . . . . . . . 1 The Problem. . . . . . . . . . . . . . . . 1 The Nature of Handwriting. . . . . . . . . 2 Current Identification Methods . . . . . . 5 The Study. . . . . . . . . . . . . . . . . 8 II THE KOREAN LANGUAGE AND WRITING SYSTEM . . . . 11 ' The History and Classification of the Korean Language. . . . . . . . . . . . . 11 Characteristics of the Language. . . . . . 12 Korean Writing System: Hangul . . . . . . 13 Word Formation in Hangul . . . . . . . . . 15 Korean Handwriting . . . . . . . . . . . . 16 III METHODOLOGY. . . . . . . . . . . . . . . . . 20 Inferential Statistics and Handwriting Identification . . . . , . , , . . . . . 21 The Preliminary Investigation . . . . . . 23 Selection of Characters for Study. . . ... 24 Types of Measurement . . . . . . . . . , . 26 The Measuring Instrument . . . . . . . . . 27 IV ANALYSIS AND FINDINGS. . . . . . . . . . . . . 30 The Major Analysis . . . . . . . . . . . . 30 Tests Using Small Samples . . . . . . . . 33 Discussion of Findings . . . . . . . . . . 37 v CONCLUSIONS..................43 BIBLIOGRAPHY. . . . . . . . . . . . . . . . APEND ICES O O O O O O O O O O O O O O O O O O O O O O O O 5 1 iii TABLE 2.1 2.2 2.3 2.4 3.1 4.1 4.2 4.3 4.4 4.5 4.6 4.7 LIST OF TABLES Hangul Vowels . . . . . . . . . . . . . . . . . . Hangul Consonants . . . . . . . . . . . . . . . . . LP and Distribution of Ratio . . . . . . . . . . . Syllable and Word Formation . . . . . . . . . . . . Assignment of Numbers to Letter Positions . . . . . T-Tests for Pairs of Samples for Different Writers. Means and Standard Deviations for Pairs of Samples for the Same Individual . . . . . . . . . . . . T-Tests for Pairs of Samples for the Same Writers . Correct Identification . . . . . . . . . . . . . . Means and Standard Deviations for Small Samples . . Means and Standard Deviations for Individual Samples. T-Tests for Pairs of Small Samples for Different Writers . . . . . . . . . . . . . . . . . . . . . iv Page 14 15 16 18 25 34 35 36 36 37 40 CHAPTER I INTRODUCTION 1. The Problem At present handwriting identification, unlike fingerprint identification, has not yet reached a level of validity and reliabi- lity which will permit its full acceptance either by criminological science or by the courts of law. The difficulty of developing a scientifically valid and legally acceptable handwriting identifica- tion lies in the nature of handwriting itself. In terms of con- sistency and individual uniqueness handwriting occupies the opposite end of a continuum from fingerprints. Repeated prints made by a given finger of a particular individual are both consistent, in that they exhibit no important variation, and unique, in that they are demonstrably different from prints made by any other individual. Handwriting, on the other hand, is both internally variable and not completely unique. Within the writings of a given person there will be variation in the way that the same letter is formed, while the writings of two individuals will generally show some dissimilarities as well as similarities. Present methods of handwriting identifica- tion are not entirely satisfactory because they do not solve the problems posed by internal variation and by the coexistence of similarity and dissimilarity in questioned and standard writings. The present study was an attempt to develop a method of handwriting identification based on statistical analysis which it is hoped will constitute an advance in the precision of present identification methods. It is also hoped that this technique will meet the scientific criteria of objectivity, reliability and validity since any criminal identification method must meet these criteria before it will be accpeted by courts of law. The remaining part of the present chapter will be concerned with a review of present handwriting methods and with an outline of the method developed in the present study. 2. The Nature of Handwriting As the preceeding discussion pointed out, handwriting has, for identification purposes, two salient characteristics: internal variation and a lack of individual uniqueness which leads to the 1"Questioned writing" is a handwriting specimen whose author- ship is disputed or unknown. Osborn uses the term "questioned writing" in referring to documents in general while Hilton uses the word both in this sense and in the sense of disputed writing. "Standard writing" or "sample writing" refers to a handwriting specimen which may be taken from documents known to be written by the suspect or which may be re- quested from the suspect. See Ordway Hilton, Scientific Examination of Documents, Chicago: Callaghan and Company, 1956, pp. 10-11, 141-142; Wilson R. Harrison, Suspect Documents: Their Scientific Examination, New York: Frederick A. Praeger Inc., 1958, pp. 292, 297-307; Albert S. Osborn and Albert D. Osborn, Qgestioned Document Problems: The Dis- covery and Proof of the Facts, Sixth Printing, Albany, N.Y.: Boyd Printing Company, 1947, pp. 14, 22, 205-206, 352. 3 overlapping of characteristics in writing made by two individuals. It is generally accepted that no individual writes in a com- pletely uniform manner--within a sample of handwriting made by a given individual there will be variations in the way that a given letter is made.2 This variation results from variation in writing condi- tions,3 writing materials and writing instruments and from the fact that there is a lack of machine-like precision on the part of the writer.4 This last factor is particularly important since variation occurs in a given individual's writing even when writing conditions and materials are held constant. The second characteristic of handwriting--the lack of complete uniqueness in a given individual's writing which leads to a coexis- tence of similarity between writings made by different individuals-- 2To mention only a few John J. Harris, "How much do people write alike: a study of signatures," Journal of Criminal Law, Crimi- nology and Police Science, 48(1), Vol. 6, March-April, 1958, pp. 647- 651; Wilson R. Harrison, Suspect Documents: Their Scientific Examina- tion, New York: Frederick A. Praeger Inc., 1958; Ordway Hilton, Scientific Examination of Documents, Chicago: Callaghan and Company, 1956: Idem., "Proper Evaluation of Dissimilarities in Handwriting," International Criminal Police Review, 104, January, 1957, pp. 48-51; Albert S. Osborn and Albert D. Osborn, Questioned Document Problems: The Discovery and Proof of the Facts, Albany, N.Y.: Boyd Printing Company, 1944; Idem., The Problem of Proof, Sixth Printing, Albany, N.Y.: Boyd Printing Company, 1947. 3Harrison, Ibid., pp. 3, 297, 331-333, 439; Hilton, Ibid., pp. 215-218, 246, 247; Osborn, loc. cit. 4Harrison, Ibid., p. 298; Hilton, Ibid., p. 141; Osborn, Ibid., p. 205. 4 has been discussed by several writers.5 The coexistence of similarity may result from the fact that individuals whose writings show simi- larity have learned a similar "style characteristic."6 In some cases it may be merely the result of random chance. These factors have, to the present, put rather severe limi- tations on handwriting identification as a technique in criminal law. In any case where specimens of writing are compared there are two possibilities: either the writings were made by the same individual or by two different individuals. The factors of variation and co- existence mean that, regardless of the element of combination of elements of the writing used in the identification process, there will usually be differences between the writings made by the same individual and similarities in writings made by different individuals. This problem of ambiguity which makes a positive decision about the authorship of a specimen of writing inherently difficult, has not 5Wilson R. Harrison, Suspect Documents: Their Scialtific Examination, New York: Frederick A. Praeger Inc., 1958; Ordway Hilton, Scientific Examination of Documents, Chicago: Callaghan and Company, 1956; Albert S. Osborn, and Albert D. Osborn, Questioned Document Problems, The Discovery and Proof of the Facts, Albany, N.Y.: Boyd Printing Company, 1944. However, the lack of uniqueness in hand- writing does not necessarily invalidate individuality of handwriting of a given individual. See Hilton, Ibid., p. 136, 141; Osborn, Ibid., p. 231, 270. Nor does internal variation necessarily preclude identification of individuality. In fact, this is the basis of the whole identification effort. 6Harrison distinguished "style" and "personal" characteristics, and the first step in handwriting identification ought to be the dis- tinction between them. See Harrison, Ibid., pp. 288-289. To borrow Harrison's distinction, identification is pursuit of "master pattern," which may be defined as "personal" characteristics. been entirely solved by existing identification methods which depend largely on finding similarities or dissimilarities in questioned and standard writings. Osborn, who was aware of this fact, warned against identification based solely on either similarity or dissimilarity: (By this same method) of ignoring differences (dissimila- rities) and looking only for similarities almost any two things not altogether unlike, can be proved to be the same. This is the basis of the common error of the incompetent witness in identifying the writing in anonymous letters. Similarities can always be found in two writings in the same language or in two writings not utterly unlike. Mere simi- larities do not necessarily prove genuiness any more than mere superficial differences necessarily prove lack of ge- nuiness. The incompetent witness, notwithstanding this fact, by dependence upon similarities alone reaches the conclusion of genuiness, or by dependence upon differences alone reaches the conclusion of lack of genuiness. . . . It seems evident, then, that a legally and scientifically acceptable handwriting identification system must be based on a tech- nique which can control for variation and coexistence of similarity. A review of the literature on handwriting identification, presented in the following section, indicates that current methods have not yet completely developed such a technique. 3. Current Identification Methods Handwriting identification methods can be classified as micro- examination or the examination of writing elements--stroke lengths, 7Albert S. Osborn and Albert D. Osborn, Questioned Document Problems: The Discovery and Proof of the Facts, Albany, N.Y.: Boyd Printing Company, 1944, pp. 240-241. See also Idem., pp. 237, 244. 6 angles between strokes, etc., and macro-examination which is a method which classifies handwriting into styles on the basis of letter design. Current handwriting identification analysis usually combines macro-examination with a form of micro-examination called the "com- parative method."8 Better terms for "comparative method" would be the "one-to-one" method or the "juxtaposition" method since, in the final analysis, all handwriting examination is comparative.9 In the one-to-one method an element of the questioned writing is juxtaposed with a similar element in the suspected writing and the two elements are compared for similarity or dissimilarity. In this type of comparison the average or modal pattern of the two writings is not considered. Such a method is inadequate since it does not take into consideration the internal variation of the two writings and the likelihood of similarity between at least some elements of the ques- tioned and suspected writing. Random selection of elements to be compared may lead to similar elements being selected from dissimilar writings or dissimilar elements being selected from Similarlwritings. In recent studies, Sjoegren and Smith have attempted to over- come the problems caused by variation and coexistence of similarity by the use of a system of weighting.10 In this system of evaluation 8For instance, Tore Sjoegren combined measured character- istics with general features of handwriting, such as arrangement, spacing, connections, etc. Tore Sjoegren, "Handwriting Comparison and Probability," International Criminal Police Review, Vol. 92, Nov. 1955, pp. 274-283. Stanley Smith suggested use of the latter group characteristics in the "Secondary Examination." Stanley S. Smith, "A Method of Comparing Written Documents," ibid., Aug.-Sept. 1954, pp. 205-215. Others such as Harrison, Hilton and Osborn are of the same opinion. 9Harrison, Hilton, Osborn used juxtaposition method for illustrations in their texts. lolbid. 7 each similarity in a particular element is given a plus rating and each dissimilarity is given a minus rating. These weights are summed algebraically in order to determine the overall tendency to similarity or dissimilarity. Hilton has argued that this method is not entirely adequate and has suggested that the major emphasis in evaluation be placed on the factor of dissimilarity:11 Rather it is an analysis of the true meaning of the dissimi- larities and if they are found to be basic and without logical explanation, the realization that these differences are the controlling factors which establish that the known (standard) and unknown (questioned) writings are by two dis- tinct persons. Harrison is in general agreement with Hilton on the importance of dissimilarity.12 Osborn, although agreeing at some points with Hilton, suggests that similarity should be given equal weight:13 The process of comparison in any field is reasoning regarding similarities and differences, and necessarily the subject has an important place in all kinds of investigations. Errors in identification problems are due not only to the failure to see the outside things but to the failure to recognize their real differences and their fundamental similarities and to understand them and interpret them when they are seen. Much of what is called science is merely accurate classification resulting from intelligent observation and reasoning leading to a correct recognition of similarities and differences. 11Ordway Hilton, "Proper Evaluation of Dissimilarities in Handwriting," International Criminal Police Review, No. 104, January 1957, p. 49. Hilton has kept this view consistently in other places: Ordway Hilton, Scientific Examination of Documents, Chicago: Callaghan and Company, 1956, pp. 51, 136-137, 144. 12Harrison, ibid., pp. 343-345. 13Albert S. Osborn and Albert D. Osborn, Questioned Document Problems: The Discovery and Proof of the Facts, Albany, N.Y.: Boyd Printing Company, 1944, p. 237. See also Hilton, Scientific Examina- tion of Documents, p. 143. 8 Three weaknesses seem apparent in current techniques. The method of juxtaposition is obviously inadequate since it does not take into consideration the variability factor. Second, the emphasis placed by some writers on the factor of dissimilarity has no adequate theoretical ground since it stresses only one aSpect of variability. Third, the weighting method developed by Sjoegren and Smith, while avoiding the first two weaknesses, can be criticized on the ground that it fails to provide a method for determining the degree of positive or negative weighting necessary to permit acceptable infer- ence about authorship. It seems evident that some fresh approach will be necessary in order to develop a handwriting identification method which can cope with the problems of variation and coexistence of similarity. 4. The Study The study reported in this paper was an exploratory attempt to apply the technique of statistical inference to the problem of handwriting identification.14 A comparison of the problems confronting the handwriting identification specialist and the problems typically encountered by biological or social scientists attempting to make inferences about samples of variable material show that in many essen- tial respects they are remarkably similar. 14The Theory and application of inferential statistics will be discussed in more detail in Chapter III. See Helen M. walker and Joseph Lev, Statistical Inference, New York: Holt, Rinehart and Wins- ton, 1953; W. Allen Wallis and Harry V. Roberts, Statistics: A New Approach, Glencoe: The Free Press, 1956. 9 Many research hypotheses in the biological and social sciences require that the investigator determine whether two samples, alike in some respects and different in others, were drawn from the same universe of measurement. The problem cannot be solved by a simple examination of the samples since differences may have occurred as a result of the sampling error inherent when samples are drawn from a heterogeneous universe of measurement. Inferential statistics allow the researcher to reject or accept the hypothesis that the samples were drawn from the same universe, not with absolute certainty, but with a specified small margin of error. Handwriting identification can be conceptualized as a problem in sampling. The examiner has two (or more) samples of writing which will usually show both similarities and dissimilarities. These may be drawn from two different universes (i.e., made by different individ- uals) or they may have been drawn from the same universe (i.e., made by the same individual) and show differences because of sampling error. In the present study specimens of Korean handwriting collected by the researcher were treated as statistical samples. Measurements of certain micro-characteristics were made and these measurements were subjected to statistical analysis in order to determine whether inferential statistical methods could differentiate between samples whose authorship was known a priori. The present study was limited in three respects. First, only Korean writing was used in the present study because of the writer's familiarity with this writing system. Second, only a limited set of 10 measurements were used. No attempt was made to evaluate macro- characteristics statistically, although macro-characteristics were used in determining the final sample, and only a small set of micro- measurements were used. Finally, no attempt was made in the present study to analyze disguised writing. Although the present study was limited there are obviously rich potentialities for statistical in- vestigations of handwriting using other languages and more refined methods. In the following chapter there is a discussion of the Korean language and handwriting system, with particular attention to problems in identification inherent in this writing system. Chapter III dis- cusses in detail the statistical methods, samples and measuring tech- niques used in the study and presents an analysis of findings. CHAPTER II THE KOREAN LANGUAGE AND WRITING SYSTEM1 Because the technique of handwriting identification reported in this paper was based on samples of Korean writing a brief dis- cussion of the Korean language and writing system is appropriate. The History and Classification of the Korean Language Most linguists classify Korean in the Altaic language group which in turn is considered to be closely related to the Ural group. If there is such a relationship,then Korean is related to such Euro- pean languages as Finnish and Magyar. The opinion that the Ural and Altaic languages are related is based on the fact that both are agglutinative languages and the belief that the origins of both can be traced to central Asia. The Altaic group is divided into three branches: Turkic, Men- golian and Tungustic. Korean belongs to the Tungustic branch. Tungus- tic variants are spoken in Siberia, in Manchuria and by some 35 million Koreans, both in Korea and in Japan. Korea shares similarities in its 1This part of the paper is mainly taken from the book: Korea-- Its Land, People, and Culture of Allggges, Seoul, Korea, Hakwon-Sa, Ltd., Part III, People, Language, Chapter 2, Language, pp. 117-124, 1960. Some modifications and innovations were made, however, to meet the purpose of this research--especially design. 11 12 agglutinative structure with Japanese, but not with Chinese, al- though modified Chinese characters are used in Korean writing. The relationship of Korean and Japanese is not surprising since it is believed that Japan was settled in part by immigrants from Korea and from other areas where Tungustic languages were spoken. Characteristics of the Language Although the systematic study of the Korean language has not yet been completed certain characteristics have been identified. 1. Korean vowels are divided into three groups and vowel com- binatiOns tend to be made within these groups and not between them. (Such vowel harmony is a general characteristic of the Ural-Altaic languages.) The three groups are: a. Hard Vowels: ’. (a)J_(o))-’ (ae)J,) (oe) b. Medium Vowels:" (i) c. Soft Vowels: ‘1 (6)1- (u).—(E)4] (e) The vowels of groups a and c tend to combine with others within their group but to resist combination with the other group. The single vowel of the second group may combine with vowels from either of the other groups. 2. Korean words, unlike words in Indo-European languages, never begin with more than one consonant. Such.English.words as "strike" or "break", for example, would not occur in Korean. Further, Korean words do not begin with liguid consonants such as the English "r" or "1". Finally, Korean lacks the consonants "f" and "v". These sounds are approximated in Korean by the consonant a (p or b). 13 3. The most distinctive characteristic which separates Korean and other Ural-Altaic languages from Chinese or from Indo- European languages is the agglutinative structure of its grammar. .In Indo-European languages grammar is indicated by modifications in 'words in the forms of tenses, bases andunumbers. In Chinese "full" words or denotative words are never inflected and grammar is indicated by word position and by the use of word particles or "empty" words which help convey meaning. Agglutinative languages such as Korean fall between these two types of languages. Words are fixed, as in Chinese, but the "empty" words of particles become "glued" or attached to the fixed words in somewhat the same way that inflected endings are attached to word roots in Indo-European languages. Korean Writing System: Hangul Hangul, the Korean writing system, is phonetic, like English, rather than ideographic, like Chinese, although the characters or letters were adopted from Chinese. Hangul was developed in Korea's "Golden Age" Which was ush- ered in with the reign of the fourth Yi king, Sejong, in the 15th century. King Sejong, who believed that the function of written language was communication with the common people, developed an alpha- betic language in order to facilitate this communication. There are 24 letters in the Hangul alphabet--10 vowels and 14 consonants. The total number of letters in the Korean alphabet has been reduced since the period of King Sejong, with the elimination of such vowel as o and such consonants as A , é . l4 Hagul vowels can be formed into dipthongs and can be classi- fied into two categories: Simple: ‘- (a). I: (ya). .1 (8). :1 'T . I (1) Compound: y, (ae). )1 . 4] (e). a] (yet), (oe). 7' (ui),J,', (wa), 7’ (wo),‘w, (wae),fi (we) If the simple and compound vowels are considered together, there are 21 vowels. Modern grammarians classify them as simple and dipthong vowels: Table 2.1. Hangul Vowels Simple--. k 4, J“ T r, H A, Jr, Dipthongs"): :’ JL 7 1:, a], J, /’ )1" 14 w a! This distribution is based on the following triangular chart for the simple vowels: Figure 2.1 I 4T Dipthongs are formed by the following principle of combination: 2 MeGune-Reisschuer system is the best known system to romanize Korean pronounciation. Example is shown in Table 2.2. 15 1+.»— )+-}] #:144444- 7r 1+Tsfi=l+fldl 71 T+l win-+1 sJ+=—L+I- 7"='T+" ;—91=-L-Ff);fi)='T+-JI The consonants of Hangul now in use are either simple or double: Table 2.2. Hangul Consonants Simple--7 (k or g), \_ (n),C (t or d), a (r or 1), \J (no.6 (p or b).A (s). o (silent or ng). not: or j>.%, (ch'). >7 (k'),j’_fich N Ammuzv HmsvH>HwGH 68mm osu Mom monEmm mo mummm mom mnoauofi>on vuwwawum was memo: .N.¢ oHnoH 36 Table 4.3. T-Tests for Pairs of Samples for the Same Writers (N=25) Individual 1 2 3 1 .13 .15 1.19 2 2.23*(a) 1.63 1.73 3 .29 .43 2.19* 4 1.31 2.70* 2.334 5 2.80* .45 .79 6 6.65* 1.69 2.02 7 1.61 .95 .07 8 .27 2.00 .08 d.f.= .24; t.05= 2.06; t.01= 2.80; t.001= 3.55 (a) incorrect identifications are indicated by an asterisk Table 4.4. Correct Identification X1 X3 Type N Z N Z N Z Total N Different Individuals 23 82.14 24 85.71 19 67.85 28 Same Individual 5 62.50 7 87.50 6 75.00 8 Both Types 28 77.77 31 86.11 25 69.44 36 37 Table 4.5. Means and Standard Deviations for Small Samples (N=10) l _ 2 _ 3 Individual X s X s X s l 3.55 1.19 2.05 .47 1.80 .72 2 4.40 1.09 2.30 1.03 2.40 1.19 3 2.80 .95 2.10 .89 1.68 .96 4 4.95 1.69 1.95 .57 2.80 1.49 5 5.93 1.02 3.60 1.09 1.81 .58 6 9.45 1.71 2.15 .59 4.64 1.17 7 3.95 1.89 3.75 2.45 1.73 1.26 8 2.85 1.24 2.25 .51 1.35 .62 Discussion of Findings An examination of the results of the analysis points to two general conclusions about the use of statistical inference in hand- writing analysis. First, it is evident that, in the case of samples of writing which are fairly large, a rather -high degree of accuracy in determining similarity and difference is possible. As the sample size decreases accuracy also decreases. Second, it must be stressed that the technique is not com- pletely accurate for any of the elements measured even for the larger samples. Because a high degree of accuracy is vital in the legal application of handwriting analysis,methods for improvement must be developed. Accuracy may be increased in part by combining macro-analysis with micro-analysis, in part by improvements in the micro-analysis. A detailed analysis of the statistical findings suggest several ways in which micro-analysis can be improved. 38 1. The use of more than one character as the basis for analysis. An examination of the mean measurements in Table 4.6 shows that some of the individuals are rather close in their mean measurements for a particular element. Normal handwriting falls within a restricted size range and it can be expected that a num- ber of individuals will have the same mean size and range of va- riability in a given letter element. The figures in Table 4.1 indicate, however, that there is no pair of individuals which is not significantly different in at least one of the measurements. This fact indicates that individuals may be more unique in their over- all writing than they are in terms of a single character and suggests that more than one character should be analyzed in applied hand- writing examination. Table 4.6. Means and Standard Deviations for Individual Samples (N=50) X X X Individual X 1 s X 2 s X 3 s l 3.46 1.06 5.98 .91 2.57 1.17 2 4.57 .94 7.32 .97 2.70 .76 3 3.26 .97 5.06 .99 2.31 .79 4 3.59 1.17 7.59 1.81 2.09 .66 5 6.73 1.26 4.78 .55 4.41 1.28 6 9.67 1.48 5.58 .94 2.53 .61 7 4.15 2.01 6.30 1.37 2.71 1.01 8 5.60 1.57 7.98 .97 3.21 .84 39 2. More accurate measurement. Some of the apparent similarity between individuals may be the result of inaccuracy or insufficient refinement in measurement. It can be suggested that improvements should be made in the instrument in order to permit more precise measurement and in order to permit the measurement of curvature which is an important part of the characteristics of a stroke. 3. Control of Type I and Type II errors.4 Because statistical inference is based on probability rather than certainty it is possible that a decision about the null hypothesis may be in error. There are two kinds of error which can be made: (a) Type I error - the rejection of the null hypothesis when it should be accepted. In terms of handwriting analysis this means that the examiner concludes that the writing was made by different individuals when it was actually made by the same individual. The errors found in Table 4.3 are of this type. (b) Type II error - the acceptance of the null hypothesis when it should be rejected. Again, in terms of handwriting analysis this means that the examiner concludes that the writing was made by the same individual when, in fact, it was made by different in- dividuals. Errors in Tables 4.1 and 4.7 are of this sort. There are several approaches to increased identification accuracy based on an analysis of the two types of errors. Type I error can be reduced simply by raising the region of rejection. If the region of rejection were set at .01 instead of at .05 (which was used in this study) only one error instead of 6 would have occurred For a discussion of types of error and the power efficiency of statistical tests see Blaylock, 22, cit., Chapter 14. 40 Table 4.7. T Tests for Pairs of Small Samples for Different Writers (N=10) t Pair X1 X2 X3 1-2 1.66* .70* 1.37* 1-3 1.55* .16* .31* 1-4 2.14* .43* 1.91* 1-5 4.80 4.13 .34* 1-6 8.95 .42* 6.54 1-7 .56* 2.15 .12* 1-8 1.28* .91* 1.50* 2-3 3.49 .47* 1.48* 2-4 .86* .99* .66* 2-5 3.24 2.74 1.43* 2-6 7.88 .40* 4.25 2-7 .65* 1.72* 1.22* 2-8 2.96 1.37* 2.48 3-4 3.49 .45* 1.99* 3-5 7.08 3.38 .34* 3-6 10.74 .15* 6.19 3-7 1.72* 2.00* 1.00* 3-8 .010* 4.62 .91* 4-5 1.57* 4.24 1.96* 4-6 5.91 .76* 3.07 4-7 1.35* 2.26 4.73 4-8 3.15 1.25* 4.84 5-6 5.60 3.69 3.69 5-7 2.92 .18* 1.76* 5-8 6.05 .35* 3.50 6-7 6.82 3.54 3.70 6-8 9.87 3.70 3.54 7-8 2.45 1.89* 1.89* d.f.= 9 t.05= 2.26 t.01= 3.25 (a) asterisks indiggfg4ihéorrect identification 41 in comparisons of writings made by the same individual as shown in Table 4.3. This would, however, have increased the number of Type II errors in Table 4.1 from 18 to 25. One method of reducing both types of error is to raise the region of rejection in order to reduce a Type I error and increase the sample size in order to reduce a Type II error. The reduction of Type II error by increasing sample size can be seen in the improve- ment in accuracy of samples of 50, Table 4.1, over samples of 10 in Table 4.7. It can be suggested then that the examiner use a high region of rejection (at least .01) and use the largest sample possible. An additional reduction in Type II error can be made by using a one-tailed test. In a two tailed test, the type used in this study, the direction of differences between means was not specified. More specifically, when two samples were compared the test was made to include both the probability that the sample mean was below the universe mean and the probability that the sample mean was above the universe mean. If the universe mean is known, then a one-tailed test in which the direction of the sample mean is specified. Such a test will have a smaller probability of Type II error than a one- tailed test for the same sample size. The use of a one-tailed test might be possible if a sample of questioned writing is being compared with a large standard writing of known authorship. This situation might be possible if criminal investigation bureaus kept quantita- tive records of writing in the same way that fingerprint records are kept.5 51bid., chapter 14. 42 It is apparent that statistical inference can determine the identity of writing with a relatively high degree of accuracy al- though not with absolute certainty. The probability of error in identification can be greatly reduced by the use of methods suggested above. CHAPTER V CONCLUSIONS The research reported in this paper was an exploratory study of the possibility of using inferential statistical methods in handwriting analysis. Two inherent characteristics of handwriting-- variation within the writings of a single individual and overlap in writings made by different individuals--make handwriting identifica- tion problematic. Current methods of analysis do not completely solve the problems presented by these factors because, while they do consider similarity and difference, they do not provide objective criteria for deciding how much similarity there must be before it can be concluded that samples of writing were made by the same per- son or how much dissimilarity there must be before it can be con- cluded that samples were made by different persons. The technique of inferential statistics was developed in order to solve the same type of problem in other areas of inquiry and an examination of the conceptual and mathematical structure of this technique suggests that it can be legitimately used in the area of handwriting analysis. The present study was limited in scope in several ways. The study was based on the measurement of elements of a single character. The measuring instrument developed by the writer for the study was restricted to line and angular measurement and was not capable of 43 44 measuring stroke curvature. Small samples were used. In spite of these limitations the statistical tests used led to correct iden- tifications in 69 percent of the cases for the least discriminating element and in 86 percent of the cases for the most discriminating element. Further, it has been shown that the accuracy of identifica- tion based on a single element can be improved by increases in sample size and by changing the region of rejection of the null hypothesis. The findings, then, seem to demonstrate that the techniques of statistical inference hold great promise for improvements in hand- writing identification. The present study was exploratory and was not designed to develop a finished method of identification. The development of a finished method which can be used in criminal investigation can only result from continued research in this area. On the basis of the preliminary research made for the present study and the findings reported in the preceeding chapter the writer would suggest that research for the development of a mature statistical handwriting identification technique should include the following things: 1. The development of better measuring instruments. It would be particularly desirable to develop instruments which could measure curvature and stroke width and which were calibrated in standard linear measurements such as fractions of a millimeter. In the area of measurement,methods for controlling the measurement error,which would vary from examiner to examiner should be developed. 45 2. The combination of micro-analysis and macro-analysis. The analysis of class-patterns--the overall stylistic characteris- tics of writing--should be combined with the measurement of charac- ter elements. The development of a technique for quantifying macro- characteristics would be an important step in the development of a precise handwriting system. 3. The study of writing in various languages. The present study was based on the Korean alphabet. If the technique of statis- tical analysis is to have general applicability, the factors of varia- bility and coexistence of similarity must be studied in other languages. Although it can be assumed that the methods used in the present study are applicable for Other languages such an assumption must be tested before a universal system of identification can be developed. 4. The investigation of statistical analysis in disguised writing. The present study made no attempt to examine disguised writing. It is apparent that a great number of the cases processed by handwriting examiners may include disguised writing. For this reason the usefulness of statistical analysis in disguised writing must be thoroughly investigated. 5. The development of techniques for multiple character analysis. One of the findings of the present study suggested that identification can be more accurate if it is based on the examination of several characters. A mature identification system should probably include techniques for quantitatively combining various elements. 46 6. Exhaustive study of writing variation and coexistence. Before statistical identification methods can be completely developed more data concerning patterns of variability, overlap of character- istics and internal combinations must be collected and analyzed. In other words, a great deal more must be known about the universe of handwriting before a mature technique can be developed. This could be facilitated if criminal investigation departments in various countries would colleCt and classify handwriting specimens in the same way that they collect fingerprints. . The writer feels that the use of statistical methods holds great promise for handwriting identification and is hopeful that more research in this area can lead to the development of a scientif- ically and legally acceptable handwriting identification system. BIBLIOGRAPHY A. Books Blalock, Hubert M. Jr., Social Statistics, New York: McGraw-Hill Book Company, Inc., 1960. Dixon, W. J. and F. J. Massey, Jr., Introduction to Statistical Analysis, New York: MeGraw-Hill Book Company, Inc., 1951. Eisenhart, C., M. W. Hastay, and W. A. Wallis, Techniques of Sta- tistical Analysis, New York: MeGraw-Hill Book Company, Inc., 1947. Harrison, Wilson R., Suspect Documents: Their Scientific Examination, New York, Frederick A. Praeger Inc., 1958. Hilton, Ordway, Scientific Examination of Documents, Chicago: Callaghan & Company, 1956. McCarthy, P. J., Introduction to Statistical Reasoning, New York: MeGraw-Hill Book Company, Inc., 1957. McCollough, Celeste and Loche Van Atta, Statistical Concepts: A Program for Self-Instruction, New York: MeGraw-Hill Book Company, 1963. Mood, A. M., Introduction to the Theory of Statistics, New York: MeGraw-Hill Book Company, Inc., 1950. Osborn, Albert S., The Problem of Proof, Sixth Printing, Albany, N.Y.: Boyd Printing Co., 1947. , Questioned Documents, Second Edition, Albany, N.Y.: Boyd Printing Co., 1946. & Albert D. Osborn, Questioned Document Problems, The Discovery and Proof of the Facts, Albany, N.Y.: Boyd Printing Company, 1944. Savage, L. J., The Foundations of Statistics, New York: John Wiley and Sons, Inc., 1954. 47 48 Sellitz, Claire, et a1, Research Methods in Social Relations, 2nd Ed., New York: Henry Holt, 1959. Somprasongk, Prathnadi, "Identification of Thai Handwriting: A Classification of Common Letter and Numerical Forms in Four Hundred Standards," An unpublished M.S. thesis, University of California, Berkeley, Cal., Oct. 13, 1960. Walker, Helen M., and Joseph Lev, Statistical Inference, New York: Holt, Rinehart and Winston, 1953. Wallis, W. Allen, and Harry V. Roberts, Statistics: A New Approach, Glencoe: The Free Press, 1956. Young, Pauline, Scientific Social Surveys and Research, Englewood Cliffs, N.J.: Prentice-Hall, 1956. B. Articles Albarracin, Roberto, Peritajes Judiciales (Elements of document examination), Revista de la Mutualidad de la Policia Federal (Argentine), An 28, no 322, mars 1953, p. 40. Bischoff, M.A., "The Possibilities to Improve Writing Comparison Evidence," Revue Internationale de Criminologie et de Police Technique (Suisse), Vol. 9, No. 4, Oct.-Dec. 1955, p. 273 (14,1). Clark, P. F., "Classification and identification of criminal hand- writings," The Australian Police Journal, Vol. 6, No. 4, Oct. 1952, p. 276. Conway, J. V. P., "The Identification of Handwriting," JCLC and PS, Vol. 45, No. 5, January-February, 1953. Domenici, Folco, Valore et limiti dell' esame peritale dei mano- scriti (The Value and limits of examination of written docu- ments), La Giustizia Penale (Italie), An 58, no. 4, avril 1953, I, p. 163. Dvorak, A., "Zur Frage der Beweiskraft individueller Merkmale," (The Probative Value of Individual Characteristics), Arch, f. Krim., 122 (3/4:90-100, Sept.-Oct. Filho, Jose Del Picchia, Metodo grafoscopico universal ("Une methode graphoscopque universelle--A universal method of handwriting examination"), Inverstigacoes(Brazil), An 4, no 41, mai 1952, p. 75. 49 , "Que es documentoscopia (La documentoscopie- Examination of documents)," Investigacion(Espagne), An 20, no 294, Oct. 1952, p. 77. , Resenha bibliografica da perxcia de docu- mentos (A bibliography of document examination), Investigacoes, An 4, no 43, juillet 1952, p. 95. Goyle, D. N., "Identification of Handwritings," The Penal Reformer, (Inde), Vol. 8, No. 4/5, Juin-Juillet 1952, p. 86, et no 6/7, Aout- Septembre 1952, p. 147. Harris, J., "Disguised Handwriting," J. Criminal Law Criminol. Police Science, Vol. 43, No. 5, January-February, 1953. Harris, John J., "How much do people write alike," J. Criminal Law Criminol. Police Science, Vol. 48, No. 6(1), pp. 647-651. Hilton, Ordway, "Can the Forger be Identified from His Handwriting?" J. Criminal Law Criminol. Police Science, Vol. 43, No. 4, Nov.-Dec., 1952, pp. 547-555. , "The Collection of Writing Standards in Criminal In- vestigation," J. Criminal Law Criminol. Police Science, Vol. 32, No° 2, July-August, 1942, pp. 241-256. , "Der Beweis fur die Echtheit eines Schriftstuecks," (Proof of Genuineness), Kriminalistik, 12(11):459-462, Nov. , "Proper Evaluation of Dissimilarities in Handwriting," International Criminal Police Review, 104, Feb. 1957, pp. 48- 51. Livingston, Orville, B., "A Handwriting and Pen-printing Classification System for Identifying Law Violators," J. Criminal Law Criminol. Police Science, Vol. 49, No. 5, Jan.-Feb., 1959 pp. 487-506. Malley, R., "Identification and Handwriting," International Criminal Police Review, No. 94, Jan., 1956. Serrano, Pedro, "Grafistica (Elements de documentoscopie(Essentials of document examination)," Policia(ESpagne), No. 128, Oct. 1952, supplementa et No. 129, Nov. 1952. , Grafistica(The elements of document examination), Policia(Espagne), an 12, no 131, janvier 1953, supplement et no 132, fevrier, supplement. , Grafistica, (The elements of document examination), Policia, An 12, no 132, avril 1953. 50 , Grafistica, "Elements of document examination, Policia, an 12, no 137, juillet 1953, no. 138, aout 1953, no. 139, septembre 1953. SjBegren, Tore, "Handwriting Comparison and Probability," International Criminal Police Review, Vol. No. 92, Nov. 1955, pp. 274-283. Smith, Leh Theodora, "Six Basic Factors in Handwriting Classifica- tion," J. Criminal Law Criminol. Police Science, 44, 1954, pp. 810-816. Smith, Stanley 8., "A Method of Comparing Written Documents," International Criminal Police Review, Aug.-Sept., 1954, pp. 205-215 0 Wrenshall, A. F. and D. M. Duke, "Statistical Methods and the Examina- tion of Questioned Documents," Seminar No. 4, The Examination of Questioned Documents, RCMP Crime Detection Laboratories, May 10-11, 1956, pp. 89-98. Appendix 1 Sample Text “7:77 W4! #777? % 74:7 ‘747 7'7; 7777 4747.77 7477 37.77 9171/ :1“ 1’]: 112F173. «37.7.4.7 Vol V7373! 74/77 7:13. #47": 375’"! 7753?} *1/71} 744 it: lab—77%, 17/73: 517,47; r717 «Ia/>7 77,777 77447?) '4 7 “7 77:37" 777—2, 4777 7777721 37,7}; 77:14.9. 7 71:. 77; “44,44,714 P771717: 44:1 71453473— 3127/7 val-+474 47777“ 711.71 777773717777 J‘f 97.1% «I 774 7717—7 77102.3- 737,17- 7-3! 77 7+ .77 7747 7+7 7 :7 747777 71:: 77‘; 777—77 :17 777 {~77 2% 47777 “7" 57-1477 3 1 ”1 “I I 74177:: I774 777:“ 3’14“: 774‘ new 77 7: 71’” 7 ., , WW7; j' “I 442 #73 “7%, “4143:! figi—d 1:71’7’} 74M; £3741: $775} 777 77777 444' 777 074% 77‘ 7777 7 I77 777, rI-a 2.77 77 777777 77 )4; 1'7 "1‘7”, 7’“ : :Z’i‘WJ/Wef” #:1117136 71/7. wag-7771.77 «77./a 47774771 5' sI 71v ‘1” 57427717“) ”I’M—‘- ‘V/‘M'J ‘Pf %3 vii I ’1’! W777} t’d'fiéw d 691 [477-3177 .9771: a} 73/4 Ifl/d 6')! to («I it! 7:147 “77777 77.27744 }fl .7“ a" Appendix 2 Major Divisions of Class Pattern ‘1 'l ‘ II ._ I 4. , . . _ \ . w 1 ,t\ 1 I- . . . «W-_Q\‘ K§‘\.uulltw0\\1\\hfl\ \\.\.u\\.nm‘\ VAI\..\Q\M|\\AU 54 Q\ a , A.:_,..H%;.. I . Kim m \VQ “\Afib \M4DQ&N\» IIIII !-.--l:\ \Qxah. - .1 -:l- A A "u’s m the (icntimcn-r .mwlunuag) 014) 0) 3““"1111 :3“ qvmx 122$- 27L 55- §m M\§\m§§\ Q§§§N SSK .1 wamxfi £5 22 “11.1 17 \ . V . 1. . 111121. 1...] .. . :5... .Li. .5. g 1. 9....1HH1L1 24-..}. 2.2+ 1L. L L3 11L. _ LL. .2]... . .2. L L 2.11 L..L.2 ... 12L. l1..11LH2,2 .. .. L1 2 _ L L.2L1 . m 2.1.1 L L V. . L . L L L 2. .1411111.L11~14‘.1~2 LL,1 L .1. I1.1.4 .. .L 11. Ill L . .1L1.. .I L ,L112L1 . L . .11 x24. £1 23 L 2 :21 ”M L .L. . 12.. .. . LHMM .2 §Rn 1_ m . L .1 .211 IL112.22..LL1L11 1.1 L1111L .1121 . L L #2.. . .L. L. . L112 L L L L L L L .1L 2 2L LAN: . . 22m- ..L.HL2..2..M 1 2 L m, LL . L RV 1 3» é. mi Z 1 fink g ESL 3. \. % 3:11,...1 Appendix 3 Application of the Mbasurement Plate 57 In the measurement of the lines and angles of the charac- ter analyzed in this study the following rules have been used: 1. Maximum Extension Rule (MER)--The measuring instrument is placed on the line to be measured. 2. Inner Line Rule (ILR)--The measuring instrument is placed inside the line to be measured. 3. Maximum Angle Rule (MAR)--As in the maximum extension rule the measuring instrument is placed on the two lines which intersect. 4. Inner Angle Rule (IAR)--the instrument is placed inside the angle to be measured. To illustrate the steps for application of the measurement plate and the aforementioned rules: a. 4- c Note: (1) LINE-UP (2) LINEAR MEASUREMENT (3)ANGULAR ab-Xl-Z MEASUREMENT bc=X2=l abc=X3=8 red lines=1ines drawn according to the rules black lines-lines on the plate All three steps may be illustrated: Appendix 4 Measurement Plate CEnlarged Positive) 59 *ROOM use out! i A] N UT; l’ "90 .girflujnflmr WIIHWIIII 293 0304 31 IIHIHWIHHIIIUIIH