THE ATTRIBUTION 0F AUTHORSHIP: A COMPUTERIZED METHOD EVALUATED AND COMPARED WITH OTHER METHODS PAST AND FUTURE THESIS FOR THE DEGREE OF PH. D. MICHIGAN STATE UNIVERSITY GEORGE W. ZIMMER 1968 mass J ‘71,’ W; F‘WT ; I e L f! :11?” " This is to certify that the thesis entitled THE ATTRIBUTION OF AUTHORSHIP: A COI~IPUTERIZED METHOD EVALUATED AND COE-‘ZPARED WITH OTHER METHODS PAST AND FUTURE presented by George W. Zimmer has been accepted towards fulfillment of the requirements for Ph. D. degree“, English :waJMM Major professor Date W /é/) / ffi 0-169 C,’ K 2730 3/0 fl”? - ABSTRACT THE ATTRIBUTION OF AUTHORSHIP: A COHPUTERIZED METHOD EVALUATED AED COMPARE WITH OTHER EETJODS PAST AfiD FUTURE George W. Zimmer The proving of authorship by statistical means has a long but inglorious history in the field of English scholarship. What usually has happened is that an undisinterested scholar, out to "prove" that (for example) there was a "Pearl-poet" to whom can be attributed three or four additional Middle English poems, lists elements that the anonymous Pearl has in common with the other poems, and concludes that on the basis of his "statistics" the poems must have been by the same author. In his 1941 dissertation (University of Hinncsota) John W. Clark takes great pains to disprove the attractive "Pearl-poet" theory by examining the large quantities of data used by scholars from 1876 on, and finds their data invariably faulty or misapplied. Even granting them sufficient accuracy, Clark maintains their data could prove mere influence of one poet on another as well as it proves con on authorship of the several poems. But basically the fault in the early attempts at proving authorship lay in the inaccuracy of the data, which was neces- sarily gathered "by hand." "Precision without accuracy” is the J downfall of pre c01~utcr Statistical analyses OI literature. he filould e:— wect a co13uter- aid ed 3,030 ct to avoid this pitfall. One fairly typical study, 3:2.312 gagig§?, by Alvar Ellegard, does in nvolve a connuter, but the commuter is put to work on deta de- insceuratulv by ~I‘llegr—wrd's rived laboriously, intuition lly, an; own hand. Therefore, his nethod, which purports to prove that the Letters f Jrnits were written by Sir Philip Francis, contains the traditional flaw of preconfiuter stulies o; t1e sane kind. V m, fipl— r‘r 1‘ ’3 '.- ‘9‘ x v\ '3" ', --. . . L ~v (a ' pa 7.- ‘» w l ' r~ -'. lee tasi set in the are e p.oject has to Joel ey ouyective ‘ a A ~' I o in n '1 - v.4 vr '3 w --~-. ~~ 4- ‘-~ .. - wire of vocaeulary. inc zeta ye.e CeiirCn only iron the inetn noene o _f v . . m- ‘l ' ~ - 1--..-: 4- -\ ..v' '. v rs "- r. V a ' . r‘ five nineteenth-century :iite s, Lien no Oel of attfiJhtln, an .l. anonymou: or do btful noon t any of the five. The purpose was Do test only the teat itLJlf. ”Prvcision without accuracy" was avoided by lavinr the connuter do the first selecting of vocabu- lary items to be subjected to analysis. J. Fed into the ET 3 Control Date computer were more than five- hundrnd pounds of Ill cords, each one bearing a sil: le line of ve13e carefully verified, and Mltfl the snellin: of certain poten- tially a biqueus words ceiventior alize in order to dispel the ambiguity: KAthhe eLVilinry verb; “A-_—t1e month. A :lossary prorram written by Jan-s D. Clark of Hichiran tate University gave me alphabetized lists for every text fed into the conduter. A.search of the largest texts having disclosed not one content: word significantly present in one author and not as significantly present in at least one other, I determined to use the connon words in my testing. George W. Zimmer Accordingly, I took the forty-five most common words and word-groups (AM+ARB+IS+WAS+WERE=one word; to which I apply the term allomorphs in my thesis) with the exception of the personal pronouns, and made of them a forty-five point profile for each of my 230 texts. If a writer subconsciously chooses one function word in preference to another, the one he chooses will form a peak on the profile-chart of his texts, while the other, less-prefer- red, word will show up as a valley. When charts of texts even of different word-populations are compared, two by the same author should have more points in common than two charts of texts by dif- ferent authors. The comparing was done by enlarging the points on the 230 profile-charts to quarter-inch holes, and then laying one chart on top of another and seeing where the holes coincided. Thus every text was compared with every other text, and five indices of cor- respondence tabulated for each comparison. A chart that finally analyzed the half-million bits of information thus tabulated showed that the method has possibilities: the best of four cri- teria into which I combined my five indices of correspondence was able to call the correct author sixteen times more frequently than would chance guessing. The test might be highly reliable in establishing which author of only two wrote a doubtful text. ts value, however, for attributing a truly anonymous selection to one of a larger number of possible authors I will insist upon denying. To do this, a statistical test must have the same accuracy as a chemical test; it must work every time under laboratory conditions if it is to be George W. Zimmer assumed workable when conditions are less controlled. Although the project was a failure, and the test is untrust- worthy as it now stands, there is some possibility that by shar- pening my procedures (perhaps by using the comauter at all stages) and by strengtheninf the word-list by dronpinr deed iteus and add- P. as some overlooked before, a reliable test for prov “? authorship could yet be discovered. THE ATTRIBUTION OF AUTHORSHIP: A COMPUTERIZED METHOD EVALUATED AND COHPARED WITH OTHER METHODS PAST AND FUTURE BY George W. Zimmer- A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of English 1968 (37-2 I v or I :2- ./ ./ 7 V. , Copyright by GEOL CE W. Zlg-HLER 196:: THE ATTRISUTEOI or AETIORSII‘: A constrsalzzn hBTnGD EYALUATED ALB cenrnazb WITH crass nrrlous PAST Ale FLTan IETRCDUCTION One of the host teapting fields of scholarship is that ninvolving the determination of authorship. If the historical- biogranhical approach to literature has ever had validity (and it shall here be considered a self-evident truth that it did and does have validity) it is necessary to know with certainty which author wrote vhich works. Our very important stereotypes of the various artists are largely dependent upon their canons of works: subtract a Timon of Athens from his canon and the percept "Shake- speare" becomes something other than it was with giggn in he canon. The authorship problem does, then, have its willing sol- vers, whose methods, however, vary widely. The simplest method of settling an item within or outside a canon is by edict. By declaration of a scholar or a school of scholars an item is placed within or outside a canon. The logic for such edicts is partially as follows: a work can be placed in a canon only by the best authority; I am the best authority; therefore this work is declared to he part of the canon. Prai- matically, at least, the edict method is one of the best, espec- ially when the placement of the work is afrecd uuon by those holars most interested in the subject. But the edict method gives rise to the tatihr of exception by rival scholars or riva 1 schools. lien tee rule states that the burden of eiseroof re ts upon the distenter who cannot therefore use the edict nethod. It ° 'W' -A ' '_ v c _'_- 0,: ” ~, 1 o 9! is at tile UOlnL teat statiStics are erouent to play on tee cues- astiflhf “J O I .1 "D 1.: :5 "Xethods usiné statistics” is a caterory cl several different ways: "successful" and hon-successful; simple and intricate; pseudo anm true; early, more recent and meet re- cent, or pre-comwuter, early computer, recent computer. Generally, the early atteiuts USs simple processes involvinf pseudo-statis- tics and declare themselves successful. Kore recent attempts are less certain about success, although, paradoxicalli, they use scientifically sounder procedure. To be dealt with in this thesis are saeeies of early aid recent, simple and intricate methods. Each species will be given its just share of criticism including the most recent which, accordin“ to my oun classification, is also the most intricate and non-successful. Sections of the thesis vill show the deficiencies of several types of statistical methods for provire au horslip, tith parti- cular attention to (A) the old, pre-eonuuter methods tzat relied 1 and simplistic couparisons and that were so '3.) on pseudo-statistics very positive in their results, and to (?) the computer-based method that involved he in my task. In between there will be short looks at hybrid methods: computer-aided and intricately statistical, but "positive” in their conclusions. This element of "positivity" cannot be treated too fully. 3 ml 1" .L V " 1 fi fi ,2 ~ , .3 A _!_‘u _ ‘f ‘ n lee efiorts oi the scaolsrs in the iele Oi attrior tion 01 g a - c ‘ , - - . K - ‘ - ‘ ‘ authorsnie in the ore—coneuter axe nave alreaev oeen charac— terized as simp-istic and pseudo—statistical. In general the earlier the study the less scientific it i s; but in about Li rect ratio to the lack of scientific riser, result of such {[1 studies tend to be declared "positive.” -here are very human most inst,nces the "positive” results supoort a preconceived notion. Later, as scientific rifor stiffened somewhat, an occasional scLolar would cla.i ‘.n to have had his oninion changed by his res arcL, yet results would be set forth as positive, even thoush negative evidence was present in ne wly as rreat proportion as positive. There are the furtller hulan inclina- tions to put an end to a piece of research, and to fill an unreasonable demand for a "happy ending" however false that ending might be. Such endings, which demand "positive" re- sults, are easy to sell to a readership unwilling to study the data or the techniques upon which the results are based. My dis se mt tion presents in three chapters three sta es or episodes in the quest for certainty by scholars in the field of English. The first chapter starts in 19u1 and re- fers back to the beginninms of a particular authorship prob- lem, that of the so-ce lled Pe«? rl-poet. It ca: be t Qien as typical of authorship pr031e .s, in that almost any argument ad: uced for the com only-held owinion has been respected, viiile one most carefu.lly pr-jarce attack on the attractive theory has been all but totally ignored. The first chapter quotes heavily from that attack (John W. Clark's doctoral dissertation, University of Minnesota, 1941) in order to re: present its tone of futility and dogged determination. Also, in presenting Clark's dissertation at such great length, I am repaying a debt I had not been aware of owing until I recently returned to its voluminous pages and found there many of the ideas and attitudes I had thought I had formed independently. My second chapter advances to the 1960's, with an occa- sional flashback to Clark and the history of the authorship problem. In it I examine two types of computer-aided pro- jects .aving as their ends the determining of authorship, and find both of them lacking integrity. It is suggested in this chapter that the computer should be allowed to do all of the work; especially in the first stages of a project, human error should not be permitted to intrude. The third chapter, a delineation of my own project, which is a study of vocabulary for proofs of authorship, should fol- low the rule of maximum computer use. But it does not follow my rule. The discrepancy is owing to the fact that this disser- tation was written backwards. Hy decisions on how to proceed had been influenced by Clark and many other books and articles which I deliberately put aside while assembling my own project. 1 Eminent among the writings are "Eras in English Poetry," Josephine Hiles, PXLA LIX 553-75, which from examination of syntax posits three eras for each century of verse; Statistical Study 9; Literary Vocabulary. George Udny Yule (Macmillan, 19hh), where formulae are given for determining the authorship The project was finished and the results were tabulated before the relationships among the various methods fully occurred to me. I had intended at the outset to demonstrate an impossible thesis: namely, that it is not possible to prove authorship by statistical examination of vocabulary. The project, however, proved the only thing that it was capable of proving: that the particular method used here is not capable of proving author- ship. The positive knowledge of this negative fact coupled with comparison of the method here oriwinated with previous methods whose authors evince far less positive inoxl; :e of the shortcominfs of their work has now led me to a new hope that the authorship problem may yet after all eventually be solved by statistical means. The results of my research i- §e were negative and could ave been pro-enter in a few pages. Althoufih I an ._J properly I as reluctant as anyone else to write at length about little, I have a purpose for so aoins in the sections of this thesis outside Chapter III. The positive result I aim at is putting an end to misguiuea work in the field, for which there may have seen an evcus e a few ghe wis ago, but not new. Chapter III, however, is no longer than it has to be, in contrast of passagjes of 10,000 words or lore; Tvneiggig; fatheggtiegi A Tevtooo" of I 10*ctic l Lir'rdstigs Gus ta're .:-.erd.an - u. r-_---. H..-— u. ( -Outon S'Traveiase ITCU), which conce11t‘ates on vocabulary items (types) and their occvr‘nces (tokens) a the ;::ea:1s of ( determini1¢ authorsnin; and “enta-nne—Shakesneare ani the Deadly Pa ra.llel, " Geor“e C. WT/lor, hilola,ic l ngrteggz XiII (1Tb;>) 330-3, 1ich lists pseudo-ser_ously seventy-five kinds of evidence tiat canon and influence scholbrs use, f “fliCh "”VTflbV-i f3 "vocabulary” is number 31. to the toe-explicit works I criticize. In heroin? with this conciseness is my deterrination in Chawter III not to repeat, or to excerpt from, my cnar s and tables (thereby forcing the reader to consult the full tables in their proper context each time reference is made to one) unless to show how such xcerpting can be made to give the appearance of validity to otherwise inconclusive results. The fourth chapter is a retrospect of the three stages of work in the field, pre-computer, early computer and re- cent, with a look at some trends that started after this thesis was begun. It will there be shown how the problem of attribution of authorship will probably be solved, when and if it is ever solved. CHAPTER I A CLASSIC CRITIQUE OF PRE-COMPUTER STATISTICS In March of 1941 John Williams Clark submitted for his doctoral degree from the University of Minnesota his disser- tation, The Authorship 9; Sir Gawain and the Green Knight, Pearl, Cleanness, Patience, and Ernenwald in the Light _f the Vocabulary. This work attacks the labors of all the scholars who up to that time had attributed the five poems to fewer than five poets. The first scholars interested in the problem had taken their cue from the manuscript of four of the poems: since the first four have come down to us in one document, the tendency would be to attribute them to a single writer. This attribution had been disputed several times before Clark, but the important editors of the texts had persisted in the one-poet theory, in support of which various kinds of evidence were presented. Professor Clark went to great pains to destroy his predecessors' arguments, sometimes by accepting their evidence but re-interpreting it, and other times by adding further evidence to show that they had used distorted data. His 531 pages of hard—hitting argumentation are comprehensive and attach every important argument adduced for the common authorship of any two of the five poems. Clark is thorough, completing and correcting 0:) other supposedly scholarly work. He gives the history of his problem: The earliest considerable attempt at deciding the ques- tion of single or multiple authorship came in 1:76, in Moritz Trautnann's doctoral dissertation, Ueber Ver- fasser und Entstehungszeit Einiger Alliterierender Gedichte des Altenglischen; and the opinion there ex- pressed-~that the four poems were written by a single poet--was given currency a few years later by Ten Brink, in his Geschichte der Epglischen Literatur, the English translation of which in the 'eighties extended Ten Brink's reputation, and the respect in which his pro- nouncements were held, beyond the then somewhat restric- ted circle of English scholars who willingly read German. Since that time, as Henner says (ed. Cl (Tor Cleannesé] p. Xi, n.1), "practically all those who have made ‘— special investigations of, or edited any of these poems . . ., have accepted the Opinion" that a single author wrote Gaw, Prl, Cl, and Pat. This opinion--perhaps we may say this pious opinion--was erected almost into a dogma by being spread by Professor (later Sir) Israel Gollancz (who has done more than any other scholar to create the reputation--if not, indeed, the identity-- of "the Pearl-poet") upon the sacred page of the Cambridge History 9: English Literature, of which the first volume, containing the account of the four poems, appeared in 1907. Sir Israel had published his adherence to the doctrine of single authorship as early as 1391 (in his first edition of Prl), and to the end of his life continued to proclaim it; his most absolute af- firmation of faith being perhaps one that appears in the preface of his edition of Pat (1913): "It is now gener- ally accepted, in respect of the four poems, that all the evidences of dialect, vocabulary, art, feeling, and thought conclusively point to identity of authorship ..." Italics mine. [Clark's] The opinion that Erh, also, was written by the author of the four poems (or of one or more of them) was first advanced som what later (by Carl Horstmann, in the editio nrincens of Erk, in Altenglische legpnden (Neue Folgei, 16b15, and has never been so widely adop- ted or, in general, so confidently urged, as the opinion that the four poems are from a single hand. . . . In this attribution of the authorship of Erk to "the Gawain-poet,” Horstmann (has) been followed by most scholars, notably R. V. Chambers, in Essays and Studies by Members 3: the English Association, 19. 126, n.2, but even Chambers does not express himself with assurance, and there is still a considerable amount of more or less half-hearted dissent. (Pages 3-4) Clark sees his task as giving heart to the dissent. He disposes in turn of each of the p oponents of single author- ship, and of each of the theories based on dialect (sixty: three pages), proSody (twenty-one), interests, attitudes and opinions (eight), syntax and style (thirty-two), and parallel passages (forty-one pages). After 192 pages, he is ready for Part II, the examination of the vocabulary. The chief con- tention that Clark advances in Part I is that similarities or even identity of dialect or the other criteria "prove" only that the writer of one poem lived in the same area as the writer of another, or was influenced by him. The evidence, he claims, cannot conclusively show either separate or common authorship, but, if the evidence is to be admitted, it has more force in proving diverse authorship. Always, however, there is the strong desire on the part of the earlier scholars to demonstrate the more attractive theory: to set up a "Pea rl-poet" whose canon of works, together with those undoubtedly lost, would make him a worthy contender for honors com: only reserved for Ch aucer. Part II of Clark's dissertation is aimed at those scholars who, following Ten Brink's Geschichte der Englischen o 1 q o o ' , ‘ o ., Literatur and Sir Israel Gollancz in the 1907 Cambriuse History of English Literature, sought to give a statistical 1 Elistory of Pnhli M O h Literature: I, Eng. tr., H. a. {ennedy (Iew Io;k, ). S -, J 10 foundation to their preconceptions. His strictures are par- ticularly directed toward J. P. Oakden's Alliterative Poetry . V. . I . .2 i . in middle anglisn and nenry L. Savage, editor of S . Erken- U.“— wald,3 although some lesser sinners attract a share of attention. The vocabulary of the five poems early engafed the attention of scholars intent on discovering whether -—or, as it almost seems as if we must sometimes say, on proving that--the poems were by a single author. Trautmann published on the subject three times. . . . Trautmann's conclusion was, in brief, that the vocabu— laries of the five poems are so much more like each other than they are like the vocabularies of any other ME alliterative poems, that we must suppose the five, and only the five, to have been by a single author. There is no doubt about the exceptional degree to which the vocabularies of the poems resemble each other; but subsequent investigations both more extensive and more exhaustive than Trautmann pretended to have made, or, indeed, could have made in the absence of editions with more or less complete glossaries and of HE , have shown that the vocabularies of the five poems are nowhere nearly so similar or so peculiar as Trautmann thought. ‘his is well shown by Savage, in his edition of Erk, pp. liv-lv: ". . . the value of Trautmann's findings has been somewhat reduced by the appearance of the later volumes of the HED and the progress of scholarship; yet," Savage adds cheerfully, ”the test of vocabulary indicates an unusually close connection between the five poems, and has strong affirmative bearing on the possi- bility of common authorship.” In other words, we are right back where we started --the vocabularies of the five poems are rather strik- ingly similar, but not by any means so similar that common authorship is the only possible (or even, I may add, the most probable) explanation. That this fact is perceived by the advocates of the theory of common authorship needs no further proof than that they all pay their respects to the vocabulary, and then look else- 2 University of Manchester Publications, CCV and CCXKKVI (Manchester, 1950 and 1935). 3 New Haven, 1926. 11 where for cogent arguments in favor of their View. (Pages 193-4) And, as Clark demonstrated in Part I, the arguments from elsewhere are not cogent, either. Why, then, does he devote the bulk of his dissertation to an examination of the vocabu- lary, especially when "attempts to prove the common author- ship of the five poems, on the basis of their vocabularies . . . have clearly failed"? (Page 195) At this point, Clark could take credit for examining the vocabulary, the most ob- jective of all the criteria and the one at the same time that provides the most massive data; but instead he uses the device as a sort of tail to pin on one of his less preferred prede- C GSSOI‘S: that failure has been proclaimed by no one more empha- tically than by some of the principal advocates of the theory of common authorship themselves. But an opening has been left for studies of certain special aspects of the vocabularies; and that opening has been seen by the indefatigable Mr Oakden, who, in the second volume of his Allit. Poetgy in Middle English, investigated three of these special aspects: (17—“Chiefly alliterative" words (by which Oahden means, not words that usually alliterate, but words "found but rarely or even not at all, outside the alliterative poems"); (2) "synonyms for man, knight," and (3) "synonyms to express movement." . . . consideration of his findings will repay our efforts by the further suspicions it will arouse as to the validity of the theory of the common authorship of the five poems, and will serve as an appropriate intro- duction to the main part of this dissertation. (Pages 195-6) Clark soon shows that Oahden's fifty-three "chiefly allitera- tive" words reduce to but twelve that are thinly distributed among the five poems, only one appearing in all five poems; that at least one synonym (douth) demonstrates "a fundamental difference in Sprachgeffihl" (page 199) fron poem to poem; and that Oakden's "definitely poetic" words for "so" have a characteristically non-significant distribution. 0“ some little significance for this present dissertation is Clark's explanation following his table of "go"-words for the five poems: The 31 ssaries [which Clark had thought he could rely on for his examination of Oahden's assertions are com- plete except where "frequent" occurs under Gaw, and possibly also under Pat generally--here, as with the synonyms of gap, hnirht, I have pieced out Bateson with Gollancz, and can hope for getting nothing more than an approach to the truth. All the zeros under Pat, how- ever, are probably right; neither Gollancz nor Bateson, apparently, leaves words completely unnoticed in his glossary except by accident. (Page 213) In this instance, Clark does not permit himself the dudgeon to which he ascends when he attacks Oahden directly; instead, he merely accepts the incomplete research for what it is not worth. Later, we see him re-doing far longer lists of French and horse words, with the goal of achieving more nearly per- fect accuracy. With an impossible foresight, he and all the other glossary-makers would have waited to let computers do the work with complete accuracy. Yet, even accepting the 'mperfect lists for vhat they are, there must also be an accurate, honest reading of them. In one note, Clark accuses Oahden of carelessness (page 222), and of misreading "the glossaries, or silently 'correcting‘ them." The same note (no. 17, page 223) tabulates the etymo- logies given for four instances of the word note: line Golla cz Savage Oakden 8 0E2 OE OF 101 CE OF OF 33 OF OF CE 152 CE OF OE . . . all three scholars pave; airee. It makes no immediate difference who is rifiht, or whether arybody is rirht; the point is the at neither 09‘; en nor anyon else really 11:-01.s (and. pro bably the author or authors of the poe2 did not know) how often this allevedly "chiefly al ite1.tive" word from CE pgtg 11iea1s in the Five 0 18 WA () PM 0 O O O . . . As I have shown above, the elitors of the poems sometimes disecree on the deriva ion of .o1ds . . . and yet neither Gollancz nor Savage (nor Crhicn, for that 1atter) expresses the slightest doubt that t1e truth is attainable and the he has attained it. This sort of t-1i113 s co:.1’.':on in t1e edit iro1s of the Five Poe 3. (Pa es 22-3 ,b) The lefan is "pr cisien \ithout accuracy,” and the further point that Clark makes, about even the author not beilg a‘are of etymologies, is one to store w,.ay for future reference. However, he himself acts on the counter theory that the aut‘10r(s )11ad an a.a:e n=es of words, since the bulk of 01a: ck' 8 research is precisely in his lists of rrencn and Horse words. Furthermore, Clark makes no claims to perfect accuracy in 2:_ lists. This disclainer is in accord with the footnote above, but not with his abjuring of the "precision without accuracy" 510 01. It is ever thus. If the size of a writer's da‘a is in- .. .- L 13 '.J O U} D ’ ) ‘L a ‘4 P] (‘1' ) (I O "5 L) O r x, C) O C) t.. to (‘1‘ 13‘ Q {.5 O J y. .4 ' |-lo pressive, the stat not fail to imaros s those wdo lave no inclination them. Clark had t1e inclination, and he did teat Cahien's figures on tie incidence of Old Norse words in 11.}. Let us consider Oakden's statement . . . that Gaw con- tains 256 03 words. What is an ON word? For that matter, what is "a" word? Are £31122: a., and a311, adv., two words or one, for our pu2poses? Much might be argued in favor of either answer; and so long as we give the same answer consistently, it makes no differ- ence which we choose. But which has Oahden chosen? He doesn't say, and I don't know how to find out. Again, whichever he has adopted, does he stick to it? Again, I don't know. What I do know is that Oahden's habits of work, so far as I have observed them--and I have ob- served them pretty extensively--do not inspire me with confidence that he has given very careful consideration to the problem. (Page 553) . . . I do not claim . . . that my judfiment has been infallible, or even that it has been better than the judgment of Kr Oakden and of the editors of the several poems. But I believe that I have shown, beyond reason- able doubt, that Oakden (like most of the editors) has overestimated the number of words, in the five poems, of which we can say with confidence that they are prob- ably ON. (Page 555) The aspersion cast (page 353), Kr. Clark concludes: Incidentally, the close similarity of Oakden's figures and mine for each of the poems severally (except Gan) leads me to believe that Oakden must, after all, have been nearly as cautious (always with the exception of Gaw) as I about calling a word OR; (pate 357) thus he puts himself in the same pocket as his chief target: such statements as Oakden's that Gaw contains 238 ON words-~or mine that it contains 202--are perfect exam- ples of precision without accuracy, the fact being, of course, that no one knovs how many "distinct words" the language contains, or how many of them are OK. (Page 558) A weird sense of futility pervades the dissertation. The Horse words disposed of, in a mere 120 pages, Clark turns to the French words. His predecessor, Hartley Bateson,# had worked out the proportions of French words in the two poems h Ed., Patience (2nd ed., London, 1918). d “ qr} Eggglyani Patigggg as 32.57 to 1?.92. I v Beteson, like Oe”l>. without of 1H3 rel; or Drooasly xiscallee OF are miscallrfi QT, very few (f arwe SG\”CTOj (temobrflal;7(;0c>r wor;%131 t'1e Ei‘Je Pocnus 9 OF 019 e1t in P11 to as rie3o colv close .. 7 as env tnet coull either ;8 beinfi nrecise " " -~ ‘L ‘. '-\ r'v #1" ~ ..' nu'lfiCbln. to Jive notice .1— . ~v u ;b Since t e Lords -~\.— A ..‘I ~ -: J-'1‘ \a '\A ‘ E‘s ‘ v " ' o. v.~11;1 g_t_ LLO:&;;l;OQ&01f r t1én 3Q), eni since there ., fifl ‘- r k. C) - J a» ’3 c:- (5 O H; 1-“ t: "J o (.1- J. o "1 . r ‘ c’v" t ; J .30 't8_ulfl -'-~ ‘A - ,., F ~. ‘ n'~ - o JC‘Lufli” onv—-es cl 1x3,jproouulf, a _ Then, since he is unsure of the meaning of Sateson's ratio of 5h.b7 to 19.92, 019 r; does it over again; ”and for food measure, I have extenfied the inves 1~m~t on to tne other poems." (Pane b6?) 2e laments: iJ. (D l I ('1- C) rt- 0 :5 h. H 1 C3 0 C) k d }-I C) :3 O :J 1.; 513» e -or lamentations a 1 F 1.1 O (‘2, 5" ('1‘ D F" L) O 9 h)! L) " 1:; O ’J C.) (.1- i.) 1.5 v 3 Cf- m QJ '._.l O C) ,C b ; ness f uh ‘ "as, after al_, a.self-iiiosed task; IU' I cannot for3~er to say that I wish (a) that editors of IE texts hel sereed on such cuestioas as the nroner alohebetical nosition of 2 (in both values), X (vowel), and u an} 3, end (6) that HE noets 31i scribes had, as rtenus Ward said, :nov .n how to en ll. The mere mech n'ca complications 1 , nto the connilation of such a list 3 that below, are yond the nowers of anyone wh s not evneriencei the: to in: ine. But the 1'03 wes at len jitw. 1‘“ ”1.31101" and I have the 111°C) rt of gnoainj that I have discovered some facts 3 it, and subsequent students-~1M1o e “at1e11t1c11 stunies were not cut short, like mine, with plane sconetry--nev dis- cover more. (Page L69) introduced cse txo f cto:cs :(T) . .4 '1." But what has he discoverefi that is of any lest_nj worth? His "statistics" are of no more value than those of the th e f...’ post as 1w l V ' 1, L” 1 . ,1 .-. scaolnrs he sttec s: tge nets are r tier ed u many onnortunitics for erro as were nresent when the earlier scholars worked without the aid of the ng_rd _E) lish.2ig- nest unsta ti al w tionarv, and the data are handled in a manner. Clark \HO‘rs ut ratios of French to horse weris for 16 each of the five poems. Pearl has 4.29 OF words to 1 Oh word; Cleanness 3.56; Erkenwald 3.60; Gawain 3.56; and Pati- ence 2.67 OF words to 1 ON word. It is my duty to remind the reader once more that these figures, like all those in this chapter, are only ap- proximate, so that such a difference as that between Erk and Gaw, or even that between Erk and Cl, is probably not very significant; but such differences as . . . that between Prl and Pat can certainly not be attributed to the roughness of the basic figures. (Pages 478-9) Perhaps not, but reason for tabulating the ratios in the first place can be questioned, and the differences can hardly be said to indicate anything of importance. Clark sees the difference between Patience and Cleann 33 as showing that the former was written first, "before the author had become so well acquainted with French literature," (page 479) although an increased sophistication about French writings could con- ceivably have caused an author to eschew French terms in favor of native (including, all unaware, horse) words.5 Again, Clark is precise though inaccurate when determin- ing munbers of French words, counting as a single form the noun and verb taxen from the same source-word, but counting them as two if they are derived independently (affray, n. + affrave, v. = one word; but afvaunce, n. + affve, v. = two words). (Page 431) Furthermore, he counts separately the full and aphetic forms of the same word: ”This is perhaps not entirely reasonable, but it is convenient, and can hardly make 5 Later (page 555), Clark admits that it is "pointless and unrealistic for us to pretend to know when a HF poet was thinking of the native word and when he was thinking of the foreign one." 17 any serious difference in any conclusions to be drawn from the list," (page 430) the reason being, that Clark himself does not take his lists very seriously. When you are out to disprove a theory, any kind of statistics will serve. Kever- theless, am happy that he did distinguish the full and aphetic forms, because thereby he "proved" the greater pro- portionate incidence of the full form in rimed poetry. His care with the two kinds of Riddle English poetry no doubt influenced my decision to consider verse forms in deriving glossaries. Just as Clark wasted effort in verifying Bate- son's ratios, so also did my meticulousness go for almost naught. (See Chapter III.) I find, upon re-reading Clark's dissertation four years after I first borrowed it (during which four years my own project received its form), that I am more indebted to it than I would ever have cared to admit. The impression I re- tained was almost entirely of his contentious tone stemming from exasperation and frustration. I see now that it was he who must have put into my mind the notion to settle upon the inconspicuous words as possible indicators of writing habits. From him came my technique of so dividing poems that the effect of rime could be either emphasized or nullified. And, finally, Clark's dissertation had, as I hope mine will have, the value of pointing the way to a method and a point of view, in determining questions of authorship, that may, when brought to perfection, lead to more positive and dependable conclusions than I have been able to make them yield. (Page 572) His method, owing to his negative point of view, yielded Clark further frustrations. Positive thinking has a far better chance at publication than does mere objectivity. Or perhaps Clark's tone insulted too many too important scholars.6 The dissertation was never published in its en- tirety. Portions of it appeared in a variety of journals;7 but his powerful argument was completely ignored fifteen years later by a supposedly demolished target when Henry L. 6 His argets include Gollancz, Savaf 3e, Oal;den (of course), Iiss Iiary Serjeantson ("The Dialects of the west Ifi dlands in Iiddle English," in Review pf En“lish Studies 5. 54. 136, 519 (1927)), J. R. R. Tolkien and E. V. Gordon edd., Sir Gawain and the Green Knight (London, 1925), and Lane Cooper and his doctoral students. O akden and Serjeantson are accused ofa suboressio veri (page 55), as are Tolkien and Gordon (pages 114-5). Oakden is treated quite harshly throughout, often through the medium of jokes and catch-words, the points of which are not always clear to me. Perhaps most typical of Clark's fits of ill-humor is the footnote on page 7h: "This is neither Hr Chapman's first, nor his most ambitious, nor his most fruitless contribution to the study of the four poems. The only true and lawful claimant to those titles is Mr Chapman's doctoral dissertation, produced in 1927 at Cor- nell University under the Concordantifex Haximus, Professor Lane Cooper, and entitled 'A Lexical Concordance of the Middle English Pearl, Cleanne ess, Patience, and Sir Cawayp and the Grene Knight.‘ In a fiVe-page preface (the only thing the dissertation contains besides the concordance prOper), Hr Chapman writes: 'The work is recorded on about no, 000 slips ...; and in this shape the copy in due time will be sent to the printer...'" Clark is piqued because this concordance, which he intended to rely upon, covers only six letters of the alphabet. Howadays they use the computer at Cornell for the production of concordances. A quality product results, but without that useful byproduct, the Ph.D. 1 \ 1‘ 7 "Observations on Certain Differences in Vocabulary Between Cleanness and Sir Gawain and the Green Knight, Philological Quarterly, "VVIII_(1949), 261-73; ”Pa aphrases for ‘God' in the Poems Attributed to 'The Gawa in-Poet, '" Modern Langua e Fotes,L LVV (1950), 232- -6; "'The Gawain poet' and the Substantival Adjective,’ Journal of En lis h and Ger- manic Ph ilology, XLIX (19 50), 60-6 ; "On Certain 'Allitera- tive' and 'Poetic' 1"ords in the Poems Attributed to 'The 19 (‘3 Savage“ again advanced the theory of common authorship as being all but universally accepted. Gawain-Poet,'" Modern Language Quarterly, XII (1951), 587-98. 8 Henry Lyttleton Savage, The Gawain-Poet: Studies i3 his Personalipy and Background (Chapel Hill, 1956). John Conley's review points out that "John W. Clark's sobering studies of the vocabulary of the five poems are not even listed. Yet Professor Clark shows that J. P. Oakden's Alliterative Poetry 23 Middle English, to which Dr Savage appeals, is far from being trustworthy." Speculum, KKXII (1957), 858-61. -AUT-1 ORSLIP P403 ELS IE TAB “ARLY COfiPLTSR AGE The main quarrel that a statistician would have with ClarZ: or his predecessors would be over the tin; samples that they sow; times Zeafil with. Secondarily, the statistician .L mifht marvel over tleir lacv takes into consideration ponulations and devia tio1:s. Final- ly, he could ask for a kind of acctracy that either Clark 1 Q ' 1" q .\ vs ~, - r a H ‘- ~ ‘A --. .. I ‘1' ‘ ‘w ‘ ,- ‘ (Y ‘6 3 . ‘ n01 1_e predecessors mere arepalen to QiOVldG. Pre-to.la «Hf II scholars could not have foreseen the corp tor, a1d put off ,- _l .J (‘1‘ x) f) O 3.) (D H "J '4. heir researcEies until i lable. But tie author- ship question continued to interest scholars even into the nintee en- fifties ans -sittiee, who? a nu DC“ of stueies were V O __ 0 ‘ _‘ - . p: . T . '1‘ {'1“ r .a_ I 4“ _'. 0 W10 Has Jinius° aid its consaniOh wold e, A statistical N p.—.—- a... M " .- m-—.-=-—--—— m.- .. , . . .. 1 metiod for De*e“11i.* Authorscia, oy Alver Elle ,ard (sta- v‘w“ —.——«--- tistics by Per Sigurd Agrell), both leke heavy use of tie computer. The latter voluae is a]: no t w ‘1olly concerned with 1 w w :. L...’ :-.: n ,. i r“ . u ,' 1 n a. tables 01 seaeis ics Lorie, out ey co; 3 ter, Liti accompany- ing explanations. Ihe pair or bocks is a conVinCirr perfor- 21 mance. Ellegard's history of the problem shows that Philip Francis (1740-1818) had always been the prime candidate for the honor of having composed the Junius letters, althoueh Ellegard claims that he began his study favoring another possible writer. Ellegard‘s central point seems to be that direct testimony by the contemporaries of Junius can, now that those witnesses are dead, have no chance of pointing out the writer to blame for these incendiary pamphlets, and that therefore a more objectively deductive method must be used to determine their authorship. Sir Philip Francis had long been the favorite candidate among scholars who conjectured an author for the Letters. Ellegard was persuaded by the biographical and other evidence that the chances were avainst his having been the author, and he set out to prove his contention by statistical means. His method has several faint echoes of techniques that Clark 3- anined in his critique of the scholarship on the "Pearl: poet." Clark's unpatentable concepts of the favorite word2 and of the unconsciously-chosen expression5 alluded to at the end of the previous chapter form the base of Ellegard's system. Simply, his system would compare corpora of Junie writings with non-Junian writings by the presence or absence of certain words or expressions. These plus- and minus-words -- originally #58 of them, later reduced to 272 -- were 2 Clark dissertation, page 302. 3 Pages 9-11. 22 culled rather subjectively” from all of the Leggggg, from all of the known, identifiable writings of Sir Philip Francis, and from a "million-word sample" of contemporary writings. A Junian plus-word is one used with higher frequency in the Letters than in the "million-word sample" and a minus-word is one that occurs with less frequency in Junius than in the sample. The 458 words are on a sliding scale of positiveness or negativity, with words used not at all (or almost not at all) in the Junian letters designated the most minus. Elle- gard himself culled the tables of occurrences of these #58 words or erpressions from double readings of all of the texts involved (Junius, Francis, the million-word sample). From these tables came the raw data which were fed to the computer for manipulation by formula and for multiple grouping. To repeat, the books by Ellegard and his statistician are a convincing performance. I wrote a too-favorable review of them for the iguana; _: inalisn and germanic Philology, asserting my belief that Ellegard had solved for all time the problem of the authorship of the Letters 2; Junius. I damp- ened my praise, it is true, by mentioning a factual error or two, and by not accepting completely either Ellegard's object— ivity or his ability to scan so many texts for sis 453 items as accurately as he would have his readers believe he did. My review was in the mail to the Journal 9; English and h Despite ilegard's claim of objectivity; see below, gpages 26ff. 5 June, 1963, pages 688-9. 23 Germanic Philolosy before I saw the handling of the books by the reviewer for the Times Literary Supplement,6 who brought a nicer skepticism to the task. I was then working on a series of projects for Professor Arthur Sherbo at hichigan State University, taking over in the midst of one project that had already been started with another assistant. The goal of these investigations, as I understood it, was to determine which of several 1000-word texts belonged to Samuel Johnson and which were spurious.‘7 Therefore my interest in Ellegard's apparent success was colored by my hepe for a like success, and slanted by my involvement in a Johnsonian project. Looking at the Ellegard books now that I have become more blasé'in the face of ponderous scholarship and somewhat conversant with, if not statistics, statisticians, I can well believe that Ellegard's statistician was sneering at him, (he is called "subtle” by the Times Literary Supplement reviewer) while working up his tables of results. For statisticians seldom put any credence in the statistics of laymen, particu- larly When those laymen are scholars of literature. One noted statistician with whom I talked at Michigan State Uni- versity even refused to accept the widely accepted results of 6 "The Statistics of Style," January 5, 1965, P389 1- 7 Though the sampling was wide enough for an effective analysis, having twenty-two texts (mostly Johnson's, but with examples from the known writings of other possible authors of the disputed texts), it was not deep enough: a thousand words could not provide sufficient data for the relatively crude tests to which they were to be put. There will be more about these projects later. 24 the work done on the Federalist papers by Frederick Mosteller and David L. Wallace,8 where the problem involves only two possible writers. The professional statistician's skepticism stems from the tendency of dealers in authorship problems to vant to sell their cause regardless of negative evidence. Thus, while Ellegard claims to have been converted from an anti: Franciscan stance, nevertheless his case is made to seem in- evitably to lead to Francis. And although he may have chosen his plus and minus test words from the Letters before he be- lieved them to be by Francis, he still know Francis as a prime candidate, and he could have been subconsciously influ- enced in his choice of the 458 test words by his familiarity with Francis' writings. I can cite no authority for the caveats in the preceding paragraph. The notions expressed formed in my mind as my belief about Ellegard and others changed under the influence of one or both of the statisticians to whom I talked at Mich- igan State University. Some statisticians --—anc again I ca - not cite texts or give names -- would further object to the way the analysis in Ellegard's books groups writers and texts. The million-word sample contains texts which are then re: grouped into the writings of individual authors, and compared with the million-word sample. There is a bias factor in the formula used to evaluate the deviations of individual authors' 8 Inference and Disputed Authorship: The Federalist, (Addison-Wesley, Reading, Mass., 196h). 25 writings from the standard of the sample. Katurally, «rancis is found to deviate farthest from the sample and to be the closest to the Junian standard.9 To one of the statisti- cians with whom I spoke, a text is a text; and it loses its testability when divided or when combined with other texts. Ellegard's system repeatedly combines texts. The 272 test-words are themselves combined also. The changes are rung on the combinations of groups of groups of plus— and minus-words. It is not enough that there is a descending scale of Junianity attaching to groups of plus: words; in order to test many texts in a variety of ways -- he fills forty-five pages with tables ---Ellegard combines his groups into the very Junian, the somewhat Junian, and the slightly Junian. Of course, the first super-group excludes non-Junian texts better than the other two. That this is a species of circular argument is not pertinent, since the results, by such manipulation, are rendered so much more positive. Putting aside the subtleties read between the lines by the Times literapy Supplement reviewer of the Appenlix written by Ellegard's statistician, Per Sigurd Agrell, per- haps the main mark against the theory, even assuming that it proves Francis the most likely author among hose authors considered, is the contention that only a few of the several 9 The possibility that Francis might have consciously imitated Junius is not given weight. 26 hundred thousand Englishmen capable of having written the Letters were entered into Ellegard's process. Ellegard here rightly falls back on biographical details- Francis was available and knowledgeable as the letters were being written, and they stopped when he went to India. Yet the fact remains that almost any number of secretaries or mistresses of Oppo- sition members in Parliament could have had access to the information displayed, and might have had the literary skills associated with Junius. But if the "'one new fact' demanded by Dilke"1O is pro- vided by Ellegard's finding that Francis' style is the closest to the Junian of all the feasible and known contend- ers, perhaps the question is after all solved. TLe method, however, is not convertible to other problems of contested authorship where the biographical details are not so clear- cut. It would be almost worthless where the data-providing subject matter is l'mited, where texts contain fewer than two thousand words, or where the writings are in verse. 4y principal criticism of Ellegard's procedure concerns his method of gathering data. My procedure was as follows. After a cursory reading of the whole material --all the Junius Letters --, in order to get a general impression of the language of the time, I carefully combed the Junius material for words and constructions which seemed to me to be used with remarkable frequency in it. I then did the same for the comparative sample of a million words, noting not only the words which struck me as remarkably fre- quent in the various texts, but also those words which, 10 Who Has Junius?, page 119. Charles Wentworth Dilke wrote in the Athenaeum in the mid-1300's of his disbelief in the theory of Franciscan authorship of the Letters. 27 though not particularly frequent, I did not remember having seen in Junius. In this way I obtained a preliminary list of Junian plus words --from my reading of Junius -- and of Junian minus words ---from my reading of the comparative mat- erial. After I had got the whole list of plus and minus words by heart, I read through the whole tex material again, registering each occurrence of each word included in the list. When this had been done, the total number of occurrences was added up for each word, in Junius on one hand, and in the comparative material on the other. After this, it only remained to calculate the distinct- iveness ratio, and the final testing list could be drawn up. In order to minimize as far as possible the number of occurrences lost by inadvertence, each page of text was read through twice ovpr. Even so, however, mistahes have certainly been made. At the very time I was reading Ellegard's boon for review, I was having trouble maintaining accurate counts of my own. The old HISTIC computer seemed not to be as truthful as we thought it should be, in giving us counts of words in senten- ces and of words of certain lengths. I we ndered if the pro- gram for alphabetized lists of words was also playing us false. The text I checked was supposed to have thirty-four instances of the word "of" according to the count of the MISTIC computer. I then discovered how difficult it is to find as few as thirty—four "of's" in a text of a mere thou- sand words; and I knew the number that I was trying to find, and was circling each "of" as I found it. It took about eight readings. Ellegard's assurance that two readings would not result in mistakes that would have significant bearing 11 A Statistical Method . . ., paces 22-3. . 12 . on his results is one statement that I cannot accept. Ellegard was aware of the means available for avoiding all mistakes, in deriving an unbiassed testin list, as well C) as in counting the items from that list in the separate texts. There are two ways of guaranteeing an unbiassed testing list. One is to examine completely the vocabulary of all the texts investigated, and draw up the lists of plus-expressions and minus-expressions wholly on the basis of such complete investigation. The other way is to select a sample of expressions according to a well-defined objective criterion, which can be reason- ably assumed not to favour or disfavour any particular candidate. The selection may be either random, or systematic. I have followed neither course. To make a complete investigation would have been a Herculean task: it will have to wait until the whole prgcess can be carried out by electronic computer. . . . Ellecard's choice was wrong. Unlike John W. Clark in 1941, he could have chosen the computer and he should have waited for suitable programs for sifting and culling. Instead, he repeats the error that Clark is so vociferous about of having precision without accuracy. His indices to the fifth and sixth decimal place, and his forty-five pages of perfectly worked-out tables are all based on shifty data. The precision and accuracy must begin at the beginning or there will obtain inexorably the dictum of the computer operators: Garbage in, garbage out. Computer-use for authorship problems was very much in 12 g Statistical Method . . ., page 23. 13 Who Was Junius?, page 113. 29 the atmosphere in the early sixties. The first phase of the Johnson problem initiated by Professor Sherbo ground to a halt: the thousand-word texts we were using were just not large enough to show distinctiveness in such aspects as sen- tence—length, or numbers of x-lettered words. We were look- ing for a determination of which of several disputed texts were Johnson's by means of a statistical process suggested by George Udny Yule.”L Even the excellent glossary program which gave us the occurrences of every vocabulary item in alphabetized lists failed to yield promising results, pre- sumably for the same reason: the shortness of the texts. We then struck out in a new direction. Retaining the twenty: two thousand-word texts while finding some way to increase the kinds of possibly distinctive data was the immediate task. White Knight that I am, I invented my own system. The system sought to multiply the data by counting, not words, but groups of words. The repetition of patterns might be the clue to an author's writing habits. Raw words would, of course, not serve this new purpose, because there is so much variety in the selection of content-words that the longer patterns would almost never be repeated. The raw words were given coded designations indicating their "part of speech" and their "funstion in sentence.” Thus the 14 g Statistical Study 2; Literary Vocabulary (Cambridge, Macmillan, 19AM), and "On Sentence-length as a Statistical Characteristic of Style in Prose; with Application to Two Cases of Disputed Authorship," Biometrika, XXX (1938), pages 563-90. 30 expressions "in the house," "on the town," and "over the rain- bow" would all have the same coded appearance: PJ DE KP, for "Breposition introducing adiective phrase," "the Qeterminer 'thE,'" and "goun object of Ereposition." All 22,000 words had to be so coded (by hand!) before the computer program for selecting patterns could begin to work. By this time the casual reader will recognize the same old fallacy of pre- cision without accuracy in the work. r‘he coding had to be done by hand, and I made the usual claim of consistency to refute in advance any argument that my data might be deficient. The program ran, the like patterns were collected, and the output was analyzed. The results were nothing if not inconclusive. Not so, however, with the results of a very similar in- vestigation that was being carried on at the same tine at Columbia University by Louis T. Kilic. Hr. Milic's disser- tation also examined prose of the eighteenth century by tabu- 15 lating patterns of words; but his coding of words was based on the structural grammar of C. C. Fries rather than on the traditional parts of speech, and his sample texts ran to four: thousand words instead of our one thousand. Perhaps, I reasoned, his results would tend to be more positive than ours o 15 A Quantitative Approach to the Style of Jonathan Swift (The Hague, Houton, 1967). Er. Hilic and I exchanged several letters and talked long distance once as he was putting the finishing touches on his dissertation. I can at this moment understand his state of mind the day my call went through to him. By the time his work would have been available, I had already abandoned the system. 31 Despite Milic's apparent success, (his "study claims to have produced a method of identification by internal evidence, free of the usual uncertainties, using statistical methods and computer technology")16 I decided to reject the process we had discovered independently when it came time to under- take my own dissertation project. What course I took, how it failed, and why I think yet that it is, in the main, the right course form the substance of the remainder of this thesis. 16 Dissertation Abstracts,XXIV (1964) 3730. CHAPTER III DISCOVERING A TEST FOR PROVIMG AUTHORSHIP: A STATISTICAL TREATHEET OF MOST OF THE LONG POEMS OF LORDSWORTH, KEATS, SHELLEY, BROWSING, AI-ID TJINYSOI‘E The project I now undertook started out to be the impossible one of proving that authorship cannot be proved by a statistical ex mination of vocabulary. Translated into possibilities, it meant that I would devise the best test I could and use it under ideal conditions with the hope that it would work but with the expectation that it would not. Of all the segments of the initial project with Professor Sherbo, I had confidence only in the "glossary" program. The tests for sentence— and word-length were almost patently unwork- able when used on our short, thousand-word texts. With alpha- betized word lists, however, texts could be compared for every item of vocabulary. Such was my intention: to omit consideration of no word out of fear of the charge of conscious or unconscious prejudice that Ellegard was subject to when he drew up his list of #58 items. I also left the Johnsonian milieu with its doubtful texts so that I could concentrate on the test itself and not on any immediately practical application thereof. To the best of my limited knowledge, there had been no completely objective tests of authorship. Even if the researcher 32 KN 3 is without an axe to grind, so to speak; that is, even if he does not hold a prior belief that a certain author is to be credited with the disputed work, he nevertheless does begin with a strictly l'mited set of candidates, and omits from consideration the possible stray contributor, or the truly anonymous writer who was not known in his own time to have written anything. Always, in such research, there is the task set of attributing something of doubtful authorship to a :nown writer. And although controls are purportedly used, the methods are never tested entirely apart from the problem for which they were designed. Having a goal in mind can cause an experimenter to color his data, even unconsciously, by select- ing items for analysis any other way but at random. Another abuse of scientific methodology occurs when com- paratively scant data, never gathered with perfect accuracy, are formulizated and magnified into imposing tables of figures to the third and fourth decimal place. This could be called the Gold Bug distortion, whereby a mistake of inches near the trunk of the tree amounted to many feet when the final line was projected. The deeper you dig in a wrong location, the more foolish you appear in retrospect. Bllegard's two volumes on the Junius problem are a good example of this sort of abuse. I take pains to avoid both of these pitfalls. Hy purpose is to seek a method of proving authorship by examination of vocabulary usage. I avoid the first trap by choosing to ex- amine only known works by known poets. And, secondly, I allow the perfectly accurate countind machine to anass xv initial data, from which I subtract, by wholly objective means, the usable oarts. Ce n a poem by Keats be distinpuished from one by Shelley, hordsv.'orth, Browning, or Tennyson throufih the use of a test involvinf the poets' choice of certain voca *oule ry items? With the aim of discovering such words, I set about to feed .1 every noon of more than t7xo thousa ' words by these five ninetee nth-century poets into the 3600 Control Data computer then newly installed at nichigan State University. I did not have enotéh time to submit every poem of 01 e t1an t o thou and words by the five poets, but my ozuis sions were completely by chance (see Appendix A for a list of the tejzts). If an author— ship test by VVCGJUlc ry does exist, it probably will not work on poems of much less than four the rs and words, but my inten- tions included the deternininfi of how s all a text can be tested successfu ully. Each line of each poem vas punched on an l;fi card and carefully ve7i1ied see Specimen 2 in Appendix F). Not knowing which words would enter into my analysis, I soucht to eliminate ambiguity 3y Iollo““fi certain conventions of spelling. The aLIiliaries "may” and ”niqht" were to he sepa— 1e nouns of the saze s; e,lling oy appandint an to the nouns. The Bri ti sh "round" was always spelled "around” when it meant "a:?o nd. Contractions were spelled out so that both parts of the word could be counted while the ”word” 0 v n I _‘ ._f f‘ - C“ . r. ' .A, J- itse 11 would rerieter only once: "itis" 1or “it' ,” and ”cahht" for "can't." fly failure to somehow differentiate ”to" tie preposition and ”to" the infinitive sifnal is only slightly mitigated by the contention that all ”to‘s" are equal inas- much as the poet is choosing in either instance the same two-letter word. Mr. James D. Clark, of the Department of Psycholoqr at an State Unive-sity, wrote the program for data re- trieval. Mr. Clark's glossary prepran yielded me alphabet- ized lists of words and their occurrences in more than two hundred texts (see Specimen 3 in Appendix F). The possibil- ities for expanding the number of "texts" are nearly limitless, since halves, thirds, fourths, or sixths can be combined in many ways. No text is made up of parts from different poems, however. With the computer output I was ready to follow Ellegard's retrospective advice to select words entirely objectively (see quote, pages 26-7). I conned the lists for words sig- nificantly present in texts by one poet and not so iresent in texts by the others. he such words seen to exist; that is to say, if a word is used in several poems by one writer, it will be used with about the sane frequency by at least one other writer. It would be necessary, I decided, to work with those words apnearing in practically every text. These are the words automatically excluded from nest concordences: the non-content or function words. And since they all agdear in aluost every 1 ‘- tevt, my treat ent of then would have to consider their rela- ‘5 —: a. L ’H ‘ a 'P J' \' ", ~. A, '11 ‘3’ '. , '0 ' -\ 1. tiers to one 3R0bu‘l: does a writer's re~cateu c oice Oi LL- 56 diminish his uses of "a” while at the same time, perhaps, his "and's" are impinging upon his "or's"? I had thirteen of my texts (see Appendices A and 3) of 1 about eight thousand words scanned for the words common to all thirteen. After the disqualification of personal pronouns as too dependent upon content,1 forty-five words remained. Fourteen of the forty-five have variant forms, which were carefully combined to make single items (see Appendix C for the allomorphs of these fourteen items). It was not possible at this stage to distinguish the usages of several ambiguous forms, such as "as" and "like," but even deliberate refusal to distinguish them could be justified by the argument that the poet did after all choose the word in question, and pro- bably unconsciously, since most of the forty-five items tend to take lisht stress in their verses. This justification could perhaps extend to the single item "to," which might have been separated into its use as proposition and use as infinitive signal. The forty-five key words having been determined upon by purely objective means, it remained to find a way of using 1 This decision is based on an experience that Professor Sherbo shared with me. An xamination of three 12,000-word texts from consecutive issues of The Gentleman's hagazine of the 1740's for clues to identify the author of the doubtful middle text showed that the single word to vary significantly in usage from text to text was the feminine pronoun. One of the articles was about the Queen of Spain and also used the feminine pronoun for certain abstractions. Another article also used the feminine pronoun for abstractions, but was not concerned with the Queen. The third had no feminine pronouns, although some of the same abstractions were referred to. 37 then for comparing texts. It will be remembered (Appendix A: list of texts) that the texts vary in length from one thousand words to eicht thousand. I imagined that there could con- ceivably be distinct patterns of usage for these test or key words. Such patterns would have to have been determined en- tirely by unconscious selection by the poets. If, or since, they were beyond the ability of the poet to control, these patterns should be so much the more effective for use in es- tablishing an authorship test that would distinguish a poet from his imitators.2 And if such a test exis ed, it could possibly be used on texts of vastly unequal length. So, rather than concentrate on comparing texts of commensurate size, I decided to compare each text with every other. Accordingly, I made profiles of all the texts by graphing the forty-five words. I gave the word most frequently used in a text the value of IO %, plotted at the top of the graph. Each of the other forty-four words were given proportionate positions (see Specimen 6, Appendix F). In order to compare 2 My interest in the subject somewhat antedates my quick Masters Thesis (University of Detroit) written in the summer of 1959 in which I "traced" "evidence" of "influence" of Shelley on four subsequent poets, mainly through their use of common vocabulary items. The kind of item I then concentrated on is typified by the word shiev used by Shelley, of course, and by Francis Thompson in what must have been a conscious attempt to resemble Shelley. A word would not have to be as outrageously "poetic" as skiey for an admirer to borrow it; the other conscious borrowings would, however, tend to be the slightly out-of-the-ordinary. Apart from the function words in "turns of phrases" so borrowed, practically all of the borrowings would be content words. And no imitator would think to conform his own usage of all function words to the patterns of his master. \N C the graphs visually, I enlarged the forty-five points to quarter-inch holes (see Specimen 7, Appendix F). Preperly positioned, one on top of the other, all the points of com- parison between two graphs were immediately visible and ready to be counted. A gross count of simple coincidences would not have justified the use of graphs, since such data could be compiled merely by having the computer examine the charts of numbers that lay behind the graphs, and letting it do the ratios at the same time. Perhaps, I reasoned, the best profile similarities were skewed out of recognition by the lack of coincidence between the two leading words, that is, those given the value of 100%. Frequently it did happen that the greatest numbers of closest correspondences were found only after searching for them. This searching added but little time to the comparison of each pair of graphs. Each comparison of two graphs took approximately one minute. After taping Graph #1 to a dark board, I positioned Grap #2 atop it and counted (1) the number of holes that corresponded at all, (2) the total that corresponded closely (that is, that showed more than half a hole-diameter), (3) the number of correspondences above an arbitrary 15$ line. For the fourth and fifth indices of correspondence, Graph #2 was slid up an inch toward the top of the board while I looked for the greatest number of addi- tional correspondences above the 15% line on Graph #1. Then (Eraph #2 was slid down an inch (two inches, really) while correspondences above the 15% line on Graph #2 were sought. \JJ Ki) Sometimes more than a minute was consumed in finding the scores for the fourth and fifth indices - approximate and close correspondences above the 15% line. Time was also consumed in taping the bottom graph to the board and re- moving it, in taking out and putting away the sets of graphs and the sheets onto which I was writing the five indices. I wrote five index numbers (sometimes seven) for each of 230 x 229 x % comparisons of graphs, and I counted about fifty correspondences for each comparison of two graphs. The 1,320,000 bits of information thus counted were recorded on a triangular chart made up of three hundred individual 8%" by 12" sheets, and measuring twenty by twelve feet (see Specimen 8, Appendix F: a part of one of the 8% by 12 inch sheets). A computer, which would not have had to tape graphs to a board, or would not even have had to use graphs at all, could have completed the counting in a matter of minutes once it had been programed and the material had been prepared for it. I justify my performing this long phase without the aid of a computer by the fact that I was not.exactly certain what I was looking for or how I would be able to use the data I was compiling. At one stage of the comparison phase, when it was about one-fourth finished, I took notes on how a certain text (#56 by Shelley; see Appendix A) compared positively with other texts: that is, with what texts did it yield high in- dices of correspondence. I was most exhilarated when, seven 40 times out of seven, high indices actually did point to other texts also by Shelley. But the eighth, tenth, and eleventh comparisons of the twelve made resulted in false identifica- tions. So promising did the system appear at this t'ne that I attempted to present an explanation of it at the April 30, 1966 meeting of the Hidwest hodern Language Association in Iowa, where I learned that it is nearly impossible to present unconvincing facts convincingly. For by the time of the con- ference, when my facts should have been more firmly positive, because I was by then dealing with larger texts, they were more inconclusive than ever before or since: I knew they were not as positive as the Shelley #56 figures indicated, but I—l neither could I say that they would be negative until had finished the three hundred pages of my 20' x 12' chart. With all the indices tabulated, the final step was to test the results. If there is a profile or a set of profiles made up of points on a graph representing the proportional occurrences of forty-five common words selected objectively, and typical for each of my five poets, then surely the follow- ing test will find it. I reduced the five indices for each comparison down to four criteria, three positive and one negative. The first positive criterion consists of the five indices added together, the second is merely the third index (correspondences above the 15% line), the third adds the last two indices (the moveable ones), and he last is the same as the first, except that the low total is the test. That is to say, if two graphs have no points of correspondence then the 1.5.. 1 poems behind the graphs should not be by the same poet. Any of the positive criteria should identify texts as being by the same poet, since what I looked for in each case were the extreme examples (combination of indices distorts the data in my favor; it was done because there was no way of predicting which criteria or indices would assay out). For each of the 230 texts, I rejected all but these texts of the remaining 229 that were most like it. By expectation, at least above some limit of about 3,500 words, each text should have selec- ted matching texts; each Keats text should have selected {eats texts, each Shelley, other She_leys. Beyond this, the negative criterion should never have selected texts by the same poet. The chart of the last analysis (see Appendix E for a summation of this chart), showing the results of one of the grossest possible tests of the validity of my method, must measure 230 by 230 squares. host of these squares will not contain an entry because only the extreme examples of merely four criteria are tabulated for each row of 230 squares. Let page) what might be a ran- J. me present here (in Figure 1, next dom sample of the chart of the last analysis. Divided into a hundred equal parts of twenty-three squares on a side, one such part of the chart, the twenty-third (counting from top left), contains twenty-one criteria of correspondence, every one of which indicates texts that are indeed by the same poet, or, in the case of the negative criterion, texts not by the same poet. These are exactly the results I had hoped for. The next step 568 11 58W 311 61W KEY TO CRITERIA - total of 5 indices . 0 3rd index: correspond- 55w ences above the 15% line - 4th + 5th indices low total of indices o\o\ bu £7: 03h: u: ONO \lO“ 2222 H OH Ill FIGURE 1 Square 23 of the Chart of the Last Analysis This is the only square of the hundred into which I divided the chart of the last analysis where there are no misses. Thirty-one of the hundred have more misses than hits. #3 is to use a finer test, broadening the criteria until negative results are reached. Rather, that would have been the next step, had Square #23 been typical. A look at the final tabulation (again, see Appendix B) will show that Square #23 is the only one of the hundred that is unanimous in supporting the original theory. The remaining ninety-nine squares range down to a perfectly negative correlation (Square #44, Appendix E), with all the criteria miss'ng the mark. Hot even the negative criterion had reliabili y. 3y rifhts, the negative criterion should never have appeared when texts by the same poet were being compared. Yet it did, forty-five out of 295 times, a ratio only slightly better than pure chance. When a text selected more than one other text for positive or negative correlation all pairings were listed. For example text #54 (Browning) correctly chose five Wordsworthian texts in the twenty-third square alone for negative correlation. In each of these five instances all of the indices added up to only ten correspond- ences. Text #Sh selected non-Browning texts sixteen times in 229 trials, but selected another Browning text once. This is only three or four times better than guessing. Likewise the positive criteria, although far better than pure chance, which would score a hit approximately every fourth or fifth time, still did not have the consistency needed for an effective test (see Appendix G). The present experiment, therefore, is a failure. It did not discover a way of proving authorship by means of words 44 selected objectively and graphed without regard to text length. Improvements in the method are beyond the scope of the present study, which has now lost some of its innocent objectivity. I have some faith that by working over the data that went into the positive results I can cull out certain of the dead vocabu- lary items and build a stronger test. It is encouraging that two of the positive criteria, the second and the third, were right roughly four times as often as they were wrong, and that the first criterion was almost sixteen times as effective as guessing; that is, it was right three or four times as often as it was wrong. Especially by concentrating on the relatively few data of Magic Square #23, something viable might be con- ceived (some of the magic of this square perhaps derives from the fact that seventeen of its twenty-three texts are from Wordsworth, thirteen from The Excursion alone). Future mining of these statistical lodes, however, will have to avoid the pitfall of circular reasoning, and any testing device so de- rived will itself have to be tested against poems not on my list. Until a test is devised that will work on any hundred out of a hundred texts of known authorship, I, for one, will refuse to accept its results when it is used with texts of doubtful autho*ship. CHAPTER IV COLCLUSICI: OUTLOOK The time has come for me to fulfill the promise made in the Introduction to point out the way that the attribution of authorship problem may be solved. To recapitulate: John W. Clark concluded that the problem probably will never be solved, since it is impossible to tell whether close similar- ities between texts indicates common authorship or mere in~ fluence of one author on another. This almost certainly will remain true of disputed texts such as his four or five poems, whose author(s) left no definitely attributable corpora to be used for wcomparison. Circular arguments, however much pleasure they give the disputant, cannot be said to solve anything; and when one compares data from eawain with data from Pearl, and data from Patience and Cleanness with data from the other two, all without knowing how many authors are involved or what their known works will yield for data, it seems more than a little vain. If the names of the poems were applied to cigar- ettes respectively noted for "manly flavour," "taste beyond price," "slower burning," and "less smoke per puff," it still could be that all the brands are rolled in the same shed in Lexington, although not necessarily by the same machine. Almost as a postscript to the problem taken up in Chapter One comes in 1966 A Concordance to Five Kiddle English Poems, a computer—derived work by Barnet Kottler and Alan M. Markman #5 46 (University of Pittsburgh Press). The concordance relies heavily on the work of Sir Israel Gollancz, whose editions of four of the poems are the basic texts, that have henner's, Savage's, and Tolkien and Gordon's for variants, and whose edition of the Pgagl is one of three variant texts backing up the 1953 Gordon edition. The concordance is a volume that would have saved John W. Clark much of his labor, thereby perhaps depriving us of one of the most truculent pieces of scholarship existing. In the Kottler and Karkman Concordance all the words of the five poems are listed somehow: those in Appendix I are frankly "Partly Concorded," which means that some of the line numbers are given for their occurrences, although why all are not given is not explained. Of special interest here is the decision not to concord a list of 152 words -- a list strange in that it includes words used thousands of times ("the," "and") and words used only once ("foul") -- including all but fifteen of my forty-five test words. TLe list of 152 merely gives the numbers of occurrences of each item in the five poems as a group. Four more pages added to the book's total of 794 could have presented the numbers of occurrences of these words in each of the five poems. But again the pre- sumption on the part of the compilers is that there will be more interest by authorship scholars in the less common words. It is nevertheless an improvement over the Hatthew Arnold concordance and older concordances generally in which the more common words are merely listed without any tally whatever. #7 1 Moreover, at the University of Pittsburgh, accordin' to C Kottler and Harkman, retrieval of further information is possible, since their input data is kept stored on majnetic tape. But before that tape denagnitizes, it might be best to get everything of possible use in print. But to return to the recapitulation. Ellegard's con- 0.‘ clusion was that the Junius problem lad been solved. Against him it can be maintained that circular reasoning was used to show that Francis' style is, not identical with, but closest to that of Junius. Horeover, the data going into his calcula- tions is suspect because it was not objectively or accurately derived. This last objection also is sufficient to vitiate the ole ms that a project like Milic's might have to validity. Yet, if there is still anyone who wants to know who the real Junius was, or whether the supposed author of Shakespeare's other works was the same one who wrote The Two Noble Kinsmen, there may be hope. I believe the hope to lie in the method of profiles. If I had to try the project again, I would choose five other writers, and work with only half of their works. I would have a computer count and combine allomorphs for the forty-five items plus a few more: an item for all personal pronouns, and an extra form for "to" would find places in the new list. I would then have all of the operations for deriv- ing my first three indices done by computer. . I would consciously use the circular device to find out which items of vocabulary are most usable in obtaining positive results: wherever "hits" were registered, the profiles would be scrutinized for points of correspondence. In this way maybe half of the items could be eliminated as having no bearing on the distinctiveness patterns of writers. Armed with this stripped-down list, I would turn the computer loose on the other half of my writers' works. I would fully expect the assaying power to be great --much higher than 50%; but I would demand that it test out every time before I would claim for it any efficacy in proving the authorship of doubtful writings. For if we insist on having recourse to the techniques of science, we must abide by the strict laws of scientific proof. No chemical test is valid unless it always work under controllably identical conditions. If there be an "essence of Keats" underlying subject matter, theme, word order, thought, emotion, and style, that essence ought to be detectible. And if it is truly his essence, then it will be found in every one of his works of appreciable length. Given the existence of this "essence," ”profile," or "handprint," will it ever be possible to feed into a computer a newly-discovered work and get a positive identification or non—identification within seconds of time? I can answer that easy question, "Absolutely yes.” Words in Code Poet Text APP E1 IDI ii A The Texts Used Poem Title Division 1 T 976 2 if 985 5 T 936 h T 100h 5 T 1007 6 T 1013 '7 K 1051 8 T 1128 9 T 1142 10 T 1151 11 W 1175 12 W 1269 13 W' 1377 14 T 1395 15 S 1415 16 K 1455 17 K 1478 H3 T 1h79 19 S 1536 20 T 1625 21 W 1623 22 T 1650 5 T 1650 21+ T 165:? 25 T 1653 26 T 16MB 27 5 161,6 8 s 1652 29 S 1632 50 S 1752 *51 I( 1747 32 S 1761 33 S 1831} 34 w 1896 55 S 1922 56 S 1968 37 T 1939 38 T 1991 59 W 1992 LLO K 2055 1:.1 K 2057 1+2 W 2055 15 S 2062 hlv, B 2062 [3-5 B 2075 Two Voi I! H n H H C88 Hyperion Locksley Hall H H White Doe of Rylstone In Kenoriai Queen The Eve 11 Two Voi Queen H Locksley Hall+60 Years Rab of St. Agnes ces ab White Doe Locksley t 60 fl 1! II N 5rd Third 2nd Lines 5rd Lines 2nd Third 1st Jhird 1st Lines Book III Odd Lines Odd Couplets 1stHflf Canto VI CmfloV Canto IV 7th 206 Lines Part II 2nd Half 1st1kdf 2nd Half Part I Even Couplets Canto I 1stfiflf Odd Lines Even Lines 2nd Half Odd Couplets Queen Kab Part VI " Part III " Part VIII " Part IX Isabella Daemon of the World Part I Queen Nab Part V Michael 1st Half Queen Hab Part VII " Part IV The Princess Part I Oenone Michael 2nd Half The Fall of Hyperion 2nd Half " 1stHflf The Borderers Act IV Lines Written Among the Euganean Hills Pippa Passes Hisht H Iloor1 #9 Words in Code Poet Text Poem Title DlViSlon #6 W 2096 White Doe Canto I 1+7 W 2111'r The Excursion Ist ‘alf, Boo: VIII Lu?» W 2160 " 1st Third, 300-; II (,9 W 217’} ‘.'-."hite Doe Canto VTI 50 W 2193 " Canto III 51 S 2199 Mask of Anarchy 52 W 2219 The Excursion 2nd Half, Book VIII 55 T 2251; Locksley Hall 5b, B 226:9; Pippa Passes I-Iorning 55 1'.’ 2270 The Ez'tcursion 5rd Third, Book II 56 S 2278 liellas Rimed Lines 57 W 2279 The Prelude 2nd Half, Book III 53 W 2295 The Excursion 2nd Third, 3001; II 59 W 2516 " 5rd Fourth, Book IV 60 W 2552 The Prelude 1st Half, ‘dool: III 61 W 255L:. The Excursion 5rd Third, Book III 62 VI 2559 " 1st Fourth, Book IV 65 K 2357 0tho the Great Act II 61;. W 2560 The Excursion 2nd Third, 3001-: III 65 W 2565 " 2nd Fourth, Book IV 66 w 236.6 " 1st Third, [3001: I 67 1'! 256.9» " 4th Fourth, Book IV 68 W 2575 " 181: Third, Book V 69 K 2575 Lamia Part II 70 S 2536 The Cenci 1st Half, Act V 71 S 2537 Daemon of the World Part II 72 S 2597 The Sensitive Plant 75 W 2&05 The Excursion 1st Third, Book III 713, S 21+L;.9 The Cenci 2nd Half, Act V 75 W 21.52 The Excursion 2nd Third, Book I 76 W 21:56 The Prelude Book XII 77 W 21:65 The Excursion 2nd Third, Book V 78 T 21+71 Pelleas and Ettarre 1st 'Ialf 79 T 21477 Balin and Balan 1st Half 80 W 2485 The Excursion 5rd Third, Book V 81 S 21491 Letter to I-Iaria Gisborne 82 S 2506 Hellas 1813 Third 85 K 2550 0tho the Great. Act IV 8L1. T 2555 Pelleas and Ettarre 2nd Half 85 T 2555 Balin and Balan 2nd Half 86 W 2575 The Excursion 1st Third, Book VII 87 w 2584 " 2nd Third, Book VII 88 W 2585 " 5rd Third, 3001; VII 89 W 2585 " 5rd Third, Book I 90 T 2615 Merlin and Vivian 31; Third 91 T 2627 " 5rd Third 92 S 265i; Hellas 5rd Third 95 K 2655 0tho the Great Act III Words in Code Poet Tex 94 95 ‘96 97 9’7) 99 100 101 102 105 104 105 106 107 108 109 110 111 112 115 114 115 116 117 118 119 120 121 122 3 124 125 126 127 129 129 150 151 152 133 154 135 156 137 158 159 110 1191 142 1 1+3 BLUUJHCUEEEIDU)VH8tDCJB=SHBB+3 m A =:- V WCUUJwtafijfi NLJJHFEll—BUJUJ“ H4 V raqutntaw 2653 2674 2695 2636 2693 2702 2704 2707 2709 2724 2725 2725 2729 2751 2754 2736 2 53 2749 2752 2754- 2760 2767 2775 2778 2905 2310 2712 2817 _ 3) ()3 C)\'] \1 (D (Db-J [‘0 J C) . 3’) J—*O<3\30) ,_ ') (4'3 1 ,— \— JQ -' OG\G\ I 7:) .. ‘ J \O\’)\:) .. ...J O mruh)RHvr%DJh)merun) O\O \n 2922 Poem Title 51 Division Hellas In Hemoriam herlin and In Kenoriar Guilt and Sorrow In Memoriam The Pope (Ring and Book) 11 Vivian In Memoriam Hyperion Charles the First 11 Husings hear Aquapendente Guilt and Sorrow The Pope In Memoriam The Pope 11 In Kenorian Guilt and Sorrow In'fienoriam The Prelude Guisseppi Capnsacchi (R 6 E) The Princess The Pope Guissppi Caponsacchi H 11 The Cenci The Princess The Excursion 0tho the Great The Excursion ll 11 Gu1iss ejapi Caponsacchi 11 The Lover's Tale 1! H Half Tone (Bin? and Book) Hyperion IIal f Roz": e Tw Voices half Rome The Idiot Half Rome The Lover's Tale ”3 — '7 Joy 2nd 6th 2nd 2nd 2nd th LL 1: I1 2nd 4 VI .0 ‘r #004; 1st 2nd Third 412 Lines Third 412 Lines Half 412 Lines Sijzth Sizflni 412 Lines I half Half 3 Rimes st 5rd 5th 6th 1st 1st 7th Sixth 412 Lines Sijith Sixth 412 Lines Half 412 Lines Book XIII 5th Sixth art III 5rd 6th 2nd 1st Act Sixth Sixth Sixth Si .:4. II Part VII Third, Book VI 1st Act 2nd V Third, 300 k VI 1st Half, Book IX 2nd 4th 5rd 5rd st 2nd 2321(I 2001: 1st 5rd 4.1; h L1. th EIal f, Book Si::t}1 Siszth Fourth Fourth Fourth Fourth II Fourth Fourth 1 Fourth 7“ ‘V 4" £Oarbfl ‘7 1‘. Words in Code Poet Text Poem Title Division 144 K 5056 Lamia Part I 145 T 5060 The Princess Part VI 146 S 5061 Prometheus Unbound 4th 400 Blanks 147 T 5067 The Last Tournament 1st Half 143 S 5071 Prometheus Unbound 5rd 400 Blanks 149 K 5077 Sleep and Poetry 150 T 5114 The Last Tourn'ment 2nd Half 151 S 5151 Prometheus Unbound 1st 400 Blazfizs 152 S 5150 " 2nd 400 Blanks 155 T 5268 Loclzsley Hall Sixty Years After 154 1'] 5521 The Prelude 3001; XIV 155 S 5525 The Cenci Act I 156 S 5554 Queen liab Parts III 8.: VIII 157 S 5555 " Parts II 8: VII 158 T 5415 The E-iarriaf‘e of Gersint 1st Half 159 T 31mg N 2nd Half 60 ‘3.’ 5444 An Evening? Wall: 161 W 5464 The Prelude Book III 162 W 5470 " Book II 163 T 3501 The Ping: 164 ‘52" 55273 The Prelude 33001: IV 165 W 5553 The Brothers 166 T 5567 The Princess Part II 167 K 5605 Endymion 1st Half, Book I 161°; II 5622 " 1st Half, 13001: I". 169 B 5637 Pompilia (Ring and Book) 2nd Fourth 170 B 5759 " 5rd Fourth 171 K 5751 Endymion 2nd Half, Book I 172 T 5755 Gareth and Lynette 2nd Third 175 5 576-1 Prometheus Unbound Act III E 3% I77 K 5777 0tho the Great Act I 173 T 5779 Gareth and Lynette 5rd Third 179 K 3798 Endymion 1st Half, Book II 180 B 5801 Pompilia hth Fourth 181 S 5811 The Cenci Act III 182 T 5812 Gareth and Lynette 1st Third 185 B 5820 Pompilia 1st Fourth 184 K 5826 Endymion 2nd. Half, Book III 185 S 5857 Prometheus Unbound Act I 186 T 5868 The Passing of Arthur 18 K 5872 Endymion 2nd Half, Book II 188 S 538 The Cenci Act IV 189 K 573.92 Endymion 2nd Half, 3001*; IV 190 T 35925 Geraint and Enid 1st Half 191 K 5961 Endymion 1st Half, Book III 192 S 5‘6") Adonnis 195 T 4025 Geraint and Enid 2nd Half Words in Code Poet Tex Poem Title Division 194 VI 4245 The Borderers Act III 195 V! 4266 The Prelude Boo: IX 196 'r 4275 The Coming of Arthur 197 s 1.3-5.011. The Triumnh of Life ‘I’ I99 I? 4407 The Princess Part V 200 “I 4455 The Prelude Book X 201 S 4492 Rosalind and Helen 1st Half 202 5 454-7 " 2nd Half 205 ID 4562 The Princess Part IV * ’2‘05 1': 13-5711. The Prelude 3001; v 206 VI 4575 The Borderers Act I 207 ’T 4672 In Hemoriau 2nd Half, Even Lines 203 ID 4690 " 2nd Half, 3 Rimes 209 IT 4744 ” 1st Half, Even Lines 210 'T 4760 " 2nd Half, A Times 211 T ‘4761 " 1st Half, 3 Times 212 V1 4770 The Prelude Book I 215 T? 4773 In Hemoriam 2nd Hilf, Odd Lines * 215 T 4323 In Hemoriam 1st Half, A Rimes 216 T L545 " 1st Half, Odd Lines 217 VI 4375 The Borderers Act II 2V3 S 4922 Julian and fiaddalo 219 V1 493’ The Prelude Book VIII 220 S 5023 Peter Bell the Third 221 VI 5107 Descriptive Sketches 222 S 5559 Swellfoot the Tyrant 225 S 5592 Alastor 224 {P 5411 In Hemoriam 2nd 824 Lines 225 T 5512 " 1st 8: 7th 412 Lines 226 W 5525 The Prelude 3001: VII 227 VI 5659 " Book VI 228 (P 5727 Guinevere 229 {P 589 Lancelot and Elaine 1st Half 250 T 5919 " 2nd Half 251 I( 6153 The Cap and Bells 252 h! 615 Tour of the Alps 255 IT 6615 Aylmer's Field 234 '1 7175 Peter Bell 255 1? 7229 Enoch Arden 256 CD 7536 The Holy Grail 257 I3 7951 Bishop Blougram's Apology * Text #51, Isabella, suffers from defective computer print: out. It was retained anyway, as a control. * Eight texts were temporarily misplaced when the charts were being made. 54 Q? In the listing on the preceding pages, the column headed Division has arabic numbers when the division is mine; roman numerals indicate the divisions I found in the poems. The original project planned was to make much of comparing within poems to see whether, for example, similarities were greater between the first and second halves of a poem than between its odd and even lines. Such divisions are magnificently easy to accomplish with IBM cards, as are the subsequent combinations of glossaries. The computer, just like anybody else, is able to alphabet- ize fifty vords much more than twice as fast as one hundred. Dividing the poems, therefore, not only gave greater flexi- bility to the project, but it also saved computer time. The poems on this list were, for the most part, key- punched from the following editions: Poems of Robert Browning, ed. Donald Shallcy, Houshton Hifflin, 1956. "u ete Poems pf heats and Shelley, Kodern Library, 86, no do Poems f Tennyson, ed. Jerome 3. Buckley, Xoujhton The Poetical Works 9: Alfred, Lord Tennyspn, Kiss and hnijht, Troy, I. Y., 1a37. Poetical Works of Wordsworth, ed. Thomas Hutchinson, Oxford University Press, 1904 ( 960). APPEIDIK B The Thirteen Texts From Which were Derived the Forty-five Test Words Eumber Text Division of Words Endymion 2nd Half of Part II, 1st Half of Part III 7855 Geraint and Enid 7950 Bishop Bloughram's Apology 7951 The Cenci Acts I and V 3158 The Prelude Books X and XI 7897 Hellas 7778 Merlin and Vivian 7927 The Excursion 2nd Half of Book VIII, Book X 7995 0tho the Great Acts III, IV, V 3025 The Pope (The Ring and the Book) 2nd Half 8191 The Princess Parts I, II, III 8554 In Memoriam 1st 1256 Lines 3174 In Hemoria“ 2nd 1256 Lines 3085 Note the discrepancy in word population between the last two items. It often happens that the same number of verses will contain widely different numbers of words. A statistician would consider the word to be the unit, but the poet would more likely think of the single line of verse as the unit. Lines of regular verse, moreover, are more ”like" one another than are individual words, which range from the zero-like content of an initial "it" or "there" to the most connotative abstract or concrete terms. The project I finally determined upon ignores both measures of size by touching only non-content words and discarding all others. Herein is some of my justification for not distinguish- ing text size during the main portion of the project. 55 for from have he w if in is it let like may more no not now of on one or shall so some that the then ther thes to u upon what when .n ' 1 ..hi co. ‘0..- 1 O-v-J V. 1 O Ui'h ‘ e H .d c!" (i g, e {.3 v n‘jhjfi"~\ xv , f" £Lt & QAIJJ as U Allonorphs C1D. cannot, cannt (can't), could 2 A 4— 1 co, done, dost, dotn o _- had, hadst, hath, nas, hast, hating into an, are, art, be, been, being, isit, 'tis, 'twere, was, wast, were, wert iSit ’ 'tis , ' t‘I‘i'aS , 't‘i‘u’ere likes .. , - ',.3 .2- -_',.., fl:— nayst, mloflb, nibhtst cannot, cannt shalt, should, shouldst this whom, whose ’ ° 4L. J11“: would, wouldst 56 words with the Allomorphs of Fourteen 'twas, 1 1 087654321099 WWWWGMHW 987654.321 sasuumhhwww,wmpmaa.gaaaa.a11 . o m A]. r) 2 .u h/ new 1| m .20 I o 0%“ o W m, .m 6.“. .5 ram 00, 0 ® 0 1 ”a RVQXMVANV CV My“ .1 r «a, o C ¢ W "and .r. @ s o m Sm a so. 3%» m. h o 0 5 «o nomfldmaoo 1.1 o b. o W on» no» hoods”: mm W 25 23 35. a a has ., 1 0.3 W. o h a o 9% 83m. .2: .54 1.7 m1, amp .5. 23% we. EM .fi 8253393.» % o ”A Inoo Huaowvdvuu mum w 9 oi» can chum» o._, N m: 650. I." m. 1 s and sum as a Qow as. .5 8m ow ¢ 3 *9 on» 32.5 T . :71 announcing mm .01; no»: can bung om, 3 Amy .3009? «Hon Ml mm 13.360 “35636 % M .5 Bean. .55 m 5 3050» Film MW 0 L211 7; voHomdo. 3.3 .I 8 > e tt 5 nf e m r d 3.... a . tm1 1.1.mmsmnwamm unwumm:me111mvmm:m1Wam n 1mcmmm e.m m. o m 11 ms: nu mwwu mm s e A3 Ru '8 3 a: .35. ..8 29:6 :3 no 83.33 A Summation of the Chart of the Last Analysis APP EIJDIX E 1 17/5 3/2 12/3 10/7 9/7 12/7 15/7 8/4 10/2 5/1 2 19/8 16/4 14/5 12/5 10/4 8/2 11/1 12/2 11/4 13/2 3 16/7"11/6 [EZZQZ 16/2 8/3 26/7 10/5 7/3 9/2 8/2 4 14/6 17/4 16/4 14/6 16/3 13/5 14/3 11/2 21/3 22/5 3 13/7 8/5 10/3 3/3 17/2 11/7 4/3 7/4 4/2 9/3 6 13/8 13/9 18/3 13/8 15/7 17/3 7/5 9/3 3/1 6/5 7 12/8 21/4 11/2 14/4 14/8 10/3 11/3 17/8 6/2 12/6 8 18/12 12/4 11/3 12/5 14/7 16/5 20/7 19/9 7/3 9/5 9 11/5 14/5 17/9 20/6 15/4 9/4 18/6 14/5 16/4 26/8 10 11/9 13/9 10/6 20/6 17/6 14/10 19/8 13/4 20/6 14/4 Each of the "improper fractions" above represents the score '0 for a square consisting of 25 x 25 smaller squares. hithin most squares there is a possibility of as many as 1,597 indices of correSpondence or of 529 indices of non—correspondence. The full chart is 250 by 250 squares or twelve by twelve feet. The top fifure in each square above is the number of cri- teria pertaining to the more than five hundred pairings of graphs represented by each square, and the lower number indi- cates the times the criteria were mistaken. See Table 1, page 42, for a close View of one portion of this chart. is anything but typical of my results. Table 1 represents only 1% of the total chart; it APPENDIX F Specimens from Different Stages of the Project (1) Program cards, the first seven of 115 in the deck of the word count program built by Ja m1es D. C11? r1: of the Psycho- logy Department, Xichigan State University. (2) Input cards for this project. Each card contains one line of verse, every word verified and, in some instances, res pelled according to a convention (see pa-ges 54-5). The punching of the cards was done almost wholly in my heme, on a machine graciously rented to me by IBM. Blank cards were pur- chased from the Computer Center at Iichigan State University. (5) Output sh.eet, to show how the computer completes its part of the prOJ ect. (4) Listin eet with th as 45 test words in various possible forms.3 (5) Gathering sheet, with the 45 iteins totalled. (6) Broken-line graph, in which the broken line does not really mean anfirtiing, ut merely takes the eye from point to point (however, the broken line is the profile of a text). (7) Same graph with the points expanded to %" holes. This graph, and 229 others, were the core of my project; com- paring each one with all the others consumed the better part of a full year. The same job done by computer (that is, what turned out to be the effective part of it) would have taken less than two weeks, including all preparation. (8) One leaf from the 12' x 20' chart containing the in- dices derived from comparing graphs (see pages 58-9). 59 ‘ognn 3,Mo .c- ' ' Kazsooo 1. Iooououjao;j{ooloooouno . 9 u s I nunuuuuuunuuazmn f' I'I -Spec1nen°1 ‘_‘__._—._...7- anunanannnunxnaaoucuuagncnau '.p‘ coN<Tomor~>z A. >o>~z mo . 1 , , p H a m. popgzmq o. >.x V. pawm a. prr » I. . p .;a 1. . . H u o. prozm . I, pa. »rozm -.. I p». brnHym . Hm. >rmo » ‘.. 4 I,; p .:;.; H p pa. brqxoco: p». »rz>2 Ho. >znzcm w Lr. , k . , . N. .m ham. azacm Imo. icxum any. 13x4: has. zocrc m p m ., p n 5 Age. zznax keg. frazzrmw. bop. smoz: hon. xmccnxq P . ,;, p. y » gag. x be». «nbxzfizo gem. «mbxm goo. «m4 9 i; ‘ ..H. : . g 9 bag. <3: . . p . qaq>r zczzmm an zcxam u cu; I. H. 4}“ l 3.... 2? y/ 9’ ix,“ 4 --- M 5.95;...“ o ‘ ‘_'__ r9432" O ._ Wu ..-- m h-"Au ... -nm‘m°-n : ‘ 1r. .- ’L1\au 14 _E; u, n' -C‘. -n‘ 1*, .‘w “a . ' 5:22... I wag-mi {max ‘7’ ...-ova“.th Cum-s “an... 6.13;..{J o .... “342-44: 0 -.. ‘ I - A M Lflwflg. ‘/-7‘.. 97¢. _ ...- 44;... 5’ AMmemu m mans—an I W“m‘ I .4v¢5 gf’ —‘l-‘".. '46- ,. ~— _:.'.I7.. 288'}, 1:2,}? 2 __ ‘ {Er-3’. [’4' ' I U w— - ‘- ff’;‘€?~dh.~ '1‘, / ' .1“ MM V‘." ..— --~ -. u Jo—wmwusmwavnpmam 7 , {j I V . ' 5 . i f- : 4". ..D "Bu-fl?‘°- i ' -'- A‘NI-‘J- W‘bv—afl"81~’-\}"‘ :«VJ—U—Jm :‘I-fi‘fi'ilrwlt' -5 i ova-J” O r45-m .w ‘rM‘W-trx' Amati 3* 3"!“ Maj/4”” 4311......“ Wfluwrv JWMbMNNMuM-pd “9 aI-rvw . w .— w “—_ r. ‘u‘n a” 9 u: w' —‘v"‘_~ N h- h. @{LECJ ‘J on WI. “WI. 0 .~.’".- 2 ~24 4. ...............,...-...- j [ . ,", ... fl “m’v‘”. O (a: 71:33.: ...;-go‘ma'numa ~ ... IO? M I'M m-.. 3 4.1.4.44 0 Aw- -- .2533 {x o aézflf’ Q_ 444 '2’": {’6 -:~‘-/?-_.°3.~.:... ”...”... .dlqfi" é (Maui ““.na ”Mm- ow W“ _——‘O‘ firm—Fw- 6'- 0.r-*& "4&4 .1594? 0 £344.20 figs}: 40 _. - 0 W 5’ .. ' .... I .' - rug.‘ «.9 m." s smgs‘wuu 65 Specimen 4 56““ nm- .__ ‘- - _ ._ -...§-P eA-Cimeg—é a-” I —- #— a .11., .zo-ra... a 10+} ...-.....L -. 5.1 .1: ‘_ __ _ ..-...me ,. _ ”8135i _ 7,33: n... 8f t G: «g a l). .{ ,_.‘:7;_u.fiv_= mail-rm. svca‘mrJE‘11-Mtwuva- - “"zg-IK“ -‘““ " " '1 ‘ ——— 731—1. 7' '- m3 ’ '7“ 33:1"; ‘1' lei)? .1, v .....- ‘ .. - . ‘ - .. -- bi? . V L- ..‘333 ‘5: 43: l: ’~ _ A __ _- - 98“ ......“ , 2H1 ¢ 3% s. . . ., Q ..3. -’ gi J, l +/ A _. ‘_‘ ‘A .. cut—:5. .- A-A-mvr‘-+.’._ fl—wm.d-umwu.l.;n for 9‘ 5581‘ 8 £22533 2. £32m 3 have .2. w v v ~ haw: 1' w 1309 .234 ‘ hm? L 53 6 if #v (o W - #111 - ’27 "f" ____. -,.__ - ...??rmm - 1‘: ti _ -. if .24." arl+‘l+?-J/+.5'+-Lrov :73 2,“ +~f¢g +tl+5‘+?—;‘-i , -3 7.. __~>::I+/ u _ fifi _ m“; "‘5’ _- A A A w M :51} +1 19% O 3. :3 0 like: 3 ' like 8' 223‘ I 0 3 v :7 may ' 3 *7 m: 4. £939 9 m ~ 7 v . ... ‘ 2:13 ‘ ‘— ‘ “ A— L, ”Mm-aw“ . \ \ 39% IO 1+7 5‘5" I 4") ma j 510‘: I ma ...“: V - ._ (31’ i..." W {7' __ {an :L _ -.....5223- .. -3. .. m... -..... “Tone ‘— 3116 5' c3: 3 ’7’? 3 _ 811871 ,7, ‘- 84472‘11 -..... - 2— 39 . 6 «‘59 L M__ 63129 5" - A éfi'fi? -.. ‘ M5: _ -_ “...;-..” mag; so that; 3-0 $133 ‘lo ., .-___,. - W--- 3319. - __ __ -fi _. __- g" - - .4... .3583 9, than ‘1 ......311:=:?.9-_~.5: --- -- ”if” - ... . é . _ -- ...... ‘5}3838 3- .HJ Whose 1+“; —- :9 J 7 , - --——- g a ‘ fl.” ~ “W J~2mnwnuw “gm—”3:3! m I 119 V . ape-n / “P93 I m ,__- what .2; - w ‘31!“ 7- ch63} J t - v - M3, - 1" .. “£1: GEE—"r WETLQh 5’ v;- ‘m g as .3 - 5:125; --..--... - .3 +3 viii _3 14-9., mu. 1+ 2. 1.7133111 1‘ git-h - A I L__ ..-“-.- _.. yet, I 397: I M ._ “- \ 40 I , \ ..." \ ' ( s \ 'r " >~ A '1 \ . -. ‘ , . v. ’1' \ ; -- 3“»: J k“ a“. x. ‘ “ , g. = .2 ; I; - a) J o , 2X ..ul 2". ..L-.2.o I} - - H ;‘ w ..l r ’_ ) ._._. ‘-.._..- . ‘ fl .- s v I ‘ { \ . . . ~ . n... 1" . . . --R\«'.r-.4-v~'-Y.'|n-~'Qu1'-"\ m- n;r —>~ .4. (M, > ' ' '. I. . a ‘ _ I I ‘ ‘_ . ‘ - ,. . s . . .. . i - ...... .. . -...- ' .. , ._ -.-.-- , , u ': ' u 2 I ‘ Q ‘ I '<:a-~- ~.-.—.~v '. ' -' s \ l ' -uw--._+..__....-‘} ‘ 3 -.-..--~ .3." o~--.-4--.n, --. .nr->t~-\:r..\-. ... : _:,., “...,- -: ~111v”..-...‘.'é.‘ we huh-«....» ,b' 1 . ; I I , ’ .. . -u-mt-Mp-n'u-‘uamu‘a- ..." .‘ ' . - ' . . g . f ,r.-~. ~s.‘ :--a- ..y ~~mg~p~um VM>'~¥_~-.I..¢.,‘-A-‘vv. . n . ... . . z . . s, .. ..-... +1.... a~.—........-W , . O ... --.._;....... .-r... -.. .. .-.. ... . .... .. -.-;.._--. . ...-.- 5.....- 9 I -.....w . , '5 s "“'r ‘ Q ~..v4--a.-M- -- .J- .- . , 1 ‘, v'l! Fri-,vnu1v L no MA 7 ~.. -.., ‘ w- ..-;-r ....um. - . I . .- ‘x v .' "h”“vsnv ~-..-._..;--,.. ‘1 [ab-1Q . SpeCimen_7 66 I||| urn" é-xc Specimen 8 I 0 J, a. I 12, 9,2, 5 0-0 . ”4 3/;52’ ('15. -?_.’/0; I}, ‘7)“2; 5, f f2 67 APPEIDIX G The Effectiveness of the Four Criteria Criterion* Hiéges Ratio ‘xpectation Effectiveness 2 222/67** 3.3 .2 16.5 3 238/233 1.02 .2 5.1 II 184/168 1.1 .2 5.5 0 2&7/47 5.25 5.0 1.05 *For an explanation of the symbols 2, 3, II, and 0, see p. 40. **The figures in this column are reasonably accurate. Whai I made the count last year the fractions were: 222/6?, 236/233, 189/162, and 250/45. Since the texts behi.d the graphs used in this project came from five different poets, every fifth time two graphs were compared the texts behind them were by the same poet. A person guessing blindly the identity of the second poet in each case could erpect to be correct one out of every five times (more often with Wordsworth, less often with Keats). Thus the expected ratio is one out of five (1/5), or .2. Yet the positive criteria (2, 3, and II) guessed right five to sixteen times as often as a blind guesser would. There- fore the positive criteria can be said to be five to sixteen times as effective as sheer guessing. The negative criterion (0) is obviously worthless as an indicator of identity since its effectiveness quotient is prac- tically the same as the quotient for chance guessing. The positive criterion (2) slows some promise. With certain testing items removed --—I have no notion which ones -- and others added, and with a confining of the testing to texts of more nearly equal size, the effectiveness quotient could no doubt be raised considerably. Sixteen may be high enough for some purnoses, as when only two possibilities are present in an authorship disbute. One hundred would be better when more po- tential authors are in the competition. But when almost anyone could have written a doubtful text even a quotient of infinity would not be a positive identification. 68 SOURCES CITED Clark, John Williams. The Authorshin of Sir Gawain and the Green Knight, Pearl, Cleanness, Patience, and Erkenwald in the Light 2; the Vocabulary. University of Minnesota doctoral dissertation, unpublished, 1941. Cited from Clark: Bateson, Hartley, ed. Patience. Second Edition. London, 1913. Cha meers, R. W. "Long Will, Dante, and th e Pi_hteous Heathen, Essays and Studies by Members 0 tile English ASSOCiatiO_ I: 'C .(19 .24W “.....- "-"' Chapman, Coolidge Otis. ”The Authorship of the Pearl, " Publications of the Hoiern Lan1uage ssociation, KAIVII W932), 346. "'" ------------, A le xical Concorde irce of the Iiddle Enflish Pea rl, Cleanne ss, Patience, en:1 Sir "Gamayne and. the Grene Knifht. A--, fi--, h0-- — Jy--, and Z-- only. Cornell bniversity doctoral thesis. 1927. Gollancz, Sir Is~ael, ed. Cl eanzess. Volume I, Introduction, Text, and hotes. London, 1' 2 . r --------~---, and habel Day, edd. Cleanness. Volume II, Glossary. London, 1935. ------------, ed. atijggg. London, 1913. ------------, ed. Pearl. Revised edition. London, 1921. ------------, ed. t Erkenwald. London, 1922. norst1a“- Carl, e1. Altenrlische Legenden ( one Folgr). i‘.CilDI‘U.IlL, 1“] o Oakden, J. P. Alliterat ive Poetr3r in Ij_dile En"lish. Volume I, The Dialect e ail Letric l Surv;y. Publications of the University 01 huncheste1,3/ (19 3Q). ------------. ------ ----- - Voluue II, A Survs of the . {Ural-7i +1011. Publications of t‘he L 1i V01 Si CJ’ Oi “ENCiieS—‘L—ICT, CU 5-LJ;‘JTI (1935) o Savafie, Henry L., ed. t Erggnuald. Jew Haven, 1326. Serjeantson, hary S. "The Dialects of the 1est midlands in fiiddle Enflish,” gevieg of Englis11 Stufies, Ill (1927), 5;, 1fi6, 319. 70 Ten in., ,ernhard. istorv of EV: li: h literature ...- ... -..—-- ....- --.. ....--—— 1.rolu eI. E33. tr., I. L. Lenne J. new York, 1;“? )0 Tol“ien, J. R. R. and E. V. Gordon, edd. Sir Gawain and the Egee_n grimgt. London, $25. TrautIann, Korftz. Ueher Verteeser uni ITits teeuuCeneit einiggr Alliterierender \O’ICJLC (1_es Alter 1129593. Ialle, 1776 Conley, John. Review of I. I. Sa_vaCe's fie Cawain-Poet Studies in his Person ali:z §_ud Rae'Croune. Chapel” hill, 1 1956. In—Sneculum, ”"(II ( 57), 853-61 Ellegard, Alvar. Who Was Junius° Stockholm, 1062. ' / ------------. A Statistical Method for Determining Author shin. Stockholm, 1962. Reviews of fillerard: "The Statistics of Style,” Tigg§ Literary Sunnlement. Jan. 5. 1963. paée 1. Zimmer, GeorCe W. Journal of ngH 1111 (1963). 6.8-0. Herder}, Gustave. Tyne—‘1 Q‘rnn -....- metice il Linfuistics. 's-GreveheCe, 1060. \J \\\\\ 1! Kottler, Earnet, and Alan I Ta“3‘ ‘W Niles, Joseihine. "Eras in EnClishL Poetry, PrZJi 'tIe -Offiflfll Ian31u1Ce film Clcxriation1L .L{ (195 5), Iilic, Louis I. A Cnentitative A“:POCCH to Jonathan Swift. The La ue, 957. ------------. Dissertation Abstracts, III IV (1,61), 3730 Hosteller, Frederick, and David L Wallace. Infer e‘.1ce aid Disnuted Authorshig; The Fe‘erelist. Readinf, Iass., 1’36 Taylor, Georfe C. ”IonteiCne-Shahes‘etre and the Deadl‘ Parallel," P‘ilolo; MCII aparterly, IIII (1943), 330-3. Yule, GeorCe Ulry. The StTtIGELC“7 Str"r or IltCT““V fece‘u- 1613‘". Ca‘lbrififn, $131.1, ------------. ”On Sentence-lenCth as a ta tistical Chi racter- stic of Style," xionetriha, III (193~), 563-90. MICHIGAN STATE UNIV. LIBRnRIEs ‘ll||llllll“HIWIWWI”Nl1”IIWIHIHNNHI 31293010631491