ABSTRACT BAYLE AND L'AVIS Aux Répuclés, A NEW APPROACH By Richard Vern Wall The principal aim of this dissertation was to discover, through a computational stylistic analysis, whether or not Pierre Bayle wrote l'Avis aux réfugiés. To this end, thirty—seven computer programs were developed which combined a series of variables and procedures, some of which had previously been tested individually in other attribution. research. New elements were indices of French function words and word- patterns (fixed and variable expressions) and analyses of root-and- variation vocabularies found in works by Bayle and Daniel de Larroque-- the most frequently mentioned candidates for the attribution of l'Avis-- and in the disputed text. In addition, total texts, rather than sam- plings, were used. The great amount of data to be analyzed made nearly complete automation the most feasible approach. Because no method to count them systematically has yet been devised, some aspects of style remain in the realm of subjective analysis. Others, however, when structure allows their incidence to be readily identified and counted, have lent themselves to a quantitative, and therefore more objective, analysis, than that provided by earlier methods. The advent of the electronic computer has provided the literary Richard Vern Wall scholar a means of processing quantities of information such as a researcher might not previously have assessed during a whole profes- sional career. In my study, some eight hundred test items were grouped into five major areas represented by the basic computer program which generated their quantitative stylistic measures. These five categories and their corresponding nmemonic program names consist of: Sentence level measures (SENWOL) Sentence beginnings and endings (STYLBEND) Function word frequencies and usage (FREQFUN) . Expression frequencies and usage (EXSOR) . Vocabulary analysis (ENROOT) MAUINt—I The quantitative use of each variable was computed from works known to have been written by Bayle and Larroque. Figures thus derived were then compared to like values in the disputed text. The results were conclusive. Thirty-six percent of the postulated stylistic dis- criminants examined showed an appreciable discriminative ability in the works tested. Approximately eight percent of these variables suggested that Bayle authored the disputed text; less than one percent implied that Larroque wrote it. On the other hand, nearly one fourth of all variables tested revealed statistically significant differences of use by Bayle and the unknown author. Likewise, more than twenty-five per- cent of the variables separated Larroque from the author of the contro- versial text. Although procedures employed in my study do not represent a foolproof method for authorship attribution, they are, nonetheless, a viable tool for analysis, a means of obtaining objective internal evidence correlative to that obtained through other approaches. Earlier qualitative and quantitative findings are evaluated and compared to the stylostatistical methods and results produced by my Richard Vern Wall research. Emphasis was placed on procedures used in attribution studies, particularly those giving substance to the five categories of stylistic indices utilized. Technical sections outline the specific procedures followed and report statistical results of comparative quantitative-use of the variables. Statistical test results led to the conclusion that neither Bayle nor Larroque wrote l'Avis aux réfugiés. BAYLE AND L'AVIS AUX REFUGIES A NEW APPROACH by Richard Vern Wall A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Romance Languages 1971 PLEASE NOTE: Some Pages have indistinct print. Filmed as received. UNIVERSITY MICROFILMS © Copyright by RICHARD VERN WALL 1971 TO MY WIFE AND CHILDREN FOR THEIR LOVE AND PATIENCE ii ACKNOWLEDGMENTS Sincere gratitude is extended to all those who contrihnttd to the realization of this project. I would like to thank the Department of Romance Languages for permitting me to deveIOp the subject of this dissertation. I am especially grateful to my committee chairman, Pros fessor Kenneth R. Scholberg, members of my guidance committee, friends and colleagues for their counsel, assistance and encouragement. An eSpecial note of gratitude is acknowledged to the Computer Laboratory of Michigan State University and to Indiana University whose grants~ in-aid provided funds for program development and final data processing. To the Computer Institute for Social Science Research and to the Learning Systems Institute of M.S.U., I express my gratitude for com~ puter time and programming SUpport. This project could not have been realized without special concessions in the use of computing facilities by M.S.U. and Indiana—Purdue computer centers. A special debt of gratitude is due Paul Gabriel and Gordon Wakefield whose assistance in computer programming and data analysis was invaluable. Finally, to Ruth Harrod and Drs. Maurice Crane and Helen Lee I express unmeasured thanks. TABLE OF CONTENTS CHAPTER PAGE LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . .. vii I. INTRODUCTION: THE NEED FOR AN ATTRIBUTION STUDY OF L'AVIS AUX REFUGIES . . . . . . . . . . . . . . . . . . 1 II. THE APPROACH . . . . . . . . . . . . . . . . . . . . . . 19 III. PROCEDURES AND PROGRAM DESCRIPTIONS . . . . . . . . . . . 42 A. Section 1 - Data Description . . . . . . . . . . . 44 B. Section 2 - Program Descriptions . . . . . . . . . 47 l. SENWOL . . . . . . . . . . . . . . . . . . 47 2. SYLAN . . . . . . . . . . . . . . . . . . . 48 3. STYLBEND . . . . . . . . . . . . . . . . . 50 4. EXSOR . . . . . . . . . . . . . . . . . . . 51 S. FREQFUN . . . . . . . . . . . . . . . . . . 55 C. Section 3 - Miscellaneous Programs . . . . . . . . 58 1. CANAL . . . . . . . . . . . . . . . . . . . . 58 2. EDITING . . . . . . . . . . . . . . . . . . . 59 3. OUTPUT CONVERSION . . . . . . . . . . . . . . 60 4. FAP . . . . . . . . . . . . . . . . . . . . . 60 S. PIP . . . . . . . . . . . . . . . . . . . . . 61 6. ENROOT . . . . . . . . . . . . . . . . . . . 62 7. ANOVAR . . . . . . . . . . . . . . . . . . . 65 IV. TEST RESULTS AND ANALYSES . . . . . . . . . . . . . . . . . 72 A. Sentence-Level Measures . . . . . . . . . . . . . . 76 1. Summary and Conclusions of SENWOL Results . . 92 iv CHAPTER B. Sentence Beginnings and Endings 1. Statistical Summary of Sentence Beginning Variables . 2. Conclusions on Sentence Beginnings 3. Statistical Summary of Sentence Ending Variables . . 4. Conclusions on Sentence Endings . C. Function Word Frequencies and Usage 1. FREQFUN Individual Word Analysis 3. Verbal Summary of Individual Function Words . . b. Statistical Summary of Individual Function Words . 2. FREQFUN Grouped Data a. Statistical Summary of Grouped Function Words . 3. Function Word Conclusions . D. Expression Frequencies and Usage . l. EXSOR Individual Expression Analysis 2. Verbal Summary of Individual Expressions 3. Statistical Summary of Individual Expressions . . . . . . 4. EXSOR-Grouped Data . . . . 5. Statistical Summary of Grouped Expressions . . . 6. Conclusions for EXSOR Data E. Vocabulary Analysis F. Summary of Test Results V. SUMMARY AND CONCLUSIONS . LIST OF REFERENCES . . . . . . . . . PAGE 95 129 131 132 134 135 135 147 150 150 160 162 162 162 177 179 182 191 193 '193 203 214 225 CHAPTER PAGE APPENDIX A. LIST OF ARTICLES TESTED . . . . . . . . . 231 APPENDIX B. INPUT DATA - TEXT SAMPLE . . . . . . . . . 232 APPENDIX C. ANNOTATED PROGRAM LIST . . . . . . . . . . 233 APPENDIX D. SENWOL FLOWCHART . . . . . . . . . . . . . -237 APPENDIX E. SAMPLE SENWOL OUTPUT . . . . . . . . . . . 238 APPENDIX F. TABLE 4:6A. AVERAGE NUMBER OF LETTERS AND SYLLABLES PER WORD AND LETTERS PER SYLLABLE, LESS ONE AND TWO LETTER WORDS . . . . . . . . . . . . . . 239 APPENDIX G. SYLAN-SUBROUTINE BREAK FLOWCHART . . . . . 241 APPENDIX H. STYLBEND FLOWCHART . . . . . . . . . . . . 242 APPENDIX I. SAMPLE STYLBEND OUTPUT . . . . . . . . . . 243 APPENDIX J. FREQFUN FLOWCHART . . . . . . . . . . . . 244 APPENDIX K. FUNCTION WORD ABSOLUTE FREQUENCIES . . . . 245 APPENDIX L. TABLE 4:34A. FREQFUN STATISTICAL COMPARISON, X:L1 88 WORDS . . . . . . . 249 APPENDIX M. TABLE 4:34B. FREQFUN STATISTICAL COMPARISON, PzBl 31 WORDS . . . . . °.° 251 APPENDIX N. TABLE 4:41A. REFINED FUNCTION WORD LIST RESULTS, X = L1 9 WORDS . . . . . . . . 252 APPENDIX 0. TABLE 4:41B REFINED FUNCTION WORD LIST RESULTS, X = Bl 15 WORDS . . . . . . . 253 APPENDIX P. TABLE 4:41C. REFINED FUNCTION WORD LIST RESULTS, X # B OR L1 26 WORDS . . 254 APPENDIX Q. FUNCTION WORDS IN GROUPS . . . . . . . . . 255 APPENDIX R. TABLE 4:410. FREQFUN - GROUPED DATA, ABSOLUTE FREQUENCIES . . . . . . . . . . 258 APPENDIX S. EXSOR FLOWCHART . . . . . . . . . . . . . 259 APPENDIX T. ‘LIST OF EXSQR_EXPRESSIONS . . . . . . . . 262 APPENDIX U. EXPRESSIONS IN GROUPS . . . . . . . . . . 269 vi CHAPTER PAGE APPENDIX V. TABLE 4:54A. EXSOR GROUPED DATA, ABSOLUTE FREQUENCIES . . . . . . . . . . 215 APPENDIX w. .EEBQQI FLOWCHART . . . . . . . . . . . . . 276 APPENDIX X. ENROOT SYNONYM LIST . . . . . . . . . . . . 277 APPENDIX Y. GLOSSARY OF WORD LISTS FOR ENROOT . . . . . 290 APPENDIX 2. ENROOT SAMPLE OUTPUT . . . . . . . . . . . 291 APPENDIX AA. PIE_PLONCHART . . . . . . . . . . . . . . . 292 vii 7UU3IJE 4:21 . LIST OF TABLES PAGE Average Number of Words per Sentence in One-Hundred-Sentence Units . . . . . . . . . . . . . . . . . 71 Overview of Absolute Frequencies . . . . . . . . . . . . . . 79 Average Number of Words per Sentence . . . . . . . . . . . . 81 Average Number of Syllables per Sentence . . . . . . . . . . 83 Average Number of Letters per Sentence . . . . . . . . . . . 85 Average Number of Letters per Word, Syllables per Word, and Letters per Syllable . . . . . . . . . . . . . . . . . 87 4 3 6A. Average Number of Letters and Syllables per Word and 4:5) 4:1() . 4:11.. 4:123. 4:13 4:14 4:15, 4:16, 4:17. 4:18. Letters per Syllable, less one and two Letter Words . . . . 239 Regrouping of Test Articles . . . . . . . . . . . . . . . . 89 Average Number of Words, Syllables, and Alphabetic Characters per Sentence; Average Number of Letters per Word, Syllables per Word, and Letters per Syllable . . . . . . . . . . . . . 91 SENWOL Summary . . . . . . . . . . . . . . . . . . . . . . . 93 Adjectives as Sentence Beginnings and Endings . . . . . . . 97 Adverbs as Sentence Beginnings and Endings . . . . . . . . . 99 Articles as Sentence Beginnings and Endings . . . . . . . . 101 Conjunctions as Sentence Beginnings and Endings . . . . . . 103 Exclamations as Sentence Beginnings and Endings . . . . . . 105 Gerunds as Sentence Beginnings and Endings . . . . . . . . . 107 Infinitives as Sentence Beginnings and Endings . . . . . . . 109 Interrogatives as Sentence Beginnings and Endings . . . . . 111 Latin Words as Sentence Beginnings and Endings . . . . . . . 113 viii '“WLE 4:19. 4:20. 4:21 . 4:222. 4L225. 45241.. 45255.. 4:37. 4138. 4:39. 4:40. Negatives as Sentence Beginnings and Endings . Nouns as Sentence Beginnings and Endings . Numbers as Sentence Beginnings and Endings . Prepositions as Sentence Beginnings and Endings Pronouns as Sentence Beginnings and Endings Conjugated Verbs as Sentence Beginnings and Endings Percentage of Use Summary for Sentence Beginnings Percentage of Use Summary for Sentence Endings . Summary of Sentence Beginnings Summary of Sentence Endings Function Words, X Ll . B1 . Function Words, X Function Words, X # Bl . Function Words, Bl # Ll Function Words, X # Bl or L1 . Function Words, X # P . FREQFUN Statistical Comparison X:L1 . FREQFUN Statistical Comparison P:Bl Summary of Significant Function Words FREQFUN Individual Words--Statistica1 Summary FREQFUN Grouped Data Results . Function-Word Groups Which Suggest Bayle Wrote l'Avis aux réfhgiés . . . . . . . . . Function-Word Groups Which Suggest Larroque Wrote l'Avis aux réfggiés . . . . Function-Word Groups Which Suggest neither Bayle nor Larroque Wrote l'Avis aux refugiés . FREQFUN Grouped Words Probable Author Summary ix PAGE 115 117 119 121 123 125 127 128 130 133 137 139 141 143 145 147 249 251 148 151 155 157 158 160 161 TABLE 4:41A. Refined Function Word List Results, X = L1 . 4:418. Refined Function Word List Results, X = Bl . 4:41C. Refined Function Word List Results, X = Bl or L1 . 4:4 JD. FREQFUN Grouped Data, Absolute Frequencies . 4:42. Expressions, X = Ll 4:43. Expressions, X = Bl 4:44. Expressions, X 34 Bl 4:45. Expressions, Bl ,e L1 . .4;¢;(5. Expressions, X.# Bl or [A 4:47. Expressions, X )4 P . 4:48 . Summary of Significant EXpressions . 4349. EXSOR Individual Expressions--Statistica1 Summary 4:55(). EXSOR Grouped Data Results . 4:51 . Expression Groups which Suggest Bayle as Author of l'Avis aux réfugiés . . . . . . . . 4:5533.. Expression Groups which Suggest Larroque as Author of l'Avis aux réfugiés . 445535. Expression Groups which Suggest neither Bayle nor Larroque wrote l'Avis aux réfugiés . . . . . 4:5“. EXSOR Grouped Expressions Probable Author Summary 4‘51LAN EXSOR Grouped Data, Absolute Frequencies . 4155. Summary of ENROOT Comparative Data . 4:56. Vocabulary Richness of Five Test Articles 4‘57. Logarithmic Type/Token Projected Values and Unique Roots . 4:58. "K" Characteristics of ENROOT Data . . 4:59. Results of Verbal to Non-Verbal and Content-Word to Function-Word Comparisons . . . . . 4:60. Summary of Test Results . 4:61. Percentage of Discriminants which reject Bayle and Larroque as Probable Authors of l'Avis aux réfugiés . X PAGE 252 253 254 258 165 167 169 172 175 176 178 180 187 189 190 191 192 266 195 198 200 201 203 204 206 CHAPTER I INTRODUCTION THE NEED FOR AN ATTRIBUTION STUDY OF L'AVIS AUX REFUGIES The reasons for assigning certain pieces of literature of dubious authorship to a particular author's canon are diverse. Gerald Bentley once pointed out that theater managers and publishers might attribute plays to a famous playwright merely for advertising pur- Poses.l Like plays and other popular non-dramatic literature, P01 emical pamphlets on religion and political topics may suffer from faulty attribution. Reviewing a letter from M. Arnold in the N~°uvelles 513.11 république des lettres, Pierre Bayle examines some Weakuesses in the argument for attributing a questionable work to Saint Athanasius. Was Bayle's arguments against Arnold suggest that he aware of such publication practices as Bentley mentioned. He writes, for example,that although the title page bore Athanasius‘ “me, it was the copyist who put it there, probably because Athanasius Was "plus célebre et moins odieux" than its real author.z Even though it is frequently possible for human judgment to discern individual writing styles and to attribute passages, or even complete works to ceI‘tain authors, literary history is full of cases of disputed authorship and canonical uncertainties. In the seventeenth and eigh- teenth centuries, for example, publication of anonymous and pseud- °“Ymous polemical works was a common practice. Many of these works passed into obscurity, while others, such as l'Avis aux réfugiés,3 found their way into the canon of celebrated writers, even though their literary value and authorship remained in question. The fact that so many disputed works remain suggests two possibilities: either scholars have seen no reason to resurrect an old problem (that of attribution), because stylistically or ideologically the work in ques- tion offers nothing new to current criticism, or researchers using human judgment and traditional methods have not been able to agree on defini- tive attribution. Elisabeth Labrousse points out that the truth about l'Avis aux ré fugiés is so elusive that "l'attribution a Bayle de ce pamphlet est une pomme de discorde parmi les specialistes et elle a fait couler beau- coup d'encre."4 Because qualitatively the style of l'Avis is considered t0 be "poor," "careless," and "lacking in imaginative power,"5 its st’-)"1istic qualities would not seem to be justification for its survival. This being the case, there must be other reasons for its continued ilnPortance. The appearance of l'Avis was most opportune. Jurieu's W3 Pastorales6 had stirred and offended members of both Protestant and Catholic factions; l'Avis 9335. réfugiés constituted a timely reply. Shortly after the work appeared, it was unobstrusively refuted7 and would likely have been forgotten had not Jurieu, by publicly attri- buting it to Bayle, revived interest in it, an interest which persists to the present. Therefore, the task remains for modern-day scholars to grapple with the difficulties of definitive attribution of .L'fl’li 552‘. I‘éfugiés, for if, in reality, Bayle did write l'Avis, it has earned a Place in his canon and should be definitively assigned only there. Moreover, since the central theme of l'Avis is strongly political, and Since it is the only politically oriented work in his published corpus, Proof of his authorship would justify interpretations that Bayle's If, however, works may be infused with political overtones or intent. Bayle did not write l'Avis, this area of future scholarship should be foreclosed and such interpretations of particular works as Walter Rex's treatment of Bayle's article on David8 would have to be judged on in herent evidence alone . Definitive attribution of l'Avis aux réfugiés is important, then, because 1) it is "one of the few occasional pieces of the period [1685-1715] to survive in French literature,"9 2) if Bayle wrote 1 'Avis, it has place only in his bibliography, and 3) definite knowledge of authOrship will either open or foreclose additional areas 0 f Bayl ian scholarship . Studies of l'Avis made since Bayle's death make it clear that Scholars of the Bayle canon are concerned about proper attribution of the controversial text, even though their conclusions have differed. Some of the names which have been associated with l'Avis are Pélisson, I‘a-l‘roque, Brueys, Coquelart, Chardon, and Bayle. Very little is known of Brueys, Coquelart and Chardon10 Whose he“lies have been left out of the most recent studies on l'Avis. Paul Pelisson-Fontanier was a Protestant courtier and French Academician converted to Catholicism, who had become well known for his polemical works directed against his former "co-religionnaires." A personal friend of Pélisson's and a former diplomat residing in England, Marc-Antoine de Crosat de la Bastide, named him as the author in "l'Auteur de l'Avis aux Réngiés déchiffré," published in 1716, by In 1907, Charles Bastide reported that "on avait vu l‘abbé du Revest.11 Chez un libraire de Paris 1e manuscrit dc l'Avis de l'écriture de Pélisson . . . ."12 Daniel de Larroque, the gifted son of a famous Protestant theologian, at the height of the dispute, just a few weeks after l'Avis appeared, returned to France and converted to Catholicism. Larroque had served as Bayle's "secretaire" and is reputed to have laid claim to l'lAvis.13 A young and inexperienced writer, he gained the confidence of the older Bayle and is said to have supplied him with current infor- mat ion on British intellectual life.14 Many of the exiled Protestants held Bayle, a well-known polemicist among the moderates, responsible for the publication of the controversial Pamphlet. Bayle was a master of polemical writing. He had gained a reputation as a literary and religious critic while editing Les Eguvelles d_e_ la République des Lettres, and for having written Les pensées diverses sur 13 cométe (1682), La Critique Générale d_e_ l'Histoire (1% glvinisme d_e_ M. Maimbourg (1682), and _I_._e_ Commentaire philosophique sur ces paroles d_e_ Jesus-Christ 'Constrains-les d'entrer' (1686). No one has denied that ideas expressed in l'Avis were con- c'“rl‘l'enfly Bayle's. It has even been suggested that he wrote the preface, ha" ing received the manuscript from Larroque.15 Even so, Bayle reDeatedly denied having written l'Avis. However, some of his closest friends, fearing that authorship of the work would be grounds to exile the philosopher who had gone through a conversion to Catholicism and a I‘elapse to Protestantism, who had been exiled from France and who had Seen his Chair of Philosophy taken from him both by Louis XIV and by the 16 Dutch Consistory, saw Bayle's hand in the controversial document. In April, 1691 (over a year after l'Avis' publication), the first published accusation was made, with Jurieu's Examen d'un libelle contre 13 religion, contre l'Etat gt contre _l_a_ Revolution In this pamphlet Juricu accused Bayle of plotting gflAngleterre.... the overthrow of the Dutch government; stating, then, that Bayle wrote lLANis as part of a vast "cahale." Thus began "la guerre do pamphlets”}’ Ifluich.saw the ultimate disintegration of the Bayle—Jurieu friendship. He The "philosophe de Rotterdam” sought to defend himself. Ofi?€fled "de paraitre devant les magistrats dc Rotterdam pour Etre CHEL’J con tradictoirement avec son adversaire,"18 but since the consistory aq313<3ared to be taking no action, he published his Cabale Chimérique.19 TTier (entire scandal was so cmbarassing to the magistrates, who evidently saw in the "guerre de pamphlets" a personal quarrel rather than a PC>lfiitzica1 uprising, that they barred any further dialogue between the “'0 protagonists.20 The first major attribution study of l'Avis after Bayle's death was made by Desmaizeaux as he prepared his Vie d_e m which pre~ Cedes the 1730 edition of the Dictionnaire historique _c:_t_ critique. D . . . . . . . . . e81llaizeaux recognized the Similarities between Bayle's political Views His conclusion was, however, that his an‘i those expressed in l'Avis. Nevertheless, pepular con- resIpected friend did not write the pamphlet. senssus caused l'Avis to be placed among Bayle's Oeuvres Diverses in 1737. In 1878, M. Deschamps, in his book GenEse du_scepticisme érudit 93.13ay1e, refused to recognize Bayle as author of l'Avis. According to (kxlrges Ascoli,21 M. Deschamp's refusal was for sentimental, not factual Then, in 1906, Jean Delvolvé published his Religion, critique reasons. E£_philosophie positive chez Pierre Bayle, in which he associated ideas expressed in l'Avis with those expressed in earlier works known to be His conclusion was that Bayle wrote the work.22 Nevertheless by Bayle. in 1907, in the Bulletin dg_la_Société d'HistOire du_Protestantisme, Charles Bastide took the opposite stand.23 He admitted that the internal evidences, that is, the ideas, were attributable to Bayle, but claimed that the external evidences, word of mouth and correspondence of the In 1913, pexriod, failed to justify the hypothesis of Bayle's authorship. Georges Ascoli produced perhaps the most exhaustive study of the external evidence yet made on l'Avis.24 He took as a guide many of the documents referred to by Bastide, using much greater care and perceptive analysis iii correlation of the manuscripts he found. As indicated by Mme Labrousse, his arguments are, however, more ingenious than convincing.25 He con- clusively eliminates Pélisson from the list of possible writers and through circumstantial, external evidence concludes that Bayle must have been the author. The most recent study of l'Avis was made by Elizabeth Labrousse In true Baylian fashion, she has provided those interested in in 1965. {B£1)’1e with a wealth of documented infermation. However, her treatment of rather than producing substantial evidence for its proper attri- b‘11tion, offers only a very plausible explanation of why it was written. with regard to the attribution, she discounts most of Ascoli's arguments 811(1 agrees with Jacques Basnage's postulation, found in his letter to De Smaizeaux: "Je n'ai point encore abandonné ma premiére conjecture: 11 1e fit imprimer, y c' est que 1e manuscrit lui en avait été confié. ajtauta une preface et quelques traits de sa main, [sic] M. Hartsoeker m'a confirmé dans ma conjecture, parce qu'il m'a assuré que M. Larroque, étant prisonnier a Paris, citait souvent cet ouvrage comme une produc- tion qui lui appartenait."26 Nevertheless, Mme Labrousse feels that Bayle is to be held totally responsible for the work, even though he 27 “fight have received the original manuscript from Larroque. With the exCEption of Delvolvé, all of the studies mentioned dealt with bibliographical and external evidence in their attempts to determine authorship of l'Avis aux réfugiés. One type of evidence remained to be examined: ideas, structure and language in the text itself. This is internal evidence. Delvolvé, Howard Robinson and, to a certain extent, Mme Labrousse approached the attribution of l'Avis through the text itself by comparing ideas in the pamphlet to those espoused by the philosopher of Rotterdam. Delvolvé begins his argument concerning attribution of l'Avis by admitting that "Bayle ne s'est jamais avoué l'auteur de cet ouvrage, que 1 'on ne songea pas tout d'abord a lui attribuer, mais qui a figure par la suite dans toutes les Editions de ses oeuvres."28 Nevertheless, he 8065 on to state that the "idée de la domination du pouvoir de l'Etat aU-dessus de toutes prétentions religieuses est presentée dans l'Avis SOUS diverses formes ...."29 One of these forms is the doctrine embraced by ancient pagan moralists which centers on "devoirs absolus et sacrés, rev’étus d'un caractére quasi religieux qu'on doit a la patrie."30 De1V01vé sees a direct relationship between this quasi religious view of state supremacy and Bayle's doctrine of toleration. To elucidate his interpretation, he draws a statement from Bayle's Commentaire W31 wherein the philosopher criticizes Catholics whose intOlei‘ant attitudes toward the Protestants were not for the good of the public.”- He thcn points out that reproaches made against the Catholics have a striking similarity to those directed against Protestants by the author of l'Avis aux réfugiés. The link, then, that Delvolvé sees tying together if: Commentaire and l'Avis is "1e bien public," charac- terized in the former work by an appeal for toleration and in the latter by rejection of rebellion. Implementation of these two principles by the exiles would fu1fill the most ardent desires of both moderates and zealots. Delvolvé concludes this portion of his argument by suggesting that because orthodox Calvinists (led by Jurieu) exhibited great opposi- tion to the doctrine of tolerance presented in the Commentaire, Bayle wrote l'Avis as "une attaque directe contre des principes et un esprit dont il a maintenant éprouvé l'hostilité."33 The remaining arguments Delvolvé proposes, dealing less with literary parallelisms and more with logical reasoning, may be summed up as follows: 1. The author of l'Avis encouraged the return of exiles to France: Bayle would have preferred living in Paris;34 2. Bayle, having become strongly embittered toward Jurieu, sought to "atteindre au vif Jurieu et l'Orthodoxie, qu'il commencait a hair .;"35 3. there was cause to conjecture that the French government "envisagerait peut-étre ... un accord avec les plus modérés des réfugiés, qu‘on ferait servir a calmer les irritations: en usant a leur égard de clémence, on désorganiserait 1e parti, rappelant les uns, laissant dehors les elements les plus dangereux." Such conjectures, according to Delvolvé, prompted Bayle to profit from the situation by raising "au-dessus des intérEts religieux 1a suprematie de l'intérfit de tolérance ...;36 4. and finally, Bayle "se défendit sur le point de fait avec une faiblesse qui fut remarquée de tous.--S'i1 maintient ses dénégations avec fermete, jamais en revanche il ne désavoue ni ne blfime 1'essentiel des doctrines soutenues. Sans doute il blame ce qui traite a la glorification du roi de France et de la politique francaise. Mais i1 approuve et prend a son compte cette idée fundamentale de l'Avis que l'esprit de satire et de rebellion est toujours condamnable .;"37 therefOre, argued Delvolvé, Bayle must have written l'Avis aux_réfugiés. Not 50, responded Robinson in Bayle the Sceptic. "Nothing could be more unlike what he had thus far written. It [l'Avis] was wholly taken up with the problem of sovereignty. Its . . . spirit was political EUNi not moral; Bayle had not yet shown any interest in politics."38 Shich are some of the positive statements used by Robinson to introduce iris argument against Bayle's authorship Of l'Avis. He seems to ignore theparallels and logic developed by Delvolvé, or else he feels that his arguments are not in keeping with the spirit of Bayle's corpus. More- over; doctrines in harmony with Bayle's thought were likewise espoused by several other members of the moderate party. Bayle himself acknowl- edBEd that there were in the reformed party des bonnes ames qui sont encore persuadées, malgré les déclamations et les livres de M. Jurieu, qu'il faut aimer ceux qui nous haissent, prier pour ceux qui nous persécutent, souffrir patiemment pour le nom de Dieu, ne rendre point le mal pour le mal, l'injure pour l'injure, ni écrire des satires 9 Moreover, Eric Haase pointed out that many Huguenots arrived in exile still proclaiming fidelity to both France and Louis XIV;40 to abso- lutism. Furthermore, Mme Labrousse confirmed that l'Avis was written 'wmr un partisan convaincu de l'absolutisme--il n'en manquait pas ... Parmi les ministres les plus pieux."41 0r. in other words. the posi- tiOflof the author of l'Avis was not unique among the exiles. 10 In any case, Robinson lists several additional disparities be- tween the context of l'Avis and Baylian thought. He dwells upon Bayle's lack of knowledge of the English scene to establish his conclusions. He points out that bayle's "previous treatment of things English had been uniformly vague."42 Finding no detailed descriptions of the British political situation anywhere else in the Bayle canon, Robinson naturally assumed that "the elaborate handling of English history and the use of English authorities" were beyond Bayle's immediate command.43 Whereas Delvolve saw a relationship between political aspects of l'Avis and 12_ Commentaire, Robinson felt that the extreme position taken on royal sovereignty by the author of l'Avis nullified any right to tolerance. In addition, there is not a word about tolerance in the entire volume; the whole Spirit of l'Avis is Opposed to freedom of thought. Hence, in a very straightfbrward manner, Robinson sums up his argument against Bayle's authorship of the disputed work: Bayle's erring conscience is entirely absent. The argu- ment throughout is strongly Catholic.44 The use of the first three centuries of Christian history is the oppo- site of that made by Bayle in his Philosophical Commentary. The defense of the killings of the Vaudois is In direct contradiction to the position taken in the reply to Maimbour . If Bayle could have written France Entirely Catholic 5 in 1685, and this work in 169-0,_h"_e was a veritaEle chameleon.46 Thus, Robinson states in no uncertain terms the disparities his research revealed. It now seems appropriate to ask: "Whom shall we believe?" Having examined external evidence, Bastide and Ascoli arrived at different conclusions: Bastide for, Ascoli against attribution to Bayle. Delvolvé saw l'Avis as a logical continuation of Bayle's thought patterns and, therefore, attributed it to him. Robinson, also 11 approaching the text internally, produced evidence diametrically Opposed to that which evolved frOm Delvolvé's reasoning. Thus it becomes clear that coexistence of ideas has provided inferential, but not conclusive evidence for attribution of l'Avis aux réfugiés, and that use of traditional methods has not produced definitive internal evidence. Intuitive analysis is much more elusive than the search for ideas or stylistic differences. Nonetheless, some critics47 are endowed with what seems to be an innate ability to sense an author's touch in a work. Most of the early attributions of l'Avis aux réfugiés were based upon intuition. To make accurate impressionistic judgments effectively and consistently demands a union of wide learning and "esthetic percep- tion" brought about by a long and concentrated study of an author's works.48 For the most part, however, the average critic is "all too often . . . confronted with a passage which seems to offer no point of entry."49 If an objective method can be devised for penetrating beneath the surface of a text, the ultimate process of analysis may be con- siderably shortened. Relating intuition to attribution research, Stephen Wachal states that "intuition can indeed be useful, but . . . in an investigation of this kind [attribution studies] it should be supplemented by the routine examination of as many variables as the limits of practicality permit."SO Pinpointing of dominant stylistic traits should provide the necessary complement for such intuitive research. An analysis of style is defined here as the sum total of the stylistic elements noted below, and the author's choice, consciously or intuitively, of how to use them. Interest in style investigation has grown rapidly since 1930,51 and penetrating research by such scholars 12 as Spitzer,52 Bally,53 and Bruneau,S4 has lengthened the ever growing list of possible stylistic determinants. Until 1930, the staple cate- gories of this type of evidence were primarily versification and vocabulary.55 Since then imagery, simile, metaphor, rhythm, inflection, grammar, syntax, use of sources, indeed even spelling and punctuation (where it can be established that differences are the author's and not the publisher's choice) have been added. Some of these elements remain in the realm of subjective analysis, simply because no method has yet been devised to count them systematically.56 Others, because their structure is such that their presence can be readily identified and counted, have lent themselves to a more objective type of analysis. In cases involving vocabulary, rhythm, inflection, punctuation and even some aspects of syntax, the electronic computer has been programmed to recognize, sort, count, and report occurrences of the specified items.57 Thereby the scholar engages a highly efficient but brainless clerk to process "information in such quantities as no man's lifetime or energy could previously have contained."58 It is to be noted that the computer scholar's intent is not to have a machine take over the art of literary criticism. Rather, he uses the machine to prepare the way, and, in some cases, find a "point of entry" so that the literary critic, the historian or the investigator of style may get on with his business. Because of its quantitative nature, the automated approach to stylistic analysis has received the names "stylostatistics" or "computational stylistics." \ Whatever the approach taken to solve an attribution problem, the literary critic and the literary historian must bring to bear all possible external and internal evidence, using the most exact means at their command. To this end, it is conceivable that the process of subjective 13 judgment could at least partially be formalized or rendered explicit, especially with regard to stylistic identification. The union of human judgment, modern computer technology and statistical procedures might then provide reliable, replicable information about stylistic features of a particular author. An author's style--conceived of as constant features or combina- tions of features in his writing habits, or in his choice of words—-when analyzed, may reveal facts that a pseudonym might otherwise have kept hidden. In a well known quotation, Baudelaire wrote: "Pour deviner I'Sme d'un poete, ou du moins sa principale preoccupation, cherchons dans ses oeuvres quel est le mot ou quels sont les mots qui s'y représentent avec le plus de frequence. Le mot traduira l'obsession."59 Baudelaire uses the term "obsession" to describe that element in the soul of a writer which cannot be permanently disguised by an act of the author's will. In recent years a considerable body of evidence supporting the view that "an author's individuality is at least partially inherent in the frequency with which lexical and grammatical elements occur in his texts"60 has been published. Wachal reviews upwards of 150 documents in which possible objective correlates of style were treated and points out that frequency studies of obviously discernable stylistic elements have been successfully discriminated by Objective means.61 Because of the results reported in past studies, the basic hypothesis of a computa- tional stylistic study--that there is, indeed, a relationship between grammatical and lexical frequencies and the paternity of texts--will be taken as proved in the development of this study. 14 If Bayle did write l'Avis, he took care to disguise the fact by all means available to a man eager to protect his own life and safety. Nonetheless, even in his own time there were those who assumed it was his. Some were likely swayed by hearsay evidence, some by textual elo« ments, such as recurring word or thought patterns. If one can "finger~ print" a style by discovering those characteristics so inextricable from the writer's personality that he does not even perceive them as per- taining peculiarly to himself and therefore is at no pains to obscure them, it should be possible to determine, through a computational stylistic analysis of l'Avis aux refugiés, whether or not it was the work of Bayle's pen. NOTES FOR CHAPTER I 1Gerald E. Bentley, "Authenticity and Attribution in Jacobean and Caroline Drama," Evidence for Authorship, eds. David Erdman and Ephim Fogel (Ithaca: Cornell University, 1966), p. 180. 2Pierre Bayle, "Nouvelles de la République des Lettres," juillet, 1685, art. iv., in his Oeuvres diverses, ed.Elisabeth Labrousse, 4 Vols. (1727; rpt. Hildesheim: Georg Olms, 1864-1968), I, 327. A11 quotations from Bayle's Oeuvres diverses are taken from this 4-volume reprint of the 1727 edition. Cited hereafter as OD, I, II, III, IV. 3Avis important aux Refugjez sur leur prochain retour gg_ France, Donne pour Etrennes Efl'un d'eux en_1690. Par Monsieur C.L.A.A. P.D.P. Reprinted in Bayle, OD, II, 579-633. Attributed also to Daniel de Larroque and Paul Pélisson-Fontanier. 4Pierre Bayle I, 22_Pays ge_Foix §_1__cité d'Erasme (The Hague: Nijhoff, 1963), p. 2]9. 5Howard Robinson, Bayle the Sceptic (New York: Columbia University Press, 1931), p. 122. and Labrousse, Pierre Bayle I, 221. 6Pierre Jurieu, Lettre [sic] pastorales addressées aux fidéles ie France qui gémissent sous lg_captivité gg_Babylon [sic] ..., 3e 53: (Rotterdam: A. Acher, 1686-1689). 7According to Desmaizeaux, gig-£2.8ayle, éd. Beuchot, pre- facing Bayle's Dictionnaire historigue g£_critigue, (Amsterdam, 1730), pp. 119 A, 124 B, it was refuted by tronchin du Breuil, Basnage de Beauval and Antoine Coulan. See also Walter Rex, Essays 92_Pierre Bayle and Religious Controversy, International Archives of the History of Ideas (The Hague: Martinus Nijhoff, 1965), p. 225f. 81bid., pp. 197-255. 91bid., p. 225. loMentioned by Georges Ascoli, "Bayle et l'Avis aux réfugiés," Revue d'Histoire littéraire ge_la_France, XX (1913), 521. 11Robinson, p. 529. 12Charles Bastide, "Bayle est-i1 l'Auteur de l'Avis aux réfu iés?," Bulletin dg’la_Société d'Histoire g2_Protestantisme, LVI E1907), 550. 13Ascoii, p. 532. 15 l6 14Labrousse, Pierre Bayle I) p. 219. lsJean Delvolvé, Religion, critique §£_philosophie positive chez Pierre Bayle (Paris, 1906), p. 192. 16Asco1i, pp. 539-544. 17Labrousse, Pierre Bayle 13 226. 18Bastide, p. 549. 1900, 11, 637-685. ZOBastide, p. 549. 21Asco1i, p. 522. 22De1volvé, p. 191. ZSBastide, pp. 544-558. 24Ascoli, p. 522. 25Labrousse, Pierre Dayle I, 220. 26Lettre de Basnage a Desmaizeaux, du 19 avril 1707, cited in Delvolvé, p. 192f. 27Labrousse, Pierre Dayle 13 221. 28Delvolvé, p. 178. 291b1d., p. 183. 301bid., pp. 183-184. 31Commentaire philosophiqug sur ces paroles d§_Jésus-Christ: "Contrains-les d'entrer," 0D, II, 355-496. 32"Toute secte qui s'en prend aux lois des sociétés, et qui rompt les liens de la sfireté publique en excitant des séditions et en prechant 1e vol, 1e meurtre, 1a haine, 1e parjure, mérite d'étre exterminée par 1e glaive du magistrat." Ibid., p. 412. Cited in Delvolvé, pp. 184~185. ""“ 33De1vo1vé, p. 188. 34Delvolve suggests that Bayle would have known greater freedom from "la feurnaise théologique du Refuge" and would have been able to "jouir en paix du libre commerce des esprits distingués et polis dont Paris était 1a patrie." Religion, p. 188. He seems to overlook the fact that Bayle knew greater freedom of the press in Holland than he could have possibly known in France. See also Edmond Lacoste, Dayle, nouvelliste 25 critique littéraire (Paris: Picart, 1929), p. 63f. 35Labrousse, Pierre Bayle I) p. 189. 361bid. 37Ibid., p. 190. 38Robinson, p. 120. Robinson seems to have overlooked Bayle's interest in political affairs mentioned by Desmaizeaux, Vie de Bayle, Dictionnaire, I, xxi; referring to Bayle' s desire to keep politics out of les Nouvelles de la _§publiqpe des lettres, or to the fact that politics are not the— central issue of any of his other works. Politics-- 13 politique, les politiques--account for only 24 entries in the "Table de matieres" concordance to Bayle's complete works. 39Entretiens sur la cabale chimérique, OD, II, 625. 40Erich Haase, Einfuhrungfiin die Literatur des Refuge. Der Beitrag der franzosischen Protestanten zur Entwicklun analytischer Denkformen am Ende des 17. Jahrhunderts (Berlin: Duncker 6 Humblot, 1959),p pp 275- 276. 41Labrousse, Pierre Bayle I) 221. 42Robinson, p. 120. 43Ibid. 44Notwithstanding the fact that Scriptures quoted were drawn from the Protestant version of the Bible and an extremely well informed aware- ness of Protestant history was exhibited by the author. It seems only logical that when writing to Protestants an author should use terms--in this case Scriptures--with which Protestants are the most acquainted. Even though each study made of l'Avis which attributes the work to Bayle has used this argument to substantiate its position, I fail to see its import. 4$££_ ue c 'est que la France toute catholique sous 1e régne de Louis-leigran, OD, II, 336— 354. 46Robinson, pp. 120-121. 47Such as Leo Spitzer, A Method of Interpreting_Literature (Northampton, Mass: Smith College, 1949), and Jean Starobinski, L'oeil vivant (Paris: Gallimard, 1961). 48Richard Altick, The Art of Literary Research (New York: Norton, 1963), p. 72, likens the literary scholar in an attribution study to an art expert called in to authenticate a museum' 5 new acquisition. After all other tests have proven inconclusive, "he must finally rely upon his knowledge of the way the artist customarily worked. When a specialist who has spent years of his life reading and re-reading his author declares that a disputed piece is genuine, his intuitive expertness must be given respectful attention. Yet no such verdict can ever be regarded as final." 18 49R. A. Sayce, Style in_French Prose (Oxford: Oxford Univ. Press, 1953), p. 2. SOStephen Wachal, "Linguistic Evidence, Statistical Inference, and Disputed Authorship," Diss. Wisconsin 1966, p. 315. 51Helmut A. Hatzfeld, A Critical Bibliography 2£_the New St listics Applied to the Romance Literatures, 1900-1952 (Chapel Hill: Univ. of Nort aroliha Press, 1953), p. 1v. 52Leo Spitzer, Stilstudien, 2 vols. (Munich: Huber, 1928); Linguistics and Literary History (Princeton, 1948);‘A_Method of Interpreting_Literature (Northhampton, Mass., 1949); and ”Stylistique et critique littéraire," in Critique, XI (1955), 595-609. 53c. Bally, Traité gs stylistique francaise, 2e éd., 2 vols. (Heidelberg: Winter, 1919). 54C. Bruneau, "La Stylistique," Romance Philology, V (1951), 1—14. 55A1tick, p. 70. 56Such as imagery, simile and metaphor. However, as a scholar reviews a questionable work and associates these (subjective) stylistic elements with those extant in other works by various authors, a counting and sorting process takes place subconsciously. This process does not differ greatly from that employed by the electronic computer, or, as Norman H. Holland so aptly put it: "The computer is the first agency outside the human mind to process symbolic data." "Futures: A Non- Summary of the EDUCOM Symposium on the Computer and Humanistic Studies," Computers and the Humanities, II (Nov. 1967), S9 (Abbreviated hereafter as CHum). S7Akin to computer-aided stylistic and literary analysis is the field of computational linguistics which, as its name implies, makes extensive use of the computer's clerical abilities. Louis T. Milic very aptly illustrates the difference between the fields of literature and linguistics vis-a-vis the study of language in his article, "Winged Words: Varieties of Computer Application to Literature," CHum, 11 (Sept. 1967), 24. 581bid., p. 25. 59Cited by Robert T. Cargo, A Concordance to Baudelaire's 'Les Fleurs dg_mal' (Chapel Hill: Univ. of North Carolina Press, 1963), p. x1, and in The French Review, XXXIX, No. 5 (April, 1966), 807. 6oWachal, "Linguistic Evidence," p. 2. 61Ibid. CHAPTER II THE APPROACH Several hundred reports of research seeking possible objective correlates of style are scattered through journals and monographs representing such fields as literature, linguistics, biblical studies, journalism, education, statistics, and psychology. Research into that body of literature determined the selection of areas of computational stylistics treated in this project. Works referred to in this chapter, while representative of a larger body of scholarship, are limited to studies concerned with quantifiable indices of style that can be automated, and which are, therefore, directly applicable to the development of pro- cedures used in this attribution study. In his study of authorship attribution procedures, Wachal reviews 150 previous studies treating objective stylistic correlates. He then groups the problems, procedures, and direction taken by past research into three models: consistency, population, and resemblance. The "consistency model” involves the examination of a work attributed to an author by subjective reasons or on the basis of strong external evi- dence. After index values for the test works are established, they are compared to see if disputed text values fall in or near the range of those for the known texts.1 The "population model" involves an examina— tion of samples of a substantial amount of material by different authors, evaluated in a complex probabalistic framework. Finally, the "resem- blance model" entails an examination only of works by likely candidates, excluding all other possible authors. As with the consistency model, 19 20 attribution is made if index values for one of the hypothesized writers resembles those of the disputed text more closely than the others. This approach is especially effective if the possible—author field has been narrowed to two.2 An attribution study of l'Avis aux_réfugiés is suitable to two of these three models, consistency and resemblance. Consequently, the "either-or hypothesis" used in the statistical analysis reported in Chapter IV (pp. 93-94) of this study is based upon the assumptions governing these two models. Past attribution studies of l'Avis au§_réfugiés have been made by "strong subjective reasons or on the basis of external evidence." As a result of past research, the field of probable authors has been narrowed to two: Bayle and Larroque, and l'Avis remains in the bibliography of both writers. Since the conSistency model is "primarily useful for rejecting works from membership in a canon,”3 and because the major purpose of this study is attribution of l'Avis, which can be done only by eliminating one of the two probable authors, reputed indices from the consistency model provide a valid point of departure for my research. By the same token, since works of unquestionable authenticity--of the same genre, from the same time, and concerning like subjects--are available for the two hypothesized authors, the resemblance model also validates the approach taken in this thesis. In the resemblance model, a set of putative stylistic discriminants drawn from these known works is com- pared to like indices obtained from the anonymous work. A close resemblance between stylistic elements in the disputed text and those in the known works strongly implies mutual authorship. Conversely, lack of similarity to either or both known works provides a basis for rejection of the hypothesis that either writer is the author of the disputed text. 21 Concepts established by works in both the consistency and resemblance models suggested five broad categories of putative stylistic discrimi— nants as being most applicable to this study. These five categories consist of: l) sentence-level measures--number of words, letters, and syllables per sentence, number of letters and syllables per word, and number of letters per syllable; 2) sentence beginnings and endings—- classified as to part of speech; 3) function word use-~analyzed both individually and in grammatical groups; 4) fixed and variable phraseol— ogy--ana1yzed both individually and in grammatical groups; and 5) vocabu- lary analysis-~examination of word roots and their variations. Selection of variables for each of the five categories was determined by their automative capabilities--recognizeable and countable incidence; their effectiveness as discriminants as demonstrated in previous studies; their grammatical and syntactical combinations which are likely to identify an author; and their probable use as distinguish~ able elements which an author could not consciously disguise. George Udny Yule, a pioneer in the field of computational stylistics, proposed the sentence as an indicator of an author‘s style in an article published in Biometrika in 1936.4 More precisely, he sought to show that the variations of sentence-lengths about their average is a constant characteristic of an author's writing habits. The first few pages of his article treat the problem of punctuation. He quotes Ronald B. McKerrowS who had argued that much, if not most, of the punctuation of sixteenth- and seventeenth—century manuscripts was handled by the compositor, not by the author of a work. After noting that different versions of the texts with which he was working were punctuated differently, Yule concludes that where ”punctuation, even as 22 regards full stops, is largely the work of the compositor, there need be no hesitation in overriding them if necessary: indeed, the use of personal judgment seems unavoidable."6 Thus, in preparing his data he revised the punctuation as he saw fit. By doing so he disregarded a principle innate to objective analysis: the objectivity of the evidence provided is only as good as the means used to obtain it. If computa— tional stylistics is to introduce objectivity into literary analysis, then Yule's theory of punctuation revision must be rejected, or at best accepted with reservation, for the following reasons: (1) Even though it is generally admitted that many authors exercised little care about punctuation in the seventeenth and eigh- teenth centuries, no definitive study has been made distinguishing those who did from those who did not. A later edition of a given work might well contain corrections suggested or entered by the author. (2) If a modern critic manipulates punctuation by replacing colons, for example, now with a comma, then with a period, he has increased the size of the "possible-author" field;7 assumed that rules (written or common) governing punctuation and sentence composition in earlier periods of writing had already undergone our modern refinement, (for he is, in a sense, modernizing the syntax);8 or he has determined that the compositor was not versed in the grammatical usage of his time and punctuated at random. (3) Attribution based on frequency distribution of sentence lengths where the investigator defines the sentence parameters becomes a somewhat personal venture, dependent upon the knowledge and sensitivity of the scholar. 23 Yule was not unaware of the high degree of subjectivity these procedures brought to his study. In fact, he assented to the possibil— ity that his study might serve "only as an exploratory piece of work," while hoping it would "still retain interest and value.”9 It has done both. Yet, as he exposed additional difficulties he had encountered, he continued to combine objective and subjective measures. How should hyphenated words, numbers, the ampersand, and quotations be treated? His decisions on hyphenated words, numbers, and the ampersand were practicable and objective, whereas treatment of quotations posed a greater problem. He considered an author who incorporates brief quota— tions into the grammatical context of his own sentences to be merely substituting someone else's words. These quotations Yule considered part of the sentence itself. On the other hand, he felt that a complete sen- tence quoted by an author represents someone else's writing and should be excluded from the analysis. He soon recognized that this seemingly clear distinction became blurred in practice. Rather than make an individual decision with the appearance of each quotation-~especially very long ones--he simply left out of his samples "all pages on which this source of trouble was serious."10 In spite of its shortcomings, Yule's work continues to be cited by scholars and students of computa- tional stylistics. Word, letter and syllable counts, as they apply to sentence length, vocabulary analysis, and rhythm in both prose and poetic writings have demonstrated discriminating ability.11 When Wachal prepared his data for syllable analysis in English, be determined syllable breaks "simply by reading the text aloud and estimating the number of stresses perceived."12 Since vocalization and perception of verbal stresses may .9.“- 24 vary from reader to reader, a more explicit means of counting and com- paring letters and syllables was developed for the present study. Basic rules for syllabification, described in Chapter III (p. 49), were programed into the computer, and accurate replicable results were obtained. At Columbia University, Louis T. Milic used the IBM 1620 computer as a tool in his attempt to objectify certain traits of Swift‘s style. In his chapter headed "Connection," he refers to such neutral connectives as ”and," "but," and "for," especially as sentence beginnings where they function as transition or reference "fillers.” Although accepted and examined by Milic as a possible discriminators of style, he feels this set of variables represents elements of style which the author can con~ sciously control.13 This might be true; nevertheless, upon further consideration of sentence beginnings, as a putative index of an author's style, it was felt that, although sentence first wg£gs_might fall into the category of "conscious ordering" of stylistic elements, it seems logical that, as a writer introduces his thoughts, he is much more aware of how he begins a given sentence than of how he closes it. Moreover, if he consciously observes this element of style in his own writing, it follows that he is likely to be aware of similar or divergent habits among other writers. Having such an awareness, an experienced writer might very well see diversity of sentence beginnings as a means of concealing his identity for any number of reasons. 0n the other hand, an inexperienced writer, as he develops his own style, might seek to imitate patterns of sentence beginnings he recognizes in celebrated writers. In any case, if a writer, famous or unknown, is more aware of sentence beginnings than of sentence endings, it follows that the latter 25 are more likely to be unconscious acts and, therefore, "are more likely to reveal something that the writer might deliberately wish to conceal."14 To an uninstructed layman, one of the most impressive statisti- cally oriented attribution studies was made on the authorship of the disputed Federalist papers by Frederick Mosteller and David Wallace, two professional statisticians.15 Like many of their predecessors, they recognized human limitations in precision, accuracy, and objectivity. As a graduate student in 1941, Mosteller became interested in the Federalist authorship problem. In that pre-computer setting, he, Frederick Williams, and their wives, inspired by Yule's research, set out to count sentence—lengths in the known essays of Hamilton and Madison, the two contenders for authorship of the disputed Federalist papers. It wasn't long before they discovered ”an important empirical principle: people can't count, at least not very high."16 Finally, after eliminating their errors and tabulating their calculations, they discovered, to their dismay, that Hamilton's average sentence-length was identical to Madison's. Other expected discriminants proved equally inconclusive and they abandoned the project. It lay dormant until 1962, when Mosteller and David Wallace revived and pursued it with statistical methods and the electronic computer, because ”standard methods of historical research [had] not firmly settled this authorship problem.”17 That the Federalist dispute provided them with "a case study that would give [them] an opportunity to compare the more usual methods of discrimi- nation"18 using complex statistics was of much more importance to Mosteller and Wallace than the literary-historical conclusions they hOped to reach. 26 The two statisticians concentrated their study on function words: prepositions, conjunctions, and articles. Even though certain parts of speech, e.g., personal pronouns and auxiliary verbs, score highly as function words, Mosteller and Wallace considered them to be potentially dangerous because they ”are likely to be related to external details, and inference from them is difficult."19 From Miller, Newman, and Friedman20 they obtained a list of 363 filler-type words. Even though some of the words from this list were not relevant to the Federalist period, they proved to be very useful, primarily because they were objective with respect to the Federalist problem, and because they relieved the investigators of "a large onus of choice."21 To this "unselected" list they added 28 words drawn from their 3000 word samples 22 of 11 Federalist papers by screening out low-frequency words. Finally, they constructed an index of 18 Hamilton Federalists and 19 Madison papers external to the Federalist, compiled an additional register of 240 possible discriminators, discarded those they regarded as contextual, and compared the 103 remaining ones to the 98 previously chosen. The "new" terms were then added to the previous lists, raising the total number of possible discriminators to 165. Then came the task of scrutinizing the lists to rid them of terms that might even suggest contextuality. Sifting and culling, they arrived at a final list of 30 words upon which they based their study.23 A central theme pervades the non-technical sections of Mosteller and Wallace's study and is brought into focus as they summarize author- ship and discrimination problems: "The function words of the language are a fertile source of discriminators. . . . Context is a source of risk. We need variables that depend on authors and nothing else. 27 Some function words come close to this ideal, but most other words do not." They go on to say that "narrow and specialized variables may be of more use than global and meaningful ones.”24 Mosteller and Wallace strongly affirm a distinction between the scholar and the critic (the former seeks answers, the latter analyzes them). But not all computer— oriented literary researchers are so definitive in their discrimination. Stephen Parrish,‘25 Charles Muller,26 and Louis Milic,27 for example, attempt to remove the barrier between "measurement" and "judgment."28 Assuming that the "uniqueness of an author shows consistently in his style, regardless of the subject—matter or the conventions of the medium of the period,"29 and that an author may more readily change words than grammatical structure, Milic concentrated on syntactic, rather than vocabulary patterns. In making this assumption he considered three basic points: "(1) that style reflects personality; (2) that this is an unconscious process; and (3) that in mature writers the process is con— sistent."30 A careful reading of Swift's works revealed his tendency to use certain word groupings, which Milic discusses in his chapter "Seriation." He calls the making of lists, catalogues, series, or accumulations "the organizing principle of Swift's thought.”31 Milic chose random samples from four writers (Addison, Johnson, Gibbon, and Macaulay) to serve as controls against which he would test samples of Swift's writing. All words were analyzed and "manually" assigned a word-class, consisting of a two-digit number. Each sample text was thus reduced to a series of significant numbers which were then punched onto IBM cards and fed into the computer. Having been pro- grammed to identify the encoded word-classes, to tabulate, correlate, and print their frequency distributions along with a miscellany of comparative 28 tables, the computer provided the objective information which allowed Milic to create a graphic profile of Swift's style. Finally, he compared ‘2225332 and attributed the work to Swift. In his attempt to solve the problem of authorship of the Junius Letters, Alvar Ellegard first searched the Oxford and Cambridge matricu- lation lists, hOping to find a signature which resembled the Junian hand. Unsuccessful in his endeavors and dissatisfied with the histori- cal, biographical, and inconsistent linguistic evidence}:5 he turned to the statistical tests devised by Yule in hOpes of finding a solution to the problem through a systematic study of the Letters' language or style. Finding that Yule's tests were not sensitive enough to treat his small samples34 (many of the Junius letters are under 2000 words in length), he devised his own system of stylistic tests. Ellegard made two assump- tions in formulating his theory: first, that certain verbal patterns in a particular author's writing habits remain reasonably constant; second, that some aspects of his style are distinct enough to set him apart from his contemporaries. He then defined style in the context of his research as being synonymous with "constant features or combinations of features in an author's way of writing,"35 a definition which makes unnecessary any value judgment of "good" or "bad" style. The character- istic of style he chose to investigate was an author's use of "typical" words and turns of expression.36 Ellegard picked out words used more frequently by the writer in question than by his contemporaries. These he labeled "plus-words." Phrases or expressions were counted only when a plus- or minus—word was part of its composition. For each of these terms he calculated a 29 "distinctiveness ratio"--a ratio of frequency of occurrence in Junius to that in general usage37 as determined by his control-group samplings.38 Ellegard felt that those words with a distinctiveness ratio greater than 1.5 and less than 0.7 distinguished Junius clearly enough from his con- temporaries to compare them with the writings of Francis, the suspected author. Ellegard's procedures have received both criticism and praise. It has been said that his works contain "errors of exposition and fact," that he used such small samples in his research that "the tests were not sensitive enough to be wholly reliable,"39 that he failed in not using the electronic computer for "sifting and culling," and that he presented forty-five pages of perfectly worked-out tables . . . based on shifty data."40 On the other hand, these same critics acclaim the work as a "remarkable achievement,"41 and commend its author for having "opened the way to further work on the statistics of style."42 The reviewer for the I392. Literary Supplement best summarized the procedures Ellegard might have followed in making a more convincing study as he wrote: "If only Mr. Ellegard had [sic] more use of his electronic computer and had examined more stylistic features! The introduction of more variables might have reduced the 30043 figure, and convinced us completely that Francis was indeed Junius."44 A combination of content and function words characterized the study of Milton's influence on Shelley conducted by Professor Raben of Queen's College.45 He sought to show how often, in any sentence, Shelley used Milton's actual words. To Professor Raben's surprise, he found that the number was much greater than he had estimated. An obser- vance of function words alone would certainly have been inappropriate in such a study. 30 In addition to his research on sentence-level measures, Yule also developed an approach to quantitative vocabulary analysis46 in which he compares ratios of words used by one author to those used by others. More specifically, he concentrated on the common noun and its distribu- tion. For all nouns of a given text he computed frequencies of occur- rence and constructed a measure of their incidence that he called the "characteristic." This "K" value is expressed as an integer and represents the repetitiveness of the vocabulary of the work being tested. Subjectivity and precise mathematical functions typify both of Yule's studies, which are now acclaimed as precursors of current computer aided literary research. It was not until 1957, when Paul E. Bennett tested two of Shakespeare's plays for homogeneity of authorship, that Yule's method was again put to use. Bennett felt that fear of the "statistical theory behind Yule's measure" accounted for the literary scholars' hesitation to utilize his approach, but immediately added that "the measure itself is quite simple to use; it can be used in much the same fashion that one uses a calculating machine, without bothering one's head about the mechanics or the theory of the machine."47 Bennett then briefly described his method, which differed only in part from that of his guide. Whereas Yule had used a method of sampling, Bennett used the entire vocabulary of the two plays he tested. He hand-counted every common noun and checked his results against a well-established concordance. Finally, he presented in tabular form the results of his research and concluded that "in regard to the aspect of style which we have measured objectively in the two plays, Shakespeare is very similar to himself."48 Bennett was very careful to note that Yule's K characteristic measures 31 only one aspect of style and that "The real desideratum is to develop objective measures of several different significant aSpects of style; authorship might then confidently be ascribed when two or three or four of these measures were in substantial agreement."49 Additional research in the area of vocabulary analysis was con— ducted by John N. Pappas. The Institute for Computer Research in the Humanities (ICRH) Newsletter of December, 1966, introduced Pappas's study of a disputed eighteenth-century French text attributed to both Mlle de LeSpinasse and Mme Suard.50 The procedures described by Pappas constitute a twofold method of stylistic determination and give rise to one of the very few computerized approaches to qualitative stylistics. He first dealt with sentences, paragraphs, and punctuation. Then he measured "the mean and standard deviation of each type of punctuation per sentence and per paragraph."51 Pappas next turned to the more standard "frequency count" of "the central word list, eliminating the 10% on top of the most frequently used words and those which occur so infrequently as to make their measure invalid."52 While preparing his data, he encountered a problem not foreign to all early computer literary scholars: the treatment of accents. In order to distinguish between :a_ and a, 22 and 92, for example, he found it necessary to use a "keyoword in context" concordance program to print, in context, all words of possible double meaning. Keypunching special characters into a given text to represent all diacritical markings is at best a very tedious, time-consuming and, in certain instances, unnecessary task. Assuming, for example that we use the asterisk (*), the slash U), the percent sign (9c), the dollar sign (:5) and the plus (+) sign to represent diacritical marks, an encoding of the line "... on sgait combien peu 32 vous ates scrupuleux a détroner les Rois, aiant meme trouvé les moiens aprés cela ..." might look like this for proofreading: ON SC$AIT COMBIEN PEU VOUS E%TES SCRUPULEUX A* DE/TRO%NER LES ROIS, AIANT ME%ME TROUVE/ LES MOINS APRE*S CELA ... The presence of the Special characters, rather than aiding the reader, deters him. Only the "*" and, perhaps the "/" in trouvé would be necessary to establish meaning in a lexicographical index of this sentence. In consideration of the fact that most words whose meaning changes with the presence of a given diacritical mark are function words and will therefore have a high frequency rate, it would seem advantageous to encode only these relatively few, but often occurring, words. Had Pappas chosen to prepare his data in this manner, he would have accomplished two things: by eliminating the need to ”see certain words in context" he would have reduced his computer costs and data pro- cessing time,53 and he would have limited the number of words requiring a key-word in context listing to content words with dual meanings (e.g., étre) and verb forms of the first conjugation whose non-accented past participle resembles its present tense forms (e.g., trouvé, trouve). The first 15,000 words of l'Avis 325.réfugiés were keypunched with mark— ings similar to those demonstrated above since they were completed before the appearance of Pappas's article. The remaining 185,000 words of text were, however, keypunched in a more simplified, yet effective manner: distinguishing characters accompanied only those words having a poten- tially high frequency and whose substance is changed by presence or absence of a diacritical mark. A "pre-processing" computer program designed to eliminate the "unnecessary" characters and to expand 32], d‘, j' __ _g, n', c', sf, and m' by restoring their elided "e's", was written _ _. 33 and used to standardize the input data (s'il and s'ils, however, remained i')-54 Hence the number of words requiring a key—word in context list- ing was greatly reduced, although not entirely eliminated.SS To facili— tate the recognition of words which have evolved from a common root, to count their frequencies, and to list them lexically with their roots, the French ENROOT program was developed56 and used in conjunction with a "printed indexing prOgram."57 The resulting output data are lists, com~ plete with absolute frequencies, of all occurring forms of a given root. From this output, vocabulary became evident for which a key-word in context listing was required. A very similar program, called VIA (Verbally Indexed Associa- tions), subsidized by the Office of Naval ReSearch through the System Development Corporation of Los Angeles, was developed over a three—year period by Mrs. Sally Y. Sedelow. In her comments prefacing VIA's description, Mrs. Sedelow defined style as "the patterns formed in the linguistic encoding of information," and her working definition of stylistic analysis as "the perception of these patterns in language."58 She then stated the premise upon which her research project was based, namely, "that the choice of information bearing words, as well as all the other patterns which they help form as they are embedded in sen— tences, paragraphs, speeches, chapters . . . is the necessary province of stylistic analysis."59 The procedures employed in VIA evolved from the hypothesis that human perception of important ideas or themes is most often a function of sufficient repetition of words or word patterns to make an impression on the reader. Thus, hoping to pick out important ideas or concepts within a given body of text, Mrs. Sedelow programmed the computer to 34 print out lexically the vocabulary of a given text with each individual word's accompanying location, according to chapter, paragraph, sentence, and position in sentence. These words were listed by "root—group," with a figure denoting the total number of words included in the group printed next to the last word of the set. Using this machine-generated list, Mrs. Sedelow had to study the vocabulary and identify the terms she wished to examine further as "primary words."60 The next step entailed a manual search through synonym dictionaries and thesauri and compilation of an associated-word list. This list, in turn, had to be checked against the original text to see if the new terms occurred in it. Finally, having been fed this "thesaurus," the computer instituted "a rather complicated search for these possible associated-words, search- ing first for each word associated with the given primary word, next for words linked to such of those associated words as have primary status in their own right."61 Thus a multi-level investigation occurs, which leads to the isolation of themes within a text. A second, even less conventional program developed by Mrs. Sedelow is called MAPTEXT62 and may be used in connection with VIA (using VIA's output as input) or completely independently of it. Whereas VIA deals primarily with vocabulary and themes developed through word usage, MAPTEXT seeks to reveal patterns of word usage which were hitherto covert. It allows one to visualize a text free from semantic consideration. For example, to study the verb-adverb distribution in a text, each verb and adverb would be assigned a number as the original data deck is prepared for input.63 The resulting output from the com- puter would furnish a symbolic picture of the text. Numbers represent- ing encoded words would be printed, whereas the uncoded words in the text 35 would be represented by zeros or dashes. Thus, if a researcher wanted to ascertain the density of occurrence of a particular part of speech, he would only have to read the symbolic output. Works discussed in this chapter represent but a few of the many reports published on computer—aided literary studies, for, just as the utilization of the computer has known an exponential growth in general, so, also, has its application to the fields of linguistics, stylistics, and literary content-analysis.64 Research reported here gave direction to the present project by suggesting examination of such putative indices of an author's style as word and sentence length, syllabification, sen- tence beginnings and endings, content and function words, fixed and variable word combinations, as well as numerOus part of speech and grammatical elements. Discovering and defining possible objective correlates of style represent a considerable part of the research involved in this project, but two equally important areas remain to be reported: the delineation of procedures, e.g.,how these variables were located mechanically, and the results obtained when 843 variables were compared across the test articles. Reporting of these two areas occupies the next two chapters. NOTES FOR CHAPTER II lWachal, "Linguistic Evidence," p. S. 21bid. 31bid. 4G. Udny Yule, "On Sentence Length," Biometrika, XXX, 1936, 363— 390. Even though Yule was not able to utilize the computer in his study, I have included his article in this "survey" because his method paralleled that of computational stylistics. 5"So far as punctuation is concerned, there seems very little evidence that many authors exercised any care about it whatever. After all, even at present, few authors trouble to punctuate their MSS. with any care or consistency. Such punctuation as is found in ordinary MSS. of the sixteenth and seventeenth centuries is indeed most erratic and seldom goes beyond full steps at the end of most of the sentences and some indication of the caesura in verse." .A£_Introduction tg_Bibliog- raphy (Oxford: Clarendon Press, 1927), p. 250. 6Yule, "On Sentence Length," p. 365. 7Admitting to the subjectivity involved in so editing a text, Yule wrote: ". . . at first I by no means realized the full extent of this difficulty, and when I did often felt myself horribly incompetent to deal with it. I am sure my final decisions could often be contested, and were not infrequently inconsistent with one another." Still he con- siders his procedures valid "if only as an exploratory piece of work." "On Sentence Length," p. 365. 8For example, L3_Petit Robert gives the modern definition of "phrase" as "Tout assemblage d'eléments linguistiques capable de repre- senter pour l'auditeur l'énonce complet d'une idée concue par le sujet parlant," whereas its "old" meaning is, "Tour on construction." Further research revealed that in classical French, "phrase" is defined as an "expression, une facon de parler, locution, tournure." See Oscar Bloch and W. Von Wartburg, Dictionnaire Etymologique dg_13_langue francaise, 5° ed. (Paris: Presse univ. de France, 1968), p. 482, and Gaston Cayrou, L§_Fraggais classique (Paris: Didier, 1948), p. 659. In the classical period definitions given, "a complete expression of a single thought" is not stipulated. The lack of such regimentation would seem- ingly allow the compositors greater freedom to edit. However, it must be remembered that the fact that a printer or compositor breathed the same social and political atmosphere as the author for whom he set type, might well have provided him with more structural, as well as ideologi- cal insight into a given text than can a reader examining the work three to four hundred years later rightly expect to possess. 37 QYule, "On Sentence Length," p. 365. 19;23g,, p. 367. 11Wachal, "Linguistic Evidence," p. 195. 121212: 13Milic points out that Swift's redundancy "derives from the urge to control meaning." A Quantitative Analysis of the Style_ of Jonathan Swift (The Hague. Mouton and Co. , 1967), p.121. 14Milic, "Unconscious Ordering in the Prose of Swift, " The Com- puter and Literary Style, ed. Jacob Leed (Kent, Ohio: Kent State Un1v. Press, 1966), p. 82. 15Frederick Mosteller and David Wallace, Inference and Disputed Authorship: 'The Federalist' (Reading, Mass: Addison-Wesley, 1964). 16Ephim G.Foge1,"The Humanist and the Computer. Vision and Actuality," Proceedings_ of the IBM Literary Data Processing_Conference, IBM, Yorktown Heights, New York, 1964, p. 17, quotes from a 1961 article by Mosteller. See also Mosteller and Wallace, DisEuted Authorship, p. 7. 17Ibid., p. 1. 18Mosteller and Wallace, "Inference in an Authorship Problem: A Comparative Study of Discrimination Methods Applied to the Authorship of The Federalist Papers," paper read at the statistical meetings in M1nneapolis, Minnesota, September 9, 1962, p. l. 19Mosteller and Wallace, Disputed Authorship, p. 39. 206. A. Miller, E. B. Newman and E. A. Friedman, "Length- frequency statistics of written English, " Information and Control, I (1958), 370- 389. 21Mosteller and Wallace, Disputed Authorship, p. 39. To my knowledge, such a list does not exist for the French language. It was, therefore, necessary to cull, from grammars (e.g., Grevisse), studies on style (e.g., Sayce, Marouzeau), and syntax (e.g., Haase) a file of French function words according to the criteria presented by Mosteller and Wallace. See further explanation of function word selection, infra., pp. 56-57. 22lbid., pp. 11-13, 39-42. 23Ibid., p. 67. 24Mosteller and Wallace, Disputed Authorship, p. 265. These thgughts, from a statistician, parallel those of scholars of the New Criticism using the formalistic approach to literary analysis. See also 38 the very interesting study, "litterae ex machina," Proceedings of the IBM Literary Data Processing Conference (Yorktown Heights, New York, Sept. 1964), pp. 37-54, in which Alan Markman relates computer techniques to critical theories. 25Stephen M. Parrish, "Computers and the Muse of Literature," Proceedings gf’thg_Conference gp_the_U§g_gf_Computers ig_Humanistig Research, Rutgers, Dec. 1964, pp. 14-19, and in Edmund A. Bowles, ESE? puters in Humanistic Research (Englewood Cliffs, New Jersey: Prentice— Hall), pp. 124-134 (revised). 26Charles Muller, Essai de statistique lexicale: "L'illusion comigue" d§_Pierre Corneille (Pafis: Klincksieck, 1964), and Etude d9 statistique lexicale: Lg_vocabulaire dg_théatre dg_Corneille (Paris: Larousse, 1967). 27Milic, "Winged Words," A_Qpantitative Analysis, and "Uncon- scious Ordering." 28One of the more sceptical critics of computational stylistics is Stephen Ullman who calls the statistical method "too crude to catch some of the subtle nuances of style: emotive overtones, evocative resonance, complex and delicate rhythmic effects and the like." He does, however, admit to three important "ancillary" uses: ". . . to establish the authorship of anonymous works," to obtain a "rough indi- cation of the frequency of a particular device, its 'density' in a given work," and in some cases to "reveal a striking anomaly in the 'distribu- tion' of stylistic elements which may thus raise important problems of aesthetic interpretation." Language 229 Style (Oxford: Basil Blackwell, 1966), pp. 118-121. 29Milic, "Winged Words," p. 77. 301bid. 311bid., p. 83. 32Milic, A_Quantative Analysis, p. 268. 33The Times Literary Supplement reviewer of Ellegard's two works, Whg_W§§_Junius? (Stockholm, 1962), and A_Statistical Method £93 Determining Authorship, Ihg_Junius Letters, 1769-1792 (Goteborg, 1962) called the previously accrued linguistic evidence "very unconvincing." January 25, 1963, p. 67. 34Arthur Sherbo and George Zimmer also cautioned against using small samples. In a pilot study they found that one-thousand word samples were inadequate for testing word pattern repetition. This problem has been eliminated in this study by the use of complete representative works in lieu of text samples. See Zimmer, "The Attribu— tion of Authorship: A Computerized Method Evaluated and Compared with Other Methods Past and Future," Diss. Michigan State University, 1968, pp. 29, 32. 39 35E11egard, A Statistical Method, p. 9. 36The "mots-clefs" approach was also taken by Pierre Guiraud and reported in his Caractéres statistiques dg_vocabu1aire (Paris: Presse univ. de France, 1954), and in his Problémes §t_m6thodes d2 13 statistique linguistique (Dordrecht: Reidel, 1959), pp. 84-96. M. Guiraud's research, however, pursues a more technical statistical approach than Ellegard's. He examines and criticizes the formulae intro- duced into literary studies for vocabulary analysis by statisticians and linguists such as Yule, Herdan, and Zipf, and then develops his own formula. One of his principal concerns was the establishment of the "norm" (derived from the statistical "normal curve"), which he estab- lished for the end of the nineteenth century based on 1,200,000 words from prose writers of that period. 37For example, the word "uniform" had a relative frequency in Junius' works of 280 per million words. In the control group sample of one million words its relative frequency is 65. Thus the distinctiveness ratio is 280/65 or 4.3, which is "clearly a Junian plus word" which he used about "four times as often as his contemporaries." Ellegard, A Statistical Method, p. 15. 38Professor Ellegard used a control group of "one million words, drawn from about a hundred authors" to establish his plus and minus word list. Ibid., p. 21. 39Times, January 25, 1963, p. 67. 4OZimmer, "Attribution of Authorship," p. 28, and his review of Who Was Junius? and A Statistical Method, in Journal gf_English and Germanic Philology, June, 1963, pp. 688-689. 411bid., p. 688. 42Times, January 25, 1963, p. 67. 43This figure represents the population of potential "Junii." For "sound" attribution in Ellegard's study the number must be smaller than 300. 44Times, January 25, 1963, p. 67. The fundamental criticisms of Ellegard's research stem from the inconsistency of his own subjective impressions and manual counts; not from his approach to the problem of attribution. 4SJoseph Raben, "A Computer Aided Study of Literary Influence: Milton to Shelley," IBM Proceedings, 1964. 46The Statistical Study gf_Literary Vocabulary (Cambridge, 1944). 47Paul E. Bennett, "The Statistical Measurement of A Stylistic Trait in Julius Caesar and A§_You Like 15)" Shakespeare Quarterly, 8 (1957), 33-34. 40 481bid., p. 44. 49Bennett, p. 45. 50John N. Pappas, "Authentication of an Eighteenth- Century Text," Institute for Computer Research in the Humanities Newsletter, II, Nos. 4 and 5 (Dec. 1966, and Jan. 1967), —3- 4, 3- 4. 51Ibid., Dec. 1966, p. 4. 52Ibid. 53Since the time Pappas developed his programs, and, for that matter, since the time I prepared my textual data, typewriter terminals which include upper and lower case letters, punctuation marks (the IBM 026 and 029 keypunches I used required special multiple punches for all punctuation marks except the comma and the period), and diacritical marks have been developed. As Robert Wachal has said, "One would hope that the days of the keypunch are numbered, at least for humanists. " "Getting at Style through Statistics," rev. of Statistics and Style, eds. Lubomir Doleiel and Richard W. Bailey (New York. American Elsevier, 1969), in CHum, IV (May 1970), 27. 54Several other tests were performed by the pre-processing pro- gram, most of which were peculiar to the late seventeenth-century texts with which we were dealing. To ensure uniformity in the input data, all 200,000 words were run through this program. 55For example, 1 occurred 1024 times and a 415 times in l'Avis aux réfugiés. Assuming one occurrence of either per line of text, 1439 lines or 24 pages of output would be required for this word alone. Nevertheless, while preparing the input data, for example, I failed to fully realize that encoding past participles according to use--e.g., verbal, adjectival, or nominal-~would facilitate comparisons of such stylistic elements as verb-tense, noun to adjective, or adjective to adverb ratios. . 56See infra., pp. 62- 65, for a more detailed description of this program. A grant- in- aid of $1500 was provided by the Computer Laboratory of Michigan State University for the development of this program. Mary Rafter, a professional programmer at Michigan State University, wrote the preliminary version based on a much less sophisticated, less complex English version developed by John Hafterson of the Learning Systems Institute at M.S.U. See Basic Information 229 Retrieval System Techni- ggl_Manual, BIRS 2.0 (Michigan State University, 1968), pp. 1211-1212. 57See infra., pp. 61-62. John Morris reported that he found using PIP in conjunction with one of our early versions of ENROOT "very fruitful" in his article, "A Computer-Assisted Study of a Philosophical Text," CHum, III (Jan. 1969), 175-176. 41 58Sally Y. Sedelow, "Stylistic Analysis: Report on the Second Year of Research," System Development Corporation Document, TM-l908/200/ 000, March 1, 1966, p. 7. 59Ibid., p. 8. The technique Mrs. Sedelow used in composing VIA resembles one already in use for some time in the social sciences known as Content Analysis. The General Inquirer system developed at MIT, commencing in 1961, is a well-known example of this technique. Another is the Basic Index and Retrieval System (BIRS) designed to search and retrieve research documents. BIRS served as a point of departure, and its principal programmer, John Hafterson, as guide as we explored the possibilities of adapting parts of the system to the French language and to the authorship attribution of the present study. Even though Mrs. Sedelow's program had some features, e.g., line location of desired vocabulary printed on preliminary output, that were not built into BIRS, we felt that the immediate availability of BIRS, the presence of its writer, and the fact that a program written for another machine would need some revising and appreciably delay the project, made more judi- cious the use of BIRS. 60Defined by Mrs. Sedelow as "words occurring with high frequency relative to the rest of the text." Ibid., p. 15. She concludes her final report under the three-year research grant with a hopeful state- ment that VIA will one day be completely automated, thereby eliminating the tedious manual establishment of thesauri. "Stylistic Analysis: Report on the Third Year of Research," System Development Corporation Document, TM-l908/300/00, March 1, 1967, p. 91. 61Sedelow, "Second Year Report," p. 16. 62mm. 63It is conceivable that adverbs could be further divided as to time, place, or manner and be encoded accordingly. 64Its use in many other areas of humanistic research has also grown. CHAPTER III PROCEDURES AND PROGRAM DESCRIPTIONS Because the data to be used in this attribution study of l'Avis agx_réfugiés ultimately consist of frequency figures, making as much use as possible of modern technology and the science of statistics is an eminently sound procedure. Indeed, it is only by making intelligent use of the computer and statistical techniques that two goals of this pro— ject-~validity and reliability-—can be attained. In order to be certain that a different researcher, repeating the same procedures, will obtain identical results, each step of the experiment must be explicitly defined. Wachal points out that "One of the best means of guaranteeing explicitness is to write a set of computer programs which will perform this procedure. To the extent that the computer programs guide the work, the results are replicable, inasmuch as the machine cannot act on any instructions that are not entirely explicit."1 The literary scholar, then, who chooses to use the computer and statistics as tools for research, has no easy task. He must define literary terms--1exica1, syntactical, phonemic, or grammatical-~into a mathematical language which can beread and acted upon electronically. Once the selected ele- ments have been identified and their incidence recorded, statistical techniques can be utilized for analysis or inference. Since one of the goals of this project was the development of a procedure for construct- ing a practicable, sensitive test for authorship, the STAR (1620 author— ShiP report) system, a series of computer programs, was develOped. The STAR system evolved with the co-operation of the Indiana- Purduelhuyersities at Fort Wayne for use on their IBM 1620 computer. 42 43 Although a small and relatively unSOphisticated machine, the 1620 is especially suited to natural language analysis for several reasons. First, the structure of the 1620 lends itself to the handling of variable length character strings (i.c., words of varying lengths), characteris- tic of literary texts. Second, the simplicity of the "assembly lan- guage" is advantageous to the process of searching through lists of words. STAR, dealing primarily with vocabulary, syntactical elements and grammatical terms, involves a great deal of this type of processing. Third, STAR must read and print large amount of data (this study put to use a data base of over 1.7 million characters). The instructions to read, print and punch for the 1620 are extremely simple. Whereas only one instruction is required to read a card on the 1620, as many as one hundred-fifty instructions may be necessary on a more SOphisticated sys- tem such as the IBM 360 or CDC 6500. Thus the STAR system was written largely in assembly language. SPS (Symbolic Programming System) was chosen as the primary programming language because it provides the fastest possible character manipulation on the 1620. Its major inadequacy lies in its limited ability to be converted to a larger or more advanced system, e.g., it is totally incompatible to a fixed word-length machine. The use of FORTRAN would eliminate this difficulty; however, since FORTRAN is not so efficient as SPS, processing time would be increased at least ten-fold. The data used by STAR closely resemble the natural language text found in any book, and may be prepared with a minimum of clerical work. Another advantage of the system is its general applicability. It is adaptable to almost any type of material, and is even independent of the natural language of the material used (with one minor exception which is 44 explained in detail below, page 49, footnote 13). The programs were originally written to study French texts, but any alphabetic language would work as well.2 Basically, the results produced by STAR fall into two categories. First, numerous scores are derived from sentence and word lengths. The second lot consists of a group of scores dependent on the frequency of certain words or phrases. The results are not complicated statistical scores, but simple totals and averages. This output may be subjected to any required degree of statistical analysis. Section 1 of the program description gives a brief summary of how the input data is prepared. Section 2 provides a description of the five independent programs and sub-routines used in this study and an analysis of how the different scores developed by each program are Obtained. Finally, section 3 describes those editing and testing pro- grams which were used to check the accuracy of the other parts of the system. This section also treats programs used to proofread the original data. Data Description The format chosen for preparation of the basic text provides one of the basic differences between quantitative authorship studies. In a very broad sense, the many different types of data may be conveniently reduced to two broad groups. The first consists of a pre-processed or encoded data similar to that used by Louis T. Milic in his comprehensive study on the style of Jonathan Swift.3 Before punching any data on cards, Milic "translated" each word of the text to be examined into a two-digit code representing the grammatical part of speech of the 45 encoded word based upon its function in the sentence. These codes were then punched into the data cards.4 The second group is comprised of data punched directly onto cards from the original text. The data prepared for the current study fall into this category. A comparison of this method with that of encoded data readily illustrates that the data processing procedures will be much slower because of the large number of cards to be read. For example, a card of encoded data may contain up to thirty-six words per card; whereas the second type squeezes ten words onto a card. It is to be noted, however, that data of the second class have one very important advantage over those of the first class: they may be used for any number of different types of Studies. The data prepared for this study may be used for almost any type of processing, and, as the reader will note, a wide variety of tests have been run using them. On the other hand, encoding data restricts analysis to the type of process- ing envisioned when "translated."5 In addition to alphabetical rather than digital format of the data, still another important distinction separates the data base used here from the majority of other authorship attribution studies. Most researchers have chosen to take samplings from the authors' works being studied. Alvar Ellegard, for example, draws samplings from approximately 100 authors to establish a basis for his attribution study.6 In the present project, complete works, rather than random samples, were chosen as test articles. To be sure, this involved the accumulation and processing of significantly more data, but has produced (and we anticipate it will yet produce) more accurate results (i.e., remflts less subject to sampling error)7 than would have been attained 46 had sampling of the test articles been employed. In order to keep a constant check on the accuracy of the data, special identifying information was punched in the first five columns of each card. Columns six through 80 contain the text. Column one con- tains a code which signifies which author wrote the material.8 X--l'Avis aux réfugiés (referred to hereafter as the "unknown") L and Q--Daniel de Larroque B--Pierre Bayle P--Paul Pélisson-Fontanier D--René Descartes Columns two through five contain a sequence number which serves to keep the data in the proper order. The limited character set of the IBM 1620 computer necessitated making some minor modifications of the text. During the keypunching phase, all italics and underlining were ignored. Latin quotations which were not an integral part of the text were omitted. Because many of the accent marks used in French function to change sound rather than basic meaning, because the performing of a content analysis would necessitate the presence of more than a single word, and because omitting the "unnecessary" accents would extensively facilitate the keypunching Operation, all accents were omitted except those whose presence changed the basic meaning of a given word.9 In addition the use of the apostro- phe was ignored except for a few special cases such as 1' (which could be either l§_or la), and 1:23, The combination l'auteur, for example, was expanded to If auteur, allowing the space between the apostrOphe and the noun to become a word delimiter. In the case of l'on, it was felt that this particular use of the definite article might be a valid 47 discriminant, therefore the union remained. Where the apostrophe was clearly the elision of g_or 1, these letters were restored, e.g., 33:_ became 393; mi, t_', _s_', 11;, were expanded to m3, 3:, s3, p_e_; and _s_'_i__l_ was changed to sinil, etc. Finally, any parentheses appearing in the text were replaced by commas. In order to simplify processing, and because the semi—colon, colon, exclamation mark and question mark do not appear on the IBM 1620, punctuation was limited to the period and the comma. Thus the question mark, the colon and the exclamation mark were all punched as periods, while the dash, the semi-colon and all parentheses became commas. These changes involved only minor revisions to the text and greatly simplified both the keypunching and program development. Finally, to insure accuracy of the text, a few small programs were written to check the exactness of the text with regards to the special characters. These programs are described in section 3. In addition to the aforementioned accuracy checks, a major pre-processing program, written for the CDC 3600, also verified the basic text.10 Section 2 - Program Descriptions SSENWOL Sentence and word length as quantitative stylistic indicators have challenged several attribution study scholars.11 Some have claimed success, others have admitted defeat. Proceeding under the assumption that even though sentence-level measures standing alone may not provide sufficient grounds for a conclusive argument,l felt they might at least com- bine with other "indicators" of internal evidence. Furthermore, dis- cussions with statisticians at Michigan State University and at Indiana 48 University at Fort Wayne assured me that the methods of past sentence length research and not the theory of its applicability had been ques- tioned. Therefore, feeling quite confident that sentence and word length could be valid discriminants, I proceeded to have SENWOL written. The following items represent the type of data generated by SENWOL: l. The average number of words in a sentence, where a sentence is defined as a group of words between periods.1 2. The average number of letters in a sentence. 3. The average number of letters in a word. 4. The total number of letters, total number of words, and total number of sentences in all the data examined. 5. A provision is also made for the exclusion of words of a given length, e.g.,one or two letter words. See SENWOL results, p. 88. The program obtains these data by analyzing one sentence at a time. That is, the program reads in one complete sentence, counts all the letters and words in that sentence, prints the information for that sentence, and then goes on to the next sentence. When all the data have been read in, the program computes the averages and prints the final totals and averages. Appendix E demonstrates a sample of SENWOL output. Appendix D contains a simplified flowchart of the program. SYLAN The syllabication analysis program (SYLAN) and SENWOL are simi- lar in that they produce analagous output. SENWOL counts and stores individual characters: SYLAN works on combinations of alphabetical characters. In order to compute average syllables per sentence and average syllables per word, a sub-routine called BREAK was inserted into 49 the basic SENWOL program. BREAK takes five basic rules of syllabication and works on the text one rule at a time.13 Like SENWOL, it analyzes one sentence at a time. In the preparation of the data base the article ll_was separated from its noun or pronoun (except in the case of 1:92) in order to be counted as a separate word. However, 1: contains no vowel and thus does not fall within the bounds of the basic rules. It was therefore treated as a special case; as a word having one syllable. The identifying of l:_ is the first step in a series of word editing performed by SYLAN. The remaining editing procedures serve to eliminate other words which cannot be analyzed by these rules. Specifically, these words are those con- taining special characters and numbers. For example, A* represents é} LA*, lg; OU*, 9Q; DE*S, d§§3 etc. Numbers which were punched as digits and not in alphabetic representation cannot be analyzed by these rules and were thus returned as special words having no syllables. To be cer- tain that the number of unanalyzable words was insignificant, a tally was made of all such words which showed one such term to be found in every 1500 words. All words which had been edited were then broken down into syllables according to the five rules as referenced above. The first letter of each syllable was specially marked in computer memory with a procedure analagous to underlining that letter. After the word had been completely broken down, the program counted the total number of syllables in the word, added this number to previous totals, and then punched out all the required information as described above pertaining to words, sentences, and the entire article. Again, the output procedure is very similar to that utilized by SENWOL. 50 An additional check was made to insure that the program properly handled all cases of syllabification. From the initial output from SYLAN, certain exceptions to the five rules were noted and insertions (additional instructions) were incorporated in the program. After the final results had been obtained and checked, the error was less than 1%, which was considered negligible in terms of the large sample size. Appendix G contains a simplified flowchart of the program. STYLBEND The ways an author chooses to begin and end a sentence is con- sidered to be a characteristic of his style.14 Therefore, STYLBEND (Stylistics Beginnings and Endings) was written. This program searches for, prints, and punches the first and last words of each sentence of a text along with its corresponding sequence number which ties it back to the original data base.15 Because the computer was not programmed to recognize parts of speech, it was necessary to take the data which were thus punched onto machine cards and hand sort them. Fifteen basic parts of speech were chosen as possible beginnings.16 In order to avoid as much ambiguity and chance for human error as possible, the same parts of speech were used as endings. Once these categories had been established it was necessary to hand sort each of the punched cards into its proper category. In many cases it became necessary to refer back to the basic text in order to see the word in its context and properly classify it.17 The code/sequence numbers as described above proved invaluable at this point. As soon as the sorting process was completed, a simple counting program was written, which counted the total use of each part of speech. 'flusresults depicted each of the selected parts of speech chosen by a 51 given author to introduce or to terminate the sentences he had written. The accuracy of this program was determined by constant referral to the basic text required by words which could be classed as various parts of speech. In addition a second STYLBEND program was written, the first having been written for the CDC 3600. The second, written for the IBM 1620, produced a slightly different output. Through constant proof- reading of the basic text, minor errors were noted and corrected. The 1620 version of STYLBEND served as a final check to insure accuracy as well as precision.18 The output from the CDC 3600 computer printed out the total number of occurrences of each word as well as each line on which the word occurred, and separate cards were punched for beginnings and endings. The 1620 output produced two words per card, the first and the last of each sentence plus the location of the sentence beginning. The final tabulations of STYLBEND represent the results obtained from the most current, corrected copy of the text. All discrepancies that had been noted through proofreading were corrected. Appendix H contains a simplified flowchart of the program. EXSOR One of the simplest of the quantitative analyses in stylo- statistics is the study of word usage. However, this is not enough. Many times, a study of the use of groups of words or expressions may yield more meaningful results. It is for such a study that this program has been develOped. EXSOR is a program designed to search a sentence-structured text to find given word patterns called expressions. The searching process may be completely automatic, or it may be combined with manual searching 52 by the user. Before the program itself is described, the input and out— put specifications will be given in detail. The input consists of two groups of data: the expressions and the text. They are inserted in that order with an End_9f_ggb}9 card at the end of each group. The text is assumed to be sentence—structured; that is, it must be divided into sentences with a period: the only allowable sentence delimiter. The comma may also be used as punctuation, but all other punctuation marks are treated as separate words. The space is considered a word delimiter along with the period and comma; any number of consecutive spaces are condensed to just one space as the text is read in. The text is punched on cards in columns six through seventy-five; the first five columns contain a sequence number and will be ignored by the computer. The expressions are of two types. First, an expression may con~ tain only literals. Consider the examples: PLUT A* DIEU QUE and ILS NE NOUS L' AUROIENT JAMAIS FAIT The first is a fixed expression without variant forms. The second is composed of fixed forms and accompanying grammatical varia- tions which may or may not be selected for testing. EXSOR will find each case that these expressions were used in the text within a single sentence and print out each expression and its frequency. The second type of expression may contain one of the two special variables (WRD and WRDS) which may not be used anywhere in the text. Using these two symbols, we can define a variable expression. As an example, suppose we 53 wish to find all occurrences of the structure: ne ... jamais We are not interested in the words in between or how far apart the two literals are. We would replace the dots with special characters. WRD causes the program to skip exactly one word of the text, and WRDS causes the continual skipping of words until a match is found for the next literal. In the example above we would input the expres- sion: NE (WRDS) JAMAIS A match is first found for the word NE, and then each consecutive word is tested until a match is found for JAMAIS or until the end of the sentence is reached (in no case will an expression overlap the end of a sentence). WRDS is therefore a variable which can stand for any number of words. WRD has a somewhat similar use, but it is restricted to exactly one word. Thus, if we wished to find all cases in which the NE is separated by exactly one word from the JAMAIS, we would input the expression NE WRD JAMAIS. If we wished them separated by exactly two words, we would use the expression NE WRD WRD JAMAIS. Finally, we may combine the two different variables. The expression NE WRD WRD WRDS JAMAIS will find all cases with a separation of at least two words. Two special illegal constructions are checked for as the expressions are read in: l. The construction WRDS WRD is arbitrarily chosen to be invalid; it is exactly equal to WRD WRDS and considerable time is saved by not having to deal with both cases. When this condition is encountered, an error 54 message is typed, the correct expression is substituted, and the pro- gram continues. 2. An expression may not begin or end with WRDS or WRD. When this con- struction is encountered, the user must reload the expression list beginning with the first expression (after correcting or removing the incorrect expression). The output of the program is straightforward. Each expression is printed with an identification number and a frequency score, both of which are clearly marked. The material may also be punched on cards by the computer if desired. The program is broken up into two parts. The first of these is very fast and simple. It merely reads and stores all the expressions, using special markers to replace WRD and WRDS, and checking for the two errors mentioned above. Part two does the actual searching. First, it reads in a complete sentence. It then makes one pass through the sen- tence for each expression, until it has searched for all possible expressions. It then moves on to the next sentence; when it encounters an Epd_g£_ggb_card which marks the end of the text, it prints out the results of the search. By breaking the text up into parts, operating on only one sentence at a time, the search is simplified. Using the sense switches20 the user may select any or all of the following options. If sense switch 1 is on, the complete text will be printed. If sense switch 2 is on, the program will restart itself and automatically accept another new set of data. 55 If sense switch 3 is on, the memory position of each expression will be typed on the typewriter. This allows the user to see if the expression list is almost full, almost empty, or somewhere in between. Using sense switches 3 and 4, the user may combine manual searching with automatic searching. Certain expressions, due to their variability of. use, may need to be sorted by hand. These should be marked with a period on the input card, and the computer will ignore their tabulation as it searches the text. It will, however, print the sequence numbers where the expressions appear in the text, so that manual sorting will be much easier. Sense switch 3 is used to make the computer ignore the expres— sions as it searches, and sense switch 4 is used if the sequence numbers are to be printed. Due to the small size of available memory, certain limitations were imposed on the program regarding the amount of material it could handle. Exceeding any of the following limits will not necessarily cause the program to stop, but may give faulty results: 1. The maximum number of expressions may not exceed 800. 2. The total number of words in all expressions may not exceed 800. 3. All sentences must be less than 150 words. 4. All sentences must take less than 15 cards. 5. Average sentence length over the whole text must be greater than one card. FREQFUN Another basic type of quantitative stylistic data is the fre- quency of use of selected words. FREQFUN (Frequency of Function Words) is a program which produces the frequency of a given list of words. 56 The information thus obtained includes a listing of the words selected for study followed by their absolute and relative frequencies, and a total count of the vocabulary of the text. These figures may then be subjected to any number of statistical analyses. Two basic factors influenced the development of the program: (1) speed and efficiency of the sorting and counting process, and (2) maximum capacity of the selected word list. For this study a total of 361 words comprised the selected word list. In an attempt to obtain objectivity in our choice of fUnction words, use was made of FAP (File Analysis Program), a component of BIRS. FAP performs word-frequency analysis across (as opposed to merely within) paragraphs of a given file (literary work). It selects as descriptors of a paragraph those terms which make that paragraph a distinct element of the work. The program has three options or methods: method one maximizes on "content" terms, method two, on content and rare terms, and method three, on function words. The scores obtained from this program are a function of the average proportional use of a term and its variability of use in a given document, abstract of a document, or paragraph not exceeding 50 cards in length.21 L'Avis aux réfugiés, Cabale Chimérique, and Le_Prosé1yte abusé (roughly 153,000 words) were fed through FAP, method 3 with a "V" value cutoff established at 2.0. This procedure extracted from the 153,000 taken vocabulary a total of 184 ”function" words. For this preliminary analysis, a synonym list was used which equated such terms as 13, la, 1:, lg§_(the synonym list was not, however, used with FREQFUN). The fact that the part of speech categories Of these 184 terms, based upon the final FAP-3 tabulations, did not differ greatly from those related by S7 Mosteller and Wallace was encouraging. In addition to filler words classed as prepositions, conjunctions, pronouns, adjectives, adverbs, and auxiliary verbs, our FAP-3 output indicated that certain nouns (cas, cas—la, cause, considération(s), cfité, dommage, face, force(s), moien, moral, personnes, propos, tas), negations (guére(s), non, pi, personne, nul(le), aucun(e), pas), and twenty-one present tense and present or past participal forms other than auxiliaries, were so dispersed through- out the text to qualify as function words. Whereas Mosteller and Wallace had the Miller-Newman—Friedman list of function words to draw upon, and because no such list exists for French function words the FAP afforded the best available method to supplement the statisticians' list. How- ever, because of Mosteller and Wallace's success, after carefully omitting words whose rates could depend dangerously on context, many of these nouns and verbs were removed from the list before the final data were run. Only in the case of pronouns did we depart from their pro- cedures. It was felt that because each of the works to be tested dealt basically with the same topics, pronouns might be valid function rather than content words. Their analysis has therefore been included in the results described in Chapter IV, pp. 135-162. Appendix K lists the function words tested in this project. The program provides locations for approximately two-thousand words averaging five characters in length.22 In order to provide a fast, efficient program, all selected words were read in and stored on the disk--the 1620's external memory. Then, they were ordered according to length so that words from the data base would be compared only to selected words of an equal length. The next step involved alphabetizing, within numeric groups, the now ordered selected word list. Finally, trailing blanks23 are removed as the words 58 are stored in memory (this step increases the maximum number of words possible on the selected list by compacting the number of locations required.) After the selected words have been prepared as described above, the text is read in and the program begins the searching process. A typical card of text is processed as follows: The program searches for the first complete word on the card and counts the number of letters it contains. This word is then compared to all words on the selected word list of an equal length to the one on the text card. After the word is found, or after all words of equal size have been checked, the program proceeds to the next word on the cards, to the next, and so on until each word has been tested against the selected word list. After all cards of the text have been examined, the program prints out the final frequencies of all words in the format shown in Appendix L. Appendix J presents a simplified flowchart of the program. Section 3 - Miscellaneous Programs CANAL This is the basic program used to convert the format of the data produced by EXSOR and FREQFUN to an arrangement acceptable by the analy- sis of variance program. The word scores obtained from FREQFUN and the expression scores obtained from EXSOR are grouped individually according 24 That is to say, all word scores are recorded to author and article. sequentially in the following format where the denominators P, X, Bl, B2 and L, refer to the final test articles as described on page 75 and listed in Appendix A.25 59 FREQFUN p X B1 B5 11 A* 79 1365 1574 344 889 AFIN 1 26 29 4 26 AINSI 1 59 39 13 49 AILLEURS 1 18 15 4 11 ALLER 2 23 70 13 14 ALORS 1 l6 9 3 0 APRES 6 56 45 24 52 ASSEZ 5 28 43 6 23 AUCUN 2 53 54 5 37 AUJOURD'HUI 2 17 1o 11 17 EXSOR A* CAUSE DE 0 12 11 4 7 A* CAUSE QUE o 8 13 4 0 AFIN DE 0 18 18 4 13 AFIN QUE 1 8 10 o 13 A* PLUS FORTE RAISON — 3 1 1 1 AU PIS ALLER o 2 4 2 o A* WRDS EGARD o 12 3 3 6 DE PART ET DE AUTRE 1 o 3 1 4 The analysis of variance program requires that all scores for a given word or expression be recorded and analyzed before subsequent word or expression scores may be considered. In order to rearrange the data received from EXSOR and FREQFUN so that the analysis of variance may be performed, all data are read onto the disk in their raw form. The fre- quencies for each word expression are then retrieved and punched onto cards in the revised format. Editing Because of the changes necessitated by the limited character set of the IBM 1620 (as noted above, p. 46), numerous programs were written to check the text for keypunch or cepy errors. These programs print out the sentence sequence numbers which mark the location of sentences which have some questionable characteristic. Examples of questionable 60 characteristics are sentences of fewer than three words, sentences con- taining numbers, and sentences containing mispunched characters (percent Sign, dollar sign, parentheses, etc.). The sentences thus pinpointed were then checked by hand against the original document. Output Conversion Numerous smaller programs were written to rearrange the data from two or more programs into a format easier to analyse. Because these programs do not affect any numerical values, but only change their posi- tions on the machine cards, they will not be discussed in detail even though their use may be briefly mentioned elsewhere. Each of the preceding sections describing the STAR system with the possible exception of Section I (Data description) has extracted selected portions from the data base. It is now time to examine a series of programs designed to work on the entire text as the entity. The pro- grams to be discussed at this point are either modified versions Of existing programs, obtained from the Learning Systems Institute at Michigan State University or CDC 3600 versions of STYLBEND and EXSOR. To give a detailed description of these two would be superfluous since their functions are identical to those of the 1620 version discussed above, pp. 51-55. FAP The File Analysis Program is a component of the BIRS (Basic Index and Retrieval System) which is designed to perform vocabulary analysis. However, it has one disadvantage for studies of this type: 61 i~‘LS basic unit, or sample, is the abstract (paragraph). Previous SStudie526 have shown that the average paragraph contains too few words tO‘be an adequate sample. Therefore because of its size it is much too Small to obtain reliable and consistent results. For this reason, FAP was not used as a principal source of vocabulary scores, but rather as a component program: as a framework for developing the PIP-ENROOT pro- grams described below. Using FAP in this way saved some programming effort and will allow future users of the system the chance to obtain and use vocabulary "content” scores, as provided through FAP output, for vocabulary analysis studies. The exact structure of the PAP program will not be described in detail here;27 hOWGVCr, it is very similar to that of the PIP program described below. PIP The Printed Index Program is also a component of the Basic Index and Retrieval System (BIRS) developed at Michigan State University. The PIP program is extremely complex, and a more detailed description of its various options can be found in the BIRS manual.28 Its basic use in this study has been to prepare the data base for use with ENROOT, a sub- routine designed to reduce a word to its root as described below. PIP reads in the text, gives the material to ENROOT to be processed, and does all output operations required after ENROOT is finished. The PIP program contains three basic parts. First, the program reads various parameters which describe the input and output that will be required. We are using only a small part of the PIP program and must specify exactly the processing which we require. The input parameters include an exact description of the data 62 cards (see pp. 46-47). The output parameters call for a key-word in context index; this is a listing of each word in the vocabulary and its absolute frequency of occurrence as well as its root. After these parameters have been read, the program processes the text by paragraph. It finds a word, gives the word to ENROOT to be processed, and writes on a scratch tape the output received from ENROOT for that word. After this has been done for each word in the paragraph, the program proceeds to the next group, and continues this process until the complete text has been read in. When all words have been processed, the scratch tape contains each word and its root. PIP next alphabetizes all the words and tallies the total number of occurrences of each individual word. Finally, a listing is printed which shows each root generated in the text along with the absolute frequency of each word returned to that root. ENROOT This is a subroutine whose purpose is to reduce words to a basic root form. As explained above, PIP reads in the data base and relays each word to ENROOT in the form of a one to sixteen character array and ENROOT returns the root of each word in the same form. For the pur~ pose of uniformity in this study, we chose the infinitive as a basic root and all words etymologically derived from a given verb are returned to it. Only in cases where the noun or adjective form pre- ceded the verb into the French language were these forms designated as roots. In instances where the orthography of a given term varied, the modern spelling was generally applied to such roots. The original sub- routine, as developed by the Learning Systems Institute, was written to handle English texts and is much less sophisticated. 63 A grant-in-aid from the Computer Laboratory at M.S.U. provided funds for hiring a professional programmer, Miss Mary Rafter, to modify the English ENROOT. We spent several months researching the most effi- cient method of reducing the 55,551 word vocabulary of l'Avis aux réfugiés (minus its preface) to a final root vocabulary of 2689 words. The same process was then followed for the "known authors." The prob- lems encountered were many. The orthography, both of l'Avis aux réfugiés and of the other texts to be analyzed, varied greatly. At first, it was thought that these differences might stem frOm regional habits, but fUrther analysis revealed that works published in the same area contained the same variations of spelling. The English ENROOT pro- vided for a synonym list designed to deal with such problems. It then became necessary to find a means of increasing the capacity of the synonym list. The English version allowed locations for 500 synonyms; the capacity of the French version soon rose to 150029 as we sought greater accuracy. Before long we found ourselves swamped with list upon list of exceptions but still short of the degree of accuracy we were seeking.30 The adding of more lists and the eventual expansion of the synonym list to 2000 pairs of words slowed down the program considerably while increasing its accuracy to an acceptable 95-98%. At this point, the cost and time required for each additional degree of accuracy made continued automation impractical. The entire 166,000 words were run through ENROOT and after ten hours of manual re-ordering and keypunching, the desired 99% accuracy (allowing one percent for human error) was reached. Most of the instructions in this subroutine involved the search- ing of various word-lists. For this reason a special subprogram called 64 EQSRCH (search for terms of equal nature) was written. The subprogram was written in COMPASS because it provides a more efficient searching mechanism than FORTRAN. A compilation of synonyms containing roots (usually infinitive forms) and corresponding irregular spellings of the same word is essen- tial to ENROOT. Suffixes and prefixes were established based upon lists in Robert and Grevisse31 and checked against a reverse alphabetical listing of the entire text vocabulary. Finally, lists were established which isolate or prevent further change or reduction by the basic pro- gram. A complete glossary of these special lists will be found in Appendix Y. The analysis of the vocabulary proceded one word at a time through four basic steps as follow: 1. MISCELLANEOUS PREPARATION. This section consists of per- forming preliminary checks on each word to process words that would be erroneously classified by passing through the remaining procedures. The program searches the special lists described in the glossary and returns the word as a root if it is found on any list. It also has special routines to remove (in some cases) a terminal e_or terminal 5, After this has been completed, ENROOT searches the synonym list for the word. If the word is found there, ENROOT obtains the proper root and stops processing that word. 2. SUFFIX TREATMENT. In this section the program looks for and deals with the suffixes contained in the suffix list. ENROOT compares the end of the word with each entry in the suffix lists, looking first for eight letter suffixes, then seven letter suffixes, then six, and so forth until a suffix is found or until the end of the suffix list 65 is reached. If a suffix is found, ENROOT removes the suffix and searches the synonym list for the word without its suffix. Whenever a word is found on the synonym list, its proper root is found and pro- cessing of the word is completed. 3. PREFIX TREATMENT. Prefix treatment is much the same as suffix treatment: ENROOT compares the beginning of the word with each entry in the prefix list, searching for the longest prefixes first. If a prefix is fOund, it is removed and the synonym list is searched again; if the word is found there, ENROOT obtains the proper root and stops further processing. 4. INFINITIVE FORMATION. Because the desired root fOrm is generally the infinitive, the program concludes with modifications to the last few letters of the word. ENROOT again searches the special word, lists, and either adds letters or drops letters from the end of the word in order to construct the preper verb form. When this is completed, ENROOT returns all roots to PIP for final output at a later time. Appendix Y is a glossary of word lists used with ENROOT. ANOVAR (Analysis of Variance) All of the programs described thus far have dealt with obtaining absolute and relative frequencies of vocabu- lary, expressions, or sentence and word length.' When comparing the fre- quencies obtained from each of the test articles some differences were obvious, others were slight. In order to objectively analyze these differences a straightforward, uncomplicated statistical approach was sought. After several hours of consultation with statisticians at Michigan State University and Indiana University at Fort Wayne, the 66 analysis of variance was decided upon because it allows wide divergence from the underlying assumption of normality, and because the extremely large data base used in this study would add even more robustness to the test. The analysis of variance would provide a test of significance of the obtained frequencies based upon three assumptions: normality of the error (within author variation), equality of variance of the errors, and statistical independence of the errors.32 These assumptions are "made in deriving statistical methods and are usually . . . apt to be violated in applications and are introduced only to ease the mathematics of the derivation . . . Statistical methods have been called 'robust' if the inferences are not seriously invalidated by the violation of such assumptions."33 Henry Scheffé concludes that "Nonnormality has little effect on inferences about means . . ." and that "Inequality of variances in the cells of a layout has little effect on inferences about means if the cell numbers are equal, . . ."34 However, because of unequal cell size in the present study, a test of the homogeniety of variance was performed as illustrated by Robert Steel.35 The test results fell well within the prescribed limits. We therefore proceeded to perform the analysis of variance on the frequencies obtained from the programs described in this chapter using ANOVAR. A variable is generated measuring the distance of each of the two "known" articles from the "unknown". These two variables are analyzed by ANOVAR. Significance in this analysis indicates that one of the "knowns" is statistically closer to the "unknown" than is the other. We shall use sentence variables as an example of the form of the data analyzed by ANOVAR. Letting X stand for the average sentence length 67 of the unknown article, and letting B and L denote like values for the known authors, the variables, /X-B/,36 and /X-L/ measure the distances of B and L from X with respect to sentence length. A test of the "difference between these two distance variables" is performed by a one- way analysis of variance. Significance of the analysis of variance test indicates that with regard to average sentence length one of the known authors more closely parallels this aspect of style in the "unknown" text. ANOVAR computes the pooled within variance37 which thus measures the within author variation. This within author variation is compared to the "between author variation" by means of the "F" ratio shown belOw. between variation "within variation F This study employs two types of statistical tests: the F-test (associ- ated with the analysis of variance), and the t-test. Since in this study, the analysis of variance was applied to data containing only two treatment groups, (X-B and X-L) the F-test and t-test were equivalent. As these two tests can be used interchangeably, programming convenience dictated which of the two tests would be employed in any given instance. Once the F value is determined it is used to decide which values 138 includes a statisti- are most and least significant. Professor Stee cal table in his text which indicates exactly how large the "F" value must be in order to conclude that the test is significant at a given level. In the present study, an F value of 4.17 or significance at the .05 level was deemed adequate.39 ANOVAR uses a subroutine to look up the F value in a table and to determine the exact level of significance. Where .XXXX is the value which was obtained from a table, the fragment "RESULTS SIGNIFICANT AT THE .XXXX LEVEL," represents a typical output statement from ANOVAR. NOTES FOR CHAPTER III 1Wachal, "Linguistic Evidence," p. 10. 2See the annotated listing of programs used in this study, Appendix C. 3Louis T. Milic, Quantitative Approach, pp. 151-152. 4Encoded data has three special advantages. First, because com- plete words are not punched on cards, the data is very compact. (Milic punched all his data on approximately 2000 cards. This study has used over 25,000 cards). Second, the data can be punched in fixed format; that is, each card will have data in exactly the same place. Milic, for example, began a new code in every third column. Data in fixed format is extremely simple to process. Third, because the data is compact and easy to process, processing time is reduced. SMilic's data, for example, could not be used to study word lengths. ' 6Alvar Ellegérd, A Statistical Method, p. 21. See also Milic, Qpantitative Approach, p. —71 and George Zimmer, "Attribution of Author- ship, " p.23 for discussion of sample sizes. 7The larger sample size resulting from the analysis of entire texts rather than sample extracts from the texts increased the sensitiv- ity of the statistical tests. The average length of the test articles in this study was 8505 words. 8See Appendix A for complete listing of works tested. 9Proofreading of the original text revealed that a distinction had to be made between a and a, la and la, on and ou, du and du, des and deg, etc. Where words were noted to ”be etymologically the same, no attempt was made to distinguish between them. Only after the entire 196,000 words of text employed in this project had been keypunched and the data prepared for final analysis did I realize the extent to which the same words (e. g. entre nous, j 'entre, il est entré, l'entree, . . .) appeared as different parts of speech. This was particularly evident with "-er" verb forms. Failure to make this distinction at an early enough stage of the study limited somewhat the comparative analyses of verb terms originally envisioned. Had, for example, the past partici- ples been coded so as to discern their usage, such comparative studies as imperfect vs. passe composé and/or literary tenses would have been a simple matter without the researcher having to depend on a key-word in context concordance. To be sure, words may often change meaning when used idiomatically or in locutions. This problem is handled through an expression finding program called EXSOR. See Section 3 of this chapter. 68 69 10See discussion of pre-processing program, Chapter II (pp. 32— 33) O 11The most prominent studies reported are G.Udny Yule, "0n Sentence Length as a Statistical Characteristic of Style in Prose," Biometrika, XXX (1939), 363-390; C. B. Williams, "A Note on the Statis- tical Analysis of Sentence-Length as a Criterion of Literary Style," Statistics and Style, ed. Dolezel and Bailey (New York: Elsevier, 1969), pp. 69-75; Hemming Spand—Hanssen, "Sentence Length and Statistical Lin- guistics," Structures and Quanta (New York. Humanities Press, 1963), pp. 58-73; and Milic Quantitative Approach to the Style_ of Jonathan Swift, pp. 59— 61. See List of References for additional_ entries. 12As previously noted (p. 47), periods and commas are the only punctuation marks included in the data base. The periods indicate com- plete stops or the introduction of enumerated clauses, except in ten instances where the number of words between such complete stops caused an overflow of memory in the computer. In these cases the text was edited by the insertion of a period at what appeared to be the most logical breaking point. The addition of these periods did not affect the final results as they were inserted only for the expression sorting program, EXSOR (see pp. 51-55). 13Maurice Grevisse, Le Bon usage, 7e éd. (Paris, 1959), p. 57, lists three basic rules for dividing words into syllables as do most introductory phonetics manuals. Professor Clelland E. Jones convenient- ly sub-divides Grevisse's three rules in his Manual of French Pronuncia- tion (Salt Lake City: Dessert News Press, 1961), p. 7. For BREAK'S purposes, we proceeded from the basic assumption that all syllables in French begin with a consonant and end with a vowel. To this basic rule, then, were added those provided by Professor Jones. 14Bernard O'Donnell, "Stephen Crane's The O'Ruddy: A Problem in Authorship Discrimination," The Computer and Literary Style, ed. Jacob Leed (Kent, Ohio: Kent State University Press, 1966), p. 111. 15For example, the word Ee_occurred 117 times as the first word in a sentence in l'Avis aux réfugiés. Of these 117 occurrences, 5 were adjectives, 21 were the indefinite relative pronouns, ce gue and ce gui, and 91 were the indefinite, impersonal pronoun. See Appendix L far'sam- ple output. 16 See Chapter IV (p. 95). 17In cases of possible dual classification, the function of the word in the sentence, how it was used, took precedence over its diction- ary classification. 18The expression "precision without accuracy" is George Zimmer's. "Attribution of Authorship," p. 13. 19This card is a control card for the IBM 1620 computer only. For a description of this and other features of the IBM 1620, see the EM Monitor _1_ Reference Manual, p. 10. 70 2ODaniel N. Leeson and Donald L. Dimitry, Basic Programming Con- cepts and the IBM 1620 Computer (New York: Holt, Rinehart and Winston, Inc., 1962), p. 290. 21Mathematically this function is represented as Method 1: V.= sz/m. J J J Method 2: V.= s?/m_ J J J Method 3: V.= mg/ 45% J J J where m- denotes the average proportional use of term j(t-) in the abstracts (paragraphs or other designated units) of a given file and s? the variance of prOportional use. V-, then, represents the "value" 0 tj as an indexing term in that file. See BIRS, Chapter 13. 22The figure five represents the average word length of the 290,000 words of data base. 23The maximum number of positions a word may occupy in the com- puter is sixteen. For words of less than sixteen alphabetic characters, the locations left unoccupied are called trailing blanks. 24Article as used here refers to the individual works tested. See Appendix A. 25It is to be noted that space limitations do not permit repro- ducing here all the word/expression scores thus recorded. The scores shown above are sample drawn from the original output. 26Milic, Quantitative Approach, pp. 71-72; Ellegard, A_Statisti- cal Method, p. 9. 27For a more detailed explanation of the nature, function and possible uses of the PAP program see BIRS, Chapter 13. zslbid. 29The final version of ENROOT has locations for 2000 synonyms and has been somewhat streamlined, but still deals with several lists of exceptions. A much more simplified version, based upon our past successes and failures is being written; too late, however, for use with this study. Appendix X is the final synonym list used in this project. 30After September of 1969, the programmer with whom I had been working was no longer available and it became necessary to transfer all (agrams, data, results, etc. to Indiana-Purdue University at Ft. Wayne. ”£3 computer laboratory facilities were made available for my use, and ,ne lead programmer/operator, Mr. Paul Gabriel, assumed the task of 71 adapting ENROOT (and the other programs which had previously been written for the CDC 3600) to the CDC 6500 computer at Layfayette, Indiana through an IBM 360 terminal. 31Paul Robert, ed., Petit Robert (Paris, 1967), pp. 1954-1956 and Grevisse, Lg_Bon Usage, pp. 78-96. 32Henry Scheffé, The Analysis 9f_Variance (New York: Wiley and Sons, 1959), p. 331. 33Ibid., p. 360. 34Ibid., p. 345. 35Robert G. D. Steel and James H. Torrie, Principles and Proce- dures gf_Statistics (New York: McGraw—Hill, 1960), p. 82. 36Vertical lines indicate "absolute value." 37The variance of a sample/population is a statistical measure of the variation or range of the observations within the given sample/ population. It is defined by the formula v = (xi - X)2. In general, a large variability will give a large value for the variance, v, and, con- versely, a small value of v reflects a small variability in the data. 38Steel, Principles and Procedures, (New York: McGraw-Hill, 1960), p. 440. 39Significance at the .05 level means that 5% of all possible samples may lead to the erroneous rejection of a true hypothesis. Or, in the words of Professor Jerome Li, "it . . . is the probability that a Type I error rejection of a hypothesis that is true may be made on the basis of a single sample." Jerome C. R. Li, Introduction £g_Statistica1 Inference (Ann Arbor, Mich: Edwards Bros., 1959), p. 49. CHAPTER IV TEST RESULTS AND ANALYSIS Past computational stylistic studies of authorship attribution have for the most part concentrated on one aspect of style. Yule first analyzed sentence lengths and later developed a statistical indicator for noun distribution in a given text. Ellegérd expanded Yule's vocabulary study, establishing "plus" and "minus" words, while making little dis- tinction between content and function terms. Statisticians Mosteller and wallace concentrated on function words. O'Donnell studied basic syntactical patterns which might be taught in an introductory English writing course. Finally, Milic concentrated his study on structural grammar. One purpose of this project has been to combine, for the first time, a series of variables and procedures, some of which had heretofore been individually tested, and to add to them other variants which may prove to be valid stylistic discriminants. To call this study a gram- matical, vocabulary, or syntactical approach would be to misname it, for it encompasses all of these stylistic elements. The great amount of data to be analyzed made nearly complete automation the most feasible approach. As the project progressed, experience indicated additional areas in which increased automation would even further facilitate the final analysis.1 Thus, an essential part of this study is the establishment of automated procedures and combinations of variables where no previous grttern existed. In addition, program development, selection of varia- P ”less, analysis of data, and articulation of interpretive patterns 72 73 augment the contribution of this study. Moreover, this investigation is unique because it utilizes complete texts of French prose to distinguish between stylistic features of French authors.2 Finally, since an author's style is a combination of an almost unlimited number of charac- teristics, programs designed for use in this study were developed to examine as many combinations of variables as practicable. Eight hundred and forty-three test items were delineated in the project.‘ Possible style discriminants were tested independently and finally compared col- lectively. Their total represents a profile, while perhaps only par- tial, at least more complete than could be produced by single- characteristic studies of an author's style. This profile should serve to distinguish one author from another.. The 843 test items were grouped into five major areas as dis- cussed in Chapter II (p. 21). They are represented here with the computer program which produced their results. Each program has been given a mnemonic title which indicates its function and facilitates its discussion. The test items will be discussed in the following order: 1. Sentence-level measures (SENWOL) 2. Sentence beginnings and endings (STYLBEND) 3. Function word frequencies and usage (FREQFUN) 4. Expression frequencies and usage (EXSOR) S. Vocabulary analysis (ENROOT) Results from each category will be presented in tabular form showing statistical differences and implied similarities between Bayle's, Larroque's and the unknown author's styles. Also included in each appropriate table are t-test values and their confidence levels. As grewiously explained in Chapter III (p. 67), the one-way analysis of f 74 variance which compares two items is statistically equivalent to the t-test. Whereas the significant F-value for the analysis of variance suggests similarity, a significant t-value indicates differences rather than equality in treatment means.3 Results tabulated in this chapter, with the exception of those from SENWOL and STYLBEND, which include both statistical tests, are presented with their level of significance based upon the t-value for the sake of uniformity and clarity. After each of the 843 test items was subjected to the t-test, tables of significance were consulted to determine which of the obtained values were likely to occur due to chance or sampling error fewer than five times out of one hundred. Most of the tests which showed significance indicated a possi- ble error of less than one, rather than five, out of a hundred. For a sample size of 30 or greater, a t-value of 2.58 is necessary to estab- lish an .01 level of confidence, while a value of 1.96 is generally sufficient to indicate an .05 level.4 SENWOL and STYLBEND results will be presented with both analysis of variance and t-test scores. Varia- bles whose analysis fell below the .05 confidence level are not reported in this chapter. For the final analysis, 843 variables were tested across five . . . I . . art1c1es. Descartes' Méd1tations, Pelisson's "D1scours sur les oeuvres dc Monsieur Sarasin," and articles by Bayle and Larroque from Les Nouvelles dg_lg_république des lettres served as controls and were not included in the final analysis.5 Because of Ascoli's convincing argu- lnent seen earlier in Chapter I (Po 6) refuting attribution of l'Avis to Pélisson, and because preliminary tests revealed that his style resembles that of the author of l'Avis much less than that of either ,fiayle or Larroque, Pélisson's article was omitted from the final testing. 75 Five test articles were finally chosen: 1. P — The preface to l'Avis aux réfugiés was retained to test the hypothesis that its writer was not the same as the author of l'Avis itself. 2. X - l'Avis aux réfugiés was tested apart from its preface. 3. Bl - Cabale Chimérique provided a text known to be by Bayle, in which he denies having written l'Avis and defends himself against the attacks of Jurieu. The subject matter, genre, and length are almost identical to l'Avis. 4. B2 - Réponse d'un nouveau converti was attributed to Bayle because it exposes ideas of tolerance with which he was in sympathy. He never avowed its authorship, but then he never openly denied it either. The Réponse is supposedly a prelude to l'Avis, having appeared in 1689. Retained as a final test article because if it is in reality a miniature Avis, there should be strong parallels be— tween l'Avis and its forerunner. ' 5. L1 - £§_prosélyte abuse was chosen from among Larroque's works made available for this study by the Bibliotheque Nationale, because it most closely re- sembles l'Avis and Bayle's test work in content, genre, and length. The data tabulated and interpreted from test articles in the study were directed into the five previously listed categories, for each of which a computer program had been devised. The purpose was to arrange the data in a format which facilitates further analysis of the relationships between articles. The results from each of the five categories will be presented in tabular form followed by tentative conclusions, where warranted, and by a summary of statistical values which suggest them. 76 Sentence Level Measures Tables 4:1 through 4:9 present distributions of frequencies of words, syllables and alphabetical characters per sentence. Distribu- tions of the following information are summarized: 1. average number of words per sentence, 2. average number of syllables per sentence, 3. average number of letters per sentence, 4. average number of letters per word, 5. average number of syllables per word, and 6. average number of letters per syllable. Additional distributions are presented which group into genres the works to be tested. Finally, summary tables are presented of all data regard- ing sentence level measures. Table 4:1 shows the division of each article into units of one hundred sentences. Even though articles X, B1, and L1, contain 1811, 1839, and 1176 sentences respectively, the table only includes the first 1000 sentences of each. Because values beyond the 1000 sentence listing did not significantly change the established pattern, there was no need to include the final averages. Their presence would merely have extended two of the five columns. The first column shows the division into one-hundred-sentence units. The remaining five columns list the average number of words per sentence per unit. Denominators P, X, B1, B2, and L1 correspond to the final five articles tested as described above. The major articles in this study were tested in units of one b1,ndred sentences for internal homogeneity. Table 4:1 represents {esults from these internal tests of homogeneity with reference to 77 TABLE 4:1. AVERAGE NUMBER OF WORDS PER SENTENCE Sentence Numbers P X B1 82 L1 1-100 40.667 30.950 31.510 50.330 30.430 101-200 - 32.530 32.100 50.650 30."6O 201-300 - 32.440 30.060 45.770 29.510 301-400 - 30.740 30.070 - 31.880 401-500 - 28.440 31.040 - 23.470 501-600 - 33.410 36.690 - 33.620 601-700 - 32.000 31.770 - 33.120 701-800 - 31.210 28.740 - 36.400 801—900 - 31.880 28.110 - 31.990 901-1000 - 31.510 31.250 - 26.990 Final Average 40.667 30.549 31.338 49.059 30.783 average number of words per sentence. Line one of the sentence number column shows the average number of words per sentence in the first one hundred sentences, line two Shows the second one hundred sentences, and each following line deals with 100 incidents of occurrence. The pre- face, P, contains only eighty-four sentences; therefore its values are found only in the first line of the columns. Because the preface and BZ are short, their consistency values are not as readily apparent as those in X, B, and L1. Consistency within a given author's work is evidenced by small deviation between one-hundred-sentence units. Had these works vecnnuuitten by more than one author (cf., especially X) there would not 151k31y be the consistency which is present, unless the collaborators had 78 intentionally designed their work to match each other's style. Table 4:1 shows consistency from unit to unit of Bl except at the 501-600 sentence level where the author brings together three of the major topics of his work in a most unclear, complex series of dependent clauses and phrases.6 In the 82 column, variations are widely divergent from each of the other representative works. A follow-up study might pursue the possibility as suggested by Robinson7 that Bayle did not write the Réponse d'un nouveau converti. On the other hand, the rela- tive shortness of the article may not have permitted a pattern to develop. Finally, in theLl column, three groups of sentences disturb the internal homogeneity pattern for no apparent reason. The final' average figures reflect the mean values fbr the complete articles and are typical of the visual closeness of authors X, Bl, and L1 in the sen- tence level tests. 0 Table 4:2 presents an overview of absolute frequencies of number of sentences, words, syllables, and letters in the test articles. The author column presents last names oerach author whose work was tested in this group: preface, unknown, Pélisson, Bayle, and Larroque. The word "preface," used in tables of this study, refers to the work and to its author. Numbers one through nine refer to works by bayle listed by title in Appendix A. Numbers one through twenty-three refer to works by Larroque similarly listed. Numbers listed under sen- tences, words, syllables, and letters represent absolute frequencies of occurence in each category for each article. These figures serve as raw data for the analyses to follow. Only data from SENWOL include tests of Pélisson's work. These /fl1tence level results revealed a greater difference between Pélisson 79 TABLE 4:2. OVERVIEW OF ABSOLUTE FREQUENCIES Author Sentences Words Syllables Letters Preface 84 3418 5583 14835 Unknown 1811 55551 91610 248328 Pélisson 322 12628 20659 55365 Bayle 1 1839 57787 92948 245521 Bayle 2 287 14147 23309 62662 Bayle 3 46 2077 3375 8981 Bayle 4 105 4608 7607 20096 Bayle 5 54 2484 4183 11236 Bayle 6 36 1341 2253 5970 Bayle 7 131 5345 8686 22802 Bayle 8 27 1117 1858 4910 Bayle 9 21 813 1352 3541 B Totals 2546 89495 145571 385719 Larroque 1 1176 36190 59003 157130 Larroque 2 154 6822 11333 30121 Larroque 3 55 2144 3462 9267 Larroque 4 70 3463 5669 15293 Larroque 5 49 1977 3234 8678 Larroque 6 26 972 1560 4133 Larroque 7 140 5073 8543 22252 Larroque 8 7S 3027 5073 13348 Larroque 9 37 1735 2998 7836 Larroque 10 48 2028 3359 8916 Larroque 11 43 1746 2965 7665 Larroque 12 35 1341 2251 5911 Larroque 13 63 2189 3416 8917 Larroque 14 131 4865 7838 21199 Larroque 15 27 850 1357 3564 Larroque 16 155 4883 7802 21083 Larroque 17 162 4874 8222 21823 Larroque 18 112 3757 6530 17043 Larroque 19 140 $191 8703 22739 Larroque 20 171 6255 10338 27196 Larroque 21 173 5624 9280 24667 Larroque 22 84 2711 4208 11146 Larroque 23 3S 1170 1829 4849 L Totals 3161 108898 178973 474776 80 and the author of l'Avis than between either Bayle or Larroque and the unknown author. Since these results confirm the rather conclusive argu- ment presented by Ascoli reported on page 6 of this study, Pélisson’s work was dr0pped from further testing. Table 4:3 records the average number of words per sentence and comparative statistical values for thirty-five test articles. The article column remains the same in Tables 4:3 through 4:6 and refers to the works listed by title and author in Appendix A. The second column lists the average number of words per sentence per article. 0f some significance is the marked divergence between the mean number of words per sentence of 81 when compared to averages of B2 through B9. A similar difference is readily notable when comparing L1 to L2 through L23. Presence of these distinct differences, apparent also in Tables 4:4 and 4:5 prompted the regrouping of test articles according to genre as shown in Table 4:7. Figures defining "All of B" and "All of L" summarize the mean sentence length of articles by Bayle and Larroque respectively. This summary mean emphasizes the already obserVed divergence between articles Bl and articles 82 through B9, as well as L1 and his remaining test works. Statistical calculations at the bottom of the table are ANOVAR (analysis of variance) program results, designed to determine the like- lihood that either B or L wrote the disputed article. All statistical tests in this study are based on the hypothesis that either Bayle or Larroque wrote l'Avis aux réfugiés. Using the "All of B" and "All of L" mean values, the computer calculated an F-value of 4.31 representing a level of significance or confidence level of .04 in favor of the TABLE 4:3. AVERAGE NUMBER OF WORDS PER SENTENCE 81 Article Average Words/Sentence Preface 40.667 Unknown 30.549 Pélisson 39.217 Bayle 1 31.338 Bayle 2 49.059 Bayle 3 45.152 Bayle 4 43.886 Bayle 5 46.000 Bayle 6 37.250 Bayle 7 40.802 Bayle 8 41.370 Bayle 9 38.714 All of B 35.151 Larroque 1 30.783 Larroque 2 44.299 Larroque 3 38.982 Larroque 4 49.471 Larroque 5 40.347 Larroque 6 37.385 Larroque 7 36.236 Larroque 8 40.360 Larroque 9 46.892 Larroque 10 42.250 Larroque 11 40.605 Larroque 12 38.314 Larroque 13 34.746 Larroque 14 37.137 Larroque 15 31.481 Larroque 16 31.503 Larroque 17 30.086 Larroque 18 33.545 Larroque 19 37.079 Larroque 20 36.579 Larroque 21 32.509 Larroque 22 32.274 Larroque 23 33.429 All of L 34.450 Mean of /X-B/ is 10.958 Mean of /X-L/ is 6.720 Calculated F-value is 4.312 Level of significance is .044 Statistically significant in favor of /X-L/ 82 conclusion that Larroque was the author of l'Avis. This tentative con- cluSion was later invalidated by a regrouping of the test articles as shown in Table 4:7. Table 4:4 shows the average number of syllables per sentence of each work analyzed. As noted in previous tables, the article column refers to num- bered works named in Appendix A. The second column lists the average number of syllables per sentence per article. Again it is apparent that the average number of syllables per sentence of article B1 diverges sharply from the means of 82 through B9. This pattern of divergence is also repeated in the L1 vs. L2 through L23 figures. The mean values defining "All of B" and "All of L" summarize the average number of syllables per sentence in the works by Bayle and Larroque used in this study. This summary mean emphasizes the already mentioned differences between articles Bl and articles B2 through B9, as well as L1 and his remaining articles. At the bottom of the table are found statistical calculations computed by the analysis of variance program. The mean of X-B was statistically compared to the mean of X-L. The comparison revealed a statistical difference with a calculated F-value of 3.95 and an .05 level of significance or confidence level in favor of the conclusion that Larroque was the author of l'Avis aux réfugiés. This tentative conclusion was later invalidated by a regrouping of test articles as shown in Table 4:7. TABLE 4:4. AVERAGE NUMBER OF SYLLABLES PER SENTENCE 83 Article Average Syllables/Sentence Preface 66.464 Unknown 50.585 Pélisson 64.158 Bayle 1 50.543 Bayle 2 81.216 Bayle 3 73.370 Bayle 4 72.448 Bayle 5 77.463 Bayle 6 62.583 Bayle 7 66.305 Bayle 8 68.815 Bayle 9 64.381 All of B 57.176 Larroque 1 50.173 Larroque 2 73.591 Larroque 3 62.945 Larroque 4 80.986 Larroque 5 66.000 Larroque 6 60.000 Larroque 7 61.021 Larroque 8 67.640 Larroque 9 81.027 Larroque 10 69.979 Larroque 11 68.953 Larroque 12 64.314 Larroque 13 54.222 Larroque 14 59.832 Larroque 15 50.259 Larroque 16 50.335 Larroque 17 50.753 Larroque 18 58.304 Larroque 19 62.164 Larroque 20 60.456 Larroque 21 53.642 Larroque 22 50.095 Larroque 23 52.257 All of L 56.619 Mean of /X-B/ is 17.993 Mean of lX-L/ is 10.801 Calculated F-value is 3.955 Level of significance is .053 Statistically significant in favor of /X-L/ 84 Table 4:5 lists the average number of letters per sentence of each work analyzed. As noted in previous tables, the article column refers to selected works delineated in Appendix A. In Table 4:5 the second column records the average number of letters per sentence in each test article. Once again, a distinct difference is observed when the mean values of Bl and B2 through 89 as well as L1 and L2 through L23 are compared. Figures concidental to "All of 8" and "All of L" in this case summarize the average number of letters per sentence over all Bayle and Larroque articles tested. Once more this summary mean emphasizes the divergence, observed also in Tables 4:3 and 4:4, between articles 81 and articles 82 through 89. A similar distinction is reaffirmed when averages of L1 and L2 through L23 are compared with their summary mean. Statistical calculations at the bottom of the table are analysis of variance program results which determine the likelihood that either Bayle or Larroque wrote the disputed article, based upon the hypothesis that one or the other was, in reality its author. Using "All of 8" and ”All of L" mean values, the computer calculated an F-value of 4.349 which represents a level of significance or confidence level of .04 in favor of the conclusion that Larroque was the author of l'Avis aux réfugiés. Again this tentative conclusion was invalidated by a regroup- ing of test articles as shown in Table 4:7. // 85 TABLE 4:5. AVERAGE NUMBER OF LETTERS PER SENTENCE Article Average Letters/Sentence Preface 176.607 Unknown 137.122 Pélisson 171.941 Bayle 1 133.508 Bayle 2 218.334 Bayle 3 195.239 Bayle 4 191.390 Bayle 5 208.074 Bayle 6 165.833 Bayle 7 174.061 Bayle 8 181.852 Bayle 9 168.619 All of 8 151.500 Larroque 1 133.614 Larroque 2 195.591 Larroque 3 168.491 Larroque 4 218.471 Larroque 5 177.102 Larroque 6 158.962 Larroque 7 158.943 Larroque 8 177.973 Larroque 9 211.784 Larroque 10 185.750 Larroque 11 178.256 Larroque 12 168.886 Larroque 13 131.540 Larroque 14 161.824 Larroque 15 132.000 Larroque 16 136.019 Larroque 17 134.710 Larroque 18 152.170 Larroque 19 162.421 Larroque 20 159.041 Larroque 21 142.584 Larroque 22 132.690 Larroque 23 138.543 All of L 150.198 Mean of /X-B/ is 45.560 Mean of /X-L/ is 26.378 Calculated F-value is 4.349 Level of significance is .043 Statistically significant in favor of /X-L/ .86 Table 4:6 presents the average number of letters per word, syllables per word, and letters per syllable of each text analyzed in this study. Names and numbers in the article column refer to works listed by title and author in Appendix A. Figures in the second and third columns represent the average word length, determined first by the number of alphabetical characters per word and then by the number of syllables per word. In the fourth column are listed the average number of letters per syllable. Whereas a distinct divergence of mean values appeared in Tables 4:3, 4:4, and 4:5, where letters and syllables were compared at the sentence level, no such difference is evident when these items are observed at the word level. The numbers defining "All of B" and "All of L" summarize the mean values of articles 81 through 89 and L1 through L23 in the letters/word, syllables/word, and letters/syllable columns. A visual analysis of the averages reveals only slight differences between values in each column. In Bayle's works, the average number of letters per word, for example, ranges only from 4.2 to 4.5. Larroque's averages, on the other hand, extend from 4.0 to 4.5. In the average syllables/word column, even a greater degree of consistency appears in the representations of Bayle's corpus. This is not so, however, in the case of Larroque, whose averages range from 1.5 to 1.7. Finally, in the letters per syllable column, general consistency appears in all repre- sented works, the overall range varying only from 2.5 to 2.7. When the computed mean values did not make readily apparent a significant divergence, the value of using a statistical analysis to sharpen the distinctiveness ratio became more evident. For each summary value of Table 4:6 statistical calculations were performed, yielding the 87 TABLE 4:6. AVERAGE NUMBER OF LETTERS PER WORD, SYLLABLES PER WORD, AND LETTERS PER SYLLABLE Average Average Average Article Letters/Word Syllables/Word Letters/Syllable Preface 4.343 1.634 2.657 Unknown 4.489 1.656 2.711 Pélisson 4.384 1.636 2.680 Bayle 1 4.260 1.613 2.641 Bayle 2 4.450 1.655 2.688 Bayle 3 4.324 1.625 2.661 Bayle 4 4.361 1.651 2.642 Bayle 5 4.523 1.684 2.686 Bayle 6 4.452 1.680 2.650 8ay1e 7 4.266 1.625 2.625 Bayle 8 4.396 1.663 2.643 Bayle 9 4.355 1.663 2.619 All of 8 4.310 1.627 2.650 Larroque 1 4.340 1.630 2.663 Larroque 2 4.415 1.661 2.658 Larroque 3 4.322 1.615 2.677 Larroque 4 4.416 1.637 2.698 Larroque 5 4.389 1.636 2.683 Larroque 6 4.252 1.605 2.649 Larroque 7 4.386 1.684 2.605 Larroque 8 4.410 1.676 2.631 Larroque 9 4.516 1.728 2.614 Larroque 10 4.396 1.656 2.654 Larroque 11 4.390 1.698 2.585 Larroque 12 4.408 1.679 2.626 Larroque 13 4.074 1.561 2.610 Larroque 14 4.357 1.611 2.705 Larroque 15 4.193 1.596 2.626 Larroque 16 4.318 1.598 2.702 Larroque 17 4.477 1.687 2.654 Larroque 18 4.536 1.738 2.610 Larroque 19 4.380 1.677 2.613 Larroque 20 4.348 1.653 2.631 Larroque 21 4.386 1.650 2.658 Larroque 22 4.111 1.552 2.649 Larroque 23 4.144 1.563 2.651 All of L 4.360 1.643 2.653 Mean of /X-B/ is .119 .019 .060 Mean of IX-L/ is .148 .041 .064 Calculated F-value is .511 3.875 .160 Level of significance is .513 .055 .694 88 results shown at the bottom of the table. None of the three tests pro- duced positive results pointing to the authorship of l'Avis aux réfugiés. In order to be considered significant the calculated F-value would have to be 4.17 or greater. Only the values for average syllables per word approach this figure. Again tentative conclusions drawn from these tests were later invalidated by a regrouping of test articles as shown in Table 4:7. Because word level tests as represented in Table 4:6 revealed no significant differences, one and two letter words, primarily functional in nature, were omitted from the test data so that a higher degree of significance or distinction between the test authors' prose rhythm might be obtained. However, even though a greater divergence did begin to reveal itself, differences were so slight that it was impossible to draw any conclusions from them. Tabulated values of the average number of letters per word, the average number of syllables per word and the average number of letters per syllable with one and two letter words removed are found in Appendix F, Table 4:7 divides the test articles into two groups: expository criticism and literary criticism, and shows statistically significant differences obtained after the articles were regrouped with regard to sentence lengths. Tables 4:3 through 4:5 showed a distinct numerical difference between the mean values of the three variables tested. In addition, slight variability was noted in the test results reported in Table 4:6. The consistency of this differentiation suggested the presence of another stylistic feature as cause of the distinction. In Tables 4:3 through 4:5 the change of mean seemed to coincide with a change in 89 TABLE 4:7. REGROUPING OF TEST ARTICLES Groups Articles 8-1 31 and 32 (Cabale Chimerique and Réponse) 8-2 83 through 89 (Nouvelles de_la_répub1ique des lettres) L-l Ll (Le_Prosélyte abusé) L-2 L2 through L23 (Nouvelles d§_la_républigue des lettres) Significant differences for the variable: Number of words per sentence Articles t-value Conf. Level (CL) 8-1 8 L-l 2.520 .025 8-1 G 8-2 4.657 .001 L-l G L-2 4.675 .001 authorial intent, i.e., when the author stopped editorializing and began criticizing literary works. To test this hypothesis further, works which represent editorializing or eXpository criticism were grouped separately from those drawn from literary analysis. Thus, the group 8-1 comprises articles 81 and 82, and the group 8-2 consists of 83 through 89. Likewise L-l is L1, and L2 through L23 become L-2. T-tests were performed on grouped articles denominated 8-1 and L-l, 8-1 and 8-2, and L-l and L-2. A t-value of 2.520, with a confi- dence level of .025 was obtained from the test of 8-1 and L-l. This confidence level indicates that there is a statistical difference between the two authors which could occur fewer than two and one—half times in each one hundred samples, yielding a 97.5% chance that the noted difference was not due to sampling error. Values of 4.657 and 4.675, both yielding a confidence level of .001 resulted from t-tests 90 performed on groups 8—1 vs. 8-2 and L-l vs. L-2 respectively. The .001 significance level suggests better than a 99% chance--999 in one thou- sand--that the noted difference was not due to sampling error or chance. T-tests for significant differences thus performed on SENWOL data yielded substantiation for the hypothesis that grouping according to authorial intent was justified. Results from the regrouping tests tend to support Sherbo's hypothesis that "there is one style for the criticism of poetry, a second for the considerations on corn laws, a third for introducing a new periodical to the public, a fourth for re- viewing books . . ."8 In addition, whereas the analysis of variance had failed to distinguish between authors 8 and L9, the new groupings made computational differences in styles of the two writers more readily apparent. 8-1 and L-l are examples of expository criticism, while 8-2 and L~2 represent literary criticism. A higher level of confidence was obtained from between genres tests than from those within genres. This fact served as the basis for further use of genre groupings in seeking a solution for the disputed authorship of l'Avis aux réfugiés. Table 4:8 shows the average number of words, syllables, and letters per sentence; the average number of letters and syllables per word; and the average number of letters per syllable after articles dealing with literary criticism had been eliminated. The first column provides a shortened form of the article title followed by the letter abbreviation which has been assigned for the pur- pose of presenting these data. The following six columns summarize the average number of words per sentence, syllables per sentence, letters per sentence, letters per word, syllables per word, and letters per 91 TABLE 4:8. AVERAGE NUMBER OF WORDS, SYLLABLES, AND ALPHABETIC CHARAC- TERS PER SENTENCE; AVERAGE NUMBER OF LETTERS PER WORD, SYLLABLES PER WORD, AND LETTERS PER SYLLABLE Average Average Average Average Average 'Average Article Wrd/Sen Syl/Sen Let/Sen Let/Wrd Syl/Wrd Let/Syl Preface P 40.667 66.464 176.607 4.343 1.634 2.657 L'Avis X 30.549 50.585 137.122 4.489 1.656 2.711 Cabale 81 31.338 50.543 133.508 4.260 1.613 2.641 Réponse 82 49.059 81.216 218.334 4.450 1.655 2.688 Prosélyte L 30.783 50.173 133.614 4.340 1.630 2.663 syllable for each article. These data accumulate the information pre- sented in Tables 4:3 through 4:6 for the articles listed. Before the test articles were regrouped (see Table 4:7), a linear representation of sentence level measures, e.g., words per sen- tence, looked like L B where X, L, and 8 represent the styles of the unknown author, Larroque, and Bayle with respect to words, letters, and syllables per sentence. The distance x between 8 (or L) and X was significantly measurable with Larroque's values resembling those of l'Avis more closely than Bayle's. However, L and B were so similar that the analysis of variance could not distinguish between them. Similarly, a linear representation of word level measures, e.g., letters per word, appeared as X B L L ‘Y‘N—J 92 where the values for Bayle appear closer to those of l'Avis but still not statistically distinguishable from Larroque's. After the test arti- cles were regrouped, separation between X, B, and L became much less obvious. Some values for Bayle and Larroque, e.g., words, syllables, and letters per sentence, now approached those of the unknown author so closely that statistical tests could not distinguish them. Still, as shown in Table 4:9 it was possible to statistically differentiate between all of them in four of six tests. In addition, three of six SENWOL tests showed significant divergence between the unknown text and its preface. Summary and Conclusions of SENWOL Results Table 4:9 is a summary of the SENWOL results based on the re- grouped article totals. The column denominated "Test Variable" lists six stylistic dis- criminators used to analyze test articles on the basis of sentence level and individual word measures. Column P f X (read P not equal to X) con— tains the t-values obtained from comparing the test variables in l'Avis with those of its preface. P # X symbolizes the presence of a statisti- cally significant difference between articles with regard to correspond- ing test variables. Confidence level (CL) columns indicate the degree of confidence with which it may be judged that differences are not due to sampling error or to chance. Figures in the X f 81 or 82 column corresponding to words per sentence, syllables per sentence, and letters per sentence represent a statistically significant difference between X and 82; whereas the t-value for variables letters per word, syllables per word, and letters per syllable were derived from the X to 81 comparisons. 93 TABLE 4:9. SENWOL SUMMARY Test Variable P # X CL X # Ll CL Words Per Sentence 2.931 .005 - - Syllables Per Sentence 2.801 .01 - - Letters Per Sentence 2.557 .025 - - Letters Per Word - - 4.892 .001 Syllables Per Word — - 2.704 .01 Letters Per Syllable - - 4.227 .001 Test Variable X # 81 or 82 CL 81 # L1 CL Words Per Sentence 8.957. .001 — - Syllables Per Sentence 8.883 .001 — - Letters Per Sentence 8.653 .001 - - Letters Per Word 9.748 .001 3.702 .001 Syllables Per Word 6.764 .001 3.272 .005 Letters Per Syllable 7.021 .001 1.962 .05 In order to use t-test results to suggest a probable author, it was first necessary to establish a hypothesis against which tests of difference could be made. Since both Bayle's and Larroque's complete works include l'Avis aux réfugiés, and because literary critics and his- torians have seriously considered no other writer as its author, the initial hypothesis that either Bayle or Larroque wrote l'Avis seemed both justified and logical. To name a probable author under this hy- pothesis required that two conditions be satisfied. Because the possible-author field had been reduced to two candidates, the first 94 condition required that, for any given variable, there be a statisti- cally significant difference between Bayle and Larroque.10 The second condition required the presence of a significant difference between one of the supposed writers and the unknown author, but not between the unknown author and the remaining candidate. When these conditions were met, in view of the original hypothesis, the writer whose use of the stylistic element in question did not differ significantly from that of the unknown author was advanced as the probable author of l'Avis aux réfugiés. There remains, however, another possible inference that may be drawn from t-test results. If, for example, significant values reveal a decided difference between B and X, as well as between L and X, whether or not B differs from L is immaterial. Whenever both Bayle and Larroque differ significantly from the unknown author, the probable author is listed as neither (N). On the basis of these criteria, t-tests reported in Table 4:9 were performed after regrouping test articles. Final sta- tistical results from SENWOL data showed three tests, i.e., letters per word, syllables per word, and letters per syllable, which revealed sta- tistically significant results meeting the conditions just stated (X # 81 and X # L1). In all, the SENWOL summary table shows nine values significant at the .001 level, two at the .005 level, two at the .01 level, one at the .025 and one at the .05 level, for a total of fifteen significant values of a possible twenty—four. Ten of these values, all having at least an .01 confidence level, suggest that neither Bayle nor Larroque authored the unknown text. In summary, then the SENWOL data provided information which led to two conclusions: First, there is, in fact, a quantitative difference 95 between expository and literary criticisms in sentence level measures, and second, the number of observed and calculated differences in this section indicate considerable evidence that the author of l'Avis aux réfugiés and Bayle employ different writing practices in their use of SENWOL variables. Sentence Beginnings and Endings Tables 4:10 through 4:28 present results obtained from the STYLBEND program which located and isolated first and last words of each sentence in thirty-four original test articles. Supplemental hand sort— ing placed raw STYLBEND data into fifteen grammatical categories as listed below. adjectives gerunds nouns adverbs infinitives numbers articles interrogatives prepositions conjunctions Latin pronouns exclamations negations verbs (finite) A simple counting and averaging program then summed the absolute fre— quencies for each part of speech and calculated the percentage of occur- rence of each part of speech as a sentence beginning or ending for each test article. In order to construct a coherent pattern for presentation of results and to facilitate the hand sorting procedure, part of speech categories for both initial and terminal words were kept the same. As might well be expected, some zero values were obtained because of this decision (e.g., articles, interrogatives). Nevertheless, since zero values did not affect the final statistics for this set of variables, they were ignored, and the categories remained. 96 Tables presenting STYLBEND results fall into three groups. Tables 4:10 through 4:24 record frequencies, percentage of use, and sta— tistical values relating to the use of each of fifteen parts of speech as sentence beginnings and/or endings for all thirty-four test works. Tables 4:25 and 4:26 summarize data drawn only from articles related to expository criticism. These five articles (see Table 4:7) had revealed themselves as being of a genre more like l'Avis aux réfugiés than those of literary criticism. Finally, Tables 4:27 and 4:28 summarize significant values obtained from the STYLBEND program, their t-test results, their levels of confidence and a probable author hypothesis. Table 4:10 shows absolute frequencies, percentage of use, and statistical values relating to the use of adjectives as sentence begin- nings and endings in thirty-four test articles. Article columns identify position of the adjective as either initial or terminal. Alphabetical characters (P, X, B, or L) and digits following them in article columns identify works tested as listed in Appendix A. In frequency columns appear the number of times an adjec- tive is used in its corresponding article as either a sentence beginning or ending. In order to correct for distortions of statistical results due to differences in article lengths, percentage of use values were calculated and are presented in percent columns. Results of the test for internal homogenity of variance and ANOVAR (analysis of variance) results for all values in Table 4:10 follow the first and last word values of L23. The hypothesis of equal variance supposes that use of the prescribed part of speech as a sen- tence first word is so consistent that discrimination between individual 97 TABLE 4:10. ADJECTIVES AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 4 4.762 Last P 10 11.905 First X 82 4.528 Last X 279 15.406 First 8 l 100 5.435 Last 8 1 254 13.804 First 8 2 9 3.136 Last 8 2 45 15.679 First 8 3 2 4.348 Last 8 3 8 17.391 First 8 4 12 11.321 Last 8 4 15 14.151 First 8 5 4 7.273 Last 8 S 8 14.545 First 8 6 5 13.889 Last 8 6 9 25.000 First 8 7 16 12.214 Last 8 7 17 12.977 First 8 8 4 14.815 Last 8 8 6 22.222 First 8 9 4 19.048 Last 8 9 2 9.524 First L l 60 5.102 Last L l 183 15.561 First L 2 5 3.247 Last L 2 25 16.234 First L 3 4 7.143 Last L 3 7 12.500 First L 4 8 11.429 Last L 4 12 17.143 First L 5 7 14.286 Last L 5 9 18.367 First L 6 4 15.385 Last L 6 4 15.385 First L 7 5 3.597 Last L 7 26 18.705 First L 8 7 9.211 Last L 8 5 6.579 First L 9 2 5.405 Last L 9 10 27.027 First L 10 2 4.167 Last L 10 11 22.917 First L 11 0 0.000 Last L 11 4 9.302 First L 12 3 8.571 Last L 12 6 17.143 First L 13 2 3.175 Last L 13 15 14.815 First L 14 l .763 Last L 14 25 19.084 First L 15 0 0.000 Last L 15 4 14.815 First L 16 8 5.161 Last L 16 29 18.710 First L 17 8 4.938 Last L 17 39 24.074 First L 18 4 3.571 Last L 18 22 19.643 First L 19 8 5.714 Last L 19 0 0.000 First L 20 12 6.977 Last L 20 33 19.186 First L 21 10 5.780 Last L 21 16 9.249 First L 22 1 1.190 Last L 22 16 19.048 First L 23 1 2.857 Last L 23 3 8.571 Hypothesis of Equal Variance Fails with t-value of 2.777 Mean of /X-B/ is 5.985 3.410 Mean of /X-L/ is 2.997 4.887 Calculated F-value is 4.539 1.023 Level of significance is .039 .320 Statistically significant in favor of /X-L/ 98 works by the same author is unlikely.11 For adjectives as sentence be- ginnings the hypothesis of homogeneity of variance was rejected as wit- nessed by the statement, "Hypothesis of equal variance fails with t-value of 2.777." Stylistically speaking, then, Bayle and Larroque are not consistent in their use of adjectives as sentence beginnings; but which, if either, most resembles the unknown author? Below the test for internal variance results are found the ANOVAR results where the average of all 8 percent values is subtracted from the X percent value to obtain the mean absolute12 difference between the articles thus tested. The "slash" marks on either side of /X-8/ and IX-L/ define these differences as absolute values. Resulting figures were used to calculate the F statistic. The computed F-value of 4.539 for adjectives used as sentence beginnings represents a confidence level of .039 in favor of the conclusion that Larroque was the author of l'Avis aux réfugiés. Just as several other tentative conclusions were later negated by a regrouping of the test articles shown in Table 4:7, so it was with this one (see Table 4:27). Statistical computations for adjectives used as sentence endings yielded no significant results. Table 4:11 shows absolute frequencies, percentage of use, and statistical values relating to use of adverbs as sentence beginnings and endings in thirty-four test articles. Article columns identify the position of the adverb as either the first or last word in sentences tested in the preface to l'Avis aux réfugiés, l'Avis itself, or in works of Bayle (B) and Larroque (L). Numbers immediately next to these author denominators further identify works tested as listed in Appendix A. The actual number of times an adverb is used in the adjacent article as either a beginning or ending 99 TABLE 4:11. ADVERBS AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 1 1.190 Last P 4 4.762 First X 44 2.430 Last X 66 3.644 First 8 l 26 1.413 Last 8 1 75 4.076 First 8 2 9 3.136 Last 8 2 12 4.181 First 8 3 3 6.522 Last 8 3 4 8.696 First 8 4 4 3.774 Last 8 4 2 1.887 First 8 5 2 3.636 Last 8 5 1 1.818 First 8 6 0 0.000 Last 8 6 0 0.000 First 8 7 3 2.290 Last 8 7 4 3.053 First 8 8 1 3.704 Last 8 8 O 0.000 First 8 9 O 0.000 Last 8 9 1 4.762 First L 1 26 2.211 Last L 1 41 3.486 First L 2 l .649 Last L 2 8 5.195 First L 3 1 1.786 Last L 3 2 3.571 First L 4 0 0.000 Last L 4 6 8.571 First L 5 0 0.000 Last L 5 1 2.041 First L 6 0 0.000 Last L 6 1 3.846 First L 7 2 1.439 Last L 7 4 2.878 First L 8 2 2.632 Last L 8 3 3.947 First L 9 4 10.811 Last L 9 5 13.514 First L 10 1 2.083 Last L 10 3 6.250 First L 11 0 0.000 Last L 11 1 2.326 First L 12 0 0.000 Last L 12 1 2.857 First L 13 0 0.000 Last L 13 1 1.587 First L 14 2 1.527 Last L 14 5 3.817 First L 15 1 3.704 Last L 15 2 7.407 First L 16 1 .645 Last L 16 3 1.935 First L 17 7 4.321 Last L 17 4 2.469 First L 18 l .893 Last L 18 3 2.679 First L 19 2 1.429 Last L 19 38 27.143 First L 20 3 1.744 Last L 20 7 4.070 First L 21 5 2.890 Last L 21 8 4.624 First L 22 0 0.000 Last L 22 0 0.000 First L 23 0 0.000 Last L 23 1 2.857 Mean of /X-B/ is 1.626 2.066 Mean of /X-L/ is 1.805 2.754 Calculated F-value is .087 .159 Level of significance is .766 .694 100 is tabulated in the frequency column. In order to correct for distor- tions of statistical results due to differences in article lengths, per— centage of use values were calculated and are presented in percent columns. At the bottom of the table are located statistical calculations from ANOVAR used to determine the likelihood that either Bayle or Larroque wrote the disputed article, based upon the hypothesis that one or the other did, in reality, write it. The absence of a "Hypothesis of equal variance fails" statement in this table and in Tables 4:14 through 4:24 indicates that homogeneity of internal variance does exist. Using mean absolute values of /X-8/ and /X-L/, an F-value was calculated for adverbs used both as beginnings and endings. Lack of significant results indicates that all authors tested use adverbs as beginnings and endings, with such similarity that statistical tests used could reveal no substantial differences. The .76 level of significance means that there is a seven out of ten possibility that any difference observed is due to sampling error or to chance and not due to actual differences in writing habits. Both the .76 and .69 confidence levels fall far below the .05 considered acceptable in this study. Table 4:12 shows absolute frequencies, percentage of use, and statistical values relating to use of articles (definite and indefinite) as sentence beginnings and endings in thirty-four test articles. Columns designated "Article" identify the position of definite or indefinite articles used in either initial or terminal positions. Denominators P, X, 81 to 89 and L1 to L23 identify works tested as listed in Appendix A. In frequency columns appear the number of times articles were used in their corresponding works as either beginnings 101 TABLE 4:12. ARTICLES AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 12 14.286 Last P 0 0.000 First X 126 6.957 Last X 0 0.000 First 8 1 150 8.152 Last 8 l 0 0.000 First 8 2 16 5.575 Last 8 2 0 0.000 First 8 3 2 4.348 Last 8 3 0 0.000 First 8 4 16 15.094 Last 8 4 O 0.000 First 8 5 14 25.455 Last 8 5 0 0.000 First 8 6 7 19.444 Last 8 6 0 0.000 First B 7 20 15.267 Last 8 7 0 0.000 First 8 8 9 33.333 Last 8 8 0 0.000 First 8 9 O 0.000 Last 8 9 O 0.000 First L l 102 8.673 Last L l 0 0.000 First L 2 19 12.338 Last L 2 0 0.000 First L 3 11 19.643 Last L 3 0 0.000 First L 4 10 14.286 Last L 4 0 0.000 First L 5 8 16.327 Last L 5 0 0.000 First L 6 3 11.538 Last L 6 0 0.000 First L 7 12 8.633 Last L 7 0 0.000 First L 8 17 22.368 Last L 8 0 0.000 First L 9 3 8.108 Last L 9 O 0.000 First L 10 7 14.583 Last L 10 0 0.000 First L 11 3 6.977 Last L 11 0 0.000 First L 12 9 25.714 Last L 12 0 0.000 First L 13 2 3.175 Last L 13 0 0.000 First L 14 4 3.053 Last L 14 O 0.000 First L 15 1 3.704 Last L 15 0 0.000 First L 16 9 5.806 Last L 16 O 0.000 First L 17 11 6.790 Last L 17 O 0.000 First L 18 12 10.714 Last L 18 0 0.000 First L 19 18 12.857 Last L 19 0 0.000 First L 20 18 10.465 Last L 20 0 0.000 First L 21 29 16.763 Last L 21 0 0.000 First L 22 8 9.524 Last L 22 0 0.000 First L 23 5 14.286 Last L 23 0 0.000 Hypothesis of Equal Variance Fails with t-value of 3.705 Mean of IX-B/ is Mean of IX-L/ is Calculated F-value is Level of Significance is 9.550 5.687 2.658 .109 102 or endings. As might well be expected from basic rules of French syn- tax, the frequency column for articles used as sentence endings lists only zeros. However, such was not the case for articles used as begin- nings. In order to correct for distortions of statistical results due to differences in lengths of the works analyzed, percentage of use values were calculated and are tabulated in the percent column. Statistical results for this variable begin with rejection of the hypothesis of equal internal variance in articles by Larroque. Had there been internal consistency of use of definite and indefinite arti- cles at the acceptable .05 confidence level, the t-value would have had to be less than 1.96. Analysis of variance results showed divergency in use of defi- nite and indefinite articles in initial and terminal positions in the writings of Larroque. However, when the mean absolute values of /X-B/ and /X-L/ for this variable were compared, the contrast was not great enough to register a statistically significant difference. The calcu- lated F-value of 2.658 attests to this conclusion. Table 4:13 records absolute frequencies, percentage of use, and statistical values relating to use of conjunctions as sentence begin- nings and endings in thirty-four test articles. The position of the test variable, conjunctions, as either ini- tial or terminal, is defined in article columns. Designators P, X, 81 to 89 and L1 to L23 identify works tested as listed in Appendix A. In frequency columns appear the number of times conjunctions were used in their corresponding works as either beginnings or endings. Figures in the percent column represent percentage of use of conjunctions as first and last words of sentences in works tested. 103 TABLE 4:13. CONJUNCTIONS AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 17 20.238 Last P 2 2.381 First X 613 33.849 Last X 11 .607 First B l 542 29.457 Last B 1 23 1.250 First 8 2 83 28.920 Last 8 2 2 .697 First 8 3 2 4.348 Last 8 3 1 2.174 First 8 4 9 8.491 Last 8 4 l .943 First 8 5 5 9.091 Last 8 5 0 0.000 First 8 6 3 8.333 Last 8 6 0 0.000 First 8 7 7 5.344 Last 8 7 0 0.000 First 8 8 3 11.111 Last 8 8 O 0.000 First 8 9 2 9.524 Last 8 9 1 4.762 First L 1 383 32.568 Last L 1 5 .425 First L 2 57 37.013 Last L 2 1 .649 First L 3 10 17.857 Last L 3 0 0.000 First L 4 23 32.857 Last L 4 0 0.000 First L 5 12 24.490 Last L 5 0 0.000 First L 6 7 26.923 Last L 6 0 0.000 First L 7 30 21.583 Last L 7 l .719 First L 8 21 27.632 Last L 8 2 2.632 First L 9 9 24.324 Last L 9 O 0.000 First L 10 11 22.917 Last L 10 0 0.000 First L 11 4 9.302 Last L 11 0 0.000 First L 12 9 25.714 Last L 12 0 0.000 First L 13 17 26.984 Last L 13 0 0.000 First L 14 49 37.405 Last L 14 3 2.290 First L 15 12 44.444 Last L 15 0 0.000 First L 16 46 29.677 Last L 16 1 .645 First L 17 51 31.481 Last L 17 0 0.000 First L 18 34 30.357 Last L 18 0 0.000 First L 19 47 33.571 Last L 19 0 0.000 First L 20 48 27.907 Last L 20 2 1.163 First L 21 45 26.012 Last L 21 0 0.000 First L 22 39 46.429 Last L 22 0 0.000 First L 23 12 34.286 Last L 23 0 0.000 Hypothesis of Equal Variance Fails with t-value of 3.080 Mean of /X-B/ is 21.113 1.024 Mean of /X-L/ is 7.280 .624 Calculated F-value is 25.690 1.882 Level of significance .000 .177 Statistically significant in favor of /X-L/ 104 Statistical results for the conjunction-use variable begin with the statement, "Hypothesis of equal variance fails with t-value of 3.080." As with two previous variables, adjectives and articles, whose statistics report similar results (see Tables 4:10 and 4:12), Larroque's inconsistent use of conjunctions in the initial sentence position caused rejection of the hypothesis of homogeneity of variance. Even though the internal test for homogeneity revealed that Larroque used conjunctions inconsistently as sentence first words, when the mean absolute value /X-L1/ of all his works tested was compared to the corresponding mean for 8, /X-Bl/, a strong significant difference appeared. As shown in Table 4:13, the calculated F-value of 25.690 con- firms the visually observable difference between mean absolute differ- ences. The 25.690 F-value represents a confidence level of .000085 in favor of the conclusion that Larroque wrote l'Avis aux réfugiés. How- ever, just as several other tentative conclusions were later negated or revised because of a regrouping of test articles, so was this one (see Table 4:29). Use of conjunctions as sentence endings is not common in French syntax, although both Grevisse and Le Bidois suggest that pourtant, ainsi, donc, gn_effet, and cependant are permitted as sentence closers.13 None of the writers tested used these conjunctive forms so frequently as to distinguish his writings from the others. Table 4:14 lists absolute frequencies, percentage of use, and statistical values relating to use of exclamations as sentence beginnings and endings in thirty-four test articles. Columns denominated "Article" identify position of the test variable, exclamations, as either initial or terminal. Designators P, X, 105 TABLE 4:14. EXCLAMATIONS AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 0 0.000 Last P 0 0.000 First X 5 .276 Last X 0 0.000 First 8 1 3 .707 Last 8 1 8 .435 First 8 2 2 .697 Last 8 2 O 0.000 First 8 3 O 0.000. Last 8 3 0 0.000 First 8 4 O 0.000 Last 8 4 0 0.000 First 8 5 0 0.000 Last 8 5 0 0.000 First 8 6 0 0.000 Last 8 6 0 0.000 First 8 7 0 0.000 Last 8 7 0 0.000 First 8 8 0 0.000 Last 8 8 0 0.000 First 8 9 0 0.000 Last 8 9 O 0.000 First L 1 6 1.361 Last L 1 4 .340 First L 2 0 0.000 Last L 2 0 0.000 First L 3 0 0.000 Last L 3 0 0.000 First L 4 0 0.000 Last L 4 0 0.000 First L 5 0 0.000 Last L 5 0 0.000 First L 6 0 0.000 Last L 6 0 0.000 First L 7 1 .719 Last L 7 0 0.000 First L 8 O 0.000 Last L 8 0' 0.000 First L 9 0 0.000 Last L 9 0 0.000 First L 10 0 0.000 Last L 10 O 0.000 First L 11 0 0.000 Last L 11 0 0.000 First L 12 0 0.000 Last L 12 0 0.000 First L 13 1 1.587 Last L 13 0 0.000 First L 14 0 0.000 Last L 14 0 0.000 First L 15 0 0.000 Last L 15 0 0.000 First L 16 0 0.000 Last L 16 0 0.000 First L 17 0 0.000 Last L 17 O 0.000 First L 18 0 0.000 Last L 18 0 0.000 First L 19 0 0.000 Last L 19 0 0.000 First L 20 0 0.000 Last L 20 0 0.000 First L 21 O 0.000 Last L 21 0 0.000 First L 22 0 0.000 Last L 22 0 0.000 First L 23 0 0.000 Last L 23 O 0.000 Mean of /X-B/ is .309 .048 Mean of /X-L/ is .363 .014 Calculated F-value is .353 .782 Level of significance is .563 .612 106 81 to 89 and L1 to L23 identify works tested as named in Appendix A. In frequency columns appear the number of times that exclamations were used in their corresponding works as either beginnings or endings. Because the exclamations group contains both exclamations and interjections, frequency of use varies between first and last words. The STYLBEND pro- gram uses periods as sentence delimiters. Therefore, since exclamation marks were keypunched as periods, frequency of occurrence of true inter- jectory forms, e.g., ah, ha, 9h, or 9221.15 reflected in both beginnings and endings; whereas exclamatory forms, e.g., "Vgilé_que," "que_je suis content," or "combien je souffre," not followed by a period are re- flected only in first word totals. In order to correct for distortions of statistical results due to differences in article lengths, percentage of use values were calculated and are tabulated in percent columns. At the bottom of the table are located statistical results which, for the exclamations variable, revealed no statistically signifi- cant differences between treatment means tested. The .5 and .6 levels of significance indicate a 50% chance that any observed difference was due either to sampling error or to chance. Table 4:15 shows absolute frequencies, percentage of use, and statistical values relating to use of gerunds as sentence beginnings and endings in thirty-four test articles. Again, article columns identify position of the test variable as either initial or terminal. Alphabetical characters (P, X, 81, 82 or L1) and digits next to them identify works tested as listed in Appendix A. The number of times gerunds are used as first or last words in a sentence by a given author, in a given work, is noted in the frequency columns. Correction for distortions of statistical results caused by 107 TABLE 4:15. GERUNDS AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 1 1.190 Last P 0 0.000 First X 2 .110 Last X 2 .110 First 8 l 3 .163 Last 8 l 5 .272 First 8 2 2 .697 Last 8 2 0 0.000 First 8 3 0 0.000 Last 8 3 0 0.000 First 8 4 0 0.000 Last 8 4 O 0.000 First 8 S 0 0.000 Last 8 5 0 0.000 First 8 6 0 0.000 Last 8 6 0 0.000 First 8 7 0 0.000 Last 8 7 O 0.000 First 8 8 0 0.000 Last 8 8 0 0.000 First 8 9 0 0.000 Last 8 9 0 0.000 First L 1 3 .255 Last L 1 0 0.000 First L 2 O 0.000 Last L 2 0 0.000 First L 3 0 0.000 Last L 3 0 0.000 First L 4 O 0.000 (Last L 4 1 1.429 First L 5 0 0.000 Last L 5 0 0.000 First L 6 O 0.000 Last L 6 0 0.000 First L 7 0 0.000 Last L 7 l .719 First L 8 1 1.316 Last L 8 0 0.000 First L 9 0 0.000 Last L 9 0 0.000 First L 10 O 0.000 Last L 10 0 0.000 First L 11 0 0.000 Last L 11 0 0.000 First L 12 0 0.000 Last L 12 0 0.000 First L 13 0 0.000 Last L 13 0 0.000 First L 14 O 0.000 Last L 14 0 0.000 First L 15 0 0.000 Last L 15 0 0.000 First L 16 0 0.000 Last L 16 0 0.000 First L 17 l .617 Last L 17 O 0.000 First L 18 O 0.000 Last L 18 1 .893 First L 19 O 0.000 Last L 19 0 0.000 First L 20 0 0.000 Last L 20 0 0.000 First L 21 0 0.000 Last L 21 0 0.000 First L 22 0 0.000 Last L 22 0 0.000 First L 23 0 0.000 Last L 23 0 0.000 Mean of /X-B/ is .156 .116 Mean of /X-L/ is .176 .213 Calculated F-value is .052 .968 Level of significance is .815 .665 108 differences in article lengths was made by calculating percentage of use figures which are listed in percent columns. Analysis of variance performed on the mean of transformed vari- ables /X-B/ and /X-L/ yielded no statistically significant differences. Table 4:16 shows absolute frequencies, percentage of use, and statistical values relating to use of infinitives as sentence beginnings and endings in thirty-four test articles. As in previous tables presenting STYLBEND results, article columns identify position of test variable as either the first or last word in a sentence. Alphabetical characters (P, X, 81, 82, and L1) and figures next to them coincide with author/work tested list in Appendix A. In frequency columns appear the number of times that infinitives were used in listed works as either sentence beginnings or endings. In order to compensate for different article lengths, which otherwise would cause distortions of statistical results, percentage of use values were calculated and are tabulated in percent columns. Statistical results for infinitives variable are found at the bottom of the table. As evidenced by calculated F-values of .407 and .199 for infinitives as beginnings and endings respectively, no statis- tically significant difference exists for this variable. 109 TABLE 4:16. INFINITIVES AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P O 0.000 Last P 3 3.571 First X 3 .166 Last X 92 5.080 First 8 l 2 .109 Last 8 l 105 5.707 First 8 2 0 0.000 Last 8 2 11 3.833 First 8 3 O 0.000 Last 8 3 2 4.348 First 8 4 0 0.000 Last 8 4 3 2.830 First 8 5 0 0.000 Last 8 S 3 5.455 First 8 6 0 0.000 Last 8 6 0 0.000 First 8 7 0 0.000 Last 8 7 3 2.290 First 8 8 0 0.000 Last 8 8 O 0.000 First 8 9 O 0.000 Last 8 9 O 0.000 First L 1 2 .170 Last L l 68 5.782 First L 2 0 0.000 Last L 2 3 1.948 First L 3 0 0.000 Last L 3 2 3.571 First L 4 0 0.000 Last L 4 1 1.429 First L 5 O 0.000 Last L 5 2 4.082 First L 6 0 0.000 Last L 6 1 3.846 First L 7 0 0.000 Last L 7 4 2.878 First L 8 0 0.000 Last L 8 4 5.263 First L 9 0 0.000 Last L 9 4 10.811 First L 10 0 0.000 Last L 10 3 6.250 First L 11 0 0.000 Last L 11 3 6.977 First L 12 0 0.000 Last L 12 0 0.000 First L 13 0 0.000 Last L 13 0 0.000 First L 14 0 0.000 Last L 14 7 5.344 First L 15 1 3.704 Last L 15 3 11.111 First L 16 0 0.000 Last L 16 10 6.452 First L 17 0 0.000 Last L 17 9 5.556 First L 18 0 0.000 Last L 18 1 .893 First L 19 0 0.000 Last L 19 8 5.714 First L 20 0 0.000 Last L 20 10 5.814 First L 21 O 0.000 Last L 21 13 7.514 First L 22 O 0.000 Last L 22 5 5.952 First L 23 0 0.000 Last L 23 1 2.857 Mean of /X-B/ is .153 2.584 Mean of /X-L/ is .305 2.252 Calculated F-value is .407 .199 Level of significance is .662 .534 110 Table 4:17 presents absolute frequencies, percentage of use, and statistical test results relating to use of interrogatives as sentence beginnings and endings in thirty-four test articles. In the article columns of Table 4:17, the position of the test variable as either initial or terminal, and the author/number code of test articles, as identified in Appendix A, are listed. The frequency column contains the number of times in each work tested interrogatives (interrogative adjectives, adverbs, or pronouns) occurred as sentence beginnings and/or endings. In order to compensate for different article lengths, which otherwise would cause distortions of statistical results, percentage of use values or relative frequencies were calculated and are shown in percent columns. None of the test articles produced even'one example of an interrogative in the terminal position. STYLBEND was de- signed to identify and isolate words immediately preceding and following periods. The words thus isolated and punched on IBM cards, were hand sorted and placed into the appropriate part of speech category. There- fore, should an author have concluded a statement with ... comment?, ... combien?, ... our uoi?, or ... n'est-ce pas?, or should any inter- rogative term have been used as interjectory interrogatives, they would have been tabulated as sentence endings. One word interjectory inter- rogatives were to have been counted as both a first and a last word, but none occurred in the 200,000 words of text analyzed. Analysis of variance statistics which utilized the mean absolute difference between transformed variables (X-B and X-L) generated an F-value of 3.890. The calculated F-value of 3.890 yields a confidence level of .054 in favor of the hypothesis that Bayle wrote l'Avis aux réfugiés. Even though this level of significance suggests a better than 111 TABLE 4:17. INTERROGATIVES AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 1 1.190 Last P O 0.000 First X 49 2.706 Last X 0 0.000 First B 1 55 2.989 Last B l 0 0.000 First 8 2 7 2.439 Last 8 2 0 0.000 First 8 3 2 4.348 Last 8 3 0 0.000 First 8 4 O 0.000 Last 8 4 O 0.000 First 8 5 0 0.000 Last 8 S 0 0.000 First 8 6 0 0.000 Last 8 6 0 0.000 First 8 7 0 0.000 Last 8 7 0 0.000 First 8 8 0 0.000 Last 8 8 O 0.000 First 8 9 O 0.000 Last 8 9 0 0.000 First L 1 20 1.701 Last L l 0 0.000 First L 2 0 0.000 Last L 2 0 0.000 First L 3 0 0.000 Last L 3 O 0.000 First L 4 O 0.000 Last L 4 0 0.000 First L 5 0 0.000 Last L 5 0 0.000 First L 6 0 0.000 Last L 6 0 0.000 First L 7 2 1.439 Last L 7 0 0.000 First L 8 0 0.000 Last L 8 0 0.000 First L 9 O 0.000 Last L 9 O 0.000 First L 10 0 0.000 Last L 10 0 0.000 First L 11 0 0.000 Last L 11 0 0.000 First L 12 O 0.000 Last L 12 O 0.000 First L 13 O 0.000 Last L 13 0 0.000 First L 14 0 0.000 Last L 14 O 0.000 First L 15 0 0.000 Last L 15 0 0.000 First L 16 O 0.000 Last L 16 O 0.000 First L 17 0 0.000 Last L 17 0 0.000 First L 18 0 0.000 Last L 18 0 0.000 First L 19 0 0.000 Last L 19 0 0.000 First L 20 0 0.000 Last L 20 0 0.000 First L 21 0 0.000 Last L 21 0 0.000 First L 22 O 0.000 Last L 22 O 0.000 First L 23 0 0.000 Last L 23 0 0.000 Mean of /X-B/ is 2.047 Mean of /X-L/ is 2.569 Calculated F-value is 3.890 Level of significance is .054 Statistically significant in favor of /X-8/ 112 94% chance that the observed difference between the treatment means of /X-B/ and /X-L/ was not due to a sampling error or to chance, this ten- tative conclusion in favor of Bayle was later negated because of the re- grouping of the test articles, as shown in Table 4:7. Table 4:18 records absolute frequencies, percentage of use, and statistical test results relating to use of Latin words as sentence beginnings and endings in thirty-four test works. Article columns identify the position of the test variable as either initial or terminal, and, by means of denominators P, X, B, and L and numbers adjacent to them, specify the works tested as listed in Appendix A. In frequency columns appear the number of times Latin terms occurred in the corresponding articles as first or last words. Relative frequency or percentage of use figures are listed in the percent column. These figures were used in statistical computations in order to avoid distortions in the analysis of variance results due to differences in article lengths. At the bottom of the table are located statistical results from ANOVAR used to determine the likelihood that either Bayle or Larroque wrote the disputed article. For the Latin variable, computed F-values of .156 and 3.710 do not reflect a statistically significant difference between Bayle, Larroque and the unknown author at the confidence level required in this study. Although these tests did not reveal a statistically significant difference between authors X, B, and L's frequency of use of Latin words as noted above, data gathered by the STYLBEND program and reported in Table 4:18 present additional information that might warrant further study. An examination of the frequency columns reveals a similarity in 113 TABLE 4:18. LATIN WORDS AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 0 0.000 Last P 0 0.000 First X 7 .387 Last X 13 .718 First 8 1 4 .217 Last 8 l 12 .652 First 8 2 0 0.000 Last 8 2 . 2 .697 First 8 3 0 0.000 Last 8 3 0 0.000 First 8 4 3 2.830 Last 8 4 6 5.660 First 8 5 0 0.000 Last 8 5 0 0.000 First 8 6 0 0.000 Last 8 6 1 2.778 First 8 7 l .763 Last 8 7 4 3.053 First 8 8 0 0.000 Last 8 8 O 0.000 First 8 9 0 0.000 Last 8 9 0 0.000 First L 1 4 .340 Last L 1 6 .510 First L 2 . 0 0.000 Last L 2 0 0.000 First L 3 0 0.000 Last L 3 0 0.000 First L 4 0 0.000 (Last L 4 0 0.000 First L 5 0 0.000 Last L 5 0 0.000 First L 6 0 0.000 Last L 6 0 0.000 First L 7 0 0.000 Last L 7 0 0.000 First L 8 O 0.000 Last L 8 0 0.000 First L 9 1 2.703 Last L 9 1 2.703 First L 10 O 0.000 Last L 10 0 0.000 First L 11 0 0.000 Last L 11 O 0.000 First L 12 0 0.000 Last L 12 0 0.000 First L 13 0 0.000 Last L 13 0 0.000 First L 14 1 .763 Last L 14 2 1.527 First L 15 0 0.000 Last L 15 O 0.000 First L 16 3 1.935 Last L 16 2 1.290 First L 17 0 0.000 Last L 17 0 0.000 First L 18 0 0.000 Last L 18 0 0.000 First L 19 0 0.000 Last L 19 0 0.000 First L 20 0 0.000 Last L 20 0 0.000 First L 21 O 0.000 Last L 21 1 .578 First L 22 0 0.000 Last L 22 0 0.000 First L 23 0 0.000 Last L 23 0 0.000 Mean of /X-B/ is .589 1.366 Mean of /X-L/ is .505 .723 Calculated F-value is .156 3.710 Level of significance is .697 .060 ... .... —.w— A. 114 the relative and absolute frequency of use of the test variable by the author of l'Avis aux réfugiés and by Bayle. Author X introduced seven sentences with Latin terms, while terminating thirteen in the same fashion. Likewise, in article 81, Bayle began four and ended twelve sentences with Latin. Using STYLBEND grouped-data listings, Latin quo- tations were readily located in context. A comparison of their use revealed that of seven Latin terms used by author X as beginnings, six are short quotations which occupy the entire sentence, while one is part of a phrase used to introduce an argument. All four of the Latin beginnings in article 81 introduce short, full-sentence quotations. Seven additional latinate constructions in X are parts of final quota- tions or phrases used to conclude an argument. Likewise, Bayle's intro- duction of eight additional quotations to substantiate his arguments parallels the pattern established in l'Avix aux réfugiés. Table 4:19 presents absolute frequencies, percentage of use, and statistical values relating to use of negative forms as sentence begin- nings and endings in thirty-four test articles. As in previous tables presenting STYLBEND results, article columns identify position of the test variable as either the first or last word in a sentence. Denominators P, X, B, and L and digits next to them coincide with the author/work list in Appendix A. In frequency columns appear the number of times that negative forms were used in listed works as either sentence beginnings or endings. In order to com- pensate for different article lengths which would otherwise cause dis- tortions of statistical results, percentage of use values were calcu- lated and are tabulated in percent columns. 115 TABLE 4:19. NEGATIONS AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 0 0.000 Last P 0 0.000 First X 36 1.988 Last X 22 1.215 First 8 1 39 2.120 Last 8 l 38 2.065 First 8 2 11 3.833 Last 8 2 O 0.000 First 8 3 2 4.348 Last 8 3 2 4.348 First 8 4 0 0.000 Last 8 4 l .943 First 8 5 2 3.636 Last 8 5 1 1.818 First 8 6 1 2.778 Last 8 6 0 0.000 First 8 7 l .763 Last 8 7 0 0.000 First 8 8 O 0.000 Last 8 8 0 0.000 First 8 9 0 0.000 Last 8 9 1 4.762 First L l 22 1.871 Last L l 23 1.956 First L 2 0 0.000 Last L 2 0 0.000 First L 3 1 1.786 Last L 3 1 1.786 First L 4 0 0.000 Last L 4 0 0.000 First L 5 0 0.000 vLast L 5 0 0.000 First L 6 0 0.000 Last L 6 0 0.000 First L 7 0 0.000 Last L 7 2 1.439 First L 8 1 1.316 Last L 8 2 2.632 First L 9 0 0.000 Last L 9 0 0.000 First L 10 0 0.000 Last L 10 1 2.083 First L 11 0 0.000 Last L 11 0 0.000 First L 12 0 0.000 Last L 12 1 2.857 First L 13 0 0.000 Last L 13 1 1.587 First L 14 0 0.000 Last L 14 0 0.000 First L 15 1 3.704 Last L 15 3 11.111 First L 16 1 .645 Last L 16 2 1.290 First L 17 2 1.235 Last L 17 0 0.000 First L 18 0 0.000 Last L 18 1 .893 First L 19 0 0.000 Last L 19 2 1.429 First L 20 0 0.000 Last L 20 3 1.744 First L 21 0 0.000 Last L 21 1 .578 First L 22 0 0.000 Last L 22 0 0.000 First L 23 O 0.000 Last L 23 0 0.000 Mean of /X—B/ is 1.551 1.473 Mean of /X-L/ is 1.678 1.289 Calculated F-value is .254 .071 Level of significance is .623 .786 116 Statistical results for the negations variable are found at the bottom of the table. Using mean absolute differences of /X-8/ and /X-L/ the computer calculated an F-value of .254 and .071 for first and last words respectively. For the negations variable, observed differences, represented by the tabulated F-values, were not statistically signifi- cant enough to formulate a hypothesis of attribution to either Bayle or Larroque. Table 4:20 lists absolute frequencies, percentage of use, and statistical values relating to use of nouns as sentence beginnings and endings in thirty—four test articles. The first column of Table 4:20 identifies the test variable's position as either initial or terminal and denotes the authors and works tested as listed in Appendix A. The number of times that nouns were used in test articles as sentence beginnings and endings appears in fre- quency columns. In order to correct for distortions of statistical results due to differences in article lengths, percentage of use or relative frequency values were calculated and are presented in percent columns. After the first and last word values for article L23, statisti- cal results for all figures in Table 4:20 are summarized. Again, the analysis of variance program calculated the F-statistic. Computed F-values of .112 and 2.85 are not statistically significant and there- fore do not suggest attribution to either Bayle or Larroque for vari- ables represented in Table 4:20. 117 TABLE 4:20. NOUNS AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 0 0.000 Last P 46 54.762 First X 20 1.104 Last X 1093 60.353 First 8 1 10 .543 Last 8 l 1050 57.065 First 8 2 4 1.394 Last 8 2 186 64.808 First 8 3 0 0.000 Last 8 3 24 52.174 First 8 4 5 4.717 Last 8 4 66 62.264 First 8 5 3 5.455 Last 8 5 34 61.818 First 8 6 1 2.778 Last 8 6 20 55.556 First 8 7 2 1.527 Last 8 7 83 63.359 First 8 8 O 0.000 Last 8 8 17 62.963 First 8 9 0 0.000 Last 8 9 13 61.905 First L l 28 2.381 Last L 1 648 55.102 First L 2 3 1.948 Last L 2 98 63.636 First L 3 1 1.786 Last L 3 42 75.000 First L 4 0 0.000 .Last L 4 44 62.857 First L 5 0 0.000 Last L 5 34 69.388 First L 6 0 0.000 Last L 6 17 65.385 First L 7 9 6.475 Last L 7 80 57.554 First L 8 5 6.579 Last L 8 49 64.474 First L 9 0 0.000 Last L 9 13 35.135 First L 10 1 2.083 Last L 10 24 50.000 First L 11 0 0.000 Last L 11 29 67.442 First L 12 0 0.000 Last L 12 21 60.000 First L 13 2 3.175 Last L 13 29 46.032 First L 14 0 0.000 Last L 14 74 56.489 First L 15 0 0.000 Last L 15 11 40.741 First L 16 0 0.000 Last L 16 96 61.935 First L 17 8 4.938 Last L 17 92 56.790 First L 18 0 0.000 Last L 18 73 65.179 First L 19 2 1.429 Last L 19 77 55.000 First L 20 2 1.163 Last L 20 105 61.047 First L 21 15 8.671 Last L 21 123 71.098 First L 22 O 0.000 Last L 22 52 61.905 First L 23 0 0.000 Last L 23 24 68.571 Mean of IX-B/ is 1.580 3.473 Mean of IX-L/ is 1.814 7.131 Calculated F-value is .112 2.857 Level of significance is .738 .097 118 Table 4:21 shows absolute frequencies, percentage of use, and statistical values relating to use of numbers as sentence beginnings and endings in thirty-four test articles. The article column of Table 4:21 identifies the test variable's position as the first or last element of a sentence. Denominators P, X, B, and L and numbers adjacent to them coincide with the author/work list in Appendix A. In frequency columns appear the number of times that numbers occurred in listed works as sentence beginnings or endings. Percentage of use values were calculated to correct for distortions of statistical results due to differences in article lengths and are tabu- lated in percent columns. At the bottom of Table 4:21 are found statistical results for all values in the table. Using mean absolute differences of /X-B/ and /X-L/, the computer calculated F-statistics of 1.539 and 1.263 for num- bers as sentence first and last words respectively. Falling well below the F or 4.17 required for statistical significance, 1.539 and 1.263 figures do not suggest with any degree of certainty that either Bayle or Larroque wrote the disputed work. 119 TABLE 4:21. NUMBERS AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 1 1.190 Last P 2 2.381 First X 37 2.043 Last X 20 1.104 First 8 1 20 1.087 Last 8 l 20 1.087 First 8 2 1 .348 Last 8 2 2 .697 First 8 3 4 8.696 Last 8 3 O 0.000 First 8 4 O 0.000 Last 8 4 l .943 First 8 5 0 0.000 Last 8 5 1 1.818 First 8 6 O 0.000 Last 8 6 2 5.556 First 8 7 8 6.107 Last 8 7 7 5.344 First 8 8 0 0.000 Last 8 8 1 3.704 First 8 9 0 0.000 Last 8 9 1 4.762 First L 1 17 1.446 Last L 1 16 1.361 First L 2 0 0.000 Last L 2 0 0.000 First L 3 0 0.000 Last L 3 1 1.786 First L 4 O 0.000 Last L 4 0 0.000 First L 5 0 0.000 ‘ Last L 5 0 0.000 First L 6 0 0.000 Last L 6 0 0.000 First L 7 25 17.986 Last L 7 1 .719 First L 8 0 0.000 Last L 8 2' 2.632 First L 9 0 0.000 Last L 9 0 0.000 First L 10 6 12.500 Last L 10 0 0.000 First L 11 18 41.860 Last L 11 1 2.326 First L 12 0 0.000 Last L 12 3 8.571 First L 13 15 23.810 Last L 13 2 3.175 First L 14 0 0.000 Last L 14 0 0.000 First L 15 4 14.815 Last L 15 0 0.000 First L 16 0 0.000 Last L 16 1 .645 First L 17 8 4.938 Last L 17 O 0.000 First L 18 8 7.143 Last L 18 0 0.000 First L 19 0 0.000 Last L 19 l .714 First L 20 15 8.721 Last L 20 1 .581 First L 21 0 0.000 Last L 21 1 .578 First L 22 1 1.190 Last L 22 0 0.000 First L 23 3 8.571 Last L 23 0 0.000 Mean of /X-B/ is 2.620 1.927 Mean of /X-L/ is 6.431 1.250 Calculated F-value is 1.539 1.263 Level of significance is .222 .269 120 Table 4:22 gives absolute frequencies, percentage of use, and statistical values relating to use of prepositions as sentence begin- nings and endings in thirty-four test articles. As in previous tables presenting STYLBEND results, article columns identify the position of the variable as either initial or terminal. Alphabetical characters, P, X, B, or L, and numbers next to them identify works tested as listed in Appendix A. In frequency columns appear the number of times a preposition is used in its corresponding article as either a sentence beginning or ending. Percentage of use values were calculated in order to correct for distortions of statisti- cal results due to differences in article lengths and are tabulated in percent columns. Because rules of French syntax regarding position of the prepo- sition are even more rigid than those governing conjunctions, it would seem natural to expect zero values in all positions of the last word section of Table 4:22. Nevertheless, four such occurrences are recorded in l'Avis aux réfugiés. STYLBEND grouped-data listing shows that the words in question are located in lines 1077, 1512, 4548 and 3450 of the basic text as shown below: X1077 distinguo. Les rois sont-ils dépendans de Dieu seul. (?)14 C'est selon. (:) Si ... Si ... X1512 encore mieux peu de tems aprés. Ils regardent assez long-tems ... X4548 universitez et des parlemens, fut cassé bien-t6t aprés. En sorte que ... X3450 ... ou de n'avoir pas eu la force de crier contre. On ne peut ... Of four cases quoted, the first, although a preposition, is used as a 121 TABLE 4:22. PREPOSITIONS AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 2 2.381 Last P 0 0.000 First X 135 7.454 Last X 4 .221 First 8 1 136 7.391 Last 8 l O 0.000 First 8 2 35 12.195 Last 8 2 0 0.000 First 8 3 5 10.870 Last 8 3 O 0.000 First 8 4 7 6.604 Last 8 4 0 0.000 First 8 5 3 5.455 Last 8 5 0 0.000 First 8 6 0 0.000 Last 8 6 0 0.000 First 8 7 8 6.107 Last 8 7 O ‘ 0.000 First 8 8 0 0.000 Last 8 8 0 0.000 First 8 9 4 19.048 Last 8 9 0 0.000 First L 1 93 7.908 Last L l 0 0.000 First L 2 17 11.039 Last L 2 0 0.000 First L 3 2 3.571 Last L 3 0 0.000 First L’ 4 7 10.000 » Last L 4 0 0.000 First L 5 6 12.245 Last L 5 0 0.000 First L 6 4 15.385 Last L 6 O 0.000 First L 7 9 6.475 Last L 7 0 0.000 First L 8 6 7.895 Last L 8 O 0.000 First L 9 0 0.000 Last L 9 O 0.000 First L 10 7 14.583 Last L 10 O 0.000 First L 11 9 20.930 Last L 11 0 0.000 First L 12 3 8.571 Last L 12 0 0.000 First L 13 7 11.111 Last L 13 0 0.000 First L 14 7 5.344 Last L 14 0 0.000 First L 15 1 3.704 Last L 15 0 0.000 First L 16 16 10.323 Last L 16 O 0.000 First L 17 7 4.321 Last L 17 0 0.000 First L 18 14 12.500 Last L 18 0 0.000 ‘First L 19 13 9.286 Last L 19 0 0.000 First L 20 18 10.465 Last L 20 0 0.000 First L 21 13 7.514 Last L 21 0 0.000 First L 22 15 17.857 Last L 22 0 0.000 First L 23 3 8.571 Last L 23 0 0.000 Mean of /X-B/ is 4.324 Mean of /X-L/ is 3.946 Calculated F-value is .075 Level of significance is .781 122 coordinating conjunction, having a series of "if" statements following it. If the original punctuation had been retained, the particular occurrence would not have been counted, but because all colons were key- punched as periods, and because periods were recognized by the computer as sentence delimiters, this somewhat irregular usage was isolated and counted. The second and third "prepositions" are in reality used as adverbs and may be classed as such. However, their usage when coupled with a second adverb of time (p§u_dg_temps, bientbt) forces the reader to supply an object, even if this object does consist of an idea or a series of events. The final term classed as a preposition appears to be a true breach of accepted French syntax, for once again the reader is forced to supply an object. Although each of these usages, excepting perhaps the fourth, might be classified as a different part of speech, the fact remains that common ground rules were established before the data were run, and X was the only one of the authors tested to use these forms in the ways described. At the bottom of Table 4:22 are found the statistical results computed by the analysis of variance program. Once again the results fail to show statistical evidence suggesting either Bayle or Larroque as the probable author of the disputed text. No values are printed for the last-word column because the great number of zero occurrences made further calculation impossible. Table 4:23 shows absolute frequencies, percentage of use, and. Statistical values relating to use of pronouns as sentence beginnings and endings in thirty-four test articles. 123 TABLE 4:23. PRONOUNS AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 44 52.381 Last P 2 2.381 First X 531 29.321 Last X 61 3.368 First 8 1 681 37.011 Last 8 1 88 4.783 First 8 2 82 28.571 Last 8 2 9 3.136 First 8 3 23 50.000 Last 8 3 4 8.696 First 8 4 42 39.623 Last 8 4 2 1.887 First 8 5 21 38.182 Last 8 5 1 1.818 First 8 6 18 50.000 Last 8 6 2 5.556 First 8 7 62 47.328 Last 8 7 2 1.527 First 8 8 10 37.037 Last 8 8 O 0.000 First 8 9 11 52.381 Last 8 9 1 4.762 First L l 355 30.187 Last L 1 82 6.973 First L 2 51 33.117 Last L 2 7' 4.545 First L 3 26 46.429 Last L 3 0 0.000 First L 4 22 31.429 Last L 4 1 1.429 First L 5 16 32.653 . Last L 5 2 4.082 First L 6 8 30.769 Last L 6 1 3.846 First L 7 39 28.058 Last L 7 9 5.475 First L 8 14 18.421 Last L 8 3 3.947 First L 9 18 48.649 Last L 9 0 0.000 First L 10 13 27.083 Last L 10 3 6.250 First L 11 9 20.930 Last L 11 0 0.000 First L 12 10 28.571 Last L 12 2 5.714 First L 13 17 26.984 Last L 13 4 6.349 First L 14 67 51.145 Last L 14 7 5.344 First L 15 6 22.222 Last L 15 1 3.704 First L 16 70 45.161 Last L 16 7 4.516 First L 17 58 35.802 Last L 17 7 4.321 First L 18 39 34.821 Last L 18 3 2.679 First L 19 50 35.714 Last L 19 4 2.857 First L 20 54 31.395 Last L 20 7 4.070 First L 21 56 32.370 Last L 21 4 2.312 First L 22 20 23.810 Last L 22 2 2.381 First L 23 11 31.429 Last L 23 3 8.571 Mean of /X-B/ is 13.082 2.088 Mean of /X-L/ is 6.510 1.890 Calculated F-value is 6.510 .133 Level of significance is .017 .718 Statistically significant in favor of IX-L/t 124 Article columns of Table 4:23 identify the test variable's posi— tion as either a sentence beginning or ending. Denominators P, X, B, or L and numbers adjacent to them coincide with the author/work list in Appendix A. Frequencies with which pronouns occurred in listed works as sentence initiators or terminators appear in frequency columns. In order to correct for distortions of statistical results due to differ- ences in article lengths, percentage of use or relative frequency values were calculated and are presented in percent columns. Following tabulated values for L23 are found ANOVAR results. The mean absolute differences /X-B/ and /X-L/ were used to calculate an F-value of 6.255 for the sentence beginning statistics and a .133 figure for sentence endings. The 6.255 value is statistically significant at the .01 level in favor of the conclusion that Larroque wrote the dis- puted text. On the other hand, the .133 value for pronouns used as sen- tence endings is not significant enough to suggest a probable author. Unlike other significant ANOVAR results, the conclusion suggested in Table 4:23 was confirmed, rather than rejected after the regrouping of articles noted in Table 4:7. Table 4:24 lists absolute frequencies, percentage of use, and statistical values relating to use of conjugated verbs as sentence beginnings and endings in thirty-four test articles. Table 4:24 is the final table which lists values for all thirty- four t19371: articles. As in Tables 4:10 through 4:23, article columns identify the test variable's position as either initial or terminal. Letters of the alphabet, P, X, B, or L, and numbers next to them coin- Cide With the author/work list in Appendix A. The number of times that con ‘ . . . . Jugated verbs occurred as sentence beginnings or endings in the 125 TABLE 4:24. CONJUGATED VERBS AS SENTENCE BEGINNINGS AND ENDINGS Article Frequency Percent Article Frequency Percent First P 1 1.190 Last P 15 17.857 First X 121 6.681 Last X 148 8.172 First 8 l 59 3.207 Last 8 1 162 8.804 First 8 2 26 9.059 Last 8 2 18 6.272 First 8 3 1 2.174 Last 8 3 1 2.174 First 8 4 8 7.547 Last 8 4 9 8.491 First 8 5 1 1.818 Last 8 5 6 10.909 First 8 6 1 2.778 Last 8 6 2 5.556 First 8 7 3 2.290 Last 8 7 11 8.397 First 8 8 O 0.000 Last 8 8 3 11.111 First 8 9 0 0.000 Last 8 9 1 4.762 First L l 45 3.827 Last L l 100 8.503 First L 2 1 .649 Last L 2 12 7.792 First L 3 0 0.000 Last L 3 1 1.786 First L 4 0 0.000 Last L 4 5 7.143 First L 5 0 0.000 Last L 5 1 2.041 First L 6 0 0.000 Last L 6 2 7.692 First L 7 5 3.597 Last L 7 11 7.914 First L 8 2 2.632 Last L 8 6 7.895 First L 9 0 0.000 Last L 9 4 10.811 First L 10 0 0.000 Last L 10 3 6.250 First L 11 0 0.000 Last L 11 5 11.628 First L 12 1 2.857 Last L 12 1 2.857 First L 13 0 0.000 Last L 13 11 17.460 First L 14 O 0.000 Last L 14 8 6.107 First L 15 O 0.000 Last L 15 3 11.111 First L 16 1 .645 Last L 16 4 2.581 First L 17 l .617 Last L 17 11 6.790 First L 18 0 0.000 Last L 18 8 7.143 First L 19 0 0.000 Last L 19 10 7.143 First L 20 2 1.163 Last L 20 4 2.326 First L 21 0 0.000 Last L 21 6 3.468 First L 22 0 0.000 Last L 22 9 10.714 First L 23 0 0.000 Last L 23 3 8.571 Mean of /X-B/ is 4.194 2.308 Mean of /X-L/ is 5.986 2.844 Calculated F-value is 10.043 .329 Level of significance is .003 .576 Statistically significant in favor of /X-8/ 126 listed works appear in frequency columns. The final columns, denominated "Percent," list percentage of use or relative frequencies which were calculated and used in order to correct for distortions of statistical results due to differences in article lengths. At the bottom of the table are presented statistical results as computed by the ANOVAR program. The calculated F—value of 10.043 repre- sents a confidence level of .003 in favor of the conclusion that Bayle wrote l'Avis aux réfugiés. Like other tentative conclusions, the one drawn in Table 4:24 was negated by a regrouping of test articles as shown in Table 4:7. Table 4:25 summarizes percentage of use of each of the fifteen parts of speech as sentence beginnings in the final five test articles. The part of speech column lists fifteen part of speech cate- gories used as sentence initiators. The denominators P, X, 81, 82 and L1 symbolize the preface, l'Avis aux réfugiés, Bayle l, Bayle 2, and Larroque 1 as listed in Appendix A. Figures in columns P, X, 81, 82 and L1 correspond to the percentage of sentences in a given work which begin with one of the fifteen parts of speech listed. For example, the author of the preface began 4.762 percent of his sentences with an adjective, none with infinitives, and 52.381 percent with pronouns. The fact that all column totals do not equal one hundred percent is due to a computer rounding error and is of no significance. 127 TABLE 4:25. PERCENTAGE OF USE SUMMARY FOR SENTENCE BEGINNINGS Part of Speech P X 81 82 L1 adjectives 4.762 4.528 5.435 3.136 5.102 adverbs 1.190 2.430 1.413 3.136 2.211 articles 14.286 6.957 8.152 5.575 8.673 conjunctions 20.238 33.849 29.457 28.920 32.568 exclamations 0.000 .276 .707 .697 1.361 gerunds 1.190 .110 .163 .697 .255 infinitives 0.000 .166 .109 0.000 .170 interrogatives 1.190 2.706 2.989 2.439 1.701 Latin 0.000 .387 .217 0.000 .340 negations 0.000 1.988 2.120 3.833 1.871 nouns 0.000 1.104 .543 1.394 2.381 numbers 1.190 2.043 1.087 .348 1.446 prepositions 2.381 7.454 7.391 12.195 7.908 pronouns 52.381 29.321 37.011 28.571 30.187 verbs 1.190 6.681 3.201 9.059 3.827 Totals 99.998 100.000 99.995 99.960 100.001 Table 4:26 summarizes percentage of use of each of the fifteen parts of speech as sentence endings in the final five test articles. The part of speech column lists the fifteen part of speech cate— gories used as sentence terminators. The denominators P, X, 81, 82 and L1 symbolize the preface, l'Avis aux réfugiés, Bayle l, Bayle 2, and Larroque 1 as listed in Appendix A. In columns P, X, 81, 82, and L1 are listed the percentage of sentences in a given work which end with 128 TABLE 4:26. PERCENTAGE OF USE SUMMARY FOR SENTENCE ENDINGS Part of Speech P X 81 82 L adjectives 11.905 15.406 13.804 15.679 15.561 adverbs 4.762 3.644 4.076 4.181 3.486 articles 0.000 0.000 0.000 0.000 0.000 conjunCtions 2.381 .607 1.250 .697 .425 exclamations 0.000 0.000 .435 0.000 .340 gerunds 0.000 .110 .272 0.000 ' 0.000 infinitives 3.571 5.080 5.707 3.833 5.782 interrogatives 0.000 0.000 0.000 0.000 0.000 Latin 0.000 .718 .652 .697 .510 negations 0.000 1.215 2.065 0.000 1.956 nouns 54.762 60.353 57.065 64.808 55.102 numbers 2.381 1.104 1.087 .697 1.361 prepositions 0.000 .221 0.000 0.000 0.000 pronouns 2.381 3.368 4.783 3.136 6.973 verbs 17.857 8.172 8.804 6.272 8.503 Totals 100.000 99.998 100.000 100.000 99.999 adjectives, adverbs, conjunctions, and other parts of speech as listed. The author of l'Avis aux réfugiés, for example, terminated 15.406 per- cent of his sentences with adjectives, .607 percent with conjunctions, none with exclamations, 60.353 percent with nouns, 3.368 percent with pronouns, and 8.172 percent with finite verbs. Bayle, on the other hand, recorded percentages of 13.804, 1.250, .435, 57.065, 4.783, and 8.804 for the same variables. The significance of these differences is 129 discussed with the statistical t-test results reported after Table 4:28. The fact that all column totals do not equal one-hundred is again due to a computer rounding error and is insignificant. Statistical Summary of Sentence Beginning Variables The information in Tables 4:25 and 4:26 served as input data for the final analysis of STYLBEND variables. Analysis of variance per- formed on transformed variables /X-8/ and /X-L/ indicated which of the two authors (8 or L) was closer to X in his use of a particular stylis- tic element. Since the format of the data after the articles were re— grouped did not permit further use of the analysis of variance program, and since the t—test had proven effective in observing differences be- tween expository and literary criticisms, it was adopted as the statis- tical measure for the remainder of the study. Tables 4:10 through 4:24 provided the data shown in Tables 4:25 and 4:26. The values in these two tables were essential to performing t-tests on the STYLBEND variables across the final five test articles. A summary of t-values and confidence levels relating to use of STYLBEND variables as sentence beginnings comprises Table 4:27. The first column of Table 4:27 lists the fifteen part of speech categories used in STYLBEND analyses. Column headings X:P, X:Bl, X:82, X:L1, and 81:Ll symbolize author/article combinations for which the hypothesis of no difference was tested. Figures in each of these columns are the calculated t-values. A t-value of greater than 1.96 means that a significant difference does exist between the two test anthors' quantitative use of the variable in question. On the other hand, a non-significant t—value——sma11er than l.96--prescribes that the 130 z - Noso. Moo. oamm.m - Nooo.H Hoo. oooo.o mo. mooo.~ mono> o Hoo. oooo.m - ooom. - oomm. Hoo. ommo.o Hoo. Hooo.o monocono - - ammo. - momo. Ho. ooma.~ - mono. - mmma.fl moooufimooono - - oNao. - maoH.H mo. Hooo.m moo. oomm.~ - ooom. muooso: o Hoo. oNHo.o Ho. oooa.~ - momo. - ooao.~ - mooo. mason - - «who. - comm. mo. «Noo.H - moom. - ooom.~ mooooowoo - - Nomo. - moom. - ommo.o - mmmo. - ooam. zooms - mo. NofiN.N - oooo.fi - ooom. - Boom. - oooo. mo>opomoaaoooo - - some. - oomo. - oooo. - mmoo. - mmam. mo>opfioomofi - - ommm. - oooo. mo. omHH.N - oomo. moo. oomo.m monouom - - mmoa.fi Hoo. omoo.m - mooH.H - oomo.o - Nmoo. mcofiomEoHoxo - - oooo.H - ommo. - aaoo.a moo. mmmo.~ Ho. ooom.m mooflpooooooo - - ooom. - ommn.a - mooo. - mmom.~ moo. oo~m.~ mofiofipao - - aHvo.H - ooom. - «ooh. moo. mom~.~ - Noon. mono>oo - - oaom. - moon. - omao.H - oomN.H - hooH. mo>fiooonoo goo< so Ho Ho go Honx oo Nonx so sonx so aux noooom mo puma .oono moszzHomo mozmezmm mo >a z moo. omom.o ooo. mmom.o - omoo. mo. oomH.o - omoo. mosooooo - - oooo.o - oomo.m - oooo. mo. mooo.o - oomo. mnemommooooo - - oooo. - oooo. - oooo. - momo. - mooo.m moooeoo z - Homo.o moo. mmom.o - oomo.o mo. mooo.o - mmoo.o mono: - - oooo. - mooo.o - Hoom.o mo. oooo.o - Homo.o moomoomoo - - omoo. - oooo. - oomo. - mooo. - oooo. among - - oooo.o - oooo.o - oooo.o - oooo.o - oooo.o moompooouooocm - - momo. - momm. - oooo. - oomm. - oooo. mo>mpmommom - - Homo.o - ooom.m - mmom. - oooo.o - ooom. monsooo z - oooo. moo. ommo.o - oooo.o moo. Hoom.o - oooo.o moompoaoooxo o moo. omom.o - oooo. - ooom. mo. oooo.o - omoo.m moompoosmoou - - oooo.o - oooo.o - oooo.o - oooo.o - oooo.o mooomuao - - moom. - oooo. - oooo. - oooo. - oomm. momo>oo - - oomm.m - ooom. - oomm. - moom.o - moom. mo>oooomoo oua< so Ho oo oo Hoax so ooux mo Houx ou aux ooooom mo Homo .oooo mozmozm mUZmozmm mo >mvwbtnhh- HW‘JDI-J“ Oxaawn—' *9 *9 22 JUN 70 TITLE OF ARTICLE L8 AVIS AUX RFFUGIFZ o e o CARALF CHIMFRIOUE RFPDNSE 0808 MUUVEAU CONVERT! HFFLFXIONS sun LFS DIFFERENS OF LA TRAITE sun LF SERPENT 001 TFNTA FVF TRAITE OF LA BENEDICTION NUPTIALF HISTOIRF nss GUERRFS OF LA MAISON SUPPLEMENT At DIVERS ENDROITS DE ces LA mnRALE DU MONDE TRAITE nes EVEOUES, FT DE 16085 LE PROSELYTE ABUSE ECDNDHIE DIVINE HISTDIRES DES REVOLUTIONS EXTRAIT DBUNF LETTRE DE H. LARROOUF LA PAIX DES BONNES AMES . APOLOGIE POUR LflEGLISE ANGLICANE NDUVFLLES ACCUSATIONS CONTRE N. TRAIT DE LDAIHAN REFLEXIONS NOUVELLES SUR LES CAUSES RESPONSE 0E N. LARROQUE A* LA DISCDURS SUR UNE NEDAILLE DE FURIA HISTOIRE DE LA HORT DES PERSECUTEURS DEUX TRAITTEZ DOUSSERIUS HISTOIRE GENERAL DES CONCILES GEEST A. DIRE, REFUTATION 0E CE OUE HISTOIRES DE PHILIPPE DE VALDIO DEMONSTRATION DE LA VERITE ET DE LA HISTDIRE CU DIVORCE DE HENRY VI DE LA VERITARLE RELIGION HISTOIRE DE LA HONARCHIE FRANCAISE ESSAIS DE PHYSIOUE , LES DEUVRES POSTHUNES DE N. CLAUDE HISTOIRE DOUNE DANE CHRETIENNE LETTRE DE MONSIEUR PELISSON MEDITATIONS NETAPHYSIOUES APPENDIX B APPENDIX B INPUT DATA - TEXT SAMPLE ‘SABSTRACT PG 583 x 204 JP NE VOUS LP DIS PAS, 80851608, POUR VOUS INSULTER. At DIEU NE PLAISE. ZRSVOUS SCAVFl PFS SFNTIMENS. VOUS NE IGNOREZ PAS QUE NE AIANT AUCUNE PART Aux 286AFFAIRES PURLIOUFS. JE AI VU AVFC UNE EXTREME REGRET CETTE SUITE DE 287EVFNE8ENS. FT CFTTE FATALE NECESSITE, PAR LAOUFLLE LA FRANCE SE EST PRIVEE 28806 TANT OF HONNFTES GENS. FT DE PERSONNES DE MERITE. UUI ONT ETE CHERCHER 2890N ASVLE DANS LFS PAYS ETRANGERS. DE SORTE GDP 51 JE VOIS AVEC PLAISIR 00E 290La ANNEF 1689, NE A POINT REPONDU At VOS PREDICTIONS. CE NE EST NULLEHENT 291A¢ CAUSE 00 PRFJUDICE 00$ VOUS FN RECEVEZ. MATS A9 CAUSE DUE ON DOIT ETRE 29281EN-AISF, FN FAVEUR OF LA RAISDN ET 00 RON SENS. 00F LA SUPER- STITION DES 293N088855, FT LA CRFOHLITF POPULAIRE. SOIT DEPENTIE PAR DES EXPERIENCES 294PALPA8LFS OUT PUISSFNT AUTANT La AFFOIBLIR, DUE ELLE SE SEROIT FORTIFIEE 29SPAR LES FVFNFRFNS At 0001 VOUS VOUS ETIEZ ATTENDUS. ET POUR VOUS NONTRER 2960UE CE EST LA. LE VERITABLE SUJET DE MA JOIE. VOICE oats LE PREMIER JOUR DE 297La AN 1690, UNE LETTRE 00* JE VOUS FELICITE 0E TOUT MON COEUR DES 298FAVORA8LES DISPOSITIONS. DUE ON DIT ETRE DANS La ESPRIT DU 801 POUR LE Z99RFTABL|SSFNFNT DE VOTRE PART]. JE‘NF VOUS ASSURE PAS QUE TOUT LE NONDE SE 300FN REJOUISSF. 1L SF TROUVERA TOUJOURS DES IGNORANS ET DE FAUX SA VANS. OUT 301CONDANNERONT LA TOLERANCE DE VOTRE SECTE DANS LE ROVAUHE DU ROI 302TRES-CHRETIFNg ET 0U FILS AINE DE 18 FGLISE. HAIS JE VOUS REPONDS OUE EN 303GENERAL TOUT CE QUE 11 Y A DE PLUS RAISONNABLE DANS LES TROIS ORDRES DU 3D4ROYAUNE, APPROUVERONT OUE ON VOUS LAISSE UNE HONNETE LIBERTE, PUISOUE IL 305NE A PAS SFNRLE 80N AU SAINT ESPRIT DE SECONDER LES INTENTIONS OUE ON A 306EUES DE VNIS REINIR At 18 FGLISE CATHOLIOUE. VOUS NE SCAURIEZ CROIRE LE 307PLAISIR OUF JE RESSENS PAR AVANCE. EN NE IMAGINANT OUE VOUS NE SEREZ PAS 3080ES DERNIERS A9 REVINIR. JE NE PARLE PRESOUE DE AUTRE CHOSE AVEC HES A815. 3D9ET JE NE VDIS GUERRE DE GENS OUI NE AYENT PERDU. PAR LA SUPRESSION DE 13 310EDIT DE NANTES. DUELOUE PERSONNE QUE ILS AIMOIENT. ET QUE ILS ESTIHOIENT 311AVEC BEAUCDUP DE JDIE DES NOUVELLES FAVORABLES DUE ON DEBITE SUR VOTRE 3IZSUJET. AINSI. HONSIEUR, PREPAREZ-VOUS. TOUS TANT 00E VOUS ETES, A. RECOVDIR 313A. VOTRE RFTDUR EN FRANCE, NILLE CARESSES ET HILLE EHBRASSEHENS DE CEUX 314NENES GUI SONT ATTACHEZ AVEC UN ZELE INVIOLABLE A* LA CONNUNION DE La 315EGLISE CATHOLIOUEo SABSTRACT . 1 PG 583 X 316 NAIS PERNETTEl-NOI DE VOUS AVERTIR DE UNE CHOSEo VOUS, NONSIEUR. ET TOUS 317VOS CONERERES REFUGIEZ EN DIVERS PAYS ETRANGERS. CE EST DE FAIRE UNE ESPECE 3180E GUARANTAINE AVANT OUE DE NETTRE LE PIED EN FRANCE, AFIN DE VOUS PURIFIER 31900 NAUVAIS AIR OUE VOUS AVEZ HUME DANS LES LIEUX DE VOTRE EXIL. ET OUI VOUS 320A INFECTEZ 0F DEUX NALADIES IRES DANGEREUSES. ET TOUT-Afi-FAIT ODIEUSES. 13 SZIUNE EST 13 ESPRIT DE SATYRE. La AUTRE UN CERTAIN ESPRIT REPUBLICAIN DUI 322NE VA PAS A. NDINS OUE At INTRODUIRE La ANARCHIE DANS LE HONDE. LE PLUS 3236RAND PLEA“ DE LA SOCIETE CIVILE. VOILA DEUX POINTS SUR LESOUELS'JE PRENS 3241A LIDERTF DE VOUS PARLER EN AMI. CONNENCONS PAR VOTRE ESPRIT DE SATYRE. 9SABSTRACT PG 583 X X 325 LA FACILITF DUE VOUS AVEZ TROUVEE DANS LES PAYS ETRANGERS DE FAIRE X 326INPRINER INPUNERENT TOUT CE OUE IL VOUS A PLU9 A PRODUIT PARNI VOUS UNE SI X SZTCRANDE QUANTITE DE AUTEURSS OUE 11 NE Y A PAS DE APPARENCE QUE AUCUNE SECTE ‘ 328VOUS DISPUTE JANAIS LE PREMIER RANG DE FECONDITE EN CE GENRE LAP. CES X 329AUTEUBS SONT FORT DIFFERENS LES UNS DES AUTRES EN CAPICITE. NAIS ILS SE XXXXXXXxXXXXuX xxxxxxxxxfixxxxxxxxxxx xxXxxxx 232 APPENDIX C COUNTS EDIT I EDIT 3 EDIT 4 STD SENWOL APPENDIX C ANNOTATED PROGRAM LIST General Use (Count Synonyms) Written in SP8 for the IBM 1620 This program reads in the synonym list and counts the total number of synonyms (see Chapter III for information about the synonym list in the ENROOT description). (Edit Program 1) Written in FORTRAN for the CDC 6500 Because punctuation is vital to the sentence level tests, this program was used to check for any irregular punctua- tion such as " . . . " or " , . " so that these could be visually checked. I (Edit Program 3) Written in SP8 for the IBM 1620 This program lists the text with each sentence separated by a blank line so that unusually long or short sentences can be checked for keypunching errors in punctuation. (Edit Prqgramgly Written in SP8 for the IBM 1620 This program prints the sequence numbers of all sentences of exactly 1 or 2 words so that they may be checked for errors. (Standard Deviation) Written in FORTRAN for the CDC 6500 This program computes the mean and standard deviation of a set of numbers. (Sentence and Word Length§)_ Written in SPS for the IBM 1620 This program generated all the basic data for the sentence level tests, as described in Chapter III. (Sequence Texg) Written in SP8 for the IBM 1620 This program was used to insert sequence numbers and "ABSTRACT" cards into the newly keypunched text (used on the Pelisson and Descartes material). 233 FREQFUN SEQTC STYLBEND SENDIS SENDAT FUNW EXSOR SYLAN 234 (Frequency of Function Words) Written in SP8 for the IBM 1620 This program generates the basic data for the function word analysis described in Chapter III. (Sequence Check Text) Written in SP8 for the IBM 1620 This program sequence checks the line numbers on the text. (Stylistic Beginnings and Endings! Written in SP8 for the IBM 1620 This program finds and punches the first and last word of each sentence. See Chapter III for a complete description. (Sentence Distribution) Written in FORTRAN for the CDC 6500 This program tests internal variance by grouping sentence level data into loo-sentence groups within each article. (Sentence Level Data Analysis)_ Written in FORTRAN for the CDC 6500 This program prints out a distribution chart based on the number of words per sentence for articles X, Bl, and L1. (Function Word Analysis), Written in FORTRAN for the CDC 6500 This program performs an analysis of function words similar to that of Mosteller and Wallace: elimination by repeated t-tests to obtain valid discriminators. (Expression SortingPrggram) Written in SP8 for the IBM 1620 This is the SPS equivalent of the CDC 3600 STYLEFRAZ pro- gram which computes absolute frequencies of expressions. See Chapter III for a complete description. General Use - French (Sylabification Analysis) Written in SP8 for the IBM 1620 This program generates the sylabification data. See Chapter III for a complete description. DATA 1 DATA 2 588 FUNNY TOTFUN MISC SANAL STAN 235 (Data Conversion Program 1) Written in FORTRAN for the CDC 6500 This program compacts the data from FREQFUN for any given word: it puts the frequencies from all 33 articles on two cards. It performs a similar procedure on the EXSOR data. _(Data Conversion Program 2) Written in FORTRAN for the CDC 6500 This program puts total values from FREQFUN and EXSOR groups into a common format so that they may be analyzed by one program. (STYLBEND Summary) Written in BAL for the IBM 360/20 This program reads in the raw data from STYLBEND and com- putes totals for each part of speech. These are printed and punched for further analysis. (Function Word Analysis) Written in FORTRAN for the CDC 6500 This program does the t-tests and other final statistical analyses on the function word data. (Compute Total Function Words)_ Written in FORTRAN for the CDC 6500 This program reads in the function word data and computes the total number of function words in each article. (Miscellaneous Test§)_ Written in FORTRAN for the CDC 6500 This program performs the t-tests and other final statistical analyses on all the miscellaneous tests derived from the ENROOT data. (SENWOL Analysis) Written in FORTRAN for the CDC 6500 This program prints out all reports and does all t—tests and other final statistical analysis for the SENWOL data. (STYLBEND Analysis) Written in FORTRAN for the CDC 6500 This program prints out all reports and does all t-tests and other final statistical analysis for the STYLBEND data. FAP CANAL STAND ANOVAR EDIT 2 HTOR CAT 236 (Printed Index Program) Written in FORTRAN for the CDC 3600 and 6500 This program produces basic ENROOT data and is described in detail in Chapter III. (File Analysis Program) Written in FORTRAN for the CDC 3600 and 6500 This program was used in conjunction with the PIP program. See Chapter III for a complete description. Specific to this Study (Convert Data for Analysis) Written in SP8 for the IBM 1620 The purpose of this program was to reorganize data for easier processing. See Chapter III for a complete descrip- tion. (Standardization) Written in FORTRAN for the CDC 6500 This program was used in association with FAP; it con- verted FAP data from standardized form to non-standardized form. (Analysis of Variance) Written in FORTRAN for the CDC 6500 This program, discussed at length in Chapter III, performs the analysis of variance on various sets of data. (Edit Prpgramrg)_ Written in SP8 for the IBM 1620 The text was keypunched from manuscripts which contained a period after most numbers. This program finds all occurances of a number followed by a period so that these periods will not affect the sentence level tests. (Convert H to R) Written in BAL for the IBM 360/20 This program was used to modify the ENROOT source deck when converting from the CDC 3600 to the CDC 6500; it changed certain word lists from the 3600 form to the corresponding 6500 form. (Form Categpries) Written in FORTRAN for the CDC 6500 This program was used to group the words from FREQFUN into the categories described in Chapter IV. It performed a similar function on the expressions from EXSOR. APPENDIX D APPENDIX D SENWOL FLOWCHART GTARD l Count letters and add to totals End of :entenc: ? Print sentence results no rJ/f_Compute yes \\\\:f:f:////r '\\\averages 9 Print averages, totals 237 (STOP) APPENDIX E SEQ. NO. xxxxx X3916 X3917 X3918 X3920 X3921 X3923 X3925 X3925 X3928 X3929 X3930 X3931 X3931 X3933 X3937 X3940 X3941 X3942 X3944 X3947 X3950 X3953 X3956 X3957 X3960 X3963 X3965 X3969 X3970 X3973 X3973 X3975 TOT. WRDS XXX 009 008 030 011 034 018 004 034 016 008 012 004 014 055 027 021 011 024 044 024 050 023 023 029 036 034 048 015 029 008 020 029 TOT LET XXX 036 028 120 042 145 096 022 161 088 043 060 018 053 212 129 098 043 110 195 108 227 106 106 137 149 134 207 078 152 032 079 126 . APPENDIX E SAMPLE SENWOL OUTPUT LET. FOR WORDS XXXXXXXXX ... 331372656 33234319 324152565238214632637534225379 12338346246 222564329723643237248256124624135D 3521552C9732C64349 1056 232574435926362182936326B324483927 0372942824657125 10438764 37236A827137 8244 29447423332424 2264442732114222843824472657356222634459151634535238214 244637574683039A24243333445 366255458242439382160 24317248417 24364252256A624833762548 2422335414370861626228230225A1465252242734C9 582322364412543033992206 222355352162A483446325244263C23A437271828372370366 82828342252370861726343 23802362626282526223268 3234435613A2513684227483BA226 253544269234422632101366426367348236 1666255453245462818622613213454417 2443542372425325136232202342953834C824231861524A 1334763A3582660 133A262372768445460439342820C 23223307 42243633252468432736 23564144324308741632085452424 238 APPENDIX F APPENDIX F TABLE 4:6A. AVERAGE NUMBER OF LETTERS AND SYLLABLES PER WORD AND LETTERS PER SYLLABLE, LESS ONE AND TWO LETTER WORDS Average Average Average Article Let/Wrd Syl/Wrd Let/Syl Preface 0.000* 0.000* 0.000* Unknown 5.588 1.503 3.718 Bayle 01 5.491 1.521 3.611 Bayle 02 5.542 1.510 3.669 Bayle 03 5.429 1.445 3.758 Bayle 04 5.611 1.586 3.538 Bayle 05 5.677 1.580 3.594 Bayle 06 5.685 1.651 3.443 Bayle 07 5.421 1.513 3.582 Bayle 08 5.425 1.507 3.600 Bayle 09 5.728 1.636 3.502 All of B 5.509 1.525 3.614 Larroque 01 5.475 1.485 3.686 Larroque 02 5.551 1.518 3.657 Larroque 03 5.471 1.495 3.659 Larroque 04 5.504 1.464 3.760 Larroque 05 5.566 1.518 3.666 Larroque 06 5.325 1.474 3.613 Larroque 07 5.737 1.648 3.482 Larroque 08 5.537 1.598 3.466 Larroque 09 5.698 1.688 3.376 Larroque 10 5.688 1.567 3.630 Larroque 11 5.760 1.692 3.405 Larroque 12 5.595 1.625 3.443 Larroque 13 5.252 1.435 3.659 Larroque 14 5.466 1.429 _3.826 Larroque 15 5.530 1.572 3.518 Larroque 16 5.400 1.428 3.782 Larroque 17 5.644 1.606 3.514 Larroque 18 5.777 1.696 3.406 Larroque 19 5.644 1.618 3.488 Larroque 20 5.463 1.517 3.602 Larroque 21 5.562 1.589 3.500 Larroque 22 5.162 1.332 3.874 Larroque 23 5.725 1.519 3.768 5.524 1.525 3.624 All of L 239 240 TABLE 4 : 6A. (CONTINUED) Average Average Average Article Let/Wrd Syl/Wrd Let/Syl Mean of / X-B / .109 .059 .138 Mean of / X-L / .127 .082 .155 Calculated F-value .273 .965 .207 Level of Significance .611 .644 .656 *The Preface was not tested separately for these variables, therefore "0" values are present. APPENDIX G APPENDIX C SYLAN-SUBROUTINE BREAK FLOWCHART CSWD es Return 1 syllable > Edit bad Return 0 word syllable A (one in 1500 words) Return 1 etters _ syllable Begin processing with first Find and mark end of syllable Move to beginning of next Count all Return no. . syllables of syllables ) 241 APPENDIX H APPENDIX H WFLOWCHART ( START ) Read card Save Seq. number Read card Punch first 242 APPENDIX I omooom mmooom «Nooom mmooom mmooom Hmooom omooom oHooom wHooom maooom oHooom mHooom vHooom mfiooom Naooom HHooom oaooom ooooom woooom noooom ooooom moooom woooom moooom Noooom Hoooom onmvx mwovx venvx vofivx Noomx movmx mnmmx ommmx mmomx mvmmx emoax ommax HoHHx vmoax com x \ HNH x \ \\\\\\ \\\\\\ «mnmx vmmmx momvx nmmvx cemvx nmomx wmvmx wmamx omomx mommx omHNx onHx cowax HBHHX Nmn x ooovx oHomx \ mommx \ vommx memmx \ mmmwx \ oemflx \ ammfix \ ovmmx omm x \\p \\\\\\ \\\\\\ \\\ nmomx oHan mmvvx mmmvx mHovx vommx ovvmx nmanx ovmmx Hammx wwHNx wowax novax vmaax moo x «mm x mm x mmomx mommx voHNx Hmm x \\ \\\\\\ \\\\\\ \\\ {\O NO N \ \\\ ovHNx mwaax omvvx Homvx moomx mommx Nwmmx woflmx omwmx vmmmx moHNx Humax nmme oHHHx V co m X ><>< wvHvx mmomx moomx mmoax HNM x Hoooo \\ \\\\\\ \\\\\\ \\\ mmhflx \ com x \ <4mu OH x \ szHm mqm2UHH-O:>~'O:rUH3C>wrap-Omen:o IREMAIN, root SYNONYM Modify Terminal S word eturn root ~——~C“ ) Search Modify base, or suffixes depending on Return longest suffix root first not found Search Search or prefixes synonym lis Return longest without root first prefix not found Found Form Modify i_/’*_Return ) infinitive word root Return root C 3 276 APPENDIX X APPENDIX X ENROOT SYNONYM LIST (A* )=(AU), (AUX) (ABQNDER )=(ABHNDANT) (ABSOUDRE)=(ABSUUI (ABSURD)=(ABSURDITFSI,(ABSURDUM) (ABUSER I=(ABUSIF) (ACADFMI)=(ACADFMICIFNSI,(ACADEMIQUES) (ACCEPT)=(ACCEPTEZ) (ACCUSFR )=(ACCASATEUR) (ACCIDEN I=IACCID ) (ACTUEL =(ACTUFLLE) (ADRESSFR) = (ADRF) , (ANDRESS) (AFFERMIR)=(AFFFRM) (AEFLIGER)=(AEELICTII 9 (AFFICTION) (AGEN)=(AGENS)9(AGENT) (AGIR)=(AGISSANT) (AIMER)=(AIMENTI (AISEI=(AISEZ)‘ (ALARMER)=(ALLARMFRI (ALLEGUER)=(ALLFGUE)9(ALLEOUER) (ALLEMAN I=IALLFMANDI 9 (ALLEMAGN) (ALLER)=(VAI9(ALL)9(VAISI9IV0NT)9(AILLI,(IR),(AILLE) (ALLIER)=(ALLIFZ) (AMBASADFI=(AMBASSAD) (AMBITIUN)=(AMRITIFUI (AMQUR) = (AMUURFUX)9(AM0URER) (ANARCHI )=(ANARCHIOUF) (ANEANTIR)=(ANFANTISSEMENT) (AN)=(ANFE)9 (ANNFF) (ANDNYM)=(AN0MYMF)9(ANQNIMEI (APARENC)=(APARENCF), (APARENT),(APAREMMENT) (APARENC)=(APPARFMMENT)9(APPARENCE) (APFLLER =(APPFL) 9 (APPELL) 9 (APEL) (APFLLER) = (APPELLER) 9 (APPELFR) , (APELER) 7 (APPELLE) (APFRCFVOIR)=(APFRCQIVF) 9 (APFRCFVRAI (APLAUDIR)=(APLAUDISSFMFNT) (APLIOUFR)=(APLICATIDN), (APPLIQUAI) (APOSTASI)=(APQSTAT) (APPRFNDRE)=(APRFNANT),(APRFNOIS),(APRENS),(APREND) (APPRENDRF)=(APRFND-T-IL)9(APRENDRA),(APRENENT) (APPRFNDRE)=(APRFNNENT19 (APPRISI,(APRENER) (APPUYER )=(APPUI) (AQUFRIR)=(AQUIS),(AQUIFRT),(AOUIT) (ARDENT )=(ARD) 9 (ARDFUR) (ARGUMFN)=(ARGHI (ARMER )=(ARMFF) ,(ARMEMFN) 277 278 (ARTIFIC )=(ARTIFICI) (ASSURER) = (ASURFR) (ATHE)=(ATHEISMF)9(ATHEE)9(ATHFES) (ATTEINDRE)=(ATTEIGNIS)9(ATTEINTE) (ATTENDRE) = 9 (ATTENDR) (AUCUN )=(AUCUNF ) (AVALFR)=(AVAUX) (AVANTAGE)=(ANYTHING) (AVERTIR) = (AVFR) 9 (AVERTIT) 9 (AVERTISS) (AVIDITE )=(AVIDF) (AVOIR) = (A) 9 (AI) 9 (AV) 9 (ONT) 9 (EU) 9 (EUT) 9 (E) 9 (AURAI) (AVOIR)=(AIENT)9(AVEZ-VQU)9(AVONS-NO) (AVUIR) = (AURA) 9 (AURONS) 9 (AUREZ) 9 (AURONT) 9 (A15) (AVOIR) = (AY) 9 (AURQIS) 9 (AUROIT) 9 (AURIONS) 9 (AURIEZ) (AVOIR) = (AVE) 9 (AIT) 9 (AURAS) 9 (AURUIENT) 9 (EUES) 9 (EUE) (AVQIR) = (AUR) 9 (AVEZ) 9 (AVONS) 9 (AYER) 9 (EUTES) 9 (EUS) (AanER) = (AVE) 9(AVEU) (AZYLE)=(AZILE) (BABYLON)=(BABYLDNIEN) (BARASSER)=(BARR) (BARBARE)=(BARBARI) (BATTRE)=(BATUS)9(BATRE)9(BATTU) (BEAU )=(BEAL) 9(BEL) 9 (BELLE) (BENIR )= (BENEDICT) (BILE)=(BILIEUSF) (BOULVERSER)=(BOULFVFRSANT) (BRASSER )=(BRASSFMF) 9(BRER) (BRIGAN)=(BRIGANDAGE) (BRITANNI)=(BRITTANNIQUE) (CABAL)=(CABALISTI (CALOMNIE)=(CALOMIFS)9(CALOMNI)9(CALOMNIATFUR) (CALVIN )=(CALVINIST) (CANON )=(CANONICI) (CANTON)=(CANTONNFZ) (CARTESIEN)=(CARTFSIANISM) (CATHOLIC)=(CATHOLIQ) (CEDER )=(CESSInN) (CELA)=(CA) (CELLENT )=(CELLFNC) 9(CELLENS) (CELUI)=(CELL) 9 (CEUX) 9 (CELLE) (CENT )=(CEN I (CEPENDANT)=(CFPANDANT) (CERNER)=(CERNANT) (CERTAIN)=(CERTAINEMFNT)9(CERTES)9(CERTITUDE) (CESSER )=(CESSATION) (CHANT)=(CHANTS)9(CHANCET) (CHARME)=(CHARMANT) (CHAST)=(CHASTFTF) (CHAUD =(CHAUDFMENT) (CHEMINFR)=(CHFMIN) (CHIMFR)=(CHIMFRIOUE) (CHnISIR )=(CHnIX) (CINQ)=(CING) (CITER )=(CITATInN) 279 (CITnIEN )=(CITDYFN) (CLFRGE)=(CLERC) (COLOMNIER)=(COLOMIFZ) (COMBATTRE)=(CUMBATIRFNT)9(CDMBATRE) (COMFNCER) = (COMMENCE) (COMMUNIER)= (CnMUNION) , (COMUNIQUER)=(C0MMUNICATION)9 (COMMUNIOUASSE) (C0MISIDN)=(CUMMISSI)9(C0MMISIOI (COMMUD)=(COMMDOFMFNT) (anCEvnIR)=(anCFU)9(ancnI) (CUNCLUR)=(C0NCLURRF)9(CONCLUT)9(C0NCLUANTES)9(C0NCLURE) (CONDAMNER)=(CONDAMMNFI (CONDUIRF)=(CONDUIS) (CONFERER)=(CONFAIRF) (CONFONDRF) (CONFUSION) (CONDITRE) (anNnIST) 9(C0NNOITRE) 9 (CONNUT) (CONDITRE) (anNIR) 9 (CONNOISS) 9 (CONNnIR) 9 (CONNOSS) (CONDITRE) (comm) 9 (GONNA) 9 (CONNDI 9 (CUNNU) 9 (CONNOIT: (CONSFILLFR)=(C0NSFILFRF)9(CONSEIL) (CONSENTIR)=(CfiNSFNTFMENT) (CONVRTIR) = (anVFR) 9 (CONVERTI) (CONVIER )=(CUNVIFZ) (anVAINCRE)=(anVAINounIT) (CONVENIR)=(CONVIFNT)9(CUNVIENENT) (CONDITIU)=(CONOIRE) 9 (CONDITDN) (CONSTANT)=(CONSTANCE) 9 (CONSTAMMENT) (CONTFNT)=(CDNTFNS) (CORRECT)=(C0RRFCTFUR)9(C0RRECTIF) (C0UPER)=(C0UPOIT)9(CnUPEURS) (COURONER) = (COURONNE) (COURIR) = (c0) (COURIER)=(COURRIFR) (COURONER)=(COURONNANT) (C0UVRIR)=(C0UVFRT)9(COUVRE) (CRAINDRE)=(CRAIN)9(CRAINT)9(CRAIGN)9(CRAIGN ER)9(CRAIGNIS) (CRASSER )=(CRFR) (CREER)=(CREEE) (CRFTIEN) = (CHRFTIENT) 9 (CHRISTIAN) (CRIER)=(CRIS) (CRITIQUE)=(CRITQUER) (CRIMF)=(CRIMINFL) (CRDIRF)=(CR)9(CR0Y)9(CRUI9(CR0),(CROYER)9(CRUT)9(CROIR) (CROIRF)=(CROYARLF)9(CROYE)9(CRFANCE)9(CREDULE) (CRnITRE)=(CRnISSANT) (CRUE )=(CRUFMFNT)9(CRUFS) (CRUEL) = (CRUAUT) (CUFILLIR) = (CUFIL) (CUIRE)=(CUISANT) (CURER)=(CURAT) (CURIEUX)=(CURIOSITF) (DAUPHIN)=(DAUPHINF)9(DAUHPIN) (OF)=(DFS)9(DU)9(DFLFS) (nEBnNAIR) = (DFBONNAIR) (DECIDER )=(DECI) 280 (DECOUVRIR)=(DFCDUVRANT)9(DFCOUVRF)9(DEC0UVIR )9(DEC0UVROIT) (DECRIRF)=(UECRIRHIENT)9(DECRIEZ) (DFFAIRF)=(DEFAITF) (DEFENDRE)=(DEFFNS)9(DEFFENDRE)9(DEFFENDE) (DEFENDRE) =(DFFFNSIVF)9(DEFEENS) 9 (DEFENSE) 9 (DEFENSEU) (DEFAUT) = (DEFAUTS) (DELIVRFR)=(DELIVFR) (DELAI)= 9(DFLAIS) (DENT )=(DENTS) (DEREGLEMEN)=(DFRFGLFMENT) (DERNIER) = (DFRNIFRF) (DESHONORER)=(DFSHONNEUR) (DESTINFR)=(DESTINA)9(DESTFNIR) (DESIR)=(DESIRS) (DEVENIR )=(DFVFNOIF)9(DEVENOIS)9(DEVENOIT)9(DEVIENT) (DEVANT)=(DEVANS) (DEVOIR)=(DUIT)9 (DU*)9 (DFVOITI9 (DEVONS), (nnIVENT) (DEVOIR)=(DEVR)9(DUEI9IDEV)9(DDIVE)9(DUT)9(DOIV)9(DOIS) (DIABLE)=(DIABOLIDUF) (DIFFAMER) = (DIFFAMATOIR) (DIFFICIL)=(DIFFICUL) (DIGNITE)=(DIGNF) (DIRE)=(DIRER) (DIRE) = (01) 9 (an) 9 (DIS) 9 (DIR) 9 (DISER) 9 (DISAN) 9(0155) (DISCUTER)=(DISCUS) 9(DISCUSSI) (DISSENTION)=(DISSFNSIDNSI (DISTINCT)=(YFS) (DIVERTIR)=(DIVFRTIS) (DIVERS) = (DIVERSIT) (DIVINIT)=(DIVINITEZ) .(DOCILE )=(nnc1LITF) (DOGME )=(DUGMATIS) 9(DOGMATIOUE) (BONNER)=(DUNNFRF7-VUUS)9(DON) (noux )=(noucs) 9(DnUCEUR) (DUQUFL) = (DFSOUFL) (ECHEOIR) =(ECHFDIT) (FCLAIRFR)=(ECLAIRE) (ECRIRE)=(ECRIR) 9 (EC-IV) 9 (ECRIVANT) 9 (ECRIVIT) 9 (ECRIVOIE) (EDIFIFR) = (EDIFIANT) (FLECTION)=( FLU) 9(FLECTFUR) 9(ELECTIF) 9(ELECT0RA) (FMRARASSER)=(FMBARRASSANT) (EMPLOIFR)=(FMPLOY) (EMPIRF)=(EMPIRANT) (ENFANT )=(ENFAN) (FNDRMF )=(FNORMITF) (FNSUIVRF )=(FNS(|IT) (FNTRFPRFNDRE)=(FNTRFPRIS) (FNTIER ) = (curs ) (ENVDIFR )=(EanIR) (EPDUVANTAIL)=(FPnUVFNTAIL) (FPREHVF)=(EPRFUVFS) (ESCLAVE )=(FSCLAVAG) (FSPAGNQL)=(FSPANGNO) (ESPION)=(ESPIONS) 281 (FSSAYER)=(ESSAIRFR) (FSSENTIEL)=(FSSFNCE) (ETABLIR )=(FSTABLIR)9 (ETARLISS) (ETFRNELL)=(ETFRNFL ) (ETRE) (ETAIT) 9 (FTES) 9 (EUT-CE) 9 (EUT-IL) 9 (FUSSENT-ILS) (ETRE) (ETOIT) 9 (FTIDNS) 9 (ETIEZ) 9 (ETUIENT) 9 (FUS) 9 (EUT) (ETRE)=(ETOIT-IL)9(FSTREI9(FUST)9(SUMMES-NUUS)9(ES) (FTRE) = (FUMES) 9 (FUTES) 9 (FURENT) 9 (ETANT) 9 (ESTE) 9 (SERAI) (ETRF) = (FUSSIFZ) 9 (FUSSENT) 9 (5018) 9 (SOIT) 9 (SOIENT) (ETRE) = (SERAS) 9 (SERA) 9 (SERONS) 9 (SEREZ) 9 (SERONT) 9 (SEROIS) (FTRF) = (SEROIT) 9 (SERIONS) 9 (SERIEZ) 9 (SEROIENT) 9 (SO) (ETRE)=(SOIT-IL) 9 (SONT-ILS) 9 (SONT-ELLES) 9 (SONT-CE) (FTRE) = (snvnNS) 9 (snYEZ) 9 (FUSSE) 9 (FUSSES) 9 (FUSSInNS) 9 (SER) (ETREI=(SUIS)9(FST)9(SflMMES)9(ETE)9(SONT)9(ETOIS) (FTRANGE)=(ETRANGFS) (ETUDIER)=(ETUDIF)9(FTUDE) (FUX-MEME)=(FUXMFM ) (EVANGIL)=(EVANGLIQUFS)9(FVANGELI) (FVFNFMFN)=(FVFNF) (EVFOUE )=(FVFCH ) (EVIDENT)=(EVIn)9(FVIDENC) (EXCEPTER)=(EXCFPTION) (EXCES)=(EXCESSIVF) (EXEMPTER)=(FXFMPTIO)9(EXEMT)9(EXEMPT) (EXFRCICE)=(EXERCISES) (EXPEDIER)=(FXPFOIF) (FXPLIQUER)=(EXPLICATION) (EXPRES) = (EXPRE) (EXPRIMER)=(EXPRESSInN) (FACILE) = (FACILITE) (FAIRE)=(F)9(FA)9(FAIR)9(FASSER)9(FER)9(FERA)9(FI) (FAIRE)=(FAIS) 9 (FAITES) 9 (FAISIONSI9IFIMES)9(FAISEUR) (FAIRE) = (FIRFR) 9 (FAITF) 9 (FAISER) 9 (PE) 9 (FERER) 9 (FIR) (FALLDIR) = (FALnIR) 9 (FAUT) 9 (FAILL) 9 (FAUDR) 9 (FALLU) 9 (FALL) (FALLOIR) = (FALU) 9 (FAUDRE) 9 (PAL) (FAMILLE)=(FAMILIE) (FANATISME)=(FANATIOUF) (FANTAISI)=(FANTAISES) (FAUTE )=(FAUTFS ) (FAUX I=(FAuv ) 9(FAUSSFI9IFAUSSETE) (FAVORISER)=(FAVORABLE) (FEINDRE)=(FEINS)9(FEINT)9(FEINTS) (FEROCE)=(FEROC)9(FERnCIT) (FEUILLETER)=(FFUILLES) (FFRIR )=(FFFRT ) (FIFR )=(FIFZ) 9(FIA) (FIFRTF)=(FIFR)9(FIERES) (FINIR)=(FINI)9(FINIS)9(FINIE) (FIDELITE)=(FIDFL) 9(FIDELE) (FISQUFR )=(FISC) (PLATTER )=(FLATE) 9(FLATERIE) (FLECHIR )=(FLFCHISSF) (FLFTRIR)=(FLETRI)9(FLFTRISSANTES) (FLEURIR)=(FLEUR)9(FLnRISS)9(FLORISSANT) 282 (FLEXION) = (FLFX) (FONDER)=(FONDAT)9(FDNDEUR)9(FUNDDIEN)9(FUNDANT)9(F0NDOIT) (F0NDATION)=(FONDFMEN)9 (FONDEMENT) IFORMER)=(FORMIST) (FORMALITE)=(FORMFLS)9(FORMELLES) (FOU )=(FOLF) 9(F0LLE) (FRAINDRE)=(FRAINT) (FRAPPER )=(FRAPFR) (FRAIS )=(FRAICH) 9(FRAICHF) (FRAUD)=(FRAUDF)9(FRAUDULEUSE) (FRUIT)=(FRUITS) (FURIFUX)= (FURFUR) (GAZETTE)=(GAZFTTTFRS)9(GAZETIER) (GIROUET)=(GIROUETTE) (GLanFIER)=(GLnIRF) (GOUTER) = (GOHT) 9 (GnUST) (GOUVERNR)=(GUHVFRNE) (GRELE )=(GRFSLF) (GRIEF)=(GRIEVF)9(GRIEVES) (GROSSIR)=(GROS)9(GROSSE)9(GROSSES) (GRnSIERE)=(GRnSSIFR) (GUERE)=(GUERES) (HABILET)=(HABILF)9(HABILES) (HABITAN)=(HABITANT) (HAIR) =(HAIS) (HAILLON )=(HAILLFR ) (HARDI )=(HARnIF ) (HERITER )=(HERITIFR) (HISTOIRE) = (HISTnR)9(HISTnIR)9(HISTORIQ) (HONNORER)=(HONORABLF) (HUNNET) = (HONNFTFT) (HONTF)=(H0NTEUX)9(HONTEUSEMENT) (HOSTILF )=(HOSTILIT) (HUMAIN)=(HUMANITF)9(HUMAINES)9(HUMAINE) (ICFLUI )=(ICFLL) 9(ICEUX) (IDOLE =(IDnLATRE) 9(IDOLATRI) (IL)=(SIEN)9(SIFNNF)9(SIENS)9(LUI) (IMMEDIAT)=(IMMFOIA ) (IMPOSTURE)=(IMPnSTFURS) (INDEPEND)=(THIS) (INDIFFFR)=(THAT) (INOIGNE )=(1NDIGN ) (INFAILLIB)=(INFAILLIBLF) (INFIDELITE)=(INFIDFLLE) (INFINI)=(INFINIMFNT)9(INFIT) (INFIRM)=(INFIRMITFZ) (INJUR) = (INJURIFUX) 9 (INJURE) (INNOCFN)=(INNOCFMMENT)9 (INNUCENT)9 (IINNOCENTES) (INSCRIRF)=(INSCRIT) (INSULTER) (INSULTANT) 9 (INSnLENT) (INSDLFR) 9 (INSOLFN) 9 (INSOLENC) (INSU)=(INSCU) (INTFNTFR) = (INTENTION) (ITER )=(ITATInN) 9(ITATFUR) 283 (JALDU )=(JALnUSI)9(JALnu51E) (JESU)=(JESUS-CHRIST)9(MESSIE) (JEUNE) = (JFUN) (JUINDRE)=(JOIGNANT)9(JnIGNIS) (JOURNEF =(JOURNFLL) (JUURNAL)=(JOURNAAL) (JUSTIFIE)=(JUSTIC)9 (JUST)9 (JUSTIFIC)9 (JUSTE) (LACHE)=(LACHETF) (LAISSER )=(LAIR )9(LAIsan5) (LANGUE)=(LANGAGE) (LF)=(L8)9(LES)9(LA) (LECON)=(LECONS) (LEGEND)=(LEGENOAIRF) (LEGFR)=(LEGERFMFNT) (LEMENT )=(LEMFNTAI) (LEQUFL) = (LESQUFL)9 (LAQUELLI9 (LFSQUELL) (LEUR)=(LFURS) (LIBELL) = (LIBFLLI) 9 (LIRFLLAT) (LIBERAL)=(LIBFPALFMFNT)9(LIBFRALITE) (LIBERTF)=(LIBFRT)9(LIBERTRF) (LIRRAIRE)=(LIRRAIRIF) (LIBRE) = (LIBFRATF) (LIFU )=(LIFHX) (LIRE)=(LISER)9(LU)9(LIR)9(LISANT)9(LISE)9(LUE)9(LUS) (LITERATUR)=(LITFRAIRFS)9(LITTFRATURE) (LIVRFR )=(LIVFRFR) 9(LIVREZ) (LIVRF)=(LIVRFT)9(LIVRFTS) (LUGER)=(LUGF)9(LnGFnISI (LDGI)=(LDGICIFN)9(LOGIQUF) (LOISIR )=(LUISFR) (LnNGTEMP)=(LnNG-TEM) (L0NG)=(LUNGUF)9(LnNGUESI9(LnNcuEUR) (LORSQUE )=(LnRQUF) (LOUER)=(LDUANGF)9(L0UANGES) (MAGISTRA)=(anL) (MAITRE)=(MAITRISF) (MAJFSTE )=(MAJFST )9(MAJFSTAT) (MALHONFT) = (MALHONNF) (MALIC)=(MALICIFUSFMFNT) (MALIGNE) = (MALIGN) 9 (MALICNIT) (MANIER)=(MANIFRF)9 (MANIFRFS) (MANIFEST) = (MANIFES). (MARCHFR)=(MARCHInm) (MARQUFR)=(MAR0HARL)9(MAROUANT) (MARCHAND)=(GIRL) (MARTYR)=(MATYR) (MAUDIRF =(MAHDIR ) (MAUVAIS )=(MAUVAISF) (MECHANT)=(MFCHAN)9(MFCHAMMFNT)9(MECHANTS) (MECUNTFN)=(snsn) (MEDIRF)=(MEDISANTE) (MFDIATF)=(MFDIATFURS) (MFILLFUR) = (MFILLFU) (MEME)=(MFSMF)9(MFMFS) 284 (MEMQIRF)=(MEMOIRFS) (MENFR)=(MENANT)9(MENFMT)9(MENEUR)9(MENOIT) (MEPRISER)=(MEPRIS)9(MEPRI)9(MEPRISE)9(MEPRISANS) (MERVFILLE)=(JUNK) (METTRF)=(MET)9(MFTT)9(MIRFR)9(MIS)9(METTIQN)9(MIR)9(METTE)9(MISE) (METTRE) = (METTANT) (MIFUX)=(MIEH) {MINENT )=(INFNT) 9(INFNC) (MINUT)=(MINUTFF)9(MINUTIFS) (MIRACLF)=(MIRACL)9(MIRACULE) (MISSION)=(MISSIHNAIRE) (MQIEN =(MOIFNNAN) 9(MDYEN) (MOMENT) = (MOMFN) (MONTER )=(MUNTFZ) (M0N)=(MA)9(MFS)9(MDI)9(MDI-MFMF)9(MIEN),(MIENNE)9(MIENS)9(ME) (MONARCHI) (MUNARCH)9(MQNARCHQ)9(MUNARQUF)9(MUNACHAL)9(MONARQU) (MONSIEUR) (SIFUR) 9 (MESSIFUR)9(MDNSFUR) (MONSTRE )=(MONSTRFUX) 9 (MONSTRUFUX) (MONT) = (MUNTS) (M0RAL)=(MORALITF)9(MURALISFR) (MDRTFL)=(MURTFLLF) (MOURIR)=(MEURT)9(MORT)9(MURTS) (MYSTERF )=(MYSTFRIF) (NAIF)=(NAIVEMFNT) (NATURE) = (NATHR) (NCONTRER) = (NCQNTRF),(CQNTRER) (NEAMUIN)=(NEAMMOINS)9 (NEANMOIN) (NECESSAIRE)=(NFCCFSSITE) (NEGOCIER)=(NEGUTIATFUR) (NET )=(NFTTF) (NEUTRF )=(NFUTRALI) (NIFR)=(NIAIS)9(N )9(NIANT) (NOIRCIR) =(NOIR)9(NflIRF) (NUMMER =(NnMMF) ,(NnM5)9(NnM) (NDMBRF =(NnMBRFUX) (NUNCER) = (NDCFR) (NDTUIRF )=(NOTQRIFT) (NOURRIR)=(NDURRISSANT) (NOUS)=(NOS)9(NnTRE)9(NnTRES) (NUUVEAU)=(N0UVAUX)9 NDUVELLFMFNT) (NDUVEAU) = (NDUVFAUT) 9 (NflUVFL) 9 (NOUVELL) 9 (NOUVEAL) (NUIRE)=(NUIRA)9(NUIT)9(NUISFNT) (NUL) = (NULLF)9(NULLAM) , (NULS) 9 (NULLIT) (NVFRSFR )=(NVFRSF ) (OBFIR )=(flRFISS) 9(QRFISSAN) 9 (URFISSF) 9 (OBIIE) 9 (QBEI) (”RLIO)=(UBLIQF)9(OBLIOUFS)9(HBLIQUITF2) (HBLIGFR )=(URLIF7) (QHSCURFHIBSCHRCI) (URTFNIR)=(UBTIFNNF) (QCCASIDN)=(DCCQ) (ODIFIIX )=(U()IFIJSF) (QFHVRF)=(UEUVPES) (flFFFNSFR)=(flFFFNCANT) (UFFRIR)=(DFFFRT) 285 (HMFTTRF)=(UMISSInNs),(nMMISSInN),(nRMETTRF)9(nMIS) (0N )=(Snl)9 (snI-MFMF)9 (LmnN) (nPINFR)=(nPINInN)9(nPFNIATR)9(nPINATRE) (HPPOSER)=(0PDSFR)9(HPPnSFZ)9(nPPflSOIFNT)9(UPPOFER)9(OPPOSITI) (OPPRES )=(UPPRFSSF) (nPULEN)=(OPULENT) (URFILLF)=(UREIL) (URIGINAL)=(ORIGINFL) (ORIGINF )=(DRIGINAIR) (UTFR)=(0TANT) (OUTRAG) = (OUTRAGFA) 9 (OUTRAGFU) (OUVRIR)=(UUVERT)9(0UVERTFMENT)9(OUVERTS)9(OUVRANT)9(OUVRE) (DUVRAGF)=(0UVRAGFZ) (PACIFI)=(PACIFICATION)9(PACIFIERDIS)9(PACIFIQUES) (PAPF) = (PAPIST)9(PAP)9(PAPAL)9(PAPISTIO) (PARCOURIR)=(PARCOURU) (PARFAIT)=(PARFAITFMFNT) (PAREIL)=(PAREILLFMENT)9(PAREILL) (PAREN)=(PARFNTF)9(PARENT) (PARENCE )=(PARAT) (PARER)=(PARABLF) 9 (PARAISER) 9 (PARAISUN) (PARDITRE)=(PARUSSFNT) (PARTICUL)=(PARTICULIFRS), (PARTICLUIERS) (PARTIE) = (PARTIFS) (PARTIR)=(PARTANT) ' (PASSION) = (PASSIONN) (PATIBLE )=(PATIRILIT) (PATIENT)=(PATIFN)9 (PATIENC)9 (PATIER)9 (PATI) (PATRI)=(PATRIMHINF) (pAUVRE)=(PAUVRFTF) (PAYS) = (PAIS) (PC0NER)=(PCER)9(PCUNN) (PEINDRE)=(PEINT) (PERCER)=(PERCANT) (PERE)=(PERES) (PERIL) = (PERILLFU) (PERNIC)=(PERNICIFUSFS)9(PFRNICIEUX) (PERSONNE)=(PERSON )9(PERSUNNA) (PETIT)=(PITITFSSF)9(PFTITESSF),(PETIRE)9(PETIR) (PEUTETRE)=(PEUT-FTRF)9(PEUR-ETRE)9(PEUT-ESTRE) (PHILOSOP)=(PHILOSOH) (PIRE)=(PIS) (PLAINDRE) = (PLAIN) 9 (PLAINT) 9 (PLAIGN) 9 (PLAINDR) 9 (PLAIGNAN) (PLAINDRE)=(PLAIGNIR)9(PLAIGNER)9(PLAIGNIS) (PLUR)=(PLURALITF ) 9 (PLURI EL) (PnLIC)=(PnLICF) (P0LI)=(PULIF)9(POLITFSSF) (PUSFR)=(PUSANT) (POSSFDFR) = (POSSFS)9(PDSSFSSF) (POUSSER)=(PUUSSFNT)9(P0) (POUVOIR) = (PFHT) 9 (PEUVER) 9 (POUVER) 9 (PUIR)9(POURRFR)9(POUVANT) (POHVDIR)=(POURR) (POHRVU )=(POHRVUF ) (PRATI)=(PRATICARLF)9(PRATIOUF) 286 (PRFCIS )=(PRFCISF ) (PRFDIRE )=(PRFnICAT)9(PREDICTI) (PREFERFR)=(PRFFFRFN) (PRFJUGER) = (PREJUDIC) (PRFLAT) = (PRFLATUR) (PREMIER) = (PRFMIFRF) (PRFNDRF)=(PRIS)9(PRISEI9(PRENNER)9(PRIRENT)9(PRIT)9(PRISES) (PRESENC)=(PRFSFN)9(PRESENT)9(PRESENTEMENT) (PRFSSER)=(PRFSSANT) (PRFT)=(PRETE)9 (PRETFS)9 (PRFTS) (PRETER )=(PRFTATIF) 9(PRET) 9 (PRETENT) 9 (PRETEZ) (PRIER)=(PRI)9(PRIA)9(PRE) (PRIMER)=(PRFSSInN) (PRINC)=(PRINCFS) (PRISER)=(PRISARLF) (PRISON)=(PRIanNIERS) (DRIVER)=(PRIVAT) (PRnCHER) = (PRnUCHER) (PRnDUIRF)=(PRnnUCTInM),(PRnnHISIT)9(PRODUITE) (PR0DUIRF)=(PRODUIS)9(PRODUISE)9(PRUDUIR) (PRnTEGFR)=(PRnTFCTF)9(PROTECT!) (PROBABLE)=(PRDRARILITE) (PRncE)=(PRnCFZ)9(PRnCFS) (PRnnIGUE) = (PRnnIG) 9 (PRnnIaIE) (PROGRE)=(PROGRFZ)9(PROGRES) (PR0MT)=(PR0MPTFMFNT)9(PRnMTFMFNT) (PRUPHFT)=(PR0PHFTF)9 (PROPHFTIOUF)9 (PRDHETI) (PR0PRF)=(PR0PR)9(PRnPRFMENT)9(PR0PRES) (PROVINC)=(PROVINCIAL) (PUBLIER) = (PUBLIASS) 9 (PUBLIFZ) (PUBLIC) = (PUBLIQUE) (PUISER)=(PUISHIFN) (PUNI )=(PUNITF)9 (PUNFMFNT)9 (PUNF) (PURIFIFR)=(PURFT)9(PURF)9(PHR) (QUALITF )=(0UALITRF) (OUFRIR )=(OUFRAN)9(OUET)9(QUIS)9(0UFR)9(QUISF) (OUFL) = (QUFLL)9(QUFLF) (QUITER)=(QUITTFR) (ounIQUF )=(QunI-OUF) (RADOUCIR)=(AnnHCIFS)9(RAnnUCIES) (RAILL)=(RAILLFURS) (RAISON) = (RAISFR) (RAISDNFR) = (RAISnNNF) 9 (RAISONNA) (RAPflRTFR)=(RAPPnRTS) (RAPPFLFR)=(RAPFLLFR)9 (RAPFL)9 (RAPPFLLF) (RFRATTRF) = (RFRATU) (RFRFLL )=(RFRFLLIH) (RFCFVUIR) = (RFCFVFR) 9 (RECUF) 9 (RECOIVER) (RFchMFNcFR)=(RFchMFNCAI) (RFCOURIR)=(RFCDHRRA)9(RECDURFR) (RFCUFILLIR)=(RFCHFILS)9 (RFCUFIL) (REDFMPT)=(RFDFMPTFUR)9(REDFMPTIHN) (REFUGIF )=(RFGHGIFZ) 9(RFFUGIEZ) (RELIRF)=(RELUF) 287 (RELIGInN)=(RELIG ) (RENDRE) = (RFNDIRER) (RENONCER) (RFNDNCI) (RENVUIER) (RFNVUYFR) (REPANDRE)=(RFPAND)9(RFPANDENT)9(REPANDIT) (REPDNDRE)=(REPUNDIS)9(REPDNS) (RFPUT)=(REPUTATIHN) (REQUIRIR) = (RFQUFR)9(RE0UIS) (RESSUSCITFR)=(RFSSUCTITE) (RESPECTE)=(RESPFCT ) (RETABLIR)=(RETARLISSF) (REUSSIR)=(REUSSI)9(REUSSIT) (REVFNIR)=(KFVIFNT) (REVENIR )=(REVINIR ) (RHFTORIO)=(RHFTHnRI) (RIRE)=(RIRUIT)9(RIRFZ) (ROI) = (RnIS)9(RnY)9(RnIAL)9(RDIAUM)9(RUIAUT) (R0MPRE)=(RUPTION)9(RUPTIBLE) (ROULLER)=(RDULF) (SACCAGER) = (SAC) (SACRE)=(SACRFMFNS) (SAINT)=(SAINTFTE) (SALE)=(SALETEZ) (SANG )=(SANGLAN )9(SANGLANT) (SATISFRF)=(SATISFAC)9(SATISFAI) (SATYRSER)=(FATYRF)9 (FATYR), (FATYRIQUES)9 (SATYISE) (SATYRSFR)=(SAYTRFS)9 (SATIRE)9 (SATYRIQU)9 (SATYRE) (SAUVER) = (SALHT) 9 (SALUS) 9 (SAUVEUR) 9 (SALV) (SAVOIR) = (SAIS) 9 (SAIT) 9(SCAVDIRJ9 (SCAVANS)9 (SAVOIR)=(SAVANT)9(SCAV)9(SCAVAN)9(SACH) 9(SAVANTE)9(SAUR) (SAVOIR)=(SCAIR)9(SCAUR)9(SAI)9(SAV)9(SCAI)9(SCU) (SCANDAL) = (SCANDALE) 9 (SCANDALI) (SCRUPUL )=(SCRUPULE) (SECOURIR) (SECOURUS) (SECUTER) (SFCTFUR) 9 (SECUTRIC) (SECUTER) (SFCUT) 9 (SECUTEUR) 9 (SECUTRE) 9(SECUTION) (SECOND )=(SFC0NDFM)9(SECONDER)9(SEC0NDE) (SECRET)=(SECRETAIRE)9(SECRFTEMENT)9(SECRETTE) (SECT)=(SECTAIRFS)9(SFCTATEURS)9(SECTES) (SEILLER =(SEIL) (SEJOURNER)=(SFJOUR) (SEMBLFR)=(SENBLFRDIT)9(SFMBL)9(SSEMRLAT) (SEMFR)=(SFMAT) (SFNTIR)=(SENT) (SFNTFR) = (SFNTA) 9 (SFNTAT) 9 (SENTEREN) 9 (SENTE) (SFDIR)=(SEANT)9(ASSIS) (SEOHFNT) =(SEnuFMMFNT) (SFHL)=(SFHLE) (SFVER)=(SFVFRITF) (SIGN)=(SIGNF)9(SIGNFS) (SIMPLF)=(SIMPLFMFNT)9(SIMPLFS)9(SIMPLICITE) (SINGUL)=(SINGHLARIFZ)9(SINGULIFR) (SUIN)=(SDIGNFHSF)9(SnIGNEUX) (SDLID)=(SULIDAIRFMENT)9(SOLIDFMENR)9(SOLIDITE) 288 (SON) = (SA) 9 (SFS) (SONNE) = (SONNFLLE) 9 (SONNELL) (SOUDRE) = (SOL) 9 (SOLU) 9 (SOLUTIO) 9 (SOLUTION) (SOUFFRIR)=(SOUFFRABLF) (SOUS)=(SOUS-GOUVFRN) (SPECTER) = (SPECT) 9 (SPECTUEU) (SPFCTEUR)=(SPFCTION) (SPIRITUEL)=(RAG) (SUAOER) = (SUA) 9 (SUAORF) (SUCCFOER)=(SUCCEF)9(SUCCFS) (SUFFIR)=(SUFFISFNT)9(SUFFIRF) (SUIVRF)=(SUIT)9(SUIVE)9(SUIVR)9(SUIVI)9(SUIVER)9(SUITE)9(SUIVANT) (SUPERIOR) = (SUPFRIFU) (SUPLIER)=(SUPLIF) (SUPPOSFR)=(SUPOSITION)9 (SUPPOSIT)9 (SUPOSFR) (SUPRIMER) = (SUPPRIMFR)9(SUPPRFS)9(SUPRES ) (SURPRENORE)=(SUPRFNANT)9(SURPRIS) (SURE)=(SURETE) (SURFMFNT) (SURLFY) (SUSCITER) (SUSCITRE) (SUSPFNORE)=(SUSPFNS)9(SUSPENOF) (TABLIR) = (TARLISSE) (TAIRF)=(TU) (TEL) = (TELLE) (TEMPOR)=(TEMPORFL) (TENDRF)=(TENO)9(TFNOR)9(TENS)9(TENOU)9(TENTION),(TENSION)9(TENR) (TENIR)=(TIENS)9(TIFNT) 9(TEN)9(TIENN)9(TIFNNER)9(TIENNE) (TENTFR) = (TFNTE) (TFSTANT) = (TFSTAN) (TESTABLE)=(BOY) (TICULIFR) = (TICHLIF)9(TICULIAR) (TINGUFR)=(TINGUOIT)9 (TINCTION)9 (TINGUANT)9 (TINGUO)9 (TINGUEZ) (TIRFR) = (TIRF) (TORRENT)=(TORFNT) (TOTAL)=(TOTALFMFNT) (TOUT) = (TDUTF) 9 (TOUS) 9 (TOUTES) (TRACTFR)=(TRACT) 9 (TR) (TRAOUIRF)=(TRAOUCTFUR)9 (TRAOUCTION) (TRAHIR)=(TRAHI)9(TRAITRF) (TRAITFR )= (TRAITFME) (TRAINORE)=(TRAINT )9(TRAIGNIR) (TRANCHER) = (TRANCHAS) (TRAVAILL)=(TRAVAIL )9(TRAVAL ) (TRAVAGAN)=(GOOOIFS) (TRFS-VRAI)=(TFRS—VRAI) (TRIBUFR) = (TRIRU) 9 (TRIRUTION) (TRIOMPHFR)=(TIOMPHFR) (TROMPFR) = (TPOMP) 9 (TROMPRE) (TRONFR )= ( THRONFR) (TROUVFR) = (TROUVERE) (TUFR)=(TUOIT) (UN)=(UNF) (UTIL)=(UTILF)9(HTILFMENT)9(UTILITE) (VAINCRF)=(VAINCRONT),(VAINOUFR)9(VAINOUFUR) 289 (VANTFR )= (VFNT ) (VARIFR)=(VARIF),(VARIENT)9(VARIETE) (VENORF)=(VENO)9(VFNDF)9(VFNDIT)9(VENTE) (VENIR)=(VFNH)9 (VIENT)9 (VENOIT)9 (VIENS) 9 (VEN) 9 (VIENN)9 (VIENNER (VFRSFR)=(VERSF) (VERIT)=(VERITABLE)9 (VFRITE)9(VERITE-CI)9(VERITEZ) (VERTH) = (VFRTHFUX) (VICTUIRF) = (VICTORIFUX) (VIOLFN =(VIULFMMFNT)9(VIOLFNTS) (VIOLABLE)=(VUILARLF) (VUFH)=(VOFUX) (VISIB)=(VISIBILITF)9(VISIRLF)9(VISIBLEMENT)9(VISIBLES) (VDILA)=(THERF) (VOICI) = (VOICF) (VOOHFR) = (VOQH)9 (VOCATIF)9 (VOCATION) (VOIAGFR)=(VUYAGF) (VOIR)=(VFRR),(VU)9(VOYE)9(VO)9(VOYER)9(VUF)9(VUS)9(VERRER) (VOIR) = (VUIS)9(VOIT)9(V0Y)9(VOI) (VORISFR )=(VORI) (VDHLOIR) = (VFHILL) 9 (VFUILLFR) 9 (VOUDRF) 9 (VEULENT) 9 (VE) (VOULOIR) (VFUX) 9 (VFUT) 9 (VOUL) 9 (VOULU) 9 (VOULUST) 9 (VOUOR) 9 (V0HL01R)=(V0HL”IS)9(VUUNDRDI )9IVOULR) (VOUS)=(VOS)9 (VOTRF) (ALLFR) = (AILLFNT)9(ALLF)9(ALLFFS) (CFVUIR) = (CnIT)9 (CnIS)9(CU)9(CUI )9ICFU) (PRFNDRF) = (PRFNOR) (PDHVHIR) = (PHT),(PHHV) (PRFNDRE) = (PRFN)9 (PRFMDRFR)9(PRI) (FLATFRIF) = (TRIXXX) (CITER) = (CITFF) (MFTTRF) = (M1) (NAITRF) = (NFF) APPENDIX Y LBAS LSYN KSUF LPRE IDROPCHR IER LAD LRE LIR LADl IETOER ILOSES IREMAIN 1534 185678 APPENDIX Y GLOSSARY OF WORD LISTS FOR ENROOT These two lists make up the synonym list. LBAS contains the base or root words, and LSYN contains the synonyms to these roots. A list of all suffixes. A list of all prefixes. These words drop a certain number of letters from the end of the word. These words drOp from 0-6 letters and add ER. These words have a number of letters added to the end of the word. The letters are obtained from the corresponding entry in JAD. These words drOp 0-6 letters and add an R. These words drop 0-6 letters and add RE. These words dr0p 0-3 letters and add IR. These words have one (1) letter added to them. The letter is obtained from the corresponding entry in JADl. These words change the final E to an R. These words lose their final S and have no other changes made. These words have no changes made at all. These are three and four letter wOrds which lose their final S. These are five, six, seven, and eight letter words which lose their final S. 290 APPENDIX Z APPENDIX 2 ENROOT SAMPLE OUTPUT KEYWORD ABAISSER ABANDONNER ABATTRE ABIMI'R ABIRAM ABJURm ABOLIR ABOMINABLE ABONDHI ABORD ABOUT IR ABSALON ABSOLU ABSOUDRE ABSI‘ ENIR ABSTRAIT ABSURD ABUSER 291 FIELD RIPS CONTEXT ABAISSE ABANDON ABANDONNE ABANDONNEE ABANDONNER ABATTRE ABIME ABIMFR ABIME ABIRAM ABJURER 2 ABOLIR ABOLIRENI‘ ABOLISSENI‘ ABOLIT 2 ABOMINABLE 2 ABOMINABLES ABOMINATION ABONDANCE ABONDANI'E ABONDEROIT 3 ABORD ABOUTI ABOUTIRONI’ ABOUTISSENI‘ ABRI ABSALON ABSOLU ABSOLUE ABSOUJMFNI‘ ABSOLUTION ABSOUDRE ABSOUS ABSUUS ABSI‘ENIR ABSI‘ENU 2 ABSI‘INT ABSI‘RAITES 7 ABSURDE ABSURDUM ABUS ABUSANT ABUSE ABU SENT ABUSE? N NNN @UU VON APPENDIX AA 1 APPENDIX AA PIP FLOWCHART* START) I d a A arameters r”? in this study. /\ Construct *This flowchart shows only I/O formats those portions of PIP used I Read text group Get next ' word Find root [/M J5 using ENROOT [ Store word- root pair Alphabetiz Print no words listing 292 (STOP)