A METHOD OF T‘RANSLATENG SSMPLE CHINESE TEXTS INTO ENGL§SH BY MACHINE T3951: for “19 Degrae of DH. D. MICHIGAN STATE UNIVERSITY David Djen-Hsien Cheng 1963 THESIS WW////II///l////7.’///I/l////I/’I/////I///I//I////I//l I 1 3 1293 10474 6676 This is to certify that the thesis entitled A b’IE'I'E-IOD OF TRKI‘ISL'LTII‘IG SE-EPLE CHINESE TEXTS IITI'O ENGLISH BY M'aCIIIIIE presented by David Djen-Hsien Cheng has been accepted towards fulfillment of the requirements for Ph.D. Electrical degree in mnglneerlng 4. WV? W26? Major professor Date 04/1/ng 9g; /€/9 3 J / 0-169 LIBRARY t Michigan State University MSU LIBRARIES RETURNING MATERIALS: Place in book drop to remove this checkout from your record. FINES will be charged if book is returned after the date stamped below. MAG]? A MOD OF MSLATING SIMPLE CHINESE TEXTS INIO mLISH BY MACHINE by David DJen-Hsien Chang In translating Chinese to English by machine there are three areas to be considered; namely (1) dictionary storage and retrieval of Chinese characters, (2) selection of the correct meaning or the word and (3) syntax. In view of the magnitude of the task involved, this thesis is confined to cover the first and third areas, for which an automatic translating system is deve10ped. Moreover, the system is primarily de- signed to translate simple Chinese texts where the writings do not in- volve complicated syntactic structures . Using the radical system, a method of storage and retrieval of Chinese characters by machine is described. An evaluation of the radical- oriented look-up system is made. In addition, algorithms are develOped to translate two important word order phenomena in Chinese, the idiomatic expression and name construction. The dictionary stbrage and look-up program written for the cm l60-A computer and machine results of a sample translation are also presented . The syntax system involves the analysis of word order structures in a Chinese sentence and the synthesis into correct counterpart in David Djen-Hsien Cheng English. Certain Chinese words are used in a sentence for unique pur- poses and the translation of such cases can be processed by machine easily. The main syntax translator is derived based on the concept of syntactic unit (synit), which is made of word or string of words bounded by certain relationship. Chinese syntactic structures can be analyzed by studying the relative sequences of the synits. Detection of partic- ular sequences of synits therefore provides the key to successful trans- lation into English. Algorithms for the complete syntax system are de- ve10ped, and simulated results presented. A.METHOD OF TRANSLATING SIMPLE CHINESE TEXTS INTO ENGLISH BY MACHINE by David Djen-Hsien Cheng A THESIS Submitted to the School of Advanced Graduate Studies of Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Electrical Engineering 1963 ACKNOWLEDGEMENTS The author wishes to express his sincere appreciation to Dr. Gerard P. Weeg, Professor of Electrical Engineering and.M3th- ematics, for his guidance and encouragement in the developnent of this thesis. To his parents, he is ever grateful for their constant en- couragement and understanding. Sincere thanks are extended to Dr. Lawrence W. Von Tersch, Dr. Richard J. Reid, Dr. Howard E. mpbell and Mr. James P. Wang for serving on the Guidance Committee. -11- CHAPTER I. II. TABLE OF CONTENTS MODIETION I 0 O O O O O O O O O O O O O O O O O l. 2. 3. ll. DICTIONARY RETRIEVAL AND STORAGE . . . . . . . . l. 2. 3. Characteristics of a Language . . . . . . . Goals of Mechanical Translation . . . . . . Developments Made in Machine Translation of Chinese to English . . . . . . . . . . . . Objectives of the Thesis . . . . . . . . . Nature of the Chinese Character . . . . . The Radical System. . . . . . . . . . . . Word Look-Up Based on the Radical System Evaluation of the Radical Look-Up System Idctionary Information Storage . . . . Some Solutionr to the Meaning Problem; a. Idioms . . . . . . . . . . . . . . b. Frame Constructions . . . . . . . . An Algorithm.of Idiom and Word Look-Up . Introduction . . . . . . . . . . . . . . . Recognition Grammar Analysis . . . . . . . Discussion of the Syntax Flow Chart and .Associated Algorithms . . . . . . . . . . . an Processing "Past-TEnsers" . . . . . . . b. Deletion of Words . . . . . . . . . . . c. Formation of Syntactic Uhit ("Synit") . d. Synit Sequences Involving 99' . . . . . e. Reformation of Synit of Third Order . . f. Other Synit Sequences . . . . . . . . . g0 mgliSh Equvalents e e e o o e e e e e 0 Conclusion . . . . . . . . . . . . . . . . -iii— Page [OH LIST OF TABLES Various Locations of Radicals . . . . Frame Constructions . . . . . . . . . Look-Up Results by Band . . . . . . . An Illustrative List of Classifiers . Code Assignment for Synits I . . . . Code Assignment for Synits II . . . . A List of Frequently Used Prepositions in Prepositional Synits . . . . . . . . . .iv.. 11 16 21+ 31 Mo 1+1 55 1o. 11. 12. 13. 1h. 15. 16. 17. 18. 19. 20. 21. LIST OF FIGURES Information Box . . . . . . . . . . . . . . . . . . . . . Algorithmic Chart for Processing Frame Constructions . . Dictionary Information Storage in CDC l60 K NO FUN(Wk) .3. 11 Yes K + l ——> K Pos(w)3.9x Yes ” Yes FUN (W);l3 NO Consult List of Frame Constructions Figure 2 Algorithmic Chart for Processing Frame Constructions -19- 7. An Algorithm of Idiom.and Word Look-Up The system of word storage and look-up designed for the CDC 160-A computer is now described in two parts. The first part deals with stor- age of dictionary information into memory. The second part describes the look-up procedures. Each word needs ten octal positions to store its information, which consists of two idioms involving the word and one meaningf. For illustra- tion take the word .135. ; from the radical list under-ii- in Appendix 3, if; is identified to be numerically equivalent to 2150. The twoaword idioms involving i??? as the first word are {3} ii and $3 to . The numerical equivalents of {a and.ig are 3050 and 3030 respectively, and the loca- tions of the English equivalents of the idioms are 5110 and 5020 reapec- tively. Both idioms are nouns (numerical code 12), and have no signifi- cant syntactic functions. The meaning of $33. by itself is "to learn" which is a verb (numerical code 41) and whose English equivalent canbe found at location 3020. Hence, the memory locations 2150-2157 inclusive shall store the information associated with~€§» in the following fashion: *For convenience, we shall do away with the multiple meaning problem by incorporating only the correct meaning of the word in its context. This does not hamper the basic principles of storage and look-up since any additional meaning of the word merely increases the memory space but the storage and look-up formats are similar. moreover, the restriction to two idioms clearly does not alter the basic process; but for a small computer, such as the CDC 160-A, does make the memory go farther. -20- 2150 3 o 5 0 Digital Equivalent of It}: 2151 0 0* 1 2 Syntactic Function and Part-of-Speech of $3.81 2152 5 1’ l 0 IocationofEnglish Equivalent of Idiom E?) fl 2153 3 o 3 0 Digital Equivalent of is 2151+ o o l 2 Syntactic Function 8. Part-of-Speech of £35.43 2155 5 0 2 0 Location of English Equivalent of Idiom fig; #3 2156 0 0 h l Syntactic Function and Part-of-Speech of a} 2157 3 0 2 0 Location of English Equivalent of Word 2%.; Figure 3 Dictionary Information Storage in CD0 l60-A The memory section of 1000-W77 inclusive in CDC 160-A is allocated for storing dictionary information of all the words in Appendix 3. This information must be stored sequentially, that is to say, storage begins at 1000 with the dictionary information of the word - , continues at 1010 with the word 3‘» , and so on to 11-770 with the punctuation mark ! . An end of list sign is necessary, and an 0077 is chosen for that purpose. The algorithmic chart for inputting dictionary is shown on the following pasc- «x- _ 00 designates no syntactic function. -21- K = 1000 I l —-)| Input 1+ octal digits Is Input YES Next End of List (0077): . stage No Store input into K K + l -———e> K Figure A Algorithmic Chart for Inputting Dictionary Information -22 .. The look-up process begins with finding the equivalent digital codes of the words to be translated. The method is outlined in Section 3 of this chapter. Suppose the Chinese passage below is to be translated, then from Appendix 3 the equivalent codes are found and indicated directly below the correSponding words. if a”. in 11 ii is iii #3 i i a 9% 2&70 10h0 3550 h3h0 A320 1230 2150 3030 2000 3h00 4670 2750 ii a m - 12u0 3660 3110 A710 These numerical codes become the input to the machine. As end of list sign is again needed, and an 0077 is again used. The first step of machine look-up is the search for idioms. If a combination of adjacent words forms a single meaning, it must be found at this stage; for otherwise the single meanings of the words will be output. It should be noticed that the problem of multiple meaning actually lessens with the increasing number of idioms in a.passage since these idiomatic meanings are unique. The procedures of idiom.look-up is described in the example in Section 6a. The algorithmic chart for the combined idiom and word look-up is shown in Figure 5. The routine to search frame constructions can be conveniently inserted after the idiom and word look-up. The combined program.of dic- tionary storage and retrieval written for CDC l60-A computer is shown in Input h Octal Digits J, Store in Temp. Cell "m" -23- Yes Yes Is Input End of List (0077)? . No , - Input A Octal Digits Store in Temp. Cell "n" l; Yes Add 3 to Content of m Is Input End of List (0077)? ‘ No _ . Add 3ito Content of m content Of ? Content 1 Location = of n Specified by m. Set LOOP Count = 2 l.No Add 3 to Content of m. Output Content of Locationlé;q Specified by m Content of A ? Content No I Location a ofln Specified by m. Add 1 to Content of m Yes Set Loon Count = 2 l. Reduce2L00p Count by l O 2. Count e 0 lL-' Add ,1, to Content of m Yes OUtPut 001113911t 0f Location Store Content of n into m.| Specified by’m I Content of n No End of List? 1. Reduce LoOp Count by 1 No 42 O "—" iYes 20 .1: Count ' Next Stage Yes Figure 5 Algorithmic Chart for Idiom and Word Look-Up -2h- Appendix 5, and machine results of translating the given Chinese passage are also included. The results worked out by hand are shown below. Table 3 Look-Up Results by Band Words in Syntactic Part-of- Location of Passage function Speech English Equivalent it 00 12 2&20 :5. 00 31 3h10 1k. ii 00 ’41 2700 it 00 15 6110 1:3 00 63 61+00 a? #3 00 12 5020 a 11 M 0150 E )2 12 82 7030 9.2} 7.5; 13 11 6010 5% it 00 #2 6120 . 00 91 6600 It can be seen that the machine results and hand results are identical. II. SYNTAX 1. Introduction Syntax is the process by which words combine to form phrases and sentences according to certain grammatical rules. It is, therefore, that we begin by assuming every sentence has a structure, that is, that each string of words which purports to be a sentence is as a.matter of fact a sentence. With the string of Chinese words and its correct meanings and grammatical preperties supplied to the computer, the machine must be made to process these words into prOper order for synthesis into English. This means the grammatical contents of the words are to be manipulated while the conceptual contents of the message remains constant throughout the automatic processing. In general, grammatical analysis of a sentence may aim at various results. The method to be used in the analysis may differ considerably, depending on the particular aim at hand. One can set up a generative grammar to produce strings of words which are sen- tences of a given language. Another type of grammar may be aimed at dis- tinguishing sentences from non-sentences. However, these grammars do not analyze existing features of a sentence. For this, a recognition grammar can be set up which takes a given sentence of a language and finds the syntactic structures of the sentence. The analysis to be taken here -25- 126- is that of recognition grammar, and assumes that in one way or another there is a decision involved concerning the syntactic relationship of each word string. 2. Recognition Grammar Analysis The two common methods of describing a sentence are phrase struc- ture analysis and dependency analysis. The former, also known as the immediate-constituent analysis, starts at the syntactic level and con- siders a step-by-step breakdown into components of increasingly lower order of complexity. The dependency analysis, on the other hand, breaks the sentence up into a.tree formation with the predicate as the stemt P. Garvin [7] conceived a method similar to that of the phrase structure analysis with the significant difference being that the minimum.units, the words, are taken first and these units are gradually fused into units of higher order of complexity. This thesis follows that approach; in this section a recognition grammar is deve10ped. This leads to an al- gorithm for translating Chinese to English. Similar to the process of translation by human beings, syntactic analysis should follow that of look-up and meaning determination of words. As Oettinger [8] has stated, word-by-word translation is a linear approx- imation to the more SOphisticated transformations capable of mapping ele- ments of the source language into elements of the target language. It is generally recognized that these more SOphisticated transformations will -27- be functions, not of isolated words, but of words and their context. However, Jumping from.word-by-word analysis to an analysis of contextual influence in the broadest sense is not likely to be ideal. Initially restricting analysis to linguistic context, and to a small neighborhood of a word at that, is more likely to lead to results both linguistically significant and concrete. Therefore, the initial system.analysis is to provide for signs that lead first to the readily determinable relations and gradually shift to more complex relations. The initial indicating signs are derived from the part-of-speech codes. From there, groupings of words into possible structural, sequential, meaningful combinations shall take place in terms of the grammatical information contents of the units already determined. The process of recognition is not carried out in an arbitrary manner but in a definite sequence in which completion of the work in one stage of recognition insures the successful translation at the next stage. This combining process is built up until the sentence level is reached. The flow chart shown on the next page describes the procedures involved in the syntax system. -28- l 2 k t Processing "Past Tensers" INPUT: A String of Words W W ... W ... W? + their information boxes. Deletion of Words l J. Formation of Syntactic unit (Synit) of First Order I Formation of Synit of Second Order I T WL Formation of Synit of Third Order 5‘: Synit Sequences Involving 9’] J2 Re—formation of Synit of Third Order J. Other Synit Sequences J7 Output of English Equivalents Figure 6 Syntax Flow Chart -29- 3. Discussion of the Syntax Flow Chart and Associated Algorithms 3a. Processing "Past-Tensers" The fact that Chinese is not an inflected language leads to the necessity of contextual analysis of verbs as to their tenses or aspects: In one situation, single characters following certain verbs carry the grammatical connotation of past action. There are two words, 7 and 52.. , commonly used in this sense, which shall be called "past-tensers." The function of the word '3 in a sentence is many fold, but in the case when it is used to convert a verb into or past tense or com- pleted aspect, it follows immediately after the particular verb. Thus oz "to eat" becomes «'5, 3’ "eaten", and E7 * "to return" becomes {.57 i ‘5 "returned". When ii is used as a sign for past tense it is also placed immediately after the verb the tense of which it affects. Thus, "to have seen" in Chinese is 75 1 £7. ("to see" %’ 2, ), and "to have eaten" at {a . The procedure for making verbs into past tenses is shown on next page. * Chinese linguists have been debating on whether verb tenses really exist in the Chinese language. Some asserts that the word "aspect" is perhaps more fitting to indicate the temporal order. However, the analysis here is not involved to such depth, and the words "tense" and "aspect" will be used interchangeably. -30- K = 1 V K + l-—¢ K , Load INFO (wk) 1 Yes No No 2 P08 (wk) s Ax Yes K + l -———+> I( POS (wk) 3 85 (3',g§) no Yes Change Tense of Wk-l’ a Verb, to Past Tense Delete INFO (wk) Figure 7 Algorithmic Chart for Detecting "Past-Tensers" -31- 3b. Delection of Words There are words in Chinese which when translated into English are considered excess baggage. One such case is the usage of fié} after ad- jectives. The function of 3:) here is to carry the modifying tone which is missing when the adjective is used alone in Chinese. Generally it is also grammatically incorrect if the Chinese adjective is not followed by (.343 . However this is not true in English. Therefore, {3’9 can be de- leted in this case without any misinterpretation. Another case of excess baggage involves certain usages of units or classifiers. It is a rather interesting phenomenon in Chinese that when a number is used with a substantive, a unit or a classifier must also be included. There are various types of classifiers to associate with the substantives. An illustrative list of these is shown below: Table A An Illustrative List of Classifiers The classifier for E? (horse) is 7: " ;$. (car) is .$@ " 3 (letter) is 3% " iii (tree) is if; " (dog) is 41 H (house) is fifi Thus, "two horses" in Chinese is .2. g; 57 , and "three cars" become __=_ *5? SF , which literally means "two classifier (horse) horses" and "three classifier (car) cars" reSpectively. -32- An extension of this phenomenon is the usage of more general terms like the adjectival pronoun "this" and "that" to describe a substantive instead of a specific number. Here, too, the classifier must be used. OIIP "This (IL_ ) horse" in Chinese is EEJE.W~ , and "that (-fiP ) car" is #7! $fiy$. In the case of Specific numbers, the classifier should be re- tained for the purpose of clearer interpretation. But when used with adjectival pronouns, it can be deleted because the modified terms, like horse and car, do not require the classifiers in English usage. Algorithmic charts for the deletion processes are Shown in Figures 8 and 9. 3c. Formation of Syntactic units Analyzing the relations among the components of any particular passage presupposes a prior cataloguing of both the possible significant relations among passage components and the prOperties of representations that reflect these relations. The relations in question are formal ones, as between adverb and verb, or between noun and adjectival.modifier. These relations are typically reflected in the order of the related repre- sentations or in structural similarities of the related representations. For example, in Chinese the adjectival modifier always precedes its noun, and adverbs precede verbs. These formally fixed relationships constitute the very information which is needed to construct correspondence with the target language. -33- K + l—9-K Load INFO (wk) POS (wk) 2 9x ———/Yes \) Next Stage iNO Pos (wk) 3:21: JIYe s K + l ——>- K Pos ("1.) 3, BMW N° JR 8 Delete INFO (Wk) No Figure 8 Algorithmic Chart for Deletion of 3‘.) Preceded by Adjective K + l-SDK -3h- No Load INFO (Wk) 7 P08 (wk) a 9x I YeB/ (POS (wk) ; 15 Yes K.+ l-———+-K L (Pos ("1.) 3 63 (classifier) No Next Stage Yes Delete INFO (wk) Figure 9 Algorithmic Chart of Deletion of Classifiers Preceded by by Adjectival Pronoun -35- Therefore, to establish the relationship among words, we set out to formulate constituent patterns of words in terms of their formal gram- matical prOperties. The immediately accessible ones are their part-of- Speech codes. We begin by listing the very simple relationships. When adjacent words satisfy any of the established relationships, they can be bounded to form a unit which shall be called syntactic unit, or "synit" for Short. A synit is not necessarily restricted to contain one or two words only; rather it can include a sequence of words as long as the adjacency relationship is satisfied. Gradually the synits of lower orders starting with the first order Shall be deve10ped into the com- position of synits of higher orders. In formulating the set of rules to form synits we rely not on linguistic intuition or pre-existing descriptive analysis but on the study of a.large number of passages derived mainly from the two books previously mentioned [3] [A]. This has the immediate advantage that one can be sure, even before testing, that one has a set of rules of fairly wide applicability. Furthermore, in dealing directly with the writings, one gains familiarity with the problem areas that may arise and thus is in a better position to resolve the ambiguities. Inadequacies will inevitably occur, but in the light of accumulated results, this empirical approach does provide a working method toward the paramount aim. The synits of first order are formed by directly searching the part-of-Speech codes associated with the words. Assume the following designations: Noun- synit of the first order - (N) Adjective- " - (A) Verb- " - (V) Preposition- " - (P) Number- " - cg) Conjunction- " - (C) Special term- " ' (S) Punctuation mark- " - ($) The combination of words, each of which is represented by its part-of—Speech code identical with that in Appendix A to form synits of the first order, can be made according to the rigid order as follows: (N) - 1x 1x 1x (A) - 2x 2x 2x 32 2x 32 (v) - Ax Ax Ax 31 Ax 33 Ax 33 (P) - 5x ( ) - 62 #3 61 63 61 1h 53 52 61 la (C) - 7x (S) - 8x (35) - 9x * Figure 10 Formation of Synits of First Order * Each pair of symbols, such as 1X or AK is to be read as a two digit decimal number whose least significant digit is unspecified. -37- Thus, for example, number Signits of the first order are of the form 61 63, i.e. numerical number followed by a number-associate noun; etc. Synits of first order are deve10ped into the make-up of synits of higher orders. The following is a list of the synits of second and third orders, derived from synits of first and second orders respectively. Assume double brackets for second-order synit and triple brackets for third- order, i.e. noun synit of second order is represented by ((N)), and third order by (((N))). ((N)) - (N) - (A) (N) ((V)) - (V) ((P)) - (P) (P) (N) ((27)) - W) (N) (Z!) ((0)) - (C) ((3)) - (S) (($)) - ($) Figure 11 Formation of Synits of Second Order -38- (((N))) - ((22)) ((N)) - ((21)) - ((N)) (((V))) - ((V)) (((P))) - ((P)) ((N)) - ((P)) (((C))) - ((0)) (((S))) - ((5)) ((($))) - ((55)) Figure 12 Formation of Synits of Third Order It can be seen that there are nine part-of-Speech classifications (Appendix A), and the types of synits of the first order, derived directly from these parts-of-Speech codes, is numbered at eight. The number of different types of synits of second and third orders are further reduced to seven and six respectively. The formation of synits is to set the stage for the next step of the syntax process, that of examining sequen- tial ordering of the synits. Therefore, the contents of synits of the highest order must be so constituted that they are compatible with the ensuing needs. For certain, each synit shall represent a distinct entity whose properties are only known through the synit with which it is idenr tified. An examination of the synits of the third order shows that the conjunction, special term, and punctuation mark synits are identical in -39- content as with the respective synits of first and second orders. In the case of conjunction and puntuation mark synits, their functions are that of subordinating, coordinating, and ending a clause and there is no meaning or concept relationship between these words and other types of words. Special terms, as indicated, are words or groups of words whose grammatical roles are quite complex and therefore should be isolated. The remaining synits of third order are noun, verb, and preposition synits. One may contemplate a further merging of these synits into synits of higher order. However, a closer investigation shows that this will defeat the purpose of synit formation. If the noun synit is combined with the verb synit, then their combination will generalize the conceptual or thought content of the new synit to such an extent that the unique distinction of synit content is lost. For the same reason combining the preposition synit with the verb synit leads to a similar result. The remaining possibility is between noun and preposition synits, but they are both formed from noun synits of second order, therefore the significant relationship between these two synits are already processed during the formation of third order synits. It is then concluded that third order is the upper limit of forming synits. Further evidence will be found in the sections on synit sequences. The algorithms for formations of synits are now described. The string of Chinese words ending with a punctuation mark is again represent- ed by -ho- W W ... W ... Wfi, where W l 2 k lst word of string Wk = kth word of string W? = a punctuation mark. and abbreviations for the different fields of the information box also conform with those previously designated. Each word shall be preceded by a symbol indicating which type of synit the word belongs to. The symbols for the different orders of synits are arbitrarily assigned numerically as follows: Table 5 Code Assignment for Synits I (N) - 11 ((N)) - 12 (((N))) - 13 (A) - 21 (V) - 1+1 ((v)) - 1+2 (((V))) - 1+3 (P) - 51 ((P)) - 52 (((P))) - 53 (fl) - 61 ((7.2)) - 62 (c) - 73 ((c)) - 73 (((c))) - 73 (s) - 83 ((8)) - 83 (((s))) - 83 ($) - 93 (($)) - 93 ((($))) - 93 NOtice the composition of words in all three orders of conjunction, special term.and punctuation mark synits is identical, consequently the code for each is the same. -hl- Each word which is an initial element of a synit is preceded by one of the above symbols, which serves the same purpose as the left-hand bracket. Other words belonging to the same synit also need preceding symbols which are not necessarily identical to that of the inital element. These symbols merely indicate that more words are included in the synit, and are assigned as: Table 6 Code Assignments for Synits II Word interior to the noun - synit - l5 " adjective - synit - 25 " verb - synit - #5 " preposition — synit - 55 " number - synit - 65 " conjunction - synit - 75 Special term- synit - 85 Suppose the words which form a noun synit of the first order are W WhW 3 5 (i.e., p >.5). Then they are recognized by the machine as 11 INFO (W3) 15 INFO ("1.) 15 INFO (W5). The symbols for the synits associated with each word, such as 11 and 15, shall be abbreviated in general by SYB Wk. As each word in the string is processed by the computer, the machine must be provided with three pieces of information in order that the apprOpriate choice of synit Sign preceding the word and the location -h2- where the word is to be stored can be made. These are (l) the symbol of the current synit, (2) first location of the list of INFO(Wk)'S and their preceding SYBW ’5 starting with W and (3) the location where the current k l) INFO(Wk) and its associated SYBw must be stored. For this purpose, then, k three memory locations, to be named s, f, and z, are allocated to store the reSpective information. The procedures involved in forming the various synits follow iden- tical patterns. Since the formations of the noun synits embrace more possibilities than others, they are described in detail in the following charts; and formation of other synits can use the same charts with the only difference being that the part-of-Speech code and the synit symbol preceding the word are changed wherever applicable. The algorithmic charts for forming noun synit of first, second, and third order are given in Figure 13, 1A and 15 reSpectively. 3d. Synit Sequences Involving (51/? If the same or very similar structural devices are used to rep- resent relations in both source and target languages, the transformation of relation representations would present little difficulty. However, Chinese and English do use different word orders to denote the same grammatical relationships. Therefore, synthesis of Chinese sentences into English requires not only a clear understanding of synit content, but their sequential formations in a sentences as well. -43- 2. Load Content of f into 3 Specified in 2 Load INFO (Wk) Store 93 INFO ‘J\I§§l (W ) starting t location 2 P08 (wk) a 9x No Next Stage POS (wk) 3 1x No \ POS (Wk) must satisfy one of the part-of-speech Then decision is made accordingly to go to the particular synit formation chart. codes. 1. Store 11 in S 2. Store 11 INFO (W ) at location 8 cified in l. Yes Content of S 3 11 YES I No Content of S i 61 Yea" No / Store 15 INFO (Wk) at location Specified in z. Store 65 INFO (wk) at location Specified in Z. 1. Adjust content of z to indicate storage location of next word. 2. K + l -———% K. Figure 13 Algorithmic Chart for Forming Noun Synit of First Order -hh_ lo K=l 2. Load Contents of f into 1 Load 3 I YBwk a. INFO 1 (W ) ' A Yes Sinsw 3 93 k ‘L No 7 Z No Store 93 INFO SIBwk _. 11 |____, (w ) starting P Yes at location Specified in l Content of S g 12 No Next Stage No ‘ Content of S i 62 i l Content of S i 52 (£1 fl Yes SYBw must satisfy one 8f the symbols for synits of first order, and apprOpriat decision is made according to SYBw to go to the particul synit formation chart a) 1. Store 12 into S Store 65 INFO 2. Store 12 INFO (wk) at (Wk) at location location specified in z. Store 15 INFO (Wk) at location Specified in l t Store 55 INFO (Wk) at location Specified in l specified in I location of next word 1 1. Adjust content 1 to indicate storage 2. K + l ————9 K ' I Figure 1A Algorithmic Chart for Forming NOun Synit of Second Order -h5- lo K=l 2. Load content of f into £ i SYBW must satisfy one of the Symbols for synits of second order. Then decision is made accord- ingly to go to the partic- ular synit formation chart Load smwk & INFO (wk) A/‘EEE. SYBW 2 93 k . V No ‘3 NO Store 93 INFO SYBw = 12 (W ) starting at k ' P Yes location Specif- AL. ied in £ ? , Content of s e 13 Yes ‘I No NeXt Stage I I Content of S = 53 35 \ No l. 2. (Wk) at location Store 13 into S Store 13 INFO specified in z Store 15 INFO Specified in z (Wk) at location (Wk) at location Store 55 INFO Specified in 2 it 1. Adjust content of Z to indicate storage location of next word _ 2. x.+ 1 ———————> K Figure 15 Algorithmic Chart for Forming NOun Synit of Third Order -Ng- Synits partition a sentence into localized thoughtgroups, but to weld these different groups into a complete expression requires further investigation of relative sequences of these synits. Additionally, the Special terms and the potential syntactic functions of individual words also provide key information toward syntax analysis. As indicated in Section 5 of Chapter I, a classification of part- Of-Speech called Special terms is necessary due to the complexity of the grammatical uses of these terms. Possibly the most frequently used word among the Special terms is the word 97 , which is one of the chief means for the detection of certain relationships among synits in a sen- tence. All synits now considered are of third order. ( [ ] is used as equivalent to ((( ))).) The particular sequence of synits that is signi- ficant is formed as follows: [verb] [noun]: [ f5"? ] [noun]2 However, detection of such a sequence of synits is not sufficient to guarantee a synthesis procedure into English. The reason is that not all verb' synits will satisfy the given formation. Consequently, the information box of the verb in the verb synit must be examined, and in Presence is Optional in certain cases. -h7- particular the potential syntactic function part of the word. The in- dicating Sign is thus found from the combination of its part-of-Speech and its potential syntactic function. In case Of more than one verb in the verb synit, it is assumed that the initial verb Should be examined, and the assumption is made on the basis that the first element generates the action or thought involved in the verb synit. The verbs must also be separated into two categories. The first category includes intransitive verbs and verb-prepositions, and the second includes transitive verbs only. The intransitive verb- 9? or the verb-preposition- (3’7 sequences are formed by the following order. 1. [verb ][noun]:[ ‘9’? ] [noun]2 intransitive l * 2. [verb-preposition] [noun]l [ \F? ] [noun]2 An example for each case is shown below: 1.E’_ni_£][3e@1[éfy ][_/_\_1 To live in. ‘ U.S. PeOple Translation: Pe0ple who Eé live i2_U.S. 2.[£][\$7#§2fl1[é’71[:@afi731 To be in 'Anartic continent 'One classifier place Translation: One clfisfiifier place which is in Anarctic Continent * Presence is Optional. -h8- The synthesis procedures take the second noun synit, i.e. [noun]2 and places it in front of the predicate clause of [verb] [noun]l and the word g’y is deleted. To make the English translation correct, the conjunctive pronoun "which", "who", or "where” must be inserted following the [noun]2. The choice should be made to agree with the substantive in [noun]2, but this is difficult if not impossible. We shall merely indicate that 8. [NH] term is to be inserted to stand for either of "which", "who", or “where". The #7 in the above case has only one possible function because of the intransitivity of the action involved. If a transitive verb takes the place of the intransitive verb in the same formation of synits then the noun synit [noun]l whose presence was optional previously, is now a necessary element to follow the verb. The usage of {347 here can have two interpretations. (5/9 's function may be that of making the case of the noun synit preceding it possessive. Then synthesis merely in- volves a process of combining this noun synit as a modifier to the noun synit following {34] 3 é’g is deleted but no order change is necessary. On the other hand, the word Q47 may be used to effect a change similar to the previous intransitive verb case. Examples: 1. [ stars 1 [3:61] [1’34] 1 [232% 1 ' To visit HIE: Scholar(s) Translation: Scholar(s)j[wH] t¢ visit U.S. -hg- 2.[%_§L_][—$si][é’-7 J[£] To teach ‘ English ' Porsongs) Translation: Persongs) [WH] Z2 teach English. It is important that the two different functions of #9 must be dis- tinctly recognized. The prOper choice again goes back to the stage of meaning determination where human intervention is necessary. If by now {3’9 still remains in the passage without satisfying any of the formations thus far considered then its presence Should be located either following a noun synit or a verb synit. In the [noun]- éfiy case, the result is to make the noun synit a possessive case, i.e. [noun]'s, and in the verb case the verb is made into past tense. 9’? is deleted after the change is completed. The algorithmic charts for processing the synit sequences involving {247 are shown in Figure 16 and 17. 3e. Re-formation of Synit of Third Order It is evident that the synit order may be altered after process- ing all synit sequences involving ’3‘.) . The possibility arises from the fact that some of the synits which were separated, now form adjacent synits which are so related that they can be combined into a Single synit according to the rules shown in Section 3c (Figure 15). HOwever, these -50- K + l—)K \ Load INFO (wk) POS (Wk) 3 9x —X§§)+ Next Stage N° Pos18u Does the Synits in the No neighborhood of 87 form sequence [v] [N11 [947] [N]; r ALures Rearrange order of Synits into [N]2 [WH] [v] [N]l J, delete g‘] Figure 16 Algorithmic Chart for Translating A Chinese Phrase whose Sequence of Synits is [V] [N11 [ 97 1 [N12 -x- The use Of preceding Signs to indicate types of synits for words be- domes significant here. The detection of the particular sequence of synits, such as [V] [N] [99] [N] above, is to be made based on those signs. Subsequen syntax algogithms dealing with synit sequences also employ this Scheme. -51- A. Load INFO (wk) I 4. PCS (Wk) .3.- 9x |_x__)es Next Stage \IINO K+l—>K |<—N°—'POS (wk)%8l+ (6'9) #JkYes K - l -———fi> K J svsw lNoun Synit Yes k No NO 1 -- K+2—)K le— STE" -Verb Synit k Yes V Make the Verb Make the Noun Synit into Past Synit into Tense Possessive Case delete (91:7 ] Figure 17 Algorithmic Chart for Processing Synit Sequences Formed by W Preceded by a Noun or Verb Synit 752- l- K = l 2. Load Content of f into .8 l -—)| Load SYBWk and INFO (Wk) EL Yes ——-—’ smwkij 93 - SYBw must satisfy one of iNO the k symbols for synits - 7 No of third order. Then de- Store 93 INFO smw = 13 , (w ) starting k ' cision is made accordingly P Yes - t t th arti 111 at location 0 80 0 e P C ar SPGCH'ied by 3 s it formation chart. Content of s 3 13 Yes yn INC No 1. Store 13 into S Store 55 INFO Store 15 INFO 2. Store 13 INFO (Wk) at location (Wk) at location (Wk) at location Specified by z Specified by l Specified by l . l l 1. Adjust content of z to indicate storage location of next word. ‘ I Figure 18 Algorithmic Chart for fie-formation of Noun Synit of Third Order -53... established rules were based on information of synits of second order. Otherwise, the procedures follow the same format. In Figure 18 the al- gorithmic chart is given with.modifications made to deal with synits of third order only. The formation of the noun synit is again considered to exemplify the general procedure. 3f. Other Synit Sequences It is obviously not possible to enumerate all the syntactic structures in the Chinese language which are different from English. EVen if a certain sequence of synits is detected that effects an order change for synthesis into English, one may well find a similar sequence, in which the words used are different, which does not need such a change. It can be seen that in the synit sequences discussed in the preceding section the order change is made based on two factors; namely the pres- sence of the word Efiy and agreement on the syntactic function of the verb. Therefore, it is extremely difficult to establish general rules indicating only the type of synits in a sequence without any regard to Specific words. In view of the irregularities in word structures in Chinese only two Significant synit sequences are found to be meaningful when their orders are changed. The first deals with a sequence initiated by a pre- position synit. The second involves a statement of comparison. In both cases the syntactic function of certain Specific words needs to be examined. -5h- The formation of the prepositional sequence is * [preposition] [verb] [noun] Some illustrations are first given. l.[\:]1j’;] it] [fl] Th you To bid Farewell Translation: To bid farewelltg you. 2.[fifi][§:][’3’11 For mg. To buy Past-tenser Translation: Tb buy past tenSer for me. (or bought) I -—' 4 3. [ @3221 [i1 ” From home To come Translation: To come from home. The synthesis involves rearranging these synits into [verb] [noun]? [preposition]. The flow chart for automatic processing is given in Figure 19. The preposition used for this particular function should be provided with Sign in its potential syntactic function field. A list of some of the prepositions more frequently used in the prepositional sequences is shown on the next page. * Presence is optional. -55- Table 7 A 1.181: Of Frequently Used Prepositions in Prepositional Synits [ ['5] ] [verb] [noun] ] [verb] [noun] [ [it ] [verb] [noun] [ Sui ] [verb] [noun] [ ‘3] ] [verb] [noun] [ ”[5; ] [verb] [noun] [ [fl ] [verb] [noun] The statement of comparison usually uses certain comparative signs. The common ones are $3 and 3%: . Thus "tall" ( (137 ), "taller“ and n n a; 4; g 'é,’ - tallest in Chinese are “7] , f ‘37 , and 3:1 ‘67 . However, when a com parison is made in which one member is expressive of inequality, the word [:L: is used. [:13 is placed before the member expressive of inequality and the measure involved after. Examples: 1° 33 fit is s _I_ _I_I_e_ Tall Translation: 2_[_ taller than he. or _I_ am taller than h_e_. 2'“i ii £13 5.4% That Classifier Tree Tall Three Times Translation: Three times taller than that classifier tree. -56- _ifi:3 SYBW' & INFO I SIBwk ; 93 IZE§—;+Next Stage NO _ - K+l-)K eflsmwk ; 53 I Iles Does the order of NO next two synits satisfy [v] [N] 7 I'Yes Rearrange [P] [v] [N] into [v] [N] [P] Figure 19 Algorithmic Chart for Translating a Chinese Phrase whose Synit Sequence is [P] [V] [N]. -57.. The Change of word order that is necessary to give correct trans- lation into English involves the placement of the adjective of comparison in its comparative case and the quantity, if Specified, before the member which is compared. In the case as illustrated in unple l, the word {$7 (tall), or in general the adjective of comparison, is made comparative (i.e.taller than) and placed before the element being compared, fig (he). In a.more complicated pattern as Shown in Example 2, the additional part I of the phrase, i.e. 3. 1% (three times), is placed before the comparative adjective ”taller than". The word bésis deleted after word order is changed. The sequence of synits is not directly reSponsible for recognition Of statement of comparison. The key to detection comes from.the syntactic functions of the words H: and the adjective of comparison. But Al; is a Special term and therefore, forms a synit by itself. The remaining words following at in the statement of comparison may form one noun synit. However, no ambiguity Should arise since the syntactic function in adjec- tive of comparison is first detected and the prOper word order change is then made within the synit. The algorithmic charts for processing the statements of comparison are Shown in Figures 20 and 21. 3g. Output of English Equivalent The final stage of translation, that of the output of the English K = 1 Load INFO (wk) I‘ 2 Yes Q POS (Wk) a 9X Next Stage ‘ No Agfi * YES FUN’(Wk) = 21 ($3 )I—————— N K + ln——>KIkAELI FUN’(Wk)'; 22 (ifif) , f A Yes 7 f - K+l—)K I K+l—)l( I Load INFO (Wk) Load INFO (Wk) I ? -—-———3EL——IPOS (wk) 2 2x POS (wk).§f 2x IJEL—- - ‘ Yes IYeS Adjective Wk made Adjective Wk made Superlative Comparative Idelete INFO (wk_l) delete INFO ("h-l) Figure 20 Algorithmic Chart for Processing Comparative and Superlative Words in Chinese * _ For convenience, the syntactic functions of i and 3% in their roles Of comparative and superlative cases are arbitrarily assigned as 21 and 22 reSpectively. -59.. J Load INFO (wk) ,L 7 PCS (wk) . 9x INo POS (wk) .1. 83 (kt) .IYeS K+l——->K Load INFO (wk) 210—4 Yes POS (wk) 2: 1x Rearrange word order 0 ts - comparison - number - number - associated Noun's - Adj. of l K+1—>K L Load mo (wk) J '2 P08 (wk) = 6x L753: I__N_K I: I Load INFO (wk) \[r Y” POW (wk) 2 63 ALE) order of bh - Noun's - Adj. of comparison to Rearrange original word "Adj. of Comp. (compara- Noun to 7 I tive case) - Noun's 3‘ \L L 2 4(- [e / NO FUN (wk) a: 23- delete Hz J Z; Figure 21 r-Nmber-Ass. Noun- Adj. of Canp. (Comparative Case) - Noun's. Algorithmic Chart for Processing Statement of Comparison k * The syntactic function of the adjective of comparison is arbitrarily assigned as 23. s60- counterpart of the Chinese passage, is a matching process since the locations Of the English equivalents Of the words can be found from their information boxes. Presumably the English words are stored in the memory, and it is only necessary to go to the particular location and bring out the word for an eventual print out. The other fields in the information box, the part-Of-Speech and potential syntactic function, no longer serve any purpose and can be deleted. h. Conclusion Although machine results are not available at this time, the syntax routines have been tested repeatedly by hand by following the algorithmic charts Shown in this chapter. Some simulated results worked out by hand are shown in Appendix 6. The passages translated are selected so as to represent the various cases Of grammatical patterns in the Chinese language. Emphasis has been_previously'made on the fact that the translating system.thus developed is designed to translate fairly elementary Chinese texts, as exemplified by the reference books used. There is no doubt that this system will encounter many difficulties when tried upon any text of'more complex nature, both in grammar and word usage. However, it is believed that the schemes developed, particularly the formation of synits, are of considerable value toward future work of mechanical trans- lation from Chinese to English. APPENDIX 1 THE RADICAL LIST -61- taste this asi$és$fitsmxsss Duals; alvixlasmihaismsmssfleraahsm Miriamsarl ailisihriahh whamthkkwhs artisans..." :[panuilatmtfl aw%asa¥atmi e+gF attf warrantsatxs nrrt- is hashttara AC; a hathhflass thitmtwt a. assistants. rosette/ass i “Mamastrawastntn ms .3 s assmma: mamas; APPENDIX 2 A LIST OF luOO COMMONLY USED , CHINESE CHARACTERS -63- -54- Feather—malarial]? Ease.safestvsixstsrtar$es fisdaaaaaasshhhhswaaafisiii Tslainsigfiffitgkttttkttttfi swirl hiafiiawaaiirtasfipnwssrsnatal varietisawfishlasatsfialwsmheartachewh Em male ants/hates Estartfiwxvarxeseihmaswghis airmaiw a ten Rhonaqéllxtuaae/Sls smflaewltaers walls—aha WEWEWWmEfirahfi/l lass/e Zahara ahattahahiatm5§raE++wtttm hflflflwflfiififlwflanflmtmmtawaniiwmuwsnfi EdutainagsitrfiimhrmNamath seesaw? .tiitttinftf has mm as stamina heahemhmgeehfihfihwahhtaoofi mafiathetahehaasfitmgiwaaat _siivlEhreisihit9xflestmmhfi _i_tilitsatisisiinxztistet as it if 2%: at first has the, ashes e. as. as assertssratisinissrss as stain. H Nhhfimtaathinflm Balersswnasamiéarasas aaaaasdaaaahtaaraafisaith? htthhfithhtthhmaFEEhfianiE s aaahhaathaaaahehéthaastfia asihmhhmoaaafitiflhaflePatio titan. hawmediatorsmantrashe[hissifiseEasaawttFMMs/Maafimfir mm ”We—rm. me M aw two .3 a]... Re R Aims to a; as .m lbw. E as are Mm Nb aw mw kw _Wfl$%fiflfi_%_wttk_m l/wmmfimmfifirshdmfisfimmfi Firmwfinifisfiifififiafifia?szRN rttifihiflfiéwFFfiWfiTFfitfiinTI EaRnfihaéhifiééfifiht$$53WMgm .Tm..fl$$§.r®.me®ifiaxfl #tTfififivgxmfiamfifislILM [wltfimmmw t -66.. _m mtearnhandlebarssaitxarssmstamafieaasiRestraint simsaishaaiaseaasaaaashae Elamamumeasasaaarvfimmtmfinwwheafimream. _atrfiwflmtimtwfiafiFarhhmatam atawrhmaaanafintahfiaim harm with m travails... fiifim{Thaimavniszasaecstasaharam amatirllslmwaimmskatinsuxsaawassamamifisafi—mlemmaonfiamfi tom—m ihflwaahohifis/m (siesta [assessafim ll tesafiflmshfimsatarsfitaaawtt loam madam e Ermaimimmfi ..Eemiaseaaim Sightseer rim Trim mt snowflursms ETasislstaS M Kim fihlmlhifi Whfiflwmmmfiglfflwitems/meanwgfisnw/Eiaami meter assesses; this? This intimhfa m fifissasérklgsswmltflhttemahassaaiws. asistsélrrtaiatafiafisrrshattersih Tashaislethatasaaauaisssstxafitarseta Q“ q‘u- “\u- a A”? )‘I \«e ow- \w “\w our our m.» mm ' 2“: 356 I“ \-/~ 4» fl "3' G I" ,VH' M“- om- MW (2% W " NE? 3. CW EIi‘fi‘: {a I] 21”: on, e\\\- 0 75° as: \w “\u» we “(... —i— ‘-\ one u an \W .35 fl 11hr are our ash o\\\- cm at» n\«- ON— ['0 ow is?” afar—2%: its. this FR #131173183’53 It shiatfssagfnwhgéhmfi arraisaptaitttfifiht sfisarh seat- Fri—922‘s: s7t333159a, or iiftfizgfikrfiéé sZEFfiaIars—wa-Eefiafiias ssh—nathispssaaa at? harassirtaaattfis a treatise Fi9¥IE5§195§i at] ”s? tripart‘ with ”aa- éE‘é‘fBE—iiii—la’ié’é has??? «it—503 at? t was 1 at a 185196316} iiswfiaafzaflafs @207‘ she—E’oirtssiEiEh-w—Mtfii at? that a EHGR‘EEM it’s-sari Pfifiiiégfiifiv firtafiafiahahsgza f‘a’amzfie’t raififigfitmi : trail—63:11: at Ira-13353210 isofi’ismiiflflifgifi‘fi’fififi Elsi}? sat—175:1; 1781:; 13213531} trawliaaagfig’rm rattaifl’f atmweayofii gggfilzglfimgfigg 5119020022 gggfsfii’f Nimii‘faiéhzlu APPENDIX 3 A SHORT LIST OF CHINESE I CHARACTERS COMPILED ACCORDING TO THE RADICAL SYSTEd -69- The word is listed on the right Side of the column. Its associated radical is indicated at the upper left corner of the first word of its list. The numerical equivalent for each word is arbitrarily assigned and appears «- at the left of the word. 9(- 3'Using the CDC l60-A Computer, it is conceived that dictionary information Hence, starting at loca- for each word takes ten octal memory positions. tion 1000, the digital equivalents are accordingly assigned. 1:00 ... 1110 i 1420 5" 1630 (a 2040 i 2250 A) 2:60 A. 2: s a] a s, t it, rt t 1010 J: 1230 4'3] :40 t 0 1650 4; 2060 % :70 ‘I‘ 2510 3t. 1030 T 1140 [it 1450 I” 1660 J;- :10 it 2300 E. 2:0 ’3'] 1740 7"" 1150 [P9 Z460 E 1670 7’5“ 2.100 11° :10 {It} :10 “I [:50 :[D 1260 {K 1470 + 1700 a 2110 t; :20 AF 2540 IE. 1260 Z 1270 1i 1500 f 1710 F“? 2110 S 2330 $- 2550 f]: 11070 t2 {500 /\ 1510 "I“ 320 ’E‘ 230 if? 2:540 {E 2560 ”[313: Moo 7 1210 £3 I152s ‘33 1730 E 2140 g, 2350 [a 2570 #2 1.110 $ 1310 It 1:30 E] 1740 '3 :1‘150 \i'?’ 2360 [FA :00 at I, 20 {E 1:30 Ii 1:10 7?? :50 [a :60 ‘8 hi?» [‘3' 2610 5i 1130 L‘/\ 1340 7‘? 1550 1 2000 ’2: 2170 \ifi 2400 'I'Jé 2620 at 1140 ’[E 350 '3‘. 1560 R 1760 3 t 2200 E 2410 'I'é 2630 "gt a a BI {is E ii .... f: is i ”6° 1F 1370 ”k 1600 g :10 #I‘ 2220 ‘ 2430 [3‘ :50 1% 1170 If]: 1300 )1? l6/o P‘ :10 g 2230 52:7 24140 :1 2:0 7; 120° ‘9‘ 1410 3'] 1620 “Z 2030 k :40 'J ‘ 2450 '[i’ 2670 35‘ 2700 2710 1730 27% 2500 2'15” 0 1760 2770 3°10 3°20 3000 3a30 3°40 3°50 3°60 3°70 3100 3u0 3110 £930 31110 61:. 3150 3160 3170 5 3200 3210 .‘ 3110 3230 31% ‘ 3250 3260 k my 3270 3300 “K 3310 3310 II- 3330 33% 3350 3360 3370 3400 lywo 34.20 3430 l? 3‘140 3450 7°. 3460 3670 3500 3520 3530 3540 3550 3560 3570 3600 3610 3610 1:): 3630 3640 3650 3660 *anfihimmgmxggg 3m sassmhgtgyi 7t ’ — a I -70- 3750 3b 3760 ”COO 4010 402.0 é 3 9e» M0 4M5 4-070 4100 4M0 4n0 4130 4140 Ex» or 2» I423 assesses 153- 5515" EzEnfl‘ I 3* has at 1,? emssmsts- ‘HSO 4160 4170 4mm 4UUO 4110 42.30 4a» 4250 4260 4270 4300 4310 4320 4330 434. 4.350 I as \i; chi- M. m». o\w 16's to; 1“? «ID 3* e the II»- sss 1" 4430 [“3 ’1‘4‘1‘0 4450 4460 4470 4500 45/0 4520 4530 it 45% \i7 #550 4560 4570 46/0 4620 4630 4640 4650 13k #660 4670 starsmbssssasasagzsgges as {P- as as [as 5,» J} ‘? \tq 4700 4WD 3770 W20 3510 4720 4730 4740 4‘15 0 4760 JO Ll 1'1 v A APPENDH 1+ CIASSH‘ICATION OF PART-OF-SPEEH OF THE CHINESE WORD -72- -x- The numerical code for each classification is assigned arbitrarily. Sample Words of the figu§§_(lX) Classification Nouns <11) m. s . si Pronouns (12) {I}: ' flffi Pr0per Nouns (13) a $ Time-Associated Nouns (11+) 3., g, 31;} Adjectival Pronouns (15) 1%., ’F Conjunctive Pronoun (16) [NH] / ;.¢. ubicA,on, 06ers Adjectives (2X) Adjectives (21) 3; 7g: , 33:31 Adjectives of Comparison (22) k , é? Adverbs (3X) Adverbs (31) 37, fl , R Adverbs of Comparison (32) k , 1%. Adverbs of Condition (33) >3. ’37, @534 we 0x) Transitive Verbs (1+1) at, Intransitive Verbs (1+2) 5,2} is , $3. Auxiliary Verbs (1+3) ‘ @‘I Q Verb-Prepositions (Ms) E, ‘% Prepositions (5X) Prepositions ’ (51) fig Time-Related Prepositions (52) J: ' 11 ’ if} 3541. The two-digit number with X as second number denotes the general category, such as fix for verb; while a numerical second number denotes the particular sub-field of part-of-speech, such as #2 for intransitive verb. -73.. SMple Words of the Numbers (6X) Classification Numerical Numbers (61) ..., .31 ) + Pronominal NumberS‘(62) .— {13 ’ gr 3, Number-Associated Nouns (Classifiers) (63) 1.3] I if, )1 Conjunctions (7x) Introductory‘(Subor.) Conjunctions (71) #9 .fi . {g i ’ “I M Coordinate Conjunctions (72) *9’ 51‘ Special Terms (8X) Special Elements I (81) E ’ % Special Elements II‘(82) {E a 1.1:. ‘ (831 947 (81+) Past Tensers (85) '3 , 1%. Punctuation Marks (9X) kriOd "' 0 coma: " , Colon - 3 Semi-Colon - : Brackets - ( ), [ ] Quotation Marks 4- " mclamation Mark - '2 APPENDIX 5 LOOK-UP PROGRAM AND SAMPLE RESULTS USING CDC 16o-A cmurm -7h_ -75.. Master Routine Function Execution Code Address gpgrand JPR , Subroutine I LDD 75 SEM 1000 SBN 77 ZJF 03 AOB 03 NZB 10 JPR Subroutine I LDD 75 find 0105 SBN 77 NZF 02 Go to Next Stage JPR Subroutine I LDD 75 END! 0106 SEN 77 ELF 05 LDN 03 RAM. 0105 NZF 33 IUD! 0105 Comments Input dictionary in- formation First Storage location = 1000 -. . : End of list check Input word to be looked “P Temporary Storage 0105 End of list check Input following word for idiom search Tumporary Storage 0106 End of list check -76- Function Execution Code Address gErand Comments STF 02 L114 *l-l-I- 8m 0106 Check for first idiom in dictionary ZIP 12 LDN 03 RAM 0105 STF 02 L114 «m SM 0106 Check for second idiom in dictionary NZF 12 LCN 02 Output idiom information STD 77 AOM 0105 JPR Subroutine II AOD 77 NZB 05 2.13 5h LDN #93 Output word information RAM 0105 JPR Subroutine II AQM 0105 JPR subroutine II L114 0106 STM 0105 SEN 77 NZB 63 GotoNextStage Subroutine I Function mecut ion Code Address gpgrand J'PI 01 *X-H EXC ‘ J+102 IMP 03 PJF 02 010]. L114 0101 LS3 SCM 0102 LS3 SCM 0103 LS3 SCM 0101+ STD 1'75 PCB 211» NIB 25 subroutine II Function mecution Code Address garand JFI 01 a-x-x-x- LIM‘ 0105 SIF 02 m H-x-x- 3m 76 EXC 111011 LDD 76 LPG 7700 Caments This routine inputs 11 octal digits and stores in temporary location 0075 Oments This routine prints out 11 octal digits Stored in location specified by the content of 0105 -78- Function Ekecution Code Address %rand Canments LS6 OTA LDD 76 LPN 77 OTA PJB 22 MB 23 For the given Chinese passage: 7):: Ekaiiiififl SHAAAABHA 6‘61]: . Input to Machine: 2170 10110 3550 113110 11320 1230 2150 3030 2000 3100 A670 2750 12110 3660 3110 A710 0077 Output: 00 12 21:20 00 31 3h10 00 111 2700 00 15 6110 00 63 61:00 00 12 5020 11 M1 0150 12 82 7030 13 11 6010 00 1+2 6120 APPENDIX 6 SIMULATED TRANSLATION OF CHINESE PASSAGE INTO ENGLISH -79.. -80- The following three passages in Chinese are to be translated into English. The correct English translation is first presented beneath each passage. The translation using the Syntax System.deve10ped in Chapter 2 are then carried out in ensuing sections. Each string of words ending with a punctuation mark is first written, followed by the word- for-word translation. Details are stressed on processing the significant syntactic relationships existing in the passages. The trivial parts of the translating process are not indicated. I’ fii‘m‘oiafiné‘mtinegam Q ’5 :aiiéhrara‘y k sci/segue. it Masai E saw, “magmas a #1112 waft: £9.71”; 531431;." Translation: I do not know when does this semester end, also do not kDOW‘when does our final exam.begin. I only know that the teacher already said, "Before final examination begins, we must review well the old lessons.” IL i3rfi‘aét'3'1441‘ifayfl‘.%zd, detail. Translation: Three years ago I went to your home to bid farewell to you, now I returned. -81- III“ ‘éffit>§éb%fii, 5571<%F@4;»11;%fi \fiyfitfiféfi-{m’flfi’ fi‘rffiflifi; \gi’: A}?! +Etss,ssse$sea .tfi Items”? swam, excesses Lt%1&3§7h%+-%.P r’rtxtaiiiipésé Mk. Translation: When the weather is very cold, even sea water can freeze. But at one place in the Anarctic Continent, there is a dead sea; when the temperature reaches sixty degrees below zero, this sea does not freeze. The scientists who went there to explore for experimental results, found it contained salt eleven times more than ordinary sea water. Therefore it does not freeze easily. I-lfi 1 as; La messages I do not not to know this classifier semester -§t_ what time to end Deletion of word: Delete 1g) (classifier) after 1: (this) Formation of Synits of first order: (1) (do not, not to know) (this semester) (at) (what) (time) (to end) Formation of Synits of third order: [I] [do not, not to know] [this semester] [gt] [what] [time] [to end] Translation: I'do no t td'know this semester §t_what time to/end do not ' ’ -82- 2's 1 115.11.111st 1.212.911; #1113 Also do not, not to know we 97 final exam.at wh att time o begin Formation of Synits: [Also do not,not to know] [we] [ __E/__f] 1 [final exam] [at] [w___hat] [time] [to begin] Sequence of synits: . [we] [ [9’7 ] -—) [we] in possessive case—)[our] Reformation of Synits: [Also do not,not to know] [our final exam] [at] [what] [time] [to begin] Translation: Also WM know our final exam at what time ty begin. do not its 321*- itj LL?- i ii ‘I onIy to know teacher already to say past-tenser Processing past-tense: M past-tenser ——5 said Formation of Synits: [I] [only to know] [teacher] [already said] Translation: I onIy tp’know teacher already said 7* -83- 1“ i k 7% E. s w. 4-5., to be at final examination to begin before Frame Construction: to be at . . . before -——> before Formation of Synits [before final examination] [to begin] The sequence of synit above is [P] [v], but the syntactic function of the word "before” as found by frame construction will not be so assign- ed to effect a change in the manner discussed in Secion 3f. of Chapter 2. Otherwise translation is incorrect. Translation: before final examination to/ begins 7 Ffifltké a as mg a 213 must with old __ér’§7 lesson[s) to review we—ll Deletion of word: 9/] after g , an adjective Formation of Synit: [E] [must] [with old lesson(s)] [to review} wellJ Sequence of.Synit:‘ I - [P] [v] [N] (Optional) to [v] [N] (Optional) [P] [to review well] [with old lesson(s)] Translation: [113 must to review well with old lesson(s). / / —84- ILL: 14: 11111.1 xiii-ems. 21 t_ you Three year(s) ago I go to,to your home obid farewell Formation of Synits: [three year(s) ago I] [go to,to] [your home] [to y__] [to bid] [farewell] Sequence of Synits: [P] [v] [N] to [v] [N] [P] [to you] [to bid] [farewell] [to bid] [farewell] [to you] Translation: Three year(s) ago I'go aggto your home to bid farewell tg_y22, went 2. fififiil Now to return ‘} Processing Past—Tenser to return 3 -———9>returned Synit Formation: [N31] [I] [returned] Translation: M I returned -85- x k . s a a i. a; a? 1111 to be at weather very cold a. time III. 1. Frame Construction: to be at time ——-) when Deletion of Word: Delete word 67 after )Q‘ Formation of Synits: , as adjective [When] [weather] [very cold] Translation: When weather -is- verv cold. ————-I— -I-— 2, _m as £1 5? 91$ 1 Sea water even can to eeze Formation of Synits: [Sea water] [even can to freeze] Translation: Sea water even can to freeze. / \ 3°4flifi 331:]: .511 3’2: iii] m1 Anarctic continent fig one classifier place Bu to be in . Formation of Synits: [guy [to be in] [Anarctic continent]_[_§_’L] [one classifier place] Sequence of Synits: [verb-preposition] [noun]l ['93 ] [noun]2 --€> [noun]2 [NH] [verb preposition] [noun]l [nnp nlggcifipr nlane] [WV] [to be in] [Anarctic continent] -86- Translation: But one classifier pIace [VH1 to in Anarctic continent. /7' which is in L“? :Ji ii There-to be one classifier dead sea Formation of Synits: [thereto/be] [one clasdifier dead sea] Translation: There tq/be one classifier dead sea. /' "i5 5' it. it. eli_ 11:: 5- new. to be at weather col.d tg_ zero below sixty degree(s) éfiL time Frame Construction: to be at ... time-——> when Formation of Synits: [when] [weather] [cold] [to zero] [below sixty degree(s)] [6191‘ Translation: When weather -is- cold to zero below sixty degrees fl . 6. Ft- t‘r set: 757 s asst This classifier sea also do not,not can to freeze. Deletion of Word: xi Delete Jf (classifier) aiterfs (this) -87- Formation of Synits: [this sea] [also do notjnot can to freeze] Translation: This sea also do gptjnot can tg’freeze " not 7 7° 4: :12”: as 352 33' ““1 3‘; 56.1.2": to go to,to to explore 97 science eXpert(s) to experiment *Si result- Formation of Synits: [to go tojto to explore] [6’9] [science eXpert(s)] [to experiment] [result] Sequence of Synits: [verb—preposition] [64]] [noun]—-)[noun] [WH] [verb-prep.] [science exPert(s)] [WH] [to go tojto to explore] Translation: Science eXpert(s).LWH+;§g=gpateth to explore to experiment who went ESEEEE k \gfi A I E _ s). 8‘M2 is. 32$ 1:55» the To find it _t_g contain . salt to be e;- ordinary sea zia § +- 4%- vats} mare eleven tifiés Formation of Synits: [to find] [1.3] [to contain] [ [9/7] [S_a_l_§] [to be] [i] [ordinary sea water] [more] [eleven times] -88- Sequences of Synits: l. [verb] [[5]? ] —-> verb-past tenser [to contain] [ 6’7 ]——)contained 2. Statement of comparison: [eleven times] [more than] [ordinary sea water] Translation: ,To—Sind’it contained salt to be eleven times more than Found ordinary sea watero J i . a \| 9° ff-r le \‘E’ 1" Therefore 'EE‘ do notjnot easily to freeze lg: Formation of Synit: [Therefore] [it] [do notjnot] easily to freeze] Translation: Therefore it ‘dg=gg§z§§§_ easily ‘t6'freeze. does not ’ REFERENCES Current Research and Develognent in Scientific Documentation No. 11. Chapter 3, "Mechanical Translation", pp. 190-228, National Science Foundation, November, 1962. "Acceneral Canpiler Capable of learning", Richard Fairbanks Arnold, A Thesis, Master of Science, Michigan State University, 1958. Shau Wing Chan, "Chinese Reader for Beginners", Stanford University Press, Stanford University, California, October, l9h3. men Ren Chao, "Character Text for Mandarin Primer", Harvard University Press, Cambridge, Massachusetts, 1951+. Tu Hsui-Chih ( h ‘2? in ). China Weekly, pp. 10-12, Marksman, July, 1962. Yehoshua Bar-Hillel, "Machine Translation of languages", Edited by William N. Locke and A. Donald Booth, pp. 183-193, The Technology Press of the Massachusetts Institute of Technology and John Wiley and Sons, Inc., New York, October, 1957. Paul L. Garvin, Proceedings of the National Syntposim on Machine Translation", E. P. Edmndson, Editor, pp. 286-292, Prentice-Hall, Inc. , Englewood Cliffs, New Jersey, 1961. Anthony G. Oettinger, "Automatic Language Translation", Harvard University Press, Cambridge, Massachusetts, 1960. -89- ROOM USE ONLY IES "‘fi@[i[ujui[u[[[[]][]’][[][]"