U «TREE AUTOMATA: MACHENES FEAT CAN CLASS!” PAWERNS Thesis 11mm Eegree cf P31. E MiGHiGAN STATE ENWERSEY KENNETH LEROY WELLEAMS i973 'va“-(gm. ‘ .‘ M9 a. 5‘ LIBRARY ’ Michigan State 2 University 3: Cir-fin - -.¢ _ . 3 1- " HMS & SBNS’ BOOK BINDERY IND. LIBRARY BINDERS ‘ mmorqny. mama/w ABSTRACT U-TREE AUTOMATA: MACHINES THAT CAN CLASSIFY PATTERNS By Kenneth Leroy Williams A syntactic method for pattern recognition using an extension of standard tree automata theory to a theory for unordered trees is developed. A formalization is given for regular unordered tree grammars and automata. It is shown that regular unordered tree automata can serve as pattern classification devices. Many concepts inherent to syntactic pattern recognition methods are defined. The important differences between circular and noncircular primitive systems are noted. An exploration is made concerning what types of pattern classes used with what types of primitive systems are amenable to this method of classification. It is shown that deterministic pushdown automata can be used to simulate regular tree automata which in turn can be used to simulate regular unordered tree automata. The interesting class of languages generated as the frontier by regular unordered tree grammars is investigated. A characterization for various types of tree generating grammars based on the substitu- tions made in root-to-frontier paths is proposed. U-TREE AUTOMATA: MACHINES THAT CAN CLASSIFY PATTERNS BY Kenneth Leroy Williams A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Computer Science 1973 ‘9 .) ACKNOWLEDGEMENTS I want to thank each of the members of my doctoral committee for his unique contributions. Thanks go to Dr. Herbert Bohnert for crossing department and college lines and serving on the committee as a member of the Philosophy Department (but a member with a great deal of computing background). Thanks go to Dr. Hans Lee for serving as my academic advisor and for continually trying to get me to see the "big picture" and integrate my somewhat narrow field of special- ization into a broader context. Thanks go to Dr. Harry Hedges and the Department of Computer Science, as well as the Division of Engin- eering Research, for providing me with financial assistance during my doctoral studies. Thanks also are due to Dr. Hedges for his fine reading of earlier thesis drafts and pointing out many previously overlooked errors. Of course the committee member and faculty member I am most indebted to, is my thesis advisor, Dr. Carl Page. Most of the computer theory that I know was learned from him; either in the many classes I enjoyed where he was the instructor or as a result of his encourage- ment to carry my research forward. Special thanks are due to my wife Bonnie. Without her love and support I would never have reentered the academic world to pursue my doctorate. ii GLOSSARY Chapter 1. INTRODUCTION 1.1 1.2 TREE 2.1 2.2 2.3 TABLE OF CONTENTS MOTIVATIONS . . . . . . APPLICATION AREAS OF MULTIDIMENSIONAL AUTOMATA 1.2.1 1.2.2 1.2.3 1.2.4 A Logic Application . . . . Insights to Language Theory Natural Language Applications . . Recognizing Patterns with Multidimensional Automata . . . . . AUTOMATA FORMALIZATION . . BASIC DEFINITIONS . . . . . STRING REPRESENTATIONS . . . TREE ACCEPTORS . . . . . . . 2.4 TREE GRAMMARS . . . . UNORDERED TREES . . . . . . 3.1 MOTIVATIONS . . . . . 3.2 GRAPH THEORY BASES . . . . . . . 3.3 STRING REPRESENTATIONS AND THE IMPLIED RELATIONSHIPS 3.4 U-TREE GRAMMARS AND AUTOMATA . . . . . PATTERN RECOGNITION CONCEPTS . . . . . . 4.1 MOTIVATIONS . . . . . . . . 4.2 UNDEFINED TERMINOLOGY . . . . . . iii Page 10 11 l4 19 19 22 27 30 39 39 40 Table of Contents 4.3 BUILDING U-TREES FROM PATTERNS . . . . 41 4.4 THE PATTERN RECOGNITION PROCEDURE . . . . 50 4.5 SOME EXAMPLES . . . . . . . . 58 5. PROPERTIES OF U-TREE GRAMMARS AND LANGUAGES . . . 68 5.1 SIMPLIFICATION OF ACCEPTING MACHINES . . . 68 5.2 LANGUAGE PROPERTIES OF THE FRONTIERS OF U-TREE LANGUAGES . . . . . . . . 86 6. CHARACTERIZATIONS FOR TREE GRAMMARS . . . . 92 6.1 BASIS FOR CHARACTERIZATION . . . . . 92 6.2 CONTEXT-FREE TREE GRAMMARS . . . . . 95 6.3 CONTEXT-SENSITIVE TREE GRAMMARS . . . . 100 6.4 RECURSIVELY ENUMERABLE TREE GRAMMARS . . . 106 7. SUMMARY AND RECOMMENDATIONS . . . . . . 110 7.1 SUMMARY . . . . . . . . . 110 7.2 SUGGESTIONS FOR FUTURE WORK . . . . . 111 BIBLIOGRAPHY . . . . . . . . . . 114 iv a(a+B) b/a Ci’ci ,ci',c d(a) (13.5) y-l 1 GLOSSARY Meaning See Page Subtree of a at a 10 All strings with symbols from set A All non-empty strings with symbols from set A All strings of length n with symbols from set A Cartesian product of sets A1 and A2 A is a subset (or subclass) of B A is a proper subset (or subclass) of B A is not a subset (or subclass) of B a is an element of set A a is not an element of set A b is derived from a in one step (or a implies b) 15,31 b is derived from a 15,31 The complement of set A (Defined on page 9) 9 o-1(n) 9 Ranked alphabet A 9 A node of a d-tree 23 (Defined on page 14) 14 (Defined on page 10) 10 A11 mappings for ljij5 28 Context-free sets (in Chapter 5 only) 86 Depth of node(a) 9 An equivalence relation Inverse of mapping f V Glossary Term ? RUTG RTG Meaning A11 permutations on string f String languages of regular u-tree grammars (in Chapter 5 only) goes to or contains Left hand side String of length 0 Language of G Non-negative integers Positive integers The string composed of n (0)5 Projected derivation trees Response to a Right hand side Regular u-tree grammar Regular tree grammar Such that Number of symbols in x Replaced by Regular sets (in Chapter 5 only) Pushdown stack start symbol All subsets of set Q Tree automaton A11 trees over alphabetZ All pseudoterms over 2 Tree (or u-tree) production vi See Page 31 86 12 58 12,32 30 15 14 86 74 12 11 15,30 Glossary ’_1‘__£_:_r_m_ Meaning See Page tx Means ti for x=xi 12 U Universal tree domain 8 UTA U-tree automaton 32 Digraph with U = set of nodes, V = set of edges 23 WOLOG Without loss of generality vii CHAPTER 1 INTRODUCTION 1.1 MDTIVATIONS Most pattern recognition schemes fall into one of two classes. The more common type involves feature extraction followed by statistical classification. Within the last several years, however, more and more attention has been devoted towards applying techniques of formal lang- uage theory to recognizing patterns. Such approaches are usually called syntactic or linguistic pattern recognition. A good introduction to the syntactic approach is given in Fu and Swain(18). Syntactic methods are appealing since they provide a vehicle for formalizing both the rep- resentation of patterns and the corresponding classification methods. A major disadvantage of the syntactic approach is that language theory has customarily dealt with one-dimensional strings of symbols and sets of such strings as its primary objects of study. Meanwhile most pattern recognition has been concerned with patterns as represented by two- dimensional pictures or other high-dimensional representations. Strings that are amenable to syntactic techniques are usually inadequate repre- sentations for high-dimensional patterns. Interest in a wide variety of types of machines that may be called multidimensional automata also has developed within the last ten years. The multidimensional automata that have been best characterized and developed are those known as tree automata. Tree automata theory in- cludes the study of tree grammars, tree transformation systems, tree languages and tree acceptors or pseudoautomata. It seems natural to try to apply some of the results of tree automata theory towards pat- tern recognition. The first attempt in this direction was that given by Fu and Bhargava(20). Fu and Bhargava extablished a method for re- presenting the primitives of a pattern as nodes of a tree. This paper includes an extension of (20) in several ways. Real patterns can indeed be more adequately represented by trees than by strings but there is still something missing. Branches of a tree can never intersect, i.e. paths down two distinct branches can never reach the same point, but, as we trace over various parts of a pattern, the same point can often be reached by several different routes from a given start point. To overcome this handicap (and others) we will not restrict ourselves just to trees but rather to tree-like struct- ures called d-trees, where distinct branches can lead to the same node or nodes. In an actual pattern recognition scheme it is desirable to have the ability to construct the representative tree, or d-tree, using no outside information, once the set of primitives is known. The tree, or d-tree, can then be presented directly to a pseudoautomaton for acceptance or rejection, or possibly for classification among several different pattern classes. The method presented here has this ability. The study of tree grammars, tree acceptors and their extensions as presented in this thesis also leads to some interesting results in the theory of formal languages. Some of these results are developed here in Chapters 3, 5 and 6, although Chapters 3 and 5 are more con- cerned with the practical problems associated with pattern recognition. The work in this thesis also offers a contribution as it provides a step towards formalizing properties of graph transformations. The tree-like structures developed here are based on graphical properties rather than the functional definitions used in most other work (see Chapter 2). Although graphs in general are not treated here, d-trees (Definition 3.4) are a more general form of graphs than those previ- ously studied by tree automata theorists. This work, which is another application of the study of multidimensional automata, constitutes a partial bridge from tree automata theory to a theory of graph automata. 1.2 APPLICATION AREAS 9: MULTIDIMENSIONAL AUTOMATA We will use the term multidimensional automata to denote mathe- matical models of machines which can accept, reject or classify inputs from spaces with dimensionality order greater than one. (One dimensional encodements of higher dimensional structures are also possible inputs.) A brief survey of several applications of multidimensional automata theory follows. 1.2.1 ‘ALLOGIC APPLICATION Thatcher(40) presents a discussion and history of the question of decideability of the weak monadic second-order theory of multiple suc- cessors. The question was answered affirmatively for the one successor case using concepts from the area which has (later) become known as tree automata theory independently by Buchi(9) and Elgot(13). The problem was posed for the case of multiple successors by Buchi(9). Doner(12) was later able to generalize the tree theory methods of Buchi and Elgot and prove this result also. 1.2.2 INSIGHTS £2 LANGUAGE THEORY W.C. Rounds(36) shows how the newly developed theory of tree auto- mata can be used to provide clearer insights into, and better proofs of, some of the classical theory of formal languages. One example is a simple proof showing that the class of context-free languages is closed under intersection with regular sets. New areas of investigation are opened also. For example, the theorem given in Peters and Ritchie(32), that the language analyzable1 by a finite set of context-sensitive productions is context—free, is also shown in (36). (The proof origi- nally given by Peters and Ritchie was obscure if not incorrect.) Theorem 6.2 of this thesis provides another example of an applica- tion of tree automata theory for proving results in the area of formal language theory. 1.2.3 NATURAL LANGUAGE APPLICATIONS N. Chomsky in (11) and in other works has shown that phrase struct- ure grammars can not suffice to represent equivalence of meaning between such sentences as "Sincerity may frighten the boy.‘ and "The boy may be frightened by sincerity.". Rather, he has proposed a more general type of generative system called a "transformational grammar". A transform? ational grammar combines string generating phrase structure rules with rules that provide transformations between trees representing the deep structure of such systems. It has been observed by Aho and Ullman(2), by Martin and Vere(26) and by Rounds(35), that formalized tree transduction systems can provide 1"Analyzable by" essentially means "can be parsed using". For a more complete definition see (36). a system for carrying out these transformations. Tree transducers are also considered as possible machines to aid in translations from one language to another. No practical work along these lines has yet been attempted but these research papers lay some of the theoretical found- ations for it. 1.2.4 RECOGNIZING PATTERNS WITH MULTIDIMENSIONAL AUTOMATA Automata which can accept or reject two-dimensional patterns are described by Blum and Hewitt(4) and by Savitch(37). One might think of this form of automaton as being a "bug" which is given a maze with filled and unfilled cells to wander around in. The bug then accepts or rejects the maze depending on whether the maze meets some pre-estab- lished criterion. Some of the capabilities of the bug model given in (4) are that given a square maze it can decide if; (1) The maze contains precisely k filled cells. (2) The maze contains a single rectangle of filled cells. (3) The maze contains a single square of filled cells with edges parallel to the maze boundaries. Blum and Hewitt also show that bugs that can leave markers on cells, so that on future visits the cells can be recognized as previously visit- ed, are more powerful than those that cannot leave markers. The mazes considered in (37) are those with exactly two "doorways" from each cell leading to two other cells. Here a maze is accepted if there is a path of unfilled cells from a designated maze start cell to a designated maze goal cell. It is shown that constructing a bug to make such considerations is equivalent to constructing a Turing machine to accept or reject some coding of the maze. Similar automata are used to provide an unusual characterization for the context-sensitive languages in Fischer(l6). Generally, pattern recognizing automata go hand—in-hand with pattern generating grammars. In an important paper by Pfaltz and Rosenfeld(33) generative schemes called web grammars are introduced. No accepting automata are defined; acceptance of a pattern generated by a web grammar can only be accomplished by parsing with the grammar. Example 1.1 A simple web grammar from Pfaltz and Rosenfeld(33). 0- (INNTJA) VN-{A} : Nonterminal symbols VT-{a,b,c} : Terminal symbols S-{A°} : Start web } : Productions b P’ i v ~€> <:::: a “ —-€> ' A a A a c A derivation might be: @373“ o 2} <> 1? \t J A a c A a 'E/ a c A b b b 289% Any pattern, with only terminals labeling the joining of line seg- ments, which can be generated by a web grammar is said to be in the pattern language. Pfaltz and Rosenfeld defined web grammars in such general terms, and allowed so many types of productions through the use of "embedding rules" (which describe exactly how each production is to be applied) that their system effectively subsumes any tree or graph production system that one might define. Unfortunately, however, it does not shed much light on the nature of such systems. There is an essential difference in interpretation between the webs produced by the Pfaltz and Rosenfeld grammars and the multidi- mensional structures used by others (including this author). Web grammars produce structures which are themselves regarded as being patterns. On the other hand the strings, trees etc. used by others are regarded as partial representations for the pattern. Structures which represent patterns, rather than constitute patterns, provide an important step towards an abstraction of pattern recognition concepts. The scheme developed by Fu and Bhargava (to be discussed in some detail in Chapter 3) offers the first application of multidimensional automata used for pattern recognition involving both a generative sys- tem and a machine to do the acceptance/rejection/classification. In previous cases this has been accomplished only through parsing. The material presented in Chapters 3, 4 and 5 of this dissertation provides a significant extension to this work; an extension that may lead to a practical recognition scheme for a number of pattern recognizing applications. CHAPTER 2 TREE AUTOMATA FORMALIZATION It is difficult or impossible to discuss various aspects of tree automata theory such as subtrees, tree replacement, and tree grammars without an adequate formalism for the language of discussion. This chapter is an attempt to provide that framework and to introduce the principal results upon which the remainder of the dissertation will be based. 2.1 BASIC DEFINITIONS Many definitions and previous results in tree automata theory will be used for our pattern recognizing automata. We begin with definitions leading towards the idea of a tree acceptor or pseudoautomaton. Most of the definitions come from Thatcher(39) and Brainerd(7) although ap— ropriate notational changes have been made. A similar develOpment can be found in Brainerd(7). Definition 2.1 Let N+ be the set of positive integers. Let U, the universal tree domain, be the free semi-group with identity, 0, generated by N+ and a binary operation, -. So, for all acU we have a-OBO-a=a. U can be partially represented by the following figure: 1°l'l Definition 2.2 The depth of atU is denoted d(a) and is defined as fol- lows: d(0)=O, d(a°i)=d(a)+l for ib if azb and a#b. We say a and b are incomparable if afib and his. Definition 2.4 D is a tree domain if: (a) D is a finite subset of U, (b) beD and a a-ieD. Definition 2.5 A ranked alphabet is a pair Q50) where A is a finite set of symbols and o is a finite relation in AXN.1 (N=non—negative integers.) Let Ants-1(n) so for all n, A.n will be a subset of A. Definition 2.6 A tree (actually an ordered tree) over A (i.e. over 4A“) ) is a function a:D+A where D is a tree domain and for aeD, max{ 1 a-ieD}eo(a(a)). We denote the domain of a tree by D(a) or Da. We will denote the set of all trees over 2 by T2. Example 2.1 Let Da={0,l,2,lol,l-2,l~l-l,l-l-2}, E={ S,A,Z}, XO={Z}. £2={A.S}- The tree {(0.8).(1.A).(2.2).(1-1.A).(1-2.Z).(1-1-1,Z). lln most previous work 0 was assumed to be functional so each sym- bol had fixed rank. It was pointed out by Thatcher(43) that this is an unnecessary restriction and the theory holds for all finite o. (1.1-2,Z)} may be written as: The following definitions allow us to deal with parts (subtrees) of trees. Definition 2.7 Let a,b,b' be members of U (the universal tree domain) such that a-b'=b. Then blgfb' if afb. b£§_is undefined otherwise. We now have b/O-b, a/asO and (if b/a is defined) a’(b/a)=(a°b)/a=b. Definition 2.8 Let a be a tree and "a" be a member of D . gig? {(b,x)l(a°b,x0€a}. a/a is called a subtree of a at a. Note that aGT and aGD $a/a e T A A. 2.2 STRING REPRESENTATIONS Different authors use different methods to represent trees in string form. In this paper we will use what is commonly called pseudo- term notation2 so the tree of Example 2.1 is represented by S(A(A(ZZ)Z)Z). Although pseudoterm notation is somewhat more awkward to write than other forms (for example postfix form) it has the advantage of always providing a unique representation for a tree regardless of the ranking relation defined whereas other methods only define a unique tree when used with the ranking relation. The notation used does have an effect 2This is also called bracketed notation, list notation or functional notation. (See Brainerd(6).) 11 on the definition of tree automata, but an equivalent formalization can be given for each notation. Definition 2.9 The set 12 of pseudoterms, usually just called terms, * on X, is the smallest subset of (ZL){),(}) satisfying:3 (a) Sgt): (b) If n>0 and f9: and t1,t2,---,tnct then f(t1t2---tn)(12. X We now define a recursive algorithm for writing the pseudoterm corresponding to a tree: Algorithm 2.1 0. (We refer to the tree being operated upon as tree T.) 1. Write "a" where (O,a)€T. 2. If (l,a1)(T then done: exit. 3. Write "(". 4. Write the pseudoterms (by calling Algorithm 2.1), in order, for subtrees T/l, T/2,--',T/i where i=max{jl(j,aj)£T}. 5. Write ")". 6. Done: exit. After recursive applications of Algorithm 2.1 until an exit from the initial call occurs we will have the pseudoterm corresponding to the original tree. 2.3 TREE ACCEPTORS We are now ready to define tree accepting machines. The definition used will essentially be Brainerd's(7) formalization of Thatcher's(39) definition. 3We assume that parentheses are not symbols of 2. 12 Definition 2.10 Let be a ranked alphabet where A={X1,X2,o-oX.k}. A tree automaton, or pseudoautomaton, is a system M= <§,t1,~--,tk,€> where: (a) Q is a finite set of states. (b) For each i, ljiik, t is a relation in szQ for (Xi,z)(o. i (c) FSQ is a set of final or accepting states. If each ti is a function ti:Qz+Q for all (Xi,z)£o then M is deterministic, otherwise M is nondeterministic in which case we write ti(Xl,--°,Xn)~x04 5 iff ((X1,--°,Xn),XO)(ti. If (Xi,0)£o we write tfvx iff (A,X)£ti. Notation: If X=Xi£ A, then tx means ti. We now show how each automaton accepts or rejects a member of TA and thus defines a set of accepted trees. Definition 2.11 The response, p, of‘a pseudoautomaton M to an input is defined as follows: (a) If xtA (xyvx iff tgvx. 0: D (b) If xGAn, n>0, o(x(ol°°°an))~x iff 3X1,°-°,Xn€ Q, tx(X1°'°Xn)~X and 0(aiyvxi, liign. If M is deterministic p is a function with p:-%fQ characterized by the following: (a) If xGAo, p(x)=tx(A). (b) If chn. n>0. p(x(a1°"an))=tx(o(a1)°°°o(un))- Definition 2.12 The behavior of a tree automaton M: is {tIp(t)(F}. This is also called the language of M, L(M). We may read symbol "av" as "goes to" or "contains". 51 I the string of length 0. 13 Example 2.2 A={0,1,X,S,B}, AO={0,1,X}, A1={B}, A3={S}. Consider the deterministic pseudoautomaton M= . We define the t functions as: to(1)=0 t1(1)=1 tx(A)=X tB(X)-B ts(lBl)=s tS(OSO)=s We will find the response of M on tree: ./\ A X V' This tree has pseudoterm representation: S(“S(OS(1B(X)1)0)0)- Now p(S(OS(OS(lB(X)l)0)0))- ts(o(0)o(S(OS(lB(X)1)0))p(0))= ts(to(x)ts(p(0)0(S(lB(X)1))p(0))tO(A))= tS(0tS(to(A)tS(o(1)0(B(X))o(l))to(A))0)= ts(0ts(0ts(t1(x)t8(o(X))t1(A))0)0)= tS(OtS(OtS(1tB(tx(1))1)0)0)- ts(0ts(0ts(ltB(X)l)0)0)= tS(OtS(0tS(lBl)O)0)- tS(OtS(OSO)0)- tS(OSO)= S 14 Since SeF the tree is accepted, i.e. this tree is an element of L(M). Actually we can show that M as constructed here will accept a tree, T, iff T is a parse tree from the context-free phrase structure grammar G- . Note that L(G)= {0“1x1onlngp}. Although the procedure of recognizing a tree with a tree automaton seems to be a tedious task we will see later that it can be greatly simplified. We will be using tree automata as pattern recognition devices where a tree will represent a pattern to be accepted, rejected or perhaps classified by what final state the automaton ends in. Definition 2.13 Given tree a, the frontier of a (abbreviated fr(a)) is the string defined recursively as: (a) If u-(0,x) (i.e. a is a single node tree) fr(a)=x. (b) Otherwise, fr(a)=fr(a/l)fr(a/2)-o-fr(a/n) for n-max{i|i€N+, iéDa}. For example, the frontier of the tree in Example 2.1 is ZZZZ. The frontier of the tree in Example 2.2 is OOleOO. Theorem 2.1 (Thatcher(39)) A set of trees, T, is the behavior of a tree automaton iff T is a projection6 of the set of derivation trees generated by some context-free grammar. Corollary 2.1 The frontier of the behavior of a tree automaton is a context-free language. 2.4 TREE GRAMMARS Definition 2.14 Let acDa, a,B£TA. aga+B)-{(b,x)(aIbZa}L/{(a°b.x)I(b,x)€B}. 6A projection is a function from the (finite) alphabet of the con- text-free grammar onto a finite alphabet. It is known that the class of context-free languages is closed under projections(see (22))- 15 This is the result of replacing the subtree o/a at "a" by the tree 8. Definition 2.15 (Brainerd(7)) A regular tree grammar7 over A (i.e. over ranked alphabet <§.o ) is a system G= <§,o',P,fi> satisfying: (a) <8,o'> is a finite ranked alphabet such that AEB and o'|A=o (i.e. relation 0' over elements of A is exactly equivalent to o). The elements of A and B-A are called terminal and nggf terminal symbols respectively.8 (b) P is a finite set of production rules of the form ¢+W where ¢,weTB. (c) FSQTB is a finite set of axioms (the start trees). The following definition indicated how a regular tree grammar generates trees. Definition 2.16 (Brainerd(7)) gngE is in G iff J a rule 0+? in P such that Ola-9 and B-a(a+‘i‘). gig is in G iff lama with a 3 8. fl is in G iff 300,"',0m, mio such that G=00=§ 01:? "'=§am-B is in G. The se- quence 00,"',0m is called a derivation or deduction of B from a, and m is the length of the derivation. We will think of a regular tree grammar over <2,c> as generating trees in T2. Definition 2.17 If G is a regular tree grammar over A, then L(G)= {ceTAlayfl‘ such that y 30. is in G} is the set of trees generated by G. Regular tree grammars G and G' are equivalent if L(G)-L(G'). 7This was originally called a regular system. 8The terms terminal and nonterminal are used in a somewhat differ- ent sense than one may be accustomed to. Trees in the language of a tree grammar will have all their nodes labeled with terminal sym- bols but it is permissible to have tree productions whose left hand sides use terminals and whose right hand sides use nonterminals. 16 Example 2.3 (Brainerd(7)) Let S8 . A0={p.q}, A1=fi“}, A2={V}. ~~ 1‘41 }. 3P \/ a; P={ p9 .——>~ q 9 I .9N D P . P P \I N 4v q -—> V } N Po ‘1 p. This grammar has no nonterminal symbols. One derivation is: “‘ \l .. v k _r: -4>\: n,g q P I 'w—%:>na q “'77” p N P ”A A v! \l p J////\\\\\'q ///\\\\\ q d! / q ,1 17 Theorem 2.2 (Single Axiom Theorem, Brainerd(7)) For every regular tree grammar G-‘ one can effectively construct an equivalent regular tree grammar G'-‘ such that F' consists of a single nonterminal symbol, i.e. a tree of the form {(0,2)}, 2630- Definition 2.18 A tree grammar Gz over 2 one can effectively construct an equivalent expansive grammar with a single axiom. Theorem 2.4 (Thatcher(40)) A set of trees, T, is accepted by a non- deterministic tree automaton iff T is accepted by a deterministic automaton. Furthermore, given a nondeterministic tree automaton one can effectively construct an equivalent deterministic tree automaton. Theorem 2.4 is shown through a subset construction similar to that used to construct deterministic finite state machines. The following result is basic to our later pattern recognition as it allows us to build an acceptor for trees (and later, tree-like structures) that represent patterns from a given set. Theorem 2.5 (Brainerd(7)) For every regular tree grammar, G, one can effectively construct a deterministic tree automaton, M, such that L(M)-L(G). 9Intuitively an expansive grammar is one where at each application of a production the generated tree grows or expands, it can never contract. 18 For regular tree grammar G= (B,r',P,S> over <£={x1,x2,---,xk},r> the construction procedure may be summarized as follows: Step(l) Obtain an equivalent expansive regular tree grammar with single node trees as axioms. Step(2) The equivalent nondeterministic tree automaton is M= <(B'-£)US",t1,t2,~--,tk,S" where tx(X1---Xn)~XO iff X0+x(X1---Xn) is a rule in P', and S" is the set containing precisely the labels of start trees in S'. Step(3) Now construct a deterministic tree automaton equivalent to M. Brainerd has also shown the converse of Theorem 2.5: Theorem 2.6 For every tree automaton, M, one can effectively construct a regular tree grammar, G, such that L(M)-L(G). Taking Theorems 2.5 and 2.6 together we have: Theorem 2.7 The sets of trees generated by regular tree grammars are exactly those accepted by finite tree automata. CHAPTER 3 UNORDERED TREES 3.1 MDTIVATIONS A general theory of graph grammars, graph transducers and graph automata might be viewed as one goal of research of this type. The results presented in this chapter provide a partial bridge between tree automata theory and this goal. Grammars and acceptors for certain classes of digraphs are shown with a number of interesting results. The digraphs are, however, still restricted in important ways. On a more immediate level, the work done by Fu and Bhargava(ZO) who first used tree automata theory in pattern recognition was the ori- ginal stimulus. The following example of theirs will serve both to il- lustrate the methodology and to point the way towards necessary exten- sions. In this paper, we will view patterns as connected figures which are built up of more basic figures called primitives (see Chapter 4.) The object of this research is not to aid in finding the primitives; rather we assume we have the primitives (or at least have the portions of patterns which may be regarded as primitives) and want to work from them while attempting to identify the patterns. Example 3.1 (Fu and Bhargava(20)) We want to recognize "squares on top of and to the right of squares". A better pattern class definition might be to say that we want to recognize all patterns which can be 19 20 generated by the following procedure: (a) Draw a square with vertical and horizontal sides. (b) Halt or draw a square immediately to the right of or above an existing square. (c) Return to step(b). Let regular tree grammar G8 'where B={S,a,b,A,B} over Z= {S,a,b}. Bo-{a,b}, Bl-{a,b}, Bz-{a,b,S}. a = + , b - + , length(a)= length(b). a LetP-{ s. éA/\’B,Ao éA//\B ' b \\\\\ a] b I a A / 9 AC " > ,30 a } B ./ A ‘B b . a A Given figure: the following tree (which \ /’ can be derived using G) represents it: 21 This tree's pseudoterm representation is S(a(a(b)b(a(b)b(a)))b(a(b)b(a))). The tree automaton which accepts trees generated by G is M8 <{A,B}, ta,t {8}) where the t functions are defined by: b’tS’ tS(AB)=S ta(AB)=A tb(AB)=B ta(b)=A tb(a)-B tb(k)=b ta(X)-a Notice that the nonterminals of the regular tree grammar have become the states of the accepting tree automaton. It will be worthwile for the interested reader to follow the steps through as M accepts some short derivation from G, i.e. find the response, 9, to string, t, which represents (in pseudoterm form) a tree generated by G. Given a pattern and its primitives we would like to be able to generate (and later accept) trees (or tree-like structures) in a manner similar to that of Fu and Bhargava but with extensions meeting the following criteria: (1) When constructing a structure to represent a pattern we do not want to have to use any prior information concerning which of the descendants of a node must come first - note that in the example if S S we had generated \\\\ rather than we - a b, A a b would not have been able to accept the tree with our automaton. We do not want to continually have to check on this kind of order information. 22 (2) We want to be able to construct our structure without the necessity of referring back to the productions in the grammar to find out which new primitives must be adjoined to which current ones. In the example when constructing the representative tree, how did we know that we could not start at S and construct the following tree (which. does represent the figure but will not be accepted by M)? b,//' a The only way we know not to make the tree in this way is by referring back to the production rules and observing that this is not an accept- able tree. In fact, if we can exhaust our supply of primitives for a pattern by constructing a tree where we have consulted a production rule at each stage to see if we are making a prOper expansion we do not need to present the tree to an acceptor at all - it must be accept- able because of the way it was constructed. 3.2 GRAPH THEORY BASES The considerations of the previous section lead us to the following development with graph theoretical definitions. Graph definitions are from Harary(21). .nginition 3.1 A digraph, D, consists of a finite, nonempty set of Points or nodes, U, and a set of ordered pairs of distinct points in U, 23 V. We may write D= <1,» . Any pair (ul,u2) in V is called an are or directed edge. Arc (ul,u2) may be shown as an arrow from point ul to point uz. Definition 3-2 A walk in a digraph is an alternating sequence of points and arcs u -°,x ,u in which each arc x is (u. ,u.). The n n 1 1-1 1 o’xl’“1’x2" 123355 of a walk is the number of occurences of arcs in it. A path is a walk in which all points are distinct. A closed walk is a walk with the same first and last points and a cycle is a nontrivial closed walk with all points distinct (except the first and last). A digraph is acyclic if it contains no cycles. Definition 3.31 A digraph, D, is rooted if it contains a special point, called the root, r, such that D contains a walk from r to each other point in D. A digraph is labeled when each node is assigned a unique name as a label. Definition 3.4 A d-tree is a rooted, acyclic, labeled digraph. The labels used for d-trees will be ordered pairs, a:x. We call a the first label coordinate, "x" the second. In practice it will be the "x" which serves to uniquely identify different nodes. When writing d-tree D we will assume that node a:xa physically shown higher than node b:xb and line segment a:xa-bzxb shown means that there is an arc (a:xa,b:xb) in D. We are taking advantage of the 1Chapter 2 and the first part of this chapter have consisted primarily of results and definitions due to other researchers. From this point on, the definitions and results are original with this author unless otherwise specified. The work, however, does build on the work of others, most particularly Thatcher (39,42) and Brainerd(7). 24 fact that a d-tree has an implied partial ordering on its nodes with a least element, the root. Example 3.2 D-tree T = can be written as T - We define unordered tree differently than other authors (see Knuth (23) and Harary(21)). Definition 3.5 A u-tree or unordered tree is a rooted digraph with each node assigned a (not necessarily unique) label and with exactly one path from the root to each node. We denote the set of all u-trees with labels from alphabet, B, as uTB. The same convention will be used to represent arcs of u-trees without arrows as that used for d-trees. The concept of order is the essential difference between trees and u-trees. Given trees A.= (/:>P\\\\ and B - d////\\\\b we a - -b b a 25 would in no way want to call them the same but if we regard A and B as representations of u-trees they are just different representations for the same u-tree. Definition 3.6 If a d-tree or u-tree contains arc (a,b), node "a" will be called a predecessor of "b" and node "b" will be a successor of a . The set frontier of a d-tree or u-tree is the set of all nodes with no successors. Notation: Where there is no chance for confusion we may denote a node with label x as node x. Definition 3.7 A root-to-frontier path in a d-tree or u-tree is a path whose first point is the root and whose last point is an element of the set frontier. The dgp£h_of a u-tree, t, is the max {nIn-length of a root-to-frontier path in r}. Definition 3.8 We define the natural correspondence of u-trees to d- trees through the following algorithm, Algorithm 3.1. If u-tree U is the result of applying Algorithm 3.1 to detree D then we say U is the naturally corresponding_u-tree to D. Informally we construct U by "unfolding" the non-unique root-to-frontier paths in D to give unique paths in U. In this algorithm we use two pushdown stacks, P and S. P and S will point to distinct nodes of d-tree D and u-tree U respectively. Our notation will be: P-x means that the top of stack P points to node 8 and P+x means to push down P by placing x on it. We also have a me- thod of marking nodes of D and later erasing our marks. The algorithm attempts to simulate the steps a person might follow as he proceeds through D and U; P and S give him the ability to "back up" a root-to- frontier path and the marking allows him to see what has already been 26 expanded. Algorithm 3.1 (Algorithm to define and construct the naturally corresponding u-tree U, to d-tree D.) Initial conditions: All nodes of D are unmarked. P=r for r the root of D. 8-0. 0. Denote current node as node P of D with label a:x. 1. Construct new node P' of U with label "a". If S=0 make node P' the root of U. If S¥0 make node P' a successor of node S. 2. Mark node P. S+P'. 3. If node P has no unmarked successors then erase the mark from each marked successor (if any), pop 8, pop P and go to step 4. Else go to step 5. 4. If 8-0 then done: exit. Else go to step 3. 5. Pick an unmarked successor of P, say node M. P+M. Go to step 0. Example 3.3 A representation of the naturally corresponding u-tree to d-tree T of Example 3.2 is: T' I Definition 3.9 U-tree A1- is a sub-u—tree of u-tree A2- if: (a) U1§;U2 such that for all p1(U1 if (p1,p2)eV2a;p2(U1, furthermore for pGU label(p)=x in A iff labe1(p)-x in A 1’ 1 2° (b) V consists of exactly those pairs in V2 whose components are 1 both in U1. 27 Definition 3.10 U-tree A1 is identical to u-tree A2 if each is a sub-u- tree of the other, and we may say A l-AZ. 3.3 STRING REPRESENTATIONS AND THE IMPLIED RELATIONSHIPS We want to be able to represent d-trees and u-trees in linear form, as pseudoterms (see Definition 2.9). The correspondence between u-trees and their pseudoterms is given by: (a) For the single node t, the corresponding term is t. (b) For arcs (to,t1),(to,t2),°--,(t0,tn) being the only arcs with f first coordinate to, with f -°°,fn respectively being the 2’ ---,tn, the set of corresponding 1’ terms corresponding to t1,t2, terms is {to(Y)|Y is a permutation of {f1,f2,---,fn}}. The correspondence between d-trees and their pseudoterms is given (a) For the single node t:x the corresponding term is t. (b) For arcs (t0:xo,t1:x1),(to:xo,t2:x2),--f,(tozxo,tn:xn) being 0:x0 with f1,f2,--',fn respectively being the terms corresponding to t the only arcs with first coordinate t 1 232" " tn:xn the set of corresponding terms is {to(Y)|Y is a perm- :x1,t utation of {f1,---,fn}}. A corresponding pseudoterm for the d-tree in Example 3.2 and the u-tree in Example 3.3 is S(a(b)b(c(b))a(c(b)a)). Note that if B is the corresponding u-tree to d-tree A, both E and A will have the same pseudoterm representations. Again we stress the fact that neither the representation in Example 3.2 nor that in Example 3.3 is unique: there are several more ways of writing each of these structures. A tree can serve as a representation for a u—tree; we can regard T' of Example 3.3 28 as being a tree which is one representation for the u-tree. In order to firmly establish the relationships that exist between d-trees, u-trees, trees and pseudoterms we will define the following mappings. (Here we distinguish between "mapping" and "function" by allowing a mapping from set A to set B to be any relation on AXB such that each element of A will be related to one or more elements of B whereas a function will be a relation on AXE such that each element of A will be related to exactly one element of B.) Define c : d-trees+u-trees by the natural correspondence. 1 Define c2: u-trees+trees by t£c2(u) iff t is a tree which can serve as one representation of u-tree u. Define c3: trees+pseudoterms by Algorithm 2.1. Define c4: d-trees+pseudoterms by the above correspondence. Define c5: u-trees+pseudoterms by the above correspondence. The following digraph illustrates these mappings: /" A '\ 7 \- ' T c1 c2 pseudo- d-trees u-trees trees terms C4 All d-trees, u-trees and trees are over some ranked alphabet Z. Pseudoterms are over £1Jf),(}. 29 We may make the following observations concerning these mappings: c1 is a function since each d-tree has a unique naturally corres- ponding u-tree. c 1 c 'f 1 F 7\ a a 8 no . = . 2 a unct on or example c2(b c) { b./\c’ c\/\b } c is a function. (See Algorithm 2.1.) is not a function. Note that c4(d)-c3(c2(c1(d))) which (because of the inclusion of c2) is not a function. c5 is not a function. Note that c5(u)-c3(c2(u)) which is not a function. -1 2 c1 is not a function since c1(a:x)-c1(a:y)=a. cgl is a function since given a tree it can serve as a representa- tion of exactly one u-tree. c;1 is a function. (See Section 2.2.) -l c 4 is not a function. We note that c;1(p)-c11(c;1(c;1(p))) l -1 -l -1 is a function. We note that c5 (p)-c2 (c3 (p)). which (because of the inclusion of c 1) is not a function. c-l 5 We now define some equivalence relations based on the above funct- ions. Let (D,E) be the equivalence relation on d-trees with dlid2 iff c1(d1)-c1(d2). Let (T,E') be the equivalence relation defined on trees by tli't2 iff c;1(t1)-c;1(t2). Let (P,E") be the equivalence relation :n -1 :1 7'1 ' .00 defined on pseudoterms by p1- p2 iff c3 (p1)- c3 (p2). Define c , ,c5 ! in a similar manner to c ':-,c except they are now defined on equi- l’ 5 valence classes where appropriate, giving the following digraph: 2Notation: Regard (a:x) as the single node d-tree with label a:x. Similarly for (a:y) and a. 30 u-trees It is clear that cl',---,c5' are functions (as are c1'-1,°°°,c5'-1) ' must be one-one. Each equivalence class in (D’s) I l ’ 5 will be infinite but each equivalence class in (T,s') and in (P,5") so each of c "',c will be finite. These mappings and equivalence classes will be used as we develop the theory of unordered tree automata. Two things are perhaps most important to observe: (1) Given either a u-tree, a tree or a pseudoterm we can immedi- ately and uniquely determine the appropriate corresponding pair among u-trees, classes in (T,E') and classes in (P,E" (2) Given a d-tree, it immediately defines a class of pseudoterms which correspond to a unique u-tree. 3.4 U—TREE GRAMMARS AND AUTOMATA Definition 3.11 A regular u-tree grammar, abbreviated RUTG, over ranked alphabet satisfying: (a) (3,6) is a finite ranked alphabet such that A93 and o' |A=o. (b) P is a finite set of production rules of the form 6+? where ¢,Y£uTB. 31 (c) PguTB is a finite set of axioms. (d) For tGP or for t which is either the LHS or RHS3 of a prod- uction in P, if t has exactly n arcs with first coordinate c, then ceAn. We will think of a RUTG over «5% as generating u-trees with labels in A. The following indicates how a RUTG generates u-trees: Let a,8,¢,w be u-trees over B such that ¢+w and a- », -Uv -Uv -U‘v.w ggif b--t B,¢<¢,¢>,w esaya 3asuuree of a, say 8- ., such that {paths in S}={paths in ¢} and if UB= (Ha-US)L}U¢ and VB-(Vu-(VSLJ{(u1,urs)})Lij[){(u1,urw)} where (ui’urS) is the arc that connected a node of Ua’ node ui, to the root, urS’ of S i e a 3g if He sequence of u-trees ao,a1,---,an such that canoe---=>an=a. and (ui,urw) is a new arc connecting u to the root, “IW’ of w. We say Given regular u-tree gramar G- (B,o' ,P,I‘> over , the language * of G, L(G), is the set, T, of u-trees over A such that tcT iff y :3 t for some yer. Definition 3.12 RUTG S is equivalent to RUTG S' if L(S)-L(S'). Definition 3.13 Given set A, we define a set of equivalence classes, (AP,E), on the set of strings of length n over A by letting xsy iff x is a permutation of the symbols in y.for x,y(Ap. We denote an individual class in (AP,E) by overscoring any individual permutation in the class. Example: '35 means the class of permutations on string "ab". This is {ab,ba}. 3LHS - left hand side, RHS - right hand side. 32 Definition 3.14 Let where: (a) Q is a finite set of states. (b) For each i, liijk, t is a relation in (Qz,s)xQ for (Xi,z)eo. i (c) FQQ is a set of final (or accepting states. Definition 3.15 If each ti is a function t1:Qz+Q for all (X1,z)60 then M is deterministic, otherwise M is nondeterministic in which case we write ti(Xl---Xn)vxo iff ((X1---Xn),xo)cti. If (X1,O)£o we write tfvx iff (A,X)£ti. Notation: If x-X.€A, we will often write t 1 meaning t X 1' Definition 3.16 We now show how each automaton accepts or rejects a member of T and thus defines a set of accepted trees: The response, A p, of UTA M to an input is defined as follows: (a) If xEAO (b) If the input is as which has pseudoterm representation , D(x)~x iff tivx. x(olo2--°un), chn for n>0, where {oillfijn} contains the pseudoterm representations for sub-u-trees of a 3 (x,a1)£V for ljign then p(a)~x iff 3 X1,-°°,anQ such that tx(X1-°-Xn)‘~ X and p(oi)~xi, ljign. If M is deterministic, p is a function such that p:tA+Q character— ized by the following: (a) If xGAO, p(x)'tx(l). (b) If the input is as in (b) for nondeterministic u-tree automata then 0(a)-tx(o(al)-°-p(an)). Definition 3.17 Given UTA M= the behavior of M is {t|p(t)£F}. This is also called the language of M, L(M). 33 We now proceed to establish that RUTGs have the same prOperties in relation to generated u-trees that regular tree grammars (henceforth abbreviated RTGs) do for trees. Furthermore RUTGs bear the same rela— tionship to UTAs that RTGs do to tree automata (henceforth abbreviated TAs). The development presented herein for unordered trees is similar to Brainerd's development for ordered trees. Definition 3.18 RUTG G- (B,o,P,I‘> is reduced if 1‘ consists of exactly one single-node start tree. Definition 3.19 RUTG G-T over A is expansive if each rule in P is of the form Xo+x(R;777R;) where xeA.n and Ro,Xl,---,Xd£B-A or of the form Xo+x where xGAO. Lima; 31:}. For each RUTG G-~ over 2 one can effectively con- struct an equivalent reduced expansive grammar. Proof WOLOG assume P-{¢1+¢1Ilfii§r} and S-{ajllfijjk}.4 Referring to function c2' of Section 3.3 we see that for all u£{u-trees over 2} c2'(u)#¢. Therefore for all ¢1.W1,0 we can pick t:,t; and ti respect- 1 ively such that t:6c2'(¢1), t$£c2'(w1) and tj-c21(a a ). we now construct J RTG c'- (s',o',1>',s'>9 B'-B,_o'-o, P‘dti—rtiligiir} and S'-{tgll§j_<_k}. Now for all t€L(G') we know that t£c2'(u) for some u£L(G) and further- more for all u€L(G)]t(L(G') such that t€c2'(u). (G' is a "microcosm" of G; it functions similarly but at each application of a production it produces one representation of the u-tree that G produces.) we know by Theorem 2.3 that we can construct RTG G"- such that B =8", 0 f Xo+x(x1° - ~xn)€ P"}. =0 , s -s and Pf={XO+xIX0+xéP }U{xo+x(x1---xn)| f f f MG). (2.13.1). So RUTG G is clearly reduced and expansive and furthermore L(Gf)= LEE _3___2_ For each reduced expansive RUTG S- <3,0,P,Z> over A, one can effectively construct a nondeterministic UTA M, such that L(M)=L(S). ‘Ppggf_ Let M? (O-(B-A)L){Z},tl,-°-,tk,{ZP> where the move functions are tx(W)~xO iff X0+x(—X—F:}-(;) is a rule of P (in pseudoterm form). We first prove that X :3 a in S iff p(o)~X in M, by induction on the depth of a. (a) Depth(a)-0#a-XGAO, thus x-m in 5 iff X=Dx in 5, since 5 is expansive, iff X+x is a rule in P iff tivx, by definition of M, iff p(o)§p(x)~X, by definition of p. (b) Assume a- has pseudoterm representation x(a1a2°°-an), xéAn, where {ailljifn} is the set of all sub-u—trees of a)(x,x1)(V and xicai for lfign. Our induction hypothesis is that if depth(a')for Q-X1,-°-,Xj we construct deterministic M'-<2Q,t1',---,tk',F'> . We have 2Q: Xl',---XZJ' where each Xi', 13i§2j, denotes a separate subset of Q. We define ti' by ti'(XITTT:X;T)-Lj{xr|ti(E;:TTX;)VXr for xlex ',-o-, Xh(Xn'}. We let F'-{X1'IX1'f\F*¢}- Clearly M' is deterministic since each ti' is a function. To complete the proof (i.e. to show that L(M)=L(M')) we first show, by induction on the depth of a, that pM(a)~g5 iff ng(a)=G with geG. (a) Assume depth(a)-O=}o-x£AO. gec, by our definition of tx', iff pM,(a)-G with 366. Now pM(o)vg iff ting iff tx'-G with (b) Assume depth(a)>0 and a- with pseudoterm representation x(alo2°-°an), fon, where {a liign} is the set of all sub-u-trees of 1| a3(x,xi)€V and xien1 for lfiin. Our induction hypothesis is that if depth(a') be a u-tree automaton over . One can effectively construct a regular u-tree grammar 8- over such that L(M)-L(S). £593; Define tree automaton M'- by: Q'=Q, F'BF and we include one rule of form t1'(X1°°'Xn)~X in M' iff t1(XITTTR;)~X in M (so we include one specific permutation from each permutation class). It is clear that for each u-tree a£L(M) one specific permutation of a will be accepted by M' and for each tree o' accepted by M' the u-tree c;1(a') will be accepted by M. By Theorem 2.6 we know we can construct regular tree grammar G such that L(G)-L(M'). By Theorem 2.3 we know we can then construct a reduced, expansive regular tree grammar G' such that L(G')-L(G)-L(M'). Assume G'- over . G' will produce exactly one representation of each u-tree accepted by M. We can now define reduced, expansive RUTG S= over (Am) such that S will produce the u-trees in c;1(L(G')). We define B S-B', oS-o', and PS-F' so PS contains a singlex P'} U{x0+x(§1_77i;)|xo+x(xlmxn) EP'}. 80 we have L(S)-L(M). Q.E.D. Theorem 3.3 The sets of u-trees generated by regular u-tree grammars are exactly the sets accepted by u-tree automata. Iggggf Follows immediately from Theorems 3.1 and 3.2. Theorem 3.4 Given regular u-tree grammar G- (B,o,P,I'> , one can effect- ively construct regular tree grammar G'- be a regular u-tree granmar S wi‘thP-f OS -———> Sx/>a\c, a/\c A/o } '- Then regular tree grammar G length(a)=-x length(b)-x+y, y>0 .. .b- N a. _>b._:r In each of these cases "a" is a structural part of "b". 4.3 BUILDING U-TREES FROM PATTERNS Definition 4.1 A primitive feature system is a triple P8==<§,P,D> where: (a) P is a finite set of primitive types. (b) D is a finite set of informal descriptions of elements of P. (c) SGP, S is designated the start point. (d) No primitive type (except 8) is allowed to be a structural part of any other primitive type. We normally refer to a primitive feature system simply as a pggmif tive system. Example 4.2 Given a pattern in quadrant l of an X/Y coordinate system. Let P30. , P-{S=-,a- —),b-T}, D-{ (S is the closest point of 42 the pattern to the origin),(length(a)=length(b)=w, but w may vary and will assume an appropriate size for a given figure),(a is horizontal), (b is vertical),(the point of entry for a and for b is the end away from the arrow point, the point of exit is the arrow point)}. In future examples where there should be no chance of misinterpretation we will usually only give a description for S. The rest of the descriptions will be clear from the geometric form of the primitive types. Using the previously defined PSo let pattern F1 be given: (1.2) (2.2) F1 - (1.1) (2.1) Then the primitives will be: (a.[(1.2), 92)]) A A (b.[(1.l).(l.2)]) (b.[(2.1).(2.2)]) (S.(1.1)) (a.[(1.1). 2.1)]) Definition 4.2 Given a primitive system PSB , a corresponding d—tree, T to pattern, F, is a d-tree constructed using a representation of F by P8 with the following procedure: (a) Root of T is S:identifier of S. (b) Node a :b is a successor of node a1:b1 in T iff the point 2 2 of entry of primitive (a2,b2) coincides with the point of exit 43 of primitive (a1,b1) in the representation. Clearly, the corresponding d-trees when constructed in this manner will satisfy criterion (2) of our pattern recognition system. No refer- ence needs to be made to anything, we just construct the d-tree taking all primitives as they occur in the representation. It is important to note that a given primitive, (a,x), may have its point of entrance co- incide with more than one point of exit; in this case node a:x of a corresponding d-tree would occur on more than one root-to—frontier path. and F as defined earlier, the corresponding d- Example 4.3 Using P80 1 tree will be: S:(l,l) a:[(l,l),(2,l)] . b=[(1.1).(1.2)] b=[(2.1).(2.2)] a=[(1.2).(2.2)] Where there is no possibility of misinterpretation we can leave off the unique identifiers and get: 44 If we let F2 = a Ex then the primitives (using P30) will be: b b a at a b b S a 1' and the corresponding d—tree will be: S a b b ,a a >b b .,a It should be clear exactly what primitive is represented by each node of this d-tree without the identifiers. 45 Definition 4.3 A circular primitive system PS= ,B= T ,C= i ,D=<—-},{(S is the closest point to the origin in quadrant 1)}> . If F has primitive representation: s a / then S,(S,a),a,(a,b),b,(b,d),d,(d,c),c,(c,S),S is a closed walk in the corresponding d-tree so P34 is circular. Definition 4.4 A noncircular primitive system is one which is not cir— cular. (i.e. No patterns can exist in the appropriate universe that can produce a corresponding d-tree containing a closed walk.) Example 4.5 The universe is the subset [0,w) of one-dimensional space. Psl- <8,{S-',a-—>},{(S is the point of the figure closest to 0) }> . Then P81 is noncircular. Example 4.6 Universe - 3-dimensional space (in the area where x,y,z:9). (x,y,z) (staz) (X:Y:z) P52- . P-{ S - or or . (x,y.z+W) (XJ'Wfl) (x-WJJ) (X,YW,Z) (x,y+w,z+w) (x9Ysz) (x:Y:z+w) (x+w,y,z) b = (x,y.2) (x+W.y.2) c a: (x,y.2T 46 _. (m:}’:z+w) (x.y.z+W) e1 ’xw.y+w.z) X.y+W.2) D={(S is the closest permissible line segment of the pattern to (0.0.0). i.e. the sum of the distances from the endpoints is the least)»(a:ba°»s all share the same appropriate w),(the points of exit for a,b,c are the sides where the arrows point, points of entry are any other side)}. P then, is noncircular. 32’ Example 4.7 Ps3- 9 " 6 " " " ‘EIZEI; twice " ‘EES " " a6 " " 49 If k=7 we could apply Sa1 then a6 therefore S+a7+S(S) n n n n u — n n n n 8 a3a4 a7S a8 :1 n n n —— n — n n u u 9 814313 863 a9 n u n n _—— n __ n n n u 1° a14““13 1“93 a10 Therefore there cannot be a noncircular Ps' Q.E.D. Definition 4.6 Given pattern, f, with corresponding d-tree, t; we say that u-tree, u, is a correspondipg u-tree to f if u is the naturally corresponding u-tree to t. In Example 4.3 a corresponding u-tree for will be: pattern F2 b u a J b a Practically speaking we may observe that the corresponding u-tree is a theoretical construction only; it allows us to construct an accept- ing u-tree automaton, M, which will accept corresponding u-trees (thus accepting patterns) iff they are in L(M). we will never need to actually construct a u-tree when attempting to recognize a pattern: we can find the pseudoterm representation directly from the d-tree. Definition 4.7 A set of patterns, T, is definable if there is a regular u-tree grammar, G, and a primitive system, P8, such that given pattern t with y the corresponding u-tree for t, then tET iff y(L(G). Definition 4.7 does not mean that L(G) cannot contain u-trees which do not correspond to a figure in definable set T, in fact L(G) will often 50 contain such u—trees. These u—trees, however, cannot correspond to any physically possible pattern. See Example 4.10 for a case where this occurs. Note that every finite set of patterns {F1,F2,---,Fn} will be definable since we could first define the u-tree productions, figure by figure, as So+1:i, one level trees, then define the primitive system to have the figures themselves, and S, as primitive types. (Fi cor- responds to P1.) This misses the spirit of what we mean by primitive type but it does not violate our definitions. 4.4 THE PATTERN RECOGNITION PROCEDURE Theorem 4.2 Given definable set, T, there is a deterministic u-tree automaton, M, such that for pattern t with y the corresponding u-tree for t, then ttT iff y€L(M). £222: Follows immediately from Theorem 3.1. At this point we can describe our procedure for determining if a given pattern is in a given definable pattern set using an apprOpriate primitive system and u—tree grammar. Pattern Recognition Procedure Step(l) Given a pattern find the primitives which taken together constitute it. Step(2) Build a d-tree with labels taken from the set of primitives. Step(3) Present the pseudoterm corresponding to the d-tree to a previously constructed UTA for acceptance or rejection. It seems appropriate to ask what parts of this procedure are completely defined and thus algorithms. Given a definable set of patterns, step (2) under most conditions (see Theorem 4.4 and related text) will be an algorithm. With a given UTA, M, step (3) is an 51 algorithm defined by M. Step (1), however, in many cases is not com— pletely defined. It is not the purpose of this thesis to describe meth— ods of imposing primitive types, so we will just comment that probably a mathematical template matching involving "closest fit" could be used, but certainly a variety of methods might prove best for different applications. Of course, once a computer program has been written to find the primitives, the program itself will define a procedure. Definition 4.8 Primitive type c has unique size means that any primi- tive (c,x) in a representation of a pattern must use the identical c with no variation in any of its measurements. Lemma 4.1 Given u-trees T1- , T28 with root (T1)= root(T2)-r then T not a sub—u—tree of T2 eBnode(a)16 Uan2 such that 1 {(a,x)l(a.X)£V1}¥{(a.X)l(a.X)£V2}- Proof Given the conditions of the lemma assume to the contrary that for all aGU1r\U2,{(a,x)I(a,x)fVi}-{(a,x)|(a,x)éVZ}. In a u-tree every node can be reached on a path from the root. We first establish the claim that (given the assumption) yeU iff yeU This is shown by induction 1 2' on the length of path r,---,y. (a) We know rGUanz, and by the assumption {(r,x)l(r,x)( V1}= {(r,x)|(r,x)6V2} for all eri such that x can be reached from r by a path of length 1 thus x6 also. U3-1 (b) Assume claim true for all paths of length less than n. Now for ny1 reached on a path of length n we will have path r,(r,x1),x1,°°-,xn_1, (xn_1,xn),xn-y. Path r,(r,x1),x1,-'-,xn_2,(xn_2,xn_1),xn_l then is a 1We refer to a node by naming its label. 52 path of length n-l in T , therefore, by our induction hypothesis, 1 xn-lGUB-i also. Since (xn_1,y) is in V1=) , by our assumption, that (xn_1,y) is in V3_1$y603_i. So the claim is established and we have U1=U2 therefore U1 2. and (p1,p2)£V2 we must have (p1,p2)£Vi. Therefore gu Furthermore for pltU1 pztU1 and criterion (a) for definition of sub-u-tree (see definitions 3.9 and 3.10) is satisfied. Ul=U2=)for all aCUl, atUln Uzéfor a£U1 and erl (a,x)( Vl iff (a,x)(Vz, by our assumption,=) criterion (b) for definition of sub—u—tree is satisfied=)T1 is a sub-u-tree of T2® , so the assumption is false and the lemma is proved. Q.E.D. Theorem 4.3 Given primitive system Ps-* with noncircular P and with the restriction that each peP has unique size, then the cor— responding d-tree to a given pattern, f, will be unique (to the extent that any two d—trees representing f will have identical naturally cor- responding u—trees). Proof Assume d—trees d1 and d2 representing f have corresponding u—trees T1 and T2 respectively with TlfiTz. This implies that (WOLOG say) T1 is not a sub-u-tree of T2. Assume Tl- (U1,V1> and T2- (U2V2> . T1 and T2 must have the same root, S, this implies f]node(a)€Ulf\U2 such that {(a,x)I(a,x)(V1}#{(a,x)[(a,x)cvz}, by Lemma 4.1, but we can see that since size(a) is unique it must have the same point of exit (in relation to f) in both the representation for T1 and that for T2. Furthermore because of noncircularity this point can never have been reached previously during d-tree construction. Therefore every (a,x) in V1 (meaning that point of entry for x coincides with point of exit for a) must also be in V3“:l so that {(a,x) I (a,x)(V1}={ (a,x) I (a,x)cvz} ® 53 contradicting the lemma. So the assumption is false and TIBTZ. Q.E.D. There are other possible restrictions one might want to impose, rather than the unique size restriction of the theorem, in order that a This is to certify that the thesis entitled Unordered Tree Automata: Machines That Can Classify Patterns presented by Kenneth Leroy Williams has been accepted towards fulfillment of the requirements for Pho Dc degree in Computer seimce We 4:54,? Major profs Date August 7, 1973 0-7639 ue d-trees. In gener- to represent patterns priate restriction can not true for circular ge class of sets of systems. The ob- constructing d-trees que d-trees. This is - / > stem let PS6 \S,P,D . z—‘} angth of x 8 c), Constructing the d-trees on a breadth first basis will give either 53 contradicting the lemma. So the assumption is false and T1=T2. Q.E.D. There are other possible restrictions one might want to impose, rather than the unique size restriction of the theorem, in order that a given noncircular primitive system would give unique d—trees. In gener— al if a noncircular primitive system will suffice to represent patterns of interest for a given application, then an appropriate restriction can be found to insure uniqueness. This, however, is not true for circular primitive systems and unfortunately there is a large class of sets of patterns that cannot be represented by noncircular systems. The ob- vious method, that of applying primitive types and constructing d—trees on a breadth first basis, will not always give unique d-trees. This is illustrated by the following example. Example 4.8 In quadrant l of an X/Y coordinate system let Ps6. . P={s=.,x- m,zl- -—9,zz=<—'} D={(S is the point closest to the origin),(cord length of x - c), (length(zl)-length(22)-l/2 c)}. Imposing primitives gives either x (.Q Q b 52121 ’()Szlzz ,(c) Szzzl’(d) z Constructing the d-trees on a breadth first basis will give either 54 21 from (a) or x 21. from (b) so there is no 1 22 unique d-tree. We can however, in this case, define a regular u-tree grammar G- which will generate exactly those u-trees (with d-trees generated on a breadth—first basis) corresponding to the pat- terns in the set: The grammar will have B-{S,x,zl,22},T-{S-} and S S P 8 { i .__€E>,x ' z1 , i""€;;>xI///f\\\\bzl , z z 1 2 094 x N g.» (’54 K N I'-‘ U .21 22 21 21 9 ;>--j35>x 21 z‘——i35>x z1 } 1 1 21 22 This grammar will generate many u-trees which are not corresponding u—trees for patterns in F, for example: 55 C) but every u-tree of this type will represent a physically impossible figure so there will be no harm done by our tree acceptor accepting d-trees we can't get anyway. If we try to draw a pattern corresponding x to tree t we get: t::::::::€::;:::>~ S Reversing ourselves and generating a corresponding d-tree (breadth- first) for this pattern, however, we get 3 21 x 21 4 21 4 x z A? 56 and since this is 225 the same u—tree as t we can see that t was not the corresponding u-tree for any real pattern. t must violate our definition of how the corresponding d-tree should be formed with every possible primitive branch. Under certain conditions we can get unique d-trees from circular primitive systems if we are willing to do some exterior checking while building them. In the following theorem we assume one condition which is considered to be understood. Unique Application Condition we assume that there is a procedure for applying a primitive type so that if a given location of the pattern serves as the point of entry for the resulting primitive then the same primitive type will always be applied in the same way at the same location, so that exactly the same part of the pattern will be represented by it and the same location (in relation to the pattern) will be the point of exit. This seems a reasonable assumption since presumably we would implement this sort of scheme on a digital computer where the same machine instructions would always be executed in the same manner. Theorem 4.4 Given a primitive system (possibly circular), under the Unique Application Condition, one can always construct a unique d-tree for a given pattern if a total ordering (<,=,>) can be defined on the unique primitive identifiers with ash iff a-b. 2522; Given primitive system Ps-T we will prove the theorem by developing an algorithm which builds the d-tree as it imposes primitive types to give primitives. At each step there will be precisely one primitive which may be expanded (i.e. its successors 57 found) thus the algorithm works by first taking an enumeration of the elements of P, say p1,p2,-°-,pn. (P must be finite, by definition.) In general p1 will be expanded before pJ if i , (given above conditions) on a partially breadth first basis. (1) Apply primitive type 8-p1, giving primitive (p1,x1). Root of D+pi:x1. G+{p1:x1}. U+{p1:x1}. (2) H+{pi:x x:€G with i-min{klpk:xr£G}}. (H must be nonempty.) (3) F+{p1:xk|p1:XjCH=§xk§xj}. (F must contain exactly one element, say f, because of the total ordering.) G+G—F. (4) E+{pk:xj|pk:x is a result of expanding at f and each x >x1 for J J all previously used x1}. U+ULJE. V4VLJ{(f,w)|w E}. G+G£JE. If G-¢ then exit. Else go to step (2). The algorithm clearly must generate a unique d-tree. Q.E.D. 58 Definition 4.9 we say p is a projection from u-tree ulcuTA1 to u-tree uzfuTA2 if digraph u2 is formed by relabeling the nodes of digraph ul with some relabeling function p':A1+A2. For sets of u-trees, U1 and is a projection of U if U contains exactly the set of U2, we say U 1 2 2 projections of u-trees in U1. Theorem 4.5 Pattern set F definable implies that EJPB,H and G with H a projection of the set of derivation trees generated by context-free grammar G, (we denote the set of projected derivation trees as Hd(G)), and P8 is a primitive system such that given pattern f, with y the corresponding u—tree for f, then fGT iff yéfld(G). 25933: F definable$3 regular u-tree gramar G' such that L(G') satisfies the conditions for Hd(G) in the theorem statement, by the definition of definable, :) Edeterministic tree automaton M with L(M)= L(G) so L(M) satisfies the requirements of Hd(G) in the theorem state— ment, by Corollary 3.1,=§ 3context-free grammar G with Hd(G)=L(M) so “d(G) satisfies the conditions of the theorem, by Theorem 2.1. Q.E.D. 4.5 SOME EXAMPLES Many infinite sets of patterns where each pattern is of finite size will be definable. In these cases we usually find that the pat- terns can each be broken down into connected subpatterns where the sub- patterns are each definable and where any number of such connections may be made, giving larger and larger, but still finite patterns in the set. Recall Example 4.8. Such cases illustrate the advantages of the recursive manner in which regular u-tree grammars can generate u-trees. 59 Example 4.92 It is desired to be able to recognize the set of houses, T ' { a : etc. } We can see that each house may be represented using only the primitive types: a-(—- , bf- T , c- /, d- ‘\ , S(the start point)--, with the convention that S will represent the point in the lower right hand corner of a pattern. A specific house in the set may be represented as; a,17 a,16 d,l3 d,15 a,1l a,12 A c,8 a,9 b,7 b,6 b,5 13.14 E k 3,1 8,4 8.3 8.2 The numbers associated with each primitive serve to uniquely identify each specific application of a primitive type. we now construct the representative d-tree for the pattern: This problem was 8180 used as an example in Fu and Bhargava(20). A similar but more limited problem was first given by Shaw(38) and also appeared in Fu and Swain(18). 60 The d-tree is made by using each primitive label as the label for a node of the d-tree. We place arc (a:xi,b:x ) in the d-tree if primitive b:x .1 can be directly reached from primitive a:xj, always traveling with the flow of the arrows. Now that the d-tree is constructed we just present .1 the corresponding term S(a(a(a(b(c))b(a(c)d))b(a(a(c)d)d(a)))b(a(a(a(c)- d)d(a))d(a(a)))) to our previously constructed UTA for acceptance. To see that a UTA can be constructed that will accept terms only for d-trees corresponding to elements of the pattern set we observe that the follow— ing grammar will generate the corresponding u-trees. we can then use the procedure as indicated by Theorem 3.1 to construct the UTA. Let G- <{a, b, c, d, S ’81’82’83 },o,P ,{S }> over (fa, b ,,c,d S},o'> >with «A A 61 Another example using very different primitive types will be instructive. Example 4.10 Let the pattern set, T, be all isocoles right triangles with vertical and horizontal sides, constructed of grid blocks. T - { I ,’ ’ [ etc. } It The primitive types are: S= , T= . S is the closest block to the lower left hand corner of a pattern. Points of exit for S and for T are the sides where arrows point, points of entrance are the other sides. Define the u-tree grammar to be G-‘<{S,T},0,P,F> E S S T where P { o -_€EE::d///f\\\\I , z E: d////\\\\bT } 10 The d-tree representing 8 9 5 6 7 will be 62 G can generate many u-trees that do not represent patterns in T, but these will not be the corresponding u-trees for physically possible patterns anyway, so no harm is done. Consider the derivation: which produces a u-tree that does not correspond to an element of T. This u-tree, however, does not correspond to any possible pattern. Any attempt to construct such a pattern, while realizing that all possible connections must be shown in the u-tree illustrates this fact. As mentioned earlier, one of the best features of this automatic method for pattern recognition is in the way it can recursively accept pseudoterms corresponding to recursively defined pattern sets.3 As a last example we outline a possible use involving nonrecursive patterns. This example also illustrates the capability of the method for pattern classification as well as recognition. This method is probably not the most practical for printed characters; a vast amount of research has been done on this problem. Example 4.11 we want to recognize printed English capital letters. In order to keep the example small we will only work with three letters, C, G and O, and we will assume that these are written in an idea form for analysis. we will use primitive types: s--, a! R.\\ , b-/, c-\ , d= /, e-=—-), f-T, g-é—,h-l with start point 8 taken as the closest point to the origin in quadrant 1 of an 31n fact this recursive capability is one of the main advantages of any syntactic system. X/Y coordinate system. Since we system we will use the procedure unique d-trees. The enumeration above, so "a" is expanded before etc. Given ideal character C we primitive system as: f which has d-tree mm Representing G: Representing 0: 63 have a clearly circular primitive outlined in Theorem 4.4 to ensure of primitive types will be as given "b" which is expanded before "c" would then represent it with this e { c; 64 We define our u-tree grammar over'(T-{s,a,b,c,d,e,f,g,h},o> as G- with B-{C,B,O}UZ and 8 31 C a G P={E——>f ’ °‘>£ ’ e b. b b 4 at) .J cl s a c °—> O f e } b b e f In order to define an accepting u-tree automaton we first need to find an equivalent expansive u-tree grammar over <%,o> . To get this we work through each production in P, introducing new nonterminals where necessary, to make expansive productions, giving G- with Bl-{a1,b b b }LJB and 1! 2! 39b4’c19c29c3’c4fieloez’e3tel.985’f19fafig3 Pl"g‘>.l7\c;““‘>:l ' 59:11 ’ :1__>b1,:1_>c:1, c.1—> §,c.2—>e:1, E’1 0 m N m O———O 00" N 00‘ U °0 H > u U 34—21 - 34—21, ‘34->"1 H a} The deterministic u-tree automaton we construct will end in state C if the input pattern is C, state C if G and state 0 if 0, so it will classify the input. It will not end in any one of these final states for any other input. Using the construction process outlined following Theorem 2.5 (modified for u-trees) we define M= (Bl-z’ts’ta’tb’tc’td’ te,tf,tg,th,{C,G,O}>». The t functions will be defined by: t8 (ac C 1°)2' t $1(ac 3)=G ts(a1cM4) ta(f1)-a1 tb(e1)-b1 tb‘AH’z tb(§;3;3=b3 tb(f4)=b4 tC(Akcl 66 te(e2)‘cz tC(83)‘°3 tc(e4)=°4 te(°1)"e1 te(b2)-e2 te(b3)-e3 te(b3)’e4 te(x)=eS t£(b1)'f1 tf(A)-=f4 tg(A)-g3 To see how M functions we will use Definition 3.16 to trace through an acceptance of pseudoterm s(a(f(b(e(c))))c(e(b(ge)))) which represents the corresponding d-tree to an ideal input pattern, a. 0(8(a(f(b(e(c))))c(e(b(8e)))))' t3(p(a(f(b(e(c)))))p(c(e(b(ge)))))- t8(ta(p(f(b>cc(te(p))))= t8(ta(tf(tb(p(e(c)))))tc(te(tb(p(s)p(e)))))= ts(ta(tf(tb(te(p(c)))))tc(te(tb(t8(x)te(x)))))- t8(ta(tf(tb(te(tc(x)))))tc(te(tb(g3e5))))= t8(ta(tf(tb(te(c1))))tc(te(b3)))- ts(ta(tf(tb(e1)))tc(e3))- t8(ta(tf(b1))c3)- ts one can construct reduced, expansive regular tree grammar G'- <3,o',P,{Z'}> over with L(G')-L(G), by Theorem 2.3. One can then construct phrase structure grammar G"- such that P' will have Xo+x(X1--°Xn) for each 1An equivalent result was proved in Brainerd(7). 7O «9 9 in P. L(G") will then be exactly X1 X2 Xn the set of pseudoterms representing elements of L(G')=L(G). Each of the productions in P' will have a single nonterminal in B-A as its LHS (since G' is expansive) thus G" is a context-free grammar, therefore one can construct a nondeterministic PDA which will accept exactly L(G") which is exactly those pseudoterms representing elements of L(G). Q.E.D. Corollary 5.1 Given any regular u-tree grammar, M, one can construct a nondeterministic PDA that will accept exactly those pseudoterms re- presenting elements of L(M). Eggg§_ Given M, consider regular tree grammar G that simulates M, from Theorem 3.4. Then apply Theorem 5.1 to G. Q.E.D. Corollary 5.2 Given any regular u-tree automaton, M, one can construct a nondeterministic PDA that will accept exactly L(M). figgggf Follows from Theorem 3.2 and Corollary 5.1. So we can always construct a nondeterministic PDA to take the place of a u-tree automaton. Nondeterministic PDAs, however, are themselves difficult to implement; there might be cases where it would be easier to make a deterministic u-tree automaton than a nondeterministic PDA. We would like to be able to use a deterministic PDA. This would involve an easy implementation or simulation. Theorem 5.2 Given a context-free phrase structure grammar, G, there is a deterministic PDA which will accept a string, t, iff t is a pseudoterm representation of a parse tree generated by G. 71 Proof Let Ca V v P,yd> , with P={(pl): A. +wl : a N T jl (92): Aji+w2 (pn): Aj +wn} n 2 with IVNI-m, Z VNUVT’ A315 VN for l:i_<_n. Let G'- VN"VT"P"GQ> where VN'={oiI1:i:m, ai£2} so the nonterm— inals VN' will be a set of new symbols. Let VT'=£U{),(}.3 Each production pi' in P' will be obtained from the corresponding production pi in P by the following procedure: (1) Replace each occurrence of any nonterminal y (pi by the 1 corresponding new nonterminal a giving new production x 3' 1° i:aj+w , let p1 be oj+yj(w ). Now grammar G' will generate exactly the set of pseudoterm repre- (2) For each production x sentations of parse trees generated by G. P' will have no equal RHSs since every RHS is composed of both the LHS and the RHS (transformed by step 1) of a production in P and the same production will not occur twice in P. Furthermore when parsing a string in G' it is always clear exactly when a substitution should be made; it will be immediately after reaching symbol ")" as input is scanned from left to right. The sequence to be substituted for will be exactly yk(w') for yk§Vh, w'((VTLJVN')*. So G' is LR(O) hence a deterministic PDA can be constructed to accept exactly L(G')4 and L(G') is exactly the pseudoterms representing parse trees generated by G. Q.E.D. 2IVNI indicates the number of elements in VN. 3We assume ),(££. 4See Hopcroft and Ullman(22). 72 Example 5.1 Given context—free grammar G-1 with P - {3+oso S+lBl B+x} 'g '. '3 II H H H V I Let G . To generate P we will have after step (1): aS+OaSO a +lo31 S aB+x After step (2) we get: ' ‘ P {oS+S(OaSO) oS+S(laBl) uB+B(x)} and G' will generate the pseudoterms representing derivation trees produced by G. Theorem 5.3 Given regular tree grammar, G, in expansive form with no equal right hand sides in its tree productions, there is a deterministic PDA which will accept exactly the set of pseudoterms representing elements of L(G). mg Given G- over we can construct context- free phrase structure grammar G'- such that for each X2 > 1 2 n X0+x(X1'--Xn). We can then make the same observations about parsing in P, P' will contain 9' as those made for the G' in Theorem 5.2. Q.E.D. 73 Corollary 5.3 Given regular u-tree grammar, G, in expansive form with no equal right hand sides in its u-tree productions, there is a deter- ministic PDA which will accept exactly the set of pseudoterms represent- ing elements of G. £322: Given G consider regular tree grammar G' that simulates C (see Theorem 3.4). G' will still be reduced, expansive and contain no equal RHSs. Then apply Theorem 5.3 to G'. Q.E.D. We now state the strongest result of this section. Theorem 5.4 Given any regular tree grammar, G, one can construct a deterministic PDA that will accept exactly those pseudoterms represent- ing elements of L(G). 2523: We will outline the construction for the deterministic PDA, M', so that it will be clear that M' will accept set L, the correct set of pseudoterms.5 we assume G is defined over ranked alphabet . Let M8 <3,t1,t2,°-°,tm,s> be a deterministic tree automaton which accepts L by final state. The existence of machine M is established by Theorem 2.5. We define M'- with Fck and F con- taining a matching element ks for each 863 so F-{kSIsGS}. M' will end in state k8 and accept an input iff M would end in state a and accept the same input (there is a single exception where M"will end in a failure state, k if M would have failed anyway). It is, of course, fail’ understood that if M' has no applicible move function defined for a (state,input,stack) triple then it halts and the present state is the final state. Note that the pushdown stack will show symbol # for empty 5Here we use the Hapcroft and Ullman(22) (continued on next page - ) 74 5(continued) formalization for deterministic pushdown automata, namely a deterministic PDA M is a system where (1) (2) (3) (4) (5) (6) (7) K is a finite set of states. X is the finite input alphabet. F is the finite pushdown alphabet. qocK is the initial state. #GF is the start symbol which initially appears on the pushdown store. FQK is the set of final states. 6 is a mapping from KX(ZL){X})XF (X is null input) to finite subsets of KXF* such that (a) 6(q,a,z) contains at most one element. (b) 6(q,l,z) contains at most one element. (c) If 6(q,l,z) is not empty, then 6(q,a,z) is empty for all an. If 6(q,a,z)=(p,y) we adopt the convention that the rightmost symbol of Y will be placed highest on the store and the leftmost symbol lowest on the store. We say M accepts by final state if an input is accepted iff M halts in a final state. 75 In order to define 6 we first divide the various t function/input pairs defined by M into distinct sets. T0={t Jml:j (mm Tl={tj (w)|tj(w)£M, |w|=l} Tr={tj(w)|tj(w)£M, |w|=r, r is the maximum length of any w such that tj (w)6 M} Note that each of T0,T1,---,Tr will be finite. (In the following move function definitions any input or stack string containing ")" or "(" will be indicated with quotation marks around it. Input on stack strings not containing these symbols will not be enclosed in quotation marks.) §££P.l. M' will begin reading an input (left to right) from its input tape and will immediately push the input onto the stack, so we define 6(k0,x,#)=(ko,#x) for all xGA. Whenever it reads an "(" it pushes it down and enters state k so we ( define 6(k0,"(",x)=(k(,"x(") for all xeA. We then define 6(k(,y,"(")=(kl,"(y") for each yeA.6 Note that neither "(" or ")" is a valid input symbol to follow "(" for an element of L. We also define 6(k1,y,x)=(ki+l,xy) for each yéA, for each xeBLJA, for liiir-l, 6M' as defined here will not adequately handle the special case where there are acceptable single-node trees. It is easy to change Step 1 slightly if these also are to be accepted. 76 5(k1,"(",x)=(k ,"x(") for each xEBL/A, for l:i:r-l,¢ ( 6(ki,")",x)=(k) i,x) for each xeB\JA, for lfiijr. 9 The total effect of 6 as defined on states k),k0,k1,°--,kr,k),l, ---,k will be to read the initial input, left to right, until k . ).2 the first ")" is read. The input is pushed directly onto the stack and ).r M' will end in state k) j for "(w)" being the last j+2 input symbols 9 with lwl=j and ),(¢w. we will use the pairs in T to determine the next ).3 0 states as M' pops the stack down to the last "(". M' will now simulate Step 2_ In state k M as M would apply tx(l) for x on the frontier of a tree. M' keeps a record of what is replaced by what state it ends in. We define “km ).j.X1’ with xl=tx(x) if [tx(A)]eTO ,l,x)=(k l) for all j such that Tj#¢, for each xfBL/A Xl=x otherwise. 6( A,x)=(k ,A) for all j such that jig and Tj¥¢. k . ).j.X1 ),j,x1,x2 for each chLJA with X2=tx(l) if [tx(l)](T0 X2=x otherwise. 5(k ,X,x)=(k ,A) for all j such that )’j’xl’...’xj-l )9jaxla.°°9xj Tj#¢, for all X163 with ljijj-l, for each chLIA with Xj=tx(l) if [tx()\)](TO Xj=x otherwise. The 6 functions taken where [tx(A)]£T0 will simulate M as it makes no substitution in a string of states for a state which is itself a substitution. Note that we are only defining a finite number of 6 functions and states k since BLJA is finite and we are only considering ),j’ooo state encodements for input strings of length less than or equal to j for ljjfir. After M' applies these 6 functions defined above, it will be in state k , and the stack (bottom to top) must contain (for a )SJSXI’...’Xj valid input string): "#wl(w2°°-wn(". Step 2_ We define a( ’A’n(n)=(k( X ,A) for all je{i|i;l, Ti#¢}, k . )9j9X19'°°:X 939X19°' 3 j J for all strings Xloo-Xj from B of length j, in order to pop the top "(". We define 6( ,A,x)=(kY,Y,A) for [tx(XJXj_l---X1)](T with k(9j9x1)'°°:x j tx(Xij_l---X1)1Y in M, for all j€{i|i:l, Ti#¢}. These are the functions (and they will be functions since M was de— terministic) that actually determine what substitutions to make to simulate the functions in M, except those in To (which are already made). So M' will end in state kY,Y whenever a substitution from M, resulting in Y, could be made. §£gpmi It is now necessary to determine if the stack is empty. We define 6(kY’Y,A,#)=(kY,#) for each kY,Y defined in step 3 and 6(kY’Y,A,x)=(kY,,xY) for each xéf, for each kY,Y defined in step 3. §p§p_§_ M' ends in state kY or kY, whenever a substitution (except those indicated by To) has been made, followed by a move function from step 4. If there is no more input on the input tape and state kY is reached (indicating the stack was empty) then kY is the final state reached and if kYéF the input is accepted. M' must check for more input in state kY: if there is any, M' can move directly to a fail state. We define 78 6(kY,x,#)=(kfail,#) for all er. If state kY, is reached M' has made a substitution and must continue. If there is more input it is necessary to determine exactly how many symbols have been placed on the stack since the last "(", in order to k know in what state k --,kr M' should now be in. We define the 1’ 2” next group of states and move functions to accomplish this. M' also must know what the next input is. At this point the top of the stack must contain an element of B, in fact the element Y when in state ky,. Define (a) 6(kY,,x,Y)-(kx’Y,A) for each xéA, for each kY as defined in Step 4 (b) 6(kY,,"(",Y)=(k(,"Y(") for each kY' as defined in Step 4. (Note that k( and its associated move functions were defined in Step 1.) (c) 6(kY,,")",Y)=(kP’Y,A) for each kY, as defined in Step 4. Step Q_ We first define 6 for case(c) from Step 5. Define 6(k1,,xl,A,x2)'-(kp’X x ,A) for each leB, for each x2¢B\JA, l 2 6(kP’x1’x2,A,x3)=(kp’x1.x2,x3,x) for each X163, for x2,x3€B£JA, 6 k = A ( P’x1’°°°’xj_1’x’xj) (kP,X1,°°°,xj’ ) for each Xl(B, for all x1£B\JA with Ziijj for ljjjr, n n g n ,,, n 6(kP,X1,'°',x ,l, ( ) (k),j’ (x x2X1 ) for each XIGB, for all J xiéBUA with Ziifj for all j such that T #6. 1 These 6 functions will leave M' in state k) j with everything ’ necessary replaced on the stack. Step Z_ For case(a) in Step 5 we define 79 6(k ,A,x )=(k ,A) for each x ,x ,x (BL/A, x1,x2 3 . xl,x2,x3 l 2 3 6(kx]-’.H’X."1,A,xj =(kxl.'”.x ,A) for all xicBUA with 1:1:j J for ljjjr, 6 k ,A," " = k. " x °°-x " for all ' 3 ch that T.# , (x1»""xj ()(J,(j 1) 311 Jo and for all xithJA with ljijj. Move functions for state kj were defined in Step 1 so in all three cases generated by Step 5 M' wil enter states whose move functions are'previously defined. We have defined deterministic PDA M' so that it will clearly simulate deterministic tree automaton M and thus accept exactly those pseudoterms representing elements of L(G). Q.E.D. Corollaries 5.4 and 5.5 follow Theorem 5.4 in exactly the same manner that Corollaries 5.1 and 5.2 follow Theorem 5.1. Corollapy 5.4 Given any regular u-tree grammar, G, one can construct a deterministic PDA that will accept exactly those pseudoterms re- presenting elements of L(G). Corollary 5.5 Given any regular u-tree automaton, M, one can con- struct a deterministic PDA that will accept exactly L(M). An example will illustrate the development of a deterministic PDA to accept L(G) for a regular tree grammar. Example 5.2 G- <{S,T,x,y},o',P,{S-}> over . P={°S—>s/\T’ §_>’g’ £—>y°} G is a reduced, expansive, regular tree grammar. An accepting tree automaton constructed as in the proof of Theorem 5.4 will be: M8 with: tx(ST)=S, tx(A)=S, and ty(A)=T. 80 .Dur accepting deterministic PDA will be: M'=~ - The T1 sets used to determine some of the functions in M' will be: TO={tx(A).ty(A)} T2={tx(ST)} The value of r will be 2. We now define K and 6. Generated by Step 1: 1. 6(k0,x,#)=(k0,#x) * 2. 6(k0,y,#)=(ko,#y) 3. 6(k0,"(",x)=(k(,"x(") * 4. 6(k09"(",Y)=(k(:"},(n) 5. 6(k(,x,"(")-(kl,"x(") it 6. 6-(k1."y<") 7. 6(k1,x,x)=(k2,xx) 8. 6(k1,x,y)=(k2,yx) 9. 6(kl,x,S)=(k2,Sx) 10. 6(kl,x,T)=(k2,Tx) 11. 6(k1,y,x)=(k2,xy) * 12. 6(k1.y.y)=(k2.yy) 13. 6(k1,y,S)=(k2,Sy) 14. 6(k1,y,T)=(k2,Ty) 15. 6(k1,"(",x)'(k(,"X(") * 16. 6(k1."(".y)=(k(."y(") 17. 6(k1,"(",S)-(k(,"S(") 18. 6(k1."<".r)=(k(."r<") 19. 20. 21. 22. 23. 24. 25. 26. 5(k1.")".X)=(k)’1.X) 6<1<1.")".y)=(k),1.y> 6(k1,")",S)-(k)’1,S) 6(k1,")",T)=(k)’1,T) 6(k2,")",x)=(k)’2,x) 6(k2.")".y>=(k),2.y> 6(k2.")",S)-(k)’2.s> 6(k2,")",T)=(k)’2,T) Generated by Step 2: 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 6(k)’1’A’x)=(k)’l’S, A) d(k)’1949Y)=(k)’1’T)A) 6( ,A,S)‘(k k),1 ),1,s’ 5(k)’1,A,T)=(k)’1,T, 6(k)’2,x,x)=(k),2,s, 6(k),2.A.y)=(k)’2’T. 5(k)’2,A.S)'(k)’2’S. 6(k),2,A,T)-(k),2’T, 6(k)’2’s,x,x)=(k),2’ 6(k)’2’s.x,y)=(k)’2, 6(k)’2’S,A,S)-(k),2’ 5( ,A,T)=(k k),2,s ).2. 6(k)’2’T,A,x)=(k)’2, 6(k)’2,T,A,y)-(k)’2’ 6(k),2,T,A,S)=(k),2’ ‘5“‘>.2,T’*’T)"“).2. A) A) 1) A) A) 1) s,s’*) S’T,A) S’S,A) S,T’A) T,S’A) T’T.A) T’S,A) T,T’A) 81 82 Generated by Step 3: 43' dk),2,s,s’*’"(n)=(k(.2.s,s”*) 44. 1) * KR), 2, T ’W 2’ T, S: 45. A) “k ),2,s, T’ M’"(") (k(, 2, s, T’ 46' “32,131" ’"(N)=(k(,2,T,T’A) 47x dk(’2’T’S,A,x)=(kS,S,A) * Generated by Step 4: 48. gk 8 S’ X,#)= (kS ,#) * 49. (KkS’S,A,S)=(kS.,SS) 50. «i S’S,X,T)=(ks,,TS) 51. 60‘s s’ A,x)= (k8,,xS) 52. 5°‘s,s’ A.y)=(kS.,yS) 53. 6(kS’S,A,")")=(kS,,")S") 54. 6(k8 S’ A,"(")= (k8,,"(S") . * Generated by Step 5: 55. 6(ks,x,#)=(kfail,#) : 56. 5(ks,y,#)=(kfa 11’#) z 57. 5(ks,"(",#)=(kfa11,#) I 58. 5(kS,")",#)=(kfail,#) : 59. 6(ks,,x,S)=(kx S,A) 60- 5(ksuy,3)‘(kys.1\) * 61. 5(k "(u S)= (k( "(S") H)", s)== (k S" 62. 6(k S', PJS’) Generated by Step 6: 63. 6 (k A .X)=(k P’s, P,S,X’A) 6 . = 4 6(kp,s’A’Y) (kP,S,y,A) 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78. dckp,s.x,8)=(kp,s 3.x) 6 = (kP’S’ADT) (kP,S,T,A) 6(kP,T’ AD)<)=(1(1)’119’){3A) 6(kP,T,A,y)=(k A) P.T.y’ 6(k A,S)=(k P,T’ P,T,S’A) 6(kP,T’*’T>=(kP,T,T’A) 6=(k),2."= 5(kp,s,s’A’"(">= 5(kP,S,T,A."(")=(k),2,"(TS") GCRP’T’X.A."(")=(k)’2."(xT") 6(kP,T’y.A,"(")=(k)’2."(yT") 6(kP’T,S.A."(")=(k)’2."(ST") 6(kP,T,T’)""(")=(k) ’29"(TT") Generated by Step 7: 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 6 (kS,X’ 4:" (")=(k2 9" (xsn) 6 (ks,y9 )‘9 "(")=(k29 "(yS") 6= 6(kS,T,A."(">==(k2."(xT") 5(kT,y, A,"(")=(k2,"(yT") 6(kT,S.A."<")= 6 6==(k2."(Ty") 91. 6(k y These 94 move functions were generated by directly following the rules given in the construction; we will see that many of them are not necessary. Using M' as developed we show the recognition procedure for x(x(xy)y) which is the pseudoterm representation for: Stack contents are shown in order from bottom to top. Stack 12212 State Initially # x(x(xy)y) k0 Apply rule: 1 #x (x(xy)y) k0 3 #x( :(xY)Y) k( + 5 #x(x (xy)Y) . k1 15 #x(x( iy)y) k( 5 #x(x(x ;)y) 151 ll #x(x(x(xy ;y) k2 24 #x(x(xy ;) k),2 32 #x(x(x ;) k),2,T 39 #x(x( ;) k),2,T,S 44 #x(x ;) k s + (929T, 8S Stack Input State Apply rule: 47 #x( y) kS S + ’ 54 #x(S y) k c + s 60 I k m 1 y.8 93 #x(Sy ) k2 + 24 #x(Sy k)’2 32 #x(S k),2,T 41 #x( k),2,T,S 44 #x k(,2,T,S 47 # kS,S 48 # kS Since there are no applicible rules for (kS,A,#) M' is finished. kS is a final state so the input is accepted as it should be. Input x(x(xy)y) is an exhaustive test of necessary move functions for this example; any move function not used here was not necessary (with the exceptions noted below). The ones used are indicated with an *. Only fifteen move functions were used so our deterministic PDA could be greatly simplified by not including any of the others except those which immediately take M' from an accepting final state on any input. These are indicated with an I beside them. They must be main- tained or the machine could end in an accepting final state with more input still to come, but with no move function defined. Even with these included, M' only requires 19 states. 86 5.2 LANGUAGE PROPERTIES 9E_THE FRONTIERS 92 U-TREE LANGUAGES Definition 5.1 The frontier f §_u-tree rgpresentation is simply the frontier of the tree used in the representation. Definition 5.2 The frontier _f‘§ u-tree is the set of frontiers of representations for the u-tree. Example 5.3 U—tree T = y z The frontier of this u-tree representation is xyz. The frontier of T is {xyz,yzx,xzy,zyx}. Definition 5.3 The string language, denoted LS(G), of a regular u-tree grammar, G, is the union of all the frontiers of u-trees in L(G). For every u-tree grammar we have defined a language of strings. We use the following notation as we investigate this class of languages: 0 = the class of regular sets. A the class of context-free sets (i.e. the context-free languages). F the class of string languages generated by regular u-tree grammars. Lanai-.1 99’1"- Ppppf kLet B={01}. Then B is regular (every finite set of strings is regular). Assume Berg: 3a regular u-tree grammar, G, such that LS(G)=B, § we can apply a sequence of u-tree productions from G giving a u-tree representation, A, with string frontier 013 Esome node, say a, in A with two successors, the left eventually leading to 0 and the right to 1. Since A is only a representation of a u-tree‘? 87 filanother representation, A', such that A' can also be generated by the same sequence of u-tree productions but a in A' will have its left and right successors reversed=§lO is the string frontier of A'? lO€LS(G) ® therefore BtI‘. Q.E.D. Jim-Lea 18$?- M egr and egnsagr. REID—32:2 I‘CA. g£99§_ Let A£F=9j3a.regular tree grammar, G, with frontier(G)=A, by Theorem 3.4, abja deterministic tree automaton, M, such that L(M)= L(G), by Theorem 2.5,=?A is a projection of a context-free language, by Theorem 2.1,£9A£A (shown in (22)) therefore Pg A. Lemma 5.2 shows that the inclusion must be proper. Q.E.D. Azalea Fie- steel Claim(l) Regular tree grammar Gl= ’over {S,A,B,a,b} with P1 ' S S S S S S S S { o -—> , o -€> , o ~€> , ° “" , 6 a B S a B b A S b A A A B B A A B B 0 9A: 0 '—"> 9 0 '6 a O -—> } b A A a a B B b will produce frontier F={w|w consists of anuequal number of a's and b's}. Hopcroft and Ullman(22) present a context-free phrase structure grammar that produces F. This grammar has start symbol 8 and productions: 88 S+aBS S+aB S+bAS S+bA A+bAA Aaa B+aBB B+b Grammar G1 has been formed so that it will produce, as its tree language, exactly the derivation trees generated by the Hopcroft and Ullman grammar, therefore the set of frontiers produced by C1 must be exactly F. Claim(2) Given regular u-tree grammar G2= <(S,A,B,a,b},02',P2,{S°}> over {S,A,B,a,b} with P =“(the same set of productions as P but we 2 1 now regard them as u-tree productions, not tree productions} then LS(G2)= F. This is true since G2 can produce every tree (which serves as a u—tree representation) that G can, so FQ LS(GZ). Furthermore each 1 tree (a u-tree representation).it produces must contain the same symbols on its frontier as‘a tree in L(G), except the symbols may be in a dif- ferent order. Any string in F, however, even with a permuted order, is still an element of F. Therefore L8(G2)=F. Now that Claim(2) is extablished we consider the language F=LS(G2). It is easy to see (and well known) that F is not a regular set, but F is the frontier language of a regular u-tree grammar, G So rgpe. 2. Q.E.D. 89 Lemma 5.5 0()F#¢. Proof Let D={0n|n:l}. D69 and clearly D‘T. Q.E.D. Theorem 5.5 The results of Lemmas 5.1, 5.2, 5.3, 5.4 and 5.5 can be summarized in the following Venn diagram of relationships. We now investigate some of the closure properties of class F. Theorem 5.6 F is closed under union. a II Proof Let A,BEP. Let MA be defined over <£A’°A> = B " with LS (MA) A. Let “B be defined over (23.03) with LS(MB)=B. We will define a new grammar M such that LS(M)-AL}B, but in order to avoid interaction between productions in PA and those in PB we must first relabel some of the symbols. Let c-CAllcn. For each symbol CiGC pick a unique new symbol 1: ." ditcAFJCB' Let D {dilc16C}. For all d define 0(d1) 0 (c1). At 1) each occurrence of c in P and in P replace c i B B i H 8 '8 H . . . sets PB and PB . Let PB PB U{di -€> c1|c1§C(1£B} We now have a c. v- I- n v I new machine MB (CB CBUD’OB OB U{O(di)}’PB ,I‘B> defined over <%B,oB>‘with LS(MB')=LS(MB). Pick a new symbol stB'UCA. Let M= (CAUCB'U{S},0A"UOB'U{O'(S)‘ with d1, call the new 0},PAUPB' U{s- éwlwéFAUPB'},{s-}> be defined over (ZAU £3,0AUOB> . 90 Now, M will begin with start tree so then immediately use a production producing a start tree in PA or PB'. Either way it will produce a string in A\JB and all such strings will appear. Q.E.D. .ngp§_§;§_rf is not closed under intersection. ‘Ppppg Let X={c+w or wc+lw consists of an equal number of a's and b's}. Claim: XGP. To see this consider regular u-tree grammar G= >defined over <38,A,B,C,D,A,b,c},o> with I) U C C D D P={ 0 9 05 9 “a 9 D , S c D c4 S S S S ga/k’aafl'soa/k’sca ’ ‘b a B S a b A S A A B B A A B B o-—-) ’0—9 ’0—9 , 0-9 } D A A a a B b So, (see the proof of Lema 5.4) L8(G)-X$Xel‘ and our claim is established. + + , , Let Y={a wgor wa Iw consists of an equal number of b s and c s}. In a manner Similar to that above we can show that Yer. We now proceed to show that xrxytr. For notational purposes we let: al-{c+w in set X} aZ-{wc+ in set X} Bl={a+w in set Y} 82={wa+ in set Y} 91 Then 01f\31=¢ a1f\82={cnbnan|nzp} ozf\81={anbncn|nzp} azf\82=¢~ NOW X “Y=(01U 02) A (BlU 82)=(a1/\ 81)U(aln 82) U (a2 “81) U012 082) $0 X/\Y={cnbnan|nzp}\J{anbncn|nzp} which is not even context-free (see Theorem 4.7, Hopcroft and Ullman(22)) therefore is not in P. Q.E.D. Lemma 5.7 P is not closed under complementation. Proof Assume F is closed under complementation. We also know it is closed under union, by Theorem 5.5. Now Llf\L2¥EIijL; therefore closure under union and complementation imply closure under inter- section @) . Q.E.D. Theorem 5.7 F is not a Boolean Algebra. Proof Follows from Lemma 5.6 or 5.7. CHAPTER 6 CHARACTERIZATIONS FOR TREE GRAMMARS In this chapter a characterization for the classification of tree grammars along the same lines as that for phrase structure grammars is proposed. The work is intended to be exploratory; it provides a begin- ning towards a more general theory of tree grammars. 6.1 BASIS FOR CHARACTERIZATION Theorem 6.1 The strings of labels encountered on the root-to-frontier paths in the trees of a language generated by an RTG will be a regular set. Ppppf There will be a reduced, expansive RTG, G, that will generate any such tree language. Suppose G' over . We de- fine regular phrase structure G"= where the product- T ions in P' will be defined as follows: For in.P, P' will contain: Xo+xxl xonxz X , ° For 0 x in P, P will contain: . ° 3’ ° X +xX X +x. 0 n 0 So 6' will generate the strings of labels on root-to—frontier paths 92 93 generated by G and G' is regular, therefore these strings form a regular set. Q.E.D. The next theorem shows that for any phrase structure grammar, regardless of the type (3, 2, l or 0) productions, if each production is only applied to the right hand end of sentential forms, then only a regular set will be produced. Theorem 6.2 Given phrase structure grammar, G. If the stipulation is made that for each production, p, in G with r symbols in its LHS, p can only be applied to the right hand r symbols of a sentential form generated by G, then L(G) is a regular set. 3523f Given grammar G- VN,VT,P,S> we construct regular tree gram— mar G'=- over (V100) . For each production x1x2°--xn+y1y2---ym in P we let "1 y1 x2 y2 “—€EE> be in P'. G' is a regular tree grammar and it generates root-to-frontier paths in the same way that G generates strings. So {root-to-frontier paths generated by G'}-L(G), there- fore, by Theorem 6.1, L(G) is regular. Q.E.D. The following classification of regular (type 3), context-free (type 2), context-sensitive (type 1) and recursively enumerable (type 0) tree grammars seems apprOpriate. We will say a tree grammar is type i, for 05153, if every production in the grammar, when applied to a tree sentential form, has the effect on the strings formed by root-to-frontier paths of applying a type i phrase structure grammar 94 production. As illustrated by Theorem 6.1, the formalization of regular (type 3) tree grammars fits the classification perfectly. When we write a formalization for type i grammars, with i<3, there is an added complication. Type 3 tree grammar productions are always applied at the end of a root-to—frontier path, at a place where there can be no sub-trees of the nodes involved which are themselves not involved in the production. When applying, say, a type 2 production it can, in general, be applied to a node in the interior of a root- 2 to-frontier path, a node which may have several subtrees. The problem is also present in type 1 and type 0 tree productions, so each of these definitions must account for any subtrees which might be involved. We want our definition of type i, for i<3, tree grammars to be a generalization of the type i phrase structure grammar in the sense that given a phrase structure sentential form, say AbbbCDex we may regard this as a tree with exactly one corresponding root-to-frontier path. The sentential form for the tree in this case will be: A T b" 8?) XA D When a context-free tree production, say ° 3 is 95 applied, it should have the same effect as applying D+Dd in the cor- responding string which would leave AbbbCDdex. The subtree with root e . D ? one level below D (i.e. L ) must be maintained one level below I x 6 giving: e i.“ x l a F In general, subtrees whose roots occur one level below the node 1 being replaced by a tree production must be retained on pg least some of the root-to-frontier paths generated by applying a tree production. If they were not retained it would be equivalent to applying a produc— tion to a string sentential form generated by a context-free phrase structure grammar and eliminating everything to the right of the replacing symbols; this is clearly intolerable in any context-free grammar. With these considerations in mind we are ready to define context-free tree grammars. 6.2 CONTEXT-FREE TREE GRAMMARS W.C. Rounds in (34) has proposed a good definition for context- free tree grammars. In order to discuss his definition we first in- formally introduce that class of tree transducers that was originally called generalized sequential machines (GSMs) by Thatcher(42) and Rounds(35) and has lately been termed top-down tree transducers and investigated by Baker(3), Englefreit(15), Ogden and Rounds(30) and Perrault(3l) among others. We may think of a GSM as a device which 96 serves as a mapping of trees over ranked alphabet, A, into trees over ranked alphabet, B, i.e. GSM: TA+T£. A GSM is a finite-state device; its move function is a function from (old state,input) pairs (where each input has a certain number, the rank of the input, of arguments) into (new state,output) pairs with assignments of each of the input arguments into the output. The inputs will be nodes of trees, the arguments will be the subtrees of the input nodes, the output will be trees. A GSM works on a tOp-down basis, starting at the root of a tree in a given start state. We present an example from Rounds(34) to illustrate this informal discussion. In the following example and throughout the rest of this chapter we use the symbol X X1 X2 X.n to indicate node x and the n distinct subtrees with roots one level below x. X --°,Xn are dummy arguments used to indicate the subtrees. 1’ Example 6.1 Both input and output alphabets are 2. Let 22={p}, £1={o,r}, 20-{#,A}. is an element of T2. If a GSM starts in start state q and has a transition function 97 defined, then the result on t will be: P Other transition functions will then be applied on each of the (state, input pairs (q,p), (q,o) and (q',o). Rounds uses a modification of the GSM definition as his definition of context—free tree grammar. Context-free tree grammars will map trees over a ranked alphabet into trees over the same alphabet. He regards (state,input) pairs as being nonterminals. He regards the transition function as actually defining a production. He liberalizes these prod- uctions so that nonterminal (state,input) pairs do not have to be on 98 the frontier of the RHS of a production, rather they may be anywhere on the RHS but only one (or zero) per branch is permitted. Each sub- tree must appear below at least one of the new nodes. This definition seems more restrictive than desired in the sense that it only allows productions to be applied from the top to the bot- tom of the tree, i.e. from left to right on the root-to—frontier paths. This restriction is not important-however, since we know that for every phrase structure context-free derivation there is an equivalent left to right context-free derivation. Other definitions for context-free tree grammars are certainly possible. They should maintain the idea of applying context-free prod- uctions on root-to-frontier paths during a derivation. Other non- equivalent definitions, however, seem to be more restrictive in ways that restrict the power of productions and yet add nothing. Example 6.2 Context-free tree productions can produce frontier {anbncnl nil} which is a context-sensitive language. Given productions: S S 3 3d d ”9AA 9 9 a b X X C l 2 X1 X2 Y Y S S w—>(>\ /\_>. x a b c X X 1 2 X1 x2 99 Derivations will be of the form: S Ei::>d d :A x A d d a b c It is clear that context-free tree grammars, as just defined, are more general than regular tree grammars, yet every regular tree grammar (in expansive form) will be a context-free tree grammar. Furthermore, as illustrated by the previous example, the addition of context-free productions makes the grammars more powerful in terms of the frontiers they can produce.1 It seems natural to ask whether the type 2 tree 1It was shown in Chapter 2 that the frontiers of languages generat- ed by regular tree grammars are exactly the context-free langu- ages, but the frontier generated in Example 6.2 was context- sensitive. 100 grammars may have exactly the type 1 string languages as their front- iers. The answer is no, the frontiers will be a prOper subclass of the context—sensitive languages. To establish this fact we need a bit of background. Indexed grammars were defined by A.V. Aho in (1). He showed that the class of languages generated by indexed grammars prOperly contains the class of context-free languages and is properly contained in the class of context-sensitive languages. We now restate a theorem from Rounds(34). Theorem 6.3 (Rounds) For every context-free tree grammar, G, one may effectively find an indexed grammar, G', such that the frontier generated by G‘is exactly L(G'). Thus, although not every language generated by indexed grammars is the frontier generated by a context-free tree grammar, the converse is true and the class of these frontier languages must be properly contained in the context-sensitive languages. 6.3 CONTEXT-SENSITIVE TREE gm we now consider the question of what is an appropriate definition for context-sensitive tree grammars. We should include those that can apply context-sensitive string substitutions on the root-to-frontier paths in sentential form trees. Since there are several (probably equivalent) ways of defining these tree grammars we will again only describe them, rather than formally define them. Given a string, say x °--xn, we must be able to apply a context- 1x2 sensitive type substitution to it. If this string is thought of as a root-to-frontier path 101 n we can apply, say xjxj+l...xj+r + xkxk+l...xk+r+i for 129’ Thus productions of the form x x j k x 3+1 $ x1t+1 xj +r L l xk+r+i must be allowed. Also, some method of reassigning subtrees must be allowed. It is important, however, that no subtrees be assigned to nodes at a higher level than their original level; this would have the effect of short- ening some root-to-frontier paths, i.e. applying distinctly non-context- sensitive productions. So, the following description of possible context—sensitive tree productions seems appropriate. We use superscripts on dummy auguments to indicate what node they originally are subtrees of, and subscripts to indicate which subtree. Productions may be of the form: (see next page) lOZ Each of the subtrees represented by {x;|1315;, lfiJEPr} must be duplicated at least once as a subtree of some yk. Furthermore, the subtree represented by X1 j to prevent shortening any root-to-frontier path. The shortening of must be duplicated on a yk with kii in order strings can only occur with recursively enumerable productions, not context-sensitive. Note that every contextsfree production will also be allowed as a context-sensitive production. Example 6.3 The following context-sensitive tree grammar was con- structed by first considering the context-sensitive phrase structure grammar with productions: P' = { S+aBSc S+ch Ba+aB Bc+bc Bb+bb } n n These productions will give {anb c Inzl}. We construct tree grammar G- over . 103 The productions in P will essentially model those in P', when applied on root-to-frontier paths. P={(a)¢,s_>1 A sample derivation is: (continued next page) 104 a Q a Q B l a d» 4 (C),( B ./ a a o a 3 . b 4 b _ 4 C 4 c 3 c 3 3 In general the frontier of the trees produced will be {ln2n3n4nlnil}. Context-sensitive (and recursively-enumerable) tree grammar concepts may be applicible to more general graph grammars. This is discussed in Chapter 7. One might ask what class of languages will be generated as the 105 frontiers of context—sensitive tree productions. Although the complete answer is not known at this time it is easy to see that at least all the context-sensitive languages can be generated. If we have a context-sensitive phrase structure grammar which produces sentential form ABchBbcA we may represent this as the following tree: A B 2? c E c X B E b ._ c A If the context-sensitive grammar has production Bb+AAAbc we may include corresponding tree production: Productions of this type will eventually produce frontier 106 alaZ-°°an for a similar string a °°°an generated by the phrase 1°‘2 structure grammar. 6.4 RECURSIVELY ENUMERABLE TREE GRAMMARS The definition of type 0 tree grammars should be the same as that for type 1 except that here it will be legitimate for subtrees to be assigned (by a production application) to a node at a lower level (nearer the root) than where they occurred before the production was applied. Example 6.4 We examine an interesting recursively enumerable grammar, one that can completely model a Turing machine.2 We assume the Turing machine to be modeled is in the following form: Turing machine T V ace Axgnx‘lx Ct. AA on: Io In the starting configuration the read head is on the leftmost input tape cell. The input, or starting tape configuration, is XO,X1,'°°,Xn. Symbols to the left of XO are null input symbols, as are those to the right of X“. We assume T starts in state so. The output/move function for T will be stated as a partial function with finite domain of the form: 6:SXI + SXOXM with: 2The idea for modeling a Turing machine with a tree grammar was suggested to me in a personal conversation with W.C. Rounds. 107 S = states of T I - read symbols (the input symbols) 0 - output symbols M = {L,N,R} L means the read head will move to the left one cell on the tape, R means it will move to the right one cell and N means no move. If no move function is defined for a (state,input) pair, T halts. Since for every nondeterministic Turing machine there is an equivalent deterministic one, we assume WOLOG that T is deterministic. The start tree for our grammar, G, will be based on a starting configuration for T. This will be: The alphabet for G will be IL/OtJS\»[s_1}. Productions in G will simulate 6 in the following manner: may be thought of as T in state 31 with the read head on cell containing y2 with yl in the cell to the left and y3 in the cell to the right. C (after applying one extra production 108 to get started) will simulate T by allowing exactly one production to be applied at each point where T would have a 6 function applied. If T halts on input Xo°°-Xn, G will have no further productions possible to apply- 8 3 X1 x1+1 If we have tree produced by G with no further X1-1 xi+2 X1 Xh l A applicible productions then the final tape configuration for T will be given by X1X2°°°X1_1X1X1+1°-°Xm with the read head on cell containing X If T does not halt on input Xo-HXn then G will never cease to i+1' have productions to apply, and after applying the (i+l)th production the tape configuration for T after its (i)th move function can be read by the procedure described above. New to define the productions in C. First the set of productions to "get started". Define: 8-11 8 p x ._€EE> for all x in I. : x A l A x1 x1 Exactly one production from PA will be applied at the beginning of each derivation in C. Now we define the productions that simulate 6: i J PB: x .__E;> x1 x1 x2 81 Sj x ._—;E;>y x1 x2 x1 81 8 z x -———E;> x1 x1 X2 It is easy to see that a production from P For all 6(si,x)=(sj,y,N) in T. For all 6(si,x)=(s in T. j.y.R) Fox all z in OtJl, For all 6(Siax)=(s in T. j.y.L) B can only be applied if an equivalent move function in T would be applied. Note that in productions in PB we are taking advantage of the fact that recursively enumerable tree productions allow the shortening of root-to-frontier paths. The frontier languages generated by type 0 tree grammars are ex- actly the type 0 languages. By altering our Turing machine simulation to let the terminal symbols hang one level below place holder nodes, as was discussed for contextbsensitive tree productions, we can generate any type 0 language generated by a Turing machine. CHAPTER 7 SUMMARY AND RECOMMENDATIONS 7. 1 SUMMARY The study of multidimensional automata, specifically tree automata theory, holds a great deal of promise for automatic pattern recognition. The first use of tree automata for this purpose was shown by Fu and Bhargava in (20). This thesis uses concepts from graph theory to pro- vide significant extensions of (20) in several directions. It also shows a number of theoretical properties of tree grammars and automata. It provides another step towards developing automatic methods for manipulating graphs. Chapter 1 presents a general introduction to the pattern recognition problem and to the study of multidimensional automata. A brief survey of applications of multidimensional automata is included. Chapter 2 summarizes the pertinent work of other researchers in tree automata theory. The rest of this thesisdraws heavily on the theoretical bases provided here. Most of this theory was developed by Brainerd and Thatcher. In Chapter 3 the graph based structure for u-trees and d-trees is developed. Regular u-tree grammars and u-tree automata are defined and their relationships are shown. The various properties of these systems are investigated. 110 111 In Chapter 4 we show how these grammars and automata can be effectively used to provide an automatic (once the primitives are found) method for pattern recognition. Many of the terms and concepts inherent in any syntactic pattern recognition scheme are formalized here for the first time. Important differences between circular and noncircular primitive systems are noted. A partial characterization for sets of patterns recognizable by this system is given. Several examples and special cases are worked. L Chapter 5 deals with the question of simplifying the machines to classify patterns. It is found that a deterministic pushdown automaton will suffice rather than a u-tree automaton. This makes the pattern recognition scheme a more practical method since deterministic pushdown automata are easy to simulate using digital computers. Also in this chapter.we explore the properties of the frontier languages generated by regular u—tree grammars. In Chapter 6 a basis for providing a characterization for various types of tree generating grammars is prOposed. Regular, context-free, context-sensitive and recursively enumerable grammars are described using this characterization and the frontier languages generated are investigated. 7.2 SUGGESTIONS FOR FUTURE WORK First we discuss a very general problem. This thesis has develop- ed a graph generation and classification scheme for certain kinds of acyclic rooted digraphs. Some of the same methods (particularly the generative ones) should be applicible to graphs in general. Consider, in particular, the recursively enumerable tree grammars described in 112 Chapter 6. What would be the result of applying generative procedures of this form to unrooted graphs (with appropriate conventions as to what paths might replace root-to-frontier paths)? How about applying context— free productions? The automata defined here depend heavily on the abil— ity to conveniently represent tree—like structures in linear form and on their being rooted. What kinds of automata might one define that do not use these properties? The investigation of questions along these lines seems worthwile. Another problem that arises with any application involving formally defined grammars is the question of grammatical inference: how can an appropriate grammar be developed from the observation of sentences in a language? In the present case the problem is twofold: given a pattern set of interest we need to find an appropriate primitive system to represent the patterns as well as an appropriate regular u-tree grammar using the primitive types. What sort of procedures might one carry out in order to define these? It is possible that u-tree grammars and u-tree automata may be good formalizations to represent the growth and manipulation of and/or decision trees and search trees of the type shown in Nilsson(29). The idea of unorderdness is inherent in these trees so perhaps something like this would be fruitful. Some relaxing of the u-tree grammar definition might be necessary in order to allow for an infinite number of node labels in such applications. In the various theorems and results presented in Chapter 4 we have partially characterized the sets of patterns that can be recognized by the pattern recognition method outlined. If this method proves practical for pattern recognition applications perhaps a more complete 113 ' characterization can be developed. One of the practical considerations that has hindered the appli- cation of syntactic techniques to real pattern recognition problems is the noise and imperfect data that arises with real data. Ellis in (14) has proposed several models for probabilistic tree automata. Perhaps a useful probabilistic model could be constructed for u-tree grammars and automata so that small imperfections in data could be dealt with. Also as a practical consideration the deterministic pushdown autom- aton construction outlined in Theorem 5.4 produces a very large PDA. Many of its move function rules are redundant and/or unnecessary. It may be possible to refine it somewhat, i.e. to simplify the move function. The context-free, context-sensitive and recursively enumerable tree grammars described in Chapter 6 immediately give rise to the question of what types of automata might be appropriate to recognize the tree langu- ages generated. Just as regular tree grammars and pseudoautomata can deal with the same sets of trees there are probably distinct automata models that provide acceptors for each of the classes of tree languages generated by these grammars. The frontier languages generated by each of these types of tree grammars is shown in the following table: .IXEE.2£ Tree Grammar Type Frontier Language Generated Regular Context-free Context-free Indexed Context-sensitive ? (at least context-sensitive) Recursively enumerable Recursively enumerable It would be interesting to discover exactly what class of frontier languages can be generated by the context-sensitive tree grammars. BIBLIOGRAPHY BIBLIOGRAPHY (l) Aho, A. V.. I'Indexed Grammars - an Extension of Context-Free Gram- ars", I_§___EE Conference Record, Smosium 9_n Switching___ and A__uto- mata Theo , pp. 21.-31,1937. (2) Aho, A.V. and J. D. Ullman, "Translations on a Context-Free Granar', Conference Record _o_f First LC! Smosium 2g Theon_ of Cfluting. pp. 93.112. 1939.- (3) Baker,B.S.. “Tree Transductions and Families of Tree Languages: Abstract", to be included in the author's Ph.D. dissertation, Harvard University (in preparation). (a) mum, M. and C. Hewitt, ”Automata on a 2-Dimensional Tape”, IEEE Conference Record 9_f Ei hth Annual W__ on M_ and Automata Theog, pp. 155-1 0, 137. (5) Brainerd, W.S.. “The Minimilization of Tree Automata“, Information and Control, Vol. 13, pp. 4815-491, 1968. (6) Brainerd ,W.S., ”Semi-Thus Systems and Representations of Trees“, IEEE Conference Record of Tenth Annual Smsium on Stitching an___<_i_ Au___t_.____omata Theog, pp. ‘2561'2'Eu, I%9. (7) Brainerd,W.S., “Tree Generating Regular Systems“, Information _a_n_d C__9_____ntrol, Vol. 14, pp. 217-231, 1969. (8) Buchi,J.R. and c.c. Elgot, “Decision Problems of Weak Second-Order Arithmetics and Finite Automata", Abstract 553-112, N_9_____tices Amer. Ma__t_h__. S_o_g., Vol. 5, p. 8%, 1958. (9) Buchi, J.R., Weak Second-Order Arithmetic and Finite Automata“, University of Michigan. Logic of Computers Group Technical Report, September 1959: Z.Ma Math. Logik Grundlaggn Math., Vol. 6. pp. 66-92. 1960. (10) Buttleman, H.W.. '01 Generalized Finite Automata and Unrestricted Generative Grammars“, Proceedings of Third Annual A_C__M S 9_____sium g_n Theog__ of Cmuting, pp. 35-77. I9fi. (ll) ChomskgéN" "Aspects of the Theory of Syntax“, The M.I.T. Press, 1 114 115 (12) Doner.J.E., "Decidability of the Weak Second-Order Theory of Two Successors', Abstract 6515-1468, Notices Amer. Math. Soc., Vol. 12. p. 819, 1965; erratum, ma". ""'Vo1. 13".' 'p". 53". 1563. (13) E1got,C .C., ”Decision Problems of Finite Automaton Design and Re- lated Arithmetics', University of Michigan, Dept. of Math- enatics and Logic of Computers Group Technical Report, June 1959: Trans. Amer. Math. _S_o_c_., Vol. 98, pp. 21-51, 1961. (11+) Ellis,C.A., “Probabilistic Tree Automata“, Information _a_n9_ Control, Vol. 19, pp. 1401-1516, 1971. (15) F.ngelfriet,J.. 'Both and Tepdown Tree Transformations—A Com- parison", Tech. Report, Technische Rogeschool Twente, Neth- erlands, July, 1971. (16) Fischer,H.J.. "No Characterizations of the Context-Sensitive Languages', IEEE Conference Record _o_i‘_ Ei hth Annual Smosium 3g Switching-5E. Tum—T “rt—cog, pp. 17352136 (17) Fu,K.S., “m Syntactic Pattern Recognition and Stochastic languages", Purdue University School of Electrical kgineering Tech. Re- port No. 71.21, January, 1971. (18) m.x.s. and P.H. Swain, “a: Syntactic Pattern Recognition“, Soft- ware Eh ineer , Vol. 2, J .T. Tou editor, New York, Academic Press, pp. 15%83, 1971. (19) N,K.8. and T. Huang, “Stochastic Granars and Languages“, Lug. Journal 21; C ter 9231 Info. Sciences, Vol. 1, No. 2, pp. 133359. June. I975. (20) Fu,K.S. and B.l(. margava, 'Tree Systems For Syntactic Pattern Recognition", Presented at the Computer Image Processing and Recognition Symposium. University of Missouri - Columbia, August zit-26, 1972. (21) Harary,F., “Graph Theory", Addison—Wesley Publishing Company, 1969. (22) Hapcroi‘t,J.E. and J.D. Ullman, "Formal Languages and Their Relation to Automate", Addison-Wesley Publishing Company, 1969. (23) Knuth,D.E.. ”Fundamental Algorithms, Vol. 1, The Art of Computer Programming“, Addison-Wesley Publishing Company, 1968. (21+) Levy.L.S., “Tree Adjunct, Parenthesis and Distributed Adjunct Brams', Theo 21; Machines and Com tations, edited by z. Kohavi and A. Pet, Academic Press, I971. (25) (26) (27) (28) (29) (30) (31) (32) (33) (34) (35) (36) (37) (38) 116 Levy,L.S. and A.K. Joshi, "Some Results in Tree Automata", Proceed- ings g£_Third Annual ACM Symposium 93_Theory of Computing, pp. 78-83, 1971. Martin,D.F. and S.H. Vere, "0n Syntax-Directed Transduction and Tree Transducers", Second Annual ACM §Xfl£9§i£flz92.The°rX g£_ Computing, pp. 129-135, 1970. Meyers,W.J., "Linear Representation of Tree Structure", Proceed— ings 2f Third Annual ACM Symposium gn_Iheory 9: Computing, 1971. Miller,W.F. and A.C. Shaw, "Linguistic Methods in Picture Process- ing — A Survey", AFIPS Conference Proceedings, Vol. 33, Part 1, FJCC, 1968. Nilsson,N.J., "Problem - Solving Methods in Artificial Intelligence", McGraw-Hill Computer Science Series, 1971. Ogden,W.F. and W.C. Rounds, "Compositions of n Tree Transducers", Proceedings 9: Fourth Annual ACM Symposium 9g Theory 9£,Com- puting, pp. 198-206, 1972. Perrault,C.R., "Tree Stack Transducers and Finite State Target Languages", University of Michigan Dept. of Computer and Communication Sciences Tech. Report No. M-26, March, 1973. Peters,P.S. and R.W. Ritchie, "Context-Sensitive Immediate Consti- tuent Analysis: Context-Free Languages Revisited", Proceedings ACM Symposium 22_Theory 9f Computing, pp. 109-116, 1969. Pfaltz,J.L. and A. Rosenfeld, "Web Grammars", Proceedings Joint International Conference 93 Artificial Intelligence, Wash- ington, D.C., pp. 609-619, 1969. Rounds,W.C., "Context-Free Grammars on Trees", ACM Symposium on. Theory g£_Computing, pp. 143-148, 1969. Rounds,W.C., "Mappings and Grammars on Trees", Mathematical Systems Theory, Vol. 4, No. 3, pp. 257-287, 1970. Rounds,W.C., "Tree-Oriented Proofs of Some Theorems on Context— Free and Indexed Languages", Second Annual ACM Symposium 93 Theory gf_Computing, pp. 109-116, 1970. Savitch,W.J., "Maze Recognizing Automata", Proceedings gf Fourth Annual ACM Symposium 92 Theory 9§_Computing, pp. 151—156, 1972. Shaw,A., "A Formal Picture Description Scheme as 3 Basis for Pic- ture Processing Systems", Information and Control, Vol. 14, pp. 9-52, 1969. A . [P .J a A.” '1 .‘m A‘l 117 (39) Thatcher,J.W.. “Characterizing Derivation Trees of Context-Fae Grammars through a Generalization of Finite Automata Theory" , Journal at C ter an} Sntem Sciences, Vol. 1, No. it, pp. (40) Thatcher,J.W., I'Generaliaed Finite Automata Theory with an Appli- cation to a Decision Problem or Second-Order Logic', Mathematics mt... Theog, Vol. 2, No. 1, pp. 57-81, 1968. (41) Thatcher,J .11., 'Transfomations and Translations from the Point of View of Generalized Finite Automata Theory', Conference ms. 191‘. m 9.". M 2!. W. pp. 139-555. 1969. (#2) Thatcher,J .W. . 'Generalisedz Sequential Machine Maps”, ournel of Cmter 2:3 Snten Sciences, Vol. 4, No. 4, pp. 3%9-537. E70- (93) Thatcher.J.W.. “Tree Automata - An Informal Survey', Unpublished P‘P‘re 197°e 1 8 8563 L“ “7 WI!“ Nm Ono E WWII \ 1