‘1 —— _— _— _— s _— —_ _— —— .—— —_ —— .— —— _— 104 663 THS Ill'llllllllllllll This is to certify that the dissertation entitled INDEPENDENT SETS IN (r,s)—TREES presented by Junghee Cho has been accepted towards fulfillment of the requirements for PhoDo degree in Mathematics 22x7 . 07M C] Major professor Date July 15, 1992 MS U is an Affirmative Action/[q uul Opportunity Institution 0-12771 LIBRARY M'CMQan States L University PLACE IN RETURN BOX to remove thin checkout from your record. TO AVOID FINES retum on or before date due. DATE DUE DATE DUE ' DATE DUE I i L j l MSU Is An Affirmative Action/Equal Opportunity Institution cmwotodtnoms-M :_._._..—_._...__-_._-_...r~._-_._r_-_... _._J.'_ .- '_ _ —.:;.-t.ur .. INDEPENDENT SETS IN (7*, s)-TREES By J unghee Cho A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the Degree of DOCTOR OF PHILOSOPHY Department of Mathematics 1992 y ABSTRACT INDEPENDENT SETS IN (r,s)-TREES o / '5; BY J unghee Cho An (r,s)-tree is a connected, acyclic, bipartite graph with r light and 3 dark vertices. In this thesis, three variable, exponential generating functions are used to find exact values of the expected value p(r,r) of the vertex independence number flo(T) of (r,r)-trees T for 1' up to 19. Also the probabilistic method is applied to find bounds for 30(T) and for the edge independence number fl1(T) for almost all (n, n)-trees. These results compare favorably with corresponding bounds for random bipartite graphs. ACKNOWLEDGMENTS I would like to express my gratitude to Professor Edgar M. Palmer, my dissertation advisor, for his patience and constant encouragement, and to Professor Bruce Sagan, for his many helpful suggestions. I also thank Professors Jeanne W. Kerr, Wei-Eihn Kuan and Mary J. Winter. ii Contents LIST OF TABLES . ii LIST OF FIGURES iii INTRODUCTION 1 1 ENUMERATION OF INDEPENDENT SETS IN (r,s)-TREES 4 1.1 Generating Functions ........................... 4 1.2 Functional Relations ........................... 8 1.3 Recurrence Relations and Numerical Values .............. 10 1.4 Expectation of the Vertex Independence Number ............ 16 2 ASYMPTOTIC BEHAVIOR 23 2.1 The probabilistic method ......................... 23 2.2 Vertex independence number for almost all (n,n)-trees ......... 26 2.3 Edge independence number for almost all (n,n)-trees ......... 32 2.4 Comparison with random bipartite graphs ............... 35 BIBLIOGRAPHY 40 iii List of Tables 1.1 1.2 1.3 1.4 Number of trees of different types .................... 14 Expectation of vertex independence number ........ O ...... 17 Expected independence number by proportion when r = s ...... 21 Total of the vertex independence numbers ............... 22 iv List of Figures 1.1 A diagram for (7,2)-trees ......................... 15 INTRODUCTION Graphical enumeration, especially the enumeration of trees is an important topic in graph theory with a long history as well as many applications to various fields, especially chemistry and computer science. Caley first derived a formula for the number of labeled trees [C89]. Since that time the literature has grown substantially. There were 162 references in Moon’s definitive monograph Counting Labelled Trees [M70] and we suspect that the current number of articles published on this subject is at least 1,000. Included here are papers by Moon and Meir which focus on the independence numbers of trees of various types. In [MeM73] they derived a formula for the expected vertex independence number of a random labeled tree. They were also able to determine the asymptotic value of the expectation. They continued to explore this problem for recursive trees, planted plane trees and tri-valent trees in [MeM75] and found exact and asymptotic formulas for the expectation in each of these cases. This thesis is also concerned with the exact and asymptotic behavior of the independence number of trees. Here are the few graph theory definitions that we require. A graph G consists of a finite nonempty set V of vertices and a set E of edges which are unordered pairs of distinct vertices. The cardinality of V is called the order of C while the cardinality of E is called the size of 6'. An edge e joining the vertices of u and v is denoted by no and the vertex u and v are said to be adjacent. The degree of a vertex v is the number of vertices adjacent to v. A walk in a graph G is a sequence of vertices 101,102, ...,wm such that w,- is adjacent to w,-+1 for i = 1 to m — 1. A path is a walk in which no vertices are repeated. A cycle is a walk with at least 3 different vertices that has no repeated vertices except the first and last. A graph is connected if every pair of vertices is joined by a path. A bipartite graph is a graph with two kinds of vertices, namely light or dark, and every edge in the graph has one light and one dark vertex. A tree is a connected acyclic graph. A rooted tree has one vertex, called the root, which is distinguished from the others. A subset S of vertices of a graph is called independent if no two vertices of S are adjacent. The vertex independence number of a graph G is the number ,Bo(G) of vertices in any largest independent subset of vertices of G. A subset E of edges of a graph is called independent if no two edges of E are adjacent. The edge independence number of a graph G is the number [31(G) of edges in any largest independent subset of edges of G. An (r,s)-tree is a connected, acyclic, bipartite graph with r light and 3 dark vertices. Let F(r,s) denote the set of all (r,s)—trees with r light vertices labeled 1, ..., r and 3 dark vertices labeled 1, ..., s. We give I‘(r,s) the uniform probability distribution. Let u(r, 3) denote the expected value of 50(T) for (r, s)-trees T in I‘(r, 5). Our first objective in this thesis is to determine p(r,s) for small values of r and s. The main tool here is an exponential generating function in three variables. Fol- lowing Moon and Meir [MeM73], we make use of two special types of rooted trees, but our sample space is more complicated because of the two sets of vertices under consideration. Certain operations on these generating functions lead to functional equations from which recurrence relations can be derived. From these, in turn, exact values can be computed. Secondly, we wish to estimate the asymptotic behavior of u(r,s). For this task we use the probabilistic method, which originated in the work of Erdbs [Er47], and has been well documented in several books, such as [3085], [Pa85] and [AlSE92]. To implement the method, we follow the technique of [Pa92], which employed the matrix-tree theorem to estimate £1 for random superpositions of (n, n)-trees. We are able to establish bounds on both ,80 and [31 for almost all (n,n)-trees. Finally these results are compared with a similar treatment of random bipartite graphs. Chapter 1 ENUMERATION OF INDEPENDENT SETS IN (r,s)-TREES 1.1 Generating Functions Our aim is to determine the expected value of the independence number of (r, 3)- trees. As is customary we begin with rooted trees. Let I" (r,s) be the set of all (r, s)-trees obtained by rooting each tree in I‘(r, s) at a vertex. Thus 11"(73 3)| = INT, 8)|(" + 8)- Let ht,” denote the number of rooted trees T in I"(r,s) such that 30(T) = k. The associated exponential generating function (egf) is co r+s xr ya H = H(x,y,z) = Z (Z hk,,.,,z")-—'-T. r+s_>_1 k=1 1‘. S. Austin[A60] and Scoins [862] found that the cardinality of the sample space F(r, s) which is given by |F(r, s)| = r"ls"‘l. From this it follows that r+s Z hkm, = r“‘s"1(r + s), k=1 4 which is the number of rooted (r,s)-trees. Hence on setting 2 = 1 in the egf for H we have °° ._ .. x'y’ h(:r,y)=H(-r,y,1)=r+X;1r ‘8 l("Hm-g- Ordinarily it is easy to obtain functional relations for generating functions that enumerate rooted trees with special properties. But this case is complicated by the fact that we are dealing not only with several variables but also there are two types of rooted trees with a given independence number. These two types are described in a lemma of Meir and Moon [MeM73] that holds for any rooted tree. Let T denote a tree that is rooted at some vertex v. If every set of ,Bo(T) independent vertices of T contains v we say T is of type I . On the other hand, if at least one set of flo(T) independent vertices of T does not contain 12, we say T is of type II . If we remove the root 1: of T we obtain a (possibly empty) collection of rooted trees U1, ..., U,- whose roots were originally joined to v. These are called the branches of T at v and the following lemma relates the independence number of T to the independence number of its branches. Lemma 1.1.1 Let T be a type I tree rooted at v. Then each of the rooted trees U1,...,Uj is a type II tree, and no") = 1+ from». (1.1) i=1 Proof. Suppose T is a type I tree rooted at 1). Let W be a maximum independent set of vertices of T. Since T is a type I tree, u must belong to W. Therefore the roots, u1,...,u,- of the trees U1,...,U,-, respectively, are not in W. We assert that for each i = l to j, W H V(U,) is a maximum independent set in the branch Ug. It follows that each U,- is a type II tree. Suppose our assertion is false for some i. Then there is a set S of independent vertices in U,- such that |S| > IW n V(U.-)I. There are two cases to consider. (a) u, ¢ 5'. Then we form the set W' = (W \ (W n V(U,-)) U S, of independent vertices in T. Clearly IW’I > IW], a contradiction. (b) u,- E 5'. Here we form the set W' = (W \ ((W 0 V(U,~)) U {0}» U 5, which is independent in T. But |lV’| Z |VV| and W’ does not include 1). So this contradicts the assumption that T is a type I tree and also establishes the formula (1.1) for the vertex independence number of a type I tree. D The case for type II trees is covered by next lemma whose proof is similar to the one above. Lemma 1.1.2 Suppose T is a type II tree rooted at v. Then at least one of the rooted trees U1, ..., U,- is a type I tree and hence 30(T) = ifl0(Ui)- (1.2) i=1 Proof. Assume that we have a tree T rooted at v, in which each branch U,- rooted at u,- for i = 1 to j is a type II tree. Let W be a maximum independent set of vertices of T and suppose the root 2) is not in W. But since U1 is a type II tree, it has an independent set .S' of vertices, at least as large as W (1 V(U1) which does not include ul. Hence we can form the set W’ = (W\ (W 0 WW» U 5, 6 which is an independence set of vertices of T, at least as large as W and does not include u]. We can treat the other neighbors of v in a similar fashion and arrive at a maximum independent set W”, which does not contain any neighbor of 2). Hence W” U {u} is a larger independent set than W, a contradiction. Cl Now the lemma can be applied to express the exponential generating function H (x, y, z) in terms of generating functions for the two types of trees. Let 9k,” and fkm, denote the numbers of rooted (r,s)-trees with independence number k of type I and type II, respectively. Define the generating functions for the two types by 00 r+s CC G: G(x, y,z = Z: (ngnknz )—!--y—!, r+s_>_1 k=1 and 00 r+s (tr 3/8 F: FCC y,z = Z (ka,r,ksz) )r—!—!- r+s_>_1 k=1 Obviously hk,r,s = gk,r,s + flea-,3, and H = G + F. We require notation for all of the different kinds of rooted trees. Let H L and H D denote the generating functions for (r,s)—trees rooted at a light and a dark vertex respectively. Similarly for G and F, we can define GL, GD and FL, FD. Then H=HL+HD, G=GL+GD, F=FL+FD and HL = 0;, + FL, (1.3) HD=GD+FD. (1.4) 1.2 Functional Relations Now we establish the functional equations that relate these generating functions. Lemma 1.2.1 The generating functions GL, GD, FL, FD, HL, and HD for the vari- ous types of rooted (r,s)-trees, satisfy the following relations CL = zxeFD (15) Go = 2316”" (16) FL=x(eGD —1)eFD (17) F0 = 31(66" -1)6FL (1 8) and HL = zxeFD + x(eGD -—1)eFD (1.9) HD = zyeFL + y(eGL —1)eFL (1.10) Proof. Suppose T is a type I tree rooted at a light vertex, then by Lemma 1.1.1, each of the rooted branches U1, ..., U ,- obtained by removing the root of T is a type II tree which is rooted at a dark vertex. The generating function for the families of those type II trees, is Fig/j! forj = 0, l, Therefore the generating function, GL for T is; 00 CL = 2.1: E: Fig/j! = zxeFD. “=0 The factor x is present to account for the root of T and the factor z is present because of equation (1.1) of Lemma 1.1.1. Suppose T is a type II tree rooted at a light vertex, then by Lemma 1.1.2, at least one of the rooted branches U1, ..., U,- must be a type I tree which is rooted at a dark vertex. Therefore the generating function F L for T has a factor of 8 gab/.71: CGD — 1, i=1 the generating function for non-empty families of type I trees . There may or may not be some type II trees among the branches U1, ..., U ,- and the generating function for (possibly empty) families of type II trees rooted at dark vertices is eFD, as before. The observations imply that FL = x(eGD —1)eFD. The factor x is present to account for the root of T, but because of equation (1.2) of Lemma 1.1.2, the factor 2 is not included here. Equations (1.6) and (1.8) follow from (1.5) and (1.7) respectively by symmetry and equations (1.9) and (1.10) follow from (1.5) through (1.8) combined with (1.3) and (1.4). D The formulas in this lemma can be used to calculate both the number of rooted trees of the various kinds with r + 3 vertices as well as the sum of the independence numbers of these trees. To handle the former task, we set 2 = 1 in the various egf’s introduced earlier and we also simplify the notation as follows. Let g=g(r.y)=G(ar,y.1), f=f(r,y)=F(-r,y,1), h=h(x.y)=H(x,y,1) and similarly 9L = 91.0133!) = GLOW/,1), 90 = 90(1331) = 0003,31, 1), with the same notation for fL, f1) and hL, hp. Furthermore, it follows from (1.9), (1.10) and the equation H = HL + Hp that h = xehD + yeh". This completes the treatment of the relevant functional equations for the several types of rooted trees. 1.3 Recurrence Relations and Numerical Values First we introduce notation for the coefficients of the simplified egf’s of the previ- ous section: 00 xr ya fL = Z fL(r,s)F?9 r+szi ' ' oo xr ya 90 = 2 gD(r,s)-7j;fa r+s>1 ' ' 00 (I), ya f0 = Z fD(r,s)Fya r+821 ' ' — 00 xr y: — 00 xr ye hL —- 4:; hL("’)r—!§I’ hD — 2 1100.033, 1‘ s_1 r+s_>_1 so that r+s Z ht,” = hum) + hD(r,s)- k=l The following lemma provides recurrence relations for the coefficients of the gen- erating functions above . Lemma 1.3.1 Let r and s be non negative integers. Then 9L(1,0)=1, (1.11) 914,“) = 0, for r = 0,1 and 3 > 0 (1.12) and forr >1, r—i r 3 guns) = z ( i ) ( - )gL(i,j)fD(r-i.s—j)a (1-13) r —l J where the sum runs over all i, j such that 0 < i < r and 0 S j S 3. Also fL(1.0) = 0 and fur”) = r for r = 0,1 and s > 0, (1.14) 10 and forr >1, r — i r s fL(T.-9) = Z r _ 1 ( i ) ( j ) [fL(l,j)(gD(T-f,8-j) + fD(r—i,s-j)) + gL(i’j)gD(r-iv’"j)]’ (1.15) where the sum runs over all i, j such that 0 < i < r and 0 S j S s. Proof. The initial values in (1.11) and (1.12) are easily found by considering the tree diagrams. On setting 2 = 1 in equation (1.5), we find g, = are”, (1.16) and on differentiating this equation with respect to x we have 8.9L _ In fDafD Ec— — 6 +1“: 03' (1'17) On solving (1.16) for em and substituting the result in (1.17), the latter becomes 69L _ 0ft) Finally, using the definitions of g1, and f1), we can compare coefficients on both sides of (1.18) to obtain (1.13). Again, the initial values in (1.14) can be found easily from the tree diagrams. Next we verify (1.15). On setting 2 = 1 in equation (1.7), we find f1. = 2(e” —1)e’D, (1.19) and on differentiating this equation with respect to x we obtain afL_ 9 f In 90690 —a-$——.(eD—1)eD+xe e 0x a; (1.20) + x(e9” —1)efD 8 11 which leads to aft, _ 690 6ID 690 3-67-11— fL$('5;+-5';)+9L$-5?- (1-21) By comparing coeffients of both sides of (1.21) we obtain (1.15). D There is also a combinatorial proof of this theorem that was observed by B.Sagan, involving an extra root. For example to prove (1.13), consider a type I , (r,s)-tree T with a root and an extra root w both in light vertices. Let U be the branch of T which contains w. Next we deform T into two parts, T \ U and U, then identify them as follows: :r .__. (T\U,U). (1.22) For each fixed extra root of T, gum) is the number of trees in the left side of (1.22) and there are (r — 1) ways to label the extra root. Therefore for r > 1, (r — 1)gL(,,,) is the number of trees T in the left side of (1.22). On the other hand, T \ U is a type I tree by Lemma 1.1.1 and counted by 9L(i.j) for some i, j up to labels and U is a type II tree counted by (r — i)fD(,_,-',_,-) up to labels. Of course ( : ) ( j ) accounts for labeling. Hence the right side of ( 1.22) is counted by E ( : ) ( 8' )gL(i.j)(r _ z.)fD(r-i,s—j)- (1.23) O 1, (r - 1)fL(,.,,) is the number of trees T in the left side of (1.22). But this time T \ U can be either type I or type II. If T \ U is a type I tree, then U must be a type I to make T a type II. And (r — i)gL(,-,,-)gp(,_,-,,-j) will count those trees up to labels. On the other hand if T\ U is a type II tree, then U can be either a type I or a type 12 II. And (r — i){fL(;,j)(gD(r—.',.—j) + fD(r_.',,_j))} will be the number of these trees up to labels. Hence (1.15) follows. Observe that Lemma 1.3.1 serves to calculate values of 9D(r.a) and fD(,,,) because of the symmetric relation between light and dark. Specifically gD(r,s) = gL(s,r) and fD(r,s) = fL(s,r)' Now we can find the number of trees of different types for small values of r and s. The computations begin with the boundary values in (1.11), (1.12) and (1.14). We have used the lemma to count trees for all r and s S 19. We include in Table 1.1 the dataforr+s=9andr+s=10. The last column, 22:: him,” is obtained by summing all prior columns, i.e. r+s Z him-,3 = 91.0,.) + goo-,3) + fL(r,s) + fD(r,.)- k=1 Notice the symmetry of the data in this table. For example, 9L(7.2)a which is the number of (r,s)-trees of type I with 7 light and 2 dark vertices rooted at a light vertex, is 3122, which is the same as 90(2.7)- We can also illustrate the numbers for (7,2)-trees with tree diagrams. Consider a (7,2)-tree T. We can see easily each maximum independent vertex set contains either none or exactly one dark vertex and its cardinality (the vertex independence number) is 7 for all (7,2)-trees. Also there exists exactly one light vertex adjacent to both dark ones for all those trees. Next, besides the unique light vertex which we just mentioned, we have i ( ‘3 l = 26 i=0 many ways to partition the remaining 6 light vertices into two sets where the vertices of each set are joined to one of the dark vertices as in Figure 1.1. The letters inside the circles indicate the number of light vertices in each set. 13 Table 1.1: Number of trees of different types gum) 9mm) f L0,.) . f on...) Iii hi,” 8 0 0 1 9 3122 0 14 896 4032 48690 36 3798 26208 78732 97640 6180 62360 121820 288000 6180 97640 121820 62360 288000 36 48690 26208 3798 78732 0 3122 896 14 4032 0 8 1 0 9 9 0 0 1 10 8176 0 16 2048 10240 240156 42 9891 107121 357210 1044120 15912 282984 868824 2211840 382600 382600 1570525 1570525 3906250 15912 1044120 868824 282984 2211840 42 240156 107121 9891 357210 0 8176 2048 16 10240 0 9 1 0 10 (D Figure 1.1: A diagram for (7,2)-trees Since there are 7 ways to label the unique light vertex and 26 ways to partition the remaining 6 vertices, there are 7 x 26 labeled (7, 2)-trees. Hence there are 7 x 7 x 26 (7,2)-trees rooted at a light vertex, i.e. 9L(7,2) + fL(7,2) = 7 X 7 X 26 = 3136. Next we consider how many of them will be of type II which are counted by fL(7.2)- Since they are the ones containing one dark vertex in their maximum independent vertex set, we can easily see them as the case of i = 0 or 6 in the above diagram. Hence we have fL(r,2) = 7 X 2 many of them, where 7 is the number of ways to label the unique light vertex and 2 is the number of ways to label the dark vertices. Therefore 9L(7,2) = 3136 - fL(7.2) = 3122. Next 900,2) = 0 is immediate and similarly we can figure fD(7,2) = 7 x 26 x 2 = 896, where the last factor 2 is the number of ways to root the tree at a dark vertex. And by adding those numbers we obtain the last column 4032. 15 1.4 Expectation of the Vertex Independence Number As defined in the introduction, u(r,s) is the expected value of the vertex in- dependence number with uniform probability distribution assigned to I‘(r,s). The computation of u(3,3) is illustrated in Table 1.2. The four isomorphism types of (3,3)-trees T are shown together with the number of ways, l(T), to label each one. The vertex independence number flo(T) is also provided. Hence we have u(3,3) =(9 x4+36x3+18 x3+18 x3)/81. The table also displays the ratio of the expected value to the order, e.g. ,u(r, r) / 2r. Of course, this ratio must be at least .5, and our computational results will indicate that it is not ever much bigger than .56- - - Our aim is to develop recurrence relations for u(r,s). Since the independence number of a tree T is constant for all rooted versions of T, we can express u(r, s) in terms of rooted trees as follows: #(7‘, S) = 250(T)/(T"‘8"1(T + 3)), where the sum is over all rooted (r, s)—trees T. Hence ' p(r,8)r" 1s' 1( r(+)s =Zflo(T) (1.24) This motivates the choice of the generating function M (x, y) which delivers the sum of the independence numbers for rooted (r, s)-trees. We define OO xrys .M(:r,y) = Z p(r, s)r’ 1 3r l(r+ s)—3—!. (1.25) r+le Now our task is to express AI (x, y) in terms of the generating functions of the previous sections. The result is given in the following theorem. 16 Table 1.2: Expectation of vertex independence number T 1(T) #(r, 1‘) #(r,r)/21‘ o___. 1 1 .5 E 36 252/8l=3.111111 252/486=.518518 % 18 E 18 l7 Theorem 1.4.1 The generating function Ill{x,y) for the independence numbers of rooted (r,s)-trees is (1+ ho)gz. + (l + hz.)go M(x.y)=1_ thD (1.26) Proof. It follows from our definitions that 2 160(T) = Z khk,r,sa I: where the sum on the left is over all rooted (r, s)-trees. Hence p(r,s)r’ 13' 1( ()1‘+S =Zkhknrs Therefore 111(x, y) = (0H/82)z._.1. If we differentiate both sides of equation (1.9) with respect to z and simplify using (1.4), then we have aHL/Bz = xeFD + erDBHD/az + (z —1)xeFDBFD/3z. (1.27) The companion equation for trees rooted at a dark vertex is obtained from (1.24) by substituting D for L and y for x; aHp/az = yeF‘ + yeH‘BHL/Bz + (z —1)yeFL0FL/Bz. (1.28) Next we set 2 = 1 in both equations (1.24) and (1.25) . Recall that we obtained f1, from FL(x, y, 2) by setting z = 1, i.e.: fL = fL(1‘,y) = FL($,3/,1), and f0 = fD($,y) = FD(x,y,1), etc. 18 Then we arrive at the following system of equations for the two partials (BHL/az),=1 and (BHD/Bz),=1 : { (6HL/6z),=1 = xefD + xehD(8HD/62)z=1 (BHD/Bz)z=1 = yefL + yehL(8HL/32)z=1 After solving the system, we use the equation H = HL + HD to find TCfD + yefz. + $y(ehp+fz. + ehL-rfn) (on/oz).=1 = 1 _ W, (1.29) The following equations are obtained from (1.5) through (1.8) by setting 2 = l : {M = ref” 91) = ye" fL = xegb+fn _ 336]” ID = y69L+IL _ yeIL, Then the result of substitution in the right side of (1.26) is __ (1+ thgL + (1 + htlgn (an/3.2).:1 _ 1_ thD . which is the required formula. El Equation (1.23) in Theorem 1.4.1. can be written as MUM!) — M($a ythhD = 9L + 90 + (1091. + thD, (130) which leads to MOB?!) = 9L + 91) + hDgL + thD + M($,y)thD- (1.31) By comparing the coefficients of both sides in ( 1.28) we have the recursive relation of the following corollary. 19 an" Corollary 1.4.1 The expected value of the vertex independence number for the la- beled (r, s)-trees, u(r, s), has the following recurrence relation: 1 r s #(r, S) = rs—ISf-l(r + S)[gL(r’9)+gD(ri‘)+z ( i ) ( j ) (hD(i,j)gL(r—i,s—j)+hL(i,j)gD(f-i,s—j)) iSr '5‘ 7' S I . . .'_ .,'_ , . :+J’+J"=a where in the second sum, we have i < r andj < 3. Corollary 1.4.1 was used to compute the expected value u(r,r) for r = 1 to 19. The ratio [1(r, r) / 2r of the expected value to the order of these trees is displayed in Table 1.3. Note that the values for r = 1, 2 and 3 agree with the entries in Table 1.2 which were found by inspecting the tree diagrams. We also verified the value of u(4, 4) / 8 in Table 1.3 by inspecting all the relevant tree diagrams. Notice also that the sequence of the ratios increases slowly and u(19, 19) / 38 is approximately .5615. 20 Table 1.3: Expected independence number by proportion when r = s r u(r, r) / 2r 1 0.50000000000000000000 2 0.50000000000000000000 3 0.51851851851851851852 4 0.53173828125000000000 5 0.54026240000000000000 6 0.54582540493382614439 7 0.54959402326760519619 8 0.55225856471315637464 9 0.55422138607749021473 10 0.55572092017749310500 11 0.55690257221809530113 12 0.55785805356475197043 13 0.55864727324199846080 14 0.55931072818637855904 15 0.55987668338524781944 16 0.56036544760401056140 17 0.56079199438507640447 18 0.56116761656041338555 19 0.56150100107406312305 21 Table 1.4: Total of the vertex independence numbers r + s r 3 Bren”) 50(T) 4 1 3 3 2 2 8 8 1 7 7 2 6 1152 3 5 10140 4 4 17424 12 1 11 11 2 10 51200 3 9 4782996 4 8 67159318 5 7 265688360 6 6 396047700 16 1 15 15 2 14 1605632 3 13 1167575916 4 12 86974860336 5 11 1573585770680 6 10 10124304409740 7 9 28269280359936 8 8 38861741660224 22 Chapter 2 ASYMPTOTIC BEHAVIOR 2.1 The probabilistic method Our aim in this chapter is to study the asymptotic behavior of the expected vertex independence number of (n, n)-trees. The difficulty of this type of problem has been noted before. For example, Bender ([Be74], p.512) has observed: “Practically nothing is known about asymptotics for recursions in two variables even when a generating function is available. Techniques for obtaining asymptotics from bivariate generating functions would be quite useful. Some results have been obtained for a small class of generating functions (Bender [2]), but these are often hard to apply.” Nevertheless, we are able to establish bounds on the independence number for most (n, n)-trees. The tools we use include the matrix-tree theorem and the probabilistic method. The matrix-tree theorem originated in the work of Kirkhoff (see Moon [M70] p.42) and relates the number of spanning trees of a labeled graph to the adjacency matrix. In particular, if G is a labeled graph of order n with vertex set V = {'01, ...,vn}, then the adjacency matrix A(G) is the n by 17. matrix whose i,j entry is 1 or 0 23 according as vertices v,- and v,- are adjacent or not. The matrix D(G) has diagonal deg v1, ..., deg on, while the off-diagonal entries are all zero. Theorem 2.1.1 (Matrix-Tree Theorem) The number of spanning trees in any labeled graph G is the value of any cofactor of D(G)—A{G). This theorem has been applied to many important classes of graphs for which convenient formulas have been found. For example, when applied to the complete graph Kn, it yields Cayley’s formula, nn'z, for the number of labeled trees [C89]. We will make use of the theorem for a particularly simple but important family. Consider a graph, denoted G(l/'1,V2,V3,V4), whose vertex set V is partitioned into four non empty sets: The edge set of G (V1, V2, V3, V4) consists of all edges joining vertices of V1 to vertices of V2, vertices of V2 to vertices of V3, and vertices of V3 to vertices of V4. Corollary 2.1.1 The number of spanning trees of G(V1, V2, V3, V4) is (721 + "QM-1012 + "4)"3-1nilni‘a where n,- = |V,-| for i =1 to 4. This formula can be established in several ways. We obtained it by applying row and column operations to calculate a cofactor of the required matrix. Our colleague, Greg Buzzard, verified these calculations using Mathematica. It can also be realized as a corollary of a very broad result of Knuth [Kn68] on generalized Priifer codes (see also Moon [M70] p.10). The probabilistic method was first applied to graphs by Erdbs [Er47], who pio- neered its use with so many innovations that it may be more proper to call it the 24 Erdb's method. Here we sketch only the portion of the method that we require. For background on probability theory for a discrete sample space, one can consult the appendix of [Pa85]. Theorem 2.1.2 (Markov’s Inequality) Let X Z 0 be a random variable and let t > 0. Then P(X 2 t) g E(tX). (2.1) On setting t = 1 on inequality (2.1), we have P(X 2 1) g E(X). (2.2) If X is non negative and integer valued, we also have P(X = 0) + P(X Z 1) = 1. (2.3) In our applications, the sample space always consists of graphs with at least n vertices and the random variable X counts certain types of subgraphs. It follows from (2.2) and (2.3) that if E(X) —-> 0 as n —) 00, then P(X21)—+0 and P(X=0)—)1 and we say “ almost all graphs have no such subgraph” . Suppose the sample space consists of (n,n)-trees and X counts sets of vertices which are undesirable, i.e. “ bad sets” . If we show E (X) —-) 0, then we say “almost all trees have no bad sets” . 25 2.2 Vertex independence number for almost all (n,n)- trees Our aim is to determine bounds for 30 for almost all (n,n)-trees in I‘(n,n). Of course, we always have the lower bound nSflO, for all trees in I‘(n, n). To find an upper bound we require some notation and a few more definitions. Fix k > n and let Xk be the random variable on I‘(n, n) which counts the number of independent vertex sets of order k. Each such k-set consists of i light and j dark vertices with 01ift<(n+an—l)/2. (2.24) Next we consider f3 "ff-”+1 of (2.19): n + an — )n—t( t )t—an+l (2.25) t —t —am+1_ f3 f4 (n+an—t—1 t+1 __ 1 n+an—t-1 1 —t n + an — l — 2t an-l _(1+n+an—t—l) (1+?) (1+ (n+an-t)t) . (2.26) Observe that (1 + 1 )"+a"-‘-1(1+1)-*> 1 if n + an - t — 1 > t. (2.27) n + an — t — 1 t This follows from the fact that 1 f(x)=(1+ 5):” 1228) increases to e as x —> 00. It is also clear that for the last factor of (2.26), we have °’" ft — 2. 2. (1+ (n+an—t)t) >11 <(n+cm 1)/ (29) Hence we find that a¢+1/Clt > 1 III < (Tl + an —1)/2, (2.30) i.e. the series increases all the way to the very last term, which is ato, where to: [(n+an+1)/2]. (2.31) Now the sum in (2.14) is bounded by the product of the last term and the length of the sum. Hence we have E(Xk) = 0(l)n"'°’"+l(n — om — 1)(110, (2.32) 29 where the factor (n — an — 1) is contributed by the length of the sum. On the other hand, “to = 0(1)a(n+an)/2 (2'33) and a bit of algebra shows that “(n+anl/2 = 0(1)n’("’°"+2)(2"°(1 - a)2“'/(l + a)”°)”- (2-34) Substituting (2.33) and (2.34) in the equation (2.32), we find that the upper bound for the expectation takes the following simple form: E(Xk) = O(1)(21"°'(1— a)2°‘/(1 + a)1+°')". (2.35) Since we want the right side of (2.35) to approach zero as n —2 00, we just need to solve the inequality 21"°’(1 — 01)“ < (1 + a)1+°'. (2.36) A simple numerical calculation shows that it is sufficient to choose a = .27974. (2.37) These observations are summerized in the following theorem. Theorem 2.2.1 For almost all (n,n)-trees T, the vertex independence number 50(T) satisfies the inequality: n S flo(T) S (1.27974)n. (2.38) We now show that the upper bound in (2.38) cannot be significantly improved using the probabilistic method. To do this we will show that E(Xk)—+ ooasn—ioo, 30 for a slightly smaller value of k = (1 + a)n. An argument similar to the one which produced (2.17) also shows that for some constant co > 0, E(Xk) 2 con""""+1 2 at, (2.39) an 00 as n —> 00 for k = (1.2797)n. (2.46) This shows that the upper bound in the theorem cannot be reduced substantially. Our exact calculations in Chapter 1 indicate that the expected value of do is about (.5615---)2n = (1.1230---)n. The latter number is in the middle of the interval described in Theorem 2.2.1. We suspect that the value of 30 for almost all (n, n)-trees is even more closely concentrated about the asymptotic value of the mean than the interval of Theorem 2.2.1. 2.3 Edge independence number for almost all (n, n)-trees Recall that the edge independence number of a tree T is denoted by 61(T). Our aim in this section is to find bounds for ,61(T) for almost all (n, n)-trees T in I‘(n, n). This was a problem left unsolved in [Pa92] and [Sch92]. The following theorem shows the close relationship of fig and £1 for trees. Theorem 2.3.1 For any tree T of order n, flo(T) + 51(T)= n. (2.47) This result follows quickly from theorems of Gallai[Ga59] and K6nig[K631] (see also [H69] pp.95-96). It can also be derived directly by an algorithmic approach that produces the required independent sets. Now this theorem and Theorem 2.2.1 can be combined to provide bounds for 61 which we state in the next theorem. 32 Theorem 2.3.2 For almost all {n,n)-trees T, the edge independence number 31(T) satisfies the inequality: (.72026)n 5 MT) 3 n. (2.48) There is another approach which could lead to an improvement in the lower bound in (2.48). It stems from an idea in [Pa92] for finding matchings in superpositions of trees. This alternative makes use of the following slight generalization of Hall’s theorem [Ha35]. Let G be a graph with vertex set V. For any subset S of V we define NC;(S) to be the set of vertices v in V \ S such that v is adjacent to some vertex of 5. Theorem 2.3.3 Let G be a bipartite graph with partite sets V1 and V2 and let d be a positive integer. Then [31(6) 2 [VI]- d (2.49) if and only iffor all subsets S ofVl, lNc(5)| 2 ISI - d- (13-50) Proof. Construct a bipartite graph G; from G by adding d new vertices to V; and all possible edges between V1 and the d new vertices. Then Hall’s theorem is applied to Gd. D As usual, our sample space in this section is I‘(n, n) with the uniform probability distribution. Each tree T in I‘(n,n) is a bipartite graph in which V1 is the set of n light vertices and V2 is the set of n dark vertices. A subset S of V1 is bad if [NT(S)[ < [S] — d. (2.51) By Theorem 2.3.3, if there are no bad sets in T, then MT) 2 n — d. (2.52) 33 Let X4 be the random variable which counts the number of bad sets. We want to determine d as a function of n so that E(Xd) —> 0 as n —+ 00. (2.53) Then we can say “ almost surely, there are no bad sets” . Hence for almost all (n, n)- trees T in F(n.,n) 310‘) _>_ n — d- (2.54) Naturally we will try to make (I as small as possible. We set d = on, (2.55) where o is a constant between 0 and 1 . Suppose S is a bad set of order k in a tree T. Since T is connected, we must have d+1. (n + d)/2. Hence E(Xd) is bounded by the product of the largest term, which occurs at k = (n + d)/2, and the length of the sum. Futher computation yields E( ’d) = O(1)(21‘°‘(1 — a)2°/(1+ a)1+°’)". (2.59) This bound is seen, at once, to be identical to that of formula (2.35) in the previous section. On further examination of the formula (2.12) for E(Xk) and (2.58) for E(Xd), we see that these are virtually identical. One can be obtained from the other by an appropriate change in the names of the variables. Hence no improvement in Theorem 2.3.2 is possible from this approach. 2.4 Comparison with random bipartite graphs In the previous two sections, we discussed bounds of flu and Bl for almost all (n, n)- trees in I‘(n, n). Following Bollobas (see [B085] p.52) we let Q{K(n, n); p} denote the probability space of all bipartite labeled graphs with partite sets V1 and V2, |V1|= |V2|= 71. in which each V1 — V2 edge is selected with probability p. More specifically for each graph G with AI edges, the probability assigned to G is P(G) = pM(1- p)"2‘M- (2.60) Now we seek bounds of fig and )81 for almost all bipartite graphs in g {K (n,n); p}. We preserve most of the approach and notation from the previous sections. Fix It > n and let X;c be the random variable that counts independent sets of 35 order k. Then the expectation of X), is E(Xk) = Z ( 7: ) ( n- ) (1 -p)‘j, (2-61) .7 where the sum is over all i, j with 0 0 as n —+ 00. Then we can say fl0(T)Sk=(1+a)n, (~' [\9 O) or V for almost all bipartite graphs G in Q{K(n,n); p}. As before in section 2.2, symmetry allows us to eliminate the variable j in formula (2.61) for E(Xk). We find r n n 1(1—4') - E(A.)522(,)(,,_,)(1—p) . (2.66) where the sum is over all i with on ( 11) Next we use the fact that (n — 2)(6(1-+-o)—51’)n2 < e-2(6(1+a)—62)n. (272) n Then the equation (2.68) will become E(Xk) = 0(1)n’1 Z D(a,6)", (2.73) an<6n<(n+an+l)/2 where D(o', 6) = 1/e2(5(1+°)‘62)66(1 — 6)1'6(6 — a)6'°(l + a — 6)1+°"6. (2.74) To make the right side of (2.73) approach zero, we need to find an a such that D(a,6) < 1, (2.75) for all a<6S(1+a)/2. 37 A straightforward calculation shows that D(.5,6) < 1 for all .5 < 6 S .75. Hence if On the other hand, E(Xk) Z clnzn‘Hat, where Cl is a positive constant and t: (1+ a)n/2. a(1+a)n/2 Z C2n_2n-2Ana where C; is a positive constant and A __ 4 — e(1+")2/2(1+ a)1+°(1— a)1"" And we find D(.49,6) > 1 for some 6 near.7. For example, D(.49, .7) = 1.0258. Which implies that the estimate a=.5 cannot be improved significantly by this method. These observations are summerized in the following theorem. 38 (2.76) (2.77) (2.78) (2.79) (2.83) (2.84) Theorem 2.4.1 For almost all bipartite graphs G, the vertex independence number 30(G) satisfies the inequality: n S 60(G) _<_ (1.5)n. (2.85) As expected, Theorems 2.2.1 and 2.4.1 give similar bounds on the independence number. But the upper bound in 2.4.1 is slightly higher than that of 2.2.1. This might be accounted for by the fact that the random graph model is less restricted. Hence there are more opportunities for large independent sets. 39 Bibliography [AISE92] [Au60] [Be74] [B085] [C89] [Er-47] [Ga59] [Ha35] [H69] [Kn68] N.Alon, J.H. Spencer and P. Erdés, The Probabilistic llIethod, Wiley- Interscience, New York (1992). T.L. Austin, The enumeration of point labelled chromatic graphs and trees, Canad. J. hIath. 12 (1960) 535—545. E.A. Bender, Asymptotic methods in enumeration, SIAM Rev. 16 (1974) 485-515. B. Bollobas, Random Graphs, Academic Press, London (1985). A. Cayley, A theorem on trees, Quart. J. .Math. 23 (1889) 376-378. P. Erdbs, Some remarks on the theory of graphs, Bull. Amer. Math. Soc. 53 (1947) 292-294. T. Gallai, Uber extreme Punkt- und Kantenmengen, Ann. Univ. Sci. Budapest, Eo'tvo's Sect. AIath. 2 (1959) 133-138. P. Hall, On representatives of subsets, J. London Math. Soc. 10 (1935) 26-30. F. Harary, Graph Theory, Addison-Wesley, Reading (1969). D. E. Knuth, Another enumeration of trees, Canad. J. Math. 20 (1968) 1077-1086. 40 [K631] [M70] [MeM73] [MeM75] [Pa85] [Pa92] [Sch92] [Sc62] D. Kbnig, Graphen und Matrizen, Math. Fiz. Lapok. 38 (1931) 116-119. J. W. Moon, Counting Labelled Trees, Canad. Math. Congress, Mon- treal (1970). A. Meir and J. W. Moon, The expected node-independence number of random trees, Nederl. Akad. Wetensch. Proc. Ser. A 76 = Indag. Math.35 (1973) 335-341. A. Meir and J. W. Moon, The expected node-independence number of various types of trees, Recent Advances in Graph Theory, Academia Praha (1975) 351-363. E. M. Palmer, Graphical Evolution, Wiley-Interscience, New York (1985). E. M. Palmer, Matchings in random superpositions of bipartite trees, J. Comput. Appl. AIath. 41 (1992). E. Schmutz, Matchings in Superpositions of (n,n)-Bipartite Trees, preprint. H.I. Scoins, The number of trees with nodes of alternate parity, Math. Proc. Cambridge Philos. Soc. 58 (1962) 12-16. 41