LIBRARY Michigan State University .h” This is to lcertifg that the thesis entitled The Distribution of the Number of Components of a. Random Mapping Function presented by Jay Ernest Folloert has been accepted towards fulfillment of the requirements for P110 Do degree in l‘lathematica I/ Major professofi Date Jul-I 251 1955 0-169 1 PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before due due. DATE DUE DATE DUE DATE DUE VER ITY MSU Is An Affirmative Action/Equal Opportunity Institution ammo MICHTGAN STATE UNIVERSITV Pa. THE DISTRIBUTION OF THE NUMBER OF COMPONENTS OF A RANDOM MAPPING FUNCTION By JAY ERNEST FOLKERT A T"?sls Fla—u.) Submitted to the School for Advanced Graduate Studies of Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Mathematics 1955 AC KI'EOL‘JLELXH‘; BIN? TS The author wishes to express his sincere thanks to Dr. Leo Katz who supervised the work involved in the solution of the problem. his advice, encourage- ment and patience have been most important throughout the author's entire program of study. The writer deeply appreciates the financial support of the Office of Naval Research. THE DISTRIBUTION OF THE NUMBER OF COMPONENTS OF A RANDOM MAPPING FUNCTION By Jay Ernest Folkert AN ABSTRACT Submitted to the School for Advanced Graduate Studies of Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Mathematics Year 1955 a?“ Approved l JAY ERNEST FOLKERT ABSTRACT From the collection, F, of functions, f, which map the set, Jr1_, of N elements into j_l_ , the general subclass, Gitit’ is selected. This class is composed of functions such thf-‘t for each fe Gfril the point x1€fl has r1 images in.Jrl_, i = l, 2, ...., N. A probability space is constructed by taking Girl} and attaching equal probability to each ferrfi . A random map-rung of E into n relative to Obj} is defined as corresponding to the selection of an f from Girl} with uniform probability. A subsetcbof 171. is a component of the function if and only if it is a minimal, non - null subset such that f (CHCD and f‘1 (0)Cu9. Every mapping function, f‘G{r1§ decomposes.jfl into a number of disjoint components. I Therefore, to each f in GiTii there corresponds a number, c, which is the number of com'r‘onents induced in. [1 by this f. If f is selected at random from Gifii then c is a random variable. The problem is to find the probability distribution of c. The method used is one in which auxiliary sums, §/.,I/4= l, 2, ...., N are defined. These sums are accounted for by two approaches. By the first spproach a formula in terms of N and r1 is derived for computation of S/“vua 1. 2, ...., N . By the second approach, S“ is found in terms 2 JAY EhhHSW FOLKEFT AmSTnACT J. _ of the probability, P of exactly 0 components, 0 = l, 2, C’ ...., N. A matrix solution of this expression for PC in terms of Sn is given. The result isihe exact probability distribution of the number of components. Particular cases of general mapping are considered as specializations of the mapping under G{T1}. These are the subclass, Gr, of functions under rhioh each element of.l—1_ has the same number, r, of images and the subclass, 01, of functions under which each element of 1—1_ has only one image. Hollow mapping in the sense that no point is permitted to map into itself is also considered. For this case, the subclasses, Hlfil’ Hr, and H1 are defined in a manner parallel to the general case and the exact probability distribution is found for each. Numerical examples are included to illustrate each of the cases considered. The amount of computation involved in these suggests the need for an a,proximation to the exact distribution. Therefore a binomial approximation is developed. The results of the exact and approximate dis- tributions for these examples are included in tabular form for reasons of comparison. 3. 5. TABLE OF CONTENTS INTRODUCTION 1:1. Basic considerations . . . 1:2. Statement of the problem and the of attack . . . . . . . . 1:3. Previous work . . . . . . . THE GENERAL CASE . . . . . . . . 2:1. The calcula.tion formula for the auxiliary sum, Sn. . . 2:2. The exact probability distribution the number of components . 2:3. A numerical example . . . . PARTICULAR CASES OF GENERAL MAPPING 3:1. The class Gr . . . . 3:2. A numerical example . . . . . . 3:3. The class G1 . . . . . . . . . . 3:4. A numerical example. . . . . . . THE HOI-‘I-Jow CASE 0 o o o o e o e o e e 0 4:1 Preliminaries. . . . . . . . . . :2 The auxiliary sums and the exact probability distribution of the number or componeIItSo e e o o o e #:3. A numerical example. . . . . PARTICULAR CASES OI HOLLOW MAPPING. 5:1. The class Hr . . . . 5:2 A numerical example. 5 3. The class H1 . . . . 54 . A numerical example. 0 Page \l-C'NH 14 19 22 22 23 2b, 27 32 32 TABLE or CONTENTS (Continued) Page 6. A BINOMIAL APPROXIMATION OF THE DISTRIBUTION OF THE NUHBE R OF CON} ONENTS . . . . . . . . . . 42 6:1. Introductory considerations . . . . #2 6: 2. Derivation of an approximate probability diStribUtion oeooeeoeoeeee “'2 6:3. Numerical examples. . . . . . . . . . . . 47 7 O SUT'II'LABY . . . . Q o Q Q 0 O o O O O O O O O O O 5 1 REFERENCES, , LIST OF TABLES Page TABLE 1. Stirling Numbers of the First Kind — s2 . 19 TABLE 2. Comparison of the Exact and Approximate Distributions of the Number of Com- ponents in the General Case . . . . . . 49 TABLE 35 Comparison of the Exact and Approximate Distributions of the Number of Com- ponents in the Hollow Case . . . . . . 50 OOOOO IIIIII 1. INTRODUCTION lzl. Basic considerations. This thesis is concerned with the collection, F, of transformations or functions, f, on the points of a set, Jr1_ , of N elements into the set Jr1_, The most general kind of a function takes a point of_[TL into an arbitrary number of points of the set. Under such a function the point x, maps into r1 points of Jr1_, x3 maps into r3 points of.j_l_, etc. A subset d>of J71. is a component of the transformation or function if and only if it is a minimal, non-null subset such that points of¢D map only into points of d3 and every point which maps into a point of u? is itself a point of ob . Symbolically, the subset d’is a component of the function if it is a minimal, non-null subset 3 f(W)CI-Dand f".1 (“WC-“9. It is essential that «D be minimal . Every mapping function, feF, decomposes E into a number of disjoint components. For example, if f were a cyclic permutation taking x1 into x1+l, mod N, there'would be only one component. On the other hand, if f were the identity transformation, there would be N components. Consequently, depending on the function involved, there could be 1, 2, ..., or N components induced in j_l_ by a function, f. 2 1:2. Statement of the nroblemiand,the,nlan of attack. A suitably chosen subset 370$ the finite collection, F, of functions is considered. A probability space is constructed by takingfi'and attaching equal probability to each fey. A random mapping of [L into E relative to 31s defined as correSponding to the selection of an f 8 9' with uniform probability. To each f£t37there corresponds a number, c, which is the number of components induced in.Jr1; by this function. if f is selected at random from 57, then 0 is a random variable. The probability distribution of c for various choices of.9'is the core of this thesis. There are two main parts in this dissertation. Part I is the general case in which a point may map into itself. For any fixed set,{r1} , i = l, 2, ...., N, of values, there exists a subclass, GfrflCF, of functions such that for each f£G{r1}the points x1, x3, ...., xN offl have r1, r3, ...., rN images respectively. Of Special interest is the case where r1 = r for all i. (O 4 rsN). This introduces the subclass, crcF, of functions under which each point of 11 has the same number, r, of images. Moreover, the special case where r = 1 yields the subclass G1 of functions under which each point has only one image. Part II is the hollow case and deals with the sub- class H(rfiCFof completely nonidentical functiuns under which no point is permitted to map into itself. As in the general case, the Special cases Hr and H1 are considered. 3 These are actually subsets of the corresponding G sets and‘ are defined in a manner parallel to the subsets Gfrfifir and G1. Starting with the general case, the procedure is to consider the subset Erto be the subset Gir1}with r1, r2, ...., rN images in.Jrl_ for the points x1, x2, ...., xN respectively for each fa G{r1}° Using matrix representation, which will be described in section 2:1, calculation formulas for auxiliary sums, S/i,’/¢= l, 2, ..., N, are found in terms of N and the number of images, r1. This is followed by a derivation of the same auxiliary sums by a different approach. By this second method é“ is found in terms of the probability, Pc’ that exactly c components are induced in 171_ by f chosen at random from G{Ti}' Then, by matrix algebra, the last derivation is solved for PC. The result is that the exact probability distribution of the number of components under the random mapping “(r1} is found in terms of é/s. Since the é/5,//‘= l, 2, ...., N, can be calculated by the first set of formulas, the distribution of c can be computed. The results for the collection, G{T13 of functions are easily adapted to give similar results for the classes GI. and G1 by making prOper changes in the values of r1 in the formulas. The hollow case is treated in a manner parallel to the general case in as much asthe subsets Hfrgpfir and H1 are considered in turn. Moreover, the results of the ,‘ Ll. general case can be adapted to this case by making prOper changes in the limits of summation and proper substitutions in the formulas for the auxiliary sums to account for the fact that no point may map into itself. The numerical examples which are included for each of the above cases indicate that considerable computation is involved in finding the exact distribution. The examples which are presented are relatively trivial. In order to minimize the work involved in more complicated cases a binomial approximation to each distribution is derived and illustrated by numerical examples. 1:3, Previous work. If 3718 restricted to the class of permutations of N elements, components become cycles. Under a permutation any point of the finite set.l—1_ maps into a second, the second maps into a third, etc. until at some stage a point maps into the first member of the sequence. At this stage a cycle is formed. Gontcharoff [I] has given several moment generating functions for the number of cycles of the elements of a permutation group of N elements. His results of the asymptotic behavior of the distribution of the number of components are repeated in different forms by others. Feller [2] reports that if c is the number of cycles formed, the distribution of c is asymptotically normal with mean equal to log N and standard )1/2 deviation equal to (log N . More recently Greenwood [3] 5 showed by different methods that for large N, the mean of the number of cycles is approximately equal to log N + C, where C is Euler's constant, and that the variance is approximately equal to log N + C 4F; . The case where 31s the class of single - valued mappings on N elements was considered in part by MetrOpolis and Ulam [N] . (This is the class which is called G1 in section 3:3.) They defined the study of a random function as the study of the probability distribution over the .j_lgj_l' sample points of f(x) on.Jr1_ into j71_. Using the word tree instead of component, they proposed the question of the expected number of components under random mapping. They suggested that the answer is no doubt of the order of log N and base their surmise on the result which was pre- viously obtained for cycles. Kruskal [75] describes the structure of a component as consisting of a cycle together with a number of trees rooted in the elements of the cycle. This is consistent with the definition used in this thesis. He, too, considered the class of single - valued mappings and in particular dealt with the question proposed by Metropolis and Ulam. He showed that the eXpected number, E, of components under a random mapping function is N n: (1‘1) E =2 (N-fllmlfl‘ - m=l 6 Moreover, for large N this reduces to the following asymptotic result, (1:2) E.aul/2 (log 2N + C), where Clis Euler's constant. This dissertation will present the entire distribution for multiple - valued functions as well as single - valued functions. Thus its results extend well beyond those given by Kruskal. r.‘ m-‘ "*0" v- 2 (‘1 Cu. ILE C-LJ'LVLSA-‘ULIJ 01594:; If no restriction is placed on the mapping of g 2 into $ 2 , it is conceivable that a point could map int' itself. he shall refer to this type of mapping as the general case. The collection, F, of all functions on g 2 into g 2 is admittedly broad for it does not specify the number of images of the different points of S 2 . There— fore the subclass, G{r- , is considered. This class consists 13 of those functions, f, under which each point x.£:§ 2 1 has r1 images in S 2 , i = l, a, ...., h. It is assumed :- I . 1‘ "r h ., that; r1, 1'2, 00., r1; are mom}. Further, for eacn f‘GIri‘ the mapping of § 2 into g 2 is unique. Equal probability is a“signed to each f in G . Since to each f in Gir‘ there {Pi} 1} corresponds a number, c, which is the n*.ber of components induced in § 2 by this f, c becomes a random variable if f is selected at random from G{r- . 1} 2:1. The calculation formgla for the auxiliary sumikgpc. In order to begin the derivation of this formula it is necessary to introduce some notation. (2:1) Let .3: {i} , i = 1, 2, ...., N be the set of indices 0 the points Xian . (232) Let («1, a2, 0000, aft) be a/U-part partition offiinto subsets such that: (a) a3 is non ~ empty for each j.§}z (b) a. . a = 0 for 3 f k J k (C) a1+a3+oooo+ayu=§o (2:3) Letcfl'aJ be that subset of elements of j_2_ with indices belonging to a1, 3 = l, 2, "'°L/* . (2zb) Let E(a1’ Q3, ....’ §/‘) be the event when.1_2_ is divided into subsets in accordance with (2:2) and (2:3),«J(1L3 (3 = 1, 2, ....,/¢) has the properties: (a) f( (90,3)C «22,3 (b) f-1 (waj)c. ”JG J It is emphasized that none of the subsets ¢%_, 1 was, °qu.need be minimal subsets having the properties (a) and (b) of (2:4). therefore, E(a1, a3, ...., 9") is not equivalent with decomposition of f 2 into components. (2:5) Let p(a1’ 02, ...., 99;) be the probability of the occurrence of E(a,, a2, 97‘) on the condition that f is a random selection frmn(?¥£f 2: L S = ( 6) et /“ 2::EEEE:T p(a1, a2, ...., €72), (a1) 0.2, 0900, a/) /é¢= l, 2, ...., N, where the sum is over all possible choices of (G1, 02, ...., g”) in accordance with (2:2) for a fixed value of/ . 4 9 21 (2:7) Let kJ = the number of indices in a J = l, j, 2, °'°°b/b . This implies that there are k1 elements in u) , k3 elements in a? etc. (1.1 a3, Consistent with (2:2) kJ :> 0 and k1 + k3 + .... + = N 13.. The indicated summation in (2:6) can now be considered in two parts. By theorem 3, section b, chapter 2 of Feller'l2i] , for any fixed set of values (k1, k2, ...., k/a ), there N are ways in which the N indices of .J} k1, k2, 0.00, y can be divided into/u.groups of which the first contains k1 indices, the second contains k3 indices, etc. Here N is the multinomial coefficient and k1, kg, 0000, 1y equals N: . However, (k1, k2, ...., k/‘) k1: kg: 0000 1%: can take on different values and still satisfy the conditions of (2:7). For each choice of (k1, k3, ...., k}. ) the different choices of (a1, a2, ...., d/i ) must be considered. For convenience, therefore, it is possible to write (2:6) as a double sum, 2 EM (2:6') 5 = " , /‘ ' )p(a’1, 0-2, 000a ) (k1, k2, coco, 1a“) (a1, a2, 0000, 6/“ /" (k) The symbol means that for a fixed set (11, a2. 0.0., Gy‘ of values (k1, k2, ...., k/‘) the sum is taken over all 10 possible choices of (a1, a; ...., a”) in accordance with (2:2). To find a workable expression for §u, it is necessary to evaluate p(a1, a2, . In order to do this, ...., 0y) f each flfdpulis represented by an N by N matrix, Bf = (big) where bij is either zero or unity. This representation is based on an approach suggested for sociometric data by Forsyth and. Katz [6] and by Katz [7] . It was used in a similar way by Katz and Powell [3] . Each of the rows and columns of Bf is representative of a point of,J—),. The elements of each row denote the mapping of the point of.J_2_ which this row represents. For example, the 1_§h row of Bf describes the mapping of x1 En . If biq is unity, then x1 maps, under f, into xq En . If biq is zero, then x1 does not map into xq. Since the functions f3 ($.13 have been assigned equal probability, the ri ones N r1 WayS, 1 = l, 2, .000, N. in row x1 ay with edual probability appear in this row in any of0(a1,aa,...,% ) 1:13: rij = N (2:12' ) Sf k1+k2+oooo+ //‘ = l, 2, 000., N. Formula (2:12') provides the means for computing k}‘ 3/. , /:= 1, 2, ...., N, from the initial conditions (i.e./ with N given and r1, 1 = 1, 2, ...., N known, the auxiliary sums can be calculated). The values of S,.will be used to find the exact probability distribution. 14 2:2. The exact probability distribution of the o co )0 e . In order to find the exact distri- bution of the number of components, it is necessary to obtain an expression involving the probability that an ftC-(rfl selected at random induces exactly c components in 1-2, . This is done by finding a formula for 5/“ ,/4= 1, 2, ..., N by a different method. Before deriving this expression for §/" we prove: LEMMA 1. If M (c,/t) the number of ways in 7 which c components can be distributed into/u subsets withinone of themibeinaeembty and if [t c are Stirlingnnunbers of the second kind 1. 9 then J” o “I = ' (2.13) :1 (°’/‘) /. c P3 OF. The distribution of one or more components into a subset implies that all the elements of the com- ponents are elements of the su-set and that the total number of elements in the comoonents equals the number in the subset. No regard is given to arrangement of components within the subset. WhitworthIlQ] in proposition.XXII proves that the number of ways in which c different things can be distributed m m n m n Jordon [9] as: J = A X, , where A x = n m. 0 x: Stirling numbers of the second kind are defined by (X+xfl)n-£H(X+m-l)n + m (In-l) (x+m-2)n-oooooooo 2} 15 into/u parcels (without blank lots) is 0! times the co- 0 in the eXpansion of (eX - 17“. He describes efficient of x "different things" as those which, for purposes of the problem, are not identical and he defines "parcels" as an unarranged class. Therefore, it is consistent to allow "things" to be considered "components" and "parcels" to be "subsets". Thus M (c,/) equals c: multiplied by the coefficient of X0 in the eXpansion of (ex - lY“. Jordan [9] in section 71, formula (5) shows that no ' /‘ (2:14) (eX - 1)’“ = g 4:: Jr: xc . C a“ Therefore, J” (2:13) M (c,/«-) =/4-! c- and the proof of lemma 1 is complete. It is now possible to prove THEOREM 2. If PC is the probability of exactly ,fi 0 components and if c are Stirli q numbers of the second kind, the value of_;;uisusiven_by: N ,fi (2:15) 8/ =fl: CZ P024 ,/‘= l, 2, 000, No ‘7‘ PROOF. In accordance with the hypothesis of lemma 1, the event E(a1 a2 ’ 9 0..., a ) can be described by a con- /‘ dition which is equivalent to those given in (2:4). (2:16) = the event that when.IjL E ((11, a2, 0000’ cf) is divided intol/csubsets, (‘Ja1, “9G2, ..., in accordance with (2:2) and (2:3), it is true that the c components induced in.jrl_ by f can be distributed into these’;£subsets with none of them being empty for all c if“ §, is the sum of the probabilities of the occurrence of 3(01, a2, ..., 9y~)' If there are a fixed number of components, each of the ways of distributing these components into the/ subsets of n constitutes an occurrence of E(Q1, “a, ..., 9/‘). Consequently, N (2:17) 8/.= €130 . M (c,/l-¢) , /u-= l, 2, ..., N. C Using lemma 1 with (2:17), theorem 2 is proved. The next step in deriving the exact probability distribution is to solve (2:15) for PC. THEOREM 3. If §7¢&7u= l. 2~er n... N is given by (2:12{) and_if (2:18) w = 45' f #0 and if %#f are Stirlinx_numbers of the first kind3°, then 2 Stirling numbers of the first kind are defined by m Jordan [SJ as Sn = [4; Dm (X)njx = 0’ where 1351 (x)n is the H1 m.§h derivative of: x(x - l) (x - 2) ....... (x - n + l). l7 0 (2:19) Pc = ‘22: w s , c = 1, 2, ....., N. PROOF. Using (2:18), (2:15) becomes N /‘ (2:20) w/. = f P54: ,/= 1, 2, ...., N. C =/l In matrix notation (2:20) is written: (2:20') W = PJ, where (2:21) w (VII W3 W3 000000 WN) ’ (2:22) P (P1P2P3 .0000. PN) and O O‘COOOOOOO O O 0.0.0.000 $3 I 0 ‘ . 300...... 1: ,1ng ’14 Since J11 == 0 for all/ >n , J18 a triangular matrix. Moreover, J: = l for all 11. Therefore, [J]: l , J13 non-singular and has an inverse,‘{-1 , which is also a triangular F; ”H (2:23) J = (J ”unis” 1 N sinus; matrix of the same form. From (2:20') , (2:2e) P = w.£“ . .1 /‘ 18 Using Jordan [9] , it will be shown that: O 0 00...... Sa 0 0.0.0... 2 -1 2 a (2:25),,9 = 5 s3 s3 , 3 3 SN SN c where é“ are Stirling numbers of the first kind. Jordan shows n 1 :EEE m m (2:26) 1,, S = S . 1 = m i n m wherecsn' is the Kronecker delta. This implies (2:27) 3 = I and s = J“ . Now (2:24) becomes (2:24') P = w S 3 which in non - matrix notation is (2:19) and theorem 3 is proved. Formula (2:19) is analogous to the formula (3.1) for the combination of events of chapter # of Feller [2i] The method of inclusion and exclusion used by Feller is different and does not work in the proof of (2:19). Katz [11;] employed the method used in this thesis. It involves setting ~1l1fl‘ll. 19 up a system of equations involving the desired quantity. (In our case the desired quantity was PC.) Then this system is solved so that this desired quantity is expressed ex- plicitly in terms of other calculable quantities. A short table of values of Stirling numbers of the first kind from Jordan [9;] is included in table 1 so they can be used in the computation of numerical examples. TABLE 1 STIRLING NUMBERS OF THE FIRST KIND ~ s2 V ...J N 3 h 5 6 7 8 / 1 1 0 0 0 0 0 0 o 2 ~1 1 0 0 0 0 0 0 3 2 .3 i 0 0 0 0 0 'u ~6 11 ~6 1 0 0 0 0 5 24 -50 35 ~10 1 0 0 0 6 ~120 27a ~225 85 ~15 1 0 0 7 720 ~176u 162u ~735 175 ~21 1 0 8 ~5040 13068 ~13132 6769 ~1960 322 ~28 1 These numbers are related by a recurrence formula: -1 3;“ = s": ~/.§E with 5% = 8) and 8/2 = 0. Thus, the table may be extended indefinitely. 2:3 A numerical example. Considequdi defined by I- 20 N = 6 and r1 = (1, 1, 2, 2, 3, 3). The values of the auxiliary sums by (2:12') are as follows. If/l=l, k1=N and Sl=l. Since (if) ; 0 when r>k, 5" "‘ (3)7573 )2 (“(136%) + 2[@Z:@ +~(:)(:)(:)(:)(:)‘ (3(1)?) with” s. = x 3 26121:) +6 Midi)“ + 4:1th 5., S and S6 all vanish because it is impossible to choose 5 k the k's so that a116,?) > O in a given product. Using (2:18) and (2:19) with table 1, the exact probability distribution of the number of components is: 11567.385 1.62 x 10 52.312 6 1.62 x 10 303 1.62 x 10 6 6 I? ll' .967,522 .032,291 .000,187 21 3:1, The class G . A s ecial case of G of r p {1“ 1) some importance in applications is the class Gr' Under each 1‘ EGr, each point of _(_)_ has the same number, r, of images. Any situation where each element under consideration maps into the same number of elements is of this type. While this case was implicitly covered in chapter 2, it is inter- esting to note what effect this particular kind of mapping will have on the computation formulas of the probability distribution. Since Gr is a special. case of G{P1}’ the results of chapter 2 apply and can be modified to fit the situation that r1 = r. The modification of the formula for §fl., /u = l, 2, ...., N, is threefold. First, with r1 = r the subscripts on r can be removed. Second, the factOrs obtained from B11, 1 =.l, ""b/‘ , are all equal and the double product of (2:12') can be changed to a single product with exponents. Third, for a fixed set of values, (R1, R2, ...., g»), the k1, is, ...., kg choices of elements from JT)_ to form the subsets “gals“, u) ) will all give the same as, 0..., a” value for p(a a ) because the number of images 1: a2, 0000, of each point of (') is the same. Therefore, the indicated (k) summation, ' :EEE can be replaced by the ((1.1, a2, 0..., $4) m‘. k2: .... g 1% . 1113 I.€S‘L11L is f'dCtC'I‘q COROLLARY 1. If h is the number of points in ( 2 and r is the number of images of each point, then it falls s from theorem 1 that (7 g (1, it kl+ k2* ... + kp.= N H (3:1) 9” s /: l, 2’ 00..., lg. RLXARK 1. For computational purposes (3:1) k1,};2, ..., I?) can be written as: 000, §¢§>O S f: [Tl—“(ran kl* k2+ 00. + y: I: 7) (r) k: __ ,, i=lfki J ‘L ’/“- l, a, 000., 1:, Where IVA—\r) = I¢(:‘T - 1) 0.000. (N ’ r + 1) (3:1') Since the expression for PC in (2:19) does not depend directly on the value of r but rather on the values of s/(, f‘ 3 l, 2. .0000 N’ the DrObabilitv éiStribUtiOn for this case can be computed as for the class G{+.}. '1 3:2. A numerical example. Consider Gr to be defined by N = 6 and.1'= 2. The values of the auxiliary sums by (3:1') are 24 1 _ - (12 H(2) + 2 (30) :——-32{ é64) ) (3’ 3 W()} 9375 (2,6 2, 2) (2)2 = 2 U) | (D II 3 (30)6 253,125 k Since it is impossible to choose the k's so that all( j)>0 r in a given product, 84, S5, and S6 all vanish. The exact probability distribution as computed by (2:19) is: 757.595 759.375 P1 .997 , 656 (lo .002,343 2 759,375 ..3;___. 3 759,375 ’11 II II' .000,001 Pu = P5 = P6 = 0 3:3. The class, G1. The class of functions, f, under which each point of J?)_ has only one image in.Jr)_ is called G1. This is the class of mappings considered by Metropolis and Ulam [4.] and Kruskal [5.]. Since G 16: Gr , the results contained in (3:1) apply in this case. However, because of prior interest in this class of functions, it is appropriate to show explicitly the eXpressions which apply to this particular case. 25 Since each point of.j—l_ has only one image in _r1_, r = l. The result which follows from (3:1) is given in COROLLARY 2. If N is the number of elements in.Jr1_ and each point in 1—1_ has only one image, then N (3:2) S/l. = j— Z 61, k2, coo, k k1, k2, coo, 5g>0 /‘ k1 + k2 + v99 +’¥ys= N ’ff 1‘1 _ , /= 1, 2’ .00., N. i=1(k1) Recently tables of the binomial probability dis- tribution [l2;] have been published. An.alternative form of (3:2) for which these tables are useful is given in THEOREM b. If S , A/¢= 1, 2, ...., N, is given by (3:2) and if b (k; n, p) =(fi) k qn'"k is the binomial probability distributioni_then k k3 . 1 . (3:3) 8% = b(k1; N, fi)b(k3; NI "' k1,—--—— k1, k2) 000: 5p:.0 N-k1 k1 “I" k; + ... + k/‘ = N k _ f 1 oooooooob k . N "’ k ”...-k ‘— (:u_1’ 1 d /“_2) N _ k1"'°‘5u-2 7 /‘=1, 2, .0000, No 26 N T 1. N: PdOOb‘ C K k =k k k ' 1’ 2, 000, I“ 1. 2o .oo /flo N“k1 ): #4.... (N“k L-°‘°’—Kfl:2): k1. NEN- -k1)o .kao (N-k1.k3)y. -.TN‘k1;.og-k )1 k ‘ :(SVXNLC E)..n('k;1:-....-k/9 - 1:1) ki ‘63—) (J-Y' .033», With k1, kg, 0..., 1;.>O 811d kflzlq-kl ..kz - 0.0 -k fl'i’ From these facts, (3:2) can be written as a "telescoping product", €;)@ kN-kl N—kl o 3 _ (3.2 ) é“ _ k1,k3,ooo,/u:ON k1+k2+ ...+k q‘kl akzN I\I_.1{1m.1{2 N—kl-ka . N“ N-k1 k .. ...... N-k1-...-%“-2 Efl-l ’:r- f 1 %fl_1 N-kl‘oo.“%‘_2 N-k ...-k — N-k1-...-k - 1 ,fi 1 I“ 1 4y“: 1, 2, oooo N0 N‘kl“. "—kfi"2 By the hypothesis of theorem 4, (3:3) follows immediately from (3:2') and the theorem is proved. 27 Again (2:19) can be used to compute the exact probability distribution because this formula is not directly dependent on the value of r but rather on.the value of @h which must be computed by either (3:2) or 3:3). 3:4, A numerical example. Consider G1 to be defined by N = 8 and r = l. The values of the auxiliary sums as computed by (3:3) are: 31 = 1 b(l;8,1/8) + b(2;8,1/u) + b(3;8,3/8) + b(u;8,1/2 ) + b(5;8,5/8) + b(6;8,3/u) + b(7;8,7/8). I N Since a) = (gfig) , it follows from the hypothesis of N k N-k N N-k k theorem 1+ that (k (g) (13:3) =( Xi) (K) . N N -k N N and as a result, S 2 (48) b(k;N,§ ) = b(N~k;N, Egg ) and 32 becomes: 52 = 2b(1;8,1/8) + 2b(2;8,1/4) + 2b(3;8,3/8) + b(u;8,1/2) . Using tables of the binomial distribution [12] for the values which it contains and computation by means of the hypothesis of theorem 4 for other values: s2 = .785,391,8 + .622,924,8 + .563,263,8 + .273,437,6 m N ll 2.245,018,o b(1;8,1/8) [éb(1;7,1/7> + 2b(2;7,2/7) + 2b(3;7,3/7i] + b(2;8,1/u) [sb(1;6,1/6) + 2b(2;6,1/3) + b(3;6,1/2{] + b(3;8,3/8) [2b(1;5,1/5) + 2b(2;5,2/5)] + b(u;8,l/2) [eb(1;u,l/u) + b(2;u,l/zfl + b(5;8,5/8) [}b(1;3,1/3{]+ b(6;8,3/u)b(1;2,l/2) U) K» II 28 ~792,514,8 + ~552,7“9,6 + .425,376,8 + 333,252,1 + 2. .250,339,4 + .155,731,2 509,963,9 b(l;8,l/8)b(l;7,1/7)[;b(l;6,l/6) + 2b(2;6,1/3) + + + + + + + b(3;6,1/2)]+ b(1;8,l/8)b(2;7,2/7) Eb(1;5,1/5) 2b(2;5,2/5iI+ b(1;8,1/8)b(3;7,3/7)[éb(1;u,1/u) b(2;4,1/2fl+ b(1;8,1/8)b(u;7,”/7)[213(1;3,1/3J b(1;8,1/8)b(5;7,5/7)b(1;2,1/2) b(2;8,1/u)b(1;6,1/6)[%b(1;5,1/5) + 2b(2;5,2/5i] b(2;8,l/4)b(2;6,1/3) 2b(1;u,1/u) + b(2;u,l/2) b(2;8,1/4)b(3;6,1/2)2b(l;3,l/3) b<2;8,1/4)b(u;6,2/3)b(1;2,l/2) b(3;8,3/8)b(l;5,1/5)[%b(l;4,1/4) + b(2;h,l/2i] b(3;8,3/8)b(2;5,2/5) 2b(1;3,1/3) b(3;8,3/8)b(3;5,3/5>b(1;2,1/2) b(4;8,1/2)b(1;4,l/4) 2b(1;3,1/3) r b(u;8,1/2)b(2;4,1/2)b(1;2,l b(5;8,5/8)b(l;3,1/3)b(l;2,1/2) /2) .276,37u,8 + .189,056,3 + .1uo,59o,7 + .102,539,1 + +0 + l. .102,539,1 + .062,584,8 + .189,056,3 124,969,5 + .086,517,3 + .086,517,3 + .1u0,59o,7 .ou8,666,o + .051,269,6 + 062,5au,9 715,120,9 29 55 = b(l;8,l/8)b(l;7,l/7) {b(1;6,1/6)Eb(1;5,1/5) + 2b(2;5,2/5£]+b(2;6,1/3)[%b(1;4,1/4) + b(2;#,l/2i] + b(3;6,l/2)2b(l;3,1/3»b(4;6,2/3)b(l;2,l/2i? + b(l;8,l/8)'D(2;7,2/7) {b(l;5,l/5) [2b(1;1+,1/4) + b(2;4,1/2_)]+ b(2;5,2/5)2b(l;3,l/3) + b(3;5,3/5)b(l;2,1/2i}+ b(1;8,1/8)b(3;7,3/7) w{£(1;4,1/u)2b(1;3,1/3) + b(2;4,1/2)b(1;2,1/2) + b(1;3,1/3)b(l;2,1/2)} + b<2;a,l/u)b(1;6,1/6){b(1;5,1/5)I_’2b(1;z+,l/u) + b(2;l+,1/2)] + b(2;5,2/5) [2b(1;3,1/3 + b(l;2,l/2)] + b(2;8,1/4)b(2;6,1/3)‘{£(1;4,1/4)2b(1;3,1/3) + b(2;u,l/2)b(l;2,1/2i} + b(2;8,1/u)b(3;6,1/2)b(1;3,1/3)b(1;2,1/2) + b(3;8,3/8)b(1;5,1/5) {B(1;4,1/4)2b(1;3,1/3) + b(2;4,l/2)b(l;2,l/2i} + b(3;8,3/8)b(2;5,2/5)b(l;3,l/3)b(l;2,l/2) + b(u;8,1/2)b(1;u,l/u)b(1;3,1/3)b(1;2,1/2) - .094,528,2 + .062,48u,7 + .043,258,7 + .025,63u,8 +.062,u84,7 +.o38,u52,1 + .021,629,3 + .043,258,7 + .021,629,3 + .025,63u,8 + .062,484,7 + .038,452,1 + .021,629,3 + .038,452,1 + .019,226,0 + .021,629,3 + .ou3,258,7 + .021,629,3 + .021,629,3 + .025,63u,8 .753,020,9 b(l;8,1/8)b(l;7,1/7)b(l;6,l/6) b(1;5,1/5)l%b(1;u,11h) + b(2;4,1/2i]+ b(2;5,2/5)[§b(1;3,1/3) + b(l;2,1/ 2i] + b(1;8,1/8)b(1;7,1/7)b(2;6,1/3){%(1;4,1/u)2b(1;31/5) - . . . .‘ . u . , . l r ( 4 d P I n I . r . l' I ' I . I . n 1 i . J . p . n . g. . , C ’ ' t p Q , . g , : . . ‘ f t 1‘ ‘ t ‘ ’ . p . 4 \ ( ‘ ’ < < ’ ‘ C ‘ é ‘ (' p ( F. 4 . t ' 4 I f f 4 { ' ' ' ‘ - " f * '2 I ‘ t o t . . : 'i . 7 t: I a -. . - (I r“ I . ‘ ' v . ,- ‘ f ~ 5 . 5 ’ , _ , t l! , ‘v 30 + b(2;4,1/2)b(1;2,1/2) + b(l;8,1/8)b(l;7,l/7)b(3;6,1/2)b(l;3,l/3)b(l;2,l/2) + b(1;8,1/8)b(2;7,2/7)b(1;5,1/5)b(1;u,1/u)2b(133,1/3) + b<1;8,1/8)b(2;7,2/7)b(1;5,1/5)b(2;4,1/2)b(1;2,1/2) + b(1;8,1/8)b(2;7,2/7)b(2;5,2/5)b(1;3,1/3)b(1;2,1/2) + b(1;8,1/8)b(3;7,3/7)b(1;u,1/u)b(1;3,1/3)b(1;2,1/2) + H2;8,1/4)b(1;6,l/6)b(1;5,1/5)b(1;u,1/t+)2b(1;3,1/3) + b(2;8,1/u)b(1;6,1/6)b(1;5,1/5)b(2;4,1/2)b(1;2,1/2) + b(2;8,1/4)b(l;6,l/6)b(2;552/5)b(l;3,l/3)b(l;2,1/2) + b(3;8,3/8)b(1;5,1/5)b(1;u,1/4)b(1;3,l/3)b(1;2,l/2) .O6l,283,1 + .028,839,1 + .01o,814,7 + .019,226,1 + .009,613,0 + .009,613,0 + .01o,81u,7 + .019,226,1 + .OO9,613,0 + .009,613,0 + ,009,6l3,0 + .OlO,814,7 .209,093,3 b(1;8,1/8)b(1;7,1/7>b(1;6,1/6)b(1;5,1/5) b(1;u,1/u) 2b(1;3,1/3) + b(2;1+,1/2)b(1;2,l/2)}+ b(1;8,1/8) b(1;7,l/7)b(1;6,1/6)b(2;5,2/5)b(1;3,1/3)b(1;2,1/2) + b(1;8,1/8)b(1;7,l/7)b(2;6,1/3)b(1;4,1/4) b(1;3,l/3)b(1;2,l/2) + b(l;8,l/8)b(2;7,2/7) b(1;5,1/5)b(1;u,1/u)b(1;3,1/3)b(1;2,1/2) + b<2;8,1/u)b(1;6,1/6)b(1;5,1/5)b(1;4,1/u) b(l;3,1/3)b(1;2,1/2) .01u,419,6 + .004,806,5 + .004,806,5 + .oou,806,5 + .Oou,806,5 = .033,645,6 31 88 = b(l;8,1/8)b(l;7,l/7)b(l;6,1/6)b(l;5,l/5) b(1;4,1/u)b(1;3,1/3)b(1;2,1/2) $8 = .002,u03,3 Using these results together with table 1 and (2:19), the exact probability distribution is: P1 = .405,628 P2 = .408,440 P3 = .153,895 P4 = ,028,893 P5 = .002,970 P6 = .000,169 P7 = ,ooo,005 P8 = .ooo,ooo,06 4. THE HOLLOW CASE 4:;. Preliminarie . If the mapping of J71. into ,J_l_ is restricted so that no point is permitted to map into itself, the mapping is called hollow. The subclass of functions which represents this type is called Hfl‘fi' Analogous to GFE3’ HF3}18 composed of functions, f, under which each xi£fl has r1, 1 = l, 2, ...., N, images in D. . Hollow mapping is of Special interest in the field of social psychology. In sociometric tests an individual chooses the individuals in a group with whom he wishes to be associ- ated. In some cases a variety in the number of choices made by an individual is permitted. In other cases, all individuals must make an equal number of choices. In still other instances only the prime choice is made. If each of the N individuals making the choices is considered to correspond to a point x1 and if his choices for associates correspond to the ri images of x1, 1 = l, 2, ...., N, then a hollow mapping situation exists provided no in- dividual is permitted to choose to be associated with himself. The number of choices permitted in different in- stances gives rise to different subsets of hollow mappings. 33 The situation where there is a variety in the number of answers by different individuals in the group is covered by the class,EQ}i}, If Hr and H1 are subsets of functions defined for the hollow case as Gr and G1 were for the general case, then.Hr covers the situation where each individual in the group chooses the same number of associates and H1 covers the case where each individual makes only the one best choice. Hr and H1 are considered in chapter 5. “:2. The auxiliary sums and the exact probability distribution of the number of components. Since HF$CC¥'fi’ the results of chapter 2 can be adapted to the hollow case. To make the event, E(a possible under hollow 1’ a2, 0000’ 9‘), mapping there must be at least two elements in the subset waj, J = l, 2, 00.0%. This mews the k3, J = l, 2’ 9.9,,/, f must be greater than one. The matrix representation, Bf = (bij) f is different in that bjj a O for all j = l, 2, ...., N. Consequently, the required form of Bf, which is equivalent to the event, E(al given in (2:8), is modified ’ C12, ..., 07a) in that the main diagonal elements of the principal minors, B11, 1 = 1, 2, ....,/AC, are all zeros. Therefore, the r1J ones in the j.§h row of B11, 1 = l, 2, ....5/4 and J = l, 2, ...., k1 may appear only in any of the remaining (k1 - 1) positions if Bf is to have the form equivalent to (2:8). At the same time if no restrictions are placed on 34 the form of Bf, the ones in any row could appear in any of (N - 1) positions. Since k >1, j = l, 2, ....f, i there could be at most[-1-:-]subsets formed from the index set,'~‘9, where [-12] is the largest integer in the quotient Eg- . Moreover, there could be at most[§»components induced in [L by a function, f. Using the above facts, formulas for the auxiliary sums, §,.,‘/:= l, 2, ...., [g , and for the probability,Pc, of exactly 0 components, 0 = l, 2, ...., [-2-], must be modified in order to make them valid for the hollow case. The theorems which were proved for the general case are now listed with proper modifications for the hollow case. They are numbered with primes to show the correspondence between the cases. THEOREM 1'. If N is_th£ number of elements Ln ( 2 and rv is the number of images of the point x1381 L, D21, 2, 0000, N, and if (a , a2, 0000,},) is defined by (2: 2) then the value of wggis‘piver by (k) l :jEE;:T:k EE::) (Lnl) S = TN N;l)kl,k2, ’00., ”>1 (0.1,C12, 0000,07: "‘ kl+k2 +oooo+kfik=N ”Ii k1 (kl-l) [N] = l, 2, .00., -- 0 i=1 3.11 “3 '/ 2 o.l 35 THEOREM 2' . If Pc is the probability of exactly ,1“ ‘ 0 components and.J: are btirling numbers of the second kind,_then P] (4:2) 3/. s/u: 2 PC %fi,/= 1, 2, ...., [1%]. c ‘7‘ _ S,“ THEOREM 3.0 If ‘9‘ = 7 9 fl: 1, 2, 0.00, g ’ given by (4:1), and if §2 are Stirling numbers of th Consider thgdefined by N = 6 and r1 = (1, l, 2, 2, 3,‘3). The values of the auxiliary sums are, by (4:1): 52 75%)? 5)2{2®?3 .2)(3)2 = 18 52 = .000,072 x105 83’ 84’ 55’ and 36 all vanish. By (4:3), the exact probability distribution for this example is: 399,964 .ooo,036 P u = P ' 5-P6=O 36 5. PARTICULAR cases 0:»- HOLLOW MAPPING 5:1. The class Hr. A Special case of Hfri} for the hollow case is the class Hr which is defined as Gr was for the general case. (That is each point of Jrl_ has the same number, r, of images in Jrl_.) Although this class was mentioned in section 4:1 and implicitly considered in section 4:2, for completeness, the exact formulas for computation are shown. The corollaries are numbered with primes to Show the parallelism between the results for Gr in section 3:1 and those for Hr' The reasons for the modifications are covered in section 3:1. COROLLARY 1'. Ifi N is the numbez_9fi points in JC1_ and r is the number of images of each point, then it follows from theorem 1' that k - (5:1) 8 = ____:_L__.___ g él,k2,oooo,)1-l(ir )ki’ /‘ -1 N k1. k2......5.>1 N ) + + + -'::I (1" kl k2 0000 k/ IN fl... 1, 2, ......g..[] REMARK 1'. For purposes of computation (5:1) can be written 38 N (5‘1') 5/“: E Cc k k) [N-:)(r)]N 1 2’ ’x- kl, k2,...0’1¥>1 kl+k2+. O 0+lcfl= N fifikklflrflki, fi= 1,2,....,[.12i] . Since the formula for Pc in (4:3) does not depend directly on r but rather on the values ofb },/4= 1, 2, “HI-g], the exact probability distribution can be computed by us ing, this formula. 5:2 A numerical exampl . Consider Hr defined by = 6 and r = 2. The values of the auxiliary sums by (531') are: S]. = l .. 1 ( 6) 6 _ 20 _ s _ (2) _ — .000 020 2 (20)6 3) 3 16-5. ’ $3 = O The resulting exact probability distribution by use of (4:3) is: P1 = .999,990 P2 = .OO0,0lO 133:0 5:3. The class H1. As mentioned in section 4:1, 39 the subclass H1 is defined for the hollow case as G1 Was for the general case. To complete the parallelism between the hollow and the general case, the formulas which apply specifically to random mapping under functions from H1 are given as they were for the class Hp. COROLLARY 2'. If N is the nun-her of: am ( ) and each point in ( ) has only one image, then , k (5.2) s, = ——L—— 2 H (kl-l) 1, N l’k o... = (N—l) 2’ '5“ 1 1 kl ,k2,0005‘>1 + kl +k2+ ... §,=N /u=1, 2, ....,[§] In order to make the tables of the binomial probability distribution [12;] useful an alternative form of (5:2) is presented in THEOREM 4'. If a, f4=l, 2, ....igjis given l WJMLLE b(k;n.p) =(:)p R q n'k is the binomial distribution, then (5:3) s/.=_..._4‘_)_N ‘ b(l;N, 1‘17?) _'l (N. .I. k1,k2,...,19‘>1 <(N.-l k +k +ooo+1§.-’ ) k l l 2 E); 1‘ l - -1 2- ’N-k -...-k . - , béz’N KMWHH°b 1‘,“ -1’ 1 ,a-z » N—kl-...-I. -2 ’ /&~2 /= 1, 2, ...., [—E—J. .- o I ; 96A s. 1‘ 4O PROOF: ( N Iq N-kl l -kl‘-o 00-1§‘_2 k k g to... 3 2’. .. ’1? k1 k2 k kl’kz, 00..., kf>l Therefore, (5:2) csna be written as a "telescoping product", . . _ (N_ )N (kl-1) klNl(-k 17m.) (5.2 ) S/¢ "’ (N—I1)N£>l{(ll:]) N.../4 ...) kl’kZ’OCOk ’ kl+k2+ o o o+k j; N‘l N-k -k -:l( N"{l-k2-/‘ +2) 1 2 (N_l:kl-f+ N-kl-fi +1 . I O C O .\ N-k -..-k - " ~1 -21”... "' l .1 .... (N-k:2"°15“-9218-l—1 2X N:(l k-ll /“ k/‘_l N“kl- ...-5"- "'1{ 1M" [-2-2 /= 1W2,oooo[] Using the notati n of the hypothesis of theorem 4', (5:3) follows and the theorem is proved. The formula which is used to compute the exact distribution after 8/, /u= 1, 2, ..., g1has been com— puted by (5:2) or (5:3) is (4:3) because it does not depend on the value of r but on the value of §/‘° _5:4 A numerical example. Consider H1 defined by = 8 and r = l. The values of the auxiliary sums as 41 found by use of (5:3) are By S = 1 l 8 S2 = (is—)5 213(2;8,1/6) + 2b(3;8.1/3> + b(4;8,1/2)_] 7 52 = .39o,607,u tn ll 3 ($[2b(238,1/5)b(2;6,1/4) + b(2;8,l/5)b(3;6,1/2) + 2b(3;8,2/5)b(2;5,l/3) + b(4;8,3/5)b(2;4,l/2i] S3 = .036,35#,O Sn=Eg§[b(2;8,l/4)b(2;6,l/3)b(2;4,1/2i] 5L,r = .ooo,u37,1 S5, S6’ S7, and SB all vanish. (4:3), the exact distribution is: P1 = .816,705 P2 = .177,327 P = .005,950 P4 = .000,018 P5 = P6= P7 = P8 = O 6. A BINOMIAL APPROXIMATION or THE DISTRIBUTION OF THE NUMBER OF COMPONENTS 6: I roductor 0 ns de~r o . Since the exact distribution is known, an approximate distribution is useful only if it is more easily computed. The above numerical examples, although restricted to relatively trivial oases show that considerable work is involved in the computation of the exact distribution. Therefore, an approximation is worthy of investigation. Because the distribution of the number of components is discrete and since it is conceivable, in some appli- cations, that the set JF)_ will be composed of a relatively small number of elements, it seems feasible to use a distribution of the discrete type for the approximation. A binomial approximation is therefore found and the results have proved to be rather good. d st 0 . The binomial distribution has only two parameters, N and p. With N fixed, only p needs to be estimated. Since the mean of the binomial distribution equals N times p, an approximation of p could be obtained #3 by equating the expected value of the random variable, 0 , which is the number of components in.Jr)_ , with the expected value of c if it were distributed as the binomial distribution. However, the expected value of the number, 0 , of components is not easily obtained in terms of one or more of the S/.,/¢= l, 2, ...., N. Kruskal [5:] gave the result fer the Special case where the mapping was single- valued. Since this seems to be quite difficult in the general case, it is convenient to consider a related variable, 110"l , where n is a positive integer. For such a variable the following theorem is proved. THEOREM 5; If c is the number of components in ( ) , n is a positive integer and Sgthe auxiliary sums given by (2:12,)J then the expected value of no"1 is £128}; by n 1‘. (6:1) E(n°"9 = de) if. ’/£=l -l "‘ PROOF: Consider the arbitrary quantity, A, defined a 3 S- n -1 S A‘ ét-) :2: n 2 ’“ From (2:15): s/.=/c: 3:,“ P3 3. 41+ n N ,4 Therefore, A = 2 6:1 Omit)! 2 P, :J; . #=1 .17" " c -1 ' )b lhe coefficient of PC in A = ,4: 1 -1 ,(fl-l). ’dc' By changing the notation slightly, the coefficient of c [1- Pc in A = i- :7: ng'a)’4 . f‘l But Jordan [9] , section 58, formula 2, shows n0 = E J (n) , where (n) = n(n-l)....(n-/4+1) = n“? /=1 c / /~ Therefore, the coefficient of PC in.A = no-1; and by the definition of the expected value, A = E (no-l) and (6.1) follows. This completes the proof of theorm 5. Using (6:1) the results for a few integers are: (a) E (Ic‘l) = 1 s (b) E (2°‘l) = 1 +‘"§ ( ) I ( 0’1 — + ° + E3 0 s 3 ) - 1 oz 3 -1 Since E(lc ) gives a trivial result and since E (20.1) is obviously the simplest eXpected value to compute, it is convenient to use this to find a binomial approximation of the distribution of the number of components. 45 In order to find the parameter p which will determine the binomial distribution for fixed N, it is necessary to prove the following theorem. THEOREM 6. If 0-1 s ' do variable distributed binomially with parameters (Nul) and p, then (6:2) HIP-1) = [i + (n-l)p] N-l PROOF: By hypothesis: b(o-l; N-l, p) p0*l qN-C ‘ N s c-l 5‘1 "1 c-l N-c Therefore, b (n ) = n P q ’ :1 0-1 This can be rewritten: -1 n _ - N— . r- 2. (nc 1) = E (np)° l q °,-- (q + np)‘ 1 Since q = 1 - p, (6:2) follows and the proof of theorem 6 is complete. I An estimate of p which will determine the binomial approximation of the distribution of the number of com- ponents is now possible by equating the two values of E(2°'l) obtained from theorems 5 and 6. This is not an estimate in the statistical sense. No sampling is involved. It is simply an approximation which results 0‘1 if c is the from equating the eXpected value of 2 number of components in ( ) with the expected value of 20-1 if (c-l) has binomial distribution with parameters (N-l) and p. #6 Using the results in (6:1) and (6:2) with n = 2 and equating the expected values of 20-1, the result for the estimate of p is: l 52 Nil (6:3) p = l +‘5— - 1. Using (6:3) and denoting the approximate probability of 0 components by Qc , the formula is (6:4) QC = pc‘l qN"C , c = l, 2, ...., N. 0-1 It is noted that this binomial approximation of the distribution of the number of components can be determined by finding only the value of 82. For the general case formula (2:12'), (3:1') or (3:3) can be used, depending on the values of r. After 82 has been found, p is found by (6:3) and the approximate distribution is found by (6:4). For the hollow case, the same procedure may be used and the same formulas apply with the exception that N is replaced by [g] throughout. For completeness the formulas are listed with primes as they apply to the hollow case. 1 S _ -1 (6:35) p = 1 +-§ 2 -l . In connection with the exact distribution discussed previously six numerical examples were presented to illustrate each of the classes of random mapping functions. These same examples are now presented so that the approximate distribution of the number of components can be compared with the exact distribution. Accordingly, an estimate of p is found for each example and the approximate probabilities are shown together with the exact probabilities, (found earlier) for the general and hollow cases in tables 2 and 3 reSpcctively. If GM} is defined by N = 6 and. r = (l, 1, 2, 2, 3, 3), then, from section 2:3, 32 = .065,705 and by (6:3) p = .OO6,486. The values of Qc computed by (6:4) are shown in the first section of table 2. If Gr is defined by N = 6 and r = 2, then,from 44 9375 The values of Q0 computed by (6:4) are given in the second section 3:2, 82 = and, by (6:4), p = .OOO,467. section of table 2. If G1 is defined by N = 8 and r = 1, from section 3:4, 32 = 2.245,018 and therefore p = .113,506,6. Again the values of Qc are given in the third section of table 2. 48 Ifh?33 is defined by N = 6 and r = (1, 1, 2, 2, 3, 3), then 52 = .ooo,072 by section #:3. Using (6:3'), p = .OO0,018. The values of Qc are computed by (6:4') and given in the first section of table 3. If Hr is defined by N = 6 and r = 2, by section 5:2, 52 = .ooo,020 and by (6:3'), p = .ooo,005. The values of QC as computed by (6:4') are given in the second section of table 3. If H1 is defined by N = 8 and r = 1, by section 5:4, 52 = .39o,607,4 and by (6:3'), p = .061,270,5. The values of QC are given in the third section of table 3. The agreement between the tabulated values is reasonably good. In the hollow case, there is virtually no difference between the exact and approximate values when N: 6, r1 = (1, l, 2, 2, 3, 3) and when N = 6, r = 2. In the general case, agreement is to at least the third decimal place for these examples. When N = 8 and r = 1, there is variation but there is definite agreement in the pattern of the distributions. For larger values of N the computation of the exact distribution is cumbersome. Thus, comparison becomes difficult. Katz [l3g] has shown that the exact probability of one component when N = 20 and r = l is .264,68. The approximate probability given by (6:4) for this set of values is: Q1 = .295,227. This indicates very little change in accuracy in the probability of one component for larger values of N. 1+9 80. 000 . W 80. ooo. i ...l ...... a ...... l m 30.08. 80.80. I... ...... I... II N. Hafiooo. meiooo. ...... ...... II ...I c 30:50. 03.30. ....I I... ....I ...... m 3648. mmmnmo. ...I ....I 30.08. I... s 319:. www.mma. «8.08. 80.08. 1* 338. $180. ,m amc.mmm. oss.mos. Hmm.~oo. msm.moo. smm.amo. Hmm.mno. m mmm.ome. www.moe. sec.smm. cmc.smm. mwm.scm. mmm.scm. a be on be om so om o ans .muz mus .cuz R.m.~.m.a.duas.cuz mayo dimmzinww HEB 2H gram? OASOO mo mmmmabz mmB .mo WZQHBDmHmBmHQ MBgvmommmd Q34... Bqurfis H39 and ZOmHmdeOO N Baa. 50 .II. ..I. III. a .II. om~.ooo. n mao.ooo. s mam.oao. ome.moo. n 000.000. .11. 000.000. .11: m ssm.aca. smm.ssa. oao.ooo. oao.ooo. emo.ooo. cmo.ooo. N amm.e~m. mos.cam. ome.mmm. 0mm.mmm. scm.mmm. scm.mmm. a co cm so am . ed as H n n .m n z N u A .0 I z Am.m.N.N.H.HV n «h .w I z mmawo 309903 mmB 2H mafia/53,50 m3 mHmMEDZ MIR. .mo WZOHBDmHmBmHQ megfiumommm< Q24 80% mme m0 wHOme