LIBRARY
Michigan State
University

.h”

This is to lcertifg that the

thesis entitled

The Distribution of the Number of Components
of a. Random Mapping Function

presented by

Jay Ernest Folloert

has been accepted towards fulfillment
of the requirements for

P110 Do degree in l‘lathematica

I/ Major professoﬁ

Date Jul-I 251 1955

0-169

1 PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before due due.

DATE DUE DATE DUE DATE DUE
VER ITY

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MSU Is An Afﬁrmative Action/Equal Opportunity Institution
ammo

 

 

 

 

 

MICHTGAN STATE UNIVERSITV

Pa.

THE DISTRIBUTION OF THE NUMBER OF COMPONENTS

OF A RANDOM MAPPING FUNCTION

By
JAY ERNEST FOLKERT

A T"?sls
Fla—u.)

Submitted to the School for Advanced Graduate
Studies of Michigan State University in
partial fulfillment of the requirements

for the degree of

DOCTOR OF PHILOSOPHY

Department of Mathematics

1955

AC KI'EOL‘JLELXH‘; BIN? TS

The author wishes to express his sincere thanks to
Dr. Leo Katz who supervised the work involved in

the solution of the problem. his advice, encourage-
ment and patience have been most important throughout
the author's entire program of study.

The writer deeply appreciates the financial

support of the Office of Naval Research.

THE DISTRIBUTION OF THE NUMBER OF COMPONENTS

OF A RANDOM MAPPING FUNCTION

By
Jay Ernest Folkert

AN ABSTRACT

Submitted to the School for Advanced Graduate
Studies of Michigan State University in
partial fulfillment of the requirements

for the degree of

DOCTOR OF PHILOSOPHY
Department of Mathematics

Year 1955

a?“

 

Approved

l
JAY ERNEST FOLKERT ABSTRACT

From the collection, F, of functions, f, which
map the set, Jr1_, of N elements into j_l_ , the general
subclass, Gitit’ is selected. This class is composed of
functions such thf-‘t for each fe Gfril the point x1€ﬂ
has r1 images in.Jrl_, i = l, 2, ...., N. A probability
space is constructed by taking Girl} and attaching equal
probability to each ferrﬁ . A random map-rung of E
into n relative to Obj} is defined as corresponding to
the selection of an f from Girl} with uniform probability.

A subsetcbof 171. is a component of the function if
and only if it is a minimal, non - null subset such that
f (CHCD and f‘1 (0)Cu9. Every mapping function, f‘G{r1§
decomposes.jfl into a number of disjoint components.

I Therefore, to each f in GiTii there corresponds a number, c,
which is the number of com'r‘onents induced in. [1 by this

f. If f is selected at random from Gifii then c is a random
variable. The problem is to find the probability distribution
of c.

The method used is one in which auxiliary sums,
§/.,I/4= l, 2, ...., N are defined. These sums are accounted
for by two approaches. By the first spproach a formula in
terms of N and r1 is derived for computation of S/“vua 1.

2, ...., N . By the second approach, S“ is found in terms

2
JAY EhhHSW FOLKEFT AmSTnACT

J. _

of the probability, P of exactly 0 components, 0 = l, 2,

C’
...., N. A matrix solution of this expression for PC in
terms of Sn is given. The result isihe exact probability
distribution of the number of components.

Particular cases of general mapping are considered
as specializations of the mapping under G{T1}. These are
the subclass, Gr, of functions under rhioh each element
of.l—1_ has the same number, r, of images and the subclass,
01, of functions under which each element of 1—1_ has only
one image.

Hollow mapping in the sense that no point is permitted
to map into itself is also considered. For this case, the
subclasses, Hlfil’ Hr, and H1 are defined in a manner parallel
to the general case and the exact probability distribution
is found for each.

Numerical examples are included to illustrate each
of the cases considered. The amount of computation involved
in these suggests the need for an a,proximation to the
exact distribution. Therefore a binomial approximation is
developed. The results of the exact and approximate dis-
tributions for these examples are included in tabular

form for reasons of comparison.

3.

5.

TABLE OF CONTENTS

INTRODUCTION

1:1. Basic considerations . . .

1:2. Statement of the problem and the

of attack . . . . . . . .
1:3. Previous work . . . . . . .

THE GENERAL CASE . . . . . . . .

2:1. The calcula.tion formula for the

auxiliary sum, Sn. . .

2:2. The exact probability distribution

the number of components .
2:3. A numerical example . . . .

PARTICULAR CASES OF GENERAL MAPPING

3:1. The class Gr . . . .

3:2. A numerical example . . . . . .
3:3. The class G1 . . . . . . . . . .
3:4. A numerical example. . . . . . .
THE HOI-‘I-Jow CASE 0 o o o o e o e o e e 0
4:1 Preliminaries. . . . . . . . . .
:2 The auxiliary sums and the exact

probability distribution of the number

or componeIItSo e e o o o e
#:3. A numerical example. . . . .

PARTICULAR CASES OI HOLLOW MAPPING.

5:1. The class Hr . . . .
5:2 A numerical example.
5 3. The class H1 . . . .
54 . A numerical example.

0

Page

\l-C'NH

14
19

22
22
23
2b,
27
32

32

TABLE or CONTENTS (Continued)

Page

6. A BINOMIAL APPROXIMATION OF THE DISTRIBUTION
OF THE NUHBE R OF CON} ONENTS . . . . . . . . . . 42

6:1. Introductory considerations . . . . #2

6: 2. Derivation of an approximate probability
diStribUtion oeooeeoeoeeee “'2

6:3. Numerical examples. . . . . . . . . . . . 47

7 O SUT'II'LABY . . . . Q o Q Q 0 O o O O O O O O O O O 5 1
REFERENCES, ,

LIST OF TABLES

Page
TABLE 1. Stirling Numbers of the First Kind — s2 . 19

TABLE 2. Comparison of the Exact and Approximate
Distributions of the Number of Com-
ponents in the General Case . . . . . . 49

TABLE 35 Comparison of the Exact and Approximate
Distributions of the Number of Com-
ponents in the Hollow Case . . . . . . 50

OOOOO

IIIIII

 

1. INTRODUCTION

lzl. Basic considerations. This thesis is concerned

with the collection, F, of transformations or functions, f,

on the points of a set, Jr1_ , of N elements into the set Jr1_,
The most general kind of a function takes a point of_[TL

into an arbitrary number of points of the set. Under such

a function the point x, maps into r1 points of Jr1_, x3

maps into r3 points of.j_l_, etc.

A subset d>of J71. is a component of the transformation
or function if and only if it is a minimal, non-null subset
such that points of¢D map only into points of d3 and every
point which maps into a point of u? is itself a point of ob .
Symbolically, the subset d’is a component of the function
if it is a minimal, non-null subset 3 f(W)CI-Dand
f".1 (“WC-“9. It is essential that «D be minimal .

Every mapping function, feF, decomposes E into
a number of disjoint components. For example, if f were
a cyclic permutation taking x1 into x1+l, mod N, there'would
be only one component. On the other hand, if f were the
identity transformation, there would be N components.
Consequently, depending on the function involved, there
could be 1, 2, ..., or N components induced in j_l_ by a

function, f.

2

1:2. Statement of the nroblemiand,the,nlan of attack.
A suitably chosen subset 370$ the finite collection, F,
of functions is considered. A probability space is constructed
by takingﬁ'and attaching equal probability to each fey.
A random mapping of [L into E relative to 31s defined
as correSponding to the selection of an f 8 9' with uniform
probability. To each f£t37there corresponds a number, c, which
is the number of components induced in.Jr1; by this function.
if f is selected at random from 57, then 0 is a random
variable. The probability distribution of c for various
choices of.9'is the core of this thesis.

There are two main parts in this dissertation. Part I
is the general case in which a point may map into itself.
For any fixed set,{r1} , i = l, 2, ...., N, of values, there
exists a subclass, GfrﬂCF, of functions such that for each
f£G{r1}the points x1, x3, ...., xN ofﬂ have r1, r3, ....,
rN images respectively. Of Special interest is the case where
r1 = r for all i. (O 4 rsN). This introduces the subclass,
crcF, of functions under which each point of 11 has the
same number, r, of images. Moreover, the special case where
r = 1 yields the subclass G1 of functions under which each
point has only one image.

Part II is the hollow case and deals with the sub-
class H(rﬁCFof completely nonidentical functiuns under
which no point is permitted to map into itself. As in

the general case, the Special cases Hr and H1 are considered.

3

These are actually subsets of the corresponding G sets and‘
are defined in a manner parallel to the subsets Gfrﬁﬁr and G1.

Starting with the general case, the procedure is to
consider the subset Erto be the subset Gir1}with r1, r2, ....,
rN images in.Jrl_ for the points x1, x2, ...., xN respectively
for each fa G{r1}° Using matrix representation, which will
be described in section 2:1, calculation formulas for
auxiliary sums, S/i,’/¢= l, 2, ..., N, are found in terms
of N and the number of images, r1. This is followed by a
derivation of the same auxiliary sums by a different approach.
By this second method é“ is found in terms of the probability,
Pc’ that exactly c components are induced in 171_ by f
chosen at random from G{Ti}' Then, by matrix algebra, the
last derivation is solved for PC. The result is that the
exact probability distribution of the number of components
under the random mapping “(r1} is found in terms of é/s.
Since the é/5,//‘= l, 2, ...., N, can be calculated by
the first set of formulas, the distribution of c can be
computed.

The results for the collection, G{T13 of functions
are easily adapted to give similar results for the classes
GI. and G1 by making prOper changes in the values of r1 in
the formulas.

The hollow case is treated in a manner parallel to
the general case in as much asthe subsets Hfrgpﬁr and H1

are considered in turn. Moreover, the results of the

,‘

 

Ll.
general case can be adapted to this case by making prOper
changes in the limits of summation and proper substitutions
in the formulas for the auxiliary sums to account for the
fact that no point may map into itself.

The numerical examples which are included for each
of the above cases indicate that considerable computation
is involved in finding the exact distribution. The examples
which are presented are relatively trivial. In order to
minimize the work involved in more complicated cases a
binomial approximation to each distribution is derived and

illustrated by numerical examples.

1:3, Previous work. If 3718 restricted to the class

of permutations of N elements, components become cycles.
Under a permutation any point of the finite set.l—1_ maps
into a second, the second maps into a third, etc. until

at some stage a point maps into the first member of the
sequence. At this stage a cycle is formed. Gontcharoff [I]
has given several moment generating functions for the

number of cycles of the elements of a permutation group

of N elements. His results of the asymptotic behavior

of the distribution of the number of components are repeated
in different forms by others. Feller [2] reports that if c
is the number of cycles formed, the distribution of c is
asymptotically normal with mean equal to log N and standard
)1/2

deviation equal to (log N . More recently Greenwood [3]

5

showed by different methods that for large N, the mean of
the number of cycles is approximately equal to log N + C,
where C is Euler's constant, and that the variance is
approximately equal to log N + C 4F; .

The case where 31s the class of single - valued
mappings on N elements was considered in part by MetrOpolis
and Ulam [N] . (This is the class which is called G1 in
section 3:3.) They defined the study of a random function
as the study of the probability distribution over the
.j_lgj_l' sample points of f(x) on.Jr1_ into j71_. Using
the word tree instead of component, they proposed the question
of the expected number of components under random mapping.
They suggested that the answer is no doubt of the order of
log N and base their surmise on the result which was pre-
viously obtained for cycles.

Kruskal [75] describes the structure of a component
as consisting of a cycle together with a number of trees
rooted in the elements of the cycle. This is consistent
with the definition used in this thesis. He, too, considered
the class of single - valued mappings and in particular
dealt with the question proposed by Metropolis and Ulam.

He showed that the eXpected number, E, of components under

a random mapping function is

N n:
(1‘1) E =2 (N-ﬂlmlﬂ‘ -

m=l

 

6
Moreover, for large N this reduces to the following asymptotic
result,
(1:2) E.aul/2 (log 2N + C), where Clis Euler's constant.
This dissertation will present the entire distribution

for multiple - valued functions as well as single - valued

functions. Thus its results extend well beyond those given

by Kruskal.

r.‘ m-‘ "*0" v- 2 (‘1
Cu. ILE C-LJ'LVLSA-‘ULIJ 01594:;

If no restriction is placed on the mapping of g 2
into $ 2 , it is conceivable that a point could map int'
itself. he shall refer to this type of mapping as the
general case. The collection, F, of all functions on g 2
into g 2 is admittedly broad for it does not specify the
number of images of the different points of S 2 . There—
fore the subclass, G{r- , is considered. This class consists
13
of those functions, f, under which each point x.£:§ 2
1
has r1 images in S 2 , i = l, a, ...., h. It is assumed
:- I . 1‘ "r h .,
that; r1, 1'2, 00., r1; are mom}. Further, for eacn f‘GIri‘
the mapping of § 2 into g 2 is unique. Equal probability
is a“signed to each f in G . Since to each f in Gir‘ there
{Pi} 1}
corresponds a number, c, which is the n*.ber of components
induced in § 2 by this f, c becomes a random variable if

f is selected at random from G{r- .
1}

2:1. The calculation formgla for the auxiliary
sumikgpc. In order to begin the derivation of this formula
it is necessary to introduce some notation.

(2:1) Let .3: {i} , i = 1, 2, ...., N be the set of
indices 0 the points Xian .

(232) Let («1, a2, 0000, aft) be a/U-part partition

ofﬁinto subsets such that:
(a) a3 is non ~ empty for each j.§}z
(b) a. . a = 0 for 3 f k
J k
(C) a1+a3+oooo+ayu=§o
(2:3) Letcﬂ'aJ be that subset of elements of j_2_
with indices belonging to a1, 3 = l, 2, "'°L/* .
(2zb) Let E(a1’ Q3, ....’ §/‘) be the event when.1_2_
is divided into subsets in accordance with (2:2)
and (2:3),«J(1L3 (3 = 1, 2, ....,/¢) has the

properties:

(a) f( (90,3)C «22,3

(b) f-1 (waj)c. ”JG
J

It is emphasized that none of the subsets ¢%_,
1

was,

°qu.need be minimal subsets having the
properties (a) and (b) of (2:4). therefore, E(a1, a3, ...., 9")
is not equivalent with decomposition of f 2 into

components.
(2:5) Let p(a1’ 02, ...., 99;) be the probability of

the occurrence of E(a,, a2, 97‘) on the

condition that f is a random selection frmn(?¥£f

2: L S =
( 6) et /“ 2::EEEE:T p(a1, a2, ...., €72),

(a1) 0.2, 0900, a/)
/é¢= l, 2, ...., N, where the sum is over all possible choices
of (G1, 02, ...., g”) in accordance with (2:2) for a fixed

value of/ .

4 9

21

 

(2:7) Let kJ = the number of indices in a J = l,

j,
2, °'°°b/b . This implies that there are k1

elements in u) , k3 elements in a? etc.

(1.1 a3,
Consistent with (2:2) kJ :> 0 and k1 + k3 + .... +

= N

13..
The indicated summation in (2:6) can now be considered

in two parts. By theorem 3, section b, chapter 2 of Feller'l2i] ,

for any fixed set of values (k1, k2, ...., k/a ), there

N
are ways in which the N indices of .J}
k1, k2, 0.00, y

can be divided into/u.groups of which the first contains

k1 indices, the second contains k3 indices, etc. Here

N

is the multinomial coefficient and
k1, kg, 0000, 1y

 

equals N: . However, (k1, k2, ...., k/‘)

k1: kg: 0000 1%:

can take on different values and still satisfy the conditions
of (2:7). For each choice of (k1, k3, ...., k}. ) the
different choices of (a1, a2, ...., d/i ) must be considered.
For convenience, therefore, it is possible to write (2:6)

as a double sum,

2 EM
(2:6') 5 = " ,
/‘ ' )p(a’1, 0-2, 000a )

(k1, k2, coco, 1a“) (a1, a2, 0000, 6/“ /"
(k)

The symbol means that for a fixed set

(11, a2. 0.0., Gy‘

of values (k1, k2, ...., k/‘) the sum is taken over all

10
possible choices of (a1, a; ...., a”) in accordance
with (2:2).
To find a workable expression for §u, it is necessary

to evaluate p(a1, a2, . In order to do this,

...., 0y) f
each flfdpulis represented by an N by N matrix, Bf = (big)
where bij is either zero or unity. This representation
is based on an approach suggested for sociometric data by
Forsyth and. Katz [6] and by Katz [7] . It was used in

a similar way by Katz and Powell [3] . Each of the rows
and columns of Bf is representative of a point of,J—),.
The elements of each row denote the mapping of the point
of.J_2_ which this row represents. For example, the 1_§h
row of Bf describes the mapping of x1 En . If biq
is unity, then x1 maps, under f, into xq En . If

biq is zero, then x1 does not map into xq. Since the functions
f3 ($.13 have been assigned equal probability, the ri ones

N

r1 WayS, 1 = l, 2, .000, N.

in row x1 ay with edual probability appear in this row
in any of<f

The event that Bf has the form

811

 

 

(2:8)

(Ii
l

 

 

11

is equivalent to the event, Eal’ ea, In (2:8)

0..., a”).
Ostands for an array of zeros, B“ is a k1 by k3
principal minor of Bf, j = l, 2, ""b"' This form is
equivalent to E(a,, as, ...., g“) because the k1 rows of
B11 can represent the elements of¢Ra1, those of B23

can represent the elements ofcd

a3, etc. Since unity can

appear only in the positions indicated in (2:8), the

properties, f(:»aj)<::aa3 and f"1 (¢”aj)C:“aJ, j = l, 2, ....,/h-

are obviously satisfied.
r1 . . . .
ihe procedure for the evaluation of P(a1, a3, ...., q”)

is as follows. For a fixed set of values (k1, k2, ...., 5,)
N
which satisfy (2:7), one of the k1, k3, ...., gs choices

of elements from ( 2 to represent the subsets “31, ‘Daz,

Log” is made. By interchanging rows and correSponding
columns,Bf is arranged that the rows and columns representing
the k1 points oftéil appear in the first k1 positions of

Bf, those representing the k; points ofwa2 next, etc.

For purposes of identification the rows of Bf are then labeled

from top to bottom: x11, x12, .... Xlk ; X21, X22,

X2 °
00., ’
1 k3

...............; xkl, §“2’ ...,, x,%». The correSponding

number of ones in these rows is denoted by: r11, r12, ....,

rlk;r2 , I‘a , coco, Pap; 000000; I" , l" , .000, I" o
1 3 t A
1 3 1 ’“3

The following auxiliary notation is now introduced:

= probability that the ri ones in the lib

J 3

row of B11 are in the k1 columns of B11 as indicated

1

in (2:8), 1 = l, 2, "'°b/“ and J = l, 2, ...., k1.
In row xi‘j , the rij ones could appear in any of the
N positions in the row if no restrictions were placed on
Bf. Therefore there are @12ways in which they could be
arranged in this row. Howeier, if these r1J ones are to
appear only in the ki columns of Bf to satisfy (2:8), than

k
there are only (r13) ways in which they could be arranged.

The result is

(2:10) p13 = (N ,
I'
13

In order to have Bf take form (2:8), the events for

 

i = l, 2, ....7#- and J = l, 2, ...., k1.

which the probabilities are given in (2:10) must all occur
at the same time. Since these events are all independent,
the probability of their simultaneous occurrence is the

product of these probabilities.

Therefore, (k1)
r13

(2:11) p(a,1, (13, 0000, a/o)=1’ﬁl ‘IT N
= l J: l r13)

For a fixed set of values of (k1, k2, ...., k/.)

the above procedure would be the same for each of the

la
(
k1, kg, 0..., k ChOlceS Of elements from g 2 t0 fOI'ID

the subsets ”an “3&2, won . Since (2:11) is

.00, ﬂ
3 general result, it is the expression which can be used
in (2:6') to give

(k) (k1)
(2:12) S“ = :EEk :EEE 2TT’% T.2?$l_ :
)i= 1 J- =1 N
r13

(k1, k3, ..oo, ) ((11, a”"’/‘)

/‘= l, 2, ...., N.

The above derivation together with the limits on
(k1, k3, ...., k%‘) which are given in (2:7) and the fact
that the factors of the denominators of (2:12) come from
each of the N rows of Bf constitutes a proof of
THEOREM 1. If Nels the number of elements in_(fl_,
if ru' is the number of images of the (point xv eﬂ ,
v = 1, 2, ...., N and if (on as, ...., (175)18

defined by (2:2)4_the alue of the auxiliary sum,

371;}:jen mg 37k) ﬁk (‘ 1}

k1,ka,...,5k>0(a1,aa,...,% ) 1:13: rij
= N

 

(2:12' ) Sf
k1+k2+oooo+

//‘ = l, 2, 000., N.

Formula (2:12') provides the means for computing

k}‘

3/. , /:= 1, 2, ...., N, from the initial conditions (i.e./
with N given and r1, 1 = 1, 2, ...., N known, the auxiliary
sums can be calculated). The values of S,.will be used to find

the exact probability distribution.

14

2:2. The exact probability distribution of the

o co )0 e . In order to find the exact distri-
bution of the number of components, it is necessary to obtain
an expression involving the probability that an ftC-(rﬂ
selected at random induces exactly c components in 1-2, .
This is done by finding a formula for 5/“ ,/4= 1, 2, ..., N
by a different method.

Before deriving this expression for §/" we prove:

LEMMA 1. If M (c,/t) the number of ways in
7

 

which c components can be distributed into/u
subsets withinone of themibeinaeembty and if

[t
c are Stirlingnnunbers of the second kind

1.

9

then
J”

o “I = '

(2.13) :1 (°’/‘) /. c
P3 OF. The distribution of one or more components

into a subset implies that all the elements of the com-
ponents are elements of the su-set and that the total
number of elements in the comoonents equals the number in
the subset. No regard is given to arrangement of components
within the subset.

WhitworthIlQ] in proposition.XXII proves that the

number of ways in which c different things can be distributed

 

m m n m n
Jordon [9] as: J = A X, , where A x =
n m. 0

x:

Stirling numbers of the second kind are defined by

 

(X+xﬂ)n-£H(X+m-l)n + m (In-l) (x+m-2)n-oooooooo
2}

15
into/u parcels (without blank lots) is 0! times the co-

0 in the eXpansion of (eX - 17“. He describes

efficient of x
"different things" as those which, for purposes of the
problem, are not identical and he defines "parcels" as an
unarranged class. Therefore, it is consistent to allow
"things" to be considered "components" and "parcels" to be
"subsets". Thus M (c,/) equals c: multiplied by the

coefficient of X0 in the eXpansion of (ex - lY“.

Jordan [9] in section 71, formula (5) shows that

no ' /‘
(2:14) (eX - 1)’“ = g 4:: Jr: xc .
C a“

 

Therefore,
J”
(2:13) M (c,/«-) =/4-! c- and the proof of lemma 1
is complete.
It is now possible to prove

THEOREM 2. If PC is the probability of exactly

 

,ﬁ
0 components and if c are Stirli q numbers of

 

the second kind, the value of_;;uisusiven_by:

N
,ﬁ
(2:15) 8/ =ﬂ: CZ P024 ,/‘= l, 2, 000, No
‘7‘

PROOF. In accordance with the hypothesis of lemma 1,

the event E(a1 a2
’ 9

0...,

a ) can be described by a con-
/‘
dition which is equivalent to those given in (2:4).

(2:16) = the event that when.IjL

E
((11, a2, 0000’ cf)

is divided intol/csubsets, (‘Ja1, “9G2, ...,

in accordance with (2:2) and (2:3), it is true

that the c components induced in.jrl_ by f can

be distributed into these’;£subsets with none

of them being empty for all c if“

§, is the sum of the probabilities of the occurrence
of 3(01, a2, ..., 9y~)' If there are a fixed number of
components, each of the ways of distributing these components
into the/ subsets of n constitutes an occurrence of
E(Q1, “a, ..., 9/‘). Consequently,
N

(2:17) 8/.= €130 . M (c,/l-¢) , /u-= l, 2, ..., N.

C

Using lemma 1 with (2:17), theorem 2 is proved.
The next step in deriving the exact probability
distribution is to solve (2:15) for PC.
THEOREM 3. If §7¢&7u= l. 2~er n... N is given

by (2:12{) and_if

 

 

(2:18) w = 45'
f #0
and if %#f are Stirlinx_numbers of the first kind3°,
then
2

Stirling numbers of the first kind are defined by
m
Jordan [SJ as Sn = [4; Dm (X)njx = 0’ where 1351 (x)n is the

H1

m.§h derivative of: x(x - l) (x - 2) ....... (x - n + l).

l7

0
(2:19) Pc = ‘22: w s , c = 1, 2, ....., N.

PROOF. Using (2:18), (2:15) becomes

N /‘
(2:20) w/. = f P54: ,/= 1, 2, ...., N.
C =/l

In matrix notation (2:20) is written:
(2:20') W = PJ, where

(2:21) w

(VII W3 W3 000000 WN) ’

(2:22) P

(P1P2P3 .0000. PN) and

O O‘COOOOOOO

O
O 0.0.0.000
$3
I 0 ‘
. 300......

1: ,1ng

’14
Since J11 == 0 for all/ >n , J18 a triangular matrix.

Moreover, J: = l for all 11. Therefore, [J]: l , J13

non-singular and has an inverse,‘{-1 , which is also a triangular

F;

”H

(2:23) J =

(J

 

”unis”

 

1
N

sinus;

matrix of the same form. From (2:20') ,

(2:2e) P = w.£“ .

.1

/‘

 

18
Using Jordan [9] , it will be shown that:

O 0 00......

Sa 0 0.0.0...
2
-1 2 a
(2:25),,9 = 5 s3 s3 ,
3 3
SN SN

 

c
where é“ are Stirling numbers of the first kind. Jordan

shows n 1
:EEE m m
(2:26) 1,, S = S .
1 = m i n
m
wherecsn' is the Kronecker delta. This implies
(2:27) 3 = I and s = J“ .
Now (2:24) becomes
(2:24') P = w S

3

which in non - matrix notation is (2:19) and theorem 3 is
proved.

Formula (2:19) is analogous to the formula (3.1)
for the combination of events of chapter # of Feller [2i]
The method of inclusion and exclusion used by Feller is
different and does not work in the proof of (2:19). Katz [11;]

employed the method used in this thesis. It involves setting

 

 

~1l1fl‘ll.

19

up a system of equations involving the desired quantity.
(In our case the desired quantity was PC.) Then this system
is solved so that this desired quantity is expressed ex-
plicitly in terms of other calculable quantities.

A short table of values of Stirling numbers of the
first kind from Jordan [9;] is included in table 1 so they
can be used in the computation of numerical examples.

TABLE 1

STIRLING NUMBERS OF THE FIRST KIND ~ s2

 

V
...J
N

 

3 h 5 6 7 8

/
1 1 0 0 0 0 0 0 o
2 ~1 1 0 0 0 0 0 0
3 2 .3 i 0 0 0 0 0
'u ~6 11 ~6 1 0 0 0 0
5 24 -50 35 ~10 1 0 0 0
6 ~120 27a ~225 85 ~15 1 0 0
7 720 ~176u 162u ~735 175 ~21 1 0
8 ~5040 13068 ~13132 6769 ~1960 322 ~28 1

 

These numbers are related by a recurrence formula:
-1
3;“ = s": ~/.§E with 5% = 8) and 8/2 = 0. Thus, the

table may be extended indefinitely.

2:3 A numerical example. Considequdi defined by

 

I-

20
N = 6 and r1 = (1, 1, 2, 2, 3, 3). The values of the
auxiliary sums by (2:12') are as follows.

If/l=l, k1=N and Sl=l.

Since (if) ; 0 when r>k,

 

 

 

 

5" "‘ (3)7573 )2 (“(136%) + 2[@Z:@
+~(:)(:)(:)(:)(:)‘ (3(1)?) with”

s. = x 3 26121:) +6 Midi)“
+ 4:1th

5., S and S6 all vanish because it is impossible to choose

5

k
the k's so that a116,?) > O in a given product.

Using (2:18) and (2:19) with table 1, the exact

probability distribution of the number of components is:

11567.385
1.62 x 10

52.312

6

 

1.62 x 10

303
1.62 x 10

6

6

I?

ll'

.967,522

.032,291

.000,187

21

3:1, The class G . A s ecial case of G of
r p {1“ 1)

some importance in applications is the class Gr' Under
each 1‘ EGr, each point of _(_)_ has the same number, r, of
images. Any situation where each element under consideration
maps into the same number of elements is of this type. While
this case was implicitly covered in chapter 2, it is inter-
esting to note what effect this particular kind of mapping
will have on the computation formulas of the probability
distribution.

Since Gr is a special. case of G{P1}’ the results of
chapter 2 apply and can be modified to fit the situation
that r1 = r. The modification of the formula for §ﬂ.,
/u = l, 2, ...., N, is threefold. First, with r1 = r the
subscripts on r can be removed. Second, the factOrs obtained
from B11, 1 =.l, ""b/‘ , are all equal and the double
product of (2:12') can be changed to a single product with
exponents. Third, for a fixed set of values, (R1, R2, ...., g»),
the k1, is, ...., kg choices of elements from JT)_ to form
the subsets “gals“, u) ) will all give the same

as, 0..., a”

value for p(a a ) because the number of images
1:

a2, 0000,
of each point of (') is the same. Therefore, the indicated

(k)
summation, ' :EEE can be replaced by the

((1.1, a2, 0..., $4)

 

 

m‘.

k2: .... g 1% . 1113 I.€S‘L11L is

 

f'dCtC'I‘q

COROLLARY 1. If h is the number of points in

 

( 2 and r is the number of images of each point,

 

then it falls s from theorem 1 that

(7 g (1, it

kl+ k2* ... + kp.= N

H

 

(3:1) 9” s

/: l, 2’ 00..., lg.

RLXARK 1. For computational purposes (3:1)

k1,};2, ..., I?)

can be written as:
000, §¢§>O

S
f: [Tl—“(ran
kl* k2+ 00. + y: I:

7) (r) k: __ ,,
i=lfki J ‘L ’/“- l, a, 000., 1:,

Where IVA—\r) = I¢(:‘T - 1) 0.000. (N ’ r + 1)

(3:1')

Since the expression for PC in (2:19) does not depend
directly on the value of r but rather on the values of
s/(, f‘ 3 l, 2. .0000 N’ the DrObabilitv éiStribUtiOn for

this case can be computed as for the class G{+.}.
'1

3:2. A numerical example. Consider Gr to be defined
by N = 6 and.1'= 2. The values of the auxiliary sums by

(3:1') are

24

1 _
- (12 H(2) +

2 (30) :——-32{ é64) ) (3’ 3 W()} 9375

(2,6 2, 2) (2)2 = 2

U)
|

(D
II

3 (30)6 253,125
k
Since it is impossible to choose the k's so that all( j)>0
r

in a given product, 84, S5, and S6 all vanish.

The exact probability distribution as computed by

(2:19) is:
757.595

759.375

P1

.997 , 656

(lo

.002,343
2 759,375

..3;___.
3 759,375

’11
II
II'

.000,001

Pu = P5 = P6 = 0

3:3. The class, G1. The class of functions, f,

 

under which each point of J?)_ has only one image in.Jr)_

is called G1. This is the class of mappings considered by
Metropolis and Ulam [4.] and Kruskal [5.]. Since G 16: Gr ,
the results contained in (3:1) apply in this case. However,
because of prior interest in this class of functions, it is

appropriate to show explicitly the eXpressions which apply

to this particular case.

25
Since each point of.j—l_ has only one image in
_r1_, r = l. The result which follows from (3:1) is given in
COROLLARY 2. If N is the number of elements
in.Jr1_ and each point in 1—1_ has only one

image, then

N
(3:2) S/l. = j— Z 61, k2, coo, k

k1, k2, coo, 5g>0 /‘
k1 + k2 + v99 +’¥ys= N

’ff 1‘1 _
, /= 1, 2’ .00., N.

i=1(k1)

Recently tables of the binomial probability dis-
tribution [l2;] have been published. An.alternative form
of (3:2) for which these tables are useful is given in

THEOREM b. If S , A/¢= 1, 2, ...., N, is given

 

by (3:2) and if b (k; n, p) =(ﬁ) k qn'"k is the

binomial probability distributioni_then

 

k k3
. 1 .
(3:3) 8% = b(k1; N, ﬁ)b(k3; NI "' k1,—--——
k1, k2) 000: 5p:.0 N-k1
k1 “I" k; + ... + k/‘ = N
k _
f 1
oooooooob k . N "’ k ”...-k ‘—
(:u_1’ 1 d /“_2) N _ k1"'°‘5u-2 7

/‘=1, 2, .0000, No

 

26
N
T 1. N:
PdOOb‘ C K k =k k k '
1’ 2, 000, I“ 1. 2o .oo /ﬂo

N“k1 ): #4.... (N“k L-°‘°’—Kﬂ:2):
k1. NEN- -k1)o .kao (N-k1.k3)y. -.TN‘k1;.og-k )1 k ‘

:(SVXNLC E)..n('k;1:-....-k/9 -
1:1) ki ‘63—) (J-Y' .033», With

k1, kg, 0..., 1;.>O 811d kﬂzlq-kl ..kz - 0.0 -k

 

 

 

 

ﬂ'i’
From these facts, (3:2) can be written as a
"telescoping product",

€;)@ kN-kl N—kl
o 3 _
(3.2 ) é“ _
k1,k3,ooo,/u:ON
k1+k2+ ...+k
q‘kl akzN I\I_.1{1m.1{2 N—kl-ka .
N“ N-k1
k ..
...... N-k1-...-%“-2 Eﬂ-l ’:r- f 1
%ﬂ_1 N-kl‘oo.“%‘_2
N-k ...-k — N-k1-...-k -
1 ,ﬁ 1 I“ 1 4y“: 1, 2, oooo N0
N‘kl“. "—kﬁ"2

By the hypothesis of theorem 4, (3:3) follows immediately

from (3:2') and the theorem is proved.

27
Again (2:19) can be used to compute the exact
probability distribution because this formula is not directly
dependent on the value of r but rather on.the value of @h

which must be computed by either (3:2) or 3:3).

3:4, A numerical example. Consider G1 to be defined

by N = 8 and r = l. The values of the auxiliary sums as
computed by (3:3) are:

31 = 1
b(l;8,1/8) + b(2;8,1/u) + b(3;8,3/8) + b(u;8,1/2 )
+ b(5;8,5/8) + b(6;8,3/u) + b(7;8,7/8). I

N
Since a) = (gﬁg) , it follows from the hypothesis of

N k N-k N N-k k
theorem 1+ that (k (g) (13:3) =( Xi) (K)
. N N -k N N

and as a result,

S

2

(48) b(k;N,§ ) = b(N~k;N, Egg ) and 32 becomes:

52 = 2b(1;8,1/8) + 2b(2;8,1/4) + 2b(3;8,3/8) + b(u;8,1/2) .
Using tables of the binomial distribution [12] for the

values which it contains and computation by means of the

hypothesis of theorem 4 for other values:

s2 = .785,391,8 + .622,924,8 + .563,263,8 + .273,437,6

m
N
ll

2.245,018,o

b(1;8,1/8) [éb(1;7,1/7> + 2b(2;7,2/7) + 2b(3;7,3/7i]
+ b(2;8,1/u) [sb(1;6,1/6) + 2b(2;6,1/3) + b(3;6,1/2{]
+ b(3;8,3/8) [2b(1;5,1/5) + 2b(2;5,2/5)]

+ b(u;8,l/2) [eb(1;u,l/u) + b(2;u,l/zﬂ

+ b(5;8,5/8) [}b(1;3,1/3{]+ b(6;8,3/u)b(1;2,l/2)

U)
K»
II

28

~792,514,8 + ~552,7“9,6 + .425,376,8 + 333,252,1

+

2.

.250,339,4 + .155,731,2
509,963,9

b(l;8,l/8)b(l;7,1/7)[;b(l;6,l/6) + 2b(2;6,1/3)

+

+

+

+

+

+

+

b(3;6,1/2)]+ b(1;8,l/8)b(2;7,2/7) Eb(1;5,1/5)
2b(2;5,2/5iI+ b(1;8,1/8)b(3;7,3/7)[éb(1;u,1/u)
b(2;4,1/2ﬂ+ b(1;8,1/8)b(u;7,”/7)[213(1;3,1/3J
b(1;8,1/8)b(5;7,5/7)b(1;2,1/2)
b(2;8,1/u)b(1;6,1/6)[%b(1;5,1/5) + 2b(2;5,2/5i]
b(2;8,l/4)b(2;6,1/3) 2b(1;u,1/u) + b(2;u,l/2)
b(2;8,1/4)b(3;6,1/2)2b(l;3,l/3)
b<2;8,1/4)b(u;6,2/3)b(1;2,l/2)
b(3;8,3/8)b(l;5,1/5)[%b(l;4,1/4) + b(2;h,l/2i]
b(3;8,3/8)b(2;5,2/5) 2b(1;3,1/3)
b(3;8,3/8)b(3;5,3/5>b(1;2,1/2)
b(4;8,1/2)b(1;4,l/4) 2b(1;3,1/3) r
b(u;8,1/2)b(2;4,1/2)b(1;2,l
b(5;8,5/8)b(l;3,1/3)b(l;2,1/2)

/2)

.276,37u,8 + .189,056,3 + .1uo,59o,7 + .102,539,1

+

+0

+

l.

.102,539,1 + .062,584,8 + .189,056,3
124,969,5 + .086,517,3 + .086,517,3 + .1u0,59o,7
.ou8,666,o + .051,269,6 + 062,5au,9

715,120,9

29

55 = b(l;8,l/8)b(l;7,l/7) {b(1;6,1/6)Eb(1;5,1/5)

+ 2b(2;5,2/5£]+b(2;6,1/3)[%b(1;4,1/4) + b(2;#,l/2i]
+ b(3;6,l/2)2b(l;3,1/3»b(4;6,2/3)b(l;2,l/2i?

+ b(l;8,l/8)'D(2;7,2/7) {b(l;5,l/5) [2b(1;1+,1/4)

+ b(2;4,1/2_)]+ b(2;5,2/5)2b(l;3,l/3)

+ b(3;5,3/5)b(l;2,1/2i}+ b(1;8,1/8)b(3;7,3/7)
w{£(1;4,1/u)2b(1;3,1/3) + b(2;4,1/2)b(1;2,1/2)

+ b(1;3,1/3)b(l;2,1/2)}

+ b<2;a,l/u)b(1;6,1/6){b(1;5,1/5)I_’2b(1;z+,l/u)

+ b(2;l+,1/2)] + b(2;5,2/5) [2b(1;3,1/3 + b(l;2,l/2)]
+ b(2;8,1/4)b(2;6,1/3)‘{£(1;4,1/4)2b(1;3,1/3)

+ b(2;u,l/2)b(l;2,1/2i}

+ b(2;8,1/u)b(3;6,1/2)b(1;3,1/3)b(1;2,1/2)

+ b(3;8,3/8)b(1;5,1/5) {B(1;4,1/4)2b(1;3,1/3)

+ b(2;4,l/2)b(l;2,l/2i}

+ b(3;8,3/8)b(2;5,2/5)b(l;3,l/3)b(l;2,l/2)

+ b(u;8,1/2)b(1;u,l/u)b(1;3,1/3)b(1;2,1/2)

- .094,528,2 + .062,48u,7 + .043,258,7 + .025,63u,8

+.062,u84,7 +.o38,u52,1 + .021,629,3 + .043,258,7

+ .021,629,3 + .025,63u,8 + .062,484,7 + .038,452,1

+ .021,629,3 + .038,452,1 + .019,226,0 + .021,629,3

+ .ou3,258,7 + .021,629,3 + .021,629,3 + .025,63u,8
.753,020,9

b(l;8,1/8)b(l;7,1/7)b(l;6,l/6) b(1;5,1/5)l%b(1;u,11h)
+ b(2;4,1/2i]+ b(2;5,2/5)[§b(1;3,1/3) + b(l;2,1/ 2i]

+ b(1;8,1/8)b(1;7,1/7)b(2;6,1/3){%(1;4,1/u)2b(1;31/5)

- . . .
.‘
. u . , . l r
( 4 d P I n I
. r . l' I
' I . I . n 1
i . J . p . n
. g. . ,
C ’ ' t p Q ,
. g , : . .

‘ f t 1‘ ‘ t ‘ ’
. p . 4
\
( ‘ ’ < < ’ ‘ C
‘ é ‘ (' p ( F. 4
. t ' 4 I f f 4 {
' ' ' ‘ - " f * '2 I
‘ t o t . . :
'i .
7 t: I a -. . - (I
r“ I . ‘ ' v .
,-
‘ f ~ 5 . 5 ’ ,
_ , t l! , ‘v

30
+ b(2;4,1/2)b(1;2,1/2)

+ b(l;8,1/8)b(l;7,l/7)b(3;6,1/2)b(l;3,l/3)b(l;2,l/2)
+ b(1;8,1/8)b(2;7,2/7)b(1;5,1/5)b(1;u,1/u)2b(133,1/3)
+ b<1;8,1/8)b(2;7,2/7)b(1;5,1/5)b(2;4,1/2)b(1;2,1/2)
+ b(1;8,1/8)b(2;7,2/7)b(2;5,2/5)b(1;3,1/3)b(1;2,1/2)
+ b(1;8,1/8)b(3;7,3/7)b(1;u,1/u)b(1;3,1/3)b(1;2,1/2)
+ H2;8,1/4)b(1;6,l/6)b(1;5,1/5)b(1;u,1/t+)2b(1;3,1/3)
+ b(2;8,1/u)b(1;6,1/6)b(1;5,1/5)b(2;4,1/2)b(1;2,1/2)
+ b(2;8,1/4)b(l;6,l/6)b(2;552/5)b(l;3,l/3)b(l;2,1/2)
+ b(3;8,3/8)b(1;5,1/5)b(1;u,1/4)b(1;3,l/3)b(1;2,l/2)
.O6l,283,1 + .028,839,1 + .01o,814,7 + .019,226,1

+ .009,613,0 + .009,613,0 + .01o,81u,7 + .019,226,1
+ .OO9,613,0 + .009,613,0 + ,009,6l3,0 + .OlO,814,7
.209,093,3

b(1;8,1/8)b(1;7,1/7>b(1;6,1/6)b(1;5,1/5) b(1;u,1/u)
2b(1;3,1/3) + b(2;1+,1/2)b(1;2,l/2)}+ b(1;8,1/8)
b(1;7,l/7)b(1;6,1/6)b(2;5,2/5)b(1;3,1/3)b(1;2,1/2)

+ b(1;8,1/8)b(1;7,l/7)b(2;6,1/3)b(1;4,1/4)
b(1;3,l/3)b(1;2,l/2) + b(l;8,l/8)b(2;7,2/7)
b(1;5,1/5)b(1;u,1/u)b(1;3,1/3)b(1;2,1/2)

+ b<2;8,1/u)b(1;6,1/6)b(1;5,1/5)b(1;4,1/u)
b(l;3,1/3)b(1;2,1/2)

.01u,419,6 + .004,806,5 + .004,806,5 + .oou,806,5

+ .Oou,806,5

= .033,645,6

31

88 = b(l;8,1/8)b(l;7,l/7)b(l;6,1/6)b(l;5,l/5)
b(1;4,1/u)b(1;3,1/3)b(1;2,1/2)
$8 = .002,u03,3

Using these results together with table 1 and (2:19),
the exact probability distribution is:
P1 = .405,628
P2 = .408,440
P3 = .153,895
P4 = ,028,893
P5 = .002,970
P6 = .000,169
P7 = ,ooo,005
P8 = .ooo,ooo,06

4. THE HOLLOW CASE

4:;. Preliminarie . If the mapping of J71. into
,J_l_ is restricted so that no point is permitted to map

into itself, the mapping is called hollow. The subclass
of functions which represents this type is called Hﬂ‘ﬁ'
Analogous to GFE3’ HF3}18 composed of functions, f, under
which each xi£ﬂ has r1, 1 = l, 2, ...., N, images in
D. .

Hollow mapping is of Special interest in the field of
social psychology. In sociometric tests an individual chooses
the individuals in a group with whom he wishes to be associ-
ated. In some cases a variety in the number of choices
made by an individual is permitted. In other cases, all
individuals must make an equal number of choices. In still
other instances only the prime choice is made. If each
of the N individuals making the choices is considered to
correspond to a point x1 and if his choices for associates
correspond to the ri images of x1, 1 = l, 2, ...., N,
then a hollow mapping situation exists provided no in-
dividual is permitted to choose to be associated with
himself.

The number of choices permitted in different in-

stances gives rise to different subsets of hollow mappings.

33

The situation where there is a variety in the number of
answers by different individuals in the group is covered

by the class,EQ}i}, If Hr and H1 are subsets of functions
defined for the hollow case as Gr and G1 were for the
general case, then.Hr covers the situation where each
individual in the group chooses the same number of associates
and H1 covers the case where each individual makes only the

one best choice. Hr and H1 are considered in chapter 5.

“:2. The auxiliary sums and the exact probability
distribution of the number of components. Since HF$CC¥'ﬁ’
the results of chapter 2 can be adapted to the hollow case.

To make the event, E(a possible under hollow
1’

a2, 0000’ 9‘),
mapping there must be at least two elements in the subset
waj, J = l, 2, 00.0%. This mews the k3, J = l, 2’ 9.9,,/,

f
must be greater than one. The matrix representation, Bf = (bij)

f
is different in that bjj a O for all j = l, 2, ...., N.
Consequently, the required form of Bf, which is equivalent

to the event, E(al given in (2:8), is modified
’

C12, ..., 07a)
in that the main diagonal elements of the principal minors,
B11, 1 = 1, 2, ....,/AC, are all zeros. Therefore, the
r1J ones in the j.§h row of B11, 1 = l, 2, ....5/4 and
J = l, 2, ...., k1 may appear only in any of the remaining
(k1 - 1) positions if Bf is to have the form equivalent to

(2:8). At the same time if no restrictions are placed on

34
the form of Bf, the ones in any row could appear in any
of (N - 1) positions. Since k >1, j = l, 2, ....f,

i
there could be at most[-1-:-]subsets formed from the index set,'~‘9,

where [-12] is the largest integer in the quotient Eg- .
Moreover, there could be at most[§»components induced in
[L by a function, f.

Using the above facts, formulas for the auxiliary
sums, §,.,‘/:= l, 2, ...., [g , and for the probability,Pc,
of exactly 0 components, 0 = l, 2, ...., [-2-], must be modified
in order to make them valid for the hollow case. The theorems
which were proved for the general case are now listed
with proper modifications for the hollow case. They are
numbered with primes to show the correspondence between the
cases.

THEOREM 1'. If N is_th£ number of elements Ln

( 2 and rv is the number of images of the point

 

x1381 L, D21, 2, 0000, N, and if (a , a2, 0000,},)

 

 

is defined by (2: 2) then the value of wggis‘piver by

(k)
l :jEE;:T:k EE::)
(Lnl) S = TN
N;l)kl,k2, ’00., ”>1 (0.1,C12, 0000,07:
"‘ kl+k2 +oooo+kﬁk=N

”Ii k1 (kl-l) [N]
= l, 2, .00., -- 0
i=1 3.11 “3 '/ 2

 

o.l

35
THEOREM 2' . If Pc is the probability of exactly

 

,1“ ‘
0 components and.J: are btirling numbers of the

 

second kind,_then
P]
(4:2) 3/. s/u: 2 PC %ﬁ,/= 1, 2, ...., [1%].
c ‘7‘

_ S,“
THEOREM 3.0 If ‘9‘ = 7 9 ﬂ: 1, 2, 0.00, g ’

 

given by (4:1), and if §2 are Stirling numbers

 

of th

 

Consider thgdefined by
N = 6 and r1 = (1, l, 2, 2, 3,‘3). The values of the

auxiliary sums are, by (4:1):

52 75%)? 5)2{2®?3 .2)(3)2 = 18

52 = .000,072

 

 

x105

83’ 84’ 55’ and 36 all vanish.

By (4:3), the exact probability distribution for this

example is:

399,964

.ooo,036

P
u = P '
5-P6=O

36

5. PARTICULAR cases 0:»- HOLLOW MAPPING

5:1. The class Hr. A Special case of Hfri} for

 

the hollow case is the class Hr which is defined as Gr
was for the general case. (That is each point of Jrl_ has
the same number, r, of images in Jrl_.) Although this
class was mentioned in section 4:1 and implicitly considered
in section 4:2, for completeness, the exact formulas for
computation are shown. The corollaries are numbered with
primes to Show the parallelism between the results for
Gr in section 3:1 and those for Hr' The reasons for the
modifications are covered in section 3:1.
COROLLARY 1'. Iﬁ N is the numbez_9ﬁ points in
JC1_ and r is the number of images of each point,

then it follows from theorem 1' that

k -
(5:1) 8 = ____:_L__.___ g él,k2,oooo,)1-l(ir )ki’
/‘ -1 N k1. k2......5.>1

N )
+ + + -'::I
(1" kl k2 0000 k/ IN

 

ﬂ... 1, 2, ......g..[]

REMARK 1'. For purposes of computation (5:1)

can be written

38

N
(5‘1') 5/“: E Cc k k)
[N-:)(r)]N 1 2’ ’x-

kl, k2,...0’1¥>1

kl+k2+. O 0+lcﬂ= N

ﬁﬁkklﬂrﬂki, ﬁ= 1,2,....,[.12i] .

Since the formula for Pc in (4:3) does not depend

 

directly on r but rather on the values ofb },/4= 1, 2, “HI-g],
the exact probability distribution can be computed by

us ing, this formula.

5:2 A numerical exampl . Consider Hr defined by

= 6 and r = 2. The values of the auxiliary sums by

(531') are:

 

S]. = l

.. 1 ( 6) 6 _ 20 _
s _ (2) _ — .000 020
2 (20)6 3) 3 16-5. ’
$3 = O

The resulting exact probability distribution by
use of (4:3) is:

P1 = .999,990
P2 = .OO0,0lO
133:0

5:3. The class H1. As mentioned in section 4:1,

 

39

the subclass H1 is defined for the hollow case as G1 Was
for the general case. To complete the parallelism between
the hollow and the general case, the formulas which apply
specifically to random mapping under functions from H1
are given as they were for the class Hp.
COROLLARY 2'. If N is the nun-her of: am
( ) and each point in ( ) has only one image, then
, k
(5.2) s, = ——L—— 2 H (kl-l) 1,
N l’k o... =
(N—l) 2’ '5“ 1 1
kl ,k2,0005‘>1
+
kl +k2+ ... §,=N

/u=1, 2, ....,[§]

In order to make the tables of the binomial
probability distribution [12;] useful an alternative form
of (5:2) is presented in

THEOREM 4'. If a, f4=l, 2, ....igjis given

 

l
WJMLLE b(k;n.p) =(:)p R q n'k is

the binomial distribution, then

(5:3) s/.=_..._4‘_)_N ‘ b(l;N, 1‘17?)

_'l
(N. .I. k1,k2,...,19‘>1
<(N.-l
k +k +ooo+1§.-’ )

 

 

k l l 2 E); 1‘ l
- -1 2- ’N-k -...-k . - ,
béz’N KMWHH°b 1‘,“ -1’ 1 ,a-z » N—kl-...-I. -2 ’

/&~2
/= 1, 2, ...., [—E—J.

.-
o
I
;
96A
s.

1‘

 

4O

PROOF:
( N Iq N-kl l -kl‘-o 00-1§‘_2
k k g to...
3 2’. .. ’1? k1 k2 k

kl’kz, 00..., kf>l

Therefore, (5:2) csna be written as a "telescoping

product",

. . _ (N_ )N (kl-1) klNl(-k 17m.)
(5.2 ) S/¢ "’ (N—I1)N£>l{(ll:]) N.../4 ...)

kl’kZ’OCOk ’

kl+k2+ o o o+k j; N‘l
N-k -k
-:l( N"{l-k2-/‘ +2) 1 2
(N_l:kl-f+ N-kl-ﬁ +1 . I O C O

.\
N-k -..-k
- " ~1 -21”... "' l .1
.... (N-k:2"°15“-9218-l—1 2X N:(l k-ll /“
k/‘_l N“kl- ...-5"- "'1{ 1M" [-2-2
/= 1W2,oooo[]

Using the notati n of the hypothesis of theorem 4',

 

 

 

(5:3) follows and the theorem is proved.

The formula which is used to compute the exact
distribution after 8/, /u= 1, 2, ..., g1has been com—
puted by (5:2) or (5:3) is (4:3) because it does not depend

on the value of r but on the value of §/‘°

_5:4 A numerical example. Consider H1 defined by

= 8 and r = l. The values of the auxiliary sums as

41

found by use of (5:3) are

By

S = 1
l
8
S2 = (is—)5 213(2;8,1/6) + 2b(3;8.1/3> + b(4;8,1/2)_]
7
52 = .39o,607,u

tn
ll

3 ($[2b(238,1/5)b(2;6,1/4) + b(2;8,l/5)b(3;6,1/2)

+ 2b(3;8,2/5)b(2;5,l/3) + b(4;8,3/5)b(2;4,l/2i]

S3 = .036,35#,O

Sn=Eg§[b(2;8,l/4)b(2;6,l/3)b(2;4,1/2i]

5L,r = .ooo,u37,1

S5, S6’ S7, and SB all vanish.
(4:3), the exact distribution is:
P1 = .816,705

P2 = .177,327

P = .005,950

P4 = .000,018

P5 = P6= P7 = P8 = O

6. A BINOMIAL APPROXIMATION or THE
DISTRIBUTION OF THE NUMBER

OF COMPONENTS

6: I roductor 0 ns de~r o . Since the exact
distribution is known, an approximate distribution is useful
only if it is more easily computed. The above numerical
examples, although restricted to relatively trivial oases
show that considerable work is involved in the computation
of the exact distribution. Therefore, an approximation is
worthy of investigation.

Because the distribution of the number of components
is discrete and since it is conceivable, in some appli-
cations, that the set JF)_ will be composed of a relatively
small number of elements, it seems feasible to use a
distribution of the discrete type for the approximation.

A binomial approximation is therefore found and the results

have proved to be rather good.

 

d st 0 . The binomial distribution has only two
parameters, N and p. With N fixed, only p needs to be
estimated. Since the mean of the binomial distribution

equals N times p, an approximation of p could be obtained

#3
by equating the expected value of the random variable, 0 ,
which is the number of components in.Jr)_ , with the
expected value of c if it were distributed as the binomial
distribution. However, the expected value of the number, 0 ,
of components is not easily obtained in terms of one or
more of the S/.,/¢= l, 2, ...., N. Kruskal [5:] gave the
result fer the Special case where the mapping was single-
valued. Since this seems to be quite difficult in the
general case, it is convenient to consider a related

variable, 110"l

, where n is a positive integer. For such
a variable the following theorem is proved.

THEOREM 5; If c is the number of components in ( ) ,

 

n is a positive integer and Sgthe auxiliary sums

given by (2:12,)J then the expected value of no"1

is £128}; by

n 1‘.
(6:1) E(n°"9 = de) if.
’/£=l -l "‘

PROOF: Consider the arbitrary quantity, A, defined

a 3

S- n -1 S
A‘ ét-) :2:

n

2 ’“
From (2:15): s/.=/c: 3:,“ P3 3.

41+

n N
,4
Therefore, A = 2 6:1 Omit)! 2 P, :J; .
#=1 .17" "

c
-1 ' )b
lhe coefficient of PC in A = ,4: 1 -1 ,(ﬂ-l). ’dc'

By changing the notation slightly, the coefficient of

c
[1-
Pc in A = i- :7: ng'a)’4 .
f‘l
But Jordan [9] , section 58, formula 2, shows
n0 = E J (n) , where (n) = n(n-l)....(n-/4+1) = n“?
/=1 c / /~

Therefore, the coefficient of PC in.A = no-1; and by the

definition of the expected value, A = E (no-l) and (6.1)
follows. This completes the proof of theorm 5.

Using (6:1) the results for a few integers are:

(a) E (Ic‘l) = 1

s
(b) E (2°‘l) = 1 +‘"§

( ) I ( 0’1 — + ° + E3
0 s 3 ) - 1 oz 3

-1
Since E(lc ) gives a trivial result and since
E (20.1) is obviously the simplest eXpected value to
compute, it is convenient to use this to find a binomial

approximation of the distribution of the number of components.

45
In order to find the parameter p which will
determine the binomial distribution for fixed N, it is
necessary to prove the following theorem.
THEOREM 6. If 0-1 s ' do variable distributed
binomially with parameters (Nul) and p, then

 

(6:2) HIP-1) = [i + (n-l)p] N-l

PROOF: By hypothesis: b(o-l; N-l, p) p0*l qN-C

‘ N
s c-l 5‘1 "1 c-l N-c
Therefore, b (n ) = n P q ’
:1 0-1

This can be rewritten:

-1
n _ - N— . r-
2. (nc 1) = E (np)° l q °,-- (q + np)‘ 1

Since q = 1 - p, (6:2) follows and the proof of theorem 6
is complete.

I An estimate of p which will determine the binomial
approximation of the distribution of the number of com-
ponents is now possible by equating the two values of
E(2°'l) obtained from theorems 5 and 6. This is not an
estimate in the statistical sense. No sampling is
involved. It is simply an approximation which results

0‘1 if c is the

from equating the eXpected value of 2
number of components in ( ) with the expected value of
20-1 if (c-l) has binomial distribution with parameters

(N-l) and p.

#6
Using the results in (6:1) and (6:2) with n = 2 and
equating the expected values of 20-1, the result for the

estimate of p is:

l
52 Nil
(6:3) p = l +‘5— - 1.

Using (6:3) and denoting the approximate probability

of 0 components by Qc , the formula is

(6:4) QC = pc‘l qN"C , c = l, 2, ...., N.
0-1

It is noted that this binomial approximation of the
distribution of the number of components can be determined
by finding only the value of 82. For the general case formula
(2:12'), (3:1') or (3:3) can be used, depending on the
values of r. After 82 has been found, p is found by (6:3)
and the approximate distribution is found by (6:4).

For the hollow case, the same procedure may be used
and the same formulas apply with the exception that N is
replaced by [g] throughout. For completeness the formulas

are listed with primes as they apply to the hollow case.

1

S _ -1
(6:35) p = 1 +-§ 2 -l

 

. In connection with the
exact distribution discussed previously six numerical
examples were presented to illustrate each of the classes of
random mapping functions. These same examples are now presented
so that the approximate distribution of the number of
components can be compared with the exact distribution.
Accordingly, an estimate of p is found for each example
and the approximate probabilities are shown together with
the exact probabilities, (found earlier) for the general
and hollow cases in tables 2 and 3 reSpcctively.

If GM} is defined by N = 6 and. r = (l, 1, 2, 2, 3, 3),
then, from section 2:3, 32 = .065,705 and by (6:3)
p = .OO6,486. The values of Qc computed by (6:4) are
shown in the first section of table 2.

If Gr is defined by N = 6 and r = 2, then,from
44
9375

The values of Q0 computed by (6:4) are given in the second

section 3:2, 82 = and, by (6:4), p = .OOO,467.

 

section of table 2.
If G1 is defined by N = 8 and r = 1, from section
3:4, 32 = 2.245,018 and therefore p = .113,506,6. Again

the values of Qc are given in the third section of table 2.

48

Ifh?33 is defined by N = 6 and r = (1, 1, 2, 2, 3, 3),
then 52 = .ooo,072 by section #:3. Using (6:3'), p = .OO0,018.
The values of Qc are computed by (6:4') and given in the
first section of table 3.

If Hr is defined by N = 6 and r = 2, by section
5:2, 52 = .ooo,020 and by (6:3'), p = .ooo,005. The values
of QC as computed by (6:4') are given in the second section
of table 3.

If H1 is defined by N = 8 and r = 1, by section
5:4, 52 = .39o,607,4 and by (6:3'), p = .061,270,5. The
values of QC are given in the third section of table 3.

The agreement between the tabulated values is
reasonably good. In the hollow case, there is virtually
no difference between the exact and approximate values when
N: 6, r1 = (1, l, 2, 2, 3, 3) and when N = 6, r = 2. In
the general case, agreement is to at least the third decimal
place for these examples. When N = 8 and r = 1, there is
variation but there is definite agreement in the pattern
of the distributions.

For larger values of N the computation of the exact
distribution is cumbersome. Thus, comparison becomes
difficult. Katz [l3g] has shown that the exact probability
of one component when N = 20 and r = l is .264,68. The
approximate probability given by (6:4) for this set of values
is: Q1 = .295,227. This indicates very little change in
accuracy in the probability of one component for larger values

of N.

1+9

 

 

 

 

 

 

 

80. 000 . W 80. ooo. i ...l ...... a ...... l m
30.08. 80.80. I... ...... I... II N.
Haﬁooo. meiooo. ...... ...... II ...I c
30:50. 03.30. ....I I... ....I ...... m
3648. mmmnmo. ...I ....I 30.08. I... s
319:. www.mma. «8.08. 80.08. 1* 338. $180. ,m
amc.mmm. oss.mos. Hmm.~oo. msm.moo. smm.amo. Hmm.mno. m
mmm.ome. www.moe. sec.smm. cmc.smm. mwm.scm. mmm.scm. a
be on be om so om o
ans .muz mus .cuz R.m.~.m.a.duas.cuz

 

 

 

 

 

 

 

mayo dimmzinww

HEB 2H gram?

OASOO mo mmmmabz mmB

.mo WZQHBDmHmBmHQ MBgvmommmd Q34... Bqurﬁs H39 and ZOmHmdeOO

N Baa.

50

 

 

 

 

 

 

.II. ..I. III. a .II.
om~.ooo. n mao.ooo. s
mam.oao. ome.moo. n 000.000. .11. 000.000. .11: m
ssm.aca. smm.ssa. oao.ooo. oao.ooo. emo.ooo. cmo.ooo. N
amm.e~m. mos.cam. ome.mmm. 0mm.mmm. scm.mmm. scm.mmm. a
co cm so am . ed as

H n n .m n z N u A .0 I z Am.m.N.N.H.HV n «h .w I z

 

 

 

 

 

 

 

mmawo 309903 mmB 2H maﬁa/53,50 m3 mHmMEDZ MIR.
.mo WZOHBDmHmBmHQ megﬁumommm< Q24 80% mme m0 wHOme<mHaOU

m mamas

7. SUMMARY

After Kruskal [5;] had solved the problem of the
expected number of components for a single-valued random
mapping function, the question of the probability distribution
of the number of components was a logical next step.

Moreover, the question of what would happen if the mapping
were multiple-valued seemed worthy of consideration. By

a method somewhat analogous to that used by Feller [23 for
the combination of events but more nearly like that used by
Katz [11:] , the exact probability distribution of the
number of components of a multiple-valued random mapping
function was found. Results for particular mappings which
restricted in various ways the number of images of each
point became special cases of the general solution.

Hollow mapping in the sense that no point was permitted
to map into itself was considered because of the interest
in this type in application to social situations. The
probability distribution for this case follOwed very readily
from the general solution.

Numerical examples,vhich were included as illustrations,
revealed that the amount of computation increases enormously
with increase in N. The binomial approximation, which was
presented, does minimize the work but sacrifices some of the

accuracy.

52

Rubin and Sitgreaves [l4;] , in a paper made available
after the main part of the problem which is considered in
this thesis was completed, showed some results which are
related to the problem. Dealing only with single-valued
functions (corresponding to the class G1 in the thesis)
they have found the distribution of the number of components
by a completely different method. Thus, for single-valued
functions their result overlaps the result for the class
of functions, G1, presented here. They also considered
other topics dealing with size and composition of components.
Questions concerning the size and composition of components
formed under multiple-valued random mapping functions remain
unanswered. I

It is hoped that the results obtained here will be
useful in applications. In social situations, divisions
into groups are bound to occur. Whether these divisions
follow essentially the theoretical distribution or whether
they deviate significantly so that they must be accounted
for on the basis of age, prejudice, etc. rather than on'
chance remains as part of the problem of the application

of the results presented in this thesis.

2.

3.

9.

10.

11.

'Y

w.

N.

M.

[75
0

REFERENCES

Gontcharoff, "Du domaine d'analyse combinatore,"
Bulletin de l' Academie des Sciences U,R4SJS.
Serie Mathematidue, Vol. 8, No. l.(l944), pp. #5-48.

Feller, ' ‘ - ° ’ vs
llmLﬁunudsmddehs. John Wiley and Sons, 1950.

 

E. Greenwood "The number of cycles" associated with
the elements of a permuta .tion group" . The American
Nethenatic.l Hon-hi , v61. 60, No. 6 (1953) pp. 407-u09.

   

Metropolis and S. Ulam, "A property of randomness

of an arithmetical function", The A er can

nethedchCdl Monthly, Vol. 60, No. 4 (1953) pp. 252-253.
D. Kruskal, "The expected number of components

under a random mapping function", The Amer%can
Mathematical Honﬁle, Vol. 61, NO. 6 1954

PP. 392-397.

Forsyth and L. Katz, "A matrix approach to the analysis
of sociometric data: preliminary report", §2§igm§521,
V01. IX No. a (19u6) pp. 3uo_347.

Katz, “On the metric analysis of sociometric data" ,
Sociometzy, Vol X, No. 3 (19h?) pp. 233-240.

Katz and J. H. Powell, "The number of locally
restricted directed graphs", P ocee o t
American mathematical Society, Vol. 5, No. 4 195a)

pp. 621-626.

Jordan, Calc lus 0 Fl te D e es, Chelsea
Publishing 00., l9u7.

A. Whitworth, Choice egd.Cbenge, G. E. Stechert
and Co., 1925.

Katz, "The distribution of the number of isolates
in a Group", Institute of Statistics, University
of North Carolina, himeo. Series 36, (1950).

12.

U.

54

S. Department of Commerce, Tables of the Binomial

Propapility Distribution, National Bureau of
Standards, AN56, U. S. Gov't. Printing Office

(1950).

Katz, ”Probability of indecomposability of a random
mapping function", The A r ” ., 1 cal

St tistics (in press .

Rubin and R. Sitgreaves, "Probability distributions
related to random transformations of a finite set",
A pl d h t ema cs d S at st cs Laboratory,
Stanford UniverSity, Technical Report No. 19A (1954).

  

   

RIES

l

MICHIGAN STATE UNIVERSITY L

3 1293 03502 67

"Wail
4 3