.i I I EJIIMIIJIﬂIIhWL

 

' «be?

;~o%74*l76

NIVERSITY LIBRARIES“

lllllllllllllllll lllllzlll

31293 005374

LIBRARY
Michigan State
University

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This is to certify that the

dissertation entitled

OPTIMIZING THE COST OF RELATIONAL JOIN QUERIES

presented by
Farshad Fot ouhi

has been accepted towards fulﬁllment
of the requirements for

Ph.D. degree in Computer Science

 

 

2-3.le

 

Major professor

4-27- 8'!

 

MSU is an Afﬁrmative Action/[q ual Opportunity Institution 0- 12771

 

 

MSU

LIBRARIES
.—;—.

 

 

RETURNING MATERIALS:~
Place in book drop to
remove this checkout from
your record. FINES will
be charged if book is
returned after the date
stamped below.

 

 

 

 

 

 

OPTIMIZING THE COST OF RELATIONAL JOIN QUERIES

By

Fammmdlﬁnouhi

IKEHSSEEYRAJICHi

Subuﬁnedto
ldkﬂﬁganéhaualhﬁvcnﬁQI
hipmnudfuuﬂhnentofﬂnrnxnﬁnnncnu;
:ﬂnwhetkmpeeof

IXDCHTNR(JFIH{HJDSCHWTY

qunuhnentokannpuunlhﬂenoc

1988

Milli

ABSTRACT
OPTIMIZIN G THE COST OF RELATIONAL JOIN QUERIES
By

Farshad Fotouhi

The join operation is one of the most time consuming operations in a relational
database system because it requires a large amount of cross referencing between tuples
of different relations. Therefore, the efﬁciency of the join operation has a deterministic
effect on the system performance. The best method for performing join depends, in
general, on the access methods available, the parameters of the relations involved, and
the context in which the query is presented. The objective of this thesis is to determine
optimal strategies for performing join for a given set of constraints and assumptions.
Here, the theoretical lower bound on the number of disk 1/08 is achieved. Both join-
only queries and queries involving restrictions, projections and join are considered.

Here, the existing join algorithms are classiﬁed into Relation-Scan, Index-Scan,
and Hybrid classes. This classiﬁcation is based on the availability and use of indices on
the join attribute values. This research will show that each class of algorithms performs
best for a range of parameter values. It is shown that the relation-scan class of algo-
rithms performs best when all or most of the tuples of the joining relations participate in
the join. For the index-scan class of algorithms, several graph models are proposed in
order to show that the optimization problem for these algorithms is NP-hard. Therefore,
a heuristic algorithm with linear time complexity is given. For the hybrid class, several
algorithms which are based on preprocessing a new auxiliary data structure, called the
Partial-Relations. are proposed. It is shown that for a wide range of parameter values
the proposed algorithms perform better than the best available algorithms of the other
two classes.

To my parents, Mahmmoud Fotouhi and
Azarmidokht Shazad

ACKNOWLEDGEMENTS

IwishtoexpressmyappreciaﬁontoProfessorSakﬁananikforhisvaluable
assistanceandcontinual supportineveryphaseofthepreparationofthiswork. Ialso
ﬁshmthankoomﬁueemembasProfessmMPromehmeessmShm-
bhnandProfessmEncksonfmthehhelpﬁrlcommentsontheconwnuofthiswork

Professor Esfahanian provided many useful suggestions and participated in many
ofthediscussimswhiehhelpedmdevdopsomeofthethemedcdmnntsinChaptuS
ofthisdocument.

Finally, I want to thank my family, without whose love and inspiration this would
not have been possible.

iv

TABLE OF CONTENTS

[ﬁst of Tabla

 

List of Figures

 

1. Introduction and Problem Statement

 

1.1. Relational Database Systems

 

1.2. Operations on a Database

 

1.3. Access Methods

 

1.3.1. Indexing

 

 

1.3.2. Hashing --
1.4. Motivation

 

1.5. Problem Statement

 

2. A Chsdﬁeation Scheme for Join Algorithm

 

2.1. Introduction

 

 

2.2. Relation-Scan Class of Algorithms
2.2.1. Nested-Loop Join Algorithm

 

2.2.2.80rt-Merge Join Algorithm

 

2.2.3. Hash-Based Join Algorithms _-

 

2.2.3.1. Simple-Hash Join Algorithm

 

 

2.2.3.2. GRACE-Hash Join Algorithm - - ........

 

2.2.3.3. Hybrid-Hash Join Algorithm ......
2.2.3.4. Join Algorithm Based on Fragmentation Technique

 

2.3. Index-Scan Class of Algorithms

 

2.4. Hybrid Class of Algorithms

 

3. Optimal Access Sequence for the Index-Sean Class of Join Algorith-

 

©00QQQUI§~WHH§°§E

HHHHHHHr—Hr—
©QGM§WWNHO

N
H

3.1. Introduction

 

3.2.TheGraph Models

 

3.2.1. Page Connectivity Model

 

 

3.2.2. Block Connectivity Model

 

3.2.3. Tuple Connectivity Model
3.3. The OBAseproblem and the OPAST—problem are NP-hard

 

3.4. Complexity of Computing the LeastUpper Bound on the Buﬂ’er Size

 

3.5. A Heuristic Algorithm for the Page Connectivity Model

 

3.6. Determining an Optimal Page Access Sequence

 

3.7. Cost Models and Performance Comparisons

 

3.7.1. Cost ofthe P Algorithm

 

 

3.7.2. Cost of the Fragmentation Technique
3.7.3. Cost of the Sort-Merge Algorithm

 

3.7.4. Cost Comparisons

 

4. Join and Semljoin Algorithm Based on the Partial-Relation Scheme ........

4.1. Introduction

 

4.2. An Access Path Based on a Partial-Relation Scheme

 

4.2.1. Description of Partial-Relation Schemes

 

4.22. Partial-Relation Scheme vs. Indexing and Hashing

 

 

4.3. Algorithms Based on the Partial-Relation Schemes
4.3.1. Algorithm PRJ of the Index-Scan (lass ‘

 

 

4.3.2. Algorithm PRS of the Hybrid Class
4.3.3. Cost Model

 

4.3.3.1. Cost Expression for Algorithm PRJ

 

4.3.3.2. Cost Expression for Algorithm PRS

 

4.4. Performance Evaluation

 

 

4.4.1. Comparisons with the Sort-Merge Join Algorithm
4.4.1.1. Interpretation of Results

 

21 ‘
23
23

27

35
39
41

8333.

47
49
49

51

$3$3$K

59

38%

4.4.2. Comparisons with the Hybrid-Hash Algorithm for Join-Only
Queries

 

5. Conclusions and Future Study

 

5.1. Conclusions -
5.2. Future Study '
5.2.1. M-Way Join Operation

 

5.2.2. Recursive Queries

 

Appendices

 

ACostCompaﬁsonsofOeadnganlndexvsGeatingaPR

 

 

B. Computing the Number of Unique Attribute Values in a PR
C. 110 Cost of the Sort-Merge Algorithm Under Condition A

 

D. Execution Time of the Hybrid-Hash, SM, PR8 and MPRS Algorithms ..........

References

 

72
74
74
75
75
76
79
79
81
82
83
86

3.1.
3.2.
3.3.
3.4.
3.5.
3.6.

A.l

LIST OF TABLES

TheDifferenceBetweentheGraphModels

 

1he0PASfortheGraphofFigure32

 

 

The OBAS for the Graph ofFiglue 3.3
The OPAST for the Graph of Figure 3.4

 

Valueofﬁ; fortheGraphofFigureB.3

 

Trace of the Heuristic Algorithm for the Graph of Figure 3.2
. CostCompmisonsofOeadngaB-neevsaPR

 

 

21

ﬁdﬁﬁ

41
80

1.1.
1.2.
1.3.
1.4.
3.1.
3.2.
3.3.
3.4.
3.5. The Cases for when the Buffer Size forA '1 - . - A ',. is Computed ................
3.6.
3.7.
3.8.
3.9.
3.10. Average Buﬂ‘er Size for the Hemistic and the A Algorithms
3.11. Comparisons of the Sort-Merge. martian. and P Algorithms ...............
4.1.
4.2.
4.3.
4.4.
4.5.
4.6.
4.7.
4.8.
4.9.

LIST OF FIGURES

 

A Simpliﬁed Relational Database for Keeping Inventories
EﬁectofProjecﬁonsonPlofRelation SUPPLY

 

AnExampleot‘aSelectionOperation

 

AnExampleofaNanrralJoin

 

An Example ofa Page Connectivity Graph

 

 

A Page Connectivity Graph

 

An Example of a Block Connectivity Graph H(B.E)
An Example of a Tuple Connectivity Graph

 

A Graph 601.5)

 

APossible Ottofthc Graph ofFigure 32

 

 

Average Buffer Size for the Heuristic Algorithm
Trace ofAlgorithm A for the Graph ofFigure 3.2

 

 

AnExampleofaPartial-Relation _

 

 

Number of Characters Selected vs. Fraction of False

Performance of Algorithm PRS when PRs Contain False Tilples ...................
Comparisons Under Condition A for Different Restriction Factors ................

Comparisons Under Condition A

 

Comparisons Under Condition B

 

 

Comparisons Under Condition C

 

Comparisons Under Condition B for Different PR Sizes

Comparisons of Algorithms PRS, MPRS, SM and Hybrid-Hash ............

BBBRBBnnuu

35
41

45

51

61

$3393

71
73

CHAPTER 1
INTRODUCTION AND PROBLEM STATEMENT

1.1. Relational Database Systems

As mass storage prices fall and computers are used extensively throughout
businesses and other organizations, vast amounts of valuable information are piling up
in elecuonic repositories. Finding ways to organize this information. and to allow con-
venientaccesstoitissteadily becomingmcreimpcrtant. Theemergenceofadvanced
conceptsbasedonartiﬁcial intelligencewillmakedatabasesandtheirmanagementeven
more critical.

Software developers have come up with a variety of database management sys-
tems, which accept. organize. store, and retrieve information quickly and eﬁciently. A
database management system (DBMS) consists of a collection of interrelated data and a
set of programs to access that data. The collection of related data which contains infor-
mation about an enterpriseisusuallyreferredtoasdledatabase.

Inordermdesaibethesnucnneofadambaseadatamodeliscreatedandis
administered by database administrator. A data model is a collection of conceptual
tools for describing data, data relationships. data semantics. and data constraints
[Kort86]. Database designms can choose among several data models, of which three
currently reign as the most popular [Date75]. These models are the network data model.
the hierarchical data model and the relational data model. In this research, only the
relational data model [Codd70], which is the data model for relational database systems,
iseonsidered. Areladonaldatabase consistsofacollectionofrelations.eachofwhich
isassigrledauniquename.1nthismodel,dataarerepresentedasatable, witheach hor-
izontal row representing a triple or a record and each vertical column representing one
of the attributes, orﬁeldr of the tuple. In other words. a relation of arity n (i.e.. a

l

2

relation with n attributes) is a subset of tuples from the cartesian products of the
domains DIX02X ~ - ° XD,,. The cartesian product of two relations R and S, written
RXS, of arity k; and k2, respectively is the set of (k1+k2)— tuples whose ﬁrst k1 com-

ponents are from a tuple in R and last 1:; components are from a tuple in S.

Figure 1.1 shows a relational model of a simpliﬁed inventory database. This exam-
ple was ﬁrst popularized by Codd [Codd70]. Individual rows in the PART relation
correspond to individual types of parts in the inventory. Because rows are required to be
nonredundant within a relation, each row can be identiﬁed uniquely by a subset of its
attributes. For example, in Figure 1.1a S# can be used to identify a row uniquely, while
8# and P# must be used together to identify a row of the SUPPLY relation. Such a set
of attributes is called the key of the relation. If a subset of the attributes in one relation
is the key of the second relation, then such a key is called a foreign key. For example, in
Figure 1.1, S# and P# are the keys of the SUPPLIER and PART relations, respectively.
Thus, 8# and P# in SUPPLY relation are called foreign keys.

 

# I SN SL
81 ABC Co. MI
82 XY Co. NY

Co. NY

(a) SUPPLIER Relation

 

 

 

 

 

P# PD
Pl CAM
22 GM

(b) PART Relation

 

 

 

 

 

 

l—S'#P#O

81 P1 5
81 P2 6
83 Pl 6

(c) SUPPLY Relation
Figure 1.1. A Simpliﬁed Relational Database for Keeping Inventories.

 

 

 

 

 

1.2. Operations on a Database

Modeling data in a relational structure lends itself to the development of high-level
query languages. Several styles of relational query languages have been developed.
One of these is relational algebra. Relational algebra consists of a set of operators
which can be applied to relations to create a new relation. These operators can be com—
posed to form expressions. A typical set of relational operators is projection, selection,
join, union, intersection and diﬁ'erence. Here, both join-only queries}; and queries

involving selections, projections and join are considered.

A projection operation is the elimination of certain ﬁelds of all tuples. In many
cases, a number of the ﬁelds contain information which is irrelevant for the query and
these ﬁelds are deleted. This may have the effect, however, of producing multiple,
identical tuples, since many tuples may be identical in a number of attributes. The rela-
tional database model assumes that all the tuples are unique. Thus, in theory, duplicates
must be eliminated. Figure 1.2 shows the effect of projecting on P# of the relation SUP-
PLY. Note that the result is a relation (duplicates on P1 have been removed) with only

one attribute.

 

P#
P1
P2

 

 

 

 

Figure 1.2. Effect of Projecting on P# of Relation SUPPLY.

The selection operation, sometimes referred to as restriction, is a speciﬁcation
which eliminates some of the tuples of the relation. The speciﬁcation is a Boolean
expression deﬁned over the columns of the relation. The expression may consist of
arithmetic comparison operations (i.e., =,¢, <,>,S,2) on the value of one or more

columns. The result of selection is a new relation with only the rows from the original

 

1* Aqueryisastatemanrerpeningtherurievalofinformation.

relation which satisfy the selection condition. Figure 1.3 gives an example of the selec-
tion operation on the SUPPLIER relation where the SL="NY".

 

# SN SL
82 XYCo. NY
__,[15 Co. NY

 

 

 

 

 

Figure 1.3. An Example of a Selection Operation.

The join operation is one of the most important operations in query processing.
Join is a binary operation that allows us to combine the selection and cartesian products

into one operation. The O—join of two relations R and S on columns i and j is denoted by

R5305 which is the shorthand for 0599+ j) (R X S). Here 0' represents the selection and 9
I J

is an arithmetic comparison operator. If both R and S have columns that are named the

same then the equijoin of R and S on their common attributes is called natural join and

it is denoted by RNS. Figure 1.4 gives an example of a natural join between two rela-
tions SUPPLY and SUPPLIER, of Figure 1.1. As shown in the example, this operation
allows a user to navigate through the database. In this research the natural join of two
relations on a single domain is considered, and from now on wherever the term join is

used we mean natural join.

 

 

ﬁt P# 0 SN SL

81 Pl 5 ABC Co. MI

51 P2 6 ABC Co. MI
_33 Pl o 115 Co. NY

 

 

 

 

 

 

 

Figure 1.4. An Example of a Natural Join.

1.3. Access Methods

Many queries require only a small number of the tuples to be retrieved ﬂour a rela-
tion. Therefore, the system is not efﬁcient if it must access all the tuples of the relation

to check for the tuples which satisfy the user’s request. Ideally, the system should be‘

able to locate those tuples directly. In order to allow this form of access, an additional
data structure associated with relations was designed. Here two such structures, namely
indexing and hashing are considered.

1.3.1. Indexing

An index on attribute A of relation R permits rapid access to a single tuple that has
the desired value in that attribute. An index consists of pairs whose ﬁrst component is a
value ﬁ'om atnibute A and whose second component is the Tuple Identiﬁer (TID) of a
tuple having that value. A TID is assumed to give direct access to the tuple, so that at
most one page is accessed if a tuple is referenced using a TID. An index is stored in a
special way to provide rapid access to it. The model which is widely used in most data-
base systems is B—Tree [Baye72] or a variant of it [Knut73, Come79].

A B-tree index takes the form of a balanced tree in which the path from the root to
any leaf of the tree is the same length. Each node in the tree has between [n l2] and n
children, where n is ﬁxed for a particular tree. Leaf pages contain (key,TID) pairs in
sorted order, and the higher level pages contain pairs that consist of the largest key of
the lower-level page with a pointer to that page. These pairs are also sorted. The cost of
search operation is proportional to log N, where N is the number of tuples in the ﬁle
[Baye72].

The efﬁciency of an index in query processing depends on whether the relation is
clustered or unclustered with respect to the index. Suppose an index I is used to extract
the tuples of relation R. If each data page of R is accessed at most once, then R is
clustered with respect to the index I. The index I may be termed a clustering index with
respect to R. On the other hand, if the data pages of R are referenced in a random,
approximately uniformly distributed manner, then R is unclustered with respect to I .

1.3.2. Hashing

One disadvantage of indexing schemes is that an index structure must be traversed
in order to locate a datum. Hashing avoids the traversing of an index structure and it
gives the best average access time to a single tuple. However, indexing is more efﬁcient
for range queries. The basic idea in hashing is to divide the tuples of a relation among
buckets, each consisting of one or more blocks of storage. The address of a tuple is then
determined by computing a function, called the hash function, on the search-key value
of the desired tuple.

Conventional hashing functions such as those discussed by Lum [Lum 71, Lum
73], Knuth [Knut73], Severance [Seve74] and Knott [Knot75] are useful for static ﬁles.
However, when the ﬁle size is dynamic, either a large amount of secondary storage must
be pre-allocated or long overﬂow chains and frequent rehashing must be tolerated.
Starting in mid 70’s several new hashing techniques such as linear hashing [Litw80,
Lars85], virtual hashinglLitw76], extendible hashing[Fagi79,Lome83], and dynamic
hashing[Lars78] were proposed. In these techniques rehashing is avoided by allowing
the storage space to dynamically adjust to the number of tuples actually stored. Most of
these techniques, such as extendible hashing, do not allow overﬂow to occur, thus, the
expected number of accesses in these techniques is minimized. Here, wherever the term

hashing is used, I mean a conventional hashing function.

1.4. Motivation

The rapid technological advances of the recent decade have made possible the use
of database management systems over a wide range of applications. Among the most
signiﬁcant problems in computer science today is the issue of efﬁcient implementation
of database management systems to support an array of applications. Certain types of
database operations are not widely performed today because of the high cost. One such
operation is the join.

The join operation is one of the most important and time-consuming operations in
a relational database system. Because it allows the user to navigate through the data-
base, it is an important operation. In addition, many of the techniques used for perform-
ing join can be used for other relational operators such as a cross product and aggregate
functions. Join is a time-consuming operation because it requires a large amount of
cross referencing between tuples of different relations. Therefore, the efﬁciency of the
join operation has a deterministic effect on the system performance. Because of its
importance, the join operation has been a subject of intensive study in the development
of relational database systems, and different approaches have been proposed.

The objective of this thesis is not to give a join algorithm but to develop principles
by which a join can be implemented. The best method depends on the access methods
available, the parameters of the relations involved, and the context in which the query is

presented.

1.5. Problem Statement

The primary goal of this work is to determine optimal strategies for performing
join for a given set of constraints and assumptions. Both join-only queries and queries
involving restrictions, projections and join are considered. Optimal strategy requires the
least cost among all other strategies. The cost is measured based on the number of disk
I/Os when a small size main memory buffer is assumed. When large size main memory
buffer is available, the cost is measured based on the number of disk I/Os as well as the

amount of CPU usage.

In this thesis, the existing join algorithms are classiﬁed into Relation-Scan, Index-
Scan, and Hybrid classes. This classiﬁcation is based on the availability and use of
indices on the join attribute values. This research will show that each class of algo-
rithms is best for a range of parameter values. There has been intensive study of the

relation-scan class of algorithms. Therefore, this research concentrates on the last two

classes of algorithms. Using graph models for the index-scan class of algorithms I will
show that the optimization problem for these algorithms is NP-hard. For the hybrid
class, several algorithms which are based on preprocessing a new auxiliary data struc—
ture, called the Partial-Relations, are proposed here. The performance of these algo-
rithms will be compared with that ‘of the existing ones. This research will show that for
a wide range of parameter values the proposed algorithms perform better than the
relation-scan class of algorithms.

The organization of this thesis follows. Chapter 2 presents a classiﬁcation of vari-
ous join algorithms proposed in the literatme. In chapter 3 several graph models are
presented to analyze the performance of the join Operation in a paging environment.
The graph models are based on the block or page connectivity of the joining relations.
In chapter 4 several join algorithms based on the Partial-Relation Scheme are presented.
The performances of these algorithms are compared with some of the existing algo-
rithms under various conditions. Chapter 5 contains the conclusions and suggestions for

further work.

CHAPTER 2
A CLASSIFICATION SCHEME FOR JOIN ALGORITHMS

2.1.Introduction
Thewaﬂabiﬁtyofinexpmsivehrgemainmmiesooupledwithmedemandfor
fasterresponseﬁmeisbringinganewperspectivetodatabasetechnology. Designersof
dambasesystemshweassmmdmﬁlmcendynhudanbasesmsidemdiskdmingnan-
suction processing. rim, substantial performance gains can be achieved by letting
alargeportionof,ortheentiredatabase,resideinmainmemory. Sofartwoapproaches
havebeenpmposedforusinglargeamountsofmainmemmyindatabasesystems. The
ﬁrstappmachusesvaylargebuﬂ'asmimpmvecmvenﬁonddiskaccessmethodsby
storing part of the database or part of the indices. DeWitt et al.[DeWi84],
ShapirolShap86], and Elhardt et al.[Elha84] have taken this approach in their work.
Theaecondapproach,memanory-residemmbaseapproach,ismstmetheenﬁm
database in main memory. Ammann et al.[Amma], Lehman and Carey[Lehm86a.
Lehm86b], and Leland and RoomelLelaBS] take this approach.
Haeaclassiﬁcaﬁonofjoinalgoﬁthmsundermeassumpﬁonthatthedambaseis
diskmsidentandasmanorlugemainmemorybuﬁ'aisavaﬂabk(ﬁrstappmach)is
considered. The second approach (i.e., memory-resident database) is not considered
because it requires a redesign of the database management systems. That is, the algo-
rithms and the data strucnnes for query processing, concurrency control and recovery
mustallberesnucnnedtosuesstheefﬁcientuseofCPUcyclesandmemoryratherthan
diskaccessesanddiskstorage.
Theexistingjoinalgorithmsmaybecategoriaedintothreeclasses.1he
classiﬁcationisbasedontheavailabilityanduseoftheindicesonthejoinatnibute
values. Asshowninlaterchapterseachclassofalgorithmsisbestforarangeof

9

10

parameter values. The ﬁrst class contains those algorithms which perform join by
accessing all the tuples of the joining relations. This class of algorithms is referred to as
relation-scan algorithms. This thesis show that the relation-scan class of algorithms
performs better than the algorithms of the other two classes when most of the tuples of
the joining relations participate in the join. The second class of algorithms uses indices
to access only those tuples of the joining relations which participate in the join. There-
fore, as shown in this thesis, the algorithms of this class perform better than the algo—
rithms of the other two classes when a small number of tuples of the joining relations
participate in the join. These algorithms are referred to as index-scan algorithms. The
third class of algorithms uses the semijoin technique to perform join. The semijoin tech-
nique has been widely used in distributed databases and in multiprocessor database
machines in order to reduce the communication costs. This class of algorithms is
referred to as the hybrid class of algorithms. The next sections describe some of the
algorithms proposed for each class.

2.2. Relation-Scan Class of Algorithms

Relation-scan algorithms perform join on all the tuples of the joining relations
without any attempt to reduce the size of the relations. Therefore, when only a few
tuples of the joining relations participate in the join, we show that the algorithms of this
class are more costly than the algorithms of the index-scan and the hybrid classes. This
situation occurs if a restriction or another join operation is applied to at least one of the
joining relations before the join is performed. In this section some of the proposed algo-
rithms in this class are considered.

11

2.2.1. Nested-Loop Join Algorithm

The simplest of all join algorithms in terms of implementation is the nested-loop
algorithm [Gotl75, Blas77, Kort86]. The nested-loop join algorithm for two relations R
and S is as follows:

a) Read R or as much as possible of R into the buffer.

b) For each record in S check if it has the same join values as any record of R stored
in the buffer. If they match, join the two records and output the result.

c) Repeat this process for all parts of R.

We see that the complexity of this algorithm is in the order of 0(IR lxlS I); where
IR I and IS I represent the number of data pages in relations R and S, respectively.
However, the nested-loop can be an efﬁcient join algorithm if at least one of the joining
relations is small enough to ﬁt in the main memory buffer. This algorithm also takes
advantage of different relation sizes in contrast to the sort-merge join algorithm which is
described later.

Since this algorithm has a high degree of parallelism, it has been implemented in
multiprocessor database machines [Bitt83, Vald84]. A database machine is a device
which implements the operations of the database by means of hardware. The nested-
loop algorithm when implemented on a multiprocessor machine with P processors
proceeds in the following manner: Distribute the smaller relation (called external)
among P processors. Next, the other relation (called internal) is broadcast page by page
to these processors. Each processor joins each page in the memory of the external rela-
tion with the entire internal relation. If the external relation could not ﬁt into the P
processor’s local memories, the same operation must be performed for each pages of the

external relation.

12

2.2.2. Sort-Merge Join Algorithm

Another widely used join algorithm is the sort-merge technique[Blas77, DeWi84].
This algorithm requires initial surﬁng and multiway merging of both joining relations on
their join attributes, and then merging the two sorted relations. The sort process itself
can be optimized (see, for instance, Knuth [Knut73]). Initial sorting takes several passes
over the relations and requires a considerable amount of CPU time. Sorting does not
take advantage of different relation sizes. A detailed description of this algorithm fol-

lows.

Assume that both joining relations R and S are sorted in ascending order with
respect to their join attributes A. Let us also assume that attribute A has no duplicate
values in relation S. In order to join the two relations, two pointers r and s are used over
relations R and S, respectively. Each pointer identiﬁes a tuple in a relation. Both
pointers are initially positioned on the ﬁrst tuple of their relations. The merging of the
two sorted relations starts by reading a page from each relation and comparing the join
attribute values. If r.A =s.A, then r concatenate s is output, and r is advanced to the next
tuple. r.A represents the value of attribute A in the tuple pointed at by the pointer r. If
r.A >s.A, s is advanced; otherwise, r is advanced. The process is iterated until the end of

one of the relations.

In the case discussed above there is no looping, and the join IIO cost is exactly
IR |+ I S I. However, the I/O cost of m-way merge-sorting the two relations prior to
merging them is 0 ( IR Ix(log,,,_1 IR I) + IS Ix(log,,,_1 IS I)). Therefore, the complexity
of this algorithm is in the order of 0 (n xlogn), where n is the number of the data pages
to be sorted.

If attribute A has duplicate values in relation S, then the algorithm can be modiﬁed
by adding another pointer s1 on S positioned on the ﬁrst tuple of the current run of
tuples having the same values for attribute A. Suppose that r is to be advanced; let t

represent the new tuple identiﬁed by the pointer. It r.A =s.A the pointer s is set to the

13

position of the s1 pointer. In this way a loop occurs on the run of equal values of attri-
bute A in S. Note that if the join attribute of relation S can have replicated values, then

the cost of merging scans is not guaranteed to be linear.

2.2.3. Hash-Based Join Algorithms

The main idea for hash-based algorithms is to partition the joining relations into
smaller subrelations where the nested-loop algorithm or any other algorithm can be
employed. This is an example of a divide and conquer technique.
Bratbergsengen[Brat84] presented join algorithms based on hashing. The performances
of the proposed algorithms are compared against the nested-loop and the sort-merge.
He has shown that for small relations the nested-loop method is preferable to sort-merge
and hashing algorithms. However, for large buffer partitioning the relations based on
hashing are always better than the sort-merge join algorithm. DeWitt, et al.[DeWi84],
and Shapiro[Shap86] have also proposed three hash-based join algorithms, namely the
simple-hash, GRACE-hash and the Iqbrid-hash. They analyzed and compared the per-
formances of these algorithms with the sort-merge algorithm. Their results were the
same as Bratbergsengen[Brat84]. They have also shown that the hybrid-hash join algo-
rithm performs better than the simple-hash and GRACE-hash over a wide range of
parameter values. In the next three sections each of these algorithms and a recent hash-

ing method proposed by Sacco [Sacc86a] will be described brieﬂy.

2.2.3.1. Simple-Hash Join Algorithm
Let us assume that the hash table for the smaller of the two relations R and S ﬁts in
the main memory. Then the simple-hash join algorithm proceeds as follows:

1) Use a hashing function h to build a hash table for R in memory. Assume that R is
smaller than S.

14

Sean S and for each tuple t of S, repeat the following steps.
2) Compute the hash value of the tuple t using the hashing function h.

3) Use the hashed value for t obtained in step 2 to search the hash table of R for a

match. In case of a match output the result.

If the hash table for R will not ﬁt in memory, the simple- hash join algorithm proceeds
as follows: Fill the memory with a hash table for part of R. Scan S against that hash
table and if there is a match, output the resulting tuple. Build a hash table for another
part of R in the memory and scan the remainder of S against it. Repeat this process until
a hash table for all parts of relation R has been created.

2.2.3.2. GRACE-Hash Join Algorithm

The GRACE-hash join algorithm has been implemented on the GRACE relational
database machine [Kits83,Moto83,KitsS4]. The GRACE architecture is one of the can-
didate relational database machines for the Japanese ﬁfth-generation computer project
sponsored by ICOT'I'. The GRACE database machine is organized for join-intensive
applications. It utilizes the concepts of associative disks with ﬁltering, hash-based data
partitioning, and pipeline sort-merge in its processing modules. The basic premise of
the architecture relies on the fact that the tuples of relations can be distributed into hash
buckets determined by the range of values of the attribute to be hashed, which is the join
attribute. Accordingly, the join complexity would be simpliﬁed from the 0 ( IR Ix IS I)
complexity of the brute force nested-loop join algorithm, if we hashed the join attributes
of these relations so that

S S
IRI=£ng and |SI=2m,-,
.=1

i=1

___¥

1' Institute for New Generation Computer Technology

15

where n; and m,- are the sizes of the itll bucket of respective relations and s is the number
of buckets in each relation, the join complexity would be simpliﬁed. This is because we
would compare only those tuples of relations which occupy compatible buckets (i.e., for

i at}, we do not process n,- with m,- ). Accordingly, the overall complexity would be

0(in; xmg).

i=1

The GRACE-hash join algorithm is executed in two phases as outlined in Dewar
et al.[DeWi84] and Shapiro [Shap86]. In the ﬁrst phase the relations R and S are parti-
tioned into IM I sets; where IM I is the number of pages in the main memory. The
algorithm uses one page of memory as an output buffer for each of the IMI sets in the
partition of R and S. The partitioning of the relations is done by using a hashing func-
tion. In the second phase the join is performed using a hardware sorter to execute a
sort-merge algorithm on each pair of sets in the partition. To provide a fair comparison
between different algorithms, DeWitt et a1. [DeWi84] have used the hashing technique

to perform join during the second phase.

2.2.3.3. Hybrid-Hash Join Algorithm

In this algorithm a large main memory buffer is used to minimize disk I/O. In con-
trast, the GRACE-hash join algorithm uses all the main memory as a buffer to partition
the relations. The hybrid-hash algorithm [DeWi84, Shap86], however uses only as
many pages (B) as are necessary to partition R into sets that can ﬁt in the memory. The
rest of the memory is used for a hash table that is processed at the same time that R and

S are being partitioned. The algorithm proceeds as follows:

1) Assign the ith output buffer page to partition R;, for i=1,...,B. Scan R and hash
each tuple of it using a hashing function h. If the hashed tuple belongs to R 0, place
it in memory in the hash table for R0. Otherwise, place it in its appropriate output

buffer page. When this step is ﬁnished, we have a hash table for R o in memory,

16

andR1,R2,.. .,Rp areonthedisk. Note that the hash table forRo has IM I-B

pages ande, . . . ,R3 are ofequal size.

2) Assign the ith output buffer page to set 8;, for i =1,...,B. Scan S, and use the hash-
ing function h to bash every tuple of s. If the hashed tuple belongs to so, search the
hash table for R in memory for a match. If there is a match, output the result tuple;
otherwise ignore the tuple. On the other hand, if the hashed tuple does not belong
to S 0. then place the tuple in the appropriate output buffer page, S,- for some i>0.
Nole, . . . ,Rg andSr, ~ ' ° .53 areon disk.

Repeat steps (3) and (4) for i =1,...,B.

3) Read R,- and build a hash table for it in the memory.

4) Scan the partition S,- and hash each tuple of it. Scan the hash table of R;, which is
in memory, for a match. If there is a match, output the result. Otherwise, toss the S
tuple.

As mentioned earlier, DeWitt et al.[DeWi84] or; and Shapiro [Shap86] have shown
that the hybrid-hash and simple-hash outperform the sort-merge and the GRACE-hash
join algorithms under the assumption that large main memory buffer is available. Their
comparisons were based on the execution time of the algorithms (i.e., I/O time and the

cpu usage).

2.2.3.4. Join Algorithms Based on Fragmentation Technique

Fragmentation[Sacc86a] joins two relations by recursively partitioning two unor-
dered relations into fragments until each fragment of the smallest relation in no larger
than IM I-l. The basic idea of fragmentation is to partition the joining relations R and

S into n disjoint ﬁagments in such a way that

1) for 19' Sn, fragments ith of both relations contain tuples whose join attribute values

are drawn from a set T,- of values (which identiﬁes a bucket) and

17

2) for any given i, j (iaej), S; and S,- are disjoint.

The partitioning phase logically partitions the join attribute range into (IM I-l)j
disjoint buckets by using a hashing function. Each bucket identiﬁes a pair of fragments,
one for each relation. At each partitioning step a bucket i is partitioned into IM I-l
subbuckets. The problem of joining two relations then reduces to the problem of joining
(IM l-l)i independent fragments of the two relations. Since each fragment of the
smaller relation ﬁts in the buffer, the ﬁnal join phase has a linear cost. It has been
shown in Sacco [Sacc86a] that the cost of fragmentation is
0(( IR I+IS I)x(log.M .-1 IS I)), with the assumption that S is a smaller relation than R.

Sacco [Sacc86] has compared the performances of the fragmentation technique
against the sort-merge join algorithms for various relation sizes. He has shown that if at
least one of the joining relations (i.e., the larger relation) is already sorted then the sort-
merge is better than the fragmentation. However, if the larger relation or both relations

need to be sorted then the ﬁ’agmentation technique outperforms the sort-merge.

2.3. Index-Scan Class of Algorithm

The algorithms of this class attempt to reduce the size of the joining relations
before performing the join. The reduction is done by using the indices on the join attri-
butes. Therefore, these types of algorithms, require the existence of the indices on either
one or both relations on their join attributes in order to be operational. If indices exist
on both relations then the join is ﬁrst performed on the indices, and a set of pointer pairs
to the tuples (tuple ids) that will be joined is obtained. A pointer consists of a page
number and an offset within the page. Using the pointer pairs, the selected pages of the

relations are fetched and the appropriate tuples within these pages are then joined.

When only the index on one of the two relations, say S, exists, then the algorithm
proceeds as follows:

18

Repeat steps below for all the tuples of the smaller relation R.
1) Scan relation R and for each tuple of R fetch join attribute value, r.

2) Scan the index on the join attribute of relation S, with r as the search key. If r

exists then fetch the corresponding tuples in S and join it with the tuple of R.

For algorithms of this class Fotouhi and Pramanik [Foto88] have shown that if
more than 50% of the tuples of the joining relations participate in the join, then the
selected pages of the relations may be reaccessed several times. Reaccessing of the
pages can be more costly than accessing the entire tuples of the joining relations as in
the algorithms of the ﬁrst class. Therefore, the algorithms of this class are good only
when prior selection or join is performed on the joining relations.

Blasgen and Eswaran[Blas76, Blas77] have studied the properties of various join
algorithms in considerable detail. In order to be operational some of their proposed
algorithms require the existence of indices on the join attributes. They have considered
queries which contain the join operation as well as selections and projections. They
examined the cases where various indices were present, where the restriction had vari-
ous effects, and for various sizes of the relations. The cost of the algorithms is com-
puted in terms of disk operations and showed that many different algorithms are best
under particular sets of conditions. Goodman[Good80] has also proposed a few
modiﬁcations of these algorithms. It has been shown in [Blas77] that in the absence of
indices, the sort-merge join algorithm performs better than the other join algorithms.
However, if clustering indices on the join attributes exist, then it is more cost effective

to use the indices to perform the join.

This research developed several graph models for some of the index-scan algo-
rithms to determine an ordered page or block access sequence which requires the smal-
lest size main memory buffer. The graph is formed for the page or block connectivity of
the joining relations. In these models the join-participating pages of the relations are

accessed only once. I will show that the problem of determining an ordered list of pages

19

or blocks which requiring a minimum size buffer is NP-hard. Therefore, a heuristic
algorithm which determines an ordered page access sequence requiring a near optimal
buffer size is presented. The graph models are discussed in the next chapter.

There have also been join algorithms which use a precompiled join index ['MissSZ,
Vald85]. A precompiled join index is an index which has been constructed on the
domain of the join attributes. Therefore, a join of two relations, R and S, in these algo-
rithms is done only by scanning the index and ﬁnding which tuple of R is joined with a
tuple or a set of tuples of S. This is a very fast method. However, its maintenance cost
is very high and this type of index may be very large. Such join algorithms are not con-
sidered here.

2.4. Hybrid Class of Algorithms

For these types of algorithms the join of relations is being replaced by the join of
their semijoins [Ullm82]. The semijoin of relations R and S, written RKS', is RR (R NS),
where 1!: denotes the projection operation. Note that, in general, RKS'atSXR. The expres-
sion RXS selects those tuples of R that participate in the join of R and S.

To perform join using semijoin, the algorithm proceeds as follow:
1) Scan relation R and save its join attribute values in a data structure A.

2) Scan relation S and compare the join value of each tuple with the ones in A. In
case of a match save the tuple in a temporary ﬁle S’ and store its join value in a

data structure B.

3) Scan relation R again and compare the join value of each tuple with the ones in B.

In case ofa match save the tuple in a temporary ﬁle R’.

4) Join the two ﬁles R’ and S’ using any of the existing algorithms proposed for the
ﬁrst class of join algorithms.

20

In this thesis, several semijoin algorithms which are based on preprocessing the
Partial-Relation scheme are being presented. A Partial-Relation scheme is a projection
of the join and restriction attributes of the queries. Furthermore, the values of an attri-
bute in a Partial-Relation scheme are obtained by applying a transformation function to
the original values. The objective of this transformation function is to reduce the size of
the attribute values. Partial-Relations are suitable for processing join-only queries as
well as queries involving selections, projections and joins. The Partial-Relation scheme

is described in chapter 4.

Babb [Babb79] has implemented semijoin operations using boolean arrays. The
idea is to hash the join attribute and then use the result as an address into the Boolean
array. The presence of a marked bit in the array means that matching tuples exist. The
value of the boolean arrays is the elimination of most of the data not needed in the
result. Specialized hardware has also been proposed by Pramanik [Pram86c] to perform
the semijoin operation. In order to support join as well as semijoin operations, the
method is improved and adapted for multiprocessor database machines [Vald84, Shul84,
Rich87]. Many specialized architectures have been proposed for high performance rela-
tional database management. These architectures include logic-on-disk machines
[Smit75, Schu79, Su79], VLSI-based special purpose processors [Kit583, Shib84], and
loosely- and tightly—coupled multiprocessor architectures [DeWi79, Gard81, Hsia83,
Tera83, DeWi86]. In this thesis, only join algorithms for von Neumann architecture are
being considered. This is because most of the algorithms proposed in the past as well as
those discussed here may be modiﬁed for efﬁcient implementation on various database
machine architectures [Good80, Bitt83, Shul84].

CHAPTER 3

OPTIMAL ACCESS SEQUENCE FOR THE INDEX-SCAN
CLASS OF JOIN ALGORITHMS

3.1. Introduction

Inordermopﬁmizethepaformanceofthehldex-scanchssofdgmithmsinapag-
ing environment this research developed two graph models, namely the block connec-
tivity and the page connectivity models. The term paging environment means that the
unitofbuﬂ'ersizeisapage. Thesemodelswillhelptoshowthattheindex-scanclassof
algorithms performs better than the other two classes ofalgorithms when thejoinfac-
erislow.Theperformancemeasmementisbasedontbenumberofpagesofmain
memorywhichareneededtoguaranteeoneaccessperblockorpage. Theblockcon-
necﬁvitymodelusumesmatabbckisamﬁtofsecondmysméwcess,andthepage
connectivitymodelassumesapageisaunitofsecondarystorageaccess.

Pramanik and Inner [Pram85b] have considered a similar problem however, their
modelisbasedonsavingonlythetupleswithinpagesthatparticipateinthejoin. They
haveshowndlattheproblemofdeterminingtheleastupperboundonthesizeofdre
buffer,whenonlythejoiningtuplesaresaved.isNP-hard.Theleastupperboundonthe
buffersizeimpliesthatthereisatleastonepageaccesssequencethatneedsthismuch
buffer,butnopageaccessaequenceneedsanymorethanthis.Merren,etal.[Merr81]
have considered the page scheduling which requires the least page-swapping counts.
TheyhaveshownthatthisproblemcanberepresentedasaspecialcaseoftheHamil—
tonian path problem. Thus, it is shown to be NP-complete. Two sufﬁcient conditions
fordreexistenceoftheopﬁmalsoludonswaeshowmwhicharebasedondle

 

tThejohlfactaforareladonRisdeﬁnedmbetbenﬁoofdlenrmberofmpbsch
plticipatinginthejoinoperatiouoverdlenumberofmplesink [BlasTl].

21

22

Hamiltonian path condition and the Euler path condition. Using these conditions, Mer- ‘
rett, et al. have presented heuristic procedures for near optimum solutions. Sacco and
Schkolnick [Sacc86b] have developed a general model of buffer management, called hot
set model, which can be used to minimize the number of page fault rates for different

buffer sizes.

Here, a model in which the entire page is saved in the buffer is considered as in
Merrett, et al. [Merr81] to avoid reaccesses. Thus, guaranteeing only one access per
page. The objective here is to determine an ordered list of blocks (or pages) which
requires the smallest size buffer among all possible ordered lists of blocks (or pages).
We refer to an ordered list of blocks (or pages) as a block (or page) access sequence.

The following assumptions are made in the analysis:

' 1) A join of only two relations is considered.‘ Note that an m-way join operations
(m >2) can be decomposed into a series of binary joins.

2) The database is disk resident. Therefore, the selected pages of the relations need to
be saved in the main memory buffer before being processed.

3) A data page may contain tuples of more than one relation.
4) A block may contain the data pages of more than one relation.

The remainder of this chapter is organized as follows: in Section 3.2 the graph
models are presented. In Section 3.3 the problem of determining a block access
sequence which requires the least amount of buffer will be shown to be NP-hard. In
Section 3.4 the problem of computing the least upper bound on the main memory buffer
size for a given set of joining pages will be considered and will be shown to be an NP-
hard problem. In Section 3.5 a heuristic procedure is given. In Section 3.6 the perfor-
mance of the heuristic is compared against the optimal page access sequence. Finally,
in Section 3.7 the performances of the fragmentation technique, the sort-merge join

algorithm and an algorithm based on the graph model are compared.

23

3.2. The Graph Models

The index-scan algorithms use the indices on the join-participating attributes of the
relations to determine a list of pointer pairs to the tuples that are to be joined. Thus,
from each pointer pair one can determine the corresponding block pairs or page pairs
that need to be accessed. Here, three types of graph models are developed ﬁ-om these
pairs. These graph models are the page connectivity, the block connectivity and the
tuple connectivity [Pram85b]. These models differ from each other by the unit of secon-
dary storage access and the unit of buffer storage. Table 3.1 summarizes the differences

between the three models.

Table 3.1. The Difference Between the Graph Models.

 

 

Unit of Unit of
The Graph Models Disk Access Buffer Storage==
Page Connectivity Model Page Page
Block Connectivity Model Block Page
Tuple Connectivity Model Page Tuple

 

 

 

 

 

3.2.1. Page Connectivity Model

For this model the assumption is that a pointer consists of a page number and an
offset within the page. Here, the connectivity of all the pages in a join operation is
represented by the page connectivity graph. Two nodes (1 and e of the page connectivity
graph have an edge (a,e) between them if a tuple in the page corresponding to a joins
with a tuple in the page corresponding to c. Figure 3.1 gives an example of a page con-
nectivity graph.

24

Gr

 

Figure 3.1. An Example of a Page Connectivity Graph.

A component of a graph G is deﬁned to be the maximal connected subgraph of G.
For example, in Figure 3.1 there are three components G1, G; and G3. Components
are disjoint; that is, only the tuples within one component may need to be joined,
independent of the tuples in other components. As a result, in order to avoid reaccessing
the pages of the joining relations one needs to save, at most only the pages of a single
component in the buffer. It has been shown by Pramanik and Ittner[Pram85b] that as
the selectivity factor 1' increases, the number of components in the page connectivity
graph approaches one very quickly (i.e., the graph becomes a connected graph). Note
that this does not imply that as the selectivity factor increases the page connectivity
graph becomes a complete graph. A graph is said to be complete only if every node has
an edge with every other node.

 

Figure 3.2. A Page Connectivity Graph.

Using the page connectivity graph concept, the problem of determining a page
access sequence(PAS) which requires the least amount of buffer can be deﬁned as

 

1' Selectivityfactorisdeﬁnedtobetheratioofthenumberofmplesintheresultofajointothe
slunoftheulplesinthejoiningrelations.

follows:

Given: A page connectivity graph G(V,E), and a positive integer K S I V I , where V
is a non-empty set of nodes, E is the set of edges of the graph G, and IVI
represents the number of elements in the set V.

Question: Does there exist a page access sequence which requires a buffer of size K or

less?

The page access sequence which requires the least amount of buffer (i.e., smallest
K value) is called the Optimal Page Access Sequence and is denoted by OPAS. Here,
the problem of determining such an access sequence is referred to as the OPAS-
problem. For example, for the page connectivity graph of Figure 3.2 there are 5! possi-
ble page access sequences. The idea here is to look for the page access sequence which
requires the least amount of buffer. If the nodes of this graph are fetched in the order
abced then the required buffer size is 5. On the other hand, if the pages are fetched in
the order acdbe then the required buffer size is 3 as it is shown in Table 3.2.

Table 3.2. The OPAS for the Graph of Figure 3.2.

 

 

Content of the Buffer Size of the Buffer
a 1
a,c 2
a,c,d 3
c,d,b 3
d,b,e 3

 

 

 

 

This is the smallest buffer size one can expect for this page connectivity graph.
Thus, OPAS for this graph is acdbe. In Section 3.41 show that the problem of comput-
ing the least upper bound for K is NP-hard. In Section 3.5 a hemistic procedure with
0(n2) time complexity is given. The performance of this heuristic procedure is com-

pared against the optimal page access sequence which has the complexity of 0(n I),

26

where n is the number of data pages in the page connectivity graph.

3.2.2. Block Connectivity Model

For this model the assumption is that a pointer consists of a block number and an
offset within the block. Therefore, the unit of secondary storage access is a block and
only the appropriate pages of a block are saved in the main memory. Two blocks are
said to be connected if a data page in one is joined with a data page in another. The con-
nectivity of all blocks in a join operation is represented by a block connectivity graph.
Figure 3.3 gives an example of a block connectivity graph. The circles in the ﬁgure
represent the data pages and the rectangles represent blocks. Note that if one page per
block is assumed, then the block connectivity graph becomes a page connectivity graph.

A B

 

 

 

 

 

 

 

 

C

U

 

 

 

@’ \@@

 

 

 

 

 

 

 

 

 

Figure 3.3. An Example of a Block Connectivity Graph H (3,53).

The idea here is to look for a Block Access Sequence (BAS) which requires the
least number of pages to be saved in the buffer. Such a sequence is referred to as
Optimal Block Access Sequence (OBAS). The problem of determining such a block
access sequence is referred to as the OBAS-problem. For example, in Figure 3.3 the
block access sequence ABCDE requires 4 pages to be saved in the buffer while the
sequence CADEB requires only 2 pages to be saved. An OBAS for this graph is
CADEB. Table 3.3 shows the size of the buffer as blocks are fetched in the sequence
CADEB.

27

Section 3.3 shows that the OBAS-problem is NP-hard. In Section 3.4 the problem
of computing the least upper bound on the main memory buffer size for the block con-
nectivity model is also shown to be NP-hard.

Table 3.3. The OBAS for the Graph of Figure 3.3.

 

 

 

 

 

 

Block Accessed Content of the Buffer
C C 1
A a: . as
D a3
E e 2
B --
3.2.3. Tuple Connectivity Model

The tuple connectivity model was ﬁrst presented in [Pram85b]. For this model, the
assumption is that the unit of secondary storage access is a page as in the page connec-
tivity model. However, the size of the buffer is determined in terms of the number of
tuples. The problem associated with this model is how to determine for a given tuple
connectivity graph, an ordered list of pages (PAST) which require minimum number of
tuples to be saved in the buffer. Such a sequence is denoted as OPAST and the problem
of determining such a sequence is referred to as the OPAST-problem. A tuple connec-
tivity graph is an edge-weighted graph, where the weights represent the number of
tuples of one page to be joined with the corresponding tuples in another page. This
implies that the relations have no duplicate values on the joining domains. Figure 3.4
gives an example of a tuple connectivity graph. The OPAST for this graph is ecdba
which requires a buffer storage of only 5 tuples as shown in Table 3.4.

When the join domains have duplicates, the tuple connectivity graph becomes a

directed graph. The are weights represent the number of tuples from the tail page to be

28

joined with tuples in the head page.

Table 3.4. The OPAST for the Graph of Figure 3.4.

 

 

 

 

 

 

Page Accessed # of Tuples in the Buffer
e 5
c 5
d 3
b 4
a 0
4
o 3 o
2
o 5 o
0

Figure 3.4. An Example of a Tuple Connectivity Graph.

As mentioned earlier, the problem of computing the least upper bound on the
buffer size for this model (regardless of duplicate values on the joining domains) is NP-
hard [Pram85b]. Next section shows that the OPAST-problem is also NP-hard

3.3. The OBAS-problem and the OPAST-problem are NP-hard

In this section I show that a special case of the OBAS-problem is NP-hard.
Therefore, the general OBAS-problem is NP-hard. This result is then used to show that
the OPAST-problem is also NP-hard.

29

The special case of the OBAS-problem is deﬁned as the problem of determining
the OBAS for a block connectivity graph, such that, a page within a block of the graph
may be joined with at most one other page of another block and no other page in any
other block. Here, this problem is referred to as the OBASl-problem. Figure 3.3 shows
an example of such a block connectivity graph. Note that for this block connectivity
graph, the degree of a node is the same as the number of pages in the block correspond-
ing to that node. Next theorem shows that the OBAS l-problem is NP-hard.

Theorem 1. The OBAS l-problem is NP-hard.

Proof: I present a polynomial time reduction of a known NP-complete problem,
minimum cut linear arrangement, to the OBAS l-problem. The minimum cut linear

arrangement problem is deﬁned as follows:

Given a graph G (V,E) and a positive integer K, one has to decide if there exists a one-
to-one functionf :V —) {l,2,..., IV I} such that for all i ,1<i< IV I,

l{(u,v)eE :f(u)Si <f (v)}l SK .
This problem is known to be NP-complete[Gare79].

Given a graph G(V,E), let V={A1,A2,...,A,.}, and p,- be the degree of the node A,-,
for i=l,...,n. We construct a block connectivity mph H (8.8) from G where
B={A’1,A'2, . . . ,A'n} and each node A’,-eB is labeled with p;. That is, I(A';)=p,-. In
the block connectivity graph H, a node A ’,- represents a block with p,- pages. Also an
edge (A ’,-,A 'j) in H for some iaej, represents a page in the block A’; to be joined with a
page in the block A ’j and no other page. This implies that each edge of the graph
represents a join between two distinct pages of their corresponding blocks. Therefore,

given a graph G (V,E) one can construct a block connectivity graph H (3,8) from G.

Now I show that for a minimum cut linear arrangement of nodes in G there exists a
corresponding linear arrangement of blocks in the block connectivity graph H con-

structed from G, such that the value of K for a sequence in G is the same as the buffer '

30

size for the corresponding sequence in H. Let A 1A2 - - - A,, be the sequence in the mph
G(V,E) ,with IV I=n, such that for all i , 1<i <n

é,- = I{(u,v)eE :f(u)Si<f (v)}| SK .

5,,- denotes the number of edges from nodes A1,A2, . . . ,A; to the nodes Am, . . . ,A,,

for 1<i <n. Therefore, one can represent §.- as follows:

Si = 21’} - 2 2X11 . (1)
i=1 j=lk=l
“I.

where X}; = 1 if (A j,A,,)e E, otherwise X ,7, = 0. The ﬁrst term in (1) represents the sum
of the degree of the nodes A 1,...,A,-, and the second term represents the total number of
edges among the nodes A 1, . . . ,A;. Now I show that the size of the buffer for the block
access sequenceA’lA'z - - ~ A’,, is K.

Let 7;, for i >1, be the number of pages need to be saved in the buffer when block
A’l is in the buffer and the blocks A’z, A'3....,A',- are fetched. Note that this is the
required buffer size prior to fetching block A ’m . Also note that when a block is
accessed, all the pages of that block are brought into the buffer, and only those pages of
the block that are needed for the future joins are kept in the buffer. Suppose two con-
secutive blocks A ’,- and A ’j of a block access sequence are accessed, then the number of

pages that need to be saved from these two blocks is

l(A’,-) + [(A’j) - Y.)- - Y},- ,
where Y5,- =ng =1 if (A’;,A’,-)eE, otherwise Y5,- =ng =0. The ﬁrst two terms in the
above formula represent the total number of pages in the blocks A ’,- and A ’j, respec-
tively, and the last two terms represent the total number of pages that need to be joined
and removed from the buffer.

In general, when A ’1 is in the buffer and the next i -1, for i >1, blocks of the access
sequence A ’1A ’2 - - - A ’, are fetched, the value of y,- can be represented as follows:

31

‘Yi = EKA'j) " Z 2er (2)
i=1 '=lk=l
”I.

where Y}; =1 if (A’;,A';;)eE, otherwise ij=0. Since the label on a node of H

represents the degree of the node, one can replace l(A 'j) in (2) with pj. Also, since the
edge set E is the same for both graphs G and H, and BEV, then ng =Xjk. Therefore,
7; = g.- SK for i=2,...,n. This implies that the required buffer size for the block access

sequence A’pA’g - ° - A'. is equal to K. Now I show that if the block access sequence

A’1A’2 - - - A’,, is an optimal sequence, then the required buffer size for this sequence is

also equal to K.

If the buffer size for the sequence A ’1A ’2 - - ~ A ’, is considered, one of the follow-

ing three cases may occur:

Case 1)

Case 2)

Case 3)

l(A'1)21, l(A’2)21, and (A’1,A’2)¢E. For this case (refer to Figure
3.5a), when block A '2 is accessed, the size of the buffer is
Q =72 =l(A’1)+l(A’2). Therefore, l(A'1)< §; =7,- SK, for i>1. This
implies that the required buffer size for the sequence A’IA’z - - - A’, is K.
l(A’1)21,l(A’2)> 1, and (A’1,A’2)eE. For this case (refer to Figure
3.5b), when block A’z is accessed, the size of the buffer is
£2 = y; = l (A’l) + l (A ’2) - 2. Since l(A’p) should have at least a value of
2, then the size of the buffer for the sequence is equal to K, where
K2§; =7; 21(A'1).

l(A’1)21,l(A’2)=1, and (A’;,A’2)eE. For this case (refer to Figure
3.5c), when block A’z is accessed, the size of the buffer is
1; =I(A'1)+I(A'2)-2=I(A'1)-1, which is less than l(A'l). If
l(A ’1) S K then K is the buffer size for the sequence. However, when
l (A’;) > K then the buffer size for the sequence is l(A’1). But for this case
I show that there exist another block access sequence which requires a

buffer of size less than l(A’l). This implies that A’lA'z ° . ° A',, is not an

32

optimal sequence. We obtain a block access sequence which requires less
buffer than the sequence A'IA’z ° - - A’, by exchanging only the ordering
of the nodes A’l and A’z in the sequence. For this new sequence,
A'2A’1 - - -A’,, the required buffer size is l(A'1)-1 (i.e.,
K=‘Y2 =§2=1(A'r)- 1)-

Therefore, one can determine the sequence in G (V,E) which gives minimum cut

(i.e., minimum K value) by determining the block access sequence which requires the

minimum size buffer in H(B,E). CI

5

(a)Case1

 

(c)Case3

Figure 3.5. The Cases for when the Buffer Size forA ’IA ’2 - ~ - A ', is Computed.

Following is an example of the minimum cut linear arrangement problem for a
given graph and the construction of the block connectivity graph from it. Figure 3.3
shows the block connectivity graph, H (3,5), constructed from the graph,G (V,E), of
Figure 3.6. In the mph of Figure 3.6, for K =4, the minimum cut linear arrangement
can be deﬁned by the function f as follows:

f (A’)=l. f (B’)=2. f (C ')=3. .f (D')=4. and f (E')=5

33

Table 3.5 shows the value of §; for i =2,3,4.

Table 3.5. Value of §; for the Graph of Figure 3.6.

 

1' @-
2 l{(A'.C').(A'.D').(A'.E').(B'.E')}|=4

I{(A ’,D’), (A’,E’), (B’,E’)} I =3
4 |{(A’.E’).(B’.E’)} |=2

 

 

 

 

 

Given any ordered sequence of nodes in graph G one can replace every node in the
sequence with their corresponding nodes of graph H. For example, the sequence
A ’B ’C ’D ’E ’ of the graph G can be represented by the block access sequence ABCDE
in graph H. Note that K value for the sequence in G is the same as the buffer size for the
corresponding sequence in H. For the sequence A’C’D’E’B’ in G the value of K is
equal to 2. However, the corresponding block access sequence ACDEB requires a
buffer of size 3. Therefore, as mentioned in the proof of Theorem 1 (case 3), by
exchanging the ordering of A and C in the above sequence we obtain the sequence
CADEB which requires a buffer of size K =2.

Figure 3.6. A Graph G (11.5).

34

In the next theorem I use the result obtained in Theorem 1, to show that a special.
case of the OPAST-problem, call it OPASTl-problem, is also NP-hard. Therefore, the
general OPAST-problem is NP-hard. The OPASTl-problem is deﬁned as the problem
of determining OPAST for a tuple connectivity mph, such that, there is only one tuple
in each page of the tuple connectivity mph that needs to be joined with a tuple in
another page. Therefore, the weights on the edges of the corresponding tuple connec-
tivity mph are all 1’s. For example, consider the mph of Figure 3.6. An edge (A ’,C’)
in this mph represents the need for joining a tuple in A’ with a tuple in C’. Note that
the weights on the edges are all Is and not shown in the ﬁgure. Here, the case where the
values of the joining attributes are unique is considered. Therefore, when page A ’, in
Figure 3.6, joins with pages C ’,D ’ and B", there are three tuples with unique join values
in A ’. Now I show that the OPASTl-problem is NP-hard.

Theorem 2:. The OPASTl-problem is NP-hard.

Proof: I present a polynomial time reduction of the OBAS l-problem to the OPASTl-
problem. Note that here when I refer to a block connectivity mph I mean the mph
which was considered for the OBAS l-problem. Let a block with p join participating
pages in a block connectivity model represents a data page with p join participating
tuples in a tuple connectivity model. The join of two blocks in a block connectivity
model is then represents the join of two tuples in their corresponding pages of the tuple
connectivity model. Therefore, given a block connectivity mph, one can use the above
transformation to construct a tuple connectivity mph with weights of all edges assumed
to be 1. Now given a block access sequence in the block connectivity mph, it is easy to
see that the required buffer size, in term of the number of tuples need to be saved, is the
same for the corresponding page access sequence in the tuple connectivity mph.
Therefore, one can determine the optimal block access sequence in a block connectivity
mph by determining an optimal page access sequence in the corresponding tuple con-
nectivity mph when the edges have weights of 1. CI

35

3.4. Complexity of Computing the Least Upper Bound on the Buffer Size

In this section I show that the problem of computing the least upper bound for the
block connectivity model, denoted by LUBB, is NP-hard. Let us assume that the size of
a block in a block connectivity mph is one. Then the block connectivity mph is the
same as a page connectivity mph. Therefore, the page connectivity model is a special
case of the block connectivity model. To show that the LUBB problem is NP-hard, I
will ﬁrst show that the problem of determining the least upper bound for the page con-
nectivity model, denoted by LUBP, is NP-hard. The LUBP problem is deﬁned as fol-
lows:
Given: A page connectivity mph G (V,E) and a positive integer KS I V I.
Question: Does there exist a page access sequence which requires a buffer of size 2K?
Note that by a page access sequence I mean a sequence where a page is accessed only
once. I will show that the LUBP problem is NP-hard. Before proving this the following
deﬁnitions and propositions are needed.

A cut of the mph G (V,E) is deﬁned to be two disjoint subsets of nodes X and Y,
represented by (X ,Y), such that X UY=V. Figure 3.7 shows a possible cut of the page
connectivity mph of Figure 3.2; where X ={b,e}, and Y={a,c,d}.

Y X

 

 

 

Figure 3.7. A Possible Cut of the Graph of Figure 3.2.

36

The frontier nodes for a subset of the cut, say X, are deﬁned to be the set
{x : (x,y)eE; where xeX and ye Y}. I deﬁne the non-frontier nodes for a subset of
the cut, X, to be the set {x:(x,y)eE; where xeX and yeX}. Let F}; and NF);
represent the set of frontier and non-ﬁontier nodes in the subset X of the cut, respec-
tively. Note that FXUNFX=X. In Figure 3.7, F); = {b.e} , NFx = (I) , Fy = {d,c} , and
NFy = {a}. A preﬁx of a page access sequence is any number of leading nodes of the
sequence. A possible page access sequence for the page connectivity mph of Figure
3.2 is abode, and a preﬁx for this sequence is abc which has a length of 3.

Proposition 1. Let (X,Y) be a cut of the page connectivity mph G(V,E) and X ,Y be a
page access sequence which has the nodes in subset X of the cut as preﬁx (regardless of
ordering) followed by the nodes in subset Y of the cut (regardless of ordering). Then

X ,Y page access sequence requires at least a buffer of size of IF}; I+1.

Proof: After fetching the pages of the subset X of the cut, regardless of the ordering, the
only pages remaining in the buffer are the pages in the set Fx. In order to remove any
one of these pages from the buffer one needs at least a node from the set Fy. Therefore,
the smallest size of buffer needed is IFx I+1. III

For example, in Figure 3.7, let Y.X be the page access sequence cadbe. When all
the nodes in the subset Y of the cut (i.e., c,a, and d) are fetched, the only nodes remain-
ing in the buffer would be the nodes in the set Fy which are c and d. In order to remove
any of the nodes c or d from the buffer one needs at least one node from the set Fx.

Therefore, the buffer size is at least 3.

Proposition 2. Let a 1oz - - - a,, be a page access sequence of the page connectivity
mph G (V,E) with n nodes. Also, let S;, for lSi Sn, be the required buffer for a preﬁx
of length i of the page access sequence. Then, for every S;, for lSi Sn, there exists a cut
(X,Y) of G.

Proof: I need to show that there exists a cut (X, Y) of G such that the required buffer

size when all the nodes of a subset of the cut, X, are fetched is IS; I+1. The cut can be

37

constructed as follows: place the nodes in a preﬁx of length i of the page access
sequence in the subset X of the cut and the remaining nodes of the sequence in the sub-
set Y of the cut. According to Proposition 1 the required buffer size when all the nodes
of the subsetX of the cut are fetched is IS; I+l. El .

Corollary: There is a one-to-one relationship between a cut of a page connectivity

mph and the buffer size required for a preﬁx of a page access sequence.

Proposition 3. For a page connectivity mph G (V,E) and a cut (X ,Y) of G when the
nodes in NF; (or NFy) are placed into Y( or X) the size of the Fy( or Fx) will increase
by 1.

Proof: Here I show that by placing a node from the set NFy to X then the size of Fx
will increase by 1. Let us assume the there is an edge (a,b)eE such that aeNFy and
be Fy. Then placing the node a into the set X causes a becomes a frontier node of X.

Therefore, IFxl increase by 1. El

For example, in Figure 3.7, by placing node a of the subset Y into the subset X of
the cut the size of the set Fx increases by 1. In the next two propositions I show that
maxinlum IFxI gives the least upper bound on the size of the buffer. For example, in
Figure 3.7, if the two nodes a and c from the subset Y are placed in subset X of the cut,
then Fx has the maximum size. Therefore, the ham upper bound on the size of the

buffer for the page connectivity mph of Figure 3.2 is IFX I+l = 4+1 = 5.

Proposition 4. Let (X, Y) be a cut of the page connectivity mph G(V,E). Then
FXUFy= IVI when IFXI is maximum.

Proof: Our assumption is that IF}; I is maximum. I will show the following:

1) NFy=¢. If NF yitq), then placement of some of the nodes in NFy into the subset X

of the cut will increase IF}; I by at least one, which is a contradiction.

2) NFX=¢. If NFX=¢, then placing all of the nodes in NFX into the subset Y of the cut,

may cause one of the following two events to occur:

38

a) IFyI increases and NFy=¢. For this case one might have IFyI > IFxI which
requires switching the labeling of the two subsets of the cut.

b) IFyI increases and NEW. In this case I will place the nodes in NFy into the
cut subset X. According to proposition 3, this will increase the size of Ex by

at least one which is a contradiction.

Note that by placing the pages in the set NF); into the set Y, the size of the IF); I
can not be decreased. El

Proposition 5. Let G(V,E) be a page connectivity mph G(V,E). Then the least upper
bound on the size of the buffer is max( lFx I)+1; where max gives the maximum value

of Ex.

Proof: This follows directly from Proposition 1. CI

The following theorem shows that the LUBB problem is NP-hard.
Theorem 3. The problem LUBP is NP-hard.

Proof: I present a polynomial time reduction of a known NP-complete problem to the

LUBB problem. The dominating set problem is formulated as follows:

Given a mph G (V,E) and a positive integer K S IV I one has to decide if there exists a
dominating set of size K or less for G, i.e., a subset V’ g V with IV’ I S K such that for

all us V-V’ there is a vs V’ for which (u,v)e E.

K is called the domination number of the mph G if K is minimum among all possi-
ble dominating sets for G. The dominating set problem is known to be NP-
complete[Gare79]. Suppose that the nodes of the mph G were data pages of the page
connectivity mph and the edges in mph G were the edges in the page connectivity
mph. Then one can determine the upper bound on the buffer size for the page connec-

tivity mph by determining the domination number of the mph. CI

39

Theorem 4. The problem LUBB is NP-hard.

Proof: Since the problem of computing the least upper bound for the page connectivity
model is NP-hard, and page connectivity model is a special case of the block connec-
tivity model, then the LUBB is NP-hard. D

3.5. A Heuristic Algorithm for the Page Connectivity Model

Since the problem of determining the OPAS for a given page connectivity mph is
a complex combinatorial optimization problem here, a heuristic procedure with 0(n2)
time complexity is presented; where n is the number of the nodes in the page connec-
tivity mph. This heuristic algorithm determines a page access sequence which requires
a near optimal buﬁ'er size. This algorithm is efﬁcient when the join factor is high. For
small problem size, I have compared the performance of the heuristic algorithm with a
branch and bound algorithm presented in the next section. The comparison shows that
the average buffer size for the heuristic method is very close to optimal (i.e., the buffer
size differs from optimal buffer size only by 2 or 3 pages). Some deﬁnitions are needed
before I present the heuristic method.

The resident-degree of a node v, written 8,6(v ), is deﬁned to be the number of
adjacent nodes of v that are in the buﬂ'er. A node a is said to be adjacent to node v if
there is an edge between them. The nonresident-degree of a node v, written 8,”, (v), is
deﬁned to be the number of the adjacent nodes of v that are not in the buffer. Therefore,
the degree of a node v is 5(v) = 8m(v)+6m(v). Now the heuristic algorithm is
described.

Step 1. Initialize the buffer to empty, 8,6(v)=0 and 8M,(v )=6(v) for all vs V.

Step 2. Add to the buffer a node with the smallest degree. Update the resident-
degree and nonresident-degree of its adjacent nodes.

Step 3. Add to the buffer a node with the largest 8,”. If there is more than one
node then choose the one with the smallest 8”,. Update the resident-degree

40

and nonresident-degree of the nodes adjacent to the node added to the.
buffer.

Step 4. Perform join among the nodes (data pages) in the buffer.

Step 5. Remove from the buffer the nodes having all of their adjacent nodes in the
buffer.

Step 6. If the buffer is empty then STOP, otherwise go to step 3.

Note that steps 2 and 3 each requires one scan of the list of nodes. Therefore, the com-
plexity of this algorithm is in the order of 0 (n2); where n is the number of data pages in
the mph. Table 3.6 gives the trace of this algorithm for the page connectivity mph of
Figure 3.2. Note that the page access sequence is ebdca.

I have applied the above heuristic algorithm to random page connectivity mphs of
n nodes and e edges. For each value of n, 20 random mphs with e edges were created.
The buffer size for each mph is then computed. Figure 3.8 shows the average buffer
size as a function of the edge factor for n =50, 100 and 200. The edge factor for a mph
with n nodes is deﬁned to be the ratio between the number of edges in the mph (i.e., e),

t _
and total possible number of edges between the n nodes, which is 5,221. In this

ﬁgure the assumption is that the edge factor can have values up to l (i.e., a complete
page connectivity mph). However, there are cases in which the value of edge factor is
much less than one. For example, if n pages of relation R need to be joined with m

pages of the relation S with the assumption that these pages are disjoint then the max-

m*n
. < 1.
rmum value for edge factor would be (m +n)"‘ (m +n _1)

2

 

 

41

Table 3.6. Trace of the Heuristic Algorithm for the Graph of Figure 3.2.

 

 

 

 

 

 

 

 

 

 

 

 

Content of the Buffer Pages to be Joined Size of the Buffer
e _ -- 1
¢.b (e,b) 2
b’d (cvd)9(bvd) 2
b,d,c (b.C).(C.d) 3
d,c,a (d,a),(C,a) 3
200 —
160 r-
)3:
Average 120 _
Buffer Size 80 — n=lOO
40 — / n:
O " 1 1 1 1 1 1 1 1 1 L
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8. 0.9 l
edge factor

Figure 3.8. Average Buffer Size for the Heuristic Algorithm.

3.6. Determining an Optimal Page Access Sequence

In this section I ﬁrst give a branch and bound algorithm to determine the optimal
page access sequence. The complexity of this algorithm is in the order of 0 (n l). I then
compare the performance of the heuristic procedure, given in the previous section, with
that of the branch and bound algorithm.

A tree organization is being used to facilitate the search for optimal sequence. The
branch and bound algorithm chooses the smallest buffer size among all possible paths
which start from the root and terminate at leaves. A path of length m from node v to

node w in a mph G is an edge sequence from v to w of length m in which the edges are

42

distinct. The complexity of this algorithm, called it algorithm A, is 0 (n l), where n is
the number of nodes in the page connectivity mph. Before I give the algorithm some
deﬁnitions are needed.

Given the page connectivity mph G(V,E) with n nodes, and an order list of nodes

V1, ...,Vn let
a;={vk:(vk,vp)eE for lSltSi-l and iSpSn} for ZSiSn-l.

Then the size of the buffer for a preﬁx of length i is §,~=la,- l+1 ; where, Ixil represent
the size of the set x. The size of the buffer for a given page access sequence is now can
be deﬁned to be max{§,- : i =2,...,n—1}. Let the feature brgﬁ'er size for a preﬁx of length i

of a given page access sequence to be la’; |+1 , where
a’;={vk : (vk,vp)eE for lSkSi-l and i+lSpSn} for ZSiSn—l.

Note that to determine if a node v belongs to the set a.- one needs to scan n -i nodes.
However, to determine if a node belongs to the set a’; one needs to scan only n --i --1
nodes. Thus, by using (1’; instead of a,- in determining the OPAS one can reduce the
expansion of the search tree by one level. In the algorithm given below, I ﬁrst ﬁnd an
ordered list of nodes with the smallest feature buffer size and then determine its buffer
size. Now I give the algorithm.

Step 1. Consider all the nodes of the mph and generate their children. Determine
the feature buffer size for all the paths from the roots to their corresponding

leaves.

Step 2. Find the path from root to a leaf with the smallest size feature buffer size
and let B be the last node in the path. If there are more than one such path,

then choose longest.

Step 3. Generate the children of the node B, and determine the feature buffer size

for the new paths generated.

43

Step 4. If B has only one child then goto step 5; otherwise go to step 2.

Step 5. Compute the buffer size for the path started from the root and ended at the
child of B and then STOP.

Figure 3.9 gives the trace of algorithm A for the page connectivity mph of Figure
3.2. The number on a node of the mph represents the feature buffer size for a path
started at the root and ended at that node. The dotted edges in the mph represents the
optimal page access sequence, which is the sequence acdbe. The size of the buffer for
this sequence is 3. As mentioned earlier, this algorithm is practical only for small prob-
lems. I have compared the average buffer size required for a random mph of 5 up to 10
nodes for both the heuristic method and algorithm A. Figure 3.10 shows the result of
this comparisons. As seen from the ﬁgure, the heuristic algorithm determines a page
access sequence of the random mph which requires an average buffer size very close to

the optimal.

 

 

 

 

 

Figure 3.9. Trace of Algorithm A for the Graph of Figure 3.2.

45

 

 

 

 

 

7 _
5 _.
5 _
Average 4 _
Buffer Size 3 _
2 _
1 _

0 "’ 1 1 1 1 1 1

5 6 7 8 9 10
Number of Nodes

Figure 3.10. Average Buffer Size for the Heuristic and the A Algorithms.

3.7. Cost Models and Performance Comparisons

In this section the performances of the fragmentation technique [Sacc86a], the
sort-merge join algorithm [Blas77] and an algorithm based on the page connectivity
model are compared. This last algorithm, called it algorithm P, uses the hemistic
method given in Section 3.5 to determine the near optimal access sequence. The follow-

ing notations are used to derive the cost of each algorithm:

R ,- Relation i

N,- Number of tuples in relation i

E,- Average number of tuples from R,- in a data page

L Average number of (key value, pointer) pairs in a leaf page of an index
M Number of pages of available main memory buffer space

P,- Join factor for relation i

Now I brieﬂy describe each algorithm and compute its I/O cost.

46

3.7.1. Cost of the P Algorithm

1)

2)

3)

For two relations R1 and R2, this algorithm, proceeds as follows:

Scan the indices on the join attributes of both relation in order to obtain a set of

. . . . N1+N2
pornter pairs. The cost for thrs step rs L .

 

Use the heuristic algorithm given in Section 3.5 to determine the near optimal page
access sequence. Our assumption here is that enough memory is available to
access the desired pages without any reaccesses. Therefore, the cost for this step is

ZCI'O.

Fetch the pages according to the sequence obtained in step 2 and perform the join.
The cost for this step is 1(P1,N1,EI)+¢(P2,N2,52); where r(p,n,e) is the
expected number of pages which must be accessed in order to fetch pn tuples of a
relation containing n tuples distributed evenly, e per page. r(p,n,e) has been
derived in [Good80] and is given below:

T

_ n/e if l—e/nSpSl
1:(p,n,e)-+ n-e

n/e.(l- A" ) ifp<1-e/n
l;.nl

 

3.7.2. Cost of the Fragmentation Technique

Fragmentation computes the natural join of two relations by recursively partition-

ing two unordered relations into fragments of size less than or equal to M -1 pages. The

partitioning is done using a hashing function. The problem of performing join between

two relations R 1 and R 2 is then reduced to the problem of joining the corresponding

fragments of each relation. The cost of partitioning R,- into M -1 fragments is 2xH,-,

N.
where [15:23:- , for i =l,2. In order for each fragment of the smaller relation, say R2 to

47

ﬁt in the buffer, partitioning can be recursively applied to each fragment produced by

the previous phase. Therefore, phase It produces (M -—1)" fragments. The size of the kth
H 2

(Mr-1)"

lcx I'logm-1H 21-1, and the total cost of partitioning relation R,- is 2xl-I,-( [log,,,-1H 21-1).

 

fragment is f 'I. The total number of partitioning phase is thus

The total cost of fragmentation joins is:

21'11 (103M-1H2)+2H2(108M-1H2)-H1“H2 ;

3.7.3. Cost of the Sort-Merge Algorithm

This algorithm requires the initial sorting and multiway merging of both joining
relations on their join attributes, and ﬁnally merging of the two sorted relations. The

cost for this algorithm is:

H1+2 x(logM_1H1)+H2+2 x (logu-11-I2 )

3.7.4. Cost Comparisons

Figure 3.11 shows the cost of the three algorithms as a function of x for two dif-

ferent join factors. 1 is deﬁned to be the ratio of the buffer size to the number of pages

M O I
; for 0<xSl. It rs valrd
1(P1’NlrElM(P2rN29E2)

to assume that for every 1: there exist a mph with 1(P1,N1,El)+t(P2,N2,Ez) number

 

that participate in the join (i.e., x =

of nodes which requires a buffer of size M (refer to Figure 3.8). The parameter values
considered here are Ni=10000, Eg=100 and L=200 for i =1,2. As shown in Figure 3.11a,
when the join factor is low, algorithm P outperforms the other two algorithms. How-
ever, for join factors of 0.5, only for x505 algorithm P performs better than the other
two algorithms. When 1: >05 algorithm P and the fragmentation technique perform
almost the same because for algorithm P almost all the pages of the joining relations
have to be fetched. '

48

 

 

 

 

 

 

 

 

3000 — .
2500 '
2000
Number of 1
Disk Accesses
1000
500
0 1 1 1 1 1 J 1 1 I I
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x
(a) Join Factors P1=P2=0.1.
1500 r—
1000 -— '°
Number of . .....
Disk Accesses ............... 5.00M. ‘5 ................
Aleutian! ............ L _______ _
0 _

 

 

 

1 1 1 1 J 1 1 1 1 1
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

x
(b) Join Factors P 1=P2=0.5.

Figure 3.11. Comparisons of the Sort-Merge, Fragmentation, and P Algorithms.

CHAPTER 4

JOIN AND SEMIJOIN ALGORITHMS BASED ON
THE PARTIAL-RELATION SCHEME

4.1. Introduction

Anindexonattributerfrelatioanermitsrapidacwsstoasingletuplethathas
a desired value in that attribute. However, for mum-attribute queries involving join and
resnictionsseveralindicesmayneedtobeaccessed. Hashingisalsoanothermethodfor
fastaccesstothedesiredurplesofarelation. Inadditionthistechniquehasbeenused
for mum-attribute search [Bin-k76,Ullm82] or join-only queries [DeWi84, Shap86].
However,onemay need to scan severalhash tablesinordertopmcessaqueryinvolving
restrictions and join. In [Tana79] a new access method called a Relational Inverted
Structure (RIS) has been proposed which is a combination of indexing and hashing tech.
niques. AnRIS maintainstheerossreferencesofthejoining attributesaswellasthe
hashvaluesoftheremaining attributes oftherelations. However,anRISmaybe
neededforeverysetofattributescorrespondingtothesamedomain. Furthermore, no
extensive analysis has been done on their performances. File compression techniques
have also been proposed for partial match retrieval [Sack83]. These techniques are not
numbleformejoinopaaﬁmsbecausemejoinvalueshavembewmpmssedunifmmly
across the ﬁles. Thus, coding of join attribute values require global statistics which is
moredifﬁculttoapplyforﬁlecompression. Hereweproposeanewaccesspathtoa
relation, called a Partial-Relation scheme, or PR in short, in order to speed up the rela-
tional join operations.

APartial-Relation schemeisaprojectionoftherelationonthejoin andresuiction
attributes of the queries [Pram86a. Fot088]. Further, the values of an attiibute in a
Partial-Relation scheme are obtained by applying a transformation function to the origi-
nalvalues. Theobjectiveofthisu'ansformationfunctionistoreducethesizeofthe

49

50

attribute values. As a result, the amount of storage required by a Partial-Relation is also
reduced. Known transformation functions such as hashing and or compression tech-
niques may be used to reduce the attribute values. Note that a Partial-Relation is created
based on the statistics of the queries. As I will show, the PRs facilitate the implementa-
tion of queries involving restrictions and join.

Two join algorithms are presented for the index-scan and the hybrid classes. These
algorithms preprocess the Partial-Relations ﬁrst and then join the selected tuples of the
relations. The performances of these algorithms are compared with the sort-merge and
the hybrid-hash join algorithms (two known best algorithms of the relation-scan class).
The analysis is based on the cost of accesses to the secondary storage and the CPU
usage. It has been shown that for wide range of restriction factors? and/or join factors
the proposed algorithms perform better than the sort-merge and hybrid-hash join algo-
rithms. We have considered join-only queries and queries involving restrictions, projec-
tions and join.

The organization of this chapter is as follows: Section 4.2 describes the Partial-
Relation schemes. In Sections 4.3 and 4.4 the join and the semijoin algorithms based on
Partial-Relation schemes are presented. The performnces of these algorithm are com-
pared with those of the sort-merge and the hybrid-hash.

4.2. An Access Path Based on a Partial-Relation Scheme

In this section I ﬁrst describe the Partial-Relation Scheme and then discuss the
advantages and disadvantages of using PRs in performing join and restrictions over

using multiple secondary indices or hashing techniques.

 

T Restriction:factorforarelationisdeﬁnedtobetheratiobetweenthenmnberofurplesinthe
relation thatsatisfytherestrictionsandthesizeoftherelation.

51

4.2.1. Description of Partial-Relation Schemes

Let R(A 1,112, - - - ,A,,) be a relation scheme with n attributes, A1,A2, - ~ - ,A, ,
. such that the ﬁrst It attributes of R are participating in the joins or restrictions. Let IR I
be the total number of tuples in R, and //R// be the amount of storage required by R. A
Partial-Relation scheme for R, or PR(R), is deﬁned to be a relation scheme with k-I-l
attributes where the ﬁrst It attributes are those of R and the (k+l)th attribute is an access
attribute, or AA, which points to the corresponding tuple in R. The attribute values in
PR(R) are derived by applying a Value-Reducer function which, for example, can select
a few characters out of all the characters of an attribute value. Here, a set of Value-
Reducer functions, one for each attribute of the PR, is deﬁned. Figure 4.1 gives an
example of a Partial-Relation scheme for the EMPLOYEE relation.

 

 

 

 

 

Relation EMPLOYEE
rec.# we address telephone gpt manager
1 BucknerJ. 162 Haslett,E.Lan 332-3860 Pharmacy Brown,D.
2 BuckmanJ. 299 Cleveland,E.Lan 227-1759 Shoe Repair Smith,A.
3 Buckmanl’. 160 Kalamazoo, ELan 335-2343 Photography Clare.R.
4 Buckle}. 452 Garﬁeld,E.I.an 353-1759 Pharmacy Brown,D.
5 131erth . 684 Cedar,E.Lan 350-1775 Photomhy Clare,R.

 

Value-Reducer functions:
F(ename)=H(n1anager)=The ﬁrst three characters of the attribute values.
G(dept)='1'he ﬁrst two characters of the attribute values

 

PR(EMPLOYEE)

 

ename dgrt manager AA

Ph Bro
Buc Sh Smi
Buc Ph Cla
Buc Ph Bro
Buc Ph Cla

ﬁ

 

 

tit-503N0—

 

Figure 4.1. An Example of a Partial-Relation.

The problem of determining optimal set of attributes for a PR is similar to the
secondary index selection problem [Come78] which is known to be a NP-complete

52

problem. Low cost heuristics for near optimal solution have been given in [Schk75].
These algorithms are based on the statistical properties of the queries. To add a new
attribute value to an existing PR one may scan the original relation and then create a
new PR by using the Value-Reduca function to the values of the new set of attributes. I
show in Appendix A that the cost ofcreating an index is more than the cost ofcreating a
PR.

Processing a query using PRs is done in two phases. In the ﬁrst phase, one per-
forms restriction, and join or semijoin operations on the PRs ﬁrst. This produces a set of
pointers to the resulting tuples of the original relations. In the second phase, ﬁnal join
and restriction operations are performed on these resulting tuples. The PRs contain par-
tial values and as a result operations on them may produces a few false tuples 1' . There-
fore, at the end of the ﬁrst phase a super set of the tuples to which the join operation
should be applied is obtained. The cost of processing a query is the sum of the cost to
preprocess the PRs, and the cost to process the resulting tuples. Cost of preprocessing
PRs depends on //PR // . In order to minimize the overall cost a set of Value-Reducer
functions is deﬁned which

a) Minimizes //PR// , and at the same time
b) decreases the number of false tuples.

In general, these two objectives conﬂict with each other. By deﬁning an appropriate set
of Value-Reducer functions one may be able to avoid the problem of condition (b).
However, this may not satisfy condition (a). Note that allowing false tuples in a PR may
be advantageous in reducing the size of the PR. As we will show in Section 4.4.1, for a
small percentage of false tuples (up to 40%) the performance of the proposed algorithms

remain almost the same if the restriction factor is low. The number of false tuples

 

1‘ Falsenrpksuednseurpleswhichseemmmﬁsfydwquuywhenviewedmamneaspxm
but actually are not qualiﬁed tuples [Lum70].

53

depends on the ratio between the number of unique atuibute values in a PR and the ori-
ginal relation. An analytical model is developed to compute this ratio, M, for N ran-
domly generated unique attribute values. The ratio is given by

 

AL_AL-X
X N
M=—A—. 1
N AL
N

 

 

where A is the alphabet size and X characters of each value of length L are selected by
a Value-Reducer function. The derivation of this expression is given in Appendix B.
To validate the analytical result, we have computed the percentage of false tuples (i.e,
l—M) for 1000 randomly selected persons’ names of up to 20 characters. I have also
computed the percentage of false tuples for 600 randomly selected telephone numbers of
7-digit long. The result of these experiments as well as that of the analytical model is
given in Figure 4.2. The mph of the ﬁgme shows that the percentage of false tuples
approaches 0 rapidly with an increasing value of X. Thus, by choosing only a few char-
acters of the attribute values one can guarantee a few false tuples. The mph also shows
that the percentage of false tuples for experimental data is less than that of the analytical
model. This is because the adjacent characters of the experimental data are statistically
dependent. The effect of a set of Value-Reducer functions on //PR // is also analyzed in
[Pram85a, Pram86b].

 

 
 
 
   

0.9 - Legends:
8'3 — A=Bxp., a226, N=l(XX)
. 0'6 : B=Ana1.. a=26, N=1000
Fractron of False 0:5 _ C=Expr., a=10, N=600
Tuples 0.4 _ D=Anal., a=10, N=600
0.3 —
0.2 —
0.1 —
0 " 1 1 1 1 1 1 1 1

 

 

 

l 2 3 4 5 6 7 8
Number of Characters Selected

Figure 4.2. Number of Characters Selected vs. Fraction of False Tuples.

4.2.2. Partial-Relation Scheme Vs. Indexing and Hashing

As mentioned earlier, the primary function of the PRs is not to perform an efﬁcient
restriction on them as it is in conventional indexing, but to perform an efﬁcient join.
Therefore, for simple restriction queries with a very low restriction factor it is advanta-
geous to use conventional indexing or hashing techniques. This is because PRs are
scanned sequentially even for a low restriction factor. In general, PRs are suitable for
applications where complex queries are processed. To process a query of this type, one
needs to scan only one PR per relation. However, in methods using secondary indices,
one may need to scan several indices and perform an intersection to obtain the set of

keys or pointers to the relation.

Lum [Lum70] proposed a compound indexing scheme in order to retrieve all the
tuples satisfying a partial-match query, in one access. A compound indexing scheme
consists of a set of secondary index ﬁles with a common set of index ﬁelds in each ﬁle,
but each ﬁle is ordered on different index ﬁelds. Therefore, the implementation of com-
pound indexing requires a lot of storage. On the other hand, if a serial ﬁling method is
used, only one index ﬁle is created for all the attributes to be indexed. General retrieval

through this index ﬁle, however, may require a large number of accesses because the

55

index is scanned sequentially. Note that PR is a special form of a serial ﬁling method.
However,thesizeofaPRismuch smallerthanthatofanindexbecauseeachvaluein
PR is reduced by a Value-Reducer function. Therefore, sequential scanning of a PR is
notascostlyassequential scanningofanindexinaserialﬁlingmethod. Atthe same
time, because of the reduction, false tuples may result in processing a PR.

To process join-only queries with low join factors it is more cost effective to
preprocess the PRs than to use the hashing techniques suggested in [DeWi84, Shap86].
This is because in hash-basedjoin methods the original relations are scanned several
times while preprocessing PRs requires at most one scan of the original relations.
Fmther, the preprocessing also reduces the cost of the second phase considerably
because the size of the relations are reduced due to low join factors. On the other hand,
for very high join factor preprocessing PRs is not as effective as the hash-based join
algorithms. Since PRs are centralized resources, the degree of inter- and intra-query
concurrency is decreased as compared with methods using multiple secondary indices.
However, the cost of updating a single PR is much less than maintaining several indices.

The overall storage requirements for a database will increase when PRs are used.
However, the size of a PR is much less than the corresponding indices because of dupli-
cate set of tuple pointers which appear in each index. Further, the storage utilization for
a PR is close to 100% (a PR does not have to be ordered), where as in B-tree the average
storage utilization is about 69%.

4.3. Algorithms Based on the Partial-Relation Schemes

In this section two algorithms for a relational equijoin operation between two rela-
tions, R and S are presented. The ﬁrst algorithm, PRJ, is based on joining PRs, and the
second algorithm, PRS, performs semijoin between the two PRs. ‘

The performances of the above algorithms are analyzed using the following form
of the query:

56

PROJECT (JOIN (RESTRICI‘ (R), RESTRICI‘(S)))

4.3.1. Algorithm PRJ of the Index-Sean Class

This algorithm ﬁrst applies the restrictions on the PRs and then join the selected
tuples of the two PRs. The result of this join is a set of AA-value pairs. Using these
AA-values, the tuples of the original relations are fetched and joined. The details of this
algorithm are as follows:

a) Apply restrictions and projections on PR(R) by scanning it sequentially. The pro-
jection is done only on the join and the access attributes. Store the result in a ﬁle

R ’. Perform similar operations on PR(S) and store the result in a ﬁle S ’.

b) Join R ’ and S’ and obtain a set of AA-value pairs which point to the joining tuples
of R and S. Joining of R' and S’ is done by ﬁrst sorting them and then merging the
two sorted ﬁles.

c) Fetch the tuples of R and S using these AA-values. Then perform the ﬁnal restric-

tion, join and projections on these tuples.

4.3.2 Algorithm PRS of the Hybrid Class

This algorithm uses the semijoin and sort-merge techniques to process the above
query. The semijoin operation is ﬁrst performed on the PRs in order to reduce the size of
the relations. The reduced relations are then joined using the sort-merge technique. For
the semijoin operation a hashing function as well as Boolean arrays are used.

PRS algorithm is good for the join-only queries. Also, as explained in Section
4.4.1, for general queries given above, the PRS algorithm performs better than the PRJ
algorithm for high restriction or join factors. This is because the PRS algorithm reduces
the relations by means of semijoin operation and not by using AA-values. On the other
hand, the PRJ algorithm uses the AA-value pairs and, therefore, for high restriction fac-

tors the number of accesses may be more than the number of pages in the relations. The

57

details of the PRS algorithm are as follows:

a)

b)

C)

Apply the restrictions on the PRs and perform the semijoin on them. Store the
result of this semijoin in a bit array.

Scan relation R, and for each tuple hash its join attribute value, and probe the bit
array for a match. In case of a match store the subtuple in a temporary ﬁle R’.
Create S’ from S in the same way as R’. Note that R’ and 5’ contain the entire
length of key values rather than partial key and AA-values, as for the previous
algorithm.

Join R ’ and S’ by ﬁrst sorting them on their join attribute values and then merging
the two sorted ﬁles.

4.3.3. Cost Model

Here we develop a cost model based on the number of disk accesses for the pro-

posed join algorithms. In this model the cost of eliminating duplicates from the output

and forming the output relation is ignored because this cost is similar for all the algo-

rithms. The following set of parameters are used for the model. These parameters for

i=1,2, where R 1 corresponds to R andRz corresponds to S, are as follows:

Ni

Er

fr

Number of tuples in R,-. N,- also represents the number of tuples in PR(i) because
IR,-| = lPR(i)I.

Average number of tuples from R,- in a data page

Join factor for PR(i) or R,- (i.e., the fraction of tuples of PR(i) or R,- that participate
in the equijoin)

Main Storage space in term of page frames that are available (e.g. sort buffers)
Restriction factor for PR(i) or R,

Ratio between the size of the PR(i) and R,~; where f,-<l

58

Now I derive the cost expression in terms of the number of disk accesses for the two

algorithms PRJ and PRS.

4.33.1. Cost Expression for Algorithm PM

Here, the cost expression for the algorithm PRJ is derived for the following three

03868:

I) The PRs are sorted and the original relations are clustered with respect to the join
participating attributes.
II) The PRs are sorted but not clustered.

III) The original relations are not clustered, or if they are it is with respect to some
attributes which are not participating in the join operation. Note that here we are

not assuming that PRs are sorted on the join attributes.

For the ﬁrst case one does not need to create R’ and S’. This is because while the
two PRs are scanned for restrictions, join can be performed at thesame time. As a result

the cost for step (b) of the algorithm is zero. The cost for scanning PR(R) and PR(S) is

2 2
or = ZfiNﬂE; . The cost to fetch tuples of R and S is v=21(J 112P5,N,-,E,-).

i=1 i=1

Therefore, the total cost is: cot-v. The cost for case II is the same as above except that

2
the value of v changes to 2] 1] 2P,-N,-. The cost for case 111 is the sum of the following
i=1

three costs:
1) The cost to scan PR(R) and PR(S) is (r) ;

2) The cost tocreate, sort and merge R’ and S’ which is zero ifR’ and 8’ ﬁt in main

memory, otherwise, it is

2
Pl=2fthNi/Ei(2 IOgM-r (ﬁJtNi/(2(M -1)Et)+1)) -

i =1

The standard sort-merge algorithm, called SM, is being considered here. This

59

algorithm begins by creating the sorted runs of the tuples. These runs are then
merged using an (M —l)-way merge. Each run on the average is twice as long as

the number of tuples that can ﬁt into a priority queue in memory[Knut73].
3) The cost to fetch the tuples of R and S is v.
Thus, the total cost is:

to+v ifR’ andS’ ﬁtin mainmemory,

or+pl+v otherwise.

4.3.3.2. Cost Expression for Algorithm PRS

The assumption, for the cost expressions derived here, is that the bit array, which

contains the result of the semijoin operation, will ﬁt in the main memory. The cost of the

2
PRS algorithm is tu+tt+8; where u=2NilEg , the cost to scan the relations, and

i=1

2
5:22] 112PiNt/Et008M-1'U112P1Nr/(2(M-1)Et)))+JiNi/Ei .

i=1

the cost to create, sort, and merge R’ and S'.

4.4. Performance Evaluation

In this section, performances of the proposed algorithms are compared with the
sort-merge join algorithm. This comparison is done under the assumption that large
main memory buffer is not available. I also compare the performances of PRS algo-
rithm with the hybrid-hash algorithm for the join-only queries with the assumption that
large main memory buffer is available.

60

4.4.1. Comparisons with the Sort-Merge Join Algorithm

Here I compare the performances of the sort-merge algorithm with the proposed

algorithms under the conditions that we consider typical. These conditions are: .
A) Restriction indices exist and the relations are not clustered.

B) The same as condition A but the relations are clustered with respect to the join

attributes.
C) There are no indices and the relations are not clustered.

Condition B is considered because it allows an efﬁcient implementation of the join
operation.

The parameter values that are used for the cost model in the ﬁgures are :
500<1vl=1vz<106 ,p1=p2=1.o . E1=Ez=40, M=30 , f1=f2=0.1

As mentioned in section 4.2, one of the characteristics of the set of Value-Reducer func-
tions is to decrease the number of false tuples and, therefore, make the restriction factor
for a PR very close to that of the corresponding relation. Note that this may increase the
size of the PR. Figure 4.3 shows the cost of algorithm PRS as a function of percentage
of false tuples for various restriction factors. As shown in the ﬁgure, for restriction fac-
tors of 0.1 and 0.01, the performance of the algorithm remain the same for up to 40% of
false tuples in a PR. However, for restriction factor of 0.5, the cost of the PRS algo-
rithm slightly increases as the percentage of false tuples increases by 10%. Here in the
analysis I have assumed that the restriction factors for both a relation and its conespond-

ing PR are the same.

61

 

 

 

1400 ~ .. _.
“4301’? _| 10,000
1200 —-
Number Of 1”) _ 1:0
Secondary Storage
Accesses '
600 - "OJ J=0.01
400 1 1 1 1

 

 

 

1 1 1 1 1 1 1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Fraction of False Tuples

Figure 4.3. Performance of Algorithm PRS When PRs Contain False Tuples.

Both PRJ and PRS can be used under the above conditions. However, as men-
tioned in Section 4.3, in general, algorithm PRS performs better than the PRJ algorithm
for high restriction factors while algorithm PRJ performs better for low restriction fac-
tors. This is shown in Figure 3.4 for condition A, assuming the relation sizes of 10‘.
For condition C the result is the same as for condition A. The performances of the SM
algorithm is also given in the ﬁgure. It is seen that the PRJ performs better for most res-
triction factors. But for very low and very high restriction factors the SM costs less.
For high restriction factor PRS performs better than the PRJ and SM. The cost expres-
sion of the SM algorithm under the above three conditions is given in Appendix C.

62

10‘5 r- Relation Size = 10‘/
J 1=J2

105 1— PRS /
Number of

Secondary Storagelo-t _ PRJ /

Accesses ﬁ
103 — /
SM

1021 J I l l

.0001 .001 .01 .1 1
Restriction Factors

 

 

 

 

Figure 4.4. Comparisons Under Condition A for Different Restriction Factors.

4.4.1.1. Interpretation of Results

The cost of evaluating the general query using algorithms PRJ, PRS, and the SM
under conditions A through C are mphically shown in Figures 4.5 through 4.7. Each
mph in a ﬁgure gives the number of disk accesses as a function of relation size. I have
considered four different restriction factors. These values are 1.0, 0.5, 0.1 and 0.01 as
shown separately in four mphs of each ﬁgure. In the ﬁgures only one of the PRS and
PRJ algorithm, which has the minimum cost, is considered.

Figures 4.5a and 4.5b illustrate the cost of the SM and the PRS under condition A
when restriction factors are 1.0 and 0.5. As shown in the ﬁgure, for restriction factor 1.0
algorithm PRS performs almost the same as the SM. This is because, for very high res-
triction factors, preprocessing does not reduce the size of the relations and, therefore, the
cost of joining them remains almost the same as that of the SM algorithm. As the res-
triction factors decrease (e.g., J 1=J2=0.5 ) algorithm PRS performs better than the SM
algorithm. For restriction factors 0.1 and 0.01, algorithm PRJ performs much better
than both the SM and the PRS algorithm. This is shown in Figures 4.5c and 4.5d. The
improvement is due to the fact that preprocessing reduces the number of accesses to the

relations and, therefore, it reduces the cost of processing in the second phase.

10‘5

Number of 105

Secondary Storagclo4
Accesses 103

102

 

—

 

PRS

SM

 

 

1 1 1 1
102 103 104 105 106
Number of Tuples in S

(a) Restriction Factors 11:12:10.

106

105
Number of

Secondary Storagem‘
Accesses 103

102

 

—

 

M

 

PRS

 

1 1 1 1
102 103 104 105 106
Number of Tuples in S

(b) Restriction Factors J1=Jz=0.5.

10‘5

105
Number of
10‘

Secondary Storage

Accesses

102

 

—

p

 

\

 

 

1 1 4 1
102 103 10‘ 105 10‘5

Number of Tuples in S

(c) Restriction Factors J 1=J 2:0.1.

 

 

 

 

10‘5 _ .
105 F-
Number of
104 — 5
Secondary Storage
103 _
Accesses

102 — mu

1 1 1 1
102 103 104 105 10°
Number of Tuples in S
(d) Restriction Factors J 1=l2=0.01.

Figure 4.5. Comparisons Under Condition A.

65

Figure 4.6 illustrates the cost of the SM and the PRJ algorithms under condition B.
For restriction factors 1.0 and 0.5, the SM algorithm performs better than the PRJ algo-
rithm. This is because the relations are merged in the SM algorithm by scanning them
only once, without using the indices. However, the PRJ algorithm, in addition to scan-
ning the relations, also requires to scan the PRs. On the other hand as the restriction fac-
tor decreases the PRJ algorithm performs better than the SM algorithm. This is because
the cost of the second phase decreases rapidly as a result of preprocessing the PRs.

For condition C, algorithm PRS performs better than the SM algorithm for restric-
tion factor 0.5. The result is shown in Figure 4.7. For low restriction factors (e.g. 0.1
and 0.01) algorithm PRJ performs better than the SM algorithm due to the lower cost of

the second phase.

I have also compared the cost of the three algorithms for the range of the PR sizes.
Figure 4.8 shows the effect of the different PR sizes on the cost of the PRJ algorithm for
relation of size 105. As shown in the ﬁgure, for very high (e.g., 0.5) and very low (e.g.,
0.001) restriction factors the SM performs better than the PRJ algorithm. For example,
the PRJ algorithm has to scan the entire PRs regardless of the resuiction factors, while

the SM algorithm can use the indices to access the tuples.

I have also done similar analysis for the memory and the page sizes in the range
20sM550 and 20SE 1=E 2540, respectively. We have observed similar results as dis-
cussed above.

10‘5

105
Number of
Secondary Storage
Accesses 103

102

 

—

 

PRJ

SM

 

 

1 1 1 1 L
102 103 104 10I 10‘5
Number of Tuples in S

(a) Restriction Factors 11 =J2=1.0.

10‘

105
Number of
Secondary Storage1
Accesses 103

102

 

 

PRJ

SM

 

 

l l I 1 l
101 103 10‘ 105 106
Number of Tuples in S

(b) Restriction Factors J 1=J2=0.5.

10‘

105
Number of
104

Secondary Storage
Accesses 103

102

67

 

—

\

 

 

J 1 1 1 1
102 103 10" 105 10
Number of Tuples in S

(c) Restriction Factors 11:12:01.

106

105
Secondary Storage
Accesses

102

 

 

 

I l l l l
102 103 10‘ 10r 10
Number of Tuples in S

((1) Restriction Factors J 1=.lz=O.01.

Figure 4.6. Comparisons Under Condition B.

 

 

10‘

Number of 105

Secondary Storagc104
Accesses 103

102

68

 

- PR

- SM

—

 

 

 

J 1 1 1 1
102 103 10‘ 105 10
Number of Tuples in S

(a) Restriction Factors J 1# 2:1.0.

10‘5

Number of 105

Secondary Storang‘
Accesses 103

102

 

r SM

PRS

 

 

 

’ 1 1 4 1
102 103 10‘ 105 10
Number of Tuples in S

(b) Restriction Factors J 1=J2=0.5.

10‘

- 105
Number of

Secondary Storagéo4
Accesses 103

102

69

 

r

 

 

\

 

1 1 1 m
102 103 104 105 106
Number of Tuples in S

(c) Restriction Factors J1=J2=0.1.

 

 

 

 

10‘ _
105 —
Number of
104 —
Secondary Storage
Accesses 103 " S
102 — R,
1 1 1 1 1
102 103 10‘ 105 105
Number of Tuples in S
(d) Restriction Factors J 1--J2=0.01.

Figure 4.7. Comparisons Under Condition C.

70

 

 

 

106
Number of 105 _
Secondary Storang‘ - PRJ
A SM
ccesses 103 __
102 ' 1 1 1 1

 

 

.l .2 .3 .4 .5
f 1 =f 2

(a) Restriction Factors J 1=J2=0.5.

 

 

 

 

10‘5
Number of 105 ‘
Secondary Storage104 — SM
Accesses 103 _ PRJ
102 " 1 1 1 1

 

 

 

.l .2 .3 .4 .5
fr =f2

(b) Restriction Factors J 1=J2=0. l.

 

71

 

 

 

 

 

 

 

10‘ —
Number of 105 ”
s St an4 —
econdary orag SM
Accesses 103 _/
PRJ
102 F1 1 1 1 1
.1 .2 .3 .4 .5
f 1=f 2
(c) Restriction Factors J 1=J2=0.01.
10‘ .—
Number of 105 '
Secondary Storang4 —
102 — lSM

 

 

 

(d) Restriction Factors J 1=J2=0.001.

Figure 4.8. Comparisons Under Condition B for Different PR Sizes.

72

4.4.2. Comparisons with the Hybrid-Hash Algorithm for Join-Only Queries

Here I compare the performances of the hybrid-hash join algorithm with the PRS
algorithm under the assumption that large main memory buffer is available. It has been
shown in [DeWi84] that for wide range of parameter values (e.g., relation sizes, cpu
time to swap two tuples, etc.), the hybrid-hash performs better than the sort-merge and
GRACE-hash join algorithms. Because of the large main memory buffer, their cost
model also considers CPU time. The cost for the hybrid-hash does not consider the ini-
tial cost of reading the relations because this cost is same for all the algorithms used for
comparison in [DeWi84]. The preprocessing in PRS algorithm may reduce this initial
cost of reading the relations. Thus, the cost comparisons also include the initial cost of
reading the relations. Appendix D gives the cost expressions of the hybrid-hash and
PRS join algorithms.

Figure 4.9 shows the total cost of the algorithms as a function of the join factor. As
seen in the ﬁgure the PRS is better than the hybrid-hash for only low join factors. How-
ever, one can use the hybrid-hash method for the second phase of the PRS algorithm
instead of using the sort-merge technique. The cost of the modiﬁed PRS (MPRS) as
well as the SM algorithm are also shown in Figure 4.9. In this ﬁgure I have assumed two
relations with 10000 pages each, PRs of size 1000 pages, and memory of size 1200
pages. As shown in the ﬁgure, the MPRS algorithm performs better than the hybrid-
hash for most join factors. For very high join factors MPRS does not perform as good
as the hybrid-hash because the preprocessing in MPRS does not reduce the cost of the

second phase.

73

 

 

1800 — SM

1400-

Execution Time
1000 — Hybrid-Hash

in Seconds
600 _ /
MPRS

200

 

 

 

Figure 4.9. Comparisons of Algorithms PRS, MPRS, SM and Hybrid-Hash.

CHAPTER 5
CONCLUSIONS AND FUTURE STUDY

5.1. Conclusions

Untilrecenﬂy,mainmemorystmagewassoexpensivethatonlymallmain
memorybuﬂ'a'scouldbeusedforquu'yprocessing. Withtheavailabilityofcosteﬂ’ec-
ﬁvelugemainmemmiesmmcedmesuebeingdevdopedmnducedremcostofpro-
cessing a query. This work has investigated the performance ofrelationaljoin operation
whenalugememorybuﬂ'aisavaﬂableHowevaJhaveﬂsoconsidaedmnbuffer
sizeminvesdgatemepufmmmceuade-offbetweenmebuﬂ'erdnandthenumberof
diskI/Os. Herethetheoreﬁcallowerboundonthenumberofdiskaccessesisachieved
Both join-only queries and queries involving selections. projections and join are con-
sidered. Theobjecdveofthisthesiswasmopﬁmizethecostofrelaﬁonaljoinoperaﬁon
in a paging environment for a given set of constraints and assumptions.

A classiﬁcation scheme for the existing join algorithms based on the availability
anduseofindiceshasbeenpresented. Thethreeclassesofthealgorithmsare,the
relation-scan, the index-scan and the hybrid classes. The research showed that the
relation-scan class of algorithms perform best when the join and/or restriction factors
an high. As mentioned in Chapter 2, most of the existing algorithms fall into the
relation-scan class. For the index-scan class of algorithms several graph models are
presentedinordertoshowdrattheopﬁmizaﬁonpmblemisNP-hudTherefma
heuristic algorithm which has linear time complexity is given. The algorithms of the
index-scanchsspaformbestwhenmejoinand/mmsnicﬁmfacmrsuelow.Forthe
hybrid class. several algorithms which are based on the preprocessing Partial-Relations
nepmposed.Thereseamhshowedthmforawidemngeofmsnicﬁonandlmjoinfac-
tors the proposed algorithms perform better than the relation-scan class ofalgorithms.

74

75

In general, the algorithms of the hybrid class perform better than the algorithms of the.
relation-scan and the index-scan classes for medium range of join and/or restriction fac-

tors.

5.2. Future Study

In traditional database applications, queries requiring more than 10 joins are con-
sidered improbable [Kris86]. However, in nontraditional database applications, such as
expert database systems, a need for performing hundreds or more joins is not uncom-
mon [Ullm 85, Zani85]. An expert database system attempts to emulate the reasoning
of a human expert in some knowledge domain. It does it by using both the basic facts
stored as data in the database and the rules in the knowledge base [Kort86]. Here the
extension of this research, (i.e., the use of the mph models as well as the Partial-
Relations), in two different contexts namely the m-way join operation and the recursive

queries are being discussed.

5.2.1. M-Way Join Operation

In this research the join operation between two relations is being considered. How-
ever, to process a query we may be required to join two relations, and then join the
resulting relation with another relation. This is called a 3 -way join. Similarly, the m-
way join of m relations, R 1,R2, - - - ,Rm requires the following sequence of m-l join
operations:

Bl=R1MR2
Bz=BlmR3
Emil =3”; as,"
where B,- for i =1,2,...,m—1, is a temporary relation which holds the result of a join
operation between two relations, and 3,.-1 is the output relation. Instead of performing

join between relations PRs can be used to perform m-way join. Therefore, algorithms

76

PRS, PRJ, and MPRS can be extended to support m-way join. For example, the exten-
sion of the algorithm PRJ for performing m-way join is as follows:
TB1=PR(R1) NPR (R2)
TB2=TB1 NPR (R3)
TB",:.1=TB,,,_2:NPR (Rm) ,

where TB,-, for i =1,2,...,m-l, is a PR which contains (AA1,AA 2, join value) triplets,
and TB.._1 contains (AA 1 ,AA 2,...,AA,,.). AA; is an access attribute value for relation R1.
To join RR (R,-) with T814, the two PRS are ﬁrst sorted on their join attribute values and
then the two sorted relations are merged. Preliminary results showed that as m in m-way
join operation increases, so does the improvement of the performance of our algorithms
over the sort-merge join algorithm.

One way to optimize the cost of m-way join is to take advantage of the availability
of conventional indices on the join attributes. Several sets of pointer pairs are obtained
by joining the relevant indices on the joining relations. These pairs are then used to con-
met a page connectivity mph. Using this connectivity mph, an algorithm for deter-
mining optimal number of accesses to all the join participating relations needs to be

developed. I am currently investigated this method.

Another issue in performing m-way join is to determine the order in which the rela-
tions or indices are being joined. For example, joining R1 with R2 ﬁrst and then R;
with R3 may be more costly than joining R3 with R2 ﬁrst and then R1 with R2. The
problem of determining the optimal sequence in which the relations or indices are being
joined is suspected to be NP-hard.

5.2.2. Recursive Queries

Knowledgebase systems must support richer query languages than the traditional
database systems [V ald86]. For example a knowledgebase system must support recur-

sion. A simple type of recursion is transitive closure. Transitive closures involve a

77

relation with two or more attributes which have the same domain. When the operation is
applied to a relation R with two attributes A and B of the same domain, it would add to R
all the tuples that can be deduced by transitivity from R [Vald86]. Therefore, this opera-
tion can be implemented using a series of joins and unions. Consider the following rela-
tion (which is taken from an example in [Vald86]):

PART ( part-name, category, weight, support-part-name ),

where part-name and support-part-name are of the same domain. Category and weight
are attributes associated with a part-name. Then one can ask the following query:

What is the total weight of each part in category "c”

The total weight of a part p of category c is the weight of p plus the sum of the weights
of all the parts that p transitively supports. This query can be implemented by a series
of joins and unions on the relation PART.

Valduriez and Boral [Vald86] have discussed and analyzed two algorithms, namely
the brute force and the logarithmic algorithms, for performing transitive closure. For
example, the brute force algorithm joins relation R with itself and stores the result in a
temporary relation 7'. It then joins T with R and store the result in T. This process
repeats until the result of joining R with T is empty. At each step the result of the join is
also added to the ﬁnal relation. The performance of the two proposed algorithms has
been compared for two different cases. For the ﬁrst case, the algorithms operate directly
on the base relation. In the second case, the algorithms operate on data structure called
the join indices 1 . Their result showed that the use of join indices improve the perfor-
mance of the brute force and the logarithmic algorithms. These algorithms use hybrid-
hash method for performing join and a modiﬁcation of it is used for performing union
operation. For the future study, I would like to investigate the use of Partial-Relations

instead of the join indices in performing transitive closure. As the number of joins

 

1' Ajdnindexisanabstmctionofthejoindtwonlations.

78

increases one expects to get very small join factors. Therefore, this suggests the use of

the mph models for performing uansitive closure.

APPENDICES

APPENDIX A

Appendix A
Cost Comparison of Creating an Index Vs. Creating a PR

HerewecomparethecostofaeadngaB-ueeindexsnucuuewiththatofcreaﬁnga

PR. The cost of creating a B-tree for a relation with n tuples is as follows:
1
n n n n
-—+ — — —
B 2x(el‘)ge)+1§:,1e‘

whereBisthenumberoftuplesoftherelationthaﬁtinabloeheisdrenumberof
(key,ptr) pairthat ﬁt in ablock, and l=log% is the depth oftheB-tree. 'nte ﬁrsttermin

theaboveexpressionisthecostofmadingmerehMthesecondtumisthecostof
sorting(using Z-way merge-sort) the indexed column, and the remaining terms are the
costofwritingtheinternalnodestotbedisk. Wesorttheindexedcolumnﬁrstandthen
create theleafnodesoftheB-tree. Note thatwhiletheleafnodesofthe B-tree are
createdwecanalsoereatetheinternalnodes.

ThecostofcreatingaPRis%+% wherepisthenumberofPRtuplesthatﬁtina

block. Table A.1 shows the number of block accesses for constructing a B-tree versus
that of a PR when 8:10, Z=10,p=20, ande=50. We see from the table, the cost of
creatingaPRismuchlessthanthecostofcreatinganindexforlargen.

79

80

Table A.1. Cost Comparisons ofCreating a B-tree vs. a PR.

 

n PR Cost B-tree Cost
1000 150 152
10000 1500 1925
100000 15000 23244
1000000 150000 272449

 

 

 

 

 

 

 

 

 

APPENDIX B

Appendix B
Computing the Number of Unique Attribute Values in a PR

Let S be a set which contains all possible strings (i.e., atuibute values) of length L
generated randomly using an alphabet set of size A. Therefore, the number of possi-
blevalues in s is D=AL. We assume that the Value-Reducer function selects the ﬁrst x
characters of the attribute values. We can partition S into G =A’ groups with each group
containing K=AL” values with the same ﬁrst x characters. Let P1(T) be the probability
that there is no valuefrom groupiin a sample ofsize Trandomly chosen fromS. There-

DD -K x_D__—K-lx .. xD-K-T+1_FD TK]

PK ) -1 D-T+1

fore, when D >T+K

T

otherwise P1(T)=0 . Thus, the expected number of unique values in a sample of size T is
[DD —K].
T

T

21—P1(T)=Gx

i=1

81

APPENDIX C

 

Appendix C
110 Cost of the Sort-Merge Algorithm

Haewegivethecostoftheson-mugejoinalguithmundatbedueecondiﬁons
describedinsection 4. UnderconditionAthecostis:
2
ZurthtIEtUosu-t (Jim/(251$! -1))))+211Ni/51

181

where tr;=min(J1N1 , Ng/Ej).
2
The cost of the algorithm under condition B is gonna/115, , 1(11.N,-,E;)). Finally
‘81
the cost under condition C is

2
FNi/EﬁUiNi/Eiai'lozu-r(JiNi/mM -1)Ei)))-
II

82

APPENDIX D

Appendix D
Execution Time of the Hybrid-Hash, SM, PRS and MPRS Algorithms

Herewegivethecostofthehybrid-hash,SM.PRSandMPRSjoinalgorithms
basedonthenumberofdiskIlOsandthecpuusage. Thefollowingpar'ameterawhich
aregivenin[DeWi84],areusedforexpressingthecostoftheabovealgorithms:
comp timetocomparetwokeyvalues
hash timetohashakey
move timetomoveatuple
swap timetoswaptwotuples
[0359 time for sequential IIO operation
10m timeforrandomIIOoperation
F universal ”fudge" factor. This factor is used for expressing for example, the

exu'astoragethatarelationneedtoholditshashtableinthebuﬂ'er.

a
For the hybrid-hash the cost of joining two relations R and S is 2’15, where
ill

111: (IR I+IS ”£0339. Thecostofreadingtherelations.
ll'2= (//R //+//S/I)xllaslr. The cost for partitioning the relations.

h3= (IIR//+//S//)x(1-q)xmove. The cost of moving the tuples to the output buffers.
”to!
‘3 IR 1 ‘

 

In: (R I+ IS I)x(l-q)xlom. The cost of writing from output buffers.

Its: (l/R//+//S//)x(l-q)xllaslr. The cost for building hash tables for R and ﬁnding
where to probe for S.

llg= l/S/lexcomp. The cost for moving tuples to hash tables for S.

83

34

In: //R//xmove. The cost for moving tuples to hash table for R.
M: (IR I+IS I)x(1-q)time10359. The cost for reading the partitions into the

memory.

6
The cost of the SM join algorithm is 2a, where

i=1

s1= (IR l+ IS I)xlOsg-Q. The cost of accessing the relations.

s2: (//R l/log2 yp-gH/S/Ilogz-L—goxkomp +swap ). The cost of inserting tuples into
priority queue to form initial runs. MR and MS specify the number of tuples from
Rand S, respectively, that can ﬁt in main memory at once.

S3= (IR I+IS I)><lOsp;Q. The cost ofwriting the initial runs.

s4: (IR I+IS I)><IORAND. The costofreading initial runs forﬁnalmerge.

s 5: (l/R/llogz ﬂ§l[%£+//Slﬂoggﬂbé/;—F)x(comp+swap). The cost of inserting tuples

into priority queue for ﬁnal merge.
s 5: (//R //+//S//)xcomp. The cost of joining the results of ﬁnal merge.

For the PRS and MPRS algorithm the cost of the ﬁrst phase (i.e., the cost of per-

4
forming semijoin and creating temporary relations) is 2 p1, where
i=1

p 1: IPR (R) I xIOSEQ-i-l/PR (R )//xhash. The cost of scanning PR(R) and hashing its join
attribute values.

p2= IPR (S) I xlOSEQ-r/IPR (S)//x(lraslr+comp). The cost of scanning PR(S), hashing its
join values and compare those of PR(R).

p3= IR IxIOSEQ+//R//x(hash+comp ). The cost of scanning relation R and obtaining the
tuples which participate in the join.

p4= IS IxIOSEQ+//S//x(hash+comp). The cost of scanning relation S and obtaining the

tuples which participate in the join.

85

After the ﬁrst phase, the size of the relations are IR IxP1 and IS lXPz. The cost of the
second phase for the MPRS algorithm is the same as the hybrid-hash algorithm with the
assumption that the size of the relations are reduced. Therefore, the total cost for MPRS

4 s
algorithm is 2p1+2h1. Note that the term h1 is not considered in this expression. This

i=1 i=2
is because, while the relations are being reduced the selected tuples are being hashed.
For the PRS algorithm the cost of the second phase is the same as that for the SM algo-

4 6
rithm. Therefore, the total cost for the PRS algorithm is 2p1+2s1.

i=1 i=2
For Figure 23, the following parameter values are assumed:
comp=3tls hash =9us move =2011s swap =60us JOSEQ=IOms 10mp=25ms

F=l.2 ISI=|R|=10,000 //S//=//R//=400,000 M=1200

LIST OF REFERENCES

[Amma85]

[Babb79]

[301106]

[Baye72]

[Bitt83]

[8111576]

[318877]

[Brat84]

[Codd70]

[Come78]

[Come79]

[Date75]

[DeWi79]

ID¢W1341

LIST OF REFERENCES

A.Ammann,M.HaruahanandR.Krishnamurthy,"Desi ofa
Resident DBMS," Proc. IEEE COMPCON, San Francisco, 1985.

E.Babb, lemendngaReladonalDatabasebyMeansofSpecialized
Hardware,“ A,MTODS Vol.4, No. 1, March 1979.

W. A.Burkhard."HashingandT‘rieAlorithmsforPardalMatch
Retrieval."ACMTODS, Vol.1, No. 2.Junel 6.

R.BayerandE.McOeight, "OrganizadonandMaintenanceofIarge
OrderedIndices," Actalrv‘ormatica, Vol. 13,No.3, 1972.

D. Bitten, H.Bonl, D. DeW'rttandKWilldnson, "Parallel Algorithms for
the Execution of Relational Database Operations," ACM TODS, Vol, 8, No.

3, September 1983.

MW. Blasgen and KP. Eswaran, "On the Evaluation of Queries in a Rela-
tional Data Base System," Technical Report RJ I745 (#25553), IBM
Research Laboratory, San Jose, California, April 8,1976.

MW. Blasgen and KP. Eswaran, "Storage and Access in Relational Data-
bases," IBM System Journal, Vol. 16, No. 4, 1977.

KBratbergse MethodsAndReladonalAlbraOpera-
dons," Proc. of 10th Int. Courg on VLDB, Singapore, Augustl 84.

HF. Codd, "A Relational Data Model for Large Shared Data Banks,"
Comm. ACM, Vol. 13, No. 6, June 1970.

D. Comer, "'Ihe Difﬁculty of Optimal Index Selection," ACM TODS, Vol.
3, No. 4, December 1978.

2.1331536, "The Ubiquitous B -tree," ACM Computing Surveys, Vol. 11, No.

C.J. Date, An Introduction to Database Systems, Addison Wesly Publishing
Co., Reading, Massachusetts, 1975.

DJ. DeWitt, "DIRECT-A Multiprocessor Organization for Supporting
Relational Database Management Systems," IEEE Trans. on Computers,
Vol. (28, No. 6,June 1979.

DJ. DeWitt, R. KTn‘u’hﬁi Olkctn, L. Shaphrdro, M. SDttinebrakeg and D. “PM
"Impementadon ec ques orMain emory tabase ystems," roe.
ACM SIGMOD Corgi, June 1984.

[E1ha84]

[Fagi79]

[Foto88]

[Gard8 1]

[Gare79]

[Good80]

[Gotl75]

[Hsia83]

[KitsSB]

[Kits84]

[Knot75]

[Knut73]

[Kort86]

[Kris86]

87

K. Elhardt and R. Bayer, "A Database Cache for High Performance and
F332 Restart in Database Systems," ACM TODS, Vol. 9, No. 4, December
1 .

R. Fagin, N. Pipenger, and HR. Strong, "Extendible Hashing-A Fast
Access Method for Dynamic Files", ACM TODS, Vol. 4, No. 3, September
1979.

F. Fotouhi and S. Pramanik, "Optimizing the Cost of Relational Queries
U3sing PilrdslS-gteladon Schemes," to appear in Information Systems, Vol.
1 , No. , l .

G. Gardarin, "An Introducdon to SABRE: A Muldmicroprocessor Database
Machine," Proc. of 6th Workshop on Computer Architecture for Non-
numeric Processing, Hyeres, France, June 1981. '

M. Garey and D. Johnson, Computers and Intractability, A Guide to the
Tghggry of NP-Completeness, W.H. Freeman and Company, San Francisco,
1 .

J.R. Goodman, "An Invesdgadon of Multiprocessor Structures and Algo-
rithms for Database Management," PhD. Dissertation, University of Cali-
fornia, Berkeley, December 1980.

LR. Gotlieb, "Computing Joins Of Relations," Proc. ACM SIGMOD Conf.
, May 1975.

D.K. Hasiao, Advanced Database Machines, Prentice Hall, Englewood
Cliffs, New Jersey, 1983.

M. Kitsuregawa, H. Tanaka, and T. Motooka, "Application of Hash to
Database Machine and its Architecture," New Generation Computing, Vol.
1, No. 1, 1983.

M. Kitsruegawa, H. Tanaka, and T. Motooka, "Architecture and Perfor-
mance of Relational Database Machine," Proc. of Int. Conf. on Parallel
Processing, 1984.

09?. Knott, "Hashing Functions," The Computer Journal, Vol. 18, No. 3,
l 5.

D.E. Knuth, The Art of Computer Programming, Vol.3 (Searching and
Sorting), Addison Wesley Publishing Company, Reading, MA, 197 3

HF. Korth and A. Silbcrschatz, DATABASE SYSTEM CONCEPTS.
McGraw-Hill, Inc., New York, NY, 1986.

R. Krishnamurthy, H. Boral and C. Zaniolo, "Optimization of Nonrecursive
Queries," Proc. of the 12th Int. Conf. on VLDB, Kyoto, August 1986.

[Lars78]

[Lars85]

[Lehm86a]

[Lehm86b]

[Lela85]

[Litw76]

[Litw80]

[Lome83]

[Lum70]

[Lum7 1]

[Lum73]

[Merr81]

[MissSZ]

[Moto83]

[Pram85a]

[Pram85b]

88

P. Larson, "Dynamic Hashing," BIT, Vol. 18, 1978.

P. Larson, "Linear Hashing without Overﬂow Handling by Linear Probing,"
ACM TODS, Vol. 10, No. 1, March 1985.

T. Lehman and M. Carey, "Query Processing in Main Memory Database
Management Systems," Proc. ACM SIGMOD Conf., May 1986.

T. Lehman and M. Carey, "A Study of Index Su'ucture for Main Memory
Database Management Systems," Proc. of 12th Int. Conf. on VLDB,
Kyotto, August 1986.

M. Leland and W. Roome, "The Silicon Database Machine," Proc. 4th Int.
Workshop on Database Machines, Grand Bahama Island, March 1985.

W. Litwin, "Virtual Hashing: A Dynamically Changing Hashing," Proc. of
4th Int. Conf. on VLDB, 1976.

W. Litwin, "Linear Hashing: A New Method for File and Table Address-
ing," Proc. of 6th Int. Conf. on VLDB, 1980.

DB. Lomet, "Bounded Index Exponential Hashing," ACM TODS, Vol. 18,
No. 1, March 1983.

V.Y. Lum, "Muld-attribute Retrieval with Combined Indexes," Communi-
cation ofACM, Vol. 13 No. 11, November 1970.

V.Y. Lum, P.S.T. Yuen, and M. Dodd, "Key-to-Address Transformation
Techniques, A Fundamental Performance Study on Large Existing Format-
ted Files," Comm. of ACM, Vol. 14, No. 4, April 1971.

V.Y. Lum, "General Performance Analysis of Key-to-Address Transforma-
tion Method using an Abstract File Concept," Comm. of ACM, Vol. 16, No.
10, October 1973.

T. Merrett, Y. Kambayashi, and H. Yasuura, "Scheduling of Page-Fetches
in Join Operations," Proc. of 7th Int. Conf. on VLDB, Cannes, France, 1981.

M. Missikoff, "A Domain Based Internal Schema for Relational Database
Machines," Proc. ACM SIGMOD Conf., June 1982.

T. Moto-Oka and K. Fuchi, "The Architectures in the Fifth Generation
Computers, " Proc. of IFIP83 World Congress, 1983.

S. Pramanik and F. Fotouhi, "An Index Database Machine - An Efﬁcient
M-Way Join Processor," Proc. Eighteenth Annual Hawaii International
Cory”. on System Sciences, Hawaii, January 1985.

S. Pramanik and D. Ittner, "Use of Graph-Theoredc Models for Optimal
Relational Database Accesses to Perform Join," ACM TODS, Vol. 10, No.

 

[Pram86a]

[Pram86b]

[Pram86c]

[Rich87]

[Sacc86a]

[Sac086b]

[Sack83]

[Schk75]

[Schu79]

[Seve74]

[Shap86]

[Shib84]

[Shul84]

[Smit75]

[Su79]

89

1. March 1985.

S. Pramanik and F. Fotouhi, "Join and Semi-Join Algorithms Based on
Pardal- Relation Schemes," Proc. 6th Advanced Database Symposium,
Kyotto, August 1986.

S. Pramanik and F. Fotouhi, "Index Database Machine," The Computer
Journal, Vol. 29, No. 5, October 1986.

S. Pramanik, "Performance Analysis of a Database Filter Search
Hardware," IEEE Trans. on Computers, Vol. C35, No. 12, December 1986.

J .R. Richardson, H. Lu, and K. Mikkilineni, "Design and Evaluation of
Parallel Pipelined Join Algorithms," Proc. of ACM SIGMOD, May 1987.

GM. Sacco, "Fragmentation: A Technique for Efﬁcient Query Processing,"
ACM TODS, Vol. 11, No. 2, June 1986.

GM. Sacco and M. Schkolnick, "Buffer Management in Relational data-
base Systems," ACM TODS, Vol. 11, No. 4, December 1986.

R. Sacks-Davis, and K. Ramamohanarao, "A Two Level Superimposed
Coding Scheme for Partial Match Reuieval," Information Systems, Vol. 8,
No. 4, 1983.

M. Schkolnick, "Secondary Index Opdmizadon," ACM SI GMOD, 197 5.

S.A. Schuster, H.B. Nguygen, D.A. Ozkarahan, and KC. Smith, "RAP.2:
An Associative Processor for Databases and Its Applications," IEEE Trans.
on Computers, Vol. C28, No. 6, June 1979.

DC. Severance, "Identiﬁer Search Mechanisms: A Survey and Generalized
Method," Computing Surveys, Vol. 6, No. 3, 1974.

L.D. Shapiro, "Join Processing in Database Systems with Large Main
Memories," ACM TODS, Vol. 11, No. 3, September 1986.

S. Shibayama, "A Relational Database Machine with Large Semiconductor
Disk and Hardware Relational Algebra Processor," New Generation Com-
puting, Vol. 2, No. 2, 1984.

R.K. Shultz and RJ. Zingg, "Response Time Analysis of Multiprocessor
Computers for Database Support," ACM TODS, Vol. 9, No. 1, March 1984.

J. Smith and P. Chang, "Optimizing the Performance of a Relational Alge-
bra Database Interface," Comm. ACM, October 1975.

S.Y.W. Su, L.H. Nguyen, A. Emmam, and L.Lipovski, "The Architectural
Features and Implementation Techniques of Muldcell CASSM," IEEE.
Trans. on Computers, Vol. C28, No. 6, June 1979.

[Tana79]

[Tera83]

[Vald84]

[Vald85]

[Vald86]

[Ullm82]

[Ullm85]

[Zani85]

90

K. Tanaka, C. Viet, Y. Kambayashi, and S. Yajima, "A File Organization
Suitable for Relational Database Operations," Lecture Notes in Computer
Science 75, Springer-Verlag, Berlin Heidelberg, 1979.

Teradata Corporation, DBC/1012 Database Computer Concepts and Facili-
ties, Inglewood, CA, April 1983.

P. Valduriez and G. Gardarin, "Join and Semijoin Algorithms for a Mul-
tiprocessor Database Machine," ACM TODS, Vol. 9, No. 1, March 1984.

P. Valduriez, "Join Indices," MCC Technical Report, No. DB-052-8, Aus-
tin, Texas, 1985.

P. Valduriez and H. Boral, "Evaluation of Recursive Queries Using Join
Indices," Proc. of 1 st Int. Conf. on Expert Database Systems, Charleston,
April 1986.

J .D. Ullman, Principles of Database Systems, Computer Science Press,
Rockvill, Maryland, 1982.

JD. Ullman, "Implementation of logical query languages for databases,"
ACM T ODS, Vol. 10, No. 3, September 1985.

C. Zaniolo, "The Representation and Deducdve Retrieval of Complex
Objects," Proc. of 11th Int. Conf. on VLDB, 1985.

HIGRN STRTE UNIV. LIBRQR

IIII III II0 II 5IITIIIII |I2II8 IIIII