.\_ ‘ 1‘," ‘v HE. ‘ Inn V 'x 4-. w I I ‘ ""«s~ Viv- ”nun.” . u ”.1" \ va a??? ‘ 1;;‘7‘0‘7" ‘. u ‘ . “~ In _\, .V, x 1 Iv 5-"in ( ' - ,M‘u ‘1 \‘ 3mm!" ’1- l f ‘1 w' . 31" ' ' 1 4 ’ "V ‘ . .V‘Y'Sv. ‘If-‘fllfi‘l'f. l' r w ‘ Alli! 1 H x \I, ”1'..- ("A I if; ,I . H ‘fwyfé l ‘ f ‘w‘. 1 S (44‘s't‘J-XU4‘MI’.‘ ‘ . 7‘ #3914014: ' I 7 ‘l I ' -,. u l . dual,” ‘ H' .s’l 2t :2‘Ijt‘,’t ‘" a, 4‘“: ‘ E W Lib. mid] w». ‘ pxr‘.‘ . a'fg’A‘Q' ;. , MM" 44‘ ' w- i" . ._ . , ,‘(Q‘N‘Kb' ‘ '4 ‘4'} ,r v l «’14 ‘ $741. , 4» II 2 , J l ( :r‘ 3‘3: . ”(Hr ’5’! 'Y' {My {WAY I u" n Gill‘v‘}! ' 4 § ( E . "I ' ‘ v‘: fly} 3)} ‘ V» .v .gv“: ‘ 0”}qu I.“ {(435 ‘ d l‘ 4 1 . ‘ n ; {5.}??? 5: ' '3',“ fin???“ J , . ‘ M“ l“’ r: .... "um. ‘ N’fl'fl .a 14%: ‘aw 7AA," ~ K." '1'; u "\ J‘ ~ I.“~\‘ \ a‘. V ‘I " ‘ . ‘ x , h 3:9533‘2’!‘ ‘1‘ 4h V}; v . v ; i . ‘6‘ 5:.“ ’3‘ ‘, vx‘ l‘q» ~ 1?in .mva‘v ,.v i» i V ' ‘s 2‘». I; . 4V; 4 v" 0". no! ‘ . \.;. 31’» ‘I. ‘4 ‘ 1“ ‘4 Ma: 1'" .1 {.1}, y ,I , 43‘s“ .,. 31,1!44- '; J. \V,‘ . r‘. a .‘ ‘ .. 4 .4-, A1). ’ I , :- w“ “(Ma 15‘1”" “"5?” ‘ ‘ 'Tt‘qf."~;‘,¢‘u, (,Hu‘lp‘. l.,“l(‘l . ' ,l (K; “I. "act”; '9' Q6,“ \ A345. 34} 0 7“,"; 1- y’k‘ (Fig: "' ‘1‘. 4‘ FHA]. ". fla- 1‘ l.- J [Willilllfl/lllllllllHill!!!llIll/IllH/ll/lll 93 00552 6797 This is to certify that the dissertation entitled DATA DISTRIBUTION STRATEGIES FOR PARALLEL DATABASE ACCESSES presented by Myoungho Kim has been accepted towards fulfillment of the requirements for #degree in om u er ience MM Major professor Date ML MSU is an Affirmative Action/Equal Opportunity Institution 0—12771 LIBRARY Michigan State University A UM,...n._ 4‘_4 MSU LIBRARIES m RETURNING MATERIALS: Place in book drop to remove this checkout from your record. FINES will be charged if book is returned after the date stamped below. DATA DISTRIBUTION STRATEGIES FOR PARALLEL DATABASE ACCESSES By Myoungho Kim A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Computer Science 1989 ‘- , If -' , Z» ”N i 57;: ABSTRACT DATA DISTRIBUTION STRATEGIES FOR PARALLEL DATABASE ACCESSES By Myoungho Kim With the advent of commercially available general purpose multiprocessing sys- tems, the need for developing appropriate information processing systems are increas- ingly recognized. Since many database applications require a large number of data accesses with relatively less computation, exploiting parallel data accesses is important to improve performance in parallel processing database systems. In this thesis we investigate data distribution strategies for parallel processing of database systems. The primary objective is to maximize throughput and minimize response time through con- current data accesses. We propose database processing models as a general framework, and then present data distribution strategies for three common types of database applica— tions. Two of these applications are on multikey hash files and the third application is on B-tree accesses. First, we present data distribution strategies for partial match queries. The main contribution here is the development of new data distribution methods called Fieldwise eXclusive-or (FX) distribution methods to achieve optimal file distribution. An algebraic property of exclusive-or Operation along with field transfor- mation techniques are fundamental to these data distribution methods. We show that the proposed data distribution methods perform better than the others proposed in the past. We also present efficient data construction methods based on the usage of multi- key hash directory. Second, optimal distribution for parallel processing of multiattri- bute range queries is investigated. Here, we show that for various types of multiattribute range queries there are inherent limitations in achieving optimal distribu- tion. We extend FX distribution methods to achieve optimal distribution for many use- ful multiattribute range queries. For both partial match queries and multiattribute range queries, sufficient conditions for optimal distribution by the proposed distribution methods are given. Finally, we present node partitioning schemes for B-tree type indexes. The objective here is to develop a new parallel processing scheme for B-tree type indexes stored in parallel disks. We show that parallel processing of the proposed partitioned node B-trees performs better than parallel processing of conventional B- trees. This work presents a new basis on which parallel processing systems for other database applications can be designed. To my parents Han Cheol Kim and Kye Soon Lee iv ACKNOWLEDGEMENTS I wish to express my appreciation to my advisor, Dr. Sakti Pramanik, for his gui- dance over the years. Dr. Pramanik provided valuable comments and constant support in every phase of the preparation of this work. I would also like to thank Dr. Lionel M. Ni and Dr. Abdol Esfahanian for their many valuable suggestions and comments. I also wish to express my appreciation to Dr. Dorian Feldman for his encouragement and sup- port. Without their intellectual advice and inspiration, this dissertation would not have been possible. I would like to thank all the people who helped me during my stay at Michigan State University. Finally, I would like to thank my parents, my wife and my son for their constant support, patience, and love. TABLE OF CONTENTS List of Tables ............................................................................................................ ix List of Figures .......................................................................................................... x Chapter 1. Introduction .......................................................................................... 1 1.1. Database Machines ......................................................................................... 1 1.2. Main Memory Databases ................................................................................ 3 1.3. Concurrent Dynamic Search Structures ......................................................... 3 1.4. Classification of Database Queries ................................................................. 4 1.5. Problem Statement .......................................................................................... 5 1.6. Thesis Overview ............................................................................................. 6 Chapter 2. Database Processing Models for Parallel Processing ........................ 8 2.1. General Parallel Processing Model for Database Systems ............................. 8 2.2. Two Stage Parallel Processing Model for Database Systems ........................ 10 Chapter 3. Parallel Processing Strategies for Partial Match Queries ................ 14 3.1. Introduction .................................................................................................... 14 3.2. Definitions and Terminology .......................................................................... 16 3.3. Basic FX Distribution Method ....................................................................... 19 3.4. Extended FX Distribution Methods ................................................................ 24 3.4.1. Field Transformation Functions for Partial Match Queries .................... 25 3.4.2. I and U Field Transformation Functions ................................................. 27 3.4.3. I and I Ux Field Transformation Functions .............................................. 29 vi 3.4.4. U and I Ux Field Transformation Functions ............................................. 35 3.4.5.1U1, IU2, . . . , [Ux Field Transformation Functions .............................. 40 3.4.6.1, U and I Ux Field Transformation Functions ......................................... 46 3.5. Performance Comparison with Other Distribution Methods .......................... 53 3.5.1. Probability of Strict Optimality ............................................................... 54 3.5.2. Average Response Time .......................................................................... 55 3.6. Data Construction Methods ............................................................................ 62 3.6.1. Data Construction Based on Real Global Directory ................................ 63 3.6.2. Data Construction Based on Virtual Global Directory ............................ 66 Chapter 4. Optimal Data Distribution for Multiattribute Range Queries .................................................................................................................................... 68 4.1. Introduction .................................................................................................... 68 4.2. Definitions and Terminology .......................................................................... 69 4.3. Limitations of Perfect Optimal Distribution ................................................... 73 4.3.1. Type (0 - 2) Range Queries ..................................................................... 74 4.3.2. Type (0 - 1) Range Queries ..................................................................... 78 4.3.3. Type 0 Range Queries ............................................................................. 82 4.4. Optimal Data Distribution Methods for Range Queries ................................. 82 4.4.1. Optimal Distribution by Basic FX Distribution Methods ....................... 83 4.4.2. Field Transformation Functions for Range Queries ................................ 86 4.4.3.1 and UR Field Transformation Functions ............................................... 90 4.4.4.1 and UM Field Transformation Functions .............................................. 92 4.4.5. UR and UM Field Transformation Functions .......................................... 95 4.4.6.1, UR and UM Field Transformation Functions ....................................... 98 Chapter 5. Node Partitioning Schemes for B-trees .............................................. 100 5.1. Introduction .................................................................................................... 100 5.2. The PNB-tree .................................................................................................. 101 5.2.1. The PNB-tree Operations ........................................................................ 103 5.2.2. Parallel Disks Configuration for PNB-trees ............................................ 104 vii 5.3. Motivations of PNB-trees ............................................................................... 106 5.3.1. Compressed Height .................................................................................. 106 5.3.2. Reduced Frequency of Tree Restructuring .............................................. 107 5.4. Performance Comparison ............................................................................... 107 5.4.1. Performance Models ............................................................................ 108 5.4.2. Average Data Retrieval Time .............................................................. 109 5.4.3. Threshold Points .................................................................................. 114 5.4.4. Update Performance ............................................................................ 115 5.5. Other Strategies for The PNB-tree Organization ......................................... 115 Chapter 6. Conclusions ........................................................................................... 118 List of References ..................................................................................................... 121 viii LIST OF TABLES Table 3.1. Response Time for F 1=F2=F3=F4=2, F 5=F 6% and M = 16 ............... 59 Table 3.2. Response Time for F 1=F 2=F 3:2, F 4=F 5=F 6:4 and M = 32 ............... 59 Table 3.3. Response Time for F 1=F 2=F3=F4=F 5=F 6:8 and M = 32 ................... 59 Table 3.4. Response Time for F 1=F 2=F3=F4=F 5=F 6:8 and M = 64 ................... 60 Table 3.5. Response Time for F1=2, F 2=F 3:4, F 4=F 5=F 6:8 and M = 128 ....................... . ..................................................................................................... 60 Table 3.6. Response Time for F 1=F2=F3=F4=4, F 5=F 6:8 and M = 256 ............. 60 Table 3.7. Response Time for F1=F2=F3=4, F4=F5=F6=8 and M = 512 ............. 61 Table 3.8. Response Time for F1=F2=F3=8, F4=F5=F6=16 and M = 512 ........... 61 ix Figure 2.1. Figure 2.2. * Figure 2.3. Figure 3.1. Figure 3.2. Figure 3.3. Figure 3.4. Figure 3.5. Figure 3.6. Figure 3.7. Figure 3.8. Figure 3.9. Figure 4.1. Figure 4.2. Figure 4.3. LIST OF FIGURES General Parallel Processing Model ........................................................ Two Stage Parallel Processing Model .................................................... Two Level Implementation of H1 for The Two Stage Model ............... Basic FX Distribution ............................................................................ FX Distribution with I And U Transformation ...................................... FX Distribution with I And I U 1 Transformation ................................... FX Distribution with I And I U 2 Transformation ................................... FX Distribution with U And I U 1 Transformation ................................ FX Distribution with U And I U 2 Transformation ................................. FX Distribution with I, U And I U 2 Transformation .............................. Strict Optimality When for Any Fields r and s, F ,F s 2 M .................... Strict Optimality When for Any Fields r, s and t, F ,F SF, 2 M ............. Explicit Representation of Multikey Hash Functions ............................ Example Bucket Distributions ............................................................... Nonexistence of Perfect Optimal Distribution for Type (0 - 2) Range Queries ............................................................................................................ Figure 4.4. Figure 4.5. Bucket Distribution When F 1 = 4, F 2 = 4 and M = 8 ........................... Nonexistence of Perfect Optimal Distribution for Type (O - 1) Range Queries ............................................................................................................ Figure 4.6. Figure 4.7. Figure 4.8. Figure 4.9. Figure 5.1. Basic FX Distribution When F 1 = F 2 = 4 and M = 4. ......................... FX Distribution with I And UR Transformation .................................... FX Distribution with I And UM Transformation ................................... FX Distribution with UR And UM Transformation ............................... Partitioned Node B-tree .......................................................................... 9 11 13 19 29 35 36 39 4O 52 56 57 70 72 75 77 80 83 91 96 98 103 Figure 5.2. Figure 5.3. Figure 5.4. Figure 5.5. Figure 5.6. Figure 5.7. Figure 5.8. PNB-tree After Insertion of Key Value 57 ............................................ 105 PNB-tree After Deletion of Key Value 71 ............................................. 105 Data Response Time with Various Data Request Rates ........................ 111 Data Response Time with Various Disk Speeds .................................... 113 Threshold Points ..................................................................................... 114 Tree Restructuring Cost ......................................................................... 116 Insertion Time ........................................................................................ 116 xi CHAPTER 1 INTRODUCTION There will be many applications for large databases that cannot be performed in an acceptable response time by current database systems. Since one of the most significant ways of improving performance is through parallelism, the importance of parallel pro- cessing in database systems has been increasingly recognized. Parallel processing in database systems can increase throughput and minimize response time through con— current data accesses as well as processing data in parallel. However, parallel process- ing by itself does not necessarily lead to high performance. Some of the reasons are attributed to overhead due to interprocessor communication, remote memory accesses and data access conflicts. For parallel database operations external I/O also causes a serious performance bottleneck [Bor83]. Stone [Sto87] shows that parallel query algo- rithms in a multiprocessor system may perform poorly than efficient serial algorithms on a single processor system. The advantage of indexing to reduce external I/O traffic was also emphasized in that paper. There have been numerous works on improving perfor— mance of database systems by specialized software/hardware techniques. These include database machines, main memory resident databases and. concurrent dynamic search Sti'llCtllI’CS. 1.1. Database Machines Many proposals have been made in the past for machine architectures for efficient parallel processing of database operations. Early designs such as CASSM [Su79] and RAP [Ozk75] exploit the logic per track idea which was first proposed in [81070] as an alternative to pure associative memories. They are SIMD architectures and focus on 1 parallel scanning of data on secondary storage devices. DIRECT [Dew79] is a logical extension of the SIMD type associative processors which supports both intra-query and inter-query concurrency. It assigns the number of processors dynamically to a query and so it allows users to share the processing power simultaneously. These machines can be characterized by functional replication approach because processors (or memories) are basically similar to each other and capable of performing the same set of functions in parallel. On the other hand, database machines such as RDBM [chh83] and SiDBM [Le185] have used functional specialization approach which means that dif- ferent functional units are used for different classes of database operations. Some database machines such as DBMAC M583] and Delta [Kak85] use an attri- bute based data model which employs vertical fragmentation concept. They use a verti- cal fragmentation of a relation to minimize the cost of processing and transferring unnecessary attributes from disks. Other database machines which use some form of vertical fragmentation are DBC [Ban79] and a database computer proposed in [Tan83]. Due to the similarity between relational algebra tree and data flow graph, the data driven computation concept has been proposed as an effective operational method for relational database machines [Bor82]. GAMMA, proposed in [Dew86], exploits dataflow query processing techniques. In GAMMA, queries are compiled into a tree of operators, and each operator is executed by one or more operator processes. With the exception of a few control messages at the time a query is initiated, execution of an operator is self-scheduling, i.e., when the process terminates execution, data flows between the processes without centralized control. There are also other proposals of database machines using data flow concepts [Bic83, Gli83]. Performance of various database machines are discussed in [Haw82, Hi186]. The results show that the performance improvement depends on the query type as well as the database machine architecture. l.2. Main Memory Databases Since inexpensive large main memory is becoming available, and many database applications are I/O bound, some work has been done on main memory database sys- tems to minimize data access time. ESP, presented in [GarL84], consists of multiple processors, but it operates in SISD mode. ESP uses single data stream with all proces- sors operating in lock step. The main focus of this machine is to reduce the global memory access time. SiDBM presented in [Le185], on the other hand, has an MIMD architecture in which the processors are functionally specialized. Both machines use shared memory with common bus architecture. Storage structure for main memory databases has been studied in the past. Dewitt et al. discussed the access time of AVL trees and B-trees for various memory residency factors of databases [Dew84]. They have also shown that hash based join algorithms are very effective for large main memory database systems. Lehman et al. give perfor- mance results of main memory databases using various types of access methods [Leh86]. In that paper they have also proposed a T tree as an alternative to other tree structures. The T tree is a balanced binary tree with many elements per node. It exploits binary search nature of AVL trees, and good update and storage characteristics of B-trees. Several other issues of main memory databases have also been investigated in [Amm85]. 1.3. Concurrent Dynamic Search Structures There have been several proposals on concurrent dynamic search structures to sup- port parallel processing. Many of them are based on tree structures such as B-trees and binary search trees. For concurrent B-tree operations, the entire subtree of the highest affected node is exclusively locked in [Sam76]. Bayer et al. present the algorithms which Only lock out other writers until the actual modifications must be performed [Bay77]. This approach distributes exclusive locks mostly in lower sections of the B- tree and hence it increases the concurrency. Lehman et al. also described algorithms for concurrent B-tree operations which use only a small number of locks for each tree mani- pulating process [Leh81]. Concurrent search and insertion algorithms for AVL trees and 2-3 trees are given in [E1180-a, E1180-b]. There are also several other work on con- current manipulation of binary search trees [Kun80, Man84]. 1.4 Classification of Database Queries Parallel processing strategy which is appropriate for one database application may not be appropriate for another. For example, the granularity of parallel processing which is suitable for one application may not be so for others. Thus, it may be neces- sary to develop different parallel processing strategies for various types of applications. We classify database queries based on their parallel processing characteristics, as fol- lows : (A1) Single query with multiple hits (A2) Single query with a single hit (A3) Single complex query (A4) Multiple queries accessing the same relation (A5) Multiple queries accessing different relations Examples of queries of type A1 are partial match queries and range queries. Here, intra-query parallel processing is advantageous because a single query requiring many data records can be processed in parallel [Kim88, Pra88-b, Pra88-d]. Rosenau et al. have also applied this type of parallel processing for projection operation on a relation [R0587]. It is rather difficult to exploit parallel processing of type A2 queries because apply- ing parallelism for this type of query may require finer granularity which may result in lower throughput of the system. On the other hand, parallel processing may achieve the lower bound on access time for these types of queries when appropriate software and hardware architecture is used. Achieving and guaranteeing this lower bound are impor- tant for many real-time critical applications. Parallel processing strategies for this type of applications can be found in [Pra86, Pra87, Pra88-a, Pra88-c]. Queries of type A3 include join functions, sorting of files, and complex qualifications. Several database machines which use functionally distributed architec- tures have been proposed for this type of queries Hxl85, Sch83]. Intra-query parallel- ism is also advantageous for this type of applications. For A4 and A5 type queries, many independent queries can be processed in paral- lel. Transaction processing is an example for these types. The throughput of the system for these types of applications can be improved by maximizing concurrency among the queries. Multidirectory hashing proposed in [Cev88] has been shown to be effective for certain applications of these types. 1.5. Problem Statement The primary objective of this work is to investigate data distribution strategies for various types of database applications in parallel processing systems. In this thesis we will mainly focus on the data distribution strategies for A1 and A2 type queries because these queries are in fact the most basic types, and data distribution strategies for other types of applications may be developed based on the strategies used in A1 and A2 type queries. For queries of type A1, we will investigate optimal data distribution for multi- key search queries such as partial match retrieval and multiattribute range queries. Since multikey search queries access a set of records, appropriate data distribution is very important to facilitate parallel data accesses. Though much work has been done in the past on designing efficient file structures for this type of applications, data distribu- tion to enhance concurrency has not been considered much. We will investigate optimal data distribution and appropriate data construction for this type of applications. For type A2 queries, we will investigate parallel processing Of tree type indexes for external databases. Since the B-tree is a common storage structure for this type Of applications, node partitioning schemes to exploit parallel data accesses tO B-trees will be investigated. 1.5. Thesis Overview The remainder of this thesis is organized as follows. In chapter 2 we describe high level abstraction Of database processing for parallel processing systems. The objective Of this abstraction is to define a framework which can be used as a basis for more specific implementation. In chapter 3 we present Optimal data distribution for partial match retrieval type queries. The main focus Of this chapter is the development Of new data distribution methods, called Fieldwise eXclusive-or (FX) distribution methods, for maximizing data access concurrency. FX distribution methods use bitwise exclusive-or Operation on the field values which are computed by multikey hashing. We show several useful charac- teristics Of exclusive-or Operation for Optimal file distribution. Field transformation techniques are presented to extend the scope Of Optimality in FX distribution methods. We give sufficient conditions for Optimal distribution by the proposed distribution methods and show that these methods are Optimal for most partial match queries. We describe the performance improvement by FX distribution methods over Other methods proposed earlier. Efficient data construction methods for FX distribution approach are also discussed. In chapter 4 we investigate file distribution problems for parallel processing of multiattribute range queries. Optimal data distribution methods as well as the existence and nonexistence Of perfect Optimality are investigated. It will be shown that there are inherent limitations tO achieve Optimal distribution for various types Of range queries. We give sufficient conditions for the nonexistence Of perfect Optimal distribution for certain types Of range queries. We extend the FX distribution methods for several use- ful multiattribute range queries. Sufficient conditions for Optimal distribution for mul- tiattribute range queries will be described. It will be shown that the proposed data dis— tribution methods are Optimal for a large class Of multiattribute range queries. In chapter 5 we investigate the performance of various parallel processing methods for B-trees. The main focus Of this chapter is a node partitioning scheme for B-trees to exploit parallel data accesses. The proposed B-tree node partitioning scheme is based on synchronized disks. Parallel processing Of partitioned node B-trees on asynchronous disks are also discussed. We also show the performance improvement Of partitioned node B-Uees over conventional B-trees. Chapter 6 contains concluding remarks. CHAPTER 2 DATABASE PROCESSING MODELS FOR PARALLEL PROCESSING In this chapter we describe high level abstraction of database processing for paral- lel processing systems. The Objective of this abstraction is to define a framework which can be the basis of more specific implementation. 2.1. General Parallel Processing Model for Database Systems We propose an abstract model for parallel processing of database systems. This is shown in Figure 2.1. The model is based on distributing data and access structures to enhance concurrency. In the figure, Q5,S represent a set Of parallel access nodes. These can be main memory modules or disks, depending on the parallel processing environment. Ai’s denote a set of access structures. As shown in the figure, there are three mappings that are important for concurrent processing. They are (1) storage allo- cation for data (2) storage allocation for access structures and (3) key to access structure mapping. Storage allocation for data, and storage allocation for access structures determine the amount of physical access concurrency among the nodes. In practice, these issues should be considered based on the architectural types Of parallel processing systems. In other words, a number of factors such as the interconnection structure between the pro- cessors and the access nodes may contribute to actual system performance. In this thesis we assume that the parallel access nodes are logically single shared memory. DATA Storage Allocation Key-tO-Access Structure for Data Mapping Storage Allocation for Access Structures A: Figure 2.1. General Parallel Processing Model In the parallel processing system, the speed mismatch between computation and data access time becomes a more serious problem than in the uniprocessor system due to data access conflict if data are not evenly distributed over access nodes. Generally, database applications require many data accesses with relatively less computation, and so degradation of system performanc due to inappropriate storage allocation scheme may be more significant in database applications than in others. In addition to physical memory (or device) access conflict, there are other sources of conflict for concurrent accessing of data. This means that even though two data are stored in different access nodes, access concurrency can still be reduced. One such source is the lock contention because the data in the same locking entity cannot be accessed concurrently when that entity is locked. It has been shown in [Cev88] that lock conflict causes more serious overhead than the physical memory access conflict in certain cases. This is because each resolution for a lock conflict takes at least a lock duration time which is perhaps much longer than one remote memory access time. 10 Each lock duration time may be reduced by using finer locking granularity. However, lock conflict for critical shared variables, if exist, cannot be avoided. Unfortunately, most access structures have critical parts which frequently need to be locked, e.g., root node in the tree type index and directory size dependent variables in dynamic hashing. Based on the above Observation, in order to achieve high concurrency multiple independent access structures need to be constructed for each relation to reduce the amount Of sharing for critical shared variables. By multiple access structures for each relation, we mean that a set of records pertaining to one relation is partitioned horizon- tally and each subset Of partition contributes to each independent access structure. Thus, we need an appropriate mapping from a set of keys to a set of access structures. Key to access structure mapping determines logical access concurrency while storage allocation scheme for data and access structure determines physical access concurrency. For many applications we can simplify the general model such that only a group of data which are stored in the same access node, say Qi, contributes to a unique subset Of access structures, say A j to A, which are also constructed in the same access node. By this approach storage allocation strategy for data and access structures can be treated integratedly, and the complexity of processing models can also be reduced. Two stage parallel processing model shown in Figure 2.2 represents this idea. In the two stage pro- cessing model the first stage corresponds to data storage allocation and the second stage corresponds to key to access structure mapping. The two stage model provides a more systematic design procedure for developing parallel database systems. 2.2. TWO Stage Parallel Processing Model for Database Systems The basic idea Of this model is to partition data mapping into two stages. As shown in the figure, the first stage, H1, is called Data Distribution algorithm and the second stage, H2, is called Data Construction algorithm. ll QM—l Figure 2.2. Two Stage Parallel Processing Model The data distribution algorithm, which is the same as data storage allocation in the general model, determines how the data is appropriately distributed to the parallel access nodes so that maximum concurrency is achieved between the access nodes. The data construction algorithm, on the other hand, determines the appropriate data structure to minimize the access time. It receives data from the data distribution algorithm and then create local access structures such as hashed or indexed files. In general, the following strategies can be employed for data distribution : (B 1) Declustering based on query’s data reference pattern (B2) Random distribution (B3) Objective specific declustering (B4) Clustering based on data reference pattern In method B 1, the data distribution technique takes advantage of the data reference pattern of a query. For example, if a query references numerous records, the strategy may be to distribute the data so that these records are stored uniformly among the nodes. 12 This approach is useful for A1 type applications discussed in chapter 1. In method B2, records are randomly distributed between the nodes. This method is simple, but may not guarantee a good distribution. In the objective specific method, records are allo- cated to optimize certain objective functions. For example, in [Pra86] Pramanik et al. propose a data distribution technique to construct multiple directories for a single rela- tion, where a record is allocated to the node which has the smallest directory size. It has been shown in that paper that this approach gives the minimum total directory size. However, declustering of data may not be always beneficial. For example, if the com- munication cost between access nodes is large, clustering may give better performance than declustering. SO, B4 type strategies may depend on the interconnection network. Data distribution algorithms can be a functional mapping which depends only on data values. It maps a set of data values into a set of nodes. For example, if node addresses are determined by hashed values of input data, the distribution algorithm is a mapping which is independent of time or other system parameters. On the other hand, the distribution algorithm need not be functional. For example, random distribution may map the same record to different nodes at different time. Since data accesses are content-based for most database applications, it is advantageous to make a data distribu- tion algorithm functional depending only on data values. Let D be a set of data and ZM = [0, 1, . . . , M-l} be a set of parallel access nodes. Let a data distribution algorithm be a function from D to ZM. Since actual data are usu- ally unevenly distributed in the data domain, data distribution algorithms are commonly designed based on the hashed values of data which are evenly distributed in hashed address space. Thus, we define data distribution algorithm H1 as a composition of two functions, H 1(1) and H 1(2), such that H 1‘” is a mapping from D to T and H 1(2) is a mapping from T to ZM, where T is the set of hashed values. Figure 2.3 shows two level implementation of H1 for the abstract model in Figure 2.2. This model can be thought of as one class of the two stage model. (2) H1 Q0 H10) DATA - T QM— Figure 2.3. Two Level Implementation Of H1 for The Two Stage Model Let H2 be a hash-based data construction algorithm and LD be a set of entries in all local directories generated by H2 for a given file system. If there exists one-to-One correspondence between T and LD, T is called a real global directory. Otherwise, it is called a virtual global directory. When T is a real global directory, the set of all the local directories can be thought of as a partition of T. H 1(1) is usually static because the use of dynamic hashing for H 1(1) will cause significant overhead due to intemode data movement. However, static hashing scheme for H 1(1) may result in very sparse local directories or long overflow chains. These problems can be avoided by using a virtual global directory, where the actual local directory is determined by H2. When T is a real global directory, the ratio lTl/IDI directly affects the data retrieval time as well as storage utilization. On the other hand, it is more flexible that T is used as a virtual global directory. The comparison Of these two approaches will be described in more detail in chapter 3. Functional distribution, and real/virtual global directory con- cepts are used for parallel processing of partial match queries and range queries presented in chapter 3 and 4. CHAPTER 3 PARALLEL PROCESSING STRATEGIES FOR PARTIAL MATCH QUERIES 3.1. Introduction Many information processing systems involve the application which accesses the records in a file having the values of a specific set Of attributes in common. A file is a collection of records each of which is defined as an ordered n-tuple (r0, . . . , r,,_1) of n values which are the keys or attributes of the records. When a query is allowed to specify conditions on more than one attributes, the search performed for this query is called multikey searching, or associative searching. Multikey searching is used in queries such as partial match retrieval and multiattribute range queries. A partial match query is a query where some of the attributes are specified and the rest of them are unspecified. For example, query q = [Age = *, Department = "mathematics", State = "Ohio"] is a partial match query, where * denotes don’t care condition. Since each attri- bute in a file may or may not be specified in a partial match query, a set of records need to be retrieved. When a file is constructed on parallel devices, it is important to store these records to maximize concurrency. In this chapter we investigate parallel process- ing strategies for partial match queries. Optimal data distribution methods and appropriate data construction methods are described. It has been shown in [Rot74] that multikey hashing is effective for partial match retrieval type applications. Multikey hash function, H, for a file consisting of 11 fields is an ordered n functions