.. 36...,» ». Y..sz us}. H a . A. a 3:3.ka 33...... . 55$... tit: ( 3):: ‘- l9.l3 5. 1:3... u. .t.’b t! fax-5.x. {3| (1... . 8:... z: a- V5. 3.. :39. .m a . uneven. flu? 3:“ t: if. 5. . Extoae‘. .J. .ufldbvh $3.51. :3. .{id . l"......0.) I" '9 ‘1‘: ’- .3. A:.:::rcx . . 2:75:37 .5. . : E;:........:. .4. E. :r xx: .1. "3.... t 1 a. .. 53'... . 1 :2 .. a. {.51 . . .. 1:1:3. ,\ .v. a..u:.s:fl..v\) 4 ‘ s: i , THESIS .7, mo- lIBRARY I Michigan State University This is to certify that the dissertation entitled NEW DIRECTIONS IN MACHINE SCHEDULING presented by Patchrawat Uthaisombut has been accepted towards fulfillment of the requirements for Ph - D . degree in Qompgtge: $9 ience Major professor Date Y/Zl /ZOOQ MS U is an Affirmative Action/Equal Opportunity Institution 0- 1 2771 PLACE IN RETURN Box to remove this checkout from your record. To AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 6/01 cJCIRC/DateDuepGS-pJS NEW DIRECTIONS IN MACHINE SCHEDULING By Patchrawat Uthaisombut A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Computer Science and Engineering 2000 ABSTRACT NEW DIRECTIONS IN MACHINE SCHEDULING By Patchrawat Uthaisombut We explored several new directions in machine scheduling including bicriteria scheduling, extra-resource analysis, a new model of preemption, the k-client problem, and AN D / OR linear programming. First, we considered the problem of nonpreemptively scheduling jobs that ar- rive over time on one machine to simultaneously minimize both the makespan and the average completion time. Previous research focused on proving the existence of schedules that simultaneously approximate both criteria. We showed that optimal bicriteria schedules can be constructed in polynomial-time. By optimal, we mean that bicriteria schedules with a better bound for both criteria do not exist for some input instances. Second, we applied extra-resource analysis to the load-balancing problem. Extra-resource analysis is a generalization of competitive analysis. Extra-resource analysis has been used to distinguish good and bad on-line algorithms whereas com- petitive analysis cannot. We used extra-resource analysis to derive a new type of result, namely a qualitative divergence between on-line and off-line load-balancing algorithms. Third, we introduced a new model of preemption, the “preempt-decay” model. In this model, when a job is preempted, the work done on that job gradually decays and has to be processed again. This model is a generalization of both the preempt- repeat and the preempt-resume models. We compared the optimal solution among the three models in the one machine environment. Fourth, we studied the k-client problem which combines the ideas of multi- threaded environment and location-dependent processing time. In a multi-threaded environment, there are multiple chains of jobs. Jobs in each chain must be processed in order. In the environment where the processing time is location-dependent, the time required to process a job depends on the distance between the job and the server. When the underlying metric space in the k-client problem is a line, the problem becomes the multi-threaded disk scheduling problem. We analyzed two on- line greedy algorithms for the k-client problem, derived lower bounds in the line and the clique metric spaces, and considered the special case when k = 2. Finally, we introduced AND / OR linear programming which is a generalization of ordinary linear programming. We devised an optimization scheme for AND/ OR linear programs that has a small running time in practice. We outlined the application of AND/OR linear programs to the problem of finding a lower bound instance for an approximation algorithm. We demonstrated the technique on the problem of nonpreemptively scheduling jobs that arrive over time on one machine to minimize the total completion time. To my parents iv ACKNOWLEDGEMENTS Many people contributed in many ways to make my dissertation possible. I take this opportunity to express my gratitude to them. First, I wish to express my deepest appreciation to Dr. Eric Torng, my advisor. Dr. Torng showed me the essence and the glory of theoretical computer science when I took his class. During my Ph.D. program, he always gave me ideas and new interesting problems to work on. I thank him for his encouragement, guidance, and support. I would like to thank Dr. Abdul Esfahanian, Dr. Matt Mutka, and Dr. Edgar Palmer, my committee members and also my teachers. I learned many things in and outside their classes. I also thank them for their encouragement. I would like to thank my parents, Adun and Walaitip Uthaisombut, for their love, their support throughout my life, and their encouragement to persue higer ed- ucation. I wish to thank my wife, Rujida (Leepipattanawit) Uthaisombut, for her love, support, and believe in me. Her presence gave me the courage to go forward. I would like to thank Dr. Robert Rasche and Mrs. Dorothy Rasche for their kindness throughout my stay at Michigan State University. I would like to thank Mrs. Linda Moore for her help on all kinds of administra- tive work. Finally, I would like to thank all my friends for their help, their company, and the many interesting discussions. TABLE OF CONTENTS LIST OF TABLES ix LIST OF FIGURES x 1 Introduction 1 1.1 Introduction .................................. 1 1.2 Traditional Scheduling Problems ...................... 2 1.2.1 Job Data ....................... . .......... 2 1.2.2 Machine Environment ........................... 3 1.2.3 Job Characteristics ............................. 4 1.2.4 Optimality Criteria ............................. 5 1.2.5 Examples of Scheduling Problems ..................... 6 1.3 Approximation Algorithms and On-Line Algorithms ............ 6 1.3.1 Approximation Algorithms ........................ 6 1.3.2 On—Line Algorithms ............................ 7 1.3.3 Comparison Between Approximation Algorithms and On-Line Algorithms 8 1.4 New Directions for Scheduling Problems .................. 9 1.4.1 Multiple Objectives ............................. 9 1.4.2 Extra-Resource Analysis .......................... 12 1.4.3 New Models of Preemption ........................ 13 1.4.4 Multi-Thread Environment ........................ 16 1.4.5 Location-Dependent Processing Time ................... 21 1.4.6 AND/ OR Linear Programming ...................... 21 1.5 Organization of the Dissertation ....................... 21 2 Bicriteria Scheduling 23 2.1 Introduction .................................. 23 2.2 Lower Bound of the Quality of Bicriteria Schedules ............ 24 2.3 Constructing Bicriteria Schedules in Polynomial Time ........... 26 2.4 Open Problems in Bicriteria Scheduling ................... 28 3 Applying Extra-Resource Analysis to Load Balancing 29 3.1 Introduction .................................. 29 3.1.1 Problem Definitions ............................ 30 3.1.2 Related Results ............................... 31 3.1.3 Summary of Results ............................ 34 3.2 On-Line Results ................................ 36 3.3 Off-Line Upper Bounds ............................ 39 3.4 Lower Bounds on L’P’T ............................ 50 3.5 Open Problems ................................ 55 vi 4 Preempt-Decay Scheduling 56 4.1 Introduction .................................. 56 4.2 NP-Hardness Result ............................. 57 4.3 Comparison of the Optimal Flow Time Among the Three Models . . . . 61 4.3.1 Preempt-Repeat and Preempt-Resume models .............. 62 4.3.2 Preempt-Repeat and Preempt-Decay models ............... 64 4.3.3 Preempt-Decay and Preempt-Resume models .............. 66 4.4 Approximability and Inapproximability in the Three Models ....... 70 4.5 Open Problems in Preempt-Decay Scheduling ............... 75 5 The k-Client Problem 77 5.1 Introduction .................................. 77 5.2 Upper Bounds ................................. 80 5.2.1 Upper bounds for the Total Distance Cost Function ........... 82 5.2.2 Upper bounds for the Average Completion Time Cost Function . . . . 82 5.3 General Lower Bounds ............................ 86 5.3.1 Definition of Adversary Strategy A(N, k) ................. 86 5.3.2 Properties of the Adversary Strategy ................... 90 5.3.3 General Lower Bound for the Total Distance Cost Function ...... 96 5.3.4 General Lower Bound for the Average Completion Time Cost Function . 97 5.3.5 General Lower Bound on the Line ..................... 101 5.4 General Lower bounds when k = 2 ..................... 102 5.4.1 General Lower Bound on the Clique when k = 2 ............. 102 5.4.2 General Lower Bound on the Line when k = 2 .............. 107 5.5 Open Problems on the k-Client Problem .................. 109 6 Minimizing Total Completion Time on One Machine 111 6.1 Introduction .................................. 111 6.2 SR’PT-Subsequence Algorithms ....................... 112 6.2.1 Definition of a-Schedules and the BEST-a Algorithm .......... 112 6.2.2 Definition of SRPT-Subsequence Algorithms .............. 113 6.3 Lower Bound of Non-Preemptive SR’PT—Subsequence Algorithms . . . . 114 6.4 Summary ................................... 119 7 AND / OR Linear Programs and Scheduling Problems 120 7.1 Introduction .................................. 120 7.2 Lower Bound Problems ............................ 121 7.3 Linear and Integer Programming ...................... 122 7.3.1 Previous Use of Linear and Integer Programs in Scheduling Problems . 123 7.4 AND/ OR Linear Programming ....................... 124 7.4.1 Expressions of the Underlying Objective Functions ........... 124 7.4.2 Modeling Lower Bound Problems as AN D/ OR LPs ........... 126 7.4.3 Solving an AND/ OR Linear Program ................... 130 7.5 Applying AND/ OR LP to a Scheduling Problem .............. 134 7.5.1 Program Formulation ........................... 134 vii 7.5.2 Expressions of the Objective Functions .................. 136 7.5.3 Optimizing AND/ OR Linear Programs .................. 143 7.6 Discussion ................................... 148 A Bit Summation Inequalities 150 viii LIST OF TABLES 3.1 The lower bounds computed by the linear program. ........... 54 4.1 Summary of results on preempt-decay scheduling. ............. 58 5.1 Summary of results for the k-client problem ................. 81 ix 1.1 3.1 3.2 3.3 3.4 3.5 3.6 3.7 4.1 4.2 4.3 4.4 4.5 4.6 4.7 5.1 5.2 5.3 5.4 6.1 6.2 7.1 7.2 7.3 7.4 7.5 LIST OF FIGURES Example of a schedule in the preempt-decay model. ........... Job classification schemes. ......................... Case 2.1 of Theorem 3.3.2 ........................... Case 2.2 of Theorem 3.3.2 ........................... An instance with m = 8 and m = 2 showing tightness of Theorem 3.3.2. . The lower bound instance with m = 8 and k = 2 obtained from the linear program. ................................. The plot of the lower bounds computed by the linear program. ..... Summary of upper and lower bounds for LPT with extra resources. Summary of results on preempt-decay scheduling. ............. A reduction from 3-PARTITION to a preempt-decay scheduling problem. Lower bound instance between the preempt-repeat and preempt-resume models. .................................. Lower bound instance between the preempt-repeat and preempt-decay models. .................................. Lower bound instance between the preempt-decay and preempt-resume models. .................................. Examples of domination between decay functions. ............ Lower bound instance for S’R’PT in the preempt-decay model. ..... Lower bound instance for SD]: and SEQ with the total distance cost function. ................................. Lower bound instance for SD]: and 889 with the average completion time cost function. ............................ An example A(1, 8) lower bound instance and its cover graph. ...... Lower bound instance for the average completion time cost function and k = . ................................... Examples of non-preemptive SR’PT-subsequence schedule ......... Lower bound instance for non-preemptive SR’PT—subsequence algorithm. Construction of the interval inclusion graph and the linear subprogram . N on-preemptive SR’PT-subsequence schedules. .............. The first example of optimizing an AND / OR linear program. ...... The second example of optimizing an AND / OR linear program. The third example of Optimizing an AN D/ OR linear program. ..... 47 49 53 53 55 59 61 63 65 67 69 72 82 85 96 109 113 116 141 Chapter 1 Introduction 1.1 Introduction Scheduling is the problem of allocating resources over time to perform a collection of tasks. Practical scheduling problems arise in a variety of Situations. A problem could involve: jobs in a manufacturing plant, programs to be run in a computer system, bank customers waiting for services in front of tellers’ windows, or airplanes waiting clearances to land or take off at an airport. Regardless of the application, there is a fundamental similarity among these problems. The vital elements in scheduling problems are resources, tasks, and objectives. Resources are typically characterized in terms of their qualitative and quantitative capacities, so that in a scheduling problem, each resource is described by its type and amount. Each task is typically described in terms of such information as its resource requirement, its duration, the time at which it may be started, and the time at which it is due. In addition, sometimes a collection of tasks is described in terms of precedence constraints that exist among the tasks. The objectives are some measure of goodness of solutions to a scheduling problem. Common goals in scheduling are high resources utilization, quick response to demands, close conformance to deadlines, and fairness. 1.2 Traditional Scheduling Problems In this section, we describe a classification scheme of scheduling problems developed by Graham, Lawler, Lenstra, and Rinnooy Kan [38]. Suppose that m machines M, (i = 1,...,m) have to process n jobs Jj (j = 1,...,n). A schedule is an allocation of one or more time intervals on one or more machines to each job. A schedule is feasible if at any time, there is at most one job on each machine, each job is run on at most one machine, and in addition, it meets a number of Specific requirements concerning the machine environment and the job characteristics. A schedule is optimal if it minimizes (or maximizes) a given optimality criterion. A scheduling problem type can be specified using a three-field classification alfily composed of the machine environment, the job characteristics, and the optimality criterion. 1.2.1 Job Data First of all, the following data is specified for each job J,-: o a processing requirement p, in the case of single-operation models, or a collection of processing requirements pi,- in the case of multi-operation models; 0 a release date r,-, on which JJ- becomes available for processing; 0 a non-decreasing real cost function f,- measuring the cost fj(t) incurred if Jj is completed at time t; o a due date d,-, after which Jj is considered late, and which may be used in defining f]; o a weight 21),, which represents the importance of Jj relative to other jobs, and may be used in defining fj. 1.2.2 Machine Environment The first field a = 01012 specifies the machine environment. Let 0 denote the empty symbol. If a1 6 {0, P, Q, R}, each job Jj consists of a single operation which can be processed on any machine Mi. Let p,,- denote the time to process J,- on M,. 0 a1 = 0: single machine; there is only one machine. plj = pj for all Jj. 0 a1 = P: identical parallel machine; there are multiple identical machine. pij = p,- for all [V1,- and Jj. 0 a1 = Q: uniform machines; there are multiple machines with different speeds. Each machine M,- has a speed s,-. p,,- = pj/s, for all M,- and Jj. o (11 = R: unrelated machines; there are multiple machines with different job- dependent speeds. Machine M,- runs job Jj with job-dependent speed Sij. pij = pj/s,j for all M,- and Jj. Ifal E {J, F, 0}, each job Jj consists ofa set of operations {01,}, 02,], ..., Omjj} 0 a1 = J: job shop; the operations of each job Jj forms a chain (01.150253 ..., 0mm) which must be processed in that order, and 0,,- must be processed on a given machine ,uij requiring pij time units. 3 o (11 = F: flow shop; flow shop is a special case of job shop where m, = m and u”- : Ad, for all jobs Jj. 0 a1 = 0: open shop; open shop is similar to flow shop except that the operations of any job Jj can be processed in any order. If a2 is a positive integer, then m is a constant, equal to 02; it is specified as part of the problem type. If 02 = 0, then m is a variable; the value of m is Specified as part of the problem instance. Note that a, = o if and only if a2 = 1. 1.2.3 Job Characteristics The second field D g {pmtn, rj, fimec} indicates certain job characteristics. 0 If pmtn is present, then preemptions are allowed; the processing of any job may be interrupted at no cost and resumed at a later time. Otherwise, no preemptions are allowed; once a jobs is started on a machine M,, the job occupies the machine until it is completed. 0 If rj is present, then each job may have different release dates. Otherwise, all jobs arrive at time 0. o If a precedence constraint 5pm, is present, then there is a precedence relation < among the jobs, i.e., if Jj < Jk, then Jj must be completed before Jk can be started. If 5pm 2 chain, then < forms chains. If flprec 2 tree, then 4 forms a tree. If fipm, = prec, then -< is an arbitrary partial order. If 5pm, is not present, then jobs can be processed in any order. The field B may contain some other job characteristics. They should be self-explana- tory. It should be noted that there are many more types of precedence relations than those shown above. 1.2.4 Optimality Criteria The third field 7 specifies the optimality criterion or the objective, the value we wish to optimize. Given a feasible schedule, we can compute for each J]- o the completion time 03-, the time at which the processing of job J, is completed; 0 the flow time Fj = Cj — rj, the amount of time job JJ- spends in the system; 0 the lateness LJ- = Cj — d,-, the amount of time by which the completion time of job Jj exceed its due date; lateness can be negative if job Jj finishes earlier than its due date; 0 the tardiness T,- = max{LJ-, 0}, the lateness of job JJ- if it fails to meet its due date, or zero otherwise; 0 the unit penalty Uj = 0 if Cj S dj, U,- = 1 otherwise, a unit penalty of job JJ- if it fails to meet its due date. The cost f, for each job Jj usually takes on one of the variables described above or the product of the weight m,- with one of the variables. The optimality criterion can be any function of the costs fj, j = 1, ..., n. Common optimality criteria are usually in the form E]. f,- and fmax = maxj fj. Example of common optimality criteria are the total weighted completion time Erin-C], the maximum completion time or the makespan Cmax, and the maximum lateness Lmax. 5 1.2.5 Examples of Scheduling Problems We give a few examples on the three-field classification of scheduling problems. The problem 1|rj| Z wJ-Cj is the problem of minimizing weighted completion time on one machine subject to non-trivial release dates. The problem P3|pmtn, preclLmax is the problem of minimizing maximum lateness on three identical parallel machines subject to general precedence constraint, allowing preemption. 1.3 Approximation Algorithms and On—Line Algo- rithms In this section, we describe approximation algorithms, on-line algorithms, common methods for evaluating their performance, and we compare and contrast approxima- tion and on-line algorithms. First, we define some notations. The supremum of a non-empty set A of real numbers, denoted sup A, is the smallest real number u such that u 2 a: for each :1: E A. The infimum of A, denoted inf A, is the greatest real number I such that l g z for each a: E A. If A is finite, then both supA and inf A belong to A. In this case, they are often called the minimum and maximum of A and are denoted by min A and max A, respectively. 1.3.1 Approximation Algorithms Many problems are NP-hard. It is unlikely that there exist efficient algorithms for solving N P-hard problems optimally. In this case, we should focus our effort to find efficient approximation algorithms. An approximation algorithm for a problem II is an algorithm that, for any input instance I in II, always give a feasible solution for I . We usually measure the performance of an approximation algorithm by its approximation ratio. Suppose we are considering a minimization problem. Let A(I) be the cost of the solution to input instance I produced by an algorithm A. The performance guarantee or the approximation ratio of an approximation algorithm A, denoted RA, is defined as _ A(I) where OPT is the optimal algorithm and the supremum is taken over all possible input instances of the problem. An algorithm A for a minimization problem is an a- approximation algorithm if a _>_ R A. For a maximization problem, the approximation ratio of an approximation algorithm A is defined as __ - A(I) RA — ”if 0197(1)‘ (1.2) An algorithm A for a maximization problem is an a-approximation algorithm if a _<_ R A. Note that the performance guarantee of algorithms is always greater than or equal to 1 for minimization problems and always smaller than or equal to 1 for maximization problems. In both cases, the performance guarantee is 1 for optimal algorithms. 1.3.2 On—Line Algorithms In many real-life situations, we do not know all information about the input instance in advance. Yet we have to produce a partial solution to the partial input we have. For example, in an operating system, jobs arrive over time, and we only know of their existence when they arrive. Yet we have to construct a schedule over time without 7 knowledge of the future. To model this, on-line algorithms were introduced. An algorithm is ofl-line if it has all the knowledge about the input instance before it makes any decision. An algorithm is on-line if it must construct a partial solution to the currently known partial input without the knowledge of the future. In general, on-line algorithms cannot produce an optimal solution. The performance of an on-line algorithm is usually evaluated by the competitive analysis technique introduced by Sleator and Tarjan [77]. The competitiveness or the competitive ratio of an on-line algorithm A for a minimization problem, denoted CA, is defined as cAzsupfl , 0PT(I) (1'3) where (9737' is the optimal off-line algorithm and the supremum is taken over all possible input instances of the problem. An algorithm A for a minimization problem is c-competitive if c 2 CA. The competitive ratio of an on-line algorithm A for a maximization problem is defined as . A(I) _-: ———-_ .4 cA IIIIIOPT(1) (1 ) An algorithm A for a maximization problem is c-competitive if c 3 CA. 1.3.3 Comparison Between Approximation Algorithms and On-Line Algorithms Both approximation algorithms and on-line algorithms typically give suboptimal so- lutions because they have some forms of handicap. Approximation algorithms have a limited computational power They are allowed only polynomial time to produce a solution. In contrast, on-line algorithms have a limited knowledge of the future. At any time, they must produce a partial solution knowing only the information up to the current time. Both approximation analysis and competitive analysis compare the perfor- mance of the algorithm of interest with that of the best off-line algorithm. Therefore, we can interpret the approximation ratio (competitive ratio) of an algorithm as its absolute performance measure. Alternatively, we can interpret the approximation ratio (competitive ratio) of an algorithm as a relative performance measure. By com- paring the approximation ratios (competitive ratios) of different algorithms, we can differentiate the good ones from the bad ones. 1.4 New Directions for Scheduling Problems In this section, we describe some recent research directions in the field of scheduling. We will discuss their motivations and some general results in the literature. 1.4.1 Multiple Objectives In a long history of the scheduling theory, numerous algorithms have been designed to optimize many kinds of optimality criteria in many scheduling models. Typically, each criterion has been studied separately. Few studies considered multiple criteria together. Understanding interaction among multiple criteria is important because in real life, decision makers of scheduling problems usually have to consider multiple criteria before arriving at a decision [67]. Decision makers can gain useful insights from the trade-offs involving multiple criteria. There were some bicriteria results in the literature which were byproducts of work on single criterion problems. For the problem of scheduling jobs on parallel identical machines, Graham showed that any list scheduling algorithm will produce a schedule with makespan at most twice the optimal makespan [36]. For the weighted completion time objective of the same model, Kawaguchi and Kyan showed that a list scheduling algorithm which orders jobs by non-increasing ratio of weight to processing time is an (J2 + 1)/2-approximation algorithm [51]. If all weights are equal, this algorithm is optimal for the average completion time [23]. Therefore, this algorithm performs well for both the makespan and the weighted completion time objectives. An approach set out to explicitly study bicriteria scheduling is characterizing the set of “pareto—optimal” schedules. A set of schedules is pareto-optimal if there does not exist a schedule which is simultaneously better, in both criteria, than any of the schedules in the set. Many results on finding pareto—optimal sets are due to Van Wassenhove and Gelders [85], Nelson, Sarin, and Daniels [65], Garey, Tarjan, and Wilfond [32], McCormick and Pinedo [61], Hoogeveen [43, 42], Hoogeveen and Van de Velde [44]. A second approach to study a bicriteria scheduling problem is setting a con- straint of the value of one criterion and optimizing the other criterion subject to the constraint. Smith studied the minimization of the average completion time on one machine subject to minimal maximum lateness [78]. Shmoys and Tardos studied the minimization of the average completion time on unrelated machines subject to 10 the constraint that the makespan must be at most twice the optimal [76]. Hurkens and Coster showed that there exist instances for the problem of scheduling jobs on unrelated machines such that all Optimal average completion time schedules have a makespan of 9(log n) times optimal [45]. A third approach to study a bicriteria scheduling problem is constructing a schedule that tries to optimize both criteria simultaneously. Chakrabarti et al. intro- duced a general technique for constructing algorithms that Simultaneously optimize both the makespan and the average weighted completion time [18]. Wang studied the single machine schedule problem with release dates minimizing the makespan and the average weighted completion time [84]. Stein and Wein showed that for a very general class of scheduling problems, there exist schedules which are simultaneously at most 1.88-approximation for both the makespan and the average weighted completion time [80]. Recently Rasala extended this work and proved the existence of bicriteria sched- ules for several pairs of common scheduling criteria [70]. Her results apply to a very general class of scheduling problems. More literature in multiple criteria scheduling can be found in a survey paper by Nagar, Haddock, and Heragu [64]. Stein and Wein [80] introduced the following notation for bicriteria optimiza- tion problems. Suppose we have criteria (A, B), then a schedule S for an instance I is an (a, B)-approximation schedule or an (a, B)-schedule if the objective value of S for the criterion A is at most (1 times the optimal value for the criterion A and simultaneously the objective value of S for the criterion B is at most B times the optimal value for the criterion B. Similarly an (o, B)-approximation algorithm is an algorithm that always return a (a, B)-schedu1e. 11 In general, a schedule which is optimal in one criterion is not optimal in another criterion. Thus, the first step in studying bicriteria scheduling problems is to establish the values of a and B for which bicriteria (a, B)-schedules exist. The second step is to design algorithms to find such schedules. 1.4.2 Extra-Resource Analysis The goal of competitive analysis is to evaluate the performance of on-line algorithms. The competitive ratio of an on-line algorithm can be interpreted as an absolute per- formance or a relative performance when compared to the competitive ratio of other on—line algorithms. However, in some scheduling problems, competitive analysis fails to offer useful information. The competitive ratios of good algorithms is extremely large, or the competitive ratios of good and bad algorithms are the same. For ex- ample, for the problem of on-line non-clairvoyant scheduling on single machine to minimize the average response time, Motwani, Phillips, and Torng showed that the deterministic competitive ratio is Q(n1/3), and the randomized competitive ratio is 9(Iog n) [63]. Kalyanasundaram and Pruhs were the first to explicitly use extra-resource analysis of on-line algorithms [48] with the goal of identifying good on-line algorithms in settings where traditional competitive analysis fails to offer useful information. Extra-resource analysis is a relaxed notion of competitive analysis in which the on- line algorithm has more resources than the optimal off-line algorithm to which it is compared. Extra-resource analysis has been used to argue that certain on-line algorithms are good choices for solving specific problems as they perform well with 12 respect to the optimal off-line algorithm when given extra resources [47, 69, 12, 49, 54, 57, 3, 27]. In scheduling problems, extra resources provided to on-line algorithms could be faster machines, extra machines, or a combination of both. For the problem of on—line non-clairvoyant scheduling on single machine to minimize the average response time discussed above, Kalyanasundaram and Pruhs Showed that there exists an on-line algorithm with the performance ratio of 1 + 1/6 if it has a machine with speed 1 + 6 while the optimal off-line algorithm has a machine with speed 1 where 0 s e g 1. Also, the same bound still holds if the on-line algorithm is equipped with a unit speed machine and an 6 speed machine, instead of a (1 + 6) speed machine. This provides a practical way to improve the loss of system performance due to the on-line nature of the problem. By providing the on- line algorithm with either a faster machine or an extra machine the on-line algorithm can be constant competitive against the optimal off-line algorithm. Extra-resource analysis is related to bicriteria competitive analysis. By think- ing of the amount of resources used by algorithms as a parameter to be optimized, extra-resource analysis of a single-criterion problem can be thought of as a special case of bicriteria competitive analysis of a bicriteria problem. 1.4.3 New Models of Preemption Traditional scheduling problems can be categorized into two models with respect to preemptions. In the no-preemption model, no preemptions are allowed; after a job is started, it must be executed to completion. An example of a scheduling problem in this model is the car rental problem. After a customer takes off with a car, the 13 car cannot be recalled and rented to a second customer. The car can be rented to the second customer only when the first customer returns it. In the preempt-resume model, the execution of any job may be interrupted any number of times at no cost; preempted jobs resume execution from the point at which they were last preempted. An example for this model, is the scheduling of processes in a time-sharing operating system Now let us consider a relaxation of the no—preemption model. In the preempt- repeat model, the execution of any job may be interrupted any number of times, but the work done on that job is completely lost. When the preempted jobs restart, they restart from the beginning. Any no—preemption schedule is itself a preempt-repeat schedule. Any preempt-repeat schedule can be converted into a no-preemption sched- ule by eliminating all preempted executions in the preempt-repeat schedule. Clearly, the elimination does not affect the completion of any job. For off-line problems, the no—preemption and the preempt-repeat models are equivalent because off-line algo- rithms have all the information about the input instance up front and can perform any conversion before producing the output. However, in the on-line setting, the no-preemption and the preempt—repeat models are different. An on-line algorithm has more flexibility in the preempt-repeat model than it has in the no—preemption model. An on-line algorithm in the preempt- repeat model can decide to schedule a job and later decide to preempt that job to run another newly arrived job. In the no—preemption model, this type of action is prohibited. Scheduling a sound recording studio is an example for the preempt-repeat model. A recording of a song could be interrupted. However, the entire song must 14 be recorded again. Each of the preemption models described above realistically captures many real-life situations. However, there are still many practical situations for which none of these models accurately apply. Consider a scenario in a metal casting factory where a piece of metal needs to be heated in a furnace for 60 minutes before it can be used. However, after the metal is heated for 40 minutes, there is another more urgent job; another piece of metal needs to be heated for 10 minutes. Since the furnace cannot accommodate both pieces of metal, the first piece has to be taken out before it is finished heating. After the second piece of metal is finished heating, the first piece of metal is inserted back into the furnace. The crucial point here is that during the time the metal is put outside of the furnace, it cools down. Suppose the rate that it cools down is one-half the rate that the furnace can heat up the metal. Then it takes 10/2=5 minutes to reheat the metal to the temperature just before it was taken out of the furnace. After that, it needs another 20 minutes to heat up to the desired temperature. To capture time-dependent losses after preemptions, we propose a new model of preemption called the preempt-decay model. To facilitate the discussion, we define the following. The effective remaining processing time of job j at time t, denoted pj(t), is defined as follows. If job j starts or resumes its execution at time t, and there is no preemption while it is running, it will take exactly pj(t) time units to complete, i.e., it will complete at time t+ pj(t). The effective completed processing time cJ-(t) of job j at time t is defined as Cj(t) = pj —pj(t). For example, pj(0) = p, and cJ-(O) = 0. If job j runs continuously from time t1 to time t2, then cj(t2) = cj(t1) + (t2 —— t1). 15 In the preempt-decay model, there is a decay function d : Z+ ——> Z +. Pre- emptions are allowed with the following penalty. When a job Jj is idle for t time units as a result of a preemption, d(t) units of work done on Jj are lost and have to be reprocessed. If job j is preempted at time t’ and is idle for t time units, then cj(t’ + t) : max{cj(t’) — d(t), 0}. It is natural to assume that d(t) is a nondecreas- ing function because the longer a job is idle, the more the work done on that job should be lost. See Figure 1.1 for an example. This figures Show a schedule S in the preempt-decay model for an instance I with a decay function d(t) = t/ 2. The graphs on the bottom half of the figure show cj(t). The preempt-resume and the preempt-repeat models are special cases of the preempt-decay model. If d(t) = 0, the preempt-decay model specializes to the preempt-resume model. If d(t) = 00, the preempt-decay model specializes to the preempt-repeat model. A preempt-decay problem with d(t) = c where c is a posi- tive finite constant models the operating system scheduling problem with a context switching cost of c. 1.4.4 Multi-Thread Environment In traditional scheduling problems, there are 2 choices with respect to the arrival of jobs. In the first model, all jobs arrive at the same time, typically at time 0. Many scheduling problems are easy to solve when all jobs arrive at the same time. For example, 1]] Z wJ-Cj can solved by the Weighted Shortest Processing Time (WSPT) algorithm [78], 1]]Lmax can be solved by the Earliest Due Date (EDD) algorithm [9], and 1]] EU,- can be solved by Moore’s algorithm [62]. 16 I 2 3 4 5 l l L 4 —l 1M1 L] l 144 l l l > S 1 2 3 2 41 5 1 l l l I M1 LI 1 L1 1 l l > jobl / / job 2 / jobs 3,4, and 5 Figure 1.1: Example of a schedule in the preempt-decay model. 17 In the second model, jobs could arrive at different times. Each job has a fixed release time. Many scheduling problems become difficult when jobs arrive at] different times especially those where preemptions are not allowed. Examples of NP-complete scheduling problems are 1|rj| EC], llrlemax, and llrj] E U,- [59]. Some preemptive scheduling problems remain easy to solve when jobs arrive at different times. Exam- ples of polynomially solvable problems are 1]pmtn,r,~| Z Cj [9], 1|pmtn,rj]fmam [10], and 1|pmtn,rj| 2U, [58]. Single-Thread Scheduling Problems In the past few years, another model emerged. In this model, there is an arrival dependency among jobs. One common model is a chain where jobs have a linear arrival dependency among them. In a chain, initially, only the first job is available to be scheduled. The next job will become available only after the current job is scheduled. A representative problem for this model is the load balancing problem. In this model, there are m machines, and a list of jobs. Jobs become available one at a time and the next job will become available only after the current job is scheduled. The load of a machine is the sum of the processing time of the jobs on that machine. The objective is to schedule all jobs and minimize the makespan, the maximum load on all machine. A note on availability of a job. In the schedule itself, time has no meaning. When a job becomes available for the algorithm to schedule, the job can be scheduled anywhere in the schedule. 18 Multi-Threaded Scheduling Problems Significant work has been done in on-line scheduling problems. However, most of this work abstracts away the existence of multiple threads. In most previous studies of on-line algorithms, researchers have assumed that the system or algorithm must c0pe with a single request sequence; in particular, at any given time, there is at most one outstanding request in the system and future requests will not arrive until the current request is serviced (in some problems, the system is allowed to choose to not service the current request). Because of this assumption, the underlying problem addressed by most previous work in on—line algorithms has been deciding which system resource(s), if any, should be allocated to service the current request. Two of the many examples of interesting problems which fall into this single request sequence model are the paging problem [77, 60, 30] and its generalizations, the k-server and generic task system problems [60, 56, 14]. While the single request sequence model captures many important problems, there are many others which do not fall into this category, such as some operat- ing system scheduling problems [48, 28, 29, 63] and some real-time scheduling prob- lems [53, 11, 26]. In a typical problem, there is a single system resource such as a processor and, at any given time, there are multiple requests in the system waiting to be serviced. As a result, the underlying problem is deciding which current request should the system service rather than which system resource to use. Note that in many cases, there are multiple resources and multiple requests so the system needs to decide which requests to service as well as which system resources to use. 19 While this multiple request model captures the scheduling aspect of many sys- tems, it loses the thread-based or transaction-based nature of the single request model where the arrival of requests is dependent on whether or not the system processed previous requests. For example, in practice, for relatively long stretches of time, there is a fixed number of users on an operating system making requests to the disk. Fur- thermore, each user or client generates a sequence of requests for service where each request is only generated after the previous request of the client has been serviced. This thread-based client model also describes database systems where users perform transactions which constitute a series of atomic operations in the database. In a scheduling problem in a thread-based environment, each job is composed of a chain of operations Each operation has a processing time, the amount of time the operation requires to run on the machine. The first request from each job is released when the job is released. Other requests are released when the previous request from the same job runs to completion. Traditional scheduling problems assume that requests are independent and that each of them has a fixed release time. Examples of threads are the operating system scheduling problem and the disk scheduling problem. In those problems, for relatively long stretches of time, there is a fixed number of clients utilizing the system. Each client generates a sequence of requests for service where each request is only generated after the previous request of that client has been serviced. In order to capture (i) the multiple client nature and (ii) the thread-based client nature of problems such as operating system scheduling and disk scheduling, we study the k-client problem. 20 1.4.5 Location-Dependent Processing Time In a most general form of traditional scheduling, the processing time of a job depends on both the job and the machine. This model is not rich enough to model many practical problems. For example, in a disk scheduling problem, there is a disk, a server, and requests. The requests appear at different locations on the disk. The server can move around on the disk. To service a request, the server must move from its current position to the location of the request. The cost of servicing a request is largely dependent on the distance the server has to move. The distance the server has to move to service a request depends on how the server has serviced previous requests and thus varies from algorithm to algorithm. There is a large body of work analyzing disk scheduling algorithms [22, 81, 66, 21, 33, 75, 86, 5]. 1.4.6 AND/ OR Linear Programming Previously, linear programming and integer programming have been used in the design and analysis of scheduling problems. We introduce AN D / OR linear programs which are a generalization of ordinary linear programs. AND/ OR linear programs can be used to find hard instances of scheduling problems against a fixed approximation algorithm. 1.5 Organization of the Dissertation The rest of the proposal is organized as follows. In Chapter 2, our preliminary re- sults on bicriteria single machine scheduling minimizing the makespan and the total completion time are presented. In Chapter 3, we present a new application of extra- 21 resource analysis for the load-balancing problem. In Chapter 4, we present our prelim- inary results on single machine preempt-decay scheduling. We studied a relationship between the optimal total flow time among the preempt-resume, preempt-repeat, and the preempt-decay models. In Chapter 5, we present our on-line results on the k-client problem. The ideas of multi-threaded environment and location-dependent processing time are used to model this problem. In Chapter 6, we study the problem of nonpreemptively scheduling jobs that arrive over time on one machine minimizing the total completion time. We show that a class of approximation algorithms cannot guarantee an approximation ratio better than e/ (e — 1) z 1.58. In Chapter 7, we introduce AND/ OR linear programs, a generalization of ordinary linear programs. We use AND/ OR linear programs to find hard instance of the problem studied in Chapter 6. Each of the chapters above contains an introduction to the specific set- tings of the problems studied, some more specific related previous works, the details of the results, and related interesting open problems. 22 Chapter 2 Bicriteria Scheduling 2. 1 Introduction We study a bicriteria scheduling problem in a single machine environment with re- lease times. The two criteria we want to minimize are the makespan, the maximum completion time of any job, and the average completion time. We denote this problem by llrjUCmaxvCanl- Stein and Wein proved that for any instance and 0 < a < 1, there exists a (1 +0, fi)—approximation schedule which is a schedule with the makespan at most 1+a times the Optimal makespan and the average completion time at most % times the optimal average completion time [80]. In fact, their results apply to a very large class of scheduling problems. They also showed that there exist instances for which there is no (x,y)-schedule with both x and y simultaneously smaller than _[52_+1_ z 1.618. Later, Aslam, Rasala, Stein, and Young [6] improved the upper bound to (1 +01, 203:7)- approximation for 0 < a < 1. Their results also apply to a large class of scheduling problems. A natural Open problem is to close the gap between the upper bound and 23 the lower bound Of the quality of bicriteria approximation schedules. Another nat— ural question is the time complexity of constructing such schedules. We answer these two questions for the single machine environment where jobs arrive over time (1]rj](Cmax, Cavg)). Our main results are the following. First, we show that the upper bound found by Aslam et al. [6] is best possible by constructing a family of match- ing lower bound instances. This is shown in Section 2.2. Secondly, we show that this bound can be achieved by a deterministic polynomial-time algorithm. This is shown in Section 2.3. Our algorithm, called BEST-B, is simply a slightly general- ized version of the BEST-a algorithm by Chekuri, Motwani, Natarajan, and Stein [19]. Interestingly, BEST-a is an Ef—f-approximation algorithm intended for the uni- criteria problem of minimizing the average completion time on one machine with release times (llrjICavg). Based on our preliminary results, Rasala [70] analyzed the bicriteria performance guarantee of [3887-5 for several combinations of criteria in the single machine environment with release time. She also analyzed our lower bound instances in these settings. 2.2 Lower Bound of the Quality Of Bicriteria Sched- ules In the proof of the following theorem, we use Dirac’s delta function 6() which is defined as follows. 9A“) {l/A 0_ 1. We say that an algorithm A is (m, k)-optimal if A is an (m, k)-machine l-approximation algorithm. We study two algorithms in particular. The List-Scheduling algorithm (£8) runs the next job in the list on the machine with the smallest load. The Longest- 30 Processing-Time algorithm (LPT) runs the longest unscheduled job on the machine with the smallest load. LPT is an Off-line algorithm while £8 is an on-line algorithm. 3.1.2 Related Results The load balancing problem is N P-hard [31]. Graham showed that [.8 is a (2 — i)- _1_ 3m)-approximation algorithm [37] approximation algorithm [36] and LPT is a (g — for any m 2 1. Hochbaum and Shmoys devised a polynomial time approximation scheme (PTAS) for this problem [41]. For the on-line setting, Albers introduced a deterministic on-line 1.923-approximation algorithm, and she proved a deterministic on-line lower bound of 1.852 [2]. Recently, the lower bound was improved to 1.85358 by Gormley, Reingold, Torng, and Westbrook [35]. Kalyanasundaram and Pruhs were the first to explicitly use extra-resource analysis Of on-line algorithms [48] with the goal of identifying good on-line algo- rithms in settings where traditional competitive analysis fails to do so. The extra- resource analysis technique is a relaxed notion of competitive analysis in which the on-line algorithm has more resources than the optimal Off-line algorithm to which it is compared. Following this paper, several more studies have used the extra-resource analysis technique in a wide variety of settings [47 , 69, 13, 49, 54, 57 , 27, 3]. Azar, Epstein, and van Stee have independently and in parallel considered the problem Of load balancing with extra resources [8]. In contrast to our work, they only consider on-line algorithms. Their main result is an on-line load balancing algorithm with an approximation ratio which decays exponentially in (m + k)/m as well as a nearly matching lower bound. They also considered other problem features such as 31 allowing a job to be scheduled on multiple machines, temporary tasks, and related and unrelated processors. We only consider identical machines and permanent tasks which must be scheduled on only a single machine. Another approach for identifying good on-line algorithms when competitive analysis fails to yield useful results is to restrict the set of legal input instances in some form. This has been proposed and used in a variety of settings [15, 55, 16, 17]. In contrast, extra-resource analysis compares the performance of on-line/off-line approximation algorithms to that of the Optimal Off-line algorithm on all possible input instances; however, the approximation algorithms are given more resources than the Optimal off-line algorithm. Our problem is also related to the well studied and NP-hard bin packing problem [20, 31]. In the bin packing problem, we have to pack items into a minimum number of bins of known size. An item can be placed into a bin only if it fits in the available space. Johnson et al. prove that two simple on-line bin packing algorithms, First-Fit and Best-Fit, are (%%m + 2)-approximation algorithms [46]. Richey introduced an on—line algorithm Harmonic+1 which has an asymptotic worst case ratio smaller than 1.5888 [71]. Karp and Karmarkar devised an asymptotic fully polynomial-time approximation scheme (asymptotic FPTAS) for Off-line bin packing [50]. Dell’Olmo et al. studied Off-line bin packing with extendable bins [25]. In this problem, we have to pack items into a fixed number of bins of known size. However, the size of each bin can be extended if needed. The “final” size of a bin is the maximum between the original bin size and the total size of items in that bin. The 32 goal is to minimize the sum of the final size of all bins. In contrast, the goal of load balancing is to minimize the maximum load on any machine. Speranza and Tuza studied the on-line version Of the bin-packing problem with extendable bins [79]. Azar and Regev studied the on-line bin-stretching problem [7]. In this problem, the number of bins is fixed, and we wish to pack all the items into the bins while we stretch the size Of the bins as little as possible. The on-line algorithm is presented with one item at a time, and it has to pack the item before the next item is presented to it. This problem is equivalent to the load balancing problem except that the optimal makespan (maximum load) is known in advance. In contrast, we study the load balancing problem where approximation algorithms do not know the optimal makespan. Furthermore, in our study, approximation algorithms have more machines (bins) than the Optimal Off-line algorithm. In a way similar to the work by Azar and Regev, we compare the makespan (bin stretch) of the schedules produced by approximation algorithms and the optimal off-line algorithm. The property of an algorithm being (m, k)-optimal for the load balancing prob- m+k m lem with extra machines is similar to being an -approximation algorithm for the bin packing problem. Both types of algorithms can pack items into m + k bins when the optimal off-line algorithm needs m bins. Load balancing algorithms know the number of bins used, but they do not know the optimal makespan. In contrast, bin packing algorithms know the bin size but do not know the number of bins used by the optimal algorithm. Thus, an (m, k)-Optimal load balancing algorithm can be thought of as an algorithm for solving a bin packing problem with unknown bin size but with a specified number of bins. 33 3.1.3 Summary of Results The extra-resource analysis technique has been used to derive insight into the behavior of on—line and Off-line algorithms. Previously, these insights have been used to identify good on-line algorithms in settings where traditional competitive analysis fails to do so. In this work, we show that these insights can also be used to derive a qualitative divergence between off-line and on-line load balancing algorithms. In Section 3.2, we give on-line results. We begin by extending Graham’s results [36] on the performance of £8 for the load balancing problem to the case when £8 has m + 1: machines while OPT has m machines. We show that £8 is a (m, k)-machine (1 + 2 H1) -approximation algorithm. We give a lower bound instance to show that this result is tight. The lower bound for [.8 can be generalized to apply to all on-line algorithms. In particular, we prove that no on-line algorithm is (m, k)-optimal for any k. It should be noted that although no on-line algorithm is (m, k)-Optimal for any It, the makespan of the on-line schedule could be asymptotically close to the optimal makespan as can be seen from the performance guarantee of £8. In Section 3.3, we give Off-line upper bound results. By extending Graham’s result [37] on £PT, we show that [.PT is an (m, k)-machine (max{ 4$n++311,1})- approximation algorithm. This implies that £PT is (m, k)-optimal when k _>_ mg—l that is, if .CPT has at least $21 machines and OPT has m machines, then [.PT will produce a schedule with a makespan no larger than that of OPT. Next, we refine this result to show that .CPT is (m, k)-optimal when k 2 mall‘ The proof is based on a seemingly trivial but important fact that the sum of the 34 processing times of jobs on each machine in the Optimal schedule is no larger than the Optimal makespan. To exploit this fact, new ideas are required which we now briefly summarize. The main idea is to not classify jobs by their absolute sizes, but to classify each job in a schedule by (1) the number of jobs on the same machine and (2) the order of its size among jobs on the same machine. The classification allows us to establish some crucial relationships between the optimal schedule and the EPT schedule. These relationships enable us to identify a most heavily loaded machine y in the [.PT schedule and some machine z in the optimal schedule such that the ith largest job on machine y is no larger than the ith largest job on machine 2. Therefore, the makespan of the £PT schedule is no larger than the makespan of the optimal schedule. Results in Sections 3.2 and 3.3 imply a divergence between off-line and on- line algorithms because simple off-line algorithms such as LPT guarantees (m,k)- Optimality with only a few extra machines whereas no on-line algorithm can guarantee (m, k)-optimality with any number of extra machines. This result also underscores the value of sorting before performing list scheduling. Namely, if the list is sorted in non-increasing order (LPT), then we can achieve (m, k)-optimality with k = -'"—;—1. If, however, the list is not sorted in non-increasing order, (m, k)-optimality through list scheduling cannot be guaranteed for any value of 1:. Results in Section 3.2 also imply a difference between load balancing and m+k_ m bin packing. Consider an (m, k)-optimal load balancing algorithm and an approximation bin packing algorithm. Both algorithms can pack all items into m + k bins without overpacking any bin whereas the optimal algorithm needs m bins. How- 35 ever, each of them knows different pieces of information. A load balancing algorithm knows how many bins is needed by the optimal algorithm, but it does not know the Optimal makespan (bin size). A bin packing algorithm knows the bin size but it does not know how many bins is needed by the optimal algorithm. The performance achievable by on-line algorithms in each Of the problems are quite different. N O on-line load balancing algorithm can guarantee not to overpack the bins. In contrast, simple on-line bin packing algorithms such as First-Fit and Best-Fit need only %m + 2 bins to pack all items [46]. This result indicates that the optimal makespan (bin size) is a more important piece Of information than the Optimal number of bins. In Section 3.4, we describe a procedure to compute a good, though not neces- sarily optimal, lower bound of the performance of £PT for any m and k. 3.2 On—Line Results We begin this section by proving the following lemma which is an extension of Gra- ham’s analysis for £8 [36] and £PT [37] to the case when list scheduling algorithms have It extra machines. Note that the lemma applies to any list scheduling algorithm. Lemma 3.2.1. For any instance I and any It _>_ 0, ifjob l is the last job to finish in the £8 schedule, k — 1 £5 < 711 01,7- m + . Proof. All machines must be busy up to time 3,55 when job l is scheduled, otherwise job l could have been started earlier. The starting time s,“ of job l is upper bounded by the average load on m + k machines just before job l is scheduled. Thus, 36 sf5(m + k) S air—kflzyzlpi) — p1) = m—L—k zyzlp, — 571-ka The optimal makespan on m machines is lower bounded by the average load on m machines after all jobs are scheduled. Therefore, 0337(m) Z #(Z'lzl pi). cam + k, I) = 8.“ +121 1 " 1 _____. i—— + m+k§p m+k]?l p: s (—m )08£T(m.1)+(———m+k_l)pt m+k Theorem 3.2.1. For m 2 1, k 2 0, and any job list I, cs < m‘1 or?" I c...(m+k.I) _ (1+m+k)0....(m, ). and this is tight. Proof. Suppose job I is the last job to finish in the £8 schedule. By Lemma 3.2.1 OO’P’T and the fact that no job has a processing time larger than max , C£i(m+k.l) < (—’—”—) 03;T(m,l>+(31’1°—‘—1) Canon _ m+k m+k _ m-l O’PT — (1+ m+k) Cmam (m,I). The tightness of this result can be seen by considering an instance that begins with (m — 1)(m + k) jobs of size 1 and ends with a job of size m + k. £8 would schedule m — 1 jobs of size 1 on each of the m + k machines and schedule the job of size m + k on one of the machines. Thus, its makespan would be 2m + k — 1. The optimal schedule would schedule m + k jobs of size 1 on m — 1 machines and 37 schedule the job of size m + k on the remaining machine. Thus, its makespan would be m + k. E] The lower bound for £8 can be generalized to apply to all on-line algorithms. Theorem 3.2.2. No on-line algorithm is (m, k)—optimal for m 2 2 and k 2 0. Proof. The idea of the proof is that the adversary will keep generating new jobs until the on-line algorithm schedules a new job on a non-empty machine. Fix m 2 2, k 2 0, and an on-line algorithm A. The size of jobs generated by the adversary can be described as follows. The first job has size 4. The size of each subsequent job is equal to the size of all the previous jobs combined. Therefore, p1 = 4 and p,- = 2i for j Z 2. If the adversary has generated l jobs, then the optimal schedule for these I jobs is to schedule job 1 by itself and to schedule the other jobs arbitrarily on the other machines. In fact, the adversary can schedule jobs 1 through I —— 1 on the same machine. The optimal makespan for these 1 jobs is 21 which is the size of the latest job generated by the adversary and is also equal to the sum of the size of jobs 1 through l — 1. Suppose job I is the first job that the on-line algorithm schedules on a non- empty machine. Since the on-line algorithm has only m + k machines, then I S m+k+ 1. The adversary would generate no more jobs after job Z. From the argument above, the optimal makespan is 2’, the size of job I. Since the on-line algorithm schedules job 1 on a non-empty machine, then the on-line makespan is strictly greater than 2’. Thus, the result follows. E] 38 3.3 Off-Line Upper Bounds We begin this section by extending Graham’s analysis for £PT [37] to the case when £PT has It extra machines. Theorem 3.3.1. For m 2 1, k 2 0, and any job list I, C§;T(m+k,1) < art—meg}, 03193131 Cg£T(m11) 1, k _>. _2—l_. Proof. The proof is very similar to the case where both £PT and OPT have m machines [37]. Suppose job I is the last job to finish in the £PT schedule. Thus, C5,?! = spr + p,. We can assume that job I is the last job to start in the £PT schedule. If this is false, we can remove all jobs i such that 3,4737 2 315737. This does not change the makespan of the £PT schedule because these jobs must have run on machines other than the one job I does. Moreover, this cannot increase the optimal makespan. Since £PT schedules job l last, then job I is the smallest job, i.e., p1 = pm". There are two cases. . 1 OP Case 1. pmin S §CmaxT' Since l is the last job to finish in the £PT schedule, then by Lemma 3.2.1, crr< 7" OPT m+k—1 l OPT: 4m+k—1 OPT Case 2: pmin > éCgafiT. Thus, p,- > §C§f§r for all i. Therefore, in the optimal schedule, there are at most 2 jobs on each machine. If n S m, then the optimal algorithm schedules one job per machine. Ifm+1 S n S 2m, we claim that forj = 1, ..., m, the Optimal algorithm 39 schedules job j on the same machine as job 2m+1 — j if 2m+1—j S n, and schedules job j by itself otherwise. The Optimality can be proven by an interchange argument. Furthermore, £PT would produce a similar schedule on m+k machines. If n S m+k, then £PT schedules one job per machine. If m + k + 1 S n S 2(m + k), then for j = 1, ..., m + k, £PT schedules jobj on the same machine as job 2(m + k) + 1 —j if 2(m + k) + 1 -— j S n, and schedules job j by itself otherwise. Thus, 0&5: S C353 4m+k—-1 From Cases 1 and 2, Cir! S max{ 3m + 3k , 1 }C££T and —————4T:m+:§cl>l 4:» k C££T(m, I ). We will now show that this counterexample cannot exist, and thus the theorem follows. 40 Suppose that jobs in I are labeled according to the order that they are sched- uled by £PT. Thus, pl 2 pg 2 Z pn. We first observe that job 11 is the only job in £PT’s schedule that finishes later than the last job in OPT’S schedule; that is, Ugly-(m + k, I) = Cfpnm + k, I) > ngnm, I) and Cfpnm + k, I) S ngTm, I) for i = 1,...,n — 1. Otherwise, there exists a job i such that Cfpnm + k,I) > 0357(m, I) and 1 S i S n -— 1. Create an instance I ' from I by removing jobs 2' + 1 through it from I. This does not affect the completion time of job i in the £PT schedule because jobs i + 1 through it are scheduled after job i is. Moreover, this cannot increase the optimal makespan. Thus, Cfifxnm + k, I ’ ) Z Cfp7(m + k, I ’ ) = Cfpnm + k,I) > C££T(m, I) _>_ Cg;T(m,I’), and I’ has fewer jobs than I. This contradicts our assumption that I is a smallest counterexample. We next Observe that OPT cannot produce a schedule where some machine has only one job i. Suppose this is not the case. Create an instance I’ from I by setting p,- to 035nm, I ) This only possibly increases pi. Clearly, ngTm, I ’ ) = ng’rm, I). Furthermore, C£;T(m + k,I’) Z Céfxnm + k,I) > ngTm, I) = Cg;T(m,I’). Thus, I’ is also a smallest counterexample. We can assume that, in the schedule created by £PT for instance I ’ , job i is on a machine by itself. Create an instance I ” from I’ by removing job i. Clearly, C£;T(m — 1,1”) S 0357(mJ’). Furthermore, 053nm — 1+ k, I”) = Cfifxl-(m + k, I’) > C£;T(m, I’) _>_ 033nm — 1,1”). Thus, I" is a counterexample to the theorem statement where there are m — 1 base machines and k extra machines. This contradicts the minimality of m. We now observe that the schedule produced by £PT just before job n arrives also cannot have any machines with only one job i. Suppose this is not the case. 41 Assign job n to the machine with only job i. Since OPT has no machines with only one job, job i must be scheduled with some other job j in the schedule produced by OPT. Since job n is the smallest job in the input instance, pn S pj. This implies that CrfifxT(m + (“11) = Cprm + 19,1)3 Pi +Pn S 191+ Pg“ S 035nm, 1)- We now proceed with the remainder of the proof. There are 2 cases based on the size of pn. In both cases, we show that C£;T(m + k, I) S 033m, I) which is a contradiction to the assumption that I is a counterexample. Case 1: pn S ngZT. Job 71 is the last to finish in the £PT schedule, and pa = pm,“ S fingr. Thus, from Lemma 3.2.1, 05197 < m COPT+ m+k—1 lCO’PT max - m+k max m+k 4 max 4m+4k "‘3‘" S ngr because m S 3k + 1. . 1 01> Case 2. pn > szaxT- Before we continue, we provide some definitions. Let (b be an Optimal schedule on m machines for instance I. Let a be the schedule produced by £PT on m + 1: machines for instance I. Let it be the partial schedule produced by £PT on m + k machines for instance I when exactly the first 11. — 1 jobs are scheduled. Let H and K be the set of machines in schedule (b with exactly 2 and 3 jobs, respectively. 42 0 Let E and F be the set of machines in schedule 7r with exactly 2 and 3 jobs, respectively. Since for i = 1,...,n, p,- 2 pm“ > lC‘f and C? S Cf then there are at 4 max’ max? most 3 jobs per machine in schedule (b. Similarly, there are at most 3 jobs per machine in schedule it because C,-7r S C33,” for i = 1, ..., n — 1. From earlier observations, there are at least two jobs per machine in both schedule (15 and schedule it. For the remainder of this proof, we introduce the following notation. When describing a machine, we will use a distinct letter such as u or v. When describing jobs, we will use two different notations. First, we will still call the last job scheduled by £PT job n, and we will still refer to its length as pn. However, we will also refer to jobs by how they are scheduled by the Optimal algorithm. Specifically, if u is one of the m machines for the Optimal schedule, u,- is the i’th largest job scheduled on machine u with ties broken arbitrarily. Also, we will use p(u,-) to denote the processing time Of job u,-. We now classify jobs according to 1. the number of jobs on the machine on which they are scheduled in (b and 2. the order of their size among jobs on the same machine in (b. oLetH,={u,-]uEH}fori=1,2. 0 LEI, H12=H1UH2. 0 Definitions of K with subscripts are analogous to the two definitions above. 43 We also classify jobs in a second manner with respect to the schedule it pro- duced by £PT. 0 Let E,- = { uj |job u]- is the ith job on some machine v in schedule 7r and v E E} for i = 1,2. 0 LEI. E12 2 E1 U E2. 0 Definitions of F with subscripts are analogous to the two definitions above. Figure 3.1 illustrates the two job classification schemes just described. Note that jobs in the same class may have different sizes. To further clarify these definitions, consider the following statements which are implied by u,- 6 K23. 1. Either u,- = U2 or u,- = u3. 2. Job u,- is on machine u in schedule (15. 3. u E K. 4. There are 3 jobs on machine u in schedule d, namely u1,u2, and u;;. 5. ul 6 K1, u2 6 K2, u3 6 K3. 6- Pfull 2 MW) 2 Mus)- 7. p(u1) +p(212) + p(u3) S C35,”. From earlier arguments, there are either 2 or 3 jobs per machine in schedule 44 m machines H{ K optimal schedule g3 H1 K1 K2 K3 machines M 7 time m+k E F{ F1 F2 F3 unscheduled job time partial £PT schedule 7r Figure 3.1: Job classification schemes. (1) and in schedule 7r. Therefore, we have the following equalities. 2|E| + 3|F| +1 4(IHI + W) m+k 2|E| + 3|F| +1 2|H|+3|K| |H|+|K| |E|+IF| 2|Hl+3|K| 3(IEI + IFI) +1 from (3.1) and (3.2) from (3.3),(3.4), and m S 3k + 1 (3.5) (3.6) We show that there exists a machine y in it and a machine z in 45 such that if job n is scheduled on y, there is a pairing of jobs from the two machines such that the jobs from y are no larger than the jobs from 2. We consider two cases. Case 2.12 E10 K23 # 0. This case is shown pictorially in Figure 3.2. Relation E1 0 K23 79 0 means that there exists a job v, such that v,- E E1 and v,- 6 K23. Let job Uj be the job in E2 on the same machine as v,- in schedule 7r. By definition of E1 and E2, we have 45 that p(uj) S p(v,-). In the optimal schedule, job v,- is scheduled on the same machine with exactly two other jobs, and job v,- is not the largest job on the machine. Thus, p(uj) S p(v,—) S p(v2) S p(v1). Finally, since job n is the smallest job, then pn S p(v3). Combining these facts, it follows that job n can be scheduled on the same machine as v,- and u,- in the £PT schedule so that the total load on the machine is no larger than the Optimal makespan. Algebraicany. pm.) 3 1)(m) 3 MW) 5 pd.) and p. _<. pas). then 03.... = 03 S 1)(m) + We) + an S p(v1) + p('v2) + p(v3) S 03:37- E1 E2 H1 H2 K1 K2 K3 vi v2 ’03 F1 F2 F3 4) 7r Figure 3.2: Case 2.1 of Theorem 3.3.2: E1 0 K23 75 0. Case 2.2: E1 (1 K23 = 0. First, we establish some relations between schedule (15 and schedule it. E1 Q K§3 = H12 LJ K1 because E1 0 K23 = (ll (3.7) 2|H| + [K] 2 IE] from (3.7) (3.8) 2|H] + [K] S |E| from (3.5)+(3.6) (3.9) 46 E1 2 H12 U K1 from (3.7), (3.8) and (3.9) (3.10) E2UF123U{n} = Ef = (H12UK1)C = K23 from (3.10) (3.11) Equalities (3.10) and (3.11) are the critical equalities we need and are shown graphi— cally in Figure 3.3. ¢ 1r Figure 3.3: Case 2.2 of Theorem 3.3.2: K23 2 E2 U F123 U {n} and H12 U K1 2 E1. Set E2 is not empty because from equalities (3.10) and (3.3), IE2] = [E] = 2|H| + |K| 2 m 2 1. Let uj be the first job in E2 to be scheduled by £PT. Suppose v,- is the job in E1 which is on the same machine as uj in schedule 1r. From relation (3.11), a, 6 K23. Thus, there are 3 jobs on machine u in (b. If we can show that 10.. S 1)(m), p(uj) S 1)(m), and 1)(m) S 1)(m), then 03,... = 0,? S 1)(m) +p(uj) H». S p(u1) + p(u2) + p(u3) S 0337'. Obviously, pn S p(u3) because job 71 is the smallest job. Since uj 6 K23, then job uj is either W or u3. In any case, p(uj) S p(u2). We now prove p(v,~) S p(u1). By definition, job ul is in K1 and ul is on the same machine as uj in schedule (b. From relation (3.10), ul 6 E1. Since uj is the first job to be scheduled in E2, and £PT always schedules the next job on the least loaded machine, then the fact that uj is scheduled on the same machine as v, implies 47 that v, is a smallest job in E1. In particular, p(v,—) S p(u1), and this completes case 2.2. Since this is the last possibility to consider, we have proven that there are no counterexamples to the theorem and thus the theorem is true. C] The result of the previous theorem is tight in 2 aspects. First, with _m_3—-1 extra machines, £PT does not always produce a schedule with makespan strictly smaller than that of the optimal algorithm. In fact, no algorithm can guarantee this due to the fact that an input instance can contain a job whose length is equal to the optimal m—l makespan. Second, as will be shown in the next theorem, if £PT has fewer than 3 extra machines, there exist instances such that £PT will produce a schedule with the makespan strictly larger than that of the Optimal algorithm. Theorem 3.3.3. For any pair of non-negative integers m and k such that k = m_;_2, there exist instances such that Céfxnm + k) 2 gfl—gngl—(m) > ngm). Proof. For k 2 0, the instance 1,, with 8k + 5 jobs is defined as follows: 2k + 2 jobs of size 4k + 3 2 jobs of Size 2k + 2 +i for i = 2k, ...,1 (a total of 4k jobs) 2k + 3 jobs of size 2k + 2 Figure 3.4 illustrates an instance with k = 2 and m = 3k+2 = 8, its optimal schedule on m machines, and its £PT schedule on m + k machines. Each job in the figure is labeled with its size. The optimal schedule on m = 3k + 2 machines can be described as follows: 48 k + 1 machines with jobs Of sizes 4k + 3 4k + 3 2 machines with jobs Of sizes 4k + 2 2k + 2 2k + 2 2 machines with jobs of sizes 4k + 2 — i 2k + 2 + i 2k + 2 for i=1,...,k-1 1 machine with jobs of sizes 3k + 2 3k + 2 2k + 2 All machines in the Optimal schedule have load 8k + 6. The schedule produced by £PT on m + k = 4k + 2 machines can be described as follows: 1 machine with jobs of sizes 4k + 3 2k + 2 2k + 2 2k + 1 machines with jobs Of sizes 4k + 3 2k + 2 2 machines with jobs of sizes 4k + 3 — i 2k + 2 + i for i=1,...,k The machine with 3 jobs in the £PT schedule have load 8k + 7. All other machines have load 6k + 5. Note that £PT could schedule the last job (of size 2k + 2) on any machine. D ”I 1116l6l [11 l11]11] ]11]6] [11111] [11]6] [11] 11J m+k | 11]6J m_ [10 ]6]6] ““1““ | 11 [a] machtneiL10I616] [10'7] l9l7l6l [1017] [9I7I6J [HQ [[818161 ,L9I8l optimal schedule £PT schedule Figure 3.4: An instance with m = 8 and m = 2 showing tightness of Theorem 3.3.2. 49 3.4 Lower Bounds on £PT In this section, we describe a procedure to compute a good, though not necessarily Optimal, lower bound of the performance of £PT for any m and k. Given m and k, we use a linear program that assumes that the optimal schedule and the £PT schedule have the specific structures shown in Figure 3.5 where jobs are ordered by non-increasing size. Note that each job in the figure is labeled with its job id. The variables of the linear program are the job sizes and the starting time of the last job in the £PT schedule. This procedure considers only a very restricted set of possible instances. The best lower bound instance from this restricted set can be obtained by Optimizing the linear program. We describe our assumptions in detail next. We assume that in the £PT schedule, there are 2 jobs on all machines except for one machine on which there are 3 jobs. Suppose p1 S p2 S S pn. We assume that in the £PT schedule, job i is paired with job it — i, and job n is the last job to finish. We assume that the optimal makespan is no larger than 1. Finally, we assume that there are either 2 or 3 jobs on each machine in the Optimal schedule, and the optimal schedule has a structure as illustrated in Figure 3.5. The linear program can be described algebraically as follows. Let m2 and m3 be the number of machines in the optimal schedule with 2 and 3 jobs respectively. 50 From the above assumptions, it =2 2(m+k)+1, n = 2m2+3m3, and m = 1712+m3. Thus, m2 = m—2k—1 and m3 "—" 21174-1. We denote the starting time of job n in the £PT schedule by 3. Given m 2 1 and 0 S k S (m — 1) / 2, we construct the linear program LB(m, k) as follows. maximize s + 1),, subject to Pr 2 Pi+1, i=1,...,n — 1 s S pit-191.4, i: 1,...,(n—1)/2 Pi+Pj S 1, V013.) EH P1+Pj +131 S 1, ‘v’(i,j, l) E K where H = {(1, 2m2+1—i)|i=1,...,m2} K = {(27712-l-l, 2m2+2m3+2-i, 2m2+2m3+1+i)[i=1,...,m3—1}U {(2m2+m3,2m2+m3+1,2m2+m3+2)} Notice the difference between the layout of jobs in the optimal schedules in Figures 3.4 and 3.5. In Theorem 3.3.3, we chose to use the layout in Figure 3.4 rather than the layout in Figure 3.5 because it is easier to describe. Table 3.1 shows the 51 computed lower bounds for m 2 1, ..., 45 and k 2 0, ..., 10. Figure 3.6 shows the plot of this table for m = 1, ...,60 and k = 0, ...,20. Figure 3.7 shows the plot of the lower bound when m = 100. Points in Figure 3.7 are the lower bounds obtained from our linear program when m = 100. The upper curve is the performance guarantee of £PT from Theorem 3.3.1. The upper 17.7511 3m +31: ,1} where y is the performance guarantee. It curve is described by y = max{ is important to note that the lower bounds found when m _>_ 1 and k = 0 or k 2 m—g—l match the upper bounds given by Graham [37] and Theorem 3.3.2, respectively. The lower curve is the performance guarantee of £PT when it is given m machines with speed mini“ instead of m + k machines with unit speed. When an algorithm is given machines with speed s, it can guarantee a makespan which is i— times the guaranteed makespan using machines with unit speed. The lower curve is described 4m—1 m _ 4m-1 by y = 3m m —— 3m+3k where y is the performance guarantee. Intuitively, faster machines should be utilized more efficiently than extra ma- chines. In Figure 3.7, the lower bounds Obtained from the linear program mostly lie well above the lower curve. However, there is one point which lies below the lower curve. It is possible that this point is a counterexample to our result, but more likely we have not found an Optimal extra machine lower bound for this particular combination of m and k. 52 21 6 m-2k-1 machines 5 4 3 2k, mm“ maclhine{ ( “- optimal schedule £PT schedule m+k Figure 3.5: The lower bound instance with m = 8 and k = 2 Obtained from the linear program. lower bound of 1 ' 2 makespan ratio Figure 3.6: The plot of the lower bounds computed by the linear program. 53 m\k 0 1 2 3 4 5 6 7 8 9 10 1 1 - - - — - - - - - - 2 7/6 - - - - - - — - - - 3 11/9 1 - - - - - - - - - 4 5/4 1 - - - - - - - - - 5 19/15 15/14 1 - - - - - - - - 6 23/18 10/9 1 - - - - - - - .- 7 9/7 7/6 1 1 - - - - - - - 8 31/24 7/6 23/22 1 - - - - - - - 9 35/27 27/2315/14 1 1 - - - - - - 10 13/10 25/2111/10 1 1 — - - - - - 11 43/33 11/9 10/9 31/30 1 1 - - - - - 12 47/36 11/9 7/6 19/13 1 1 - - - — - 13 17/13 11/9 7/6 15/14 1 1 1 - — - - 14 55/42 70/57 7/6 11/10 39/33 1 1 - - - - 15 59/45 5/4 7/6 53/48 23/22 1 1 1 - - - 16 21/16 5/4 27/23 10/9 19/18 1 1 1 - - - 17 67/51 5/4 19/16 7/6 15/14 47/46 1 1 1 - - 18 71/54 5/4 25/21 7/6 11/10 31/30 1 1 1 — - 19 25/19 19/15 11/9 7/6 11/1019/18 1 1 1 1 - 20 79/60 19/15 11/9 7/6 10/9 19/18 55/54 1 1 1 - 21 83/63 19/15 11/9 7/6 10/9 15/14 31/30 1 1 1 1 22 29/22 19/15 11/9 7/6 7/6 11/10 23/22 1 1 1 1 23 91/69 23/18 11/9 27/23 7/6 11/1019/18 63/62 1 1 1 24 95/72 23/18 27/22 19/16 7/6 76/69 19/18 39/38 1 1 1 25 33/25 23/18 70/57 19/16 7/6 10/9 15/14 31/30 1 1 1 26 103/78 23/18 5/4 25/21 7/6 10/9 11/10 19/18 71/70 1 1 27 107/81 9/7 5/4 11/9 7/6 7/6 11/1019/18 43/42 1 1 28 37/28 9/7 5/4 11/9 7/6 7/6 11/1019/18 31/30 1 1 29 115/87 9/7 5/4 11/9 7/6 7/6 53/4815/14 23/22 79/78 1 30 119/90 9/7 5/4 11/9 27/23 7/6 10/9 11/1019/18 47/46 1 31 41/31 31/24 5/4 11/9 19/16 7/6 10/9 11/1019/18 31/30 1 32 127/96 31/24 5/4 11/9 19/16 7/6 7/6 11/10 19/18 31/30 87/86 33 131/99 31/24 19/15 11/9 25/21 7/6 7/6 11/1015/14 19/18 55/54 34 45/34 31/24 19/15 27/22 25/21 7/6 7/6 10/9 11/10 19/18 39/38 35 139/105 35/27 19/15 27/22 11/9 7/6 7/6 10/9 11/10 19/18 31/30 36 143/108 35/27 19/15 70/57 11/9 7/6 7/6 10/9 11/10 19/18 23/22 37 49/37 35/2719/15 5/4 11/9 27/23 7/6 7/6 11/1015/1419/18 38 151/114 35/27 19/15 5/4 11/9 19/16 7/6 7/6 53/48 11/10 19/18 39 155/11713/10 19/15 5/4 11/9 19/16 7/6 7/6 10/9 11/1019/18 40 53/40 13/10 23/18 5/4 11/9 19/16 7/6 7/6 10/9 11/1019/18 41 163/123 13/10 23/18 5/4 11/9 25/21 7/6 7/6 10/9 11/1015/14 42 167/126 13/10 23/18 5/4 11/9 25/21 7/6 7/6 7/6 11/1011/10 43 57/43 43/33 23/18 5/4 11/9 11/9 7/6 7/6 7/6 53/4811/10 44 175/132 43/33 23/18 5/4 27/22 11/9 27/23 7/6 7/6 10/9 11/10 45 179/135 43/33 23/18 5/4 27/22 11/9 19/16 7/6 7/6 10/9 11/10 Table 3.1: The lower bounds computed by the linear program. 54 1‘0 2‘0 3‘0 4'0 ‘ 50 60 Figure 3.7: Summary Of upper and lower bounds for £PT with extra resources. 3.5 Open Problems Extra-resource analysis is a promising analysis technique which can be used to derive greater insight into the behavior of both on-line and off-line algorithms. Previously, these insights have been used to identify good on-line algorithms in settings where traditional competitive analysis fails to do so. In this work, we show that these insights can also be used to derive a divergence between off-line and on-line load balancing algorithms. We believe that future work should, in part, search for even more applications of the extra—resource analysis technique. 55 Chapter 4 Preempt-Decay Scheduling 4.1 Introduction In this chapter, we study the single machine preempt-decay scheduling problem with release dates minimizing the total flow time. We denote this problem by 1|rj, decay] ZFj- We study its relation to the preempt-resume and preempt-repeat counterparts (1|r,,pmtn| 2F]- and Ila-[2173). A previous result along this line is the work by Kellerer, Tautenhahn, and Woeginger [52]. They showed that the opti- mal total flow time in the preempt-repeat model is at most O(\/77) times that of the preempt-resume model, and there exist instances for which this is tight. In Section 4.2, we show that the preempt-decay scheduling problem on one machine minimizing total flow time is NP-hard. Since the preempt-decay model is a middle ground between two extremes, the preempt-resume and the preempt-repeat models, we need to determine if the preempt-decay model is significantly different from the other two models. In Section 4.3, we show that the optimal total flow time between the preempt-decay model and the existing preempt-resume and preempt- repeat model could be a factor of e(\/7—l) apart. Thus, the total flow time of the 56 Optimal schedule in the preempt-repeat model could be a factor of {Mfr—1) from that of the preempt-decay model. To distinguish Optimal algorithms in the three models, we use POPT, ’DOPTf, and N OPTtO denote an optimal algorithm in the preempt- resume model, the preempt-decay model where f is the decay function, and the preempt-repeat model, respectively. The summary of results in Section 4.3 is shown in in Table 4.1 and pictorially in Figure 4.1. In Section 4.4, we analyze the performance of the Shortest Remaining Pro— cessing Time (8’RPT) algorithm in the preempt-decay model. Note that the SRPT algorithm always runs a job with smallest remaining processing time, and it is an op- timal algorithm for the preempt-resume model [9]. Our analysis shows that 8’RPT performs poorly in the preempt-decay model; there exist instances for which the total flow time of the SRPT schedule is (ZR/ii) times the optimal total flow time in the preempt-decay model. The result is shown in Table 4.1 and pictorially in Figure 4.1. In section 4.5, we discuss some open problems in preempt-decay scheduling. Some Open problems are shown in Table 4.1 and Figure 4.1. 4.2 NP-Hardness Result First, we mention an NP-hardness result from the literature. Theorem 4.2.1. [5.9] The one machine scheduling problem with release dates mini- mizing the total flow time in the preempt-repeat model is NP—hard. Using a similar reduction from the 3-PARTITION, we prove that a similar problem in the preempt-decay model is NP-hard. 57 Lower Bound Upper Bound NO’PT POPT Q(x/ii) [52] 06/77) [52] NO’PT DOPT, POPT DO’PTL Lower Bound ilk/ELM) = utu 2 1 sin POPT, Open Q(ax/77).f(t) = ut, u S 1 270311977 open open 58 Table 4.1: Summary of results on preempt-decay scheduling. Upper Bound total flow time A 0(\/5) ONE) 0(x/5) Optimal cost in 1 {IQ/17) n(x/T—I) 1 (“fl 1 , . Preempt / Repeat model Optimal cost in Preempt / Decay model ---u----------# 1h 1* Optimal cost in Preempt/ Resume model #p--------------- _- ) _ instance total flow time A 1% 0‘) S’R’PT cost in Preempt / Decay model Q(x/r-I) for f(t) =ut and u >1 (Mum for f(t) =ut andu g 1 Approximation algorithm in Preempt/ Decay model Optimal cost in Preempt / Decay model #h------------ ~— > instance Figure 4.1: Summary of results on preempt-decay scheduling. 59 3—PARTITION : Instance: A bound B E Z +, a set A = {a1,...,a3m} of 3m integers such that B/4 < aj < B/2 for all aj E A and such that Z aj 2 m8. 016A Question: Can A be partitioned into m disjoint sets A1, A2, ...,Am such that, for 1 S i _<_ m, Ewe/1.01 = B (note that each A, must therefore contain exactly three elements from A)? Theorem 4.2.2. The one machine scheduling problem with release dates minimizing the total flow time in the preempt-decay model is NP-hard. Proof sketch. The proof is very similar to the proof of Theorem 4.2.1. The reduction is from the 3-PARTITION problem. The idea is to translate elements of set A into jobs that released at time 0 and to release streams of small jobs so that there is a slot of size B between consecutive streams. After a last stream of small jobs, release a stream of large jobs. See Figure 4.2 for an illustration. The Optimal solution is to schedule 3 jobs in each slot according to the solution of the 3—PARTITION problem and to schedule all other jobs as soon as they arrive. Solutions with this structure are the only optimal solutions. If there is no solution for the 3-PARTITION problem, then we cannot schedule 3 jobs in all slots. If a stream of small jobs are delayed, this would cause a large increase in the total flow time because there are so many of them. Similarly, if the stream of large jobs are delayed, this would cause a large increase in the total flow time. If a job is run after the stream of large jobs, its flow time would be very large. 60 Observe that preemption does not help either. This is because if a job is preempted by either a stream Of small or large jobs, it completely decays before it resumes execution. D [EL—J [EL—Jr CED— i FF llllllllllllllllll l l llllllllllllllllll l 'l llllllllllllllllll l l 7 l l L___.J l____l l J L streams Of small jobs —J a stream of large jobs Figure 4.2: A reduction from 3—PARTITION to a preempt-decay scheduling problem. 4.3 Comparison Of the Optimal Flow Time Among the Three Models In this section, we compare the Optimal flow time among the preempt-resume model, the preempt-decay model, and the preempt-repeat model to determine if the differ- ences in the three models are significant. To distinguish Optimal algorithms in the three models, we use ’PO’P’T, DO’PTI, and N O’P'Tto denote an Optimal algorithm in the preempt-resume model, the preempt-decay model where f is the decay function, and the preempt-repeat model, respectively. Note that the Shortest Remaining Processing Time (SR’PT) algorithm: always run a job with the smallest remaining processing time, is an optimal algorithm for the preempt-resume model [9]. 61 Fact 4.3.1. Suppose f and g are decay functions. If f(t) 3 g(t) for allt Z 0, then for any instance I, DO’PTfU) g DOPTQU). Fact 4.3.2. For any instance I and any decay function f, POPTU) g DOPTfU) g NOPTU). 4.3.1 Preempt-Repeat and Preempt-Resume models The result in this section is due to Kellerer, Tautenhahn, and Woeginger [52]. We restate their result here for completeness. Theorem 4.3.1. For the total flow time cost metric, NOPT(In)/P0'PT(I,,) = O(\/TD where 1,, is any input instance with n jobs, and there exists an input instance where this is tight. [52] Proof. We omit the proof Of the upper bound. The following lower bound instance is adapted from the one given in [52]. in fori=1,..,n—1 ad 1 fori=1,..,n—1 r-= n -= l O fori=n p. nfi fori=n See Figure 4.3 for illustrations of the lower bound instance, the Optimal sched- ule in the preempt-resume model, and possible schedules in the preempt-repeat model. The optimal schedule in the preempt-resume model can be described as follows. Jobs J1, ..., Jn_1 are run as soon as they arrive, and they run to completion without inter- ruption. Job J" is run as soon as it arrives, but it is always preempted by other jobs. In particular, job Jn is preempted by jobs J1, ..., Jfi. The Optimal total flow time in the preempt-resume model is 0(n\/7—i) which is primarily the flow time of job J". 62 \fri small jobs —-.—1— Jfi Jn-fi Jn-l ,Illlllllllllll !, z: I ’ 51m:llllllllllllll> 2 «assess: ii i ; A- .. ' i :g: éé «ssss ' '1' 1' A Ssllllgllllllg L.) 0 n m/fil n(\/1-i+l) n(n-\/r'i) n2+l n\/7_i+\/—1; Figure 4.3: Lower bound instance between the preempt-repeat and preempt-resume models. 63 Consider a schedule in the preempt-repeat model. Without loss of generality, we assume that jobs J1, ...,J,,-1 are run in that order. If job Jn is run before job Jn_\/,-,, it would cause fl other jobs to have an average flow time of 9(nfl) giving a total flow time of at least Q(n2). If job J" is run after job Jn_\/;;, its flow time, which is a lower bound of the total flow time, would be at least Q(n2). Therefore, the ratio of the Optimal total flow time in the preempt-repeat model and the preempt-resume model is Q(\/7i) for this input instance. Cl 4.3.2 Preempt-Repeat and Preempt-Decay models Corollary 4.3.1. For the total flow time cost metric, NOPT(In)/'D0’P7'f(1n) = 0(\/T—l) where 1,, is any input instance with n jobs, and f is any decay function, and there exists an input instance where this is tight. Proof. The upper bound follows from Theorem 4.3.1 and Fact 4.3.2. Noam.) < NOPT(I,,) Donna") - 1301970,.) = 0M5) Next, we show that an instance similar to that of Kellerer et al. [52] achieves a lower bound of SIR/77) between the Optimal costs in the preempt-repeat model and the preempt-decay model. The lower bound instance is defined as follows. Din fori=1,...,n—1 1 fori=1,...,n—1 0 fori=n Dnfi fori=n where D = f (1) + 1 and f is a fixed decay function in the preempt-decay model. See Figure 4.4 for illustrations of the lower bound instance, the Optimal sched- ule in the preempt-decay model, and possible schedules in the preempt-repeat model. The Optimal schedule in the preempt-decay model can be described as follows. Jobs 64 fr? small jobs D = f(1)+ 1 f—-h_ Dn. tal— Jw? Jr“; J“ Illllllllllllll :: i: i» jlllllllllllll .. § «fifjolssé i? i i iii? ii m... . .::. 1; A spillsnllllgg l, 7 I l 0 Dn DnfiJ Duh/1.1 +1) Dn(n — J5) Dn2 + 1 Dn\/r—i + \/7—1 Dnfi + DJH Figure 4.4: Lower bound instance between the preempt-repeat and preempt-decay models. 65 J1,...,Jn_1 are run as soon as they arrive, and they run to completion without inter- ruption. Job J" is run as soon as it arrives, but it is always preempted by other jobs. In particular, job Jn is preempted by jobs J1, ..., Jfi. Thus, DO'P'TUn) = O(Dn\/7_i) = O((f(1) +1)n\/r—i) This is primarily the flow time Of job Jn. Note that even though job Jn has to make up for the amount of processing decayed during the time it waited for jobs J1, ..., Jy; to finish, job Jn can finish before job JfiH arrives. Consider a schedule in the nO-preemption model. Without loss of generality, we assume that jobs J1, ..., Jn_1 are run in that order. If job Jn is run before jOb Jn_fi, it would cause fr? other jobs to have an average flow time of O(Dm/n) giving a total flow time of at least O(Dnz). If job Jn is run after job Jn_fi, its flow time, which is a lower bound Of the total flow time, would be at least Q(Dn2). Thus, NO’PTUn) = 9(Dn2) = O((f(1) +1)n2) Therefore, the ratio Of the Optimal total flow time in the preempt-repeat model and the preempt-decay model for this input instance is Norrun) porrun) {2(Dn2) _ 3 m — “W Note that the result apply as long as f (t) Z 0. El 4.3.3 Preempt-Decay and Preempt-Resume models Corollary 4.3.2. For the total flow time cost metric, any decay function f(t) = ut where u is a constant, and any input instance 1,, with n jobs, NOPT(In)/D0’P7'f(1n) = O(fi), and there exists an input instance where this is tight. 66 Proof. Since for any instance 1,, with n jobs and any decay function f, DOPT;(I,,) S NO’P’TUn), and from Theorem 4.3.1, N0’PT(In)/’P0’PT(I,,) = O(Jn), then ’DO’PTf(In)/’P0’PT(I,,) = O(fi). Next, we show that there exist input instances that achieve a lower bound Of (IQ/H) when u 2 1 and O(u n when u S 1. Suppose n = a: + fl. Thus, when n is large, n = :r + J51:— ~ 2:. A lower bound instance is defined as follows. r- (i — 1)(u +1), i = 1, ...,:1: ‘ (i—x—1)(u+1)\/EE, i=x+1,...,x+\/E . __ 1, i=1,...,a: 10’ ufi, i=x+1,...,x+\/E See Figure 4.5 for illustration. Tick marks indicate the completion time of big jobs. fl big jobs r ‘ «5 small jobs J5 small jobs fl small jobs J5 small jobs fi ‘fi ‘ ‘ r ‘ ail I l ,,]|J[J[|[]fll]l][lflfl[l[lfll]l] ll 1| II I A 41L; POW II II ll lJl ll ll lllJlll lllllll DOW; llll lllll ll ll ll [L ll II II _ Figure 4.5: Lower bound instance between the preempt—decay and preempt-resume models. Notice that the total gap time of fl gaps between J5 + 1 small jobs in the input instance is exactly the length of a big job. Thus, in the preempt-resume 67 model, for every fl small jobs completed, one big job can be completed by running it between the gaps. The total flow time of such schedule is (u + 1)x + a: = (u + 2):c. Note that if u _<_ 1, this schedule is not optimal. An optimal schedule can be obtained by using the S’R’PT algorithm. Now consider the preempt-decay model. The length of the gap between two consecutive small jobs in the input instance is exactly u times the length of a small job. Suppose we try to run big jobs between the gaps as we do in the preempt-resume model. When a big job is run in a gap, u units of work is completed. However, during the time the big job is suspended for 1 time unit because of a preemption by a small job, f (1) = u units of the big job have decayed. In other words, the work Of the big job done in a gap entirely decays when the big job is preempted by a small job following the gap. Thus, we can conclude that big jobs are never preempted in any reasonable schedule for this instance in the preempt-decay model. The Optimal solution in the preempt-decay model is in the following form. For some integer i where 0 g i S (5:. the first i big jobs are run as soon as they arrive. All the small jobs are run as soon as they arrive except when some of the first i big jobs are running. All the remaining fl — i big jobs are run after the last small job. We will show that regardless of the value of i, the total flow time of these schedules can be bounded from below by (u/ 2 + 1)a:\/:E. Each of the first i big jobs causes \fi small jobs to have an average flow time of at least (u/ 2 + 1)\/§ giving a total flow time Of at least 2(u/ 2 + 1):r. The last J} —i big jobs finish after time (u + 1)x giving a total flow time of at least (fl — i)(u + 1):r. The total flow time from both cases is 68 i(;+l)x+(\/1_:—i)(u+1);r = (u+1).7: x—z—l—g— Z (u+1)x\/E—ux2\/§ for1_<_iS\/E = é(u+2)z\/§ Thus, the ratio between the optimal cost in the preempt-decay and preempt- resume models is at least fi/ 2 = ilk/77) for this input instance. Cl Corollary 4.3.2 can be generalized to a larger class of decay functions. We need the following notation. A function f dominates another function f’ in the interval [t1, t2] if f (t) 2 f’ (t) for t1 5 t 3 t2. See Figure 4.6 for examples. In both examples, f dominates f’ in the interval [0, t’]. ll ll f’ . f V l t’ t t’ t Figure 4.6: Examples Of domination between decay functions. Corollary 4.3.3. For the total flow time cost metric, any decay function f which dominates some linear function f’ (t) = ut,u > 0, in the interval [0,t’] for some t’ > 0, and any input instance In with n jobs, NOPT(In)/DO’PT;(L,) = O(Jn), and there exists an input instance where this is tight. 69 Proof. The upper bound proof is the same as those for Corollary 4.3.2. To proof the lower bound, beginning with a lower bound instance given in Corollary 4.3.2, we linearly scale the release time and the processing time Of all jobs so that the processing time of no job is greater than f’ (t’ ) From the fact that jobs decay faster with f than with f’, and the argument in Corollary 4.3.2, it can be concluded that there should be no preemptions in any optimal schedule in the preempt-decay model. The rest of the argument is the same as those in Corollary 4.3.2, and the lower bound follows. CI 4.4 Approximability and Inapproximability in the Three Models In this section, we restate some approximability and inapproximability results in the preempt-repeat model from the literature. Next, we show that the SR’PT algorithm which is Optimal in the preempt-resume model performs poorly in the preempt-decay model. The following results are due to Kellerer, Tautenhahn, and Woeginger [52]. Theorem 4.4.1. There exists an O(n3/2 log n)-time O(\/r—i)-approximation algorithm for the minimum total flow time scheduling problem in the preempt-repeat model, and there exist instances where this bound it tight. [52] Theorem 4.4.2. Unless P = NP, no polynomial time approximation algorithm for the minimum total flow time scheduling problem in the preempt-repeat model can have an approximation ratio of O(ni’g) for any 5 > O. [52] Next, we show that the SR’PT algorithm performs poorly in the preempt- decay model. We need the following notation. For any real number x, let [x]1 = 70 [x] — 1, and let {x}1 :2 x — [x]1. For examples, [2.71]1 = 2, {2.71}, = 0.71, [3]1 = 2, and {3}1 = 1. Note that 0 < {x}, g 1 and [x]1+{x}1 = x for any real number x. Theorem 4.4.3. For the decay function f(t) = ut = t/v where u is a constant and v = 1/u, there exist instances 1,, with n jobs such that S'R'PTfUn) n __ 9(n) for u 2 1, and ’DOPTfUn) 3(1) + 4) _ {2(un) for u g 1. Proof. We prove the theorem by constructing a class of instances that achieves the stated lower bound. Let 5 > 0 be a small positive real number. Let u’ = v + e and i7 = [v'] 1. Suppose n = c- (i) + 3) for some positive integer c. For convenience, jobs will be labeled as Jig- where i = 1, ..., c and j = 1, ..., i7 + 3. We will also be refering to clusters of jobs. Cluster i is composed Of jobs Ji,1, ..., Jig-H3, 1 S i S c. A lower bound instance can be described as follows. Note that r,’s are just dummy variables. Ti : (i—1)(u+1)(v'+1) fori=1,...,c+1 r.-+j(v+1)—u fori=1,...,candj:l,...,i7+1 rm- = r,+ [u’+1]1(v+ 1)+{v’+1}1 fori=1,...,candj = 6+2 r,- fori=1,...,candj=27+3 v fori=1,...,candj=1,...,i7+1 1).-J = v{v’+ 1}1 for i = 1,...,c andj = 6+2 u’+1 fori=1,...,candj=i7+3 See an illustration in Figure 4.7. Shaded blocks in the top figure are shown in detail in the middle and the bottom figures. The middle and the bottom figures show a cluster Of jobs when u 2 1 and u S 1 respectively. Shaded job pieces in the middle and the bottom figures signify that these job pieces are not run to completion. There are 0 clusters of i7 + 3 jobs. The total processing time Of jobs in each cluster is u[v' +1]1 + u{u' +1}1+(v' +1) 2 (v' +1)(v +1). Each cluster is also released 71 I I -block I -block ° ° ' I -block I I I I > SRPT SR’PT-block SRPT-block - - - SRPT-block . . .‘ | I I I I ' * , I : D0797; @WTf-block ’DOPTJ-bloc ' ' ' DOPTf‘blOCkI L) l 0 1(1) +1)(v’ +1) (c - 1)(u + 1)(v' +1) c(v +1)(v’ + 1) + c(v’ +1) 2(1) + 1)(v’ + 1) do +1)(v’ + 1) ’U’ + 1 :0 1 ' v u. v{v' + 1}1 I II . I I . I-block st ' L 7 {U’ + Ill —"l #— l J1 I: J2 I J3 Jo+2 ' H ' I I I ’ ' i: ' : ' :Jvia : SRPT—block n+3 Jl [1.1m J2 |J.—,+3 .13 JM2 II ' I I I ’ POPT,-block J.-,+3 ? J1 J2 : J3 i J.+2 i I i > 0 1(v+l) 2(v+l) [v'+1]1(v+1) (v’ + 1)(v + 1) U, + 1 g: ‘U >I 1 I v I: v’ :‘U‘U’ I I-block J.7+3 I ; : J1 : Jo+2 I I > SRPT-bIOCk Jo+3 l J] JiJg+3 . ”9+2 I I, , I > DOPTf-block J“3 . . Jl [1,7,2 . . > 0 (2) +1) ('0' + 1)(u +1) Figure 4.7: Lower bound instance for SR’PT in the preempt-decay model. 72 (v’ + 1)(u + 1) time units apart. Thus, there is enough room to run all jobs in a cluster before jobs in the next cluster arrive. This can be done by running jobs in each cluster nonpreemptively and contiguously by the order Of their release times. In fact, the whole Optimal schedule for this instance is Obtained by runing all jobs nonpreemptively and contiguously by the order of their release times. The completion time and the flow time of jobs in the Optimal schedule are as follows. ri+(v’+1)+jv fori=1,...,candj=1,...,i7+1 0,130pr = n+1 for i = 1,...,candj = fi+2 ri+(v’+1) fori=1,...,candj=i7+3 u’+1+u—j fori=1,...,candj=1,...,27+1 FDOPT, , f ._ .___ ,J u{u +1}, ori—1,...,cand] -—u+2 v’+1 fori=1,...,candj=17+3 Thus, the Optimal total flow time is [1),-+111 73079770) = c (v'+1)+v{v'+1}1+ Z(v’+1+v-J'> i=1 C((v' +1) + v{v' +1}1+ v'[v' +1]1+[u' + 1]1+ 'v[u' +1]1 1 I I —§[’U +1]1[U +211) 3 C(U' +1)(1+ u + %[u' +1]1) In the SR’PT schedule, all jobs except the first job in each clusters (jobs Ji,g+3 for i = 1, ..., c) are run as soon as they arrive, and they are run to completion without interruptions. In contrast, the first job in each cluster (jobs Jig-H3 for i = 1, ..., c) never completes its execution before time re“. This can be explained below. Consider the input instance with jobs Jig)”, 1 S i S c, removed. Notice that the length of the gap in front of any job is exactly l/v times the length of the job. For example, job J“ has length 1), and the gap in front of it has length 1. For another example, job Jifl+2 73 has length v{u’ + 1}1, and the gap in front of it has length {21’ + 1}1. Consider jobs J1,,-,+3 and .1”. Job J1,,—,+3 is run from time 0 to time 1. At time 1, job Jm arrives. Since Jm has a shorter remaining time than J1,,—,+3, then J1,,-,+3 is preempted. At time 1+ 3, job Jm completes. By the argument above, the work done on job J15,” has just entirely decayed. Job J1,,-,+3 has to start anew. By repeating the argument, jobs Jig-H3 for i = 1, ..., c never complete their execution until all other jobs have completed. After all other jobs have completed (at time re“), S’R’PT run jobs J,,,-,+3, 1 g i g c, nonpreemptively in an arbitrary order. The flow time of jobs in the SRPT schedule is as follows. FSRPT = p--= v fort:1,...,candj=1,...,z7+1 1:] 1,] ’U{’U’ + 1}] fori=1, ...,C and j = ,5 + 2 at? = Dav +1)(v’+1)+ s' + 1) i=1 i=1 = (u + 2)(v' + 1)c(c +1)/2 c(c +1) 2 SR’PTfU) c(u[v'+1]1+v{u'+1}1)+ (u+2)(v'+1) c(c+ 1) 2 = cu(u’+ 1) + (u+2)(v'+1) = c(u'+1) (22+ 6:1(v+2)) Thus, the ratio of the total flow time of the S’RPT schedule and the optimal schedule of this input instance is the following. 74 S’RPTfU) > c(v’ +1) (’0 + 93(1) + 2)) DO’PTfU) ‘ c(u’ + 1)(1+ u + %[u’ +1]1) T327110) + 2) (2 + 21) + [u’+1]1) n(v + 2) (311 + 4)(u + 4) n 3(2) +4) nu 3(4u+ 1) O(n) for u 2 1, and Q(un) for u S 1. C] 4.5 Open Problems in Preempt-Decay Scheduling The results in section 4.3 show that the Optimal flow time between the preempt- resume model and the existing preempt-decay and preempt-repeat model could be a factor of O(fl apart. Corollary 4.3.1 tells us that simply using an Optimal algorithm in the preempt-repeat model to approximate a preempt-decay schedule could result in a factor of O(\/r—i) from the optimal flow time. Also, Theorem 4.4.3 tells us that simply using the SR’PT algorithm, an Optimal algorithm in the preempt-resume model, to approximate a preempt-decay schedule could result in a factor of O(n) from the Optimal flow time. These cost differences are significant. The preempt-decay model cannot be approximated by either of the two existing models. Therefore, this problem deserves a further study. A logical next step to pursue this problem is to find a polynomial-time approximation algorithm with a performance guarantee better than O(fi), or to prove that there isn’t one (unless P=NP). 75 TO the best of our knowledge, there are no previous studies in the preempt- decay model. We study only one particular setting. The preempt-decay model can be applied to many other settings such as problems with due dates or with multiple machines. Examples are llrj, decayanm, llrj, decay|(1 — Uj), and Plrj, decayl ZFj. 76 Chapter 5 The k-Client Problem 5.1 Introduction In the basic k-client problem, there is one server, 1: clients, and a metric space (e.g. a plane or a line). Each client generates a sequence Of requests in the metric space, and a request is serviced the moment the server, which moves at constant speed, moves to the location Of the request; that is, we assume zero processing time for all requests. We define a move of the server to be a non-zero distance movement that takes it from one request to a second request with no intervening requests. An input instance I Of the basic k-client problem has clients 1 through k where client i for 1 g i _<_ k has n,- requests. We define nmax(l) = maxlsisk n,- and n(I) = 219.3,: n,. We represent I by the set of requests {rigs} where TiJ is the jth request of client i for 1 S i g k and 0 g j S n,- (for notational convenience, we assume that each client has a dummy request 73,0 for 1 S i g k located at the initial server position). At any time, each client has at most one request in the system; more specifically, request rig-+1 arrives exactly when rm- has been serviced for 1 g i S k and 0 g j g n,- — 1. Thus, for 1 g i S k, rip, the dummy request of each client, is serviced 77 at time 0 and rm, the first real request of each client, arrives at time 0. Because Of our zero processing time assumption, we can assume without loss Of generality that consecutive requests Of a client are not located at the same point. An algorithm A solves a specific input instance I by computing a schedule A(I) Of server movement. We will evaluate the quality Of A(I) using two cost functions. We first consider a server oriented cost“ function which measure the “length” of A(I): the total distance cost function measures the total distance moved by the server. The total distance cost function is used in the k-server problem. We next consider a client oriented cost function which focus on the quality of service provided to each client: the average completion time cost function measures the average completion time of any entire client. For any algorithm A and any input instance I, we use AD (I) to denote the total distance moved in schedule AU) and AACT(I) to denote the average completion time of all clients in schedule A(I). For any cost function CF, the competitive ratio CSF of an on~line algorithm A is defined as where OPT denotes the optimal offline algorithm. When the cost function is not ambiguous, we will drop the CF superscript. That is, we will abuse notation by using A(I) to represent both the schedule produced by A as well as its cost for the given cost function, and we will use CA to represent its competitive ratio for the given cost function. To simplify later proofs, we define the following notation to represent some 78 commonly used features of input instance I . Let I,- Q I denote the input instance consisting only of requests from client i of I . Let D,(I) = OPTD (1,); that is, the min- imum distance the server must move to service only client i. Let DU) 2 2le D,(I) and Dma,,(I) = maxlsisk D,(I). Let did-(I) = 6(ri‘j,r,-,j+1)(I) denote the distance between the 3"" and j + 1’t requests of client i of I. Clearly D,(I) = 22;? d,,j(I). Note we will typically omit I when the input instance I is not ambiguous. One of the goals Of this work is to study the performance of several commonly used disk scheduling algorithms in the context of multithreaded systems. In particu- lar, we wish to determine how the performance of these algorithms is related to the number of threads in the system. There are two main types of algorithms: greedy algorithms which seek to optimize some short term Objective and “fair” algorithms which seek to insure all threads receive service in a reasonably timely fashion. We first define two commonly studied greedy algorithms. The Shortest Dis- tance First (SD?) algorithm moves the server to the nearest request. In disk schedul- ing, this algorithm is often referred to as Shortest Seek Time First. The Sequential (55 Q) algorithm services all the requests for one client, then services all the requests Of a second client, and so on. Note SEQ will service requests from other clients if it happens to pass over them while servicing the current client. Our upper bound results apply to any metric space. For lower bound proofs, we consider two primary metric spaces: line(oo) and K,. The first metric space, line(oo), represents an unbounded continuous line. The K,- metric space is a clique with i nodes where each node is distance 1 from any other node. Surprisingly, nearly all Of our results are identical for these two different cost 79 functions. The summary is shown in Figure 5.1. In section 5.2, we prove an upper bound of 2k — 1 for the average completion time cost function; we show that SD]: and 88 Q algorithms are (2]: — 1)-competitive on any metric space for the average completion time cost function, and this bound is tight for any metric space containing line(oo). In section 5.3, we prove lower bounds of 1325 + 1 for the total distance and average completion time cost functions when the metric space is K 00. In section 5.4.1, we prove lower bounds of 553 for both the total distance and average completion time cost functions when k = 2 and K 00 is the metric space. In section 5.4.2, we show that when k = 2, the lower bound for the average completion time cost function improves to 3 for any metric space containing line(oo). In section 5.5, we discuss some open questions regarding the k-client problem. 5.2 Upper Bounds In this section, we prove that the greedy SD]: and 85 Q algorithms are (2k — 1)- competitive for the average completion time cost function in any metric space. We also show that this bound is tight for any metric space containing line(oo). We begin by proving some a basic fact about the Optimal Off-line algorithm. Fact 5.2.1. For all input instances 1, 0PTACT(I) Z 2,9). Proof. Consider any input instance I with k clients. For 1 S i S k, O’PT clearly cannot complete client i before D,(I), the time it takes to complete client i if client i is the only client. Therefore, UPTACTU), the average completion time incurred by OPT, is lower bounded by fizz-C21 D,(I) 2 278’). Cl 80 ll k clients Cost Function Upper Bound Lower Bound Lower Bound (Koo) (line(oo» total distance 2k — 1 [4] 13,—k + 1 152,5 + 1 [4] average completion time 2k — 1 1531‘- + 1 15215 + 1 [4] 2 clients Cost Function Upper Bound Lower Bound (Koo) Lower Bound (line(oo)) total distance 3[4] 9/5 25/9[4] average completion time 9/5 Table 5.1: Summary Of results for the k-client problem. 81 5.2.1 Upper bounds for the Total Distance Cost Function For completeness, we include some upper bound results in [4]. Theorem 5.2.1. [4] For the total distance cost function, 651)}: = 2k —— 1. Theorem 5.2.2. [4] For the total distance cost function, cng = 2k — 1. The lower bound instance against both SD]: and SEQ is illustrated in Fig- ure 5.1 [4]. The Optimal solution for this instance is to move the server to the far right to point It — 1 and then to the far left to point —n1 resulting in a cost of n1 + (2k — 2). Assuming ties are broken to SDI-”s detriment, SD37 will move left to service all Of the requests of client 1, then return to the right and service all Of the requests of client 2, etc., resulting in a total cost of (2]: — 1)n1 + O(kz). For any 6 > 0, there exists an n1 such that the competitive ratio will exceed 2k — l - 6. Note, this input instance can easily be adjusted to eliminate ties. Ten. "' Tk.2 Tk,1 7‘3.n3 ° " 73,3 7‘3,2 7‘3.1 Tang '°' 72,3 7‘22 7‘2,1 7'1,n1 7'1,3 7‘1,2 7‘1,1 l l l l l m l l _____ I I I I I w I I +_‘l —n1 —4 —3 —2 —1 0 1 2 k-2 k—l Figure 5.1: Lower bound instance for SD]: and SEQ with the total distance cost function. 5.2.2 Upper bounds for the Average Completion Time Cost Function Theorem 5.2.3. For the average completion time cost function, 051)}- = 2k — 1. 82 Proof. Without loss of generality, we assume the clients are labeled according to the order that SD}- will service their last requests. That is, the completion time of client i is no greater than the completion time Of client j in SDf(I) for 1 5 i g j g k. We bound SDfACTU) by bounding the cost Of each move Of SDf. Every move begins from a request rid for 1 S i S k and 0 S j S n,-. If a move starts from request rm- where j < n,, i.e. rm- is not a terminal request of client i, then this move adds a cost of at most did- to each unfinished client’s completion time. This is true because either the server moves to request rig-+1, or it moves to a closer request from another client. Clearly such a move adds a cost of at most digs to the average completion time. The total cost of moves starting from nonterminal requests of all clients is at most I: ng-l k ZstFZDz-w i=1 j=0 i=1 If a move starts from request rm“ the terminal request Of client i, then the cost of the move can be upper bounded by (L?) (D,- + minKJ-Sk Dj). This is true for the following reasons. Clients 1 to i have finished. There are only (It — i) unfinished clients. Consider the unfinished client j > i such that Dj is minimized. Consider the distance between request rm, and the current request Of client j. This distance is no more than D,- + DJ- because in the worst case, the server must move from rm, back tO 77,0 = 73.0 and then to the current request of client j. Since SDF moves to the closest request, the distance it moves in this case can be no larger than D,- +Dj. Thus the completion time of the at most (k — i) unfinished clients is increased by at most D,- + DJ- and the average completion time is increased by at most (I?) (D, + Dj). 83 The total cost Of moves starting from terminal requests Of all clients is at most k-i k-i D- . ' . <: . __1 2(16 )(Dz‘l'igllenkDJ) _ Z (( k )DI'I'Z k) 199: ls lsjsk The cost of moves of SD? starting from nonterminal requests is at most D. The cost of moves of SD? starting from terminal requests is at most (k?) D. The combined cost is at most (&k‘—l) D. Thus, from Fact 5.2.1, SD?ACT(I) g (2k — 1)(9’PTACTU). Note this upper bound is true for any metric space. This bound is tight in any metric space that includes line(oo). More specif- ically, c313,: > 2k — 1 — c for all e > 0 on line(oo). The lower bound instance is illustrated in Figure 5.2. Note that client 1 has several requests. All other clients have exactly one request. Assuming that ties are broken to SD? ’5 detriment, SD? will first move to the left and service all requests of client 1. It will then service the requests of the remaining clients. The completion time Of client 1 will be n1. The completion time of client j will be 2m + j — 1. The average completion time is %((2k — 1)n1 + 008)). The optimal solution is to move all the way to the right first. The Optimal average completion time is %(n1 + O(k)). For any 6 > 0, there exists an n1 such that the competitive ratio will exceed 2k - 1 — 6. Note that this input 84 instance can easily be adjusted to eliminate ties. El 7‘1,n1 '" 7‘1,2 7‘1,1 7‘2,1 73,1 Tic—1,1 Tim I l l L m 1 1 l l I I I I \U I I _____ —n1 -—3 -2 —1 0 l 2 k - 2 k — 1 Figure 5.2: Lower bound instance for SD? and SEQ with the average completion time cost function. Theorem 5.2.4. For the average completion time cost function, c559 = 2k — 1. Proof. We bound SE QACTU ) by bounding the completion time Of each client of I. For 1 g i S k, the completion time Of client i is at most D,- + 2;:12Dj. This bound assumes a worst-case scenario where the server must service client 1, return to its initial position, then service client 2, return to its initial position, etc., before finally servicing client i. Thus, we Obtain the following upper bound on SE QACTU ) . G.) (.;.I(.§:D)+Dl) = (a ((2)2?) = c) ((2):) z (i) (1990. _ mg, + 20,.) ( )< < From Fact 5.2.1 we can conclude ssoACTu) 3 (2k —1)(’)’PTACT(I) Again, this upper bound holds for any metric space [4], and this bound is tight in any metric space that includes line(oo) as can be seen by the same lower bound input instance for SD? (Figure 5.2). CI 5.3 General Lower Bounds We now prove that no on-line algorithm has a competitive ratio better than 1525 + 1 for either the total distance cost function or for the average completion time cost function when k is a power of 2. These lower bounds drop to 152,5 when k is not a power Of 2. These results apply in clique metric spaces. This section is organized as follows. In Section 5.3.1, we define a restricted adversary strategy that will be used in the proof Of both lower bounds. We then prove some basic properties about this adversary in Section 5.3.2. We then prove the actual lower bound results in Sections 5.3.3 and 5.3.4. In Section 5.3.5, we state some results from [4], which translate these lower bounds to line metric spaces. 5.3.1 Definition of Adversary Strategy A(N, k) The k-client problem can be described as a game between the adversary and the on-line algorithm as follows. The adversary begins the game by generating a single request for each Of the 1: clients in the metric space. The on-line algorithm then responds by moving the server to a location where at least one request resides and servicing all requests at that position. Consider any client whose request has just 86 been serviced. The adversary must respond by either generating another request for this client or informing the on-line algorithm that this client has no more requests. The game continues in this fashion until the on-line algorithm has been informed that all clients have no more requests. We define an adversary strategy A(N, k) that is parameterized by two integers N and 1:, both Of which must be at least 1. It will utilize a clique with N k + 1 nodes partitioned into N subcliques of size It plus an extra node which is the initial server position. We number the subcliques from 1 to N. On a first reading, it is helpful to focus on adversary strategy A(1, 1:) which uses only subclique 1. Before we can define adversary A(N, k), we first define the following notation and concepts. Also, for the remainder of this section, when we say “at any time during the game” or “at the end Of the game”, it will be understood that the game is taking place between adversary strategy A(N, k) and any on-line algorithm A. Definition 5.3.1. At any time during the game, let a,- be the sequence of positions occupied by the requests of client i in the clique of size Nlc. Define 0,-(h) to be the sequence of positions occupied by the requests of client i in subclique h. Define I (h) to be the restricted input instance where client i has only the sequence of requests o,(h) forlgigk. Definition 5.3.2. At any time during the game, for 1 g i S k and 1 S h S N, client i has h-length n if 0,-(h) currently has length n. Note the h-length of a client may increase over time as the adversary continues to generate requests for client i in subclique h. 87 Definition 5.3.3. At any time during the game, clienti subsumes client j in subclique h if oj(h) is a proper subsequence of 0,-(h). Definition 5.3.4. Let 14> = Uie¢ I,- where qfi g {1, . . . , k}. Thus, I (It), is the restricted input instance consisting only of the clients in Q5 and their requests that occur in subclique h. Fact 5.3.1. At any time during the game, 0PTA0T(I(h)¢) :2 D,-(I(h)) ifi E qfi and Vj E (b, i subsumes j in subclique h. Proof. The cost of servicing all of the requests Of client i in subclique h is D,-(I(h)). In order to service client i in subclique h, the server must visit all the locations in {J,-(h) in order. It follows that the server will also visit all of locations of oj(h) in order for any client j subsumed by i in subclique h. E] Definition 5.3.5. At any time during the game, clienti covers client 3' in subclique h if clienti subsumes client j in subclique h and there is no clientl that is simultaneously subsumed by client i in subclique h and subsumes client j in subclique h. A client j is uncovered in subclique h if there is no client i such that i covers j in subclique h. We now define the notion Of a client i being dead/alive/critically alive in subclique h. Definition 5.3.6. At any time during the game, for 1 S i S k, and for 1 g h g N, 0 Let 1,- be the most recent request of client i that has appeared. Since the game begins with the adversary generating one request for each client, I,- must be de- fined. 88 0 Let c, be the subclique in which l,- appears. 0 Client i is dead if I,- has been serviced, and the adversary has made known to the on-line algorithm that client i has no more requests. 0 Client i is alive if it is not dead. An alive client lives in subclique h ifc, = h. Note that if c, = h, it is not always the case that client i lives in subclique h because client i may be dead. 0 Client i is critically alive if it is alive, request I, has just been serviced, and the adversary has not yet responded. Definition 5.3.7. At any time during the game, client i is graftable in subclique h if it has been planted in subclique h, it is uncovered in subclique h, and it no longer lives in subclique h. Definition 5.3.8. At any time during the game, if we focus on a single subclique h, our adversary is restricted to making only the following three types of moves. 0 The adversary can plant client i in subclique h by placing the next request of client i on a node in subclique h that has not yet been the location of any other request. 0 The adversary can terminate a critically alive client i which lives in subclique h by informing the on-line algorithm that client i has no more requests in subclique h. Note, the adversary may or may not choose to plant the next request of client iinsubcliqueh+1for1Sth—l. 89 o The adversary can concatenate a graftable client j in subclique h with h-length n to a critically alive client in subclique h with the same h-length by making the n + l“ request of client i in subclique h be on the same node as the l“ request of clientj in subclique h forl 2: 1, ...,n. Note that j is no longer graftable in subclique h. Definition 5.3.9. Adversary strategy A(N, k) is defined as follows. 1. The adversary begins the game by planting clients 1 through It in subclique 1. 2. In each later turn, the adversary responds only if there is a critically alive client i. If there is such a client i, let h be the subclique client i lives in, and let n be the h-length of client i. The adversary responds as follows. (a) If there is a graftable client j in subclique h with h-length n, the adversary concatenates client 3' to client i. (b) If there is no graftable client j in subclique h with h-length n, A(N, k) responds as follows. i. The adversary terminates client i in subclique h ( making client i graftable in subclique h). ii. If A has serviced at most N — 1 requests, the adversary plants client i into subclique h + 1. Otherwise, client i is dead. 5.3.2 Properties Of the Adversary Strategy We now prove some properties about the adversary strategy A(N, k). 90 Lemma 5.3.1. For 1 S i g k, 1 S h S N, and at any time during the game, the h-length of client i is 0 or is 2‘ for some integerl 2 0. Proof. We first observe that the adversary can only increase the h-length Of a client using the planting or concatenation Operations. After planting, client i has h—length 1 2 2°. Assuming the lemma holds before a concatenation occurs, it must also hold after concatenation since concatenation can only occur between two clients of equal length. Cl Lemma 5.3.2. For 1 g i S k, 1 g h S N, and any time during the game, clienti is covered by at most one other client in subclique h. Proof. We first Observe that when a client is planted in subclique h, it is not covered since its single request is placed on an otherwise unoccupied node v(i) Of subclique h. Afterwards, no other client will contain request v(i) unless client i is concatenated to the end of some other client j. Thus, the only client which can cover client i at this time is client j , and client j may not exist. Assuming such a client j does exist, client i is no longer graftable which means that the only clients which can contain node v(i) are those which subsequently subsume client j. However, by the definition of cover, these clients will not cover client i and the result follows. E] To help understand the adversary strategy, it is useful to consider the graphical representation of the covers relation in each of the subcliques. Definition 5.3.10. For 1 S h g N and any time during the game, the cover graph Hh of subclique h is a directed graph with k nodes labeled 1 through k. An arc exists from node i to node j in H), if clienti covers client j in subclique h. 91 In a general case, the N cover graphs may not help convey the structure of the adversary strategy and the input instance, but for A(N, k), the N cover graphs do help. In particular, our adversary A(N, k) will construct an input instance such that at any time during the game, each Hh will be a collection of directed binomial trees; furthermore, at the end of the game, each H), will be a directed binomial heap [83, 24]. Corollary 5.3.1. For 1 S h S N and any time during the game, the cover graph H), generated by adversary A(N, k) will consist of a set of directed trees. Proof. This follows immediately from Lemma 5.3.2. Cl Definition 5.3.11. At any time during the game, let T be a tree in Hh. Define C (T) to be the set of clients which have nodes in tree T, and define the root client r(T) to be the client corresponding to the root node of T. Lemma 5.3.3. For1 S h S N, 1 S i S k, and any time during the game, the following statements are true. 1. H), consists of a collection of directed binomial trees [83, 24]. 2. If client i has h-length 21 > 0 for some integer I, then the maximal subtree in Hh rooted at node i is a directed binomial tree with heightl containing 2' nodes. 3. For any tree T E Hh, there is at most 1 client in C (T) which lives in subclique h. Furthermore, if there is such a client, it is the root client r(T). 4. If client i is in C(T) and clientj is in C(T') where T yé T’ are both trees in Hh, then {J,-(h) and 0,- (h) occupy disjoint sets of nodes in subclique h. 92 Proof. Most of these properties follow trivially from the definition of our adversary, the three Operations our adversary can perform, the k-client game, and the definition of binomial trees. The key Observation is that when client j is concatenated tO client i in subclique h, the only change to H), is the addition of an are from node i to node j. E] Corollary 5.3.2. At any time during the game, the on-line algorithm A can service at most one request per turn. Proof. The proof follows from properties 3 and 4 Of Lemma 5.3.3. Cl Lemma 5.3.4. For 1 S h S N, 1 S i S k, and any time during the game, there is at most 1 graftable client with h-length 2j in subclique h for j Z 0. Proof. Fix j and h. Initially, there are no graftable clients in subclique h so there are no graftable clients with h-length 21 in subclique h. The termination step of the adversary is the only place where the number of graftable clients in subclique h can increase. However, for the termination step to be executed, the number of graftable clients with h-length 2i must be 0. Otherwise, a concatenation would have been performed instead. Furthermore, after the execution of this step, the number of graftable clients with h-length 2i is exactly 1. Therefore, the result follows. E] Definition 5.3.12. Let B(n) be the set of I-valued bits of the binary representation of n where bit 0 is the least significant bit. 93 Forexample, B(15) = B(11112) = {3,2,1,0} B(16) = 800000;) = {4}. {0 ifb¢B(n) Nt tht 2" d2 09 a W Jmo 1 ibeB(n). For example, [10001002/22j mod 2 = 1 since 300001002) = {6,2} 9 2 [11011112/241 mod 2 = 0 since B(11011112) = {6, 5,3,2,1,0} 34. Lemma 5.3.5. For any subclique h and any i Z 1, if there are exactly i clients with h-length greater than 0 at the end of the game, then 1. there is exactly 1 uncovered client with h-length 2b in subclique h for each b E B(i), and 2. there are no uncovered clients with h-length b in subclique h for b ¢ B(i); that is, the underlying undirected graph of Hh is a binomial heap [83, 24] with i nodes. Proof. Fix i and h. Let xb be the number Of alive clients with h-length 2” created by the adversary during the course of its operation. Let yb be the number Of uncovered clients with h-length 2" left when the adversary terminates its Operation. From the Operation of the adversary, in particular steps (2.a) and (2.b), it follows that that for any nonnegative integer b, xb = [i/2bj and 0 steep) : d2 : y" I“ m [1 ifbeB(z‘). Thus, the result follows. Cl 94 Lemma 5.3.6. For any subclique h and 1 S i S k, if there are exactly i clients with h-length greater than 0 at the end of the game, then the number of requests in subclique h generated by A(N, k) is Z 2b—1(b+2)2{§lgl+i ifiisapower of2 beB(i) ‘5 lg 1 otherwise. Proof. From Lemma 5.3.3, for 1 S h S N, we can determine the number Of requests generated in subclique h at the end of the game by summing the number of nodes in the binomial subtree rooted at each node in the final cover graph Hh. From Lemma 5.3.5, Hh will contain exactly [B(i)| disjoint binomial trees, one of size 2" for each b E B (2) Fix b in B (2) Applying well known properties of binomial trees, the binomial subtree Of size 2” in Hh will contain 25‘1/2j nodes which are the roots of binomial trees of size 2j for j =2 0, . . .,b — 1. Thus, there are a total of 2b‘1(b + 2) requests represented by this binomial tree. Therefore, the number of requests represented by all binomial trees in H, is Zbeea) 2""1(b + 2). If i is a power Of 2, then i = 29 for some integer g 2 0. Thus, there are 29"l(g + 2) requests which is %lgi + i. If i is not a power of 2, then from Lemma A.2, the number of requests is lower bounded by élgi. C] We will be particularly interested in the input instances generated by ad- versary A(1,k) where k is a power of 2. These input instances have the following characteristics. The initial request of each of the k clients is on a different node, no client has more than one request on any node, and there are no requests on node 0. There is a single client with k requests, and for each i, 0 S i < (lg k) — 1, there are 95 k/2i+1 clients with 2i requests. Figure 5.3 shows a possible input instance and its corresponding cover graph for k = 8. Note, all edges in the clique graph of size 9 have been removed. Also, the nodes have been displayed in a linear fashion to emphasize the structure Of the clients. 0 1 2 3 4 5 6 7 8 O O 0 3 9 O O 9 5 7.1—’7‘2 -—>r3—’r4 9r5+r6»r7—’r8 81—’82—*83—’84 tlatz 111—’1” vi 171 WI 91 Figure 5.3: An example A( 1, 8) lower bound instance and its cover graph. 5.3.3 General Lower Bound for the Total Distance Cost Func- tion Theorem 5.3.1. For the total distance cost function and any on-line algorithm A, 00> %lgk+1 ifkisapowerof2 A _ %lgk otherwise on the Kk+1 metric space. Proof. We use adversary A(1, k) to prove this theorem. In particular, we use Kk+1 which contains only one subclique. In this case, the adversary description can be simplified by making clause (2.b) simply terminate the client with no Option to plant the client in the next subclique. We first bound the on-line cost incurred by A. From Corollary 5.3.2, the cost incurred by A will be the number of requests in the input instance. From Lemma 5.3.6, 96 there are a total of glgk + k requests if k is a power of 2. If k is not a power Of 2, the number of requests is lower bounded by glg k. We now show the optimal Off-line cost is k. At the end of the game, H1 consists of a collection of disjoint binomial trees. Let T be a tree in H1 at the end Of the game. The root client r(T) subsumes all clients in C (T) Thus, from Fact 5.3.1, the optimal off-line algorithm can service all requests of all clients in C (T) by servicing the requests of r(T) in order. If the optimal Off-line algorithm does this for each tree in H1, it will service all requests by visiting each node of subclique 1 exactly once for a total cost of k. NO algorithm can do better than this since there are k distinct positions containing requests, so the Off-line cost follows. Dividing the on-line cost by the off-line cost, we get the final result. El 5.3.4 General Lower Bound for the Average Completion Time Cost Function Theorem 5.3.2. For the average completion time cost function, any on-line algo- rithm A, and any integer N 2 1, CACT> (1555+1)(%_1—2) ifk is apower of2 A - (1525) ——§—2?VIE:'_12) otherwise an the K Nk+1 metric space. Proof. Adversary A(1, k) used in Section 5.3.3 was concerned with forcing the on- line algorithm A to traverse nodes as many times as possible while still allowing the Optimal off-line algorithm to service all requests by traversing each node only once. This adversary strategy is not appropriate for the average completion time cost 97 function because this strategy allows A to complete many clients in a short amount of time. We need an adversary strategy which 0 forces the on-line algorithm A to traverse the nodes many times while the Off-line algorithm can service all requests by traversing each node only once, and o prevents the on-line algorithm A from completing any client too quickly. Thus, we use A(N, k) which may terminate a client in a subclique but keeps the client alive by planting it in the next subclique. We first bound the on-Iine cost incurred by A. From Corollary 5.3.2, A services at most one request when it visits a node. From step (2.b.ii), after the first N — 1 requests have been serviced, no client has been completed. Servicing the first N — 1 requests requires at least N — 1 time steps, and this contributes N — 1 to the average completion time. After the first N —-1 requests are serviced, each client has at least one more request. The cost Of servicing these requests contributes at least % 2le i = $5.1 to the average completion time. Thus, the cost incurred by A is lower bounded by N — 1 + figs—1. Next, we compute an upper bound of the Optimal cost. We will calculate the average completion time incurred by the obvious algorithm which services the subcliques in order. Clearly this is an upper bound on the cost Of the Optimal solution. In an analogous fashion to the previous section, there exists a server path which visits each occupied node Of a subclique exactly once and services all the requests on that subclique. The server simply services the requests of the root of each tree in H), in order. Thus, if Hh has i nodes, then the time required to service all requests in 98 subclique h is at most i. During this time, only these i clients can still have requests waiting to be serviced, so the average completion time increases by at most 3,72. Thus, we can upper bound the cost Of the Optimal algorithm as follows: er! I: 0797(1) g 122%,- (5.1) i=1 where p,- is the number of cover graphs with exactly i clients with h-length greater thanOforlShSN. We will upper bound inequality (5.1) by upper bounding p,- for i = 1, ...,k. First, we give an upper bound Of the total number of requests. Consider the time just after the last planting Operation. At most N — 1 requests have been serviced by the on—line algorithm A at that time. Each client never changes subclique after that. Hence, each client has at most It unserviced requests. Therefore, there are at most k2 unserviced requests. Thus, there are at most N — 1 + k2 requests. Let R, be the total number of requests in a subclique with i clients. The value of R,- is given in Lemma 5.3.6. Since there are at most N — 1 + k2 requests in I , this implies EL, R,p,s S N — 1 +k2. Thus O’PT(I) is upper bounded by the maximization Of k k Zi2p, subject to Eli-p,- S N — 1 + k2. i=1 i=1 PHI—I Since i2 grows faster than %lgi + i, and from Lemma A.2, glgi +i grows at least as fast as R,, then our upper bound on (9177(1) is maximized when Pi = 0 fori=1,...,k—1 and N—1+k2 Rk ' 99 1 1 N - 1 + k2 Th I<—§:2,__k2 :1, us, 0PT( ) _ k ,=1 2 p k pk Rk Ififli if k is a power of 2 < 5 3 +1, _ N ‘11” otherwise. 5 8" Dividing the on-line cost by the off-line cost, we get the final result. CI Corollary 5.3.3. For the average completion time cost function, any on-line algo- rithm A, and any integer N Z 2, it is the case that on K N+k_1 metric space, I _ . . CACT (325 + 1)(—'t—-—2?vfi2’;2_12) ifk is a power of 2 .A — 1 k _ . (%)(%%) otherwise. Proof. Nodes are assigned to the subcliques on demand. A new node is needed when the adversary plants a client. The adversary plants k clients in the initial step. The only other place the adversary plants a client is in step (2.b.ii). The adversary can plant at most 1 client after each request is serviced. The adversary plants no more clients when A has serviced more than N — 1 requests. Therefore, the adversary does at most k + N — 1 plantings. Hence, k + N — 1 nodes are needed for placing requests. Since N 2 2, then k + N - 1 Z k + 1. Thus, other than the k nodes on which the adversary makes k initial plantings, there is 1 other node that can be used as the initial position Of the server. Cl Corollary 5.3.4. For the average completion time cost function, any on-line algo— rithm A, and any 8 > 0, there exists an integer M such that, on K M metric space, ACT) 1525+1—5 ifkisapowerof2 C — u A [535 — 6 otherwise. Proof. From Corollary 5.3.3, by choosing N large enough and setting M = N + k — 1, the result follows. C] 100 5.3.5 General Lower Bound on the Line It was shown in [4] that lower bound results on cliques can be transformed into lower bound results on lines. In particular, any lower bound result on K, using a “finite adversary strategy” can be extended to hold on line(2l - 1) where a finite adversary strategy is one in which the maximum number of requests ever generated by the adversary is upper bounded by some known constant. Theorem 5.3.3. [4] Suppose we have a lower bound of c for either the total distance or average completion time cost functions on the K, metric space forl Z 1 as a result of a finite adversary strategy. Then this implies there exists a c — 5 lower bound for the same cost function on the line(2l — 1) metric space as the result of another finite adversary strategy. Corollary 5.3.5. [4] For the total distance cost function, any on-line algorithm A, and any 5 > 0, ACT> 1525+1—e ifkisapowerof2 c A T 15215 — 5 otherwise on the line(2k — 1) metric space. Proof. This follows from Theorems 5.3.1 and 5.3.3. El Corollary 5.3.6. [4] For the average completion time cost function, any on-line algorithm A, any integer N 2 2, and any 5 > 0, l k 2N k—l - . cfiCT> (—&2—+1)(W+J§k2—_2-) —e ifk is apower of2 _ (13215) W) -— 5 otherwise on the line(2N + 2k — 3) metric space. 101 Proof. This follows from Corollary 5.3.3 and Theorem 5.3.3. El Corollary 5.3.7. [4] For the average completion time cost function, any on-line algorithm A, and any 5’ > 0, there exists integer N’ such that, ACT> 132—’°-I-1-—)3’ ifkisapowerof2 c — a A '52—,“- — 5’ otherwise on the line(N’) metric space. Proof. From Corollary 5.3.6, by choosing a < e’, choosing N large enough, and setting N’ = 2N + 2k — 3, the result follows. C] 5.4 General Lower bounds when k = 2 5.4.1 General Lower Bound on the Clique when k = 2 In this section, we improve our general lower bounds of [521°— +1 to ‘5’ for the case where k = 2 on K 00 for both the total distance and average completion time cost functions. Theorem 5.4.1. For the total distance cost metric, no on-line algorithm is (g — e)- competitive for the 2-client problem on the K00 metric space for all e > 0. Proof. Label the vertices with positive integers. Let D be a large integer. The adversary Operates in rounds. At the beginning of round 1, a client is called the leader, and its initial request is on node D +1. The other client is called the follower, and its initial request is on node 1. Subsequent requests are generated according to the following rule. If the current request Of both clients are on different nodes, then when the request, say on node x, of either of the clients is serviced, the next request of that client is on node x + 1. If the current request of both clients are on the 102 same nodes, say on node x, then after these two requests are serviced, the new round begins. In the new round, the leader becomes the follower, and its next request is on node 1. The follower becomes the leader, and its next request is on node x + 1. Suppose the adversary stops just after round n. The Optimal solution is to service the client that is the follower in round n picking up requests of the other client along the way. We show that any on-line algorithm which hopes to be better than 2-competi- tive on this form of input sequence can be described by an infinite monotonically increasing sequence of finite numbers a, for i Z 1 where al 2 (x1 - D) / D, a,- = x,- / D for i Z 2, and x,- is the last node on which the clients place their last requests in round i. Suppose al is unbounded; that is, at any time, if the current request of the follower and the leader are on node x and y respectively, then x < y. In this case, the adversary will choose the input instance to be one where the final request Of both clients is on node xD for a large integer x. The on-line cost will be xD + xD — D while the off-line cost will be xD giving a competitive ratio of 2 — 5 Therefore, Oil must be bounded tO have a ratio better than 2. Clearly, this argument generalized which means that all a,- must be bounded for all i 2 1. Definition 5.4.1. For n 2 1, let 0,, be the sum of a,- where i is an odd number between 1 and n inclusively. Definition 5.4.2. For n 2 1, let en be the sum of 01,- where i is an even number between 1 and n inclusively. Definition 5.4.3. Let A0 = 0. For n 2 1, let A" 2 an — /\,,_1. 103 Definition 5.4.4. For n 2 1, let {on n is odd 7n : en n is even Fact 5.4.1. For n _>_ 0,/\,, 2 0. This is true because an’s is a non-decreasing sequence. Fact 5.4.2. For n 2 1, A 0,, — e,, = 0,, — en__1 n is odd TI — . e,, — 0,, = e,, — o,,_1 n is even Fact 5.4.3. For n 2 3, 7n = 771—2 + on. This is true by Definition 5.4.4, 5.4.1, and 5.4.2. Fact 5.4.4. For n _>_ 2, 7,, = 7,,_1 + /\,,. This is true by Definition 5.4.4 and Fact 5.4.2. The cost of the on-line algorithm is, W Z 2 A(In) = (7n + 7n—1 + an)D - n = (7,, + 7n_1+ /\,,_1+ A")D — n. The optimal cost is, W 2 2 0PT(In) = (in—1 + anlD : (7n—1 + An-l + /\,,)D. Therefore, if the adversary stops just after round n, then the ratio Of the cost of the on-line algorithm and the adversary is (Wu-1+ 7n + Ari—1 + An)D '- 7?. (771—1 + )‘n—l + AnlD I 104 If D is sufficiently large, the ratio is arbitrarily close to n— n An— )‘n 2 n- + An— 2)‘n An— 71+7+ 1+ = 71 1+ :2_ 1 ”122452) 711—1 + /\n—l + )‘n 711—1 + )‘n—l + )‘n 711—1 + Ail-1+ /\n We can write the equation (5.2) as 2 — c,, where An- c,, = ‘ n 2 2. (5.3) 711—1 + An—l + An, We now use the property that A,- is finite for all i 2 1 to define two important quantities of any sequence of finite values (A,). First, we define X ((A,)) = infnzgcn where c,, is defined in equation (5.3). Note that 0 < X ((A,)) < 1. For the remainder of this proof, we will use X to denote X((/\,-)). Next we define R((/\,-)) = Infnzg 75:11:. For the remainder of this proof, we will use R to denote R(()\,-)). To derive the best possible lower bound on the competitive ratio Of any on-line algorithm against this adversary, we need to find the set of values (A,) that minimize 2 - X. We will show that the maximum possible value of X is 1 / 5 which is achieved when Tiff—l- = 1 for all n 2 2; that is, when R = 1. We first observe that X V77. 2 2 An S An_1 — din—1° (5.4) This follows from equation (5.3) and the definition of X. We next Observe that Vn 2 3 An 2 R(R + 1)7,,_2. (5.5) This can be derived as follows given the definitions of R and Fact 5.4.4. ’\n 2 R7n—1 = R()‘n—1 + 7n—2) 2 B(Rfyn-Q + ”In—2) = B(R +1)7n-2- 105 From inequalities (5.4) and (5.5), we obtain l-X V71 2 3 B(R + ll’Yn—2 S /\n—1 — 772-] which can be rewritten as A1'!» 7 2 Vn22 (1—2X)7 ZA(R +R+1) (5.6) n—l Combining inequality (5.6) and the definition of R as the infimum Of 7:1: for n 2 2, we obtain (1 — 2X)R 2 x0e2 + 12+ 1) which can be rewritten as XR2 + (3X — 1)R + X g 0 (5.7) Since R must be a real number, we apply the quadratic equation and the constraint 0 < X < 1 to find the maximum value of X that results in R being a real number. (3X—1)2—4-X-X 20 The maximum value of X = 1/5 when R = 1. Therefore, we achieve the desired lower bound Of 2 - 1/5 = 9/ 5 on the competitive ratio Of any on-line algorithm for the 2-client problem on Koo. Cl Theorem 5.4.2. For the average completion time cost metric, no on-line algorithm is (g — e)-competitive for the 2-client problem on the K00 metric space for all e > 0. Proof. The proof is the same as that of Theorem 5.4.1 and because both clients complete at the same time in both the optimal Off-line schedule and the on—line schedule. D 106 5.4.2 General Lower Bound on the Line when k = 2 In this section, we improve our general lower bound of £325- + 1 to 3 for average completion time cost function for the case when k = 2 and the metric space is line(oo). For completeness, we state a lower bound Of 3g? for total distance cost function for the case when k : 2 and the metric space is line(oo) [4]. Theorem 5.4.3. For the total distance cost function, no on-line algorithm is (% —e)- competitive for the 2-client problem on line(oo) for all e > 0 [4]. Theorem 5.4.4. For the average completion time cost function, no on-line algorithm is (3 — e)-competitive for the 2-client problem on line(oo) for all e > 0. Proof. We consider an adversary that constructs inputs of the following form. The server is initially located at the origin. The first request from client A is located one unit to the left Of the origin. All the remaining requests from A will each be 6 to the left of the previous one where 6 tends to 0. The requests from client B will be arranged to the right Of the origin in a similar fashion. Without loss of generality, we assume that the on-line algorithm initially moves the server to the left to service client A. Any on-line algorithm that hopes to be better than 3—competitive against this adversary can be described by two finite positive numbers a and S. The first parameter, o, is the maximum distance the server is willing to move to the left before it changes direction for the first time. The second parameter, S, is the maximum distance the server is willing to move to the right before it changes direction for the second time. 107 Suppose a is unbounded; that is, the on-line algorithm will move the server tO the left until there are no more requests to the left. In this case, the adversary will choose the input instance to be one where the final request of client A is at position -x for a large value of x and client B will have only 1 request at position 1. The on-line cost will be %((x) + (2x + 1)) = é—(Bx + 1) while the Off-line cost will be %((1) + (2 + x)) = -;—(x + 3) giving a competitive ratio Of 3 — £5. Therefore, a must be bounded to have a competitive ratio better than 3. We can show that S is bounded with a similar argument. Next, we show that the best an on-line algorithm can do is to make a = E in which case the on-line algorithm will be 3—competitive at best. For any given a and 5, define the instance I as follows. Requests from client A appear progressively from position -1 to position —a — 6. Requests from client B appear progressively from position 1 to position 5 + 6. Figure 5.4 illustrates both input instance I as well as how the on-line algorithm behaves on I. Clearly, the on-line algorithm will move left to position —a, move right to position ,6, move left to position —a — 6 (completing client A), and then move right to position 2 + 6 (completing client B). Thus, the average completion time incurred by the on-line algorithm on input instance I is %((3a + 2S + 6) + (40 + 3H + 36)) = %(7a + 52 + 46). Meanwhile, O’PT services the shorter client first which means O’P’TACTU) = min(%((a + 6) + (2a + ,8 + 36)), %((,6 + 6) + (2fl+a+36))) = %min(3a+fl+46, a+3fl+46). Since we can make 6 arbitrarily 7a+55 min(30+fl,a+35) small, then this implies a lower bound of on the competitive ratio of the on-line algorithm. We now show that this lower bound is minimized to be 3 when a = 2. There 108 anoanrl . . . a2 0.1 b1 b2 . . . bub—lbw, , , r , s4 —a /—1 0 1 B\ —a—6 —1-—e 1+6 fl+e Figure 5.4: Lower bound instance for the average completion time cost function and k = 2. are 2 cases to consider. Case 1: aSfl In this case, O’PTACTU) = -;—(3a + H) which means that the lower bound on 7a+55 3a“, which is strictly larger the competitive ratio of the on-line algorithm is than 3 when a < B and which equals 3 when a = B. Case2:flSa In this case, O’PTACTU) = %(a + 3S) which means that the lower bound on 7a+SB a+3fl the competitive ratio of the on-line algorithm is which is strictly larger than 3 when ,6 < a and which equals 3 when a = B. Thus the result follows. CI 5.5 Open Problems on the k-Client Problem Obvious Open problems are closing the gap between the lower bound of 1325 + 1 and the upper bound Of 2k — 1 for the total distance and average completion time cost functions. For the total distance cost function and when k = 2, there is still a gap between the lower bound of as? and the upper bound of 3. 109 In [4], the k-client l-server problem is considered. In this problem, there are l servers instead of only a single server. This model is a natural generalization Of both the k-client problem and the classic l-server problem. Some preliminary results on this problem can be found in [4]. The question Of how to schedule multiple servers efficiently in this model remains open. 110 Chapter 6 Minimizing Total Completion Time on One Machine 6. 1 Introduction In this chapter, we consider the problem of non-preemptive scheduling with release dates on one machine to minimize total completion time (llrjl Z]. Cj). Many approxi- mation algorithms for this problem and similar problems use the idea of “er-schedules” [40, 73, 39, 34, 74, 19]. Previously, the best approximation algorithm was based on this idea. This algorithm is an 6fi-approximation algorithm called BE ST-a proposed by Chekuri, Motwani, Natarajan, and Stein [19] (e—f—l z 1.58). However, it was not known whether the bound of e/ (e — 1) for BEST-oz was tight. We generalize the idea of “ex-schedules” and characterize a class of approxi- mation algorithms which includes BE ST-a. We show that no algorithm in this class has an approximation ratio better than e/ (e - 1). This implies that the performance guarantee Of BEST-a proven by Chekuri et al.[19] is tight. TO find a better approx- imation algorithm, new ideas are needed. Afrati et al. introduced some new ideas beyond “er-schedule” and devised a polynomial time approximation scheme (PTAS) 111 for this problem [1]. 6.2 S’RPT-Subsequence Algorithms 6.2.1 Definition Of a-Schedules and the BEST-a Algorithm For any algorithm A and any instance I, let A(I) denote the schedule for I given by A. When there is no ambiguity, we also use A(I) to denote CA(I), the total completion time of the schedule produced by A. The Shortest Remaining Processing Time algorithm (SR’PT) is an Optimal algorithm for the preemptive version of the problem we consider in this chapter. 8727’? always schedules a job with the smallest remaining processing time. For any instance I and a 6 [0,1], let CfRPT(I,a) be the time at which a-fraction of J,- is completed in schedule S’R’PTU). In the SR’PT schedule of an input instance, a preemption can occur only when a job is running on the machine, and another job has just arrived. Therefore, there are at most n — 1 preemptions in the SR’PT schedule. When a preemption occurs, a-fraction of the preempted job is completed for some value of oz in the interval (0, 1). Since there are at most n — 1 preemptions in the S’R’PT schedule, then there are at most n — 1 distinct values of a at which the preemptions occur. These n — 1 critical values of a can subdivides the interval [0,1] into at most n sub—intervals. An a-schedule is a non-preemptive schedule obtained by list scheduling jobs in order of increasing CfRPTU, a) for some a in the interval [0,1]. Different values of a chosen in the same sub-interval lead to the same ordering of CfRPTU, a) which implies the same a-schedule. Since there are at most n sub-intervals, then there are at most n distinct a-schedules. For each instance I, BEST-oz chooses an a- 112 schedule which gives the smallest total completion time. BEST-a was shown to be an e/(e — 1)-approximation algorithm for llrjl E]. Q [19]. 6.2.2 Definition of SR’PT-Subsequence Algorithms We define the A(I)-sequence to be the sequence Of job identities Of job pieces in schedule A(I). A schedule S for I is called an A(I)-subsequence schedule if the S- sequence is a subsequence of the A(I)-sequence. For any algorithms A1 and A2, if for any instance I, the A1(I)-sequence is an A2(I)—subsequence, then we call A1 an Ag-subsequence algorithm. See Figure 6.1 for an example. The figure shows 3 schedules for instance I with corresponding sequences on the right. S1 is the S'R'PT- schedule. Since 23451 is a subsequence of 123214151 while 13452 is not, then 32 is a non-preemptive SR’PT—subsequence schedule while 53 is not. In this chapter, we will focus on non—preemptive SR’P’T-subsequence algorithms. Note that BEST-Oz belongs to this class. I: I III ”WWI—5]....” s1 1l2l3l2. l1l4l1l5l1. | . . . . 123214151 52 [2. . [3] [Z] [5]1. . . . | . 23451 33 11 . . .l3l4l |5[2. . | . . . 13452 Figure 6.1: Examples of non-preemptive S’R’PT-subsequence schedule. 113 6.3 Lower Bound of Non-Preemptive SR’PT—Sub— sequence Algorithms Our main result is that no non-preemptive SR’PT-subsequence algorithm including BE ST-Oz has an approximation ratio better than e—f—l. We need the following notations. For any positive integer 17., let ne denote [n / e] , and let 3 I Hn : E lc=l A " 1 Hn : Hn—Hn = _o . Z k k=nc +1 Note that P1,, 2 Ha — Hm z lnn — Inf = lne = 1. Next, we describe lower bound input instances. Definition 6.3.1. For any integer n 2 3, and e > 0, we define the instance I,,(e) with n jobs as follows: 0 fori=1 r,- = 1+(n+1—i)e fori=2,...,ne Hn — H,_1 2 22:1; + (n — i)e fori = ne + 1, ...,n ' __ 1+5 fori=1 7“ T e fori=2,...,n See Figure 6.2 for example Of instances from Definition 6.3.1. The figure shows (1) the instance 19(8) where e ——) 0+, (2) the SR’PT schedule, (3) all possible non-preemptive S’R’PT-subsequence schedules, and (4) the optimal non-preemptive schedule for 19(5). Note that B9 < 1. We will only be interested in instances 1,,(5) such that II" < 1 and 0 < e < 1/n. Among these instances, we will show that as n tends to infinity and 5 tends to 0, the total completion time of any non-preemptive 114 SR'PT-subsequence schedule will tend to e / (e — 1) times the Optimal total completion time. The intuitive idea behind instances from Definition 6.3.1 is the following. Con- sider any non-preemptive S’R’PT—schedule. NO matter where the large job is sched- uled in the non-preemptive SR’PT—subsequence schedule, the total completion time remains constant. Note, the latest the large job can be scheduled is between jobs J,,c and Jn,+1 which means it will delay at least about one-third of the small jobs no matter where it is run. On the other hand, in the Optimal non-preemptive schedule, the large job is run last. Each Of the small jobs is run at the earliest possible mo- ment. Only the large job incurs a large delay which is negligible when there is a large number Of jobs. In [82], we proved the following theorem. Theorem 6.3.1. For any non-preemptive S’R’PT-subsequence algorithm A and for all 6 > 0, there exist an integer n and e > 0 such that Aunts» > 40.45)) > e SRPT(I,,(€)) 0PT(I,,(5)) " e — 1 — 6' However, we will omit the proof of Theorem 6.3.1 because it is rather lengthy and complicated. Instead, we borrow a technique used in the proof of Theorem 2.2.1 in Chapter 2 and prove a similar theorem. Again, we will need Dirac’s delta function 6() which is defined as follows. 9.4.(t) = {l/A 0 c and b > c) or a + b < c) where a, b, and c are input parameters. Each if—then-else statement executed will lead to a branch result. The underlying combinatorial compu- tation of A for I is the sequence of branch results made during the execution Of A on input I. Upon termination, A outputs its solution for I. We assume that A also out- puts the Objective value of the solution along with the solution itself. The Objective value is a function Of input parameters. We will call this function the underlying com- binatorial objective function Of A for I. In general, different underlying combinatorial computations will have different underlying combinatorial Objective functions. Note that two distinct input instances may have the same underlying combina- torial computation; that is the sequence Of branch results made by A are the same for both instances. In this case, we say that the two input instance are combinatorially equivalent with respect to A. They will also have the same underlying combinatorial objective function from which the objective values of their solutions are calculated. For any positive integer n, let 11,, be the problem 11 with an additional con- straint that all input instances have size n. Let c1,c2, ..., cN be all possible underlying combinatorial computations Of A for II,,. Note that if A runs in polynomial time, N could be exponential in n. For 1 S i S N, let b,,1,b,-,2, ”.,b”, be the sequence Of boolean expressions that must hold so that when A executes, c,- is the resulting 125 computation. These boolean expressions are either the if-clauses or the negation of the if-clauses Of the if—then—else statements encountered during the execution of A. Note that if A has time complexity O(f(n)), then k,- = O(f(n)). For 1 S i S N, let 0,- be the underlying combinatorial objective function corresponding to c,. Note that o,- is a function of input parameters. For any input instance I in 11, we can represent A(I) as follows: If bu and b1,2 and and b1),l holds, then A(I) = 01 else if b2,1 and b2; and and bu, holds, then A(I) = 02 else if bN,1 and bmg and and bN’kN holds, then A(I) = 0N. Given an instance I of II,,, I will fall into exactly one of the N cases. Some Of the expressions big-,8 may contain inequalities “<” and “>”. We will relax them by replacing all inequalities “<” and “>” by the corresponding inequalities “S” and “2”. The replacement will cause some input instances of II,, to fall into more than 1 case (underlying combinatorial computation). In Section 7.4.3, these input instances will be used as “bridges” to shift our attention from one underlying combinatorial computation to another. 7 .4.2 Modeling Lower Bound Problems as AND / OR LPs In this section, we explain how a lower bound problem LB (II, A) can be modeled as AND/ OR linear programs. We make the following assumptions. 0 An input instance Of the underlying problem consists only Of real-valued pa- rameters. 126 o All underlying combinatorial objective functions of A and OPT are linear func- tions of the parameters of the input instance. 0 All boolean expressions b,J’s supporting any underlying combinatorial compu- tation c,- of A and OPT are composed of linear inequalities of the parameters of the input instance. The second assumption implies that A(cI) = cA(I) where c is a positive real constant and C] denotes the instance created from I by multiplying all parameters of I by c. Next we show how to transform a lower bound problem LB(II,A) into an AND/OR linear programming problem. By definition, the value Of LB(II, A) is defined as the following mathematical program M P1. A(I) m1” orrm‘ The solution space of M P1 is the set of all input instances I of II. A feasible solution of M P1 is a lower bound instance for A. The Optimal solution Of M P1 is the best lower bound instance against A. The Optimal value of the Objective of M P1 is also the approximation ratio Of A. We will show that M P1 can be transformed into a series of AND/OR lin- ear programs. Since A(cI) = cA(I) and OPT(cI) = cOPT(I) for any instance I and positive number c, then it is sufficient to consider those instances I such that OPT(I) = 1. Furthermore, if OPT(I) = 1, then A(I)/OPT(I) = A(I). Thus, 127 M P1 is equivalent to the following program III P2. mIax A(I) subject to OPT(I) = 1. Since I can have any number Of jobs, the number of variables of this math- ematical program is unbounded. We can divide this mathematical program into subprograms according to the number Of jobs in I. Thus, MP2 is equivalent to the following program MP3. max M P3(n) where M P3(n) is defined as n21 Irrlilax A(I) subject to OPT(I) = 1. Until now we have been omitting the clauses “OPT(I) is the Optimal Objective value for I” and “A(I) is the objective value of the solution produced by A for I ”. The values of OPT(I) and A(I) as functions of I can be derived as explained in Section 7.4.1. Let bu- and o,- be subexpressions of A(I) as explained in Section 7.4.1. Let 5;, and o; be corresponding subexpressions Of OPT(I). By plugging the expressions of OPT(I) and A(I) into M P3(n), we obtain the following mathematical program MP4(n). 128 max A subject to I: |II=n O = 1 and ( (0:0'1 and b'l,1 and b'l,2 and and b'LkII) or 0:0' and b' and b' and” and b’ I) or 2 2,1 2,2 2,1c, (Ozo’N, and b'NI,1 and b’N,,2 and and b’NI’kgw) ) and ( (A201 and bm and b1; and and bu“) or (A=02 and bu and b2; and and by”) or (AzoN and bN’l and bmg and and bN’kN) ) For n 2 1, M P4(n) is an AND/ OR linear program. The Optimal solution Of M P4(n) is the best lower bound instance with size n against A. The approximation ratio of A is the maximum of M P4(n) for all n 2 1. max MP4(n) 1121 129 7 .4.3 Solving an AND/ OR Linear Program Consider an AND/ OR linear program with C as its constraint. We can assume that C is in disjunctive normal form (OR’s of AND’s), i.e., C = C, V C2 V V C), where C, = (1,31 /\ dig /\ /\ d”, for 2 = 1, ..., h, and d,,,- E {61, ...,em} for i = 1, ...,h andj = 1, ...,l,. Then each clause C, together with the Objective function Of the AND/OR linear program defines an ordinary linear program. In theory, we can optimize an AN D / OR linear program by simply optimizing the ordinary linear subprogram corresponding to each C,- and choosing the best solution. However, this method potentially has exponential running time. This problem stems from two reasons. Firstly, there could be exponentially many linear subpro- grams tO Optimize. As shown in Section 7.4.1, there could be exponentially many clauses in the expressions of A(I) and OPT(I) in disjunctive normal form. Secondly, each clause C, corresponding to a linear program could have exponential size. That is because the underlying problem II is NP-hard. There is no known Optimal algorithm for II which runs in polynomial time (if it does indeed exist). Since the known Optimal algorithm runs in exponential time, as shown in Section 7 .4.1, each clause could have exponential size. To evade the second problem, we could use a super-Optimal algorithm SOPT which runs in polynomial time instead of OPT. A super—Optimal algorithm is an algorithm that always produces a solution whose Objective value is better than or 130 equal to the Optimal solution with a relaxation that the solution produced may or may not be feasible. Since our goal is to find a lower bound for A, the value Of the lower bound found using SOPT is a valid lower bound for A. However, the value of the lower bound found using SOPT may not be as tight as OPT. For example, the problem of nonpreemptively scheduling jobs on one machine tO minimize the total completion time (1|r,-| 2C1) is NP-hard. There is no known Optimal algorithm for this problem which runs in polynomial time. A solution with preemptions will not be a feasible solution for this problem. However, preemptive solutions are super-optimal and can be constructed in polynomial-time. To evade the first problem of exponential running time, we will use a local search scheme instead of the global one. This scheme does not guarantee to find a global Optimum, but it guarantees to find a local Optimum. Before we describe the scheme, we point out some Observations. Suppose an AND/ OR linear program M P4(n) derived in Section 7.4.1 is given. Then the following tasks can be performed. 0 Given an instance I of the underlying problem, we can determine all under- lying computations Of OPT (or SOPT) and A for I. Note that multiple computations are the results of relaxing inequalities “<” and “>” to “S” and “2”. Equivalently, in term of AND/ OR linear program, given a solution of the AND/ OR linear program, we can find all ordinary linear subprograms whose feasible regions contain the given solution. 0 We can easily find some initial ordinary linear subprogram of the AND/OR linear program. This can be done by choosing a random input instance of the 131 underlying problem and applying the previous observation. The general procedure for finding a local optimum of an AND/OR linear program can be described as follows: 1. Choose an initial linear subprogram. 2. Compute an Optimal solution Of the linear subprogram. 3. Choose a new linear subprogram whose feasible region contains the solution from in step 2. 4. Repeat steps 2 and 3 until all linear subprograms whose feasible regions contain the current solution have been considered and the solution cannot be improved. This procedure does not need a complete boolean expression for the constraints of the AND/ OR linear program up front. We only need to construct the required linear subprograms as we proceed. The procedure will eventually terminate because for a fixed AN D/ OR linear program, there are only finitely many linear subprograms to consider. We do not know the theoretical upper bound of the number Of iterations of this procedure other than the obvious one, the number of all linear subprograms. However, in our examples in the next section, we need only a few iterations before the procedure terminates. Next, we talk about the optimality of the solution found when the procedure terminates. After we Obtain a solution in step 2, we find a new linear subprogram in step 3. The key Observation is that the feasible region Of the new linear subprogram not only contains the solution in step 2, it also potentially contains a better solution. 132 Thus, by solving the new linear subprogram, we guarantee to find a solution that is as good as or better than the previous solution. By repeating steps 2 and 3, we will increasingly find better and better solutions for the AND/OR linear program. The procedure will eventually find a local optimum because otherwise it would not terminate. However, the procedure cannot guarantee to find a global optimum. The final solution found by the procedure depends on the initial solution chosen in step 1 and the selection of new linear subprogram in each iteration in step 3. Note that if the solution space has only one local Optimum, it is also the global Optimum. Although this procedure can be fully automated, a partially manual execu- tion is more productive. It is too time-consuming to implement some of the steps in the procedure. Moreover, they are probably executed only a few times. Specifically, choosing an initial linear subprogram in step 1 and choosing a new linear subprogram in step 3 should be done manually. From our example in the next section, we only need to execute the procedure for a few iterations. The task Of constructing linear subprograms from the sequence Of execution of the Optimal algorithm and the approx- imation algorithm could be automated or manually done depending on the problem. In our example, this task is automated. Computing an Optimal solution Of a linear subprogram should be automated because there are software libraries for this task and they are easily obtained. In addition, viewing the solutions of linear subprograms in some appropriate graphical form together with their numerical values can help us gains intuition about the problem faster and better than reading the numerical val- ues alone. Thus, deveIOping a program for showing a graphical representation of the solution should be considered. 133 7.5 Applying AND / OR LP to a Scheduling Prob- lem In this section, we illustrate our technique using the problem Of nonpreemptively scheduling jobs which arrive over time on one machine tO minimize the total comple- tion time (1|rj|ZCJ-). This problem has been considered in Chapter 6. A class Of algorithms called the non-preemptive SRPT—subsequence algorithms was considered. In Chapter 6, we proved that no algorithm in this class has an approximation ratio better than e/ (e — 1) z 1.58 against a class Of lower bound instances. In this sec- tion, we will show how we came up with these input instances using the technique of transforming a lower bound problem into an AN D/ OR linear programming problem. 7.5.1 Program Formulation Let SS be the class of all non-preemptive S’RPT—subsequence algorithms defined in Chapter 6 with an additional prOperty that they do not insert unnecessary idle time in their schedules. Let SS (I) be the set of all non-preemptive S’RPT-subsequence schedules with no unnecessary idle time. We want to find an instance I such that A(I)/OPT(I) is as large as possible for all A in SS. This can be represented as the following mathematical program: max min AQ— 1 A688 OPT(I). Given an instance I, any non-preemptive SRPT—subsequence algorithm will produce an S’RPT-subsequence schedule for I. Thus, given an instance I , to consider all non-preemptive SRPT-subsequence “algorithms”, we only need to consider all 134 non-preemptive SRPT-subsequence “schedules”. Therefore, our AND/OR linear program becomes S max min —— 1 365.9(1) OPT(I) We can eliminate the “min” by replacing S with a new variable R, and adding a new constraint asserting that R is no larger than the minimum S in SS (I) In other words, R is no larger than any S in SS (I) Thus, our AN D/ OR linear program becomes R . max —— subject to I OPT(I) Rs 5 VSeSSU). By using S’RPT as a super-Optimal algorithm and by following the transfor- mation in Section 7.4.2, we obtain the following mathematical program. mglx M P(n) where M P(n) is defined as max R subject to I: |II=n S’RPT(I) = '1 (7.5) R S S VS 6 88(1). (7.6) In the next section, we will derive the expression of SRPT(I) and all schedules S in SS (1 ) Note that an input instance I with n jobs of the underlying problem 1|rj| 2C,- consists of the release time r,- and the processing time p,- of each job j 135 where j = 1, ...,n. Thus, instances of the underlying problem consist only of real- valued parameters. 7.5.2 Expressions of the Objective Functions In this section, we will derive the expression of SRPTU) and all schedule S in SS (1 ) For SRPT(I), we need to derive the condition when an underlying combinatorial computation of S’RPT prevails over other computations, and the expression of the corresponding underlying combinatorial objective function. For each S in SS (I), we only need to derive its expression. A schedule prevails when it has a minimum total completion time among all schedules in SS (I) This condition has already been enforced in M P(n) First, we give some intuition about the structure of the SRPT schedule. Given S RPT an instance I, we can assume that r,- = sfm’T for all i where s,- is the starting time of job i in the S’RPT schedule. This is because SRPT cannot take advantage of having 7', smaller than sfm’T; that is, S’RPT cannot produce a schedule with a smaller total completion time even though r,- is smaller than szPT. In contrast, hav- ing smaller r,- might allow non-preemptive SRPT-subsequence algorithms to produce a schedule with smaller total completion time. We next consider how jobs overlap in the S’RPT schedule. Let v,- be the SR‘PT’ Cfrzrr] interval Is,- For any pair of jobs i and j, it is the case that either v,- and v,- are disjoint or one of them includes the other. To show this, suppose there snrr < th?7 < Cgsrzr‘r < 03512191. exist two intervals v,- and v,- that overlap, i.e., 8,- Since 3‘5ka < 85in < CSRPT, then job j is run while 'ob i is ready; that is, job 1 j I .l 136 j has a higher priority than job i. However, job j completes after job i because CfRPT < Cfm’T. This is a contradiction. The decisions SRPT has to make during its computation are (1) whether to preempt the running job when a new job arrives and (2) which job to run when a job is completed. The underlying combinatorial computation of S’RPT is the sequence of starts and stops Of jobs on the machine. The sequence of these events can be determined from, and, in fact, is equivalent tO the sequence Of job pieces in the SRPT schedule. They are also equivalent to the “order of inclusion and succession” among v,- in the S’RPT schedule. To help visualize the order of inclusion and succession, we introduce the “in- terval inclusion graph”. Roughly, the interval inclusion graph G (I) of an instance I can be constructed from the SRPT schedule of I as follows. First, replace each job piece in the SRPT schedule by a node. Make sure that the nodes still line up on a straight line. Second, for each pair Of successive nodes Of the same jobs, add an edge connecting them. The edge drawn should look like the upper half of a circle. Note that G (I) is not exactly a graph because it retains information about the ordering Of job pieces in the S’RPT schedule. In what follows, given an S’RPT schedule Of a fixed instance I with n jobs, we inductively construct (1) the interval inclusion graph G (I) and (2) the constraints Of the linear subprogram LP(I, n) Of the AND/ OR linear program M P(n) During the construction of C(I), we will maintain an ordered list of “clusters” which are non-empty sets of jobs. Their definitions will be given later. One of the jobs in each cluster will be referred to as the dominating job. Suppose U,- is a cluster 137 with job i as its dominating job. The interval v, will include the interval of all jobs in U,. Each cluster U, will have the following parameters: 0 r(Uj) is the release time of cluster U,- and is defined as r(Uj) '2 r,. o p(U,-) is the processing time of the dominatingjob in cluster Uj, i.e., p(U,—) = p,. o q(U,-) is the processing time of all jobs in cluster Uj, i.e., q(U,-) = 2,er p). o C (Uj) is the completion time of cluster U,- and is defined as C (U,) = (7,. Relabel jobs so that 0;?an S 02,9in S S CSRPT. For the convenience of the discussion, we assume that there is a dummy job 0 with r0 = O and p0 = 0. Initially, cluster U0 is the only cluster in G (I), and job 0 is its only member. Thus, T(Uo) = To = 0 (KW = P(U0) = P0 = 0 and C(Uo) = 7‘(Uol + (I(Uo) = 0- Suppose jobs 1, ...,l are already added into C (I ) where 0 S l S n— 1. Suppose further that there are k clusters in C(I) at this time. We add job l + 1 to CU) as follows. Relabel clusters in C(I) so that C(U,) S r(U,+1) for i = 1, ..., k — 1. Let i, be the maximum index such that C(U,,) S n+1. Note that C(Uk) S 0153177 since jobs are ordered by non-decreasing Cfm’T. Job I + 1 has k — i1 + 1 pieces in the SRPT schedule, one piece just after cluster j for j = i1, ..., k. Let p8,), denote the length of each piece, j = i1, ..., k. Some of the pieces possibly have zero length. The total length of job I + 1 is p,+1 which is defined as the sum of the length of all of its pieces. 138 In C(I), add a node pg), to the right of cluster U,- for j = i1, ..., k. These nodes represent the pieces Of job l + 1 in the SRPT schedule. Add an edge between node pa), and p331) for j = i1, ..., k — 1. Replace clusters U,,, ..., U,c in the list of clusters by cluster U,+1 where the members of UH, consist of job I + 1 and jobs in clusters U,,, ..., Uk. Let job l + 1 be the dominating job of U1“. The following constraints are added into LP(I, n). I(Um) = Tz+1 (7-7) P(UI+1) = PI+1 = Fifi + + Pill (7-8) (I(Um) = (GU/n+1) + + (I(Ukl) +PI+1 (7-9) C(UI+1) = ii?” = III/1+1) + (Ia/1+1) (7-10) 1"(Hull 2 C(Uz'l) (7-11) r(U,+1) g r(U,,+1) if 2', +1 g k (7.12) pg), 2 0 for j = i1, ...,k (7.13) p(U,-) g pfi’, + will), for j =i1+ 1, k (7.14) Equalities (7.7) to (7.10) simply equate the parameters of cluster U,+1 with their values according to their definitions. Equalities (7.11) and (7.12) enforce the fact that i, is the maximum index such that C (U,,) S n+1. Equalities (7.13) enforces that all pieces Of job l + 1 must have non-negative size. For j = i1 + 1, ..., k, the first job of cluster U, preempts job I + 1. Inequality (7.14) ensures that the S’RPT rule 139 is being followed; that is, when the first job of cluster U,- arrives and preempts job l + 1, the remaining processing time of job 1 + 1 is larger than or equal to that of the first job of cluster Uj. Figure 7.1 illustrates the construction of the interval inclusion graph and the constraints of the corresponding linear subprogram. The figure includes (1) an input instance, (2) the S’RPT schedule Of the instance, (3) the total completion time of the SRPT schedule, (4) the interval v, of each job i in the S’RPT schedule, (5) each step of the construction of the interval inclusion graph, and (6) constraints of the corresponding linear subprogram created in each step. There are 5 jobs, a, b, c, d, and e. In the SRPT schedule, C, S C, S Cc S C, S Ce. Interval vb succeeds interval va. Interval vC includes intervals v, and vb. Interval vd succeeds interval vc. Interval ve includes intervals vc and v,. In this figure, the length of the i’th piece Of job a is represented as a,. Furthermore, the clusters’ parameters in the constraints are replaced by their values (as functions Of jobs parameters). The total completion time of the S’RPT schedule is C, + C, + CC + C, + 0,. Next, we will determine the expression Of all S in SS (I) Let’s continue with the same example in Figure 7.1. To construct a non-preemptive SRPT—subsequence schedule, there are 3 choices to place job e and 3 choices to place job b. Thus, there are 3 x 3 = 9 non-preemptive S’RPT—subsequence schedules for I. They are listed in Figure 7.2. The formula for the total completion time of each schedule is given to the left of each schedule. We have described the construction Of the interval inclusion graph and the linear subprogram from the order of inclusion and succession of job intervals. Note 140 input instance ITII'FI IT] ’ S’RPT schedule] e1 blalllell C3T er, I (1, I 63 I ’ 3rd+3pa+3rb+3pb+2rc+2pc+2rd+2pd+re+pe Pa = 01 0 S r,, O a, 2 0 Ca : ra ‘1' pa 01 pb : bl Ca S Tb o 0 b1 2 0 05 = 7‘), + 1% 0.1 ()1 Pc=CI+C2+03 OSTcSTa o20 Q=n+m+m+m m c2 2 0 1),, S (:3 Cl 01 62 ()1 C3 03 Z 0 Pa S 02 + 03 m pa = d1 Cc S 7‘3 o O 0 C1 0162 b1 C3 d1 d1 2 0 Cd = rd +pd Ik=er+e+e3 OSraSn 61 Cl 01 C2 b1 03 62 d1 63 €120 €220 €320 Q=n+m+m+m+m+m P3363 PcS€2+€3 Figure 7 .1: Construction of the interval inclusion graph and the linear subprogram 141 input instance SRPT schedule fire + 510.: +4Pc+ 3pc +2195 +Pd 5r. + 5p. +4pa +3pc+2po +103 5re +5pe +4pa +3pb+2pc +pd 5rc+5pc +4pa + 3P5 +2128 +292 5r. +5195 +4pc+3pb +2108 +103 ra +pa +4rb+4pb+3pc+2pe +194 3re +3pe +2pa +pb+2rd+2pd+pe 3n, +3pa +2pc+pb+2rd+2pd+pe 1'. +pa+4rb+4pb+3pc+2pd+pe Figure 7.2: Non-preemptive SRPT—subsequence schedules. Fl W [TI—l 9 eITQIallczlthCz l ‘32 1‘11 1 e3 1 > e I c IaIbId I e e lal 6 1”] dl - e [albl 6 1d 1 » l c IaIbI 6 l d l i. [a] c |b| 6' ldl , [31 Ib| c I e I d l : I c Mb) I d1 6 I - lal c lbl l d I e I» m [b] c [4| 6 1: 142 that all three things are equivalent. So, for the rest of this section, we will treat the interval inclusion graph as the specification of the linear subprogram. 7.5.3 Optimizing AND / OR Linear Programs Now, let us apply the procedure described in the previous section to identify good lower bound input instances for the scheduling problem we are considering. We will especially focus on step 3 of the procedure. Example 1 Figure 7.3 illustrates the execution of the procedure. There are 6 graphs in the figure. The first graph is the initial interval inclusion graph which is the same example used in Figures 7.1 and 7 .2 After Optimizing the linear subprogram corresponding to the first graph, the solution is shown in the second graph. In the second graph, a job piece with size 0 is represented as a thick “dot”, and a job piece with non—zero size is represented as a thick “line”. The length of each thick line is proportional to the size of the corresponding job piece. Observe that piece al, the first piece of job a, has zero length. Removing al from the schedule essentially does not change the total completion time of the 872777 schedule. The third graph is the new interval inclusion graph obtained. In the third graph, the release time of job a is between the completion time of job piece b3 and job c. Furthermore, there are now only 6 possible non-preemptive SR'PT-subsequence schedules. Note that other job pieces with zero length cannot be removed from the second graph because removing them will change the total completion time of the SR’PT schedule as well as those of non-preemptive S’R’PT—subsequence schedules. 143 The optimal solution of the linear subprogram corresponding to the third graph is shown in the fourth graph. Similarly to the previous step, piece 0.2 has zero length and can be removed. The fifth graph is the new interval inclusion graph obtained. The optimal solution of the linear subprogram corresponding to the fifth graph is shown in the sixth graph. At this point, we cannot modify the interval inclusion graph anymore. Thus, we have found a local optimum. This instance give us a lower bound of 1.370. Example 2 Let us look at another example in Figure 7.4. The first graph in the figure is the initial interval inclusion graph. The second graph is the solution obtained from optimizing the linear subprogram corresponding to the first graph. Observe that all job pieces in front of piece b1 have zero size. Therefore, all of them can be removed without affecting the total completion time of the SRPT schedule. Job pieces c1, 02, c3, and c4 can also be removed. Other job pieces with zero size cannot be removed because otherwise the total completion time of the 872777 schedule and the non-preemptive SWIFT-subsequence schedules will be affected. The third graph is the new interval inclusion graph obtained. The optimal solution of the corresponding linear subprogram of the third graph is shown in the fourth graph. We cannot modify the interval inclusion graph anymore. Thus, a local optimum is reached. This instance gives us a lower bound of 1.403. Example 3 Figure 7.5 is our third and final example. The first graph is the initial inter- 144 Figure 7.3: The first example of optimizing an AND/ OR linear program. C C C C C C C C C C C 01 a2 03 04 05 b1 52 b3 b4 b5 C1 C2 63 C4 Cs :1: z mflx.x+m 0.1 02 a3 a4 05 Cl 62 63 C4 C5 w a: y z m . . . . . bl 5? b3 b4 b5 w x y 2 c5 0 o 0 0+0...- bl 52 53 b4 b5 wrryz05 Figure 7.4: The second example of optimizing an AND/ OR linear program. 145 val inclusion graph. We progress similarly. The second graph is the solution from optimizing the first graph. However, there is something different from before. The third graph is obtained by removing job pieces al, (12,01, and 02. In this graph, among others, job b includes jobs u, v, and c. However, there are no pieces of job b between u and v, and v and c. This does not follow our construction of interval inclusion graph. The fourth graph is obtained by adding 2 pieces of job b between jobs u, v, and c. There is a side effect of adding these pieces; there will be more non-preemptive SR’PT-subsequence schedules which are potentially better than the existing ones. However, since our goal is to gain intuition, not to prove anything, then this modi- fication is acceptable. Moreover, for this particular instance, it turns out that new non-preemptive S’R’PT—subsequence schedules associated with adding two new pieces for job b do not have a better total completion time than the existing ones. Then we proceed as before until a local optimum is reached. Structure of the Lower Bound Instance Obtained By inspecting the final solutions in all 3 examples, the structure of (local) optimal lower bound instances is now evident. There is one long job released at time 0. There are a number of zero length jobs. Some are released so that they preempt the long job. Some are released when the long job has just finished. With a closer look at the optimal solution obtained and with some more analysis, we found the lower bound instances as defined in Chapter 6, and that concludes this section. 146 . {TYT\ {TYTX aibl (>201 C2 63b302d1 d2 61303 u v a: y 0151 52 Ci 62 63 b3 02 d1 d2 d3 03 u v a: y (.Y...\ {TYT\. bl b2 b d1 d2 d3 u v C3 1: y 03 C C C C C C C 191 b2 b4 b5 53 d1 d2 d3 u 1) C3 :1: y a3 M; £0 +m. £11612 d3 C3 ~73 31 03 C C C C C C C C ()1 b2 b4 b5 b3 3 y 51303 u ’0 C3 +10 L; £; +- +000. 933101303 CB Figure 7.5: The third example of optimizing an AND/ OR linear program. 147 7 .6 Discussion In the literature, linear programming algorithm has been used as subroutines in polynomial-time approximation algorithms. In contrast we use AN D/ OR linear pro- grams to find hard instances of a scheduling problem. An interesting question is how well our technique can be applied to other problems. Another interesting question is whether we can apply this idea to help design and analyze approximation algorithms. 148 APPENDIX 149 Appendix A Bit Summation Inequalities Definition A.1. For n 2 1, let B(n) be the set of non-negative integers such that b __ ZbEB(n)2 —’ 72. Lemma A.1. For n 2 1, Shem") 2b(b' — b) < n where b’ = maxB(n). Proof. Let 8,, = |B(n)|, and let b1 < ()2 < < b3" be the elements of B(n). Z 2.(,,_,) -—— Z 2b‘(an—bz’) bEB(n) 133'an :— 2 2bi(an—b.) lgian-l = Z 2’” Z (b,-b,-_1) igian—l i+lgngn = Z: Z 2"‘(bj—bj_1) : Z 2b’(bj — bj_1) l§i