17158.3 L \ 0 t, _ LIBRARY 10 Michigan State University This is to certify that the thesis entitled INTER-PARTITION NETWORKING FOR OVERLAY MULTICAST presented by SAULEH EETEMADI has been accepted towards fulfillment of the requirements for the Master of degree in Electrical and Computer Science Engineering /" 7 9 '-~ 4 l ./ a :;.-" ,J' ‘5' hi. fig/:1) K. // / III-z / v; ,A» 2 -47 A. 5;;- / 'I " /‘"Major Professor’s Signature ”/M‘i‘fi‘z‘f [/3 Pi’vf“) f" Date MSU is an Affirmative Action/Equal Opportunity Institution PLACE IN RETURN BOX to remove this checkout from your record. To AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE JAN 0 6 2003.4 2/05 p:/ClRC/Date0ue.indd-p.1 INTER-PARTITION NETWORKING FOR OVERLAY MULTICAST By Sauleh Eetemadi A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Department of Electrical and Computer Engineering 2005 ABSTRACT INTER-PARTITION NETWORKING FOR OVERLAY MULTICAST By Sauleh Eetemadi A network partition is formed when a set of network nodes is divided into non- overlapping subsets of nodes. Hence, each member of a network partition, known here as an island, consists of network nodes that do not belong to any other island. Within each island, one or more nodes, known here as reflectors, are responsible for communicating all content that is either originating from the island or destined toward that island. Inter- partition networking addresses the problem of communications among the islands of a network partition through their reflectors. Inter-partition networking and its related optimization problems have not been a focus for researchers in the context of overlay networks. Overlay networks bring a new dimension to the problem. In overlay networks, any node within a partition is a potential candidate for inter-partition communication. In this case, the key problem is which nodes should be selected as reflectors in order to achieve the best possible networking performance. In this work, we focus on providing multicast connectivity across the Internet by inter-partition networking. We prove that the inter-partition optimization problem (under certain constraints) is NP—hard. We propose an algorithm to solve this problem in polynomial time by relaxing one of the constraints. Finally, under another contribution of this work, we estimate the optimum solution by deriving the probability distribution function of the inter-partition network capacity, both analytically and numerically. We show that our estimate becomes more accurate as the number of nodes increase. To the one who will establish justice, freedom and peace To all who died for it To my parents and my dear wife iii ACKNOWLEDGMENTS My feeble expression fails to describe my thankfulness to the most praiseworthy. I, therefore, use His words: "And say: ‘All the praises and thanks be to Allah.” The Quran (17:11]). In addition, I would like to extend the deepest gratitude to my advisor, Professor Hayder Radha, for supporting and mentoring me from my first day at Michigan State to date. My research, in general, and this thesis, in particular, would not have been possible without his relentless contribution and innovation. I continue to be humbled and inspired by his immeasurable intellect both as a researcher and an instructor. I am indebted to my family, especially my parents, for their continuous support and motivation throughout this effort. Words cannot express my appreciation for the selfless outlook and affectionate prayers of my parents and my wife. I thank them for always putting my interest ahead of their personal aspirations. I would like to thank Shirish, Ali and Usman for being wonderful colleagues and even better friends. I am also grateful to my colleagues in Sharif University, Bashir, Vahab and Mohammad Taghi for helping me with identifying problems, thinking through the problems, and proving theorems. I’m also grateful to Dr. Rabiee at Sharif University for supporting me to pursue my graduate studies and my undergraduate advisor, Dr. Jalili, for providing me the opportunity to work at his lab and helping me to refine my future research directions. iv TABLE OF CONENTS CHAPTER 1 INTRODUCTION 1 CHAPTER 2 BACKGROUND ' s 2.1 [P MULTICAST ................................................................................................................ 5 2.1. I [P Multicast Addressing ............................................................................................ 6 2.1.2 Multicasting on a single physical network ................................................................ 8 2.1.3 Multicasting between network segments ................................................................... 8 2.2 GRAPH THEORY .............................................................................................................. 9 2.3 ALGORITHM COMPLEXITY ............................................................................................ 10 2. 3. 1 Complexity Classes ................................................................................................. l 1 2.3.2 Decision Problems vs. Optimization Problems ....................................................... 12 CHAPTER 3 REFLECTOR BASED CONTENT DISTRIBUTION 14 3.1 MOTIVATION ................................................................................................................ 14 3.1.1 Alternatives to IP multicast ..................................................................................... 15 3. l. l. l Unicast/Multicast Reflector ......................................................................................... 15 3. 1.1.2 Application level multicast .......................................................................................... 16 3.1.1.3 Fair exchange in P2P networks: Bittorrent ................................................................... 16 3.2 HYBRID SOLUTION '10 MULTICAST ............................................................................... 16 3.3 PROBLEM DEFINITION .................................................................................................. 18 3.4 SINGLE REFLECTOR SOLUTION ..................................................................................... 18 3.4.1 Design ..................................................................................................................... 20 3.4.2 Performance Analysis ............................................................................................. 25 3.5 Two REFLECTOR ENHANCEMENT ................................................................................ 28 3.5.1 Design ..................................................................................................................... 31 3.5.2 Performance Analysis ............................................................................................. 3] 3.6 MULTIPLE REFLECI‘OR EXTENSION .............................................................................. 32 3. 6. I Design ..................................................................................................................... 33 3.6.2 Performance Analysis ............................................................................................. 34 CHAPTER 4 REFLECTOR OPTIMIZATION 35 4.1 OPTIMIZATION CONSTRAINTS ....................................................................................... 36 4.1.1 Number of Reflectors per Multicast Island ............................................................. 36 4.1.2 Paths vs. Flows ....................................................................................................... 37 4.1.3 Minimum Congestion vs. Maximum Throughput .................................................... 38 4.2 PROBLEM CLASSIFICATION ........................................................................................... 39 4.3 RELATED WORK ........................................................................................................... 42 4.4 ONE REFLECTOR PER MULTICAST ISLAND .................................................................... 44 4.4.1 Choosing Reflectors to Maximize Multicast Tree Capacity .................................... 45 4.4.2 Considering Several Multicast Trees ...................................................................... 47 4.4.3 Choosing Reflectors to Maximize Network Throughput ......................................... 51 4.5 RELAXING ONE REFLECTOR PER MULTICAST ISLAND CONSTRAINT ............................. 54 4. 5. 1 Problem Definition .................................................................................................. 55 4. 5. 2 Optimization Algorithm ........................................................................................... 55 4.5.2.1 Maximum Capacity Multicast Tree ............................................................................. 55 4.5.2.2 Maximum Capacity Multicast Tree for Reflectors ........................................................ 56 4.5.2.2.] Algorithm: Phase One ......................................................................................... 56 4.5.2.2.2 Algorithm: Phase Two ........................................................................................ 57 4.5.3 Complexity of the Algorithm ................................................................................... 58 4.6 THE CAPACITY OF THE MAXIMUM MULTICAST TREE ................................................... 59 4.7 IMPLEMENTATION ISSUES ............................................................................................. 65 4.8 SIMULATION RESULTS .................................................................................................. 66 CHAPTER 5 CONCLUSION AND FUTURE WORK 70 APPENDIX A 72 REFERENCE 74 vi LIST OF TABLES TABLE I - THE REGISTRAR SERVER PROTOCOL .............................................................................................. 24 TABLE 2 - COMPLEXITY 0F PHASE ONE OF THE ALGORITHM .......................................................................... 58 vii LIST OF FIGURES FIGURE 1 - HOW MOST THEORETICAL COMPUTER SCIENTISTS VIEW THE RELATIONSHIPS AMONG F, NP AND NPC [25] .............................................................................................................................................. 12 FIGURE 2 - ROLE OF THE HYBRID SOLUTION .................................................................................................. 17 FIGURE 3 - ONLY ONE REFLECTOR ARCHITECTURE (MC USER: MULTICAST USER) ....................................... 19 FIGURE 4 - UN ICAST LISTENER DESIGN ......................................................................................................... 21 FIGURE 5 - MULTICAST LISTENER DESIGN ..................................................................................................... 22 FIGURE 6 - REGISTRAR SERVER DESIGN ........................................................................................................ 23 FIGURE 7 - TABLE MAINTAINER DESIGN ....................................................................................................... 25 FIGURE 8 - PERFORMANCE ANALYSIS FOR THE SINGLE REFLECTOR ARCHITECTURE: THE REFLECTOR LOAD m TERMS OF NUMBER OF TRANSMISSIONS As THE NUMBER OF USERS INCREASE INSIDE OR OUTSIDE THE MAIN MULTICAST ENABLED NETWORK ................................................................................................. 26 FIGURE 9 - PERFORMANCE ANALYSIS FOR THE SINGLE REFLECTOR ARCHITECTURE: THE UPIDAD BANDWIDTH USAGE FROM REFLECTOR TO MULTICAST DlSCONNECTED USERS VERSUS THE DOWNLOAD BANDWIDTH USAGE FROM MULTICAST DlSCONNECTED USERS TO THE REFLECTOR. .................................................. 27 FIGURE 10 - Two REFLECTORS ARCHITECTURE ............................................................................................. 28 FIGURE 1 1 - ONE SEPARATE REFLECTOR WITH ONE HIDDEN REFLECTOR ........................................................ 29 FIGURE 12 - NO SEPARATE REFLECTORS ........................................................................................................ 30 FIGURE 13 - PERFORMANCE OF TWO REFLECTOR ARCHITECTURE VERSUS ONE REFLECTOR ARCHITECTURE.. 31 FIGURE 14 - MULTIPLE REFLECTORS, ONE PER EACH MULTICAST ISLAND: THE USERS CIRCLED IN DO’ITED LINES ARE THE CHOSEN REFLECTORS AND THE TREE FORMED BY THE THICK LINES, BETWEEN THE REFLECTORS, Is THE MULTICAST TREE ..................................................................... .33 FIGURE 15 — P2P NETWORK OF REFLECTORS .................................................................................................. 35 FIGURE 16 - PROBLEM CLASSIFICATION .......................... . ............................................................................. 40 FIGURE 17 - GRAPH REPRESENTATION OF P2P NETWORK OF REFLECTORS ....................... - .44 FIGURE 18 - A SINGLE MULTICAST TREE WITH DIFFERENT FLOWS STARTING AT DIFFERENT NODES: (A), (B) AND (C) HAVE THE SAME MULTICAST TREE ........................................................................................... 45 FIGURE 19 - SEVERAL MULTICAST TREES OVER REFLECTORS ........................................................................ 48 FIGURE 20 - CREATING GRAPH R FROM GRAPH G": IN THIS EXAMPLE WE USE K=3, THAT MEANS WE ARE LOOKING FOR A CLIQUE OF SIZE 3. ........................................................................................................ 53 FIGURE 21 - ONE REFLECTOR PER INTER-ISLAND LINK ................................................................................... 54 Viii FIGURE 22 - PDF FUNCTION FOR DIFFERENT GRAPH SIZES ............................................................................. 62 FIGURE 23 - THE MAXIMUM CAPACITY OF THE MULTICAST TREE, As THE NUMBER OF NODES INCREASE ....... 63 FIGURE 24 - THE LOGARITHMIC GROWTH RATE OF THE CAPACITY OF THE MAXIMUM- CAPACITY MULTICAST TREE AS THE NUMBER OF NODES IN THE GRAPH INCREASE .................................................................... 64 FIGURE 25 - THE PDF, fC (x) , EVALUATED AT THE MEAN FOR DIFFERENT GRAPH SIZES ............................. 64 FIGURE 26 - EFFECTS OF THE NUMBER OF MULTICAST ISLANDS ON THE OPTIMUM MULTICAST TREE CAPACITY ............................................................................................................................................................. 67 FIGURE 27 - EFFECTS OF THE SIZE OF MULTICAST ISLANDS ON THE OPTIMUM MULTICAST TREE CAPACITY 68 FIGURE 23 - THE BEHAVIOR OF OPTIMUM MULTICAST TREE CAPACITY FOR SEVERAL MULTICAST GROUPS... 69 ix Chapter 1 Introduction Common network protocols that are employed over the Internet are designed to operate between hosts, independent of their physical or network location. In other words, in general, two hosts communicate with each other in the same fashion, whether they are (geographically) next to each other, or across the Internet. While toady’s networking protocols are not aware of geographical, organizational and political borders, the nature of these communications are different in terms of bandwidth, security and energy consumption requirements. Once a network protocol becomes aware of different borders and partitions, a set of optimization problems arise, that is, how to employ these information to maximize security, or minimize resource (e.g., bandwidth or energy) consumption. These optimization problems are especially interesting when more than two users/hosts are involved in a communication protocol. There are other set of factors, probably more technical in nature, that could divide (or partition) 3 set of communicating nodes over the Internet. Some nodes, for example, have broadband access while others have dial-up connections. Therefore, in general, technical and/or non-technical aspects may create different partitions of communicating nodes. This leads to the notion of a network partition which is formed when a set of network nodes is divided into non-overlapping subsets of nodes. Researchers have been addressing a variety of network partition problems for several decades. A relatively trivial example arises when considering the well known IPv6 versus IPv4 problem: some nodes may be able to communicate using IPv6 while others only support IPv4. Solution to this and many other network partition problems have been investigated and addressed 1 thoroughly. However, a unifying and generic treatment for network partition problems has not been presented. The objective of this thesis is two folds: (1) Provide a rather generic framework for network partition problems; in particular, we focus on inter- partition networking as explained below. (2) Motivate the generic network-partition framework by focusing on a particular problem in inter-partition networking for overlay multicast. Before proceeding, it is important to highlight few definitions. Each member of a network partition, known here as an island, consists of network nodes that do not belong to any other island. Within each island, one or more nodes, known here as reflectors, are responsible for communicating all content that is either originating from the island or destined toward that island. Therefore, the islands of a network partition communicate through their reflectors. Inter-partition networking addresses the problem of communications among the islands of a network partition (i.e., among non-overlapping subsets of network nodes) through their reflectors. Although inter-partition networking has applications in several Internet applications and services, sensor—networks, secure communication and mobile networking, we motivate inter-partition networking by focusing on maximizing the available bandwidth between a group of users, based on multicast connectivity information. We make the distinction between multicast enabled networks, multicast islands, and multicast disconnected users, since unicast traffic can never achieve the efficiency and performance of IP multicast. Expectations for multimedia support on the Internet have grown faster than the network providers’ bandwidth limits. Internet users expect services such as Internet TV, video on demand and high quality video conferencing. Although there are hardware limits to the quality of service that Internet users can experience, increasing the efficiency of one-to-many and many-to-many protocols can enable some of these services, and increase the quality of service for others. The remainder of this thesis, including a brief list its contributions, is outlined below. Chapter 2 provides background information, that is used in the later chapters , on IP multicast, graph theory and algorithm complexity In Chapter 3, we present a simple single-reflector1 based approach to provide multicast connectivity across the Internet. In this case, the load increase on the reflector Side is proportional to the square of the number of multicast disconnected users. We enhance this model by adding a second reflector. In this case, the load increase on the reflector side decreases to a linear function of the number of multicast disconnected users. The two reflector model is only capable of connecting two multicast disconnected networks (multicast islands). Next, we extend the two reflector model to a multi-reflector model to be cable of connecting multiple multicast islands. This model will eventually lead to a multicast enabled Internet, but through a multi-reflector architecture. In this architecture reflectors form an overlay network. The performance of the multi-reflector model can not be studied without identifying and constraining the model. In Chapter 4 we claim that, if the reflectors are constraints to only one reflector per island, findin g an optimum inter-partition distribution ' A reflector iS a network entity that is designed to provide multicast connectivity to multicast disconnected “861' S. tree is NP-hard. Then, we prove that if the number of multicast groups (each multicast group forms an inter-partition distribution tree) is high, selecting the reflectors which can form the optimum distribution tree is NP-hard. Next, we relax the one reflector per multicast island constraint, to one reflector per inter-island communication. We develop an algorithm with complexity 0(K2E(X )2) which selects the best reflectors and forms the best multicast tree among them (K is the number of multicast islands, and E (X ) is the average number of users per multicast island). Finally, after finding the optimum multicast tree, we estimate the capacity of the optimum multicast tree using both analytical and numerical analysis. We used a graph With random edge capacities, drawn from a Gaussian distribution. We observe that the capacity of the optimum multicast tree has a growth rate of Log(K) as the number of nodes in the graph, K, increases. Moreover, we Show that our estimate of the capacity of the Optimum multicast tree becomes more accurate as the number of nodes in the graph increases. Chapter 2 Background In this chapter we provide necessary background information on IP multicast, graph theory and algorithms complexity. 2.1 IP Multicast IP multicast ([27]-[34]) is one form of networking that is employed over the Internet and other IP-based packet networks. In general, a single packet is classified into different traffic types depending on its destination: 0 Unicast: A packet destined for one computer. Most of the Internet traffic is unicast. For example, a HTTP2 request sent from a browser to a web server. 0 Broadcast: A packet destined for all computers within a certain scope. This traffic type is normally used for local discoveries and announcements such as Windows name service. It’s important to differentiate between MAC3 layer broadcast and IP layer broadcast. An example of MAC layer 2 Hyper Text Transport Protocol 3 Medium Access Control: This network sub-layer is responsible for controlling access to the physical transmission medium on a local area network. The functionality of this layer is built into the network adaptor, and contains a unique 48 bit address which is used for communication within the local area network. broadcast is an ARP4 request, which does not travel beyond any router, while [Player broadcast, may travel beyond a number of routers. o Anycast: A packet destined for any computer out of a number of possible (or candidate) computers. This traffic type is normally used for load balancing, Where a client connects to a server through an anycast address. A local router at the server Side decides which one of the servers will receive this packet. 0 Multicast: A packet destined for more than one computer. Similar to the broadcast case, it’s important to differentiate between MAC layer multicast and IP Layer multicast. MAC layer multicast is a single packet transmission over the physical link, which doesn’t travel beyond routers. On the other hand, in IP layer multicast, the sender sends one packet destined for a group IP address. The routers may need to replicate this packet and send it along different paths to get the data to all members of that multicast group. [P layer multicast exploits MAC layer multicast, whenever it’s available. 2.1.1 IP Multicast Addressing IP multicast traffic uses class D IP addresses to communicate. This class of IP addresses range from 224.0.0.0 through 239.255.255.255 [21]. For each IP address in this range, there is a set of zero or more hosts, who listen for traffic coming for that IP ‘ Address Resolution Protocol is a protocol for dynamically mapping IP addresses to MAC addresses. multicast group. These set of hosts are called a host group. A host that sends packets to a group may not be a member of that group. It might not even know about the current members of the group. Some class D IP addresses are reserved by IANA5 for permanent host groups such as the followings: o 224.0.0.0: Reserved base address. 0 224.0.0.l: All systems on this subnet. 0 2240.02: All routers on this subnet. 0 2240.05: All OSPF6 routes on this subnet. These reserved multicast addresses can be obtained from mcast.net using a command such as the following: C:\> nslookup dhcp-agents.mcast.net Name: dhcp-agents.mcast.net Address: 224.0.0.12 On the other hand, to find the name and purpose of a reserved IP multicast address, the information can be obtained from 224.in-addr.arpa using a command such as the following: C:\> nslookup -querytype=ptr 11.0.0.224.in—addr.arpa ll.0.0.224.in-addr.arpa name = MOBILE-AGENTS.MCAST.NET 5 Internet Assigned Number Authority (IAN A) is an organization responsible for assigning IP addresses on the Internet. 6 Open Shortest Path First routing protocol 2.1.2 Multicasting on a single physical network Multicasting a packet on a single physical network is simple. On the sender side, if a packet’s destination IP address is in class D, the device driver automatically maps that IP address to a 48 bit multicast MAC address through a static mapping. Notice that unlike unicast, a static mapping is used instead of ARP for converting multicast IP addresses to multicast MAC addresses. On the receive Side the network application lets the device driver know about the multicast IP address it’s interested in, by setting a multicast option on its network socket7. The device driver constructs the associated MAC address and passes a copy of the packets destined at that MAC address, to all applications subscribed to that multicast group. To avoid any conflict between IP multicast mapped MAC addresses and normal MAC addresses, IANA has reserved a range of MAC addresses from OlOOSEOOOOOO8 to 01005E7FFFFF for MAC layer multicast addresses. 2.1.3 Multicasting between network segments IP multicast is not limited to a single physical network. Different protocols have been developed to multicast across network segments. There are two main issues when multicasting within multiple network segments. First, determining multicast participants: This is a mechanism to determine Whether to forward a packet to a particular network segment or not. This mechanism is designed in the IGMP9 protocol. The second issue is determining the scope of multicasting. In general, multicast traffic can be distributed 7 A socket is a programming interface for inter-process and network communications. 8 The MAC address is represented in hexadecimal numbers. 9 Internet GroUp Management Protocol throughout the Internet. The scope of multicasting determines how far the traffic should travel. The distance is measured by the number of hops, and is specified by packet ’I'I‘L'O. Using this technique, one can search for the closest server providing some service, by sending probe packets with initial TTL set to one, and increasing the T'I‘L after an unsuccessful probe. IP multicast is highly efficient and commonly used within a single physical network, but in many cases it’s avoided for multicasting among multiple network segments. This issue is further investigated in Chapter 3. 2.2 Graph Theory Graph theory is one of the major tools for studying the behavior of computer networks. Here, graph theory related notations used in the following chapters are explained. A graph is a mathematical representation for a relationship between two entities. Specifically, a graph G(V,E) consists of a non empty finite set of vertices, V, and a finite set of edges, E, consisting of pairs of distinct vertices from V ([23]). If V = {v,,v2,...,vn} is a set of n vertices, E = {e,,e2,...,em} is a set of m edges each correspond to two distinct members of V, v,- and V]. Different notations are used for representing the relation between edges and vertices: e1. 6 E is represented as {vpvj} , vivj or ij , if e,- connects the vertex v,- to vertex Vj. The size of graph G is defined as the cardinality of the vertex set, V. A pair of ‘0 Time To Live: This is a field in IP header which determines the maximum number of hops a packet could travel. vertices v,- and Vj are adjacent on ek, if ek corresponds to v,- v,-, In this case, v,- and v,- are adjacent on ek, and ek is incident with v,- and v,-. To model computer networks as a graph, nodes are represented as vertices and the links between nodes are represented by edges. We extend the graph model by assigning capacities to edges and we denote this graph by G(V, E, C) , Where C is a function which takes an edge as the input, C(ei), and give the corresponding capacity of the edge as the output, q. We use C(e,) and c, interchangeably to denote the capacity of edge e,~. 2.3 Algorithm Complexity The performance of an algorithm depends on its input. The runtime of an algorithm can be analyzed in the best-case, worst—case and average-case, depending on its input ([2]). In this work, we concentrate on the worst-case running time, that is, the longest running time of an algorithm for any input of size n. The order of growth of the running time of algorithms is used for comparison between algorithms. In this thesis we compare the complexity of algorithms by the asymptotic upper bound of the worst-case running time of algorithms. The asymptotic upper bound for a function g(n), 0(g(n)) (pronounced “big-oh of g of n” or sometimes just “oh of g of n”), is defined as follows: 0(g (n)) ={f(n) : 3c > 0,n0, such that Vn 2 no, 0 S f(n) S cg (n)} The followings are examples of using the O-notation: c an2 +bn+c€ 0(n2) 0 n6 0(n2) 10 o lOOOOOn e 0(n) 0 n2 E 0(n) 0 any polynomial p(n) e 0(e") - Log (n) E 0(any polynomial p(n)) o e” E 0(any polynomial p(n)) 2.3.1 Complexity Classes Different complexity classes have been defined to identify and compare the complexity of problems. The main complexity classes used in this work are: P”, NP”, NPCl3 and NP-hard. Here, we provide intuitive and informal definitions for these complexity classes. The class P consists of decision problems that are solvable in polynomial time. More specifically, problems that can be solved in 0(n" ) , where n is the input size of the problem and k is some constant. The class NP consists of decision problems that are verifiable in polynomial time. In other words, if one claims a solution S to satisfy a NP problem, this claim can be verified in polynomial time. Therefore, any problem in P is automatically in NP as well, P g NP. However the open question is whether P is a proper subset of NP [2] (Figure l). ” Polinomial '2 Non-deterministic Polynomial time '3 NP-Complete ll NP Problems P Problems NP Complete Figure l - How most theoretical computer scientists view the relationships among P, NP and NPC [25] The class NPC consists of decision problems that belong to NP and are at least as hard as any problem in NP. A problem Q is as hard as a problem R, if R is reducible to Q. Problem R is reducible to problem Q if R can be transformed to R’ in polynomial time and a solution S satisfies Q, if an only if it satisfies R’. The class NP-hard, consists of problems that are at least as hard as any problem in NP but do not necessarily belong to NP. Therefore, a problem is NPC if it is NP and NP- hard. Notice an NP-hard problem is not necessarily a decision problem. 2.3.2 Decision Problems vs. Optimization Problems Many problems of interest are optimization problems, where there are many possible solutions, each have a value associated with them, and we are interested in the solution with the highest value. Although the NPC class is limited to decision problems, Optimization problems can be expressed as a decision problem as well. For example a minimization problem can be changed to a decision problem by saying: is there a solution 12 with a value, as good as k. Then by doing a binary search over the range of values, the minimization problem can be solved. An example of an NP-hard problem is the traveling salesman problem [26]: If a sales man, starting from his home city, it to Visit exactly once each city on a given list and then return home, what is the minimum length of his travel? This NP-hard problem can be translated to an NPC problem as follows: If a sales man, starting from his home city, is to visit exactly one each city on a given list and then return home, is it possible for him to finish his journey in k days“? If the length of the salesman’s trip is expressed in terms of days, solving the NPC problem multiple times would lead us to the solution of the NP-hard problem. In fact, if the maximum number of days is N, the NPC problem should be used no more than Log2(N) times, using a binary search, to solve the NP-hard optimization problem. Establishing a problem to be NPC or NP-hard provides good evidence for its intractability. AS an engineer, one would then spend time on developing an approximation algorithm or solving a tractable special case, rather than searching for a fast algorithm that solves the problem exactly [2]. '4 Assuming the salesman stays in each city for zero days. 13 Chapter 3 Reflector Based Content Distribution Content delivery to a group of users has been the focus of research for many years. There are several approaches to group content delivery including: IP multicast ([27]-[34]), application layer multicast ([7], [9], [10], [17] and [18]) and fair exchange p2p networks ([36]) such as Bittorrent ([35]). IP multicast is the first effort to provide group communication services. Deployment of IP multicast in the Internet is still behind expectations, and therefore other approaches have been developed to provide reliable group communication services. 3.1 Motivation Today many routers in the Internet are not multicast enabled. There are both marketing and technical reasons [14], [15]: 0 Many protocols such as monitoring and congestion control protocols are not yet fully addressed and finalized. o Multicast breaks the traditional pricing model where the sender pays for his/her local bandwidth consumption (up to the first router). In the case of multicast, the bandwidth usage gets multiplied by a number greater than or equal to one at each intermediate router. - The "egg and chicken" problem: There are no good applications (no customer demand) operating on top of IP multicast. On the other hand, a 14 stable 1P multicast platform is not in place to develop and deploy a good multicast application (to attract customers). I? multicast lack of monitoring and congestion control protocols and the pricing problem have turned H3 multicast into an error prone and problematic protocol. Even though many of these issues have been resolved, and IP multicast can be deployed as a robust and stable service, because of the "egg and chicken" problem and it's bad reputation Internet service providers and in general network administrators tend to have this service disabled. 3.1.1 Alternatives to IP multicast Many alternatives to IP multicast are developed to address its problems and provide group communication services. 3.1.1.1 Unicast/Multicast Reflector In this solution, if one or more routers between the multicast source and a host are multicast disabled, the host contacts a reflector through unicast. The reflector is capable of receiving the desired multicast traffic, and tunnels that traffic to the host throughout a unicast tunnel. UMTP15 [16] and Mtunnel [17] are examples of such an effort. These examples consider at most two reflectors, one on the multicast disabled host Side, and another on the multicast source side. This solution overcomes the "egg and chicken" problem but introduces another class of marketing problems. The reflector is paying a high price for replicating the '5 UDP Multicast Tunneling Protocol 15 content for each multicast disabled host. On the other hand, if the reflector becomes the main distributor for a wide range of multicast disabled hosts, content delivery is no more efficient. This solution, although not solving all the technical and marketing problems, is one practical way for enabling IP multicast. 3.1.1.2 Application level multicast This solution solves some of the marketing and technical problems by relying on end systems [7]. In this approach the group of users that are interested in the same content form an overlay network. This overlay network emulates IP multicast by assigning some users to be routers and others to be leaves. This takes the pricing problem to the user level where it no more relies on the network infrastructure and service providers. It also introduces another class of technical problems such as group addressing. 3.1.1.3 Fair exchange in P2P networks: Bittorrent Bittorrent-like solutions ([35], [36] and [37]) rely on distributing chunks of data to different peers and let them distribute it among themselves. Although this is a great solution for offline file distribution, it is clear that due to the huge delay and receiving chunks of data out of order, it’s not a solution for real time applications such as Internet TV, Video conferencing or chat. 3.2 Hybrid Solution to Multicast Internet multicast connectivity is the result of each and every router’s decision about Enabling/Disabling multicast traffic on different network interfaces. ISPS have 16 been very fortunate due to the “egg and chicken” problem. Although ISPS are selling broadband Internet connections over DSL and Cable, but there are no good bandwidth consuming applications with a reasonable cost to the users and providers“, and therefore the bandwidth they are paying for is not fully utilized. We believe that there is a need for a group communication protocol which can take advantage of pure IP multicast when available, and application layer multicast where [P multicast is not available. We call such a tool a hybrid solution for group communication protocol. With such a tool, the routers can make the decision for their users to use IP multicast versus application level multicast. In other words, given such a tool the level of multicast connectivity in the Internet will be determined by routers. This leads to the formation of multicast islands. A multicast island is a set of hosts that are connected Via IP multicast-enabled routers. Consequently, under this hybrid solution, routers determine the diameter of multicast islands over the Internet. Figure 2 visualizes Pure IP Multicast the role of this tool. Application Level Multicast Reliability Performance and Multicast Connectivity Diameter of Multicast Islands Figure 2 - Role of the Hybrid Solution ‘6 For example, TV channels don’t have live broadcast over the Internet for free, because with unicast the bandwidth consumption is too high and so is the cost. On the other hand, users who are already paying for their Internet broadband connection can’t find bandwidth consuming applications such as video on demand, cable TV, etc., at a reasonable cost. 17 Application level multicast and pure IP multicast are two extremes to this hybrid solution. On one extreme, there is high performance, high multicast connectivity and there is only one multicast island covering the whole Internet. On the other extreme, the number of multicast islands is as many as the number of nodes that are trying to be part of a group communication session over the Internet. In this later case, the diameter of each multicast island is zero. The proposed hybrid solution enables routers to adjust the number and size (diameter) of multicast islands. 3.3 Problem Definition AS explained in earlier sections, IP multicast is not delivering a group content distribution protocols over the Internet. On the other hand, IP multicast is enabled over all local area networks, and even beyond that, within many university campuses, companies and organizations. The idea is to take advantage of IP multicast where available, and provide alternatives where not available. The problem is how to identify IP multicast disconnectivity, and how to provide an alternative. Different solutions are studied to investigate the advantages and disadvantages of each solution. 3.4 Single Reflector Solution In this solution, it is assumed that a main multicast enabled network like Intemet2 or MBone exists, and there are limited number of multicast disconnected users interested in participating in multicast groups in the main multicast enabled network. A multicast disconnected user is a user/computer which has Internet connectivity, but is not able to send or receive multicast traffic to/from the main multicast enabled network. A network entity called Reflector is designed to provide multicast connectivity to these multicast 18 disconnected users. The reflector is a computer within the main multicast enabled network. This computer’s IP address is given to multicast-disconnected users, by announcing this address on a web Site, or sending messages on Internet message boards or mailing lists. Multicast disconnected users ask the reflector for accessing a particular multicast group. The reflector subscribes to that IP multicast group, and forwards all the data it receives for that IP multicast group to the multicast disconnected users who asked for it through a unicast connection (Figure 3). Multicast Disabled Router MC User Figure 3 - Only one reflector architecture (MC User: Multicast User) This solution, although quite simple, is the most commonly used solution to overcome 1P multicast disconnectivity ([16], [19] and [38]). This is mainly due to minimal required configuration and excellent quality, given high bandwidth availability. However, if the network is congested or the number of multicast disconnected users is high this solution is not optimal, and the quality decreases. Here, the upload and download bandwidth usage for M multicast disconnected users and N users on the main multicast enabled network is derived. This derivation is done for a Video conferencing scenario where each user multicasts a stream to all other 19 users. In such a scenario, the reflector forwards packets received from a main multicast enabled network, through IP multicast, to multicast disconnected clients, and forwards packets received from each multicast disabled client, through unicast, to other multicast disabled clients (through unicast) and to users on the multicast enabled network, through IP multicast. The important difference between sending data on multicast enabled network and to multicast disconnected users is the number of transmissions required to send the data to N users. This number is one for multicast enabled networks, independent of N, but it is N for multicast disconnected users. So, for a single packet received from a multicast disconnected network, the reflector has to make M-l transmissions to multicast disconnected users through unicast, and one transmission to users in the main multicast enabled network, through multicast. On the other hand, for a Single packet received from a user on the main multicast enabled network, the reflector has to make M transmissions to multicast disconnected users. Therefore, for a conferencing scenario where all users send one stream simultaneously, the reflector receives M +N streams, and needs to make M transmissions on the multicast enabled network, through multicast, and M (M + N — 1) transmissions to multicast disconnected users, through unicast. 3.4.1 Design As explained earlier, the reflector residing on the main multicast enabled network is responsible for forwarding streams from the main multicast enabled network to multicast disconnected users and vice versa. Since these two tasks should be performed simultaneously, one thread17 is assigned to perform each task. Figure 4 is the design '7 A single computer application can have multiple threads each performing different tasks Simultaneously. 20 diagram for the unicast listener. As explained later, once a packet arrives at the reflector from the unicast side (step 1)”, from a multicast disconnected user, the reflector looks up the user’s IP address, and finds the multicast group which the user is subscribed to (step 2). Then, it finds all other multicast disconnected users that are a member of that multicast group (step 3). Finally, it sends the packet to all multicast disconnected users (step 4,6), which are a member of that multicast group, through unicast, and sends the packet on the main multicast enabled network (step 5,7), through multicast. < MC Client (IP > < MC Client (IP >0 0 O < MC Client (IP > Address) Address) Address) OOO Unicast Listener / > 6 4: For all lp ln GMemlPe 2: (NP-9P) I 32 GMemlPs - SendTo(Ip,°P. data) m adtet(8|P. data) < Paeket(memlP.glP data) Lookup(slP) Lookup(glP. 9P) lienth. (GrouplP, GroupPort). JoinTime, ConflrmNumber Paeket(glP.gP.data) ( 5: SendTo(glP.gP.data) C slP NOT FOUND >{ Drop Packet I Multicast Group Figure 4 - Unicast Listener Design ‘8 The step number refers to the numbers in the related figure. 21 The other primary task of the reflector is to forward traffic from the main multicast enabled network to multicast disconnected users. The design diagram of this process is shown in Figure 5. In this case, once a packet arrives at the reflector from the main multicast enabled network (step 1), if the reflector is not the originator of that packet19 ,it looks up all the multicast disconnected users who are subscribed to that multicast group (step 2), and sends the packet to all those members (step 3). MC Client (IP MC Client (IP 0 O 0 MC Client (IP Addmws) Addmns) 3: For all ip ln GMemIPs SendTo(ip.gP. data) glP.gP NOT FOUND Multicast Listener on ) (gIP. 9") > LookuplglP, 9P) CIlenth, (GrouplP, GroupPort), Joinfime, ConflnnNumber) 1: PadIeUBIPJPsngPnata) \ O: For all (gIP,gP) Invoke a new listener on glP,gP , -- Multicast Group (OIPSP) ' L°°kup(') Figure 5 - Multicast Listener Design '9 If the reflector is the originator of a packet which arrives at the reflector from the multicast enabled network, this packet is a forwarded packet from one of the multicast disconnected clients to the multicast network, and therefore should not be sent back to the multicast disconnected users. 22 Multicast disconnected clients should let the reflector know about the multicast group they want to subscribe to. So, in addition to the two primary tasks for the reflector, the reflector should maintain a server which accepts requests from multicast disconnected users, and keeps track of when they join or leave a multicast group, to create listeners on the IP multicast network and time out inactive users. MC Client (IP Address) 9'”) a» ve(glP.gP.Confinn Num) <1lnlell’. Aeoept(Confi I Aocept(Confi Registrar Server __/ Request . Already \_ ® Leave Type? J°'" Exist? Yes Y88 Delete(slP. (gIP, 9P), ConfirmNum) lnsertlslP. «TIP. 9?). Gimme. rand) _ Clienth. (GrouplP, GloupPort). \____. " JolnTime, ConfirmNumber V _/ Terminate \4 Connection ‘ Figure 6 - Registrar Server Design 23 Figure 6 shows the design of the registrar server. The registrar server runs on a separate thread, and accepts join and leave requests from multicast disconnected users for multicast groups. A join or leave request message contains the multicast disconnected user’s IP address, the multicast group and port to subscribe, a request type and a confirmation number. Further details of this protocol are explained in Table 1. Table l - The registrar server protocol Client Status in Table Client Request Server Response to Client Table Action (CIP, GIP, GP) Msg(Join,GIP,GP,N) Insert Msg(Confirm,GIP,GP,RndConfirmNum) entry NOT FOUND (CIP, GIP. GP) Msg(Join,GIP,GP,N) None Msg(ReConfirm,GIP,GP,FoundConfirmNurn) entry FOUND Entry Found Msg(Leave,GIP,GP,ConfirmNum) Delete Msg(Confirm,GIP,GPfoundConfirmNum) ConfirmN um match Entry Found Msg(Leave,GIP,GP,ConfirmNum) ConfirmNum None Msg(ConfirmNumMismatchError,0,0,0) mismatch Msg(Leave,GIP,GP,ConfirmNum) Entry Not Found None Msg(LeaveWithoutJoinError,0,0,0) Msg(Leave,GIP,GP,ConfirmNum) Lookup Error None Msg(UnknownError,0,0,0) Finally, the forth and last thread of this design is the Table Maintainer. This thread wakes up every second and cleans up inactive users from the table of multicast disconnected clients. This is because some users might not send the leave request due to application error, system crash or network connection failure. This thread is also responsible for creating multicast listeners for each multicast group. The design of this thread is shown in Figure 7. 24 MC Listener on MC Listener on MC Listener on (W. 9P) 0 O 0 (W. 9P) (gIP. 9P) Create Listener 0: For all (glP.gP) . ClienIIP. (GrouplP, GroupPort). lnvokeanewUDP— ,p p: - L k . . lieteneronglP.gP (9 .9 ) UnIQ( 00 UP( )) JornTlme. ContirmNumber A Close Listeneron (elP.eP) 1: For dl . - A ._ . I = ' Clienth,(glP.gP) 2 (C lenth,glP.gP) Lookup( ) A Wait One Second Delete(Clienth.glP.gP) Yes (CurTlme - LaetActivityTlme) > Timeout NO Yes Figure 7 - Table Maintainer Design 3.4.2 Performance Analysis Although this solution is commonly used, due to simplicity and minimal configuration, it doesn’t use the available bandwidth efficiently. There are two bottlenecks which could degrade the performance of this solution. 0 The reflector’s processing power: Since the reflector has to replicate a Single packet and send it out multiple times, even if enough bandwidth is available, it might not be able to handle all multicast disconnected users. 25 0 The reflector’s upload bandwidth: As explained earlier, the number of packets sent to multicast disconnected users is much higher than the number of packets it receives. Therefore, as the number of multicast disconnected users increases, the upload bandwidth of the reflector could reach its maximum capacity and fail to handle anymore users. Figure 8 shows as the number of users in the main multicast enabled network increase, the total number of transmissions that the reflector should send increases linearly. But, as the number of multicast disconnected users increase, the total number of transmissions increases proportional to the square of the number of multicast disconnected users. RIM" “" “is it + X axis is the number of user on the multicast enabled network 0 X axis is the number of multicast disconnected users sans...“ TEETH? \ i a ‘g is I Figure 8- Performance analysis for the single reflector architecture. The reflector load in terns of number of transmissions as the number of users increase inside or outside the main multicast enabled network 26 In Figure 9 the asymmetry of upload and download bandwidth usage is shown. This simulation shows that if the download bandwidth required is x, upload bandwidth required is of order x2. 737.47. 1‘}: I 3‘ ¢ i 3‘ iv ‘ “i2 . ._ ” £330 I Q A i #43; " ‘ A “i I ' ”I 1.. + The upload bandwdittl usage from the reflector to al multicast disconnected users 0 The download bandwidth usage from all multicast disconnected users to the reflector + + l .g,‘ :g 3, ~ J: '. < J P IBai‘waldth Usage; 2? fig 5 T" g 1314' é.‘ . _, _ a!" 1., V 'a-'. r it: u a; iii-fir I *7 l 3 *¢ 1 d” \ Figure 9 - Performance analysis for the single reflector architecture: The upload bandwidth usage from reflector to multicast disconnected users versus the download bandwidth usage from multicast disconnected users to the reflector. Although this solution overloads the reflector, as the number of multicast disconnected users increase, the performance is acceptable for small number of multicast disconnected users. To address the lack of performance for large number of multicast disconnected users, we develop a multiple reflector platform. But first, we introduce the two reflector scenario, to provide more insight into the problem. 27 3.5 Two Reflector Enhancement The two reflector solution provides multicast connectivity between two multicast islandszo. Each one of the two reflectors collects the multicast traffic from each side and sends it to the other reflector. The other reflector forwards the traffic into its local multicast island, through IP multicast. It’s important to notice that the traffic between the two reflectors is going over a single unicast connection (Figure 10). GI Memb’3r MC User Figure 10 - Two reflectors architecture To avoid data looping between multicast islands, once a reflector receives a packet, it checks for the sender IP address. If the sender IP address is the same as the reflector’s IP address, it won’t send the packet to the other reflector. This solution has two advantages over the single reflector scenario. First, the bandwidth between the reflectors is used optimally. That is, if there are M nodes in one multicast island and N nodes in the other, in a Video conferencing scenario, the total 2° A multicast island is a network with a set of routers, where none of the edge routers support I? multicast (are not multicast enabled). 28 number of streams going between reflectors in both directions is M +N . The second advantage is the fact that the load of reflectors is reduced to a linear function of M and N, and moreover the load is balanced between the two reflectors (assuming almost equal number of users in each multicast island). The disadvantage of this solution is the need for setting up and configuring a dedicated machine in each multicast island to serve as reflectors. To overcome the need for a dedicated machine in each multicast island, we consider choosing one of the users of the multicast island to be the reflector, and we call that to be a hidden reflector. As the single reflector case, we consider a main multicast island and a dedicated server on that island (Figure 11). Multicast Disabled Router MC User MC Usert #Reflectod MC Ueer Figure 11 - One separate reflector with one hidden reflector A common problem arises when an originator of IP multicast traffic is assigned as a reflector. This problem is referred to as the loop problem among the developers in this field. The problem comes from the fact that the reflector can not distinguish between the traffic that is forwarded from another reflector into the multicast group, and the traffic that is forwarded to the multicast group as a member of that multicast group. This 29 problem could be solved by changing the default behavior of IP multicast applications, so they can work on a computer that is acting as a reflector. But, to be compatible and transparent to already developed multicast applications, we decide to assign a particular port number to be used as the source port number of packets forwarded from another reflector into the network. This way, assuming all the applications to be using UDP (which is the most common transport layer protocol over IP multicast) as their transport layer protocol, the traffic forwarded from another reflector can be distinguished from traffic generated from the reflector but for the purpose of participating in the multicast group. We further extend this idea to both reflectors. In this case, one user is selected from each multicast island to serve as a reflector (Figure 12). A natural question that rises in such a case is which users to select as reflectors in each multicast island. This question is addressed thoroughly in the next chapter. Enabled Ne w E i ‘ Multleest‘Enabled Natwork . _ .- (lntemetZ) UnicastConnectio‘ Multicast Disabled Router MC User MC User MC User 0 Reflector 1 Figure 12 - No separate reflectors 30 3.5.1 Design We use the Single reflector design with a special configuration and some changes. In this case, there are two reflectors. Each reflector is also acting as a multicast disconnected user for the other reflector. The only addition to the previous design is that each reflector should subscribe to the other reflector for a multicast group G, once it receives a subscribe request from its own multicast island for group G, if it is not already subscribed to G with the other reflector. 3.5.2 Performance Analysis The following diagram, Figure 13, is the performance comparison of the two reflector solution versus the Single reflector solution ~ 4- WW 1‘ 700 h. r1". Q‘Ius + ’5‘ i ’5’ . 3., + The Single reflector architecture _ 13' ,. O The two reflector architecture 2 600" + E . t 2 + g 500- + D . + 8 ‘ * ' + g 400- - .2 + E + E 300~ + _ E + 31' . + o .. g 200- + ~ + .o 5 ++ 5 100 ++ . «I + . *‘ ++ 06 ° oooooooooooo '— smanmoooOQOOOOQOOOOOOOOOOOa: _ 16 "55 20?‘ 25 3O 35 4o . Ei'NumbeLOfgusem not in thernain multicastisland “5...“ Figure 13 - Performance of two reflector architecture versus one reflector architecture 31 . In this performance comparison, we assume that all the nodes not in the main multicast island are within another multicast island. Although this is an extreme case, and rarely happens, this is the only comparison we can perform since, if all nodes not in the main multicast island are not within another multicast island, the two reflector solution, doesn’t have a solution to offer. We extend the notion of having two hidden reflectors (we will call them just reflector from now on) to multiple reflectors and study the behavior of that solution. 3.6 Multiple Reflector Extension The multiple reflector solution follows the same logic as the two reflector solution. The main difference is a new problem of how to connect these reflectors, and how to distribute content. This problem together with the problem of which reflectors to choose are investigated in the next chapter. Here, we assume each reflector knows about the best reflector to connect to, and can obtains its IP address. In this problem, we assume an undirected tree to be formed. We call this undirected tree, a multicast tree (Figure 14). In this multicast tree, each reflector knows about its neighbor reflectors. So, once the reflector receives a request for subscribing to a multicast group, the reflector itself will generate the same request, and register with all its neighbors for the same multicast group. 32 “his; iii? & ...,. “‘3“, l" A O . [a . - . The Core Internet . - . .394. -I 51- .3.5: as: - ,MWWE gt»: ' -g,'. ‘i .‘o t' 0%55’ ‘ Figure 14 - Multiple reflectors, one per each multicast island: The users circled in dotted lines are the chosen reflectors and the tree formed by the thick lines, between the reflectors, is the multicast tree. 3.6.1 Design The design of the multiple reflectors is based on the single reflector design, where each reflector serves as a reflector for its own multicast island, and acts as a multicast disconnected user for other reflectors. Further design issues like how to choose reflectors, and how to connect them is thoroughly investigated in the next chapter. 33 3.6.2 Performance Analysis Since there are several constraints for this problem and some parts of the solution are developed in the next chapter, the performance analysis of this solution is studied in the next chapter. 34 Chapter 4 Reflector Optimization The reflector based solution to group content delivery forms an overlay network between reflectors. Each reflector is responsible for providing multicast connectivity to its multicastisland21 (Figure 15). a {"3 8 . g B a .. . .. ulticast Island A E Multicast Island 8 {'83. 1a... ; N a .. The Core lntemet '. .4. .....O a .“...o & '._,°--o....,.. ....,.' ThePZPNetwork ‘, of Reflectors 3» ll 3 a Multicast Island C ‘7‘ a Municpdigand E I 0 o o o Maticast Islmd O O . 0 Figure 15 - P2P network of reflectors Individual nodes in multicast islands have different levels of connectivity with other multicast islands. So the choice of the reflector(s) in each island impacts the overall content delivery performance. The problem is which set of nodes to choose in the overall network to provide the best content delivery performance/quality to all multicast islands. 2‘ Multicast Isolated Networks 35 In general, more than one multicast group may be formed among the multicast islands. One multicast group, for example, may be used to distribute a certain type of content (e.g., real-time presentation) while another multicast group may be used to distribute another type of content (e. g., video). Since a multicast tree is formed between reflectors for each multicast group, one reflector may be a member of different multicast trees. As the number of multicast groups increases, the reflector needs to take advantage of its available bandwidth to different multicast islands. This may lead the network of reflectors into a maximum throughput problem as discussed below. 4.1 Optimization Constraints This problem can be formulated by different objective functions depending on implementation constraints and service policies. 4.1.1 Number of Reflectors per Multicast Island The number of reflectors per multicast island is an important implementation constraint. Having multiple reflectors per multicast island increases implementation complexity and protocol overheads. In a simple case, only one reflector exists in each multicast island. In this case, one of the nodes in the multicast island becomes a reflector and provides multicast connectivity to all nodes. In general, multiple nodes can serve each multicast island as reflectors. In this case, each sender (an inside node or an outside reflector) is assigned to one reflector. In other words, each reflector is responsible for delivering part of the outgoing traffic to other multicast islands, and delivering part of the incoming traffic to nodes in the multicast island. Although having multiple reflectors increases flexibility and potentially improves the performance, it dramatically increases 36 the level of complexity and overhead. The goal of this work is to solve the problem for one reflector per island. Meanwhile other options are considered for potential improvements. 4.1.2 Paths vs. Flows In general, traffic generated by a single source can be split into different streams and take on different paths. In this case, the source traffic is considered a flow (i.e., being relayed and networked between the source and destination through multiple paths over the underlying network). Although this might improve the overall performance, it introduces many complexities. Multimedia is the primary bandwidth-consuming traffic for the applications that are considered here. So, if any traffic is going to be split to improve the overall performance/quality, it would be multimedia traffic. Packets arriving out of order is a natural consequence of having a stream take on different paths. This effect is very undesirable for multimedia applications and degrades their quality”. So, using flows instead of paths may result in a degraded quality instead of actually improving the quality. The focus of this work is constraining streams to take on paths (not flows), although allowing flows may improve the quality and may introduce more implementation complexity. 22 It is feasible to employ synchronization fields, time stamps, and packet numbering schemes that could assemble a multimedia flow (i.e., coming over multiple paths) to its original packet order; however, this approach increases complexity and end-to-end delay (due to potentially excessive buffering). Aside of the re-ordering problem, supporting flow routing over networks is a rather complex framework that is not supported by existing practical networks. 37 4.1.3 Minimum Congestion vs. Maximum Throughput Network related optimization problems are traditionally defined as a Minimum Congestion or Maximum Throughput problem. The Minimum Congestion problem is defined as follows [3]: Given a grath = (V,E), and a set of pairs of vertices (s1,tl),(sz,t2),...,(sk,tk), find a path between each pair s, andti. The congestion on an edge is the number of paths that use the edge. The problem is to find a set of paths that minimizes the maximum congestion. The Maximum Throughput problem is defined as follows [4]: Given a graph G = (V,E) , and a set of pairs of vertices, (sl,t,),(sz,t2),...,(sk,tk) , find a path between each pair s, andti. Each edge (u,v)e E has a capacity c(u,v)20. A throughput f (u,v) is associated with each edge. For all edges f (u,v)e E , we require f (u,v)Sc(u,v). The throughput of each edge for a given commodity i is f (u,v,i) where f (u,v) = 2 f (u,v, i). The total throughput for each pair of vertices (swti) is defined as f, = 2 f (u,v,i) . The problem is to maximize the minimum 1“,. u.v€V These two optimization problems fit different types of applications. The desired bandwidth between each pair of nodes is one of the inputs for the minimum congestion problem. This objective function is suitable for applications that use a constant rate that has to be delivered. The minimum congestion problem finds the optimum paths to route the traffic. On the other hand, the maximum throughput problem finds the set of paths 38 that maximize the throughput. This implies that the application should be able to transmit at different rates. Multimedia applications are an example of applications capable of transmitting data at different rates to provide different levels of quality. Furthermore, multimedia streams are the primary bandwidth consuming traffic of the system under consideration in this thesis. Consequently, the maximum throughput approach to this optimization problem better fits our system. Therefore, we focus on the maximum throughput approach to this problem to provide the maximum level of quality for the different multimedia content supported by the system under consideration. 4.2 Problem Classification Given the constraints explained in the previous section and other ones to be highlighted below, several inter-partition networking problems can be defined. Overall, these problems are classified as multiple source/multiple sink problems, which can be further divided into sub-problems (depending on the constraints imposed) as shown Figure 16. 39 Reflectors for Content Maximum Throughput! Minimum Congestion Single Commodity (One Multi' Commodity (Multiple Multicast Group) Multicast Groups) Figure 16 - Problem Classification Here, the terms used in Figure 16 are defined. 0 Multiple Sources/Multiple Sinks: This term refers to a general class of problems, where there are multiple sources and sinks in a graph. The solution for this problem is to optimally route flows from all sources to all sinks. 0 No Intermediate Nodes: This term refers to a sub-class of multiple source/multiple sinks problem, where all nodes in the graph are sinks, and some are sources. This class of problems is a representative of many 40 overlay or peer-to-peer networking problems where every node is either a source, sink, or both. Partitioned Graph: This term refers to a sub-class of multiple source/multiple sinks problem, where the graph is partitioned. The solution to this problem will route flows from different sources to sinks with respect to partitions. For example, one could define a problem in this sub-class as the following: Optimally route flows from sources to a maximum of one sink per partition. Using Reflectors for Content Distribution: This term refers to a sub-class of partitioned graphs multiple source/multiple sinks problem, where each partition is represented by a set of reflectors. The goal in these problems is to distribute content to at least one representative/reflector per partition. The motivation for this sub—class of problems, as explained earlier, is choosing reflectors from multicast isolated networks (multicast islands), and distributing data between these reflectors. Later, the reflectors distribute the content in their own multicast island. This sub-class of problems adds another complexity to the multiple sources/multiple sinks problem, that is, the selection of reflectors. Since different nodes have different network connections, the selection of reflectors changes the overall performance. This class of problems represents the general focus of this thesis. One Reflector per Inter-Partition/Island Edge: This is a special case of the previous problem, using reflectors for content distribution. In this case, 41 only one reflector is allowed per inter-island edge. That is, if a content is to be sent from one island to another, only one reflector from each side is allowed for this task. 0 One Reflector per Partition/Island: This is a sub-class of previous class of problems, one reflector per inter-partition/island edge. In this case, the constraint is even tighter. Only one reflector per island for communicating with any other island is allowed. Some sub-classes of problems are labeled in Figure 16 and will be referred to, later on, in this chapter. 4.3 Related Work The overall optimization problem described in the previous section is divided into two sub-problems. The first sub-problem is selecting the set of reflectors that can potentially form the best inter-island network. The second sub-problem is to form the best possible network between reflectors and route the traffic optimally. Although the first sub-problem has not been studied thoroughly, the second sub-problem has been widely investigated. In particular, the latter optimization problem has been investigated for several applications with different constraints. Although the focus of this work is on the first sub-problem, the latter is included in the overall optimization problem classification. 0 Multiple Reflectors per Multicast Island: Having multiple reflectors per island would simply let each island take advantage of more than one node’s Internet connection. In this case, all nodes within a multicast island can be reflectors. So, the problem is how to optimally route the traffic between sources and sinks. One needs to 42 distinguish between finding an unsplittable flow (path) and a flow (multiple paths each carrying a fraction of the flow) between sources and sinks. This kind of optimization problem can be defined in two ways using two different objective functions: a) Minimum Congestion, b) Maximum throughput. Different problems can be defined by choosing different constraints. Some of these problems have been investigated by researchers: 0 Related Problem, RPl: Given a graph G(V,E) and a set of pairs (si,t,.), identify the routes for the flows between these pairs to minimize congestion. This problem is proven to be NP-hard by [1], and [3] has introduced an 0(log(n)) approximation. 0 Related Problem, RP2: Given a graph G(V,E) and a set of pairs (s,,t,.), identify the routes for the flows between these pairs to maximize throughput. This problem is proven to be NP-hard by [1], and [4] has proved approximation of this problem to be hard. In other words, for the best polynomial-time approximation of this problem, the error is at least 0(5). 0 Single Reflector per Multicast Island: Allowing only one reflector per island, naturally brings up the question of which node to pick as the reflector. In this case, the problem is choosing a set of reflectors (one in each island) that have the maximum level of connectivity with each other. Although the selection of reflectors depends on how they are going to be connected and what is the objective function, it’s still a different problem and has not been a focus for researchers in this area. 43 4.4 One Reflector per Multicast Island Given the above constraints, one can define different optimization problems with different constraints. We are investigating this problem in an overlay network sense. This implies that there is no access to intermediate nodes and routers that are part of the underlying network connecting the set of chosen reflectors. Hence, we are not interested in finding the optimal direct paths/flows between reflectors. On the other hand, due to implementation complexity, we consider one reflector per multicast island. Therefore, the following definition fits our application the best. First we model the problem as a weighted undirected graph (Figure 17). Figure 17 - Graph representation of P2P network of reflectors In this model, each multicast island is represented as a sub-graph. The nodes of each sub-graph are called terminal nodes. All terminal nodes are connected to each other 44 through the Internet. In general, each terminal node has a different Internet connection. Therefore, there is a bandwidth (capacity) associated with the connection between each pair of terminal nodes (may not be a direct connection). 4.4.1 Choosing Reflectors to Maximize Multicast Tree Capacity First, the multicast tree and its capacity are defined. Here, a multicast tree is an undirected graph without a cycle. The capacity of a multicast tree is defined as the minimum capacity of all edges present in the multicast tree. This definition is based on the assumption that contents distributed in multicast trees are delivered to all nodes at the same rate. So the capacity is limited to the minimum capacity of all edges present in the II'CC. (ll (b) (0) Figure 18 - A single multicast tree with different flows starting at different nodes: (a), (b) and (c) have the same multicast tree. Notice that flows starting from different nodes in a multicast tree are valid flows for a multicast tree, but they all have the same capacity. So, in our problem definition, 45 different flows are not distinguished from each other. An example is shown in Figure 18. In each of the scenarios shown in the example, one of the nodes is serving as the source of a multicast tree. Note, however, that the underlying multicast tree is the same for the different distribution scenarios. The problem is choosing one reflector from each multicast island to maximize the capacity” of the multicast tree. Once reflectors are identified, the capacity of the best multicast tree among them can be obtained by a maximum spanning tree algorithm as described in section 4.5.2.1. So, given any set of reflectors, the resulting multicast tree is defined as the output of the maximum spanning tree algorithm. The following is a formal definition of the problem. Problem P1: Given a graph G=(V,E,C), a partition P={l/1,V2,...,Vk}, k where V=UV,. and V,.[]Vj =, VlSiSjSk,i¢j, and each vertex is i=1 fully connected to all vertices not in the same partition, select one vertex in each partition to maximize the capacity of the resulting multicast tree. In other words, maximize the minimum capacity of all edges involved in the resulting multicast tree. Claim: P1 is NP-hard. A formal proof is not available, but the K-Minimum—Spanning Tree is likely to be reduced to this problem. Alternatively, the complexity of the more general case, several multicast trees, is investigated in the following section. 23 We denote capacity as C in the problem definitions 46 4.4.2 Considering Several Multicast Trees In practice, there are several multicast groups operating at the same time. Each multicast group may have a different set of participants in each island and forms a different multicast tree. Therefore, one needs to optimize the choice of reflectors with respect to multiple multicast trees. The problem of choosing a reflector in each multicast island to maximize the capacity of multiple multicast trees is defined as follows: Problem P2: Given a graph G=(V,E,C), a partition P={V,,V2,...,Vk}, k where V=UVi and V,nv,.=, VISiSjSk,i¢j, where each i=1 M multicast group is represented as G,, a subset of V, V =UG, , and each t=l vertex is fully connected to all vertices not in the same partition, select one vertex in each partition to maximize sum of all multicast tree capacities. In other words, maximize the sum of minimum capacity of all edges involved in each multicast tree. As the number of multicast trees increase, the possibility of having an edge, not in any multicast tree reduces. So, if the number of multicast trees is large enough, with a very high probability, all edges are covered by at least one multicast tree. Figure 19 helps to visualize this concept. For the sake of this discussion, the capacities are eliminated from the edges, to simplify the figure. In this figure there are seven reflectors, each representing its island. It can easily be seen that after constructing three multicast trees almost all edges are covered with at least one edge from a multicast tree. 47 a 5. & Multicast Island A . ‘1 ‘ .l . ’b' > o o" t o .C’ “E ' E Q 8 9. 9—. a: :i a. w .0 .’.— ‘Is‘.-.-. ‘ Q. \ I. \ : ‘.‘. “s - The Core Internet}. . \ ‘~.-'h'. . . .. . - ..... - . - .. . a, ..... .\ ..... .- I: ' ‘ . 4 o ‘ __.-.,.__ __.:'I3 a...-.. '2‘.- O I . In" M '\ ~ . ~.. Multicast Island E ‘ Multicast Tree 2 Multicast Tree 3 Multicast Tree 1 T ' Figure 19 - Several multicast trees over reflectors If the number of islands and number of multicast groups are large enough, the single multicast tree problem evolves to a maximum network throughput problem. The maximum network throughput problem is defined as follows: Problem P3: Given a graph G = (V,E,C), and a partition P= {V,,V2,... V,}, k where V=UV,. and Vl. 0Vj =, V15i$j$k,i¢j, and each vertex is r'=l fully connected to all vertices not in the same partition, select one vertex in each partition to maximize the throughput of the resulting sub-graph. In other words, maximize sum of all edge capacities involved in the resulting sub- graph. 48 Theorem: If the number of multicast groups is large compared to the number of multicast islands, the solution to P2 asymptotically satisfies P3. Proof: To prove this theorem, we define the following symbols and functions: C(i. j) Capacity of edge ij f (i . j. 8) Flow on edge ij for multicast group g f (g) Flow associated with multicast group g F0. j) Flow on edge ij T1? The multicast tree for group g, for which f ( g) is maximized. T The set of all multicast trees: {T8 : g = 1, 2,...M} S (i, j) The set of all multicast trees for which edge ij is a member of them: {T8 :TgeT,(i,j)eTg} The capacity of a multicast tree is defined as the capacity of the minimum edge in the multicast tree. If more than one multicast tree is using the same edge, the capacity is equally shared between them. a C(k9l) o e —— , eT f(i.j,g)= Migllfihl)" (I!) g (1) The flow on each edge is sum of all flows sharing the same edge. .. M .. . Ck,l F(z.1)=Z_:,f(t.J.g)= X Mmm (2) 1530.1) (k.l)eT The probability of an edge being covered when constructing a random tree of size K-I in a complete graph is the following: 49 Pr(an edge bCing covered by a tree of size K — 1) = m =_12<_ The probability of an edge not being covered while constructing a random tree of . . 2 . K . . . srze K-I, rs l—E. So, after constructing —2- trees of srze K—I, the probability of and edge still not being covered by any tree is the following: K Pr(an edge not covered by % trees) =[1_72{-)2 z e'1 z 0.367879 2 Therefore, if £2- trees are constructed, the probability of an edge not being K covered is e" which approaches zero as K increases. Therefore, if the number of multicast groups GVI) is large compared to the number of multicast islands (K), we can state the following: M>>K M 1%ng = E)———>1 (3) Problem P2 maximizes f (i, j,g) for all g, and therefore maximizes F (i, j) for any ij that is a member of a multicast tree. Since M is large compared to K, the union of all multicast trees cover all edges in the resulting sub-graph between chosen reflectors (eq. 3). Therefore, the solution to P2 maximizes F (i, j) for all 17’s. The throughput of a graph is defined as follows: Throughput(G) = 2 F (i, j) (4) (ichEk 50 Problem P2 will choose a set of reflectors which have the maximum network throughput among themselves (eq. 4). Therefore, the solution to P2 asymptotically satisfies P3. We will now study P3. 4.4.3 Choosing Reflectors to Maximize Network Throughput The problem is choosing one reflector from each multicast island to maximize the throughput of the network (Problem P3). Claim: P3 is NP-Hard. First, to give some insights into the problem and its complexity, the complexity of an exhaustive search is derived. In order to find the complexity of an exhaustive search, the number of possible solutions should be counted. Assuming the number of nodes (possible reflectors) in each multicast island to be M .- , the number of possible selections K of reflectors is HllMill, where K is the number of multicast islands. Then, for each i=1 selection the objective function, network throughput, should be calculated, which has the complexity of 0(“E||). Therefore the total complexity of an exhaustive search is K 0(||E||H||M,||). To further simplify this derivation, we consider the average number of '=l nodes per multicast islands to be E("M||). So, the total complexity is simplified to 0("E" E ("M H)" ). This complexity is exponential and helps to understand why this problem could be NP-Hard. The formal proof is provided as follows. Proof: To prove a problem is NP-hard, we need to show an NP-complete problem is reducible to this problem. To do so, we define another problem, P4, that is reducible to 51 P3. Then we show that an NP-complete problem, Clique, is reducible to P4. Therefore, since the NP-complete problem, Clique, is reducible to P4, and P4 is reducible to P3, P3 is NP-hard. Problem P4: Given a graph G'=(V,E), a partition P={V1,V2,...,Vk}, Ir where V=UV,. and V, flVj =, VISiSjSk,i¢ j, select one vertex in i=1 each partition to maximize the throughput of the resulting sub-graph. In other words, maximize the total number of edges in the resulting sub-graph. Corollary 1: P4 is reducible to P3 in polynomial time. This can be easily shown by setting capacity 1 for all edges in G ' , and adding edges with capacity zero for edges present in G but not present in 6'. So, any solution to P3 is applicable to P4, and therefore, P3 is reducible to P1 in polynomial time. Corollary 2: The NP-complete problem, Clique, is reducible to P4. We show a polynomial time mapping reduction from Clique to P4. First, we present a formal definition for the Clique problem. Clique Problem: Given a graph G" = (V,E) , is there a fully connected sub- graph of size K? Given graph G " and size K from the Clique problem definition, construct a new graph R by taking the following steps. 0 For each vertex i=1, 2, N in G", create K vertices named i1, i2, ..., ik. 0 For all edges (i,j) in E(G"), i gt j, add an edge from ip to jm for all p,m=1,2,...,K and p¢m. 52 (a) Graph R built from graph G" for K=3 Figure 20 - Creating graph R from graph G": In this example we use K=3, that means we are looking for a clique of size 3. We use graph R with partitions Vq ={iq,Vi =1, 2,...,K} for q =1, 2,...,N (N is size of G") as an input to P4. P4 finds a sub-graph, R,, with one vertex from each partition such that the total number of edges in R, is maximized. Each vertex in G " has a representative in all K partitions. If Rs has more than one representative from each vertex in G”, then Rs is not fully connected since, by definition, representatives from the same nodes are not connected. If R, is a fully connected graph (verifiable in 0(K2) , then the 53 answer to the Clique problem is Yes. If R, is not fully connected, or G " does not have K connected vertices (verifiable in 0(E) ), the answer to the Clique problem is No. The Clique problem is NP-complete and reducible to P4 (Corollary 2), and P4 is reducible to P3 (Corollary 1), therefore, P3 is NP-hard. That means P3 is, at least, as hard as the Clique problem. 4.5 Relaxing one Reflector per Multicast Island Constraint Choosing one reflector per island to optimize the overall performance is proved to be hard. So, we now relax the “one reflector per island” constraint to solve this optimization problem with a reasonable complexity as the number of multicast islands and groups increases. To minimize the implementation complexity and protocol overhead while maintaining a low complexity, we allow a maximum of one reflector in each multicast island per neighbor islands in the multicast tree (Figure 21). Figure 21 - One reflector per inter-island link 54 4.5.1 Problem Definition After relaxing the “one reflector per multicast island” constraint to “one reflector per inter-island link” constraint, the formal problem definition for a single multicast group changes to the following. Problem P5: Given a graph G=(V,E,C), a partition P={V,,V2,...,Vk}, k where V=UV, and v,nvj=<1>, VISiSjSk,i¢j, and each vertex is i=1 fully connected to all vertices not in the same partition, build a maximum capacity multicast tree between all multicast islands. Each edge on the resulting tree will have one reflector on each end. The reflectors on each end should be chosen to maximize the capacity of the resulting multicast tree. We assume very high network capacity between internal nodes of all islands compared to inter-island edge capacities. So, once more than one reflector is chosen in one island, there are no capacity limits for the network between them. This is because the traffic inside each island is going over IP Multicast that is highly optimized especially when going over Local Area Networks. 4.5.2 Optimization Algorithm We extend the maximum spanning tree algorithm to fit the reflector optimization problem, and prove this greedy algorithm will give the optimum solution. 4.5.2.1 Maximum Capacity Multicast Tree First we introduce the maximum capacity multicast tree algorithm for a graph G(V,E, C). 55 1 Sort all edges of G in a non-decreasing order by their capacities. 2 While G is a connected graph do 3 Delete the edge with the lowest capacity 4 Return G This algorithm returns the multicast tree, T, with the highest capacity. The capacity of T is determined by the edge with the lowest capacity in the tree. A tree with N vertices has exactly N-l edges. If there is another tree, T], with a higher capacity, then it must have at least one different edge. Since T contains N-l edges with the highest capacities, T1 must have at least one edge with a lower capacity to be different from T. So, the capacity of T1 is smaller than T, and this is a contradiction. Therefore, this algorithm returns the tree with the highest capacity. In other words, it maximizes the minimum capacity of all edges in the tree. 4.5.2.2 Maximum Capacity Multicast Tree for Reflectors The algorithm for finding the maximum capacity multicast tree between reflectors has two phases. 1. Finding the edges with maximum capacity between each two multicast islands and vertices incident on those edges. 2. Finding the maximum capacity multicast tree considering each multicast island as a vertex, and edges found in phase one as the edges between them. 4.5.2.2.] Algorithm: Phase One The first phase of the algorithm should find the best inter-island edges, and the vertices incident on them. The algorithm will consider one island at a time. For each island Vs, it finds the maximum edge between that island and another island. Once the 56 edge e, with maximum capacity is selected, the two vertices, vs 6 VS (from the source island) and v7. e V, (from the destination island), incident on this edge are selected as potential reflectors. This procedure is repeated K-I times (K is the number of islands) until the maximum capacity edges between V5 and all other islands are specified. 1 R 6 <1) 2 for all partitions VS 6 P do 3 for all partitions VT 6 P do 4 A ={e, :e, is adjacent on V; 6 VS and v, e VT} 5 Add F indMax(A ) to R 6 Return R 4.5.2.2.2 Algorithm: Phase Two For the second phase of this algorithm, the maximum capacity multicast tree algorithm is used as described in section 4.5.2.1. In this case, the vertices are the partitions, and the edges are the ones found in the first phase. Since all edges selected in phase one are maximum, and the maximum capacity multicast tree is built out of these edges, therefore, the resulting multicast tree has the maximum capacity among all possible multicast trees. Notice that the number of all possible multicast trees is an exponential value. As described in next section, this algorithm finds the optimum multicast tree in polynomial time. 57 4.5.3 Complexity of the Algorithm The algorithm proposed in previous section has a polynomial time complexity as described in Table 2. Random variable X, represents the number of vertices in each island. We assume Xi’s are i.i.d.24 random variables, and have an expected value of E(X) Table 2 - Complexity of phase one of the algorithm Line Complexity Comment Number 2 Finding the maximum edge 4-5 0(E(X ) ) between two islands 2 Finding the maximum edges 3-5 0(E(X) (K —l)) between one island and all others. 2-5 0(E (X )2 M) Repeating line 3-5 for all islands. 1-5 0(E (X )2 K 2) The total complexity The second phase of this algorithm has the same complexity as the algorithm for maximum capacity multicast tree described in 4.5.2.1. This algorithm has the complexity 0(E(X )2) to find the maximum edge between two islands, and 0(E(X )2 K 2) to repeat that for all islands. Therefore, the total run time of the second phase of this algorithm is 0(E(X)2K2). 2" Independently Identically Distributed 58 Given the complexity analysis of phase one, 0(E(X)2K2), and complexity analysis of phase two, 0(K2), the total algorithm has the complexity 0(K2E(X)2). Notice there are E ( X )K possible selections of reflectors, and for each specified selection of reflectors there are K K ’2 possible selections of multicast trees. So, the total number of possible multicast trees is K K ”E ( X )K . Therefore, this is a very efficient algorithm since the complexity of the algorithm is very low comparing to an exhaustive search. 4.6 The Capacity of the Maximum Multicast Tree In this section, we study the capacity of the maximum capacity multicast tree. First, we define a graph with random capacities, where the capacities are drawn from a Gaussian distribution with mean, m and standard deviation, 0'. We are interested in estimating the random variable C, the capacity of the maximum multicast tree in a graph with such random capacities. The probability of having a multicast tree with at least maximum capacity, x, is equal to the probability of having at least K —1 edges out of m edges to have capacity of x or more, given those edges form a spanning tree. FC(x) =1—Pr(C?. x) =1— Pr(K —1 or more edges have capacity of greater than or equal to x | Those edges cover all nodes in the graph) We assume independence between the capacities of the selected edges in the graph, and whether these edges connect all nodes in the graph. This is a reasonable assumption since the capacities of the edges are drawn from a Gaussian random process, independent of their location in the graph. So, the probability of K-I edges or more having the capacity of more than x is independent of whether those edges connect all 59 nodes in the graph. Therefore, since the conditional probability is independent of its condition, the cumulative distribution function of x is as follows. FC (x) = 1- Pr(K —1 or more edges have capacity of greater than or equal to x) The probability of having K-l or more edges having capacity of greater than or equal to x is obtained by summing over the probabilities of having K(K—l) K —1,K,K +1,..., edges having capacity of x or more and K —1 K — 2 l )( ),...,2,1,0 edges having capacity of less than x. The formula for this cumulative distribution function is similar to the cumulative distribution function for a binomial distribution: XE” K(K — 1) x I (rang)? ' F (x) = 1- 2 l— e 2 dx C i=K—l i LIV 2750' K(K-l)_i x 1 (Jr-039 2 e 2 dx i, J 27m The probability density function is derived by taking the derivative of the cumulative distribution function with respect to x. 60 fc (x) = ——aFS (X) x a m le ‘1) x _(x-m)2 i =-—— l— i 2 l- l e 202 dx ax i=K-l i ..., 271'0' 2 K(K-1)_l. x (x-m) 2 l ’ 2 e 2" dx [_.‘l: JZno ] (K+l)(K—2) F N(m.a’) (x)) 2 T (l— F~(m.a) (I‘ll-i F~(m,a) (x)?l l4i_ Km —l)f”l"'“’) ()0) Where F ”("5 0)(x) and me a) (x) are the CDF and PDF of a Gaussian random variable with mean, m, and standard deviation, 0': _(x—m)2 X l __ FN(m'a)(x)= [ e 20" dx m (Jr—m)2 202 1 fNW‘x’z‘r—zzz‘ie The summation in the PDF is eliminated by using the HyperGeometric function, using Mathematica ([20]). The convergence of the HyperGeometric functions, appear in fc (x), after eliminating the summation, is studied in Appendix A. Since there are no close form solutions for the PDF function, we use numerical analysis to study the behavior of the capacity of the multicast tree. Figure 2 is the plot of the PDF function for graphs of different sizes. 61 f( Cl‘lax ) 12- 10p Y Y I I U ' Figure 22 - PDF function for different graph sizes As Figure 23 shows, the mean/maximum of fC (x) increases and its variance decreases as the size of the graph increases. This means, as the size of a graph, with edge capacities drawn from a Gaussian distribution, increases, the estimate of the capacity of the maximum capacity multicast tree becomes more accurate. Since fC (x) is not well formed, we were not able to derive the mean or mode, in terms of m,0' andK , for this distribution. So, we study the behavior of the mean capacity of the maximum multicast tree numerically. Figure 23 is the average capacity of the maximum-capacity multicast tree for a graph with capacities drawn from a Gaussian distribution, N(1,0.3). 62 r .0 15.- ........ 0‘... a... 0.. 0.. 0.. 1.4~ .0 a o a c a a _ a a 1.3- . '- o a a a 1.2- a c L I 1 4L 4 l k i- 10 20 30 40 to Figure 23 - The maximum capacity of the multicast tree, as the number of nodes increase We observe the growth rate for the capacity of the maximum-capacity multicast tree is proportional to Log(K) (K is the number of nodes in a graph). We verify this by fitting a logarithmic curve over the available data set for K=4,5,...,50. Figure 24 shows the data set and the fitted curve. The fitted curve for N (1,0.3) is the following. CM“, = O.l32379L0g(K) +1.01326 Notice that as the number of nodes in the graph increases, the variance of fC (x) decreases, and therefore the mean value becomes a better estimate for the capacity of the maximum-capacity multicast tree. Figure 25 shows the PDF function, fc (x) , evaluated at the mean value of the PDF. 63 Cflax 10 20 30 40 it Figure 24 - The logarithmic growth rate of the capacity of the maximum- capacity multicast tree as the number of nodes in the graph increase f(El: Chax D zol 15:- e 10?- g 7.5 - ‘ TV f1 J i . A A . i A a ‘ ~ 10 20 30 4o Figure 25 - The PDF, fc (x) , evaluated at the mean for different graph sizes 64 With the analytical and numerical analysis provided in this section, we conclude that given a graph with edge capacities drawn from a Gaussian distribution with a known mean and variance, we can estimate the capacity of the maximum-capacity multicast tree. Moreover, this estimates becomes more accurate as the size of the graph increases. 4.7 Implementation Issues There are different implementation issues with the optimization algorithm. The most important implementation issue is finding and forming multicast islands. There are no distinct boundaries for multicast networks. Even if one establishes these boundaries, they will change over time. This is mostly because routers are not consistent in supporting IP multicast traffic. Therefore, the first issue is how to form and maintain multicast islands. Here, we develop a simple ping protocol to establish multicast islands. A multicast island is established and determined by its members. Once a user decides to join a multicast group, after joining the IP multicast group, it will send a ping request to the multicast group. All members of the multicast group are responsible for responding to ping requests within a reasonable time frame. Each response, as well, is sent to the IP multicast group. So, once a ping request is initiated by any user, all members of that multicast group have an updated list of members of that particular multicast group. The user who sent the initial ping request is the initial reflector of that multicast island. Using this ping protocol, all multicast islands are formed, but none of them know about each other. 65 Traditionally, peer-to-peer and overlay networks know about other peers in the network by contacting a central server. In this case, the initial reflector of each group in a multicast island registers the multicast island with a central server. Once a reflector registers, the central server updates all reflectors with the new multicast island’s and its reflector’s information. The reflectors then contact each other and get a list of their members. The reflector within each island will update the multicast island, by sending other islands’ and participants’ information. Bandwidth estimation is the next step required to find the optimum multicast tree. Once the bandwidth estimation is done, the best links between islands and associated reflectors are identified, and the multicast tree is identified accordingly. 4.8 Simulation Results The efficiency of the proposed algorithm is simulated in this section. Matlab is the primary tool for these simulations ([5]). In addition, the MatLog toolbox for Matlab is used for solving some optimization sub problems like finding a Minimum Spanning Tree ([6]). First, the capacity of the optimum multicast tree is studied as the number of multicast islands increases (Figure 26). In this setup, a Gaussian distribution with an average of four and variance of two is used for the number of nodes per multicast island. A Gaussian distribution with an average of 64KBps and a standard deviation of 2MBps is used for link capacities. These link capacities are rounded to 64KBps steps to be more realistic. The optimum tree capacity is obtained by the algorithm proposed in section 4.5.2. 66 2200 r i n .5 1800 i ‘ ‘ l 1600 1400- . - - Multicast Tree Capacity (KBps) 1200- - 1000 r 4 l 1 50 100 150 200 timber of unicast Bistros Figure 26 - Effects of the number of multicast klands on the optimum multicast tree capacity Simulation results show that as the number of multicast groups increase, the capacity of the optimum multicast tree increases as well. This is, because as the number of multicast islands increases, the number of possible multicast trees increases with power two. That means the number of possible multicast trees grows faster than the number of multicast islands. 80, the probability of finding a better multicast tree increases. Next, we investigate the effects of island size on optimum multicast tree capacity. The setup is the same as previous; except for the number of multicast islands that has a normal distribution with an average of 10 nodes and a variance of 9. 67 2200 I l I 2000 I 1800 l J I 1600 1400 - Multicast Tree Capacity (KBps) 1200 ‘ 1ooo . - . . 50 100 150 200 Average Milicast Island Size Figure 27 - Effects of the size of multicast islands on the optimum multicast tree capacity As of the previous case, as the number of nodes in each island increases, the inter- island connections take on higher capacities and therefore, the capacity of the optimum multicast tree increases. The optimum capacity behavior is now investigated for several multicast groups. In this setup, the average number of islands is 30, and the average island size is 5. The capacity of edges involved in an optimum multicast tree is updated with the capacity of this tree, and the algorithm is repeated until no multicast tree with a nonzero capacity could be allocated (Figure 28). 68 1600 . . . g 1400- - g 1200- - E 1000- - g g- 800- - E‘ 600- - 5- 400- - O 5 200- . o m r L o 50 100 150 200 The Nth Multicast Group Figure 28 - The behavior of optimum multicast tree capacity for several multicast groups As the number of multicast groups increase, the capacity of the optimum multicast tree drops down exponentially. This is an undesirable effect, but this algorithm is designed to maximize the capacity of the multicast trees for individual groups and not for several multicast groups together. On the other hand, since there is a significant capacity gap between the first couple of multicast groups, it means different multicast groups can carry different types of traffic, which need different capacities. For example, the first multicast group can be used for video, the second multicast group for audio, the third for text chat, and so on. 69 Chapter 5 Conclusion and Future Work In this thesis, we formulated a generic framework for inter-partition networking. Although this framework is applicable to many areas in networking, our focus throughout the thesis has been the application of inter-partition networking to multicast connectivity throughout the Internet. First, we present a simple single-reflector based approach to provide multicast connectivity across the Internet. The load increase on the reflector side is proportional to the square of the number of multicast disconnected users. We enhance this model by adding a second reflector. The load increase on the reflector side comes down to a linear function of the number of multicast disconnected users. The two reflector model is only capable of connecting two multicast disconnected networks (multicast islands). Next, we extend the two reflector model to a multi-reflector model that is capable of connecting multiple multicast islands. This will eventually lead to a multicast enabled Internet, but through a multi-reflector architecture. In this architecture reflectors form an overlay network. The performance of the multi-reflector model can not be studied without identifying and constraining the model. We claim that, if the reflectors are constraints to only one reflector per island, finding an optimum multicast tree (a multicast tree that the capacity of its minimum capacity edge is maximized) is NP-hard. Then, we proved that if the number of multicast groups (each multicast group forms a multicast tree) is high, selecting the reflectors which can form the optimum trees is NP-hard. Next, we relaxed the one reflector per multicast island constraint, to one reflector per inter-island 7O constraint. We developed an algorithm with complexity 0(K2E(X )2) which selects the best reflectors and forms the best multicast tree among them. Finally, after finding the optimum multicast tree, we estimated the capacity of the optimum multicast tree using both analytical and numerical analysis. We used a graph with random edge capacities, drawn from a Gaussian distribution. We observed that the capacity of the optimum multicast tree has a growth rate of Log(K) as the number of nodes in the graph, K, increase. Moreover, our estimate of the capacity of the optimum multicast tree becomes more accurate as the number of nodes in the graph increase. Future work of this thesis includes, but is not limited to the followings. 0 Developing a detailed protocol for the multi-reflector architecture. 0 A formal proof for NP-hardness of problem P1. 0 Deriving the estimate for the capacity of the optimum multicast tree in terms of the graph size, mean and variance of the Gaussian i.i.d. random process. 0 Estimating the capacity of the optimum multicast tree for other edge capacity distributions like exponential, uniform and heavy tailed distributions. - Studying the effects of using a hybrid solution to group content delivery on multicast islands and the evolution of these islands using game theoretic techniques. 0 Extending the concept of inter-partition networking to other areas such as security and sensor networks. 71 AppendixA Eliminating the summations in expression (6) for fC (x) using the HyperGeometric function is done by Mathematica [20]. The following is the derived formula. fc(x)= KO'JI—Z'F(K ——3——K+4]K! 2 -K2+K—1 _(m-x)2 x 1 _(x-m)2 K_2 x 1 _(Jr-m)2 2 2 2 e 23’ (K—1)1-IJ——2;_—e 20’ dx 1+ [We 20’ dx 7rd _, Ira ( x 1 _(x—m)2 \ —— 202 dx—l 2 _ 2 J" e x _(x-m) HyperG l, K +3K 2,K,"° 27:0 2 -2(K-2) I l e 202 dx-l 2 x 1 _(x'm) _. 27:0 I e 203 dx+l _.. 27:0 1 x 1 (x—m)2 \fl KK 3 l 2 e 202 dx-l HyperG 2,:-—(2;-)-,K+1,:° no (rm): ’ I l e 203 dx+l 4, 27m j“ Where HyperG (a,b,c,x) = HyperGeometric(a,b,c, x) 31;, abrb+l),. 2(a). (b). x~ l!c 2!c(c+l) ,4, (C), n! =1+ 72 The HyperGeometric function, HyperGeometric(a,b,c,x), converges for |x|<1 or if x = i1 and c > a + b ([22]). The following HyperGeometric functions appear in fc (x) : ( x l _(x—m)2 \ K2+3K 2 l 2 e 202 dx—l o HyperG 1, 2 .K.'°° 72'0' (“M2 I l e 2"2 dx+1 K .5. 272'0 ) ( x 1 _(x—m)2 ) K K 3 l 2 e 20 dx—l ' HyperG 2.— (2— )-K+l-"’° w (My I l e 202 dx+1 \ .... 27m x l _(x-Ir?2 e 2" dx—l . . . lJZrto . In both HyperGeometric functions x rs equal to x (PM), . Since, I 21 e- 202 dx+l _, no x _(x—m)2 I ‘f21__ e 202 dx is always between 0 and l, x is always between -1 and 1. Therefore, ..., no the HyperGeometric functions in fC (x) are convergent. 73 [ll [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] Reference Gary, Michel R., W. H., Computers and Intractability, Freeman and company, 1983. Pp. 187-285. T. H. Cormen, C. E. Leiserson, R. L. Rivest, C. Stein, Introduction to Algorithms, 2'“d Edition, MIT Press/McGraw-Hill, 2003, pp 966-1003. P. Raghavan and C. Thompson, “Randomized rounding: A technique for provably good algorithms and algorithmic proofs,” Combinatorica, 7:365--374, 1987. Venkatesan Guruswami , Sanjeev Khanna , Rajmohan Rajaraman , Bruce Shepherd , Mihalis Yannakakis, “Near-optimal hardness results and approximation algorithms for edge-disjoint paths and related problems,” Journal of Computer and System Sciences, v.67 n.3, p.473-496, November 2003. The Mathworks Homepage: http://wwwmathworkscom/ Matlog Homepage: http://www.ie.ncsu.edu/l_gy/matlog[ Y.-H. Chu, S. G. Rao, and H. Zhang, A Case for End System Multicast, in Proceedings of ACM SIGMETRICS, June 2000. J. Feigenbaum, C. Papadimitriou, and S. Shenker, “Sharing the cost of multicast transmissions,” Journal of Computer and System Sciences, 63(1), Aug. 2001. J. Brooks and M. J urczyk, “Creating edge-to-edge multicast overlay trees for real time video distribution ," Internet Computing '03 / International MultiConference in Computer Science , Vol.1, pp. 371-377, Las Vegas, Nevada, June 2003. Tsuen-Wan, Johnny Ngan, Dan S. Wallach, and Peter Druschel, “Incentives- Compatible Peer—to-Peer Multicast,” 2nd Workshop on Economics of Peer-to- Peer Systems (Cambridge, Massachusetts), June 2004.. John C.-I. Chuang and Marvin A. Sirbu, “Pricing Multicast Communication: A Cost-Based Approach,” Telecommunication Systems, 17(3):281-297, 2001. Micah Adler, Dan Rubenstein, “Pricing multicasting in more practical network models,” Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, p.981—990, January 06-08, 2002, San Francisco, California. J. Feigenbaum, C. H. Papadimitriou, S. Shenker, “Sharing the cost of multicast transactions,” STOC 2000. 74 [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] Ayman El-Sayed, Vincent Roca, and Laurent Mathy, “A survey of proposals for an alternative group communication service,” IEEE Network, special issue on Multicasting: an enableing technology, J anuary/February 2003. C. Diot, B. N. Levine, B. Lyles, H. Kassem, and D. Balensiefen, “Deployment Issues for the IP Multicast Service and Architecture,” IEEE Network Magazine, January/February 2000. R. Finlayson, “The UDP Multicast Tunneling Protocol”, work in progress, draft- finlayson-umtp-07.txt, Sept. 2002. P. Pames, K. Synnes, and D. Schefstrom, “Lightweight Application Level Multicast Tunneling Using Mtunnel,” Comp. Commun., vol. 21, no. 15, Apr. 1998,pp.1295—13 Yoid Project Homepage: http://www.icir.orglvoid/ . Microsoft Research ConferenceXP Project: http://www.conferencexp.net/ . Wolfran Research Inc.: http://wolframcom/ A. Rodriguez, J. Gatrell, J Karas, R. Peschke, T CP/IP Tutorial and Technical Overview, 7‘11 Edition, IBM RedBooks, August 2001, pp. 229-258. G. F. Simmons, Differential Equations With Applications and Historical Notes, 2“‘1 Edition, McGraw-Hill, 2003, pp. 199-203. F. Buckley, M. Lewinter, A Friendly Introduction to Graph Theory, Prentice Hall, 2003, pp. 47-64. R. Motwanti, P. Raghavan, Randomized Algorithms, Cambridge University Press, 1995, pp. 18-25. ‘ Complexity Classes: http://en.wikipediflrywiki/ImggeComplexitv classespng E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, D. B. Shmoys, The Traveling Salesman Problem, John Wiley & Sons Ltd., 1985. K. Savetz, N. Randall, Y. Lepage, "MBONE: Multicasting Tomorrow's Internet," Hungry Minds, Inc, March 1996. B. Williamson, "Developing IP Multicast Networks: The Definitive Guide to Designing and Deploying CISCO IP Multi- cast Networks," Cisco Press, January 2000. T. A. Maufer, "Deploying 1P Multicast in the Enterprise," Prentice Hall, December 1997. 75 [30] [31] [32] [33] [34] [35] [36] [37] [38] D. Waitzman, C. Partridge, S.E. Deering, "Distance Vector Multicast Routing Protocol," RFC 1075, Nov-01-1988. A. Freier, K. Marzullo, "Multicast Transport Protocol. S. Armstrong," RFC 1301, February 1992. B. Whetten, L. Vicisano, R. Kermode, M. Handley, S. Floyd, M. Luby, "Reliable Multicast Transport Building Blocks for One-to-Many Bulk-Data Transfer," RFC 3048, January 2001. B.N. Levine, J. Crowcroft, C. Diot, J.J. Gracia-Luna-Aceves, J. F. Kurose, "Consideration of Receiver Interest for IP Multicast Delivery, " INFOCOMM, 2000. H. W. Holbrook, D. R. Chriton, "IP Multicast Channels: EXPRESS Support for Large-scale Single-source Applications," SIGCOMM, 1999, pp. 65-78. B. Cohen, “Incentive Build Robustness in BitTorrent,” May 2003, http://wwwbittorrent.com/. L. Buttyan, J. P. Hubaux, "Toward a Formal Model of Fair Exchange - a Game Theoretic Approach," May 2000, Technical Report No. SSC/1999/039, Swiss Federal Institure of Technology. D. Qiu, R. Srikant, "Modeling and performance analysis of bittorrent-like peer-to- peer networks," In Proceedings of ACM SIGCOMM 2004. Access Grid Home Page: http://www.accessgrid.org/. 76 riliigiiigiii[in