. 5?.

.11. a.

t.
A

,1.

 

 

.L.‘

,.

 

 

MICHIG

l lINIIIINIHHUIHIIIHlllllllllllHHIHIIUIIUI

301564 4671

ll

 

 

 

 

 

 

This is to certify that the

dissertation entitled

A FRAMEWORK FOR DISTRIBUTED WEB SERVICES

presented by

Yew—Huey Liu

has been accepted towards fulﬁllment
of the requirements for

 

PhD degree in Computer Sc ience

Major professor

\J‘mzw) H .. M ‘
l

Date Sept. 16, 1996

MS U is an Afﬁrmative Action/Equal Opportunity Institution 0-12771

 

LIBRARY

. MiLfTigan State
University

 

 

 

PLACE IN RETURN BOX to romovo thb checkout from your rooord.
TO AVOID FINES return on or before onto duo.

DATE DUE DATE DUE DATE DUE

LN...“
A C

   
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MSU lo An Afﬁrmative Adlai/Equal Opportunlty Institution
WW1

A FRAMEWORK FOR DISTRIBUTED WEB SERVICES
By

Yew-Huey Liu

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Computer Science

1996

ABSTRACT
A FRAMEWORK FOR DISTRIBUTED WEB SERVICES
By

Yew-Huey Liu

In essence, the World Wide Web is a worldwide string of computer databases using
a common information retrieval architecture. With the increasing popularity of the World
Wide Web, more and more functions have been added to retrieve not only documents written
in HTML (Hypertext Markup Language), but also those in other forms through the Common
Gateway Interface (CGI), by constructing HTML documents dynamically. One of the most
exciting promises of the Web is the Digital Library. Large-scale Digital Libraries will
make huge collections available to thousands of people over wide geographical distances
using wide-area computer networks. These libraries will be some of the largest distributed
systems ever built. Combining the Web with Digital Libraries will require solutions — to
the problems of efﬁciently generating HTML pages containing digital images from huge

digital collections and indices, and to the problems of efﬁciently navigating through them.

Dynamic construction of HTML documents for handling information such as digital
libraries is slow and requires much more computer power. A signiﬁcant performance

bottleneck is the initialization and setup phase for a C01 process to gain access to the

system containing the data.

In this thesis, we present the design and implementation of a framework for distributed
Web services. Combining the connection manager daemon, cache manager daemon, and a
number of clients (i.e., the Cliette process), we address the performance issue of generating
dynamic HTML pages with the existing CGI interface. In a joint study between IBM
and the Florida Center for Library Automation (FCLA), this framework has been used as
a gateway between the FCLA state of the art Bibliography Search server and the IBM
Visual Info system. Using Extended Uniﬁed Trace Environment (UTE) tools for program
visualization and performance analysis, we show that the framework for distributed Web

services provides an efﬁcient gateway solution to address CGI performance issues.

© Copyright 1996 by Yew-Huey Liu
All Rights Reserved

To my husband Eric and my lovely daughter Michelle and Alicia

Acknowledgments

I would like to take this opportunity to express my appreciation to a number of people,

without whom this dissertation could not have been completed.

First, my sincerest thanks go to my advisor, Lionel M. Ni. He has taught me many
things about computer science, and has been my mentor, my colleague, and my adviser since
my enrollment at Michigan State. He provided valuable guidance during my graduate study,
and his positive inﬂuence on my personal and technical development will carry forward
into my future endeavors. I am also grateful to Professors Abdol-Hossein Esfahanian,
Philip K. McKinley, Matt W. Mutka, and Richard Brandenburg for their valuable advice,

and encouragement, and for serving as members of my dissertation committee.

The support provided by IBM through IBM’s Graduate Work Study Program is highly
appreciated. I would like to thank my former manager Sigmund Handelman for giving me
the opportunity to pursue my degree. My sincere thanks go to my manager, Paul Dantzig.

His constant encouragement and support have helped me ﬁnally ﬁnish this long journey.

This thesis could not have been written without the help and understanding of family

members. My very special thanks go to my husband, C. Eric Wu, for his everlasting

vi

encouragement, support, patience, and love. I proudly share this accomplishment with him

and my two lovely daughters, Michelle and Alicia.

Last, but not least, I would like to thank my mother for her love, encouragement, and

helping hands through the years.

vii

Table of Contents

LIST OF TABLES xii
LIST OF FIGURES xiv

1 Introduction
1.1 Background ..................................
1.2 Motivation ...................................

1.3 Related Work ................................. 10
1 .4 Organization .................................. 15
2 Distributed Web Services 17
2.1 The HTTP Daemon .............................. 20
2.1.1 The HTTP protocol ............................. 20
2.1.2 The high-performance HTTP server ..................... 22
2.2 Software Architectures for Distributed Web Services ............. 24
2.2.1 Connection Manager Daemon ........................ 29
2.2.2 Cliette processes ............................... 32
2.2.3 CGI processes ................................ 36
2.2.4 Cache Manager Daemon .......................... 38

3 Uniﬁed Dace Environment and Its Extension for Distributed Web Services 42

3.1 Uniﬁed Trace Environment .......................... 42
3.1.1 Distributed parallel systems ......................... 43
3.1.2 UTE trace generation and libraries ..................... 44
3.1.3 Tools and visualization ........................... 50
3.2 UTE Extensions for Distributed Web services ................. 57
3.2.1 Existing benchmarking tools and open issues ................ 57
3.2.2 New trace events - IP.Send, IP_Recv .................... 59
3.2.3 Dynamic trace generation .......................... 60

viii

3.2.4 Multiple trace channels ........................... 62

3.2.5 Unique IDs for trace generation ....................... 63
3.2.6 Clock synchronization ............................ 63
3.2.7 On-line timing routines for run-time timing data and statistics ....... 64
3.2.8 Enhancement to the utility command - ute2ups ............... 64
3.2.9 Enhancement to the N UPSHOT program .................. 65
3.2.10 User marker - seerGI and seerache .................... 65
4 Performance Evaluation of the Framework 66
4.1 Prototype System Setup ............................ 66
4.1.1 Performance of the traditional design .................... 69
4.1.2 Our framework solution ........................... 71
4.2 Design Considerations ............................. 72
4.2.1 Connection Manager ............................ 72
4.2.2 Number of cliette processes ......................... 73
4.2.3 Number of SP nodes for cliette processes .................. 74
4.3 Performance Results .............................. 75
4.3. l Workload .................................. 75
4.3.2 Inﬂuence of the number of cliette processes ................. 76
4.3.3 Inﬂuence of the number of SP nodes .................... 77
4.4 Observation .................................. 80
5 A Digital Library Using the Framework 83
5.1 Overall View of the FCLA Digital Library Web Services ........... 84
5.2 The Internal Design of the Visual Info Cliette Process ............. 85
5.2.1 Generating HTML page when the value of Di splayMethod is 2 ..... 89
5.2.2 Generating HTML page when the value of Di splayMethod is l ..... 92
5.3 The Internal Design of the Primary CGI Interface — CGIscript ........ 92
5.4 The Internal Design of the CGI interface — GetGif ............... 96
5.5 Performance of Distributed Web Services ................... 96
6 Digital Library Performance Analysis and Visualization 99
6.1 FCLA Digital Library Trace Environment Setup ............... 99
6.2 Standard HTTP Setting Without Using CMD/Cache Support ......... 103

ix

6.3 Running on a Single Workstation .......................
6.4 Running on a Single IBM SP2 Node ......................
6.5 Running on a Single Workstation with Cache Manager ............

6.6 Running on a Single IBM SP Node with Cache Manager ...........
6.7 Running on a Cluster of Workstations .....................
6.8 Running on a Cluster of Workstations with Cache Manager ..........
6.9 Running on Three IBM SP Nodes .......................
6.10 Running on an IBM SP System with Cache Manager .............

6.11 Running on a Cluster of Four Workstations with One Workstation Dedicated to
the HTTP Daemon .............................

6.12 Observations and Lessons Learned ......................

7 Conclusion and Future Research
7.1 Research Contributions ............................

7.2 Directions for Future Research .........................
APPENDICES

A Sample C MD Conﬁguration File and CGI API Calls

A.1 Sample CMD Conﬁguration File .......................
A.2 General Purpose Request Block ........................
A.3 Connection Manager Daemon API ......................
A.4 Cliette Process API ..............................
A.4.1 The HTML Request Block .........................
A.5 CGI Process API ................................

B CMDadmin Manual Page and Its Usage Example
B.1 CMD Administration Command Manual Page .................
B.2 CMDadmin -b Command Example ......................

C Sample Cache Manger Conﬁguration File and Its API Calls

C.1 Sample Conﬁguration File ...........................
C.2 Cache Manager API ..............................

D Ute2ups Output File

105
108
110
114
116
119
121
124

127
130

132
132
133

135

135
135
140
145
147
149
149

151
151
152

156
156
157

167

E One set of the CG] preformance trace results 169

F Attributes for the FCLA_1 Index Class 172

xi

3.1
3.2

4.1
4.2
4.3

6.1

6.2
6.3
6.4
6.5
6.6

6.7

6.8
6.9

6.10

6.11
6.12
6.13
6.14

6.15

6.16

List of Tables

A time partition table .............................

A histogram of MPL events and user markers .................

Detailed initialization time in the Digital Library environment ........
Overhead in our framework, assuming 2,049 bytes per HTML page .....
Average Wait/Work time for l CGI process versus 1 Cliette process .....

Elapsed seerGI time statistics (using Web standard component without CMD
support) ..................................

Elapsed IP-S end and IP_Recv time statistics on a single workstation
Elapsed seerGI time statistics on a single workstation ............
Elapsed IP-S end and IP.Recv time statistics on a single IBM SP node
Elapsed seerGI time statistics on a single IBM SP node ...........

Elapsed IP_Send and IP_Recv time statistics on a single workstation with
Cache Manager support ..........................

Elapsed seerGI time statistics on a single workstation with Cache Manager
support ..................................

Cache Manager activity statistics on a single workstation ...........

Elapsed IP_S end and I P_Recv time statistics on an IBM SP node with Cache
Manager support .............................

Elapsed seerGI time statistics on a single IBM SP node with Cache Manager
support ..................................

Cache Manager activities on a single IBM SP node ..............
Elapsed IP_Send and I P.Recv time statistics on a cluster of workstations
Elapsed seerGI time statistics running on a cluster of workstations .....

Elapsed IP_Send and IP_Recv time statistics on a cluster of workstations
with Cache Manager support .......................

Elapsed seerGI time statistics on a cluster of workstations with Cache Man-
ager support ................................

Cache Manager activities on a cluster of workstations ............

xii

52

107
108
110

112

116

6.17
6.18
6.19

6.20

6.21
6.22

6.23

6.24

Elapsed IP-Send and IP_Recv time statistics on three IBM SP nodes
Elapsed seerGI time statistics running on three IBM SP nodes .......

Elapsed IP_Send and IP-Recv time statistics on three IBM SP nodes with
Cache Manager support ..........................

Elapsed seerGI time statistics on 3 IBM SP nodes with cache manager
support ..................................

Cache Manager activities on an IBM SP system ...............

Elapsed I P_S end and I P-Recv time statistics on a cluster of four workstations
with one workstation dedicated to the Web server .............

Elapsed seerGI time statistics on a cluster of four workstations with one
dedicated to the HTTP Daemon ......................

Summary of performance results ........................

xiii

126

129

130

2.1
2.2
2.3
2.4
2.5
2.6

3.1
3.2
3.3
3.4
3.5
3.6
3.7

4. l
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9

5.1
5.2
5.3

List of Figures

How service is established between a cliette and a CGI process ........
Connection Manager Daemon being used by an HTTP (Web) server .....
State Transition Diagram for a Cliette process .................
CGI usage of the cache to avoid cliette connections ..............
Cliette usage of the cache to avoid database connections ...........

Overview of Cache Manager ..........................

Uniﬁed Trace Environment for IBM SP systems ...............
MPLSend in the UTE/MPI trace library ....................
A NUPSHOT visualization of matched sends and receives ..........
NUPSHOT visualization: with and without <OtherProcesses> states .

Visualization of user markers .........................
File browser for source code association ...................

Uniﬁed Trace Environment for IBM SP systems ...............

Traditional Web server design .........................
Our framework solution ............................
Elapsed time using traditional Web server design ...............
Elapsed Time using our framework solution ..................
Assuming inﬁnitely fast back-end server ....................
Assuming cliette processes’ Busy ratio is 20% ................
Assuming cliette processes’ Busy ratio is 50% ................
Forty-eight processes versus 48 cliette processes ...............

Ninety-six processes versus 96 cliette processes ................

FCLA Web services ..............................
Graphic representation of documentation stored in the Visual Info ......

Flowchart of FCLA cliette processes .....................

xiv

28
30
35
40
41
41

49

67
68
69
7 1
77

80

5.4
5.5
5.6
5.7

6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
6.10

6.11
6.12
6.13

A dynamic HTML page contains a page GIF image .............. 91

A dynamic HTML page contains a pick list .................. 93
Flowchart of nph-CGIscript process ...................... 95
Flowchart of GetGif process .......................... 96
One workstation/SP node without the Cache Manager ............. 100
One workstation/SP node with the Cache Manager .............. 101
Three workstations/SP nodes without the Cache Manager ........... 101
Three workstations/SP nodes with the Cache Manager ............ 102
Distributed Web services on a single workstation ............... 106
Distributed Web services on a single IBM SP2 node ............. 109

Distributed Web services on a single workstation with Cache manager support 111

High-Performance Web server on an IBM SP2 system with caching support . 115

Distributed Web services on a cluster of three workstations .......... 117
Distributed Web services on a cluster of three workstations with Cache Manager
support .................................. 120
Distributed Web services on 3 IBM SP nodes ................. 122
Distributed Web services on an IBM SP2 system with caching support . . . . 125
Distributed Web services on a cluster of four workstations .......... 128

XV

CHAPTER 1

Introduction

Computers have been in the center stage of providing high-speed computation power and
transaction processing services. Mainframe computers dominated the 19603 and 19708
since IBM introduced the System 360 in 1963. Minicomputers later became popular as
departmental systems after DEC introduced the VAX-11 in 1975. In these environments,
users employed dumb terminals to instruct these host systems. Gradually, it became
common to interconnect mainframes and minicomputers to create distributed systems over

computer networks.

The 19808 brought PCs, workstations, and LANs. PCs were quickly linked to the
large-scale distributed systems, but primarily to emulate the dumb terminal, where the local
PC computing capability was not exploited. From their beginning, powerful workstations
running engineering/scientiﬁc applications used LAN 8 to share data among users. Little by
little, users began installing LANs to interconnect a few PCs so they could share printers

and exchange data ﬁles and messages. PC LAN programs and entirely new LAN operating

systems were developed. They made LAN resources, such as server disk storage and
printers, appear as if they were part of the user’s PC. More signiﬁcantly, performance of
these network-attached devices could be made to appear almost as fast as resident devices!
This was the birth of client/server computing. The breakthrough in technology that made
the client/server practical was the introduction of very high-speed LAN networks. Today,
Network-Centric Computing has been described as the next computer wave after host-
centric and client/server computing. It represents a form of distributed computing in which

the network of computing resources is viewed as the supplier of services.

A fundamental trend for servers in Network-Centric Computing is to evolve from
traditional database and transaction servers to becoming information distribution and han-
dling systems. This trend is driven by the rapid growth of Wide Area Internetworks‘ and the
availability of inexpensive microprocessors, which fuel exponential growth in the work-
station and personal computer markets. Lower costs in processing and storage have also
stimulated the use of many varieties of information, including news sources, document
images, video training materials, and ﬁnancial services. This trend has accelerated because
of the World Wide Web (WW) [1] and Internet. These networks themselves reﬂect
changes in the economy: Corporations need to be in closer electronic communication with

their customers, suppliers, mobile workers, and contracting organizations and with public

sources of information and services.

1.1 Background

The Internet applications such as ﬁle transfer protocol (FTP), remote login (telnet), search
(gopher, Veronica), and locate (ﬁnger) have been the traditional users of the Internet Protocol
(IP). The emergence of the World Wide Web, a collection of distributed hypermedia server
systems accessible through Web browsers at client systems, had accelerated the growth
of Internet sites and usage. In fact, the term "World Wide Web" is now often used
synonymously with the Internet. The World Wide Web is becoming the interface of choice
for electronic commerce and for distributing information. The key attributes are that
the protocol is open and it separates how information is presented from the information
content. Many emerging applications, such as virtual shopping malls and information

retrieval services, are enabled through the Web with the use of Web browsers.

The recent explosion of interest in the World Mde Web can be traced to the distribution
of the CERN (European Laboratory for Particle Physics in Geneva, Switzerland) and N CSA
(National Center for Supercomputing Applications) servers and Web client browsers. In
particular, the NCSA Mosaic, a graphical user interface based on distributed multimedia
hypertext for Web browsing, has spawned several commercial variants and has made the

Internet readily accessible to a much larger population.

Shortly after NCSA’s Web server was established, it became clear that the volume
of Web trafﬁc would stress operating systems and network implementations in ways not
originally envisioned by their designers. For example, the NCSA server receives 30 to 40
new Web requests per second [2] at peak times. Because the Hypertext Transfer Protocol

(H'l'l'P) [1, 3] is connectionless, each such request appears to the server as a separate

network connection. Given the exponential growth of the Internet in recent years, and
the Web in particular, it is increasingly more difﬁcult for organizations and information
system staffs to properly anticipate future Web service needs, both in human resources and

hardware requirements.

Not only were most implementations of the TCP/IP network protocol not designed
to accept connections at this sustained rate, even conservative projections of request rate
growth showed that no single processor system could serve all requests. Network statistics
from Merit, the NSFNet backbone management group, show that Web trafﬁc is the largest
and by far the fastest growing segment of the Internet, and growing numbers of government
and commercial groups are making hundreds of gigabytes of data available via Web servers.
At the same time, the Web servers at N CSA have experienced explosive growth in trafﬁc,
from 1 million requests per week in February I994, to 2 million per week in June 1994, 3
million per week in September 1994, nearly 4 million per week in December 1994 [4], and

even larger numbers in 1995.

To support continued growth, Web servers must manage a multi-gigabyte (in some
instances a multi-terabyte) database of multimedia information while concurrently serving
multiple request streams. This places demands on the servers’ underlying operating sys-
tems and ﬁle systems that lie far outside today’s normal operating regime. Simply put, Web
servers must become more adaptive and intelligent. The ﬁrst step on this path is under-
standing exact access patterns and responses. On the basis of this understanding, one can
then develop more efﬁcient and intelligent server and system ﬁle-caching and prefetching

strategies.

A scalable web service can grow with a seemingly endless increase in the number of
user requests by adding more capacity, such as another Web server or more disk or memory,
to the Web service [5]. Adding capacity to a scalable Web service should be easily managed,

extensive reconﬁguring of the entire Web service should not be required.

The scalability of the current World Wide Web [6] is mostly accomplished through the
distribution of ﬁles across a series of decentralized servers. This form of load distribution
is both costly and resource intensive. The virtual server concept [5] was introduced to
add scalability by dynamically rotating through a pool of Web servers. It uses a round-
robin dynamic name server (DNS) to distribute Web requests across a cluster of identically
conﬁgured Web servers. This type of design reduces single point failures and increases its
availability.

A scalable Web server technique is used at the National Center for Supercomputing
Applications [5]. The key elements are the use of Andrew File System to provide a platform-
independent distributed ﬁle system and Round-Robin Distributed Name service to allow
any of several server platforms to respond to queries to the same URL. The result of this
architecture is that any number of servers could be added to the available pool, dynamically

increasing the load capacity of the virtual server.

The basic approach for a Web server (a.k.a. HTTP daemon or HTI’PD) to retrieve an
existing Hypertext Markup Language (HTML) [7] documentation is simply to get it from
the documentation tree. With the growth rate of most Web sites, it is also almost impossible
to manage this Hypertext documentation without the aids of the back-end server. Adding

new information and deleting outdated information becomes an everyday challenge.

One of the most exciting promises of the Web is the Digital Library. The Web,
while it probably contains more information than any single traditional library, is arguably
not as useful as a traditional library because it lacks services such as organization and
sophisticated search support. Digital Library has been identiﬁed as a "National Challenge"
in the Information Infrastructure Technology Application component of the US. High
Performance Computing and Communications Committee (HPCC). \VIthin the past decade,
the number and kinds of digital information sources have proliferated. Computing system
advances and the continuing networking and communications revolution have resulted
in a remarkable expansion in the ability to generate, process, and disseminate digital
information. Together, these developments have made new forms of knowledge repositories
and information delivery mechanisms feasible and economicall. Combining the Web and the
Digital Library efforts broadens the aspect of the information retrieving system. Eventually,
large-scale Digital Libraries will make huge collections available to millions of people over

wide geographical distances using wide area computer networks.

1.2 Motivation

As the Web becomes more popular, additional functions have been added to retrieve data
more efﬁciently [8, 9]. In addition, data stored in other forms can be retrieved through
the Common Gateway Interface (CGI) [10]. The CGI mechanism is a simple, general-
purpose interface that is easy to use. When the CG] mechanism launches a script, the
HTTP daemon forks a process and executes it, passing arguments in environment variables.

This is a very low-tech interface, but it works on all Unix platforms and for every Web

server. Any programming language can be used for the gateway script. The CGI provides
an interface to dynamically construct HTML documentation for a Web server, so that data
and documents need not be stored in the documentation tree. This allows the use of various
tools such as relational databases to provide easy maintenance and manipulation of the
documents and data. For example, by storing data in a relational database, one can generate

query language scripts in CGI processes to retrieve data from the database and construct

HTML documentations "on the ﬂy".

The World Wide Web is gaining popularity partially because it gives quick and easy
access to a tremendous variety of information in remote locations. Users do not like to
wait for their results. Typical users tend to avoid and/or complain about Web pages that
take a long time to retrieve. That is, users care about Web latency. Perceived latency
come from several sources. For example, it might take a long time for a Web server to
process a request, especially if it is overloaded with CGI processes. Common CGI scripts
include those to perform searches on behalf of client requests. Web clients may also cause
additional delay if they cannot quickly parse the retrieved data and display them for the
users. Latency caused by client or server slowness can be solved simply by buying a faster

computer, faster disks, more memories, or a combination of them.

Another cause for delay is that a C01 process may need to gain access to its back-end
server upon receiving a Web request. The initialization and setup phase is usually a very
time-consuming step, especially for simple requests. Unfortunately, this initialization step
has to be repeated many times for many CGI requests, since each CGI process is created to

serve just one Web request.

The realization of the dream of combining the Web with Digital Library will require
solutions to the problems of efﬁciently generating HTML pages containing digital images
from huge digital collections and indexes, and efﬁciently navigating through them. These
huge digital collections and indexes cannot be stored statically as HTML pages, since
many require constant updates, and doing so will make managing and creating Web pages
impossible in a timely fashion. Various Digital Library projects have used different back-
end servers to solve the problems of management and searching, such as the Harvest
Information Discovery and Access System [11]. However, no good gateway exists between

these back-end servers and the standard Web servers.

This motivates us to design and implement a framework for distributed Web services.
The framework includes the Connection Manager Daemon (CMD), Cache Manager Dae-
mon, and a set of gateway program APIs (Application Programming Interfaces). Using
these APIs, a CGI process can talk to a prestarted back-end client process and get ser-
vices immediately from the back-end server without spending time for the initialization and
session setup. These prestarted client processes (referred to as the cliette processes) are
virtual users who log on to various back-end server applications such as Visual Info [12]
and DBZ [13]. Multiple client processes can be allocated for each back-end server to better

serve Web requests. They can also be invoked on a remote machine to evenly distribute
Web server’s load.
The main function of the Connection Manager Daemon is to schedule these cliette

processes to serve requests coming from various Web gateway processes. Cliette processes

can be dynamically started or terminated by the Connection Manager Daemon according to

the load of the Web server as it attempts to increase its scalability. By combining the load
distribution of cliette processes and a round-robin dynamic name server, we can provide

faster Web services with much greater ﬂexibility and scalability.

Repeat requests from Web clients may be served quickly to save system resources if the
retrieved information is stored and searched. This motivates us to develop a Cache Manager
in the distributed framework. The Cache Manager manages information generated by cliette
processes. It provides different threads for different types of cached information. The Cache
Manager is independent of the Connection Manager Daemon. It stores cached information

in the memory or in the disk, and is conﬁgurable to ﬁt different server environments.

A Digital Library is a good vehicle to illustrate the set of gateway program APIs and
to help us acquire ﬁrst-hand experience. This motivates us to develop a Digital Library
prototype using the distributed framework with a back-end Visual Info server on IBM
SP2 systems. An IBM SP system is a general purpose scalable parallel system based on
the message-passing programming model. It provides a high-performance switch network
for message passing and interprocess communication. Such scalable parallel systems are
increasingly being used to address existing and emerging application areas that require
performance levels signiﬁcantly beyond what symmetric multiprocessors are capable of
providing.

In addition to the distributed framework and a Digital Library prototype, we need a
good ﬂexible tool to monitor the Web performance and to better understand the commu-
nication patterns among various components of the framework. Most benchmarking tools

available today only provide information about the Web server httpd. While monitoring

10

httpd is sufﬁcient for a general purpose Web server, it cannot provide any information about
how our distributed framework performs. To understand the performance issues, we need
tools to trace the framework with minimal overhead. This requirement prompts us to add

the tracing facility to our interface.

Running Digital Library servers on scalable parallel systems such as the IBM Scalable
Parallel (SP) system provides better expandability to satisfy future growth. The heart of
an IBM SP system is a high-performance switch network, a low-latency, high-bandwidth
network that binds together hundreds or thousands of IBM RS/6()()O processors [14]. Since
the high-performance switch network supports IP as well as other message-passing inter-
faces, an IBM SP system can be viewed as a cluster of RS/6000 workstations with fast IP

connections that provide a migration path for Web services.

It is a challenging task to develop performance tools for such a scalable parallel system.
The trace facility should be able to generate both message passing and system events (e.g.,
process dispatch and page fault) with minimal overhead and source code modiﬁcation.
Other issues, such as the clock synchronization problem [15] and supports for client/server
applications, need to be addressed, too. This motivates us to develop a Uniﬁed Trace

Environment (UTE) for IBM SP systems and its extensions for the distributed framework.

1.3 Related Work

Surﬁng the Web is getting more and more popular among the general public in recent

years. As the number of requests for Web sites throughout the world continues to grow,

11

the scalability of the server architecture, the efﬁciency of the HTTP protocol, and the
effectiveness of caching strategies become increasingly critical research and implementation
issues.

One of the most important features of the World Wide Web is that it delivers many kinds
of information, including text, images, sound and movies. Requests for nontextual material
tend to be more resource-demanding. Nontextual information uses much more storage
space, and requires much more bandwidth or time to transfer. The type of information
transmitted is not especially important to the server (it is extremely relevant to the client,
of course), but the amount of data to be served is very important. From the server’s point

of view, the large amount of data in images, audio, and movies presents similar problems.

Web servers have a simple view of the data they serve: The client usually names a
ﬁle and that ﬁle is delivered in its entirety from disk to the client through the network.
The ﬁles are never written, and they are always read from beginning to end. The access
patterns among different ﬁles are not necessarily predictable, because they depend on the
needs of remote clients. Also, each request is a separate transaction, and there is minimal
opportunity to guess the next request. Although requests from a single client might be

correlated, the stream of requests from multiple clients to the server may be completely

unpredictable.

Large ﬁles pose performance problems for Web servers because they may incur signif‘
icant latency when read from disk and as they are transferred over the network, especially
to bandwidth-limited parts of the Internet. Furthermore, large ﬁles will overﬁll I/O caches,

reducing performance even for smaller ﬁles. Measuring I/O performance remains a com-

12

plicated task because the problem space is large. I/O performance depends on the storage
devices, busses, architecture of the platform, as well as the design and implementation of
the operating system and ﬁle system. The system-wide performance of any given system
also depends on the pattern of I/O trafﬁc. Usually, a system can be tuned to be effective for

some trafﬁc patterns, at the expense of additional cost for other access patterns.

Related work on performance issues of Web servers include a study of NCSA Web
server trafﬁc in [2, 16]. Performance of the Web server on a speciﬁc platform (an HP
735 workstation) can be found in [17]. Comparing the response time and throughput of
Web servers on several UNIX systems, including HP 735 (HPUX 9.05), Sun IPC (SunOS
4.1.2), SGI Indy 2 (Irix 5.3), and Cray CS 6400 (a 10 SPARC multiprocessor, running
Solaris 2.3), was presented in [18]. These performance evaluations were all done prior to
the development of generally recognized benchmarks for Web servers. They typically use
a series of trials in which a pinger program emitted a series of HTTP requests (one of the
test loads) to the server. The pinger program recorded the round trip time for each request

for performance measurement.

The WebStone [19] benchmark has become the defacto standard for comparing Web
servers. In addition, SPEC is producing the webpetf [20] benchmark, which will likely be

the future standard of comparison (see [21] for more information on the SPEC benchmark).

Several papers were published in the past on techniques for performance improvements
of Web servers [22, 23, 24]. Pro-forking is one of the popular techniques to improve per-
formance. For example, the Netscape commerce server uses multiple pre—forked processes

to handle incoming requests. Instead of forking a new process for every HTTP request,

13

a conﬁgurable number of processes that reside in memory are pre-forked, and are wait-
ing to fulﬁll the HTTP request. This improves system performance by eliminating the
unnecessary overhead of creating and deleting processes to fulﬁll every HTTP request.
Unfortunately these pre-forked processes cannot perform CGI requests, since running a
CGI program requires overlaying the original program images. The Web server still need
to fork a new process to perform any CGI request. Among these improvements, several
ﬁnding are discussed in [21]. First, delivering large ﬁles is dominated by the network trans-
fer time. Second, using CGI scripts incurs signiﬁcant overhead in all cases, and Perl has
more overhead than compiled C. Third, both multithreading and the dispatcher/pre-forking
model perform signiﬁcantly better than a Web server that forks for each request. Fourth,

using inetd is a signiﬁcant performance loss.

Another alternative to improve CGI performance is to use a special purpose API, such
as Netscape’s NSAPI [25]. It allows CGI programmers to rewrite CGI scripts in C so that
they can be dynamically loaded into the Web server daemon. Dynamic loading is much
faster than fork/exec: In a study done by Haynes & Company [26], Netscape server with
NSAPI outperforms the other Web servers using CGI. The downside is that CGI programs
must be rewritten speciﬁcally for NSAPI, and can only be loaded into a Netscape server.
In addition, NSAPI does not address the initialization and setup problem. CGI scripts are
a bit more portable, and can be written in Perl or other interpreted languages. The design

of NSAPI also does not address the problem of the initialization and setup time required to

gain access to the back-end server.

Sun’s Java [27, 28] is a simple, object-oriented language that operates on the user’s

14

computer. It is capable of making connections to server computers and acting as an inter-
active program. Server’s TCP/IP resources are consumed while the user stays connected.
In contrast, CGI programs start up, execute, and terminate at the server side in a traditional
client/server computing environment. The system memory and resources are free as soon

as the page is downloaded. These two languages can be complementary languages.

Digital Libraries can implement software "wrappers" (also known as "middleware") to
allow diverse systems to interoperate, For example, the Stanford Digital Library project is
creating a software "virtual bus" that seamlessly connects a variety of online services [29].
Digital libraries may also use "software agents," autonomous programs capable of negoti-
ating complex access methods and terms. For instance, the University of Michigan Digital
Library project is exploring a complex architecture of collaborating agents, capable of cus-
tomized searching of many repositories [30]. These designs all center around the internal
design of the Digital Library, and do not address the gateway between the standard Web

server and the Digital Library.

A key to performance in any distributed system is effective caching, transparently
replicating and locating information for faster access. The development of large—scale
Digital Libraries will need effective information caching, not only for quick access, but
to converse with network bandwidth and reduce the load on servers. The state of caching
for the W is examined in [31]. Web caching is usually a relatively simple "ﬂat"
scheme, consisting of a single cache between the client and the servers of the Web. If
the document is not in the cache, it is fetched from the original source. The experimental

Harvest system [11, 32] provides a hierarchy of caching servers, which can be accessed

15

by Web browsers. The Harvest caching servers can store information from many sources,
including the Web, and are integrated into the Harvest indexing and searching mechanisms.
The caches communicate with each other and transfer data using their own protocol, which

is more efﬁcient and ﬂexible than using H‘l'l'P.

The framework for distributed Web services provides an efﬁcient gateway solution to

address the performance issue, especially for the Digital Library.

1.4 Organization

In this thesis, we will describe our design and implementation of the Connection Manager
Daemon and the Cache Manager interfaces, and show program visualization and perfor-
mance analysis of their cliette processes to demonstrate the efﬁciency of our framework for
distributed Web services. We have implemented a set of libraries and tools — The Extended
Uniﬁed Trace Environment, which is built on top of the Uniﬁed Trace Environment [33] to
generate trace events for the Connection Manager Daemon and cliette processes. Our orig-
inal UTE, developed on IBM SP systems for scientiﬁc applications, facilitates the tracing

of various events, such as message passing, process dispatch, page fault, and 1/0.

The Extended UTE trace library has added more functions to analyze and visualize
new events associated speciﬁcally with the Connection Manager Daemon interface and
the Cache Manager interface. We also modify UTE to extend its scope for client/server
applications. All events can be visualized, not only to understand the communication

behavior of the Connection Manager Daemon, Cache Manager, and cliette processes, but

16
also to understand system responses and to pinpoint the bottlenecks of the application.

In Chapter 2 we will explain the design and implementation of the framework for
distributed Web services. The original UTE and its extensions for client/server applications
are described in Chapter 3. We will ﬁrst examine the overhead and performance of our
gateway design in Chapter 4. A Digital Library using the framework is presented in
Chapter 5. In Chapter 6, we show performance results and trace visualization of the

distributed framework in multiple platforms for the Digital Library. Concluding remarks

are given in Chapter 7.

CHAPTER 2

Distributed Web Services

The World Wide Web is a client-server system that integrates diverse types of information
on the global Internet and on enterprise Internet Protocol (IP) networks. Clients and servers
on the Web communicate using the HyperText Transfer Protocol. The HTTP protocol is
layered on the TCP/IP protocol, so that it runs on any IP network. The typical Web client
is a browser — an interactive application, most often with a graphical user interface. A Web
browser can display a number of built-in data types including formatted documents, images,
data entry forms, and hyperlinks leading from one document to another. Web servers, also
called HTTP servers, send data (documents, images, etc.) to clients in response to their

requests. Information integration is the key to the power of the Web.

The Web provides three distinct forms of integration. The ﬁrst form of integration is
the Web’s ability to link data provided by different servers. Each data item in the Web is
addressed by a Uniform Resource Locator (URL). Web documents, expressed in HyperText

Markup Language (HTML), can contain the URLs of other documents. Browsers typically

17

18

display these references, called hyperlinks, as special regions called anchors. An anchor
can be a section of highlighted text or an icon. When the user clicks on an anchor, the
browser retrieves the document referenced by the underlying URL. The newly retrieved
document can come from a server located across the globe both from the client and from

the server that provided the document containing the anchor.

The second form of integration is the Web’s ability to provide clients with data from
diverse sources via the Common Gateway Interface (CGI). The CGI interface provides
a simple, easy-to-use method of executing programs from within a Web server. Web
servers integrate diverse sources of data by allowing CGI programs to run in response
to client requests; CGI programs perform general computations including accepting data,
communicating with other computers, and creating dynamic pages. In this way, for instance,
a Web server can provide clients with data obtained by running transactions on a legacy
mainframe system. In such a scenario, the Web server acts as a gateway, translating into the
new standard for interactive information access (HTTP) from a previous one (3270 terminal
protocols). The CGI has many advantages, including portability between server softwares,
and a large base of public domain programs and development tools designed for its use.
The biggest limitation of the CGI interface is its inability to share data and communication
resources. When a CGI program is accessed by a client, a new copy of the CGI program is
invoked for each remote client. If a program must access an external resource (such as an

IPC pipe to another resource or a database to retrieve documents), it must continually close

and reopen that resource.

The third form of integration is the Web’s ability to encompass new types of data. The

19

HTTP protocol borrows a design for extensible data typing and type negotiation from the
Multipurpose Internet Mail Extension (MIME) [34, 35] standard. Web browsers integrate
diverse sources of data by supporting several Internet data access protocols in addition to
HTTP. For instance, a URL can specify the File Transfer Protocol (FTP) for retrieving data
from a ﬁle server. Thus, a Web browser is an FTP client as well as an HTTP client. Many
Web browsers also support the Gopher (document browsing and indexing) and NNTP
(bulletin-board access) protocols. Browsers can also support new data types via helper
applications that a user can add to the browser. This is how Web browsers deliver audio,
video, and Postscript data to users today. The Web is prepared for whatever new data types

become important in the future.

In this chapter, the framework for distributed Web services is presented to address
performance issues associated with the second form of integration, namely the delivery of
diverse data provided by CGI processes. Our design provides application programming
interfaces so that users can generate dynamic HTML pages through the framework to request

information from back-end resources without constantly closing and reopening back-end

1'6 SOUI'CCS.

Several high-performance HTTP daemon designs have recently appeared to address
the inefﬁciency of HTTP daemon in general when handling static HTML pages. To fully
understand the feature provided by the framework, we will explain the basic HTTP protocol
and discuss various high-performance HTTP daemon designs in Section 2.1. The detail

design and implementation of our framework for distributed Web services will be discussed

in detail in Section 2.2.

20

2.1 The HTTP Daemon

2.1.1 The HTTP protocol

A URL for Web server content has the form
protocol://server/path

Where

0 protocol is the protocol to be used for retrieving the content. For Web server content

the protocol is HTTP, STTP (Secure HTTP), or HTTPS (HTTP with secure Sockets

Layer).

0 server is the Internet host name, e.g., "www.ibm.com". The server part may include

a port number, e. g., "www.ibm.com:8003". This allows one host to run multiple Web

servers, each bound to a different port.
0 path is a UNIX-style path name, e.g., "software/products/kidriffs.html".

So "http://www.ibm.com/software/products/kidriffs.html" is a possible URL.

The HTTP protocol is extremely simple. A typical interaction, in which a client

retrieves some data given a URL, involves the following steps:

1. The client opens a TCP/IP connection to the server mentioned in the URL, using the

default port 80 if the URL does not specify a port.

21

2. The client composes a request message containing the method (GET to retrieve data),
the path, and other information, such as a list of data types that the browser knows
how to handle. The client formats the request message as a series of name-value pairs,
encoded using the long-established RFC 822 [36] conventions for Internet electronic

mail headers.

3. The client sends the request message to the server over the TCP/IP connection. The

server reads and interprets the request.

4. The server composes a response. The response begins with a status code line that
summarizes the result: OK, Bad Request, Unauthorized, Not Found, and so forth.
The server may include other information such as content type, encoded in RFC 822

format, after the status code line. Finally, the server formats any data as a MIME

message body.

5. The server sends the response message to the client over the TCP/IP connection. The

client reads the response.

6. The server closes the connection; the interaction is complete.

A browser makes several connections (requests) per typical Web page, because an
HTML document may contain both text and graphical images. The document’s text is
stored within the document’s HTML ﬁle, but the images are not: Each image has its own
URL embedded in the ﬁle. To display an HTML document as a browser reaches each

embedded image, the browser must perform another HTTP request to retrieve the image.

22

The size of a typical request message is relatively small, a few hundred bytes. But
responses have a bimodal distribution. A typical HTML ﬁle is a few thousand bytes long,
and can be transferred over a 14.4 kilobit per second dial-up link in a few seconds. But
images are often much larger, occupying tens or hundreds of thousands of bytes, and audio
or video data are typically even larger; it may take minutes to communicate these large data
objects. Thus, a server that receives frequent requests for multimedia data must service

many concurrent H'I'I'P connections.

The HTTP protocol is designed for stateless servers, meaning that servers retain no
information about clients between connections. Because a HTTP server is stateless, it can
restart and clients will notice nothing more than a delay. This stateless design improves the

user-perceived reliability of the Web.

2.1.2 The high-performance HTTP server

An H‘I'IP daemon is a concurrent program — it generally has several client requests in
progress at the same time. Any server that processed one client’s request to completion

before beginning to process the next client’s request would be very inefﬁcient, for two

ICEISOIISI

0 Request processing begins when the server accepts a connection and starts reading
the client’s request, and does not end until the server has written the ﬁnal byte of the
response back to the client. Thus, the time it takes the server to process a request
depends upon the speed of the server (software, operating system, and hardware) and

the complexity of the request (size of ﬁle retrieved or amount of computing done and

23

size of result produced by CGI programs), and also depends upon how quickly the
client is able to send the request and receive the response. The client may be slow,
or the client’s network connection may be slow. In either case, dedicating the entire
server to work on a single request would force the server to be idle while waiting for
the client to send the next request packet or the next acknowledgment of a response
packet. The faster the server, compared to its clients and to the network paths to its

clients, the greater the performance beneﬁt of concurrent processing.

0 The server’s processor, disk, and network interface can each do useful work at the
same time. While the processor is parsing one request, the disk can be reading a
ﬁle to satisfy a second request, and the network interface can be sending a packet in
response to a third request. The more processor, disk, and network interfaces a server

has, the greater the performance beneﬁt of concurrent processing.

Most HTTP daemons achieve concurrent processing by simply forking a new process
for each connection as it arrives. The process works on the request until completion, then
terminates. The drawback of the forking HTTP server design is the large overhead per
connection. Each process occupies considerable RAM and swap space, and creating and
destroying a process consumes many processor cycles. This forking HTTP server design

results in a low-performance server.

More advanced H'l'l'P servers that have appeared recently are based on a "pool of
processes" or "pro-forking" design. The HTTP server creates a set of identical processes
during initialization. As each connection arrives, an idle process is removed from the pool

and assigned to handle the request. The process works on the request until completion, then

24
returns to the pool.

This pool of processes HTTP server design gives better performance than the forking
design, because it avoids the overhead of per-connection process creation and destruction.
Yet the design is still limited to handling a moderate number of concurrent connections,
largely because it makes such inefﬁcient use of memory. Each process still occupies a lot of
RAM and swap space, and when these resources are used up, so is the server’s connection
capacity.

This inefﬁciency motivates the design using multithreaded environment. Multithread-
ing means that one process can work on many concurrent requests. Far fewer processes are
needed, and so each process is busy a greater fraction of the time, making more efﬁcient
use of system resources. Netscape Commerce Server is one variation of the multithreaded
design. Note that while the server can be either pre-forked or multithreaded, the server
creates a new process for each CGI program that needs to run; this process dies when the
CGI program ﬁnishes. The deﬁnition of CGI programs requires the server to create a new

process for each use.

2.2 Software Architectures for Distributed Web Services

Much of the data a Web provider wants to put out on the Web is managed by existing
commercial application supporting resources managers (such as transaction processing and
database). The ability to view and optionally update the data and run these applications

across public networks like the Internet can be of great beneﬁt to the enterprise Web

25

providers. Web clients will be major points of entry for commercial applications. The
ability to use the Web browser and execute transactions provides a powerful capability
in the integration of the public presence system with the enterprise business management

system. This can take several forms and includes:

0 Web Browser access for the existing commercial application supporting resource
managers, by using the gateway at the HTTP server to provide access to those

applications including:

- Providing impedance matching between stateless Web and stateful application

server.
— Converting application presentation methods to generate HTML.

- Handling a serial stream of requests from the Web and initiating multiple con-

current threads and processes.

0 Using the Web browser for new and existing applications and to incorporate multi-

media for use over the Web.

The CGI is a standard for interfacing these application resources managers with in-
formation servers, such as HTTP or Web servers. A plain HTML document that the Web
daemon retrieves is static, which means it exists in a constant state: a text ﬁle that does
not change. CGI programs, which are executed in real-time, go beyond the static model
of a client issuing one HTML request after another. Instead of passively reading server
data content one pre-written screen at a time, the CGI speciﬁcation allows the information

provider to serve up different documents depending on the client’s request. The CGI spec

26

also allows the gateway program to create new documents on the ﬂy — that is, at the time
the client makes the request. For example, a current Table of Contents HTML document,

listing all HTML documents in a directory, can easily be composed by a CGI program.

CGI programming really expands the horizon of the Web. The simple concept of
passing data to a gateway program instantly opens up all sorts of options for a Web
developer and changes the nature of the World Wide Web. Now a Web developer can

enhance his or her content with applications that involve the end user in producing output.

However, using CGI has serious performance drawback. First, in the current HTTP
server design, each CGI request received by the HTTP server forces the daemon to spawn a
process to process the CGI request. Forking a process is costly and imposes a heavy burden
on the underlying operating system that affects the entire HTTP server performance. As
we pointed out in the previous section, using a pre-forked or a multithreaded HTTP server
design still requires creating a new process for each CGI program. Second, to connect with
the back-end server, each CGI process has to open the resource and close the resource when
it ﬁnishes. Constantly closing and re-opening back-end resources not only wastes system

resources, it also makes sharing the retrieved information impossible.

To provide a ﬂexible solution instead of providing yet another HTTP server, we
decided to implement a framework providing distributed Web services that can be used
with any HTTP server design (including the high-performance HTTP server) through
standard CGI interfaces. There are four different components in our design: the Connection
Manager, cliette processes, the Cache Manager, and CGI processes. The Connection

Manager manages cliette processes and listens to requests from CGI processes. Cliette

27

processes send requests to and retrieve data from back-end servers on behalf of Web
clients. The Cache Manager handles information constructed by cliette processes, and CGI
processes are gateway processes between the HTTP server and cliette processes. We also
provide application program interfaces (APIs) to help users write their own CGI and cliette

programs to communicate with our daemon processes.

The main function of the Connection Manager Daemon is to schedule cliette processes
to serve CGI requests. After setting up its well-known socket, the Connection Manager
starts up a number of processes to serve as cliette processes. The number of cliette processes
and their identities are deﬁned in a conﬁguration ﬁle. A cliette process can be created on a
remote machine. It can be created on a different platform with a different operating system

as long as the TCP/IP socket interface is available.

Initialization steps of the cliette process include setting up its own socket for CGI script
processes to talk to and inform the connection manager of its socket number and process
ID. The connection manager keeps the returned values (i.e., socket number and process ID)
in a queue. This queue is used by the connection manager to choose a free cliette process
and to return the cliette’s socket number to the requesting CGI process. After initialization,
a cliette process opens up a connection with its back-end server using information passed
from the connection manager. A cliette process stays connected to its back-end resources

until the connection manager either reinitializes or terminates it.

Figure 2.1 shows the event sequence of serving Web requests through CGI processes.
A CGI process forwards Web requests to a cliette process and receives results from a cliette

process on behalf of Web clients. The shaded area indicates steps that are repeated for each

28

ﬂ “ Requesting Cliette

   

  

.......

........
‘.:.-.-.-.'.-.;.:.:

..........
..................
..............
.........
.........

 

Informing Cliette

  
   

' Assign Cliette 1:5"‘:"‘”"

 

 

 

 

Cliette Free

 

   

Figure 2.1: How service is established between a cliette and a CGI process

request.

When a Web request comes in, a CGI process is begun; it and sends a request to the
connection manager asking for a cliette process through the daemon’s well-known socket.
The daemon chooses a free cliette process and forwards the cliette’s public socket number
to the CGI process. The CGI process then forwards the request (i.e., the Uniform Resource
Locator string) to the corresponding cliette process. If no cliette process is available, the
CGI process goes into a wait state. In this case, the connection manager is responsible
for waking up the waiting CGI process when a cliette process becomes available. The
CGI process then forwards the request and goes to sleep until the response is ready. Two
time-out values are used by a CGI process during the interaction between the CGI and
cliette processes. The CGI process sets a time-out value when it starts waiting for a free

cliette. After receiving a free cliette, the CGI process sets another timeout value when it

29

starts waiting for the HTML documentation. Note that the connection manager and CGI

processes can be executed on different machines or nodes.

In addition to requesting services from its back-end server, a cliette process is responsi-
ble for constructing HTML documents dynamically and returning them to the corresponding

CGI process. Figure 2.2 shows the snapshot of a running system.

The Cache Manager is designed to further improve CGI performance by allowing
cliette processes or CGI processes to share information. If dynamic HTML pages are cached,
it may prevent a cliette process from requesting the same information and constructing the

same HTML page over and over.

We will discuss each component in detail in the following sections. In Chapter 5,
we will demonstrate a framework for distributed Web services used in the Digital Library
environment. Detailed explanation of the Connection Manager conﬁguration ﬁle format

and its API can be found in Appendix A. Cache Manager conﬁguration ﬁle format and its

API are given in Appendix C.

2.2.1 Connection Manager Daemon

The Connection Manager Daemon reads a start-up conﬁguration ﬁle (default ﬁlename is
/etc/CMDaemon.conf) for information about starting a cliette process. Depending on the
hostname of the cliette process, system routine exec() or rexec() is used. The exec() is
used to start a local cliette process while the rexec() routine is used to start a remote cliette
process. The Connection Manager Daemon maintains a base queue to keep information

about the newly created cliette process. As soon as the Connection Manager receives

30

 

 

 

 

 

Figure 2.2: Connection Manager Daemon being used by an HTTP (Web) server

31
conﬁrmation from a successfully initialized cliette process, it adds this new cliette process
into a second queue — AvailQueue. Connection Manager Daemon manages cliette processes
in a ﬁrst-in ﬁrst-out fashion. It does not distinguish a local cliette from a remote cliette
when assigning CGI requests. A third queue - UnAvail queue — is used to keep a list of
busy cliette processes.
The Connection Manager Daemon uses a well-known socket port for interaction with

CGI and cliette processes. This well-known socket is used

0 for a cliette process to initiate the ﬁrst connection after being started by the connection

manager.
0 for a CGI script process to request a cliette process.

0 for a system administrator to issue commands, such as start a new cliette process,

stop a cliette process, or debug a cliette process.

In addition to listening to the well-known socket port, the Connection Manager Daemon
also listens to several private socket ports between itself and cliette processes. These socket
ports remain open as long as cliette processes are active. This open-socket connection

supports the following activities:

0 The Connection Manager sends maintenance requests to cliette processes.

0 The Connection Manager gets an interrupt when a cliette process dies. In this

case, the Connection Manager Daemon can restart the cliette process without user

interventions.

32

e A cliette process announces that it is free to serve more requests.
0 A cliette process announces that it has performed certain maintenance steps.

0 Dynamic trace generation and termination. This will be discussed in more detail in

Section 3.2.3.

A system utility CMDadmin is provided for various maintenance commands. A UNIX

style manual page for CMDadmin is included in Appendix B.

2.2.2 Cliette processes

Cliette processes can be dynamically created on machines/nodes either locally or remotely,
depending on current system loads. A remote cliette is created by the Connection Man-
ager using rexec() on the remote host, which is deﬁned in the conﬁguration ﬁle. A cliette
process requests a dedicated socket connection with the connection manager after suc-
cessful initialization. If a cliette process resides on a host outside a ﬁrewall in the open
Internet environment and the Connection Manager resides inside the ﬁrewall, connection
between the Connection Manager and the cliette is made through the ﬁrewall host using the
SOCKS [37] service. In this case, the cliette process will not initiate the socket connec-
tion with the Connection Manager, because the ﬁrewall will block any connection request
from outside. Instead, the Connection Manager initiates the connection with the remote
cliette process through the cliette’s pre-deﬁned socket port. If the Connection Manager
fails to make the connection with a remote cliette process, no automatic retry will be made.

However, a system administrator can use the CMDadmin utility to force the Connection

33
Manager to retry the connection request at a later time.

During the cliette initialization phase, a dedicated connection is made between the
cliette process and the Connection Manager, and the information listed in the conﬁguration
ﬁle is passed to the cliette process as environment variables through the dedicated socket.
A cliette process can retrieve these environment variables using the getenv() subroutine
call. The environment variables passed to the cliette process are used by the cliette’s login

subroutine to open the connection with a back-end server such as Visual Info or DBZ.

The following example shows an entry in a conﬁguration ﬁle for a cliette process A, and
the environment variables available to the cliette are CLIETT E JVAME, CLIETTEJ’ASSWD,

and CLIEITE_EXEC_PATH, whose uses are self-explanatory.

cliette:{CLIETTE_NAME=userA:CLIETTE_PASSWD:passwdA:
CLIETTE_EXEC_PATH=/etc/xxx_cliette}

After initialization, a cliette process listens to the Connection Manager through its
dedicated socket Connection for commands. The following list shows possible commands

from the Connection Manager and corresponding actions.
Cliette.dojob: the Connection Manager Daemon is telling the cliette process to get ready
for a new CGI process.

Cliettereinit: Cliette re-initialization. If the cliette is currently serving a CGI request,

then the cliette ignores the command and returns an error to the Connection Manager.

Cliettestop: Cliette termination. Again, the cliette ignores the command and returns an

error to the daemon if the cliette is currently serving a CGI request.

Clietteayt: An “Are you there?” request. The Connection Manager expects a response

from the cliette within a time-out period if the cliette is still active.

34
Cliette_kill: Kill a hanging cliette process. This request is useful to stop a run-away cliette.
Cliette.debug/debugend: The cliette process starts/stops the debugging procedure.

CIietteJraceon/traceoff: The cliette process turns on/off tracing.
An active cliette can be in one of the following states:

0 ClietteSTARTUP: The cliette process is being initialized.

o ClietteZOMBlE: The cliette process encounters an error and is waiting for the Con-

nection Manager daemon’s action.
0 Cliette_AVA1L: The cliette process is available for services.

0 ClietteBUS Y: The cliette process is currently serving a request.

Figure 2.3 shows the state transition diagram of a cliette process. A cliette is in its
Cliette_STARTUP state when it is being initialized by the Connection Manager. Subse-
quently, it enters the ClietteAVAIL state. Upon receiving a request from a CGI process,
the Connection Manager ﬁnds a free cliette and tells the CGI process what the cliette’s
public socket is. The Connection Manager sends a Cliette_dojob request along with the
CGI process’s process ID to the selected cliette process. The cliette process changes state
to Cliette_BUSY when a Clietteaojob request is received from the Connection Manager.
The request tells the cliette process to wait for a URL string coming from a CGI process
whose ID is speciﬁed in the request block. The cliette process then sets up a time-out
value waiting for the CGI process. If the time expires before the cliette process receives
any connection request from its public socket port, the cliette process changes its status
back to ClietteAVA/L and ignores any connection requests coming from its public socket
port. The cliette process will ignores a URL request and return to C lietteAVA/L state if the
connection request is from an unexpected CGI process. A cliette process will not change
its Cliette.BUS Y state after it has received the CGI request until the request is served, or

unless it is terminated by the Connection Manger Daemon’s ClietteJcill command.

35

cliette_ayt

Cliette_BUSY
dojob

ﬁni '
cliette_kil| 10b Cliette_ayt
Terminated cliette_stop CIielte_AVAIL trace-0N0“

cliette__kill demon/off

. . cliette
cliette_klll _reinit

n initialization/login
Cliette_ZOMBlE C '

Figure 2.3: State Transition Diagram for a Cliette process

cliette_kill

 

 

Each cliette process needs to provide a login routine and logout routine for service
initialization and termination. The login subroutine is called automatically by the initializa-
tion routine lnitSelf() or when the cliette process receives a ClietteJeinit command from the
connection manager. The logout subroutine is called when the cliette receives a Cliette_stop

request from the connection manager to terminate its service.

The ﬁrst message, an identiﬁcation packet sent by the requesting CGI process to the
cliette process, contains CGI process ID, a subset of CGI environment variables, and some
URL-related information (such as the size of the entire URL string). These environment
variables and URL-related information are set by the receiving cliette process dynamically
using the putenv() subroutine, and these values are cleared once the cliette process ﬁnishes

serving the current CGI request.

In our design, adding additional Web services can be achieved by simply adding a
machine/node to run additional cliette processes, provided the application back-end server
can keep up with the requests. This signiﬁcantly improves the scalability and ﬂexibility of

the Web services.

36

2.2.3 CGI processes

An HTTP server provides dynamic HTML document generation through CGI processes.
Upon receiving a URL string containing a CGI request from a Web client, the HTTP server
forks a child process to run the CGI script. The newly created CGI script inherits a number
of environment variables [6], which indicate how the HTTP server is set up and how to
communicate with the Web client. The CGI process terminates after serving the URL

request.

To communicate with the cliette processes efﬁciently, we designed a set of application

programming interfaces (API) for a CGI process to use:

0 GetCliette(): to request a free cliette from the Connection Manager Daemon.

e ConnectCliette(): to make a connection to the assigned cliette and send the cliette

process its identiﬁcation packet.

e PutURL(): to forward its URL string or other information to the cliette process. This

subroutine is used to respond to the GetURL() issued by the cliette.

e WaitForHTML(): to wait until the dynamically generated HTML document is ready

and then start receiving it from the cliette process.

When a CGI process is ready to request a cliette process, it uses the GetCliette()
subroutine to ask the Connection Manager for a free cliette process. The subroutine will
return with the cliette’s public socket address and the cliette’s machine name. If no free
cliette process is available, the CGI process is forced to wait. The CGI process can also
tell the Connection Manager how long it intends to wait for a free cliette. If the time-
out occurs before a cliette process becomes available, the GetCliette() returns a -1 value
to the CGI process. If a free cliette process is available, the CGI process uses Con-
nectCliette() to connect to the public socket port of a free cliette. After the connection
is made, the CGI process forwards an identiﬁcation packet to the cliette process. This
identiﬁcation packet also contains a subset of the CGI environment variables, includ-

in g requestmethod, contentJype, contentJength, script_name, pathjnfo, pathﬂ'anslated.

37

querystring, remote_host, remote-addr, server_name, and server_port. These environment
variables are set by the cliette process dynamically for each CGI request. The cliette process
then sends an acknowledgment packet and issues GetURL() to get the URL string. After
receiving the acknowledgment packet along with the request for the URL string, the CGI
process issues the PutURL() to send the URL string to the cliette process. Note that the
CGI process is passive because the CGI process will not send any URL string until the
cliette requests it. The CGI process then issues the WaitForHTML(), and starts reading
the dynamically generated HTML documentation, which is generated and then sent by the
cliette using SendHTML().

Both the PutURL()/GetURL() and WaitForHTML()/SendHTML() pairs are imple-
mented in a way similar to the stream ﬁle I/O, i.e., the reader waits until the writer puts
something in the channel. Both the reader and the writer can specify how much information

it intends to read or write.

The CGI process sends the requested information back to the Web client as soon as
it receives the dynamically generated HTML document. Depending on their design, some
Web clients may start processing the HTML information before the entire documentation

is received.

In the current design for the framework of distributed Web services, these CGI
processes are individual processes forked by the HTTP server. Each Web CGI request
will create a new CGI process. Although we wish to make CGI processes as small as
possible, the forking of a CGI process is still costly. Because our goal is to provide a
generic CGI interface without modifying existing Web servers, we have little control over
how CGI processes are created. Two approaches may be combined to reduce the impact
of forking a new CGI process. One is to use a high-performance HTTP server to reduce
the overhead associated with retrieving static HTML pages currently, and the other is to
place the HTTP server, Connection Manager Daemon, and cliette processes on different

machines/nodes to distribute the load.

Newer approaches, such as Netscape’s NSAPI, could also be used to improve CGI

performance. They present no conﬂict with our design.

38
2.2.4 Cache Manager Daemon

A Cache Manager interface permits ﬂexible caching policies to further streamline con-
nections by reducing the need to contact even the cliette for data that has been recently
fetched. A Cache Manager listens on its own well-known port, which is used by both CGI
processes and cliette processes. The Cache Manager is multithreaded and capable of man-
aging multiple caches in a single process. Each cache is conﬁgurable to use disk, memory,
or both to store cached data. Each cache is conﬁgurable to use passive (client-controlled)
management policies or aggressive (cache manager-controlled) policies, or a combination
of both. If a cache manger has been conﬁgured, the CGI may choose to look in the local
cache, possibly eliminating the need to contact a cliette at all. The cliette may also use
the cache to avoid back-end server request or to store the previously returned information.

Figure 2.4 and Figure 2.5 demonstrate these sequences.

A typical cache could contain information that is used to form a dynamic HTML page.
For example, it could contain a complete list of database query results while only the ﬁrst
X items are shown to the client. If a Web client requests the next X items, the CGI process

could retrieve them from cache instead of issuing another SQL query.

The Cache Manager API provides a set of primitive functions usable by other processes
to provide caching of data. Very little policy is implemented directly in the cache manager,
and what is implemented can be overridden by the processes that access it. That is, each
type of process may use the Cache Manager API to implement a policy appropriate to the
application. Because the Cache Manager runs as an independent process, it provides a
common cache usable by multiple processes on multiple machines. Different applications

may use the same Cache Manager but should be aware of each other to avoid key conﬂicts.

The cache API uses a socket interface to the Cache Manager to resolve requests. The
Cache Manager listens on a well-known port as deﬁned in /etc/services as service "ibm-
cachem gr ". If no port is given in /etc/services, the port speciﬁed in the conﬁguration ﬁle is
used. Both port values may be overridden by using the “-”p command line parameter when
starting the Cache Manager. Note that if /etc/services is not used to establish the port, any

processes attempting to use the Cache Manager must be informed of the correct port to use.

39

If used in conjunction with the Connection Manager, cliettes are automatically informed of
the port by the Connection Manager. CGI processes may need to be started with the correct

port as a parameter if /etc/services is not used to deﬁne the port, however.

The Cache Manager runs as a multithreaded process that manages one or more cache
objects. Each cache object may be conﬁgured to enforce differing policies. That is, a Cache
Manager manages multiple, independent caches. Each cache is identiﬁed with a unique

character string assigned by the conﬁguration ﬁle during initialization.

The cache may be conﬁgured to keep cached data in memory, on disk, or both. Each
data object consists of a key ( token) and data pair. Tokens are mapped into the ﬁle system

when caching to disk.

The policy under which data are purged is conﬁgurable:

e Purge items older than some threshold on a regular basis

0 Purge items only when requested by the client

Purging of old items always occurs to make room for new items when the cache capacity is

exceeded.

Figure 2.6 shows two Cache Managers conﬁgured for different purposes: the CGI
Cache Manager is conﬁgured as an "in-memory" cache because it is known that the data
it caches are always small, that the data change approximately every 20 to 30 minutes,
and that slightly out—of~date results are acceptable. The cost of contacting the cliette and
retrieving new results is potentially expensive. This cache is conﬁgured to maintain items
for a maximum of 20 minutes. When a CGI is spawned, if the item is in the cache, it can
be returned without the need to contact a cliette. If the item is not found, the cliette is

contacted and the item is sent both to the Cache Manager and the client.

The cliette Cache Manager is used to cache large images retrieved from a library
database. This data is static, changing rarely if at all. The cache is conﬁgured to maintain
its data on disk, purging items only if capacity is exceeded or if the cliette requests it. If

capacity is never exceeded, the cliette alone determines if a cache entry is stale by the date

40

Form Home“
Rum Baum

 

 

 

Mommy Cliette
@
Cliette Fm
Cache MISS

 

Figure 2.4: CGI usage of the cache to avoid cliette connections

returned when the cache is queried.

 

Figure 2.5: Cliette usage of the cache to avoid database connections

 

Connection
Manager

 

 

 

 

 

 

CGI < 2) came can
A Manager
4’\ ‘ IX

 

 

 

 

 

 

 

 

 

 

 

Memory
Cache

Firewall

 

 

 

 

 

 

 

 

Cliette Cache a; i Cliette

Manager

Database

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Database

 

 

 

 

 

 

Figure 2.6: Overview of Cache Manager

CHAPTER 3

Uniﬁed Trace Environment and Its

Extension for Distributed Web Services

3.1 Uniﬁed Trace Environment

In this section, we describe the design and implementation of a Uniﬁed Trace Environ-

ment(UTE), which will be used as the base to capture CMD and cliette events.

Parallel programs differ from sequential programs in a signiﬁcant way: Whereas
one can often predict the behavior of sequential programs by understanding the algorithm
employed, the behavior of parallel programs is notoriously difﬁcult to predict. Even more
than sequential programs, parallel programs are subject to “performance bugs,” in which
the program computes the correct answer, but more slowly than anticipated. What is needed
then is an instrumentation to collect data that leads to the understanding of the program’s
behavior with minimal overlead. The UTE developed on IBM Scalable Parallel systems
for tracing message passing parallel applications. It provides trace libraries, utilities, and
visualization tools for application programmers to understand not only communication

patterns of the application, but also system responses to the user program.

We ﬁrst describe the problems of trace analysis for distributed parallel systems in
Section 3.1.1. Two libraries, UTE/MP1 and UTE/MPL, are developed and described in
Section 3.1.2 and Section 3.1.2 for MP1 (the standard Message Passing Interface) and MPL

42

43

(the Message Passing Library) applications, respectively. In Section 3.1.3 we discuss UTE
tools for analyzing and visualizing trace events. Using these tools we are able to pinpoint
the source code (if compiled with —g) corresponding to each message passing and user
event, and interleave system events such as process dispatch and message passing/user

events in the same time-space diagram.

3.1.1 Distributed parallel systems

Distributed parallel processing is a way to increase system computing power beyond the
limit of current uniprocessor technology. Distributed parallel systems promise higher
computing power than sequential or vector computers, and are more scalable than shared
memory multiprocessors. On the other hand, programming in such a system based on
the message passing programming model is much more complex than writing sequential
programs. To take advantage of the underlying hardware, understanding the communication

behavior and load balancing issues of parallel programs is extremely critical.

One common way of monitoring the behavior of a program is to generate trace events
while executing the program. Events generated can then be used for trace-driven analy-
sis [38], program visualization [39, 40], and debugging [41]. In a distributed parallel
system, an ideal trace facility should be able to generate user-controllable message passing
and system events with minimal overhead and source code modiﬁcation. If trace overhead
is big, the timestamp associated with each event may have been altered signiﬁcantly, and

the statistics and data obtained in performance analysis may be meaningless.

Most user-level trace systems for message passing systems require source code mod-
iﬁcation to generate message passing events. More advanced tools such as the Paradyn
system require no source code modiﬁcation, because the code for performance instrumen-
tation is inserted into an application program during execution, at the expense of substantial

overhead caused by instrumentation daemons.

The capability to collect system events is as important as that to collect message

passing events. System and I/O events such as process dispatch and page fault reveal

44

crucial information on system responses to user applications. In addition, a trace facility
should be easily expandable to trace activities from other software layers, such as parallel
I/O ﬁle systems and high-level parallel languages, so that the same trace facility may be

used to trace multiple software events.

One of the most serious problems in trace analysis for distributed parallel systems
is the clock synchronization problem [15]. In a distributed system, each processor (or
node) has its own local memory and local clock, and processors communicate with one
another by exchanging messages. In such a system, trace records are generated by multiple
processors, and it is often the case that separate streams are produced independently in
multiple nodes. The logical order of events may not be guaranteed in the trace due to
discrepancy among local clocks. As a result, many trace facilities in distributed systems are
forced to do additional work to ensure consistent timestamps at the expense of increased
trace overhead. For example, Lamport [15] developed a distributed algorithm for resource-
sharing systems to extend the partial ordering to a consistent total ordering of all events by
creating additional messages among processors: a form of barrier synchronization. Since
barrier synchronization may take a long time, activating such a trace generation facility may
have adverse impact on the total elapsed time and other timing-sensitive program behavior,

and ultimately alter the program behavior to be analyzed.

UTE, a Uniﬁed Trace Environment for IBM SP systems, has been developed to attack
all the above problems. The user-level UTE trace libraries require only re-linking for
generating message passing and system events. If application source code is available,
additional user markers can be inserted into the source code. This allows a user to generate
message passing events with minimum overhead and to have the choices to mark speciﬁc
portions of the program such as various phases, loops, and routines, for performance analysis

and visualization.

3.1.2 UTE trace generation and libraries

The main parallel programming model supported by IBM SP systems is message passing.

A set of tasks, each executing in its own address space, communicates via calls to message

45

passing libraries. It allows parallel applications to exploit the performance characteris-
tic of the communication hardware. The IBM SP multicomputers connect hundreds of
RISC System/6000 processors via a communication network called the High-Performance
Switch [14], or simply the "Switch." In each Switch element is a counter called the Absolute
Time Counter (ATC). The primary function of the ATC is to enable the Switch to synchro-
nously cycle between its two primary operation modes called the run mode for normal
data transfer and the service mode for servicing the network. The ATC in each element is
synchronized within one clock cycle (25 ns) of one or more of its immediate neighbors’
ATCs. They facilitate a closely synchronized, nondrifting global time reference available to
all the processor nodes, and thus the ATCs simplify the well-known clock synchronization

problem encountered in distributed systems.

In the presence of a global clock provided by the ATC facility, the clock synchronization
problem could be completely avoided if all events use the global clock instead of the local
clock. However, the approach is infeasible as it requires changes to the AIX tracing facility
for generating system events. In addition, our experience shows that it is much more
expensive to access the global clock than the local ones. This is because the local clock
register resides inside the processor and can be accessed in tens of nanoseconds, while the
global clock register is on the adapter and several microseconds are required, including

software overhead, to access it.

We have monitored the drift off the system clocks in an IBM SP1 machine over 3
months. The maximum drift observed was 40 msec/hour. Hence, just cutting a clock
adjustment trace event at the beginning of the program execution is not sufﬁcient. As the
result, we access the ATC in the switch adapter periodically in each node to collect global
clock events. Each global event contains a global timestamp as well as a local timestamp.
The periodic accesses (once every 400 msec) is implemented through a piggy back function
when a low-level communication timer ﬁres to minimize trace overhead. Another way
of implementation without a piggy back function is through a local timer and a signal
handler for handling SIGALRM. These global clock events are then used to guarantee that

the maximum drift between two time stamps can be adjusted to an amount which is well

46

below the message passing latency.

UTE trace generation

The AIX trace facility, as part of the IBM AIX operating system, is capable of capturing a
sequential ﬂow of time-stamped events to provide a ﬁne or coarse level of detail on system
and user activities. The AIX operating system is instrumented to provide general visibility
of system events. Possible system events include process dispatch, page fault, system
calls, and I/O events such as read and write, and so forth. Built on top of the AIX trace
facility, the UTE trace libraries instrument message passing routines to provide detailed
information on message passing activities. The choice to build UTE libraries using the AIX
trace facility provides a uniﬁed and easily expandable trace environment for performance
analysis. Without such a uniﬁed trace environment, it would require multiple trace facilities
to trace various software layers such as MP1, MPL, PIOFS (a parallel ﬁle system), and HPF
(High Performance Fortran). That would not only make trace generation more intrusive,

but also make performance analysis tedious and difﬁcult.

The UTE trace libraries inherit efﬁcient trace data collection so that system perfor-
mance and ﬂow would be minimally altered by activating trace generation. For example,
the trace facility pins the data collection buffer in the main memory to reduce trace over-
head, and the size of the data collection buffer can be speciﬁed by the user at the time
of activating trace generation. This avoids tracing side effects such as page-fault, which

ultimately yields to indeterministic overhead in the tracing itself.

The cost of cutting a trace record is broken into two parts: the cost of testing whether
the event is enabled and then calling the trace buffer insertion routine, and the cost of the
trace buffer insertion routine. If a typical trace record has 3 words of data in addition to
a one-word event header (a so-called hookword that identiﬁes the event type and record
length) and a one-word timestamp, the average cost of cutting a trace record is around 110
machine instructions. Thus, the trace generation facility is efﬁcient and adds only a few

pseconds to the elapsed time for each trace event.

In UTE, trace generation is controlled by an environment variable TRACEOPT, which

47

deﬁnes trace Options such as the system and message passing events the user is interested
in: the size of the data collection buffer pinned in the main memory; and the ﬁle name
preﬁx for trace ﬁles. This allows a user to selectively enable generation of events (system
or message passing events) at execution time. If the environment variable is not deﬁned,
the application will run without generating any trace events. If a user is only interested in
message passing and process dispatch events, other system events, such as page fault and
I/O events, will not be generated as long as the user does not explicitly ask for them in the

environment variable TRACEOPT.

There are two major message passing APIs supported in IBM SP systems: the IBM
Message Passing Library (MPL) [42, 43], and the Message Passing Interface (MP1) [44, 45].
MPL was ﬁrst developed for IBM SP systems as the primary message passing API. Later,
the MP1 standard was developed jointly by national laboratories, universities, and computer

companies to leverage application development costs across multiple distributed parallel

computer platforms.

Figure 3.1 illustrates the UTE framework. In addition to these two UTE libraries,
hooks have been inserted in MPLp (EUI-H) [46], PIOFS [47], Vesta [48, 49], and HPF [50]
using the same framework, thus making it possible to generate message passing events

along with system activities, parallel I/O, and high-level parallel language events.

UTE supports on-line merging. However, we collect trace events in nodes where an
application is executing and merge them afterwards in most cases due to the limited LAN
bandwidth and volume of trace events. The merged trace stream is then fed into analysis

tools, or converted into formats suitable for visualization.

UTE/MP1 trace library

To facilitate the building of program instrumentation, MP1 provides a proﬁling interface in
which all of the MPI-deﬁned functions may be accessed with a name shift. That is, all of the
MP1 functions that normally start with the preﬁx MPI- are also accessible with the preﬁx
PMPI.. Thus, the proﬁling interface provides a simple mechanism to “wrap” original MP1

functions with any code (e.g., tracing, graphics, printfs, etc.) and export them as ofﬁcial

48

mF/MPI librarygl l UTE/MPL library I l ..... I

trace events I

on~line merge merge

L IN format 0 SDDF format I

I ALOG format j

modified nupshot

Figure 3.1: Uniﬁed Trace Environment for IBM SP systems

 

 

 

 

 

 

 

 

 

 

 

 

MPI functions. Typically this can be achieved by instructing the linker to support each
MP1 function also under the name shift. Providing such a general mechanism has several

advantages for building proﬁling libraries:

1. The overhead of generating traces is only present in the proﬁling library and is not

part of the base communication library.

2. Different tracing and proﬁling facilities can be utilized with the same base commu-

nication library.
3. The proﬁling library can be partial, e. g., only certain functions may be "wrapped."

4. Application code does not have to be changed.

A no-op routine MPLPcontrol ( ) is also provided in the MPI library to be used
for the purposes of enabling and disabling proﬁling in an MP1 proﬁling library. Thus, we
use this MP1 proﬁling interface to build the UTE/MP1 trace library on top of the AIX trace
facility for IBM SP systems.

We capture the begin and end events for each MPI routine along with its arguments
and return value. Figure 3.2 illustrates how the UTE/MP1 trace library is written. Note that

the same approach is used to construct the UTE/MPL trace library.

49

#define ev_send_start (0x10)
#define ev_send_end (0x11)

int MPI_Send(void* buf, int cnt, MPI_Datatype
type, int dst, int tag, MPI_Comm comm)

int rc;

cut_event(ev_send_start,cnt,type,dst,tag,comm);
rc = PMPI_Send(buf,cnt,type,dst,tag,comm);
cut_event(ev_send_end);

return rc;

Figure 3.2: MPLSend in the UTE/MP1 trace library

The proﬁling interface as deﬁned has certain drawbacks. Without having access to
the MPI internal data structures, it can be difﬁcult to trace all functions efﬁciently. For
instance, for visualization, ranks (i.e., node IDs) are most likely to be displayed as global
ranks rather than local ranks speciﬁed in an argument list for a speciﬁc communicator. This
local-to-global information is readily available in the MP1 internal data structures, but, if
not accessible, must be obtained through a series of MP1 function invocations, ultimately
increasing the tracing overhead. To reduce overhead, we dump the global rank list for each
communicator when it is created, thus providing an easy way to convert from local rank to

global rank.

By default, tracing is turned on by the UTE/MP1 trace library when the call to
MPLInit () is encountered, and is terminated at the exit of the application. Addi-
tional UTE routines are provided through the use of MPI-Pcontrol ( ) to turn on or turn
off tracing at any time. Thus, a user can trace only part of an application in detail while

other parts of the application are not traced.

50
UTE/MPL trace library

Similar to the UTE/MP1 trace library, we capture the begin and end events for each MPL
routine along with its arguments and return value. The MPL message passing library does
not provide name shifting as in the MPI library. However, it does have hooks for collecting
trace data for the VT visualization tool. Therefore, we replace all VT trace routines with
UTE trace routines to take advantage of existing trace collection hooks and to generate AIX

trace events.

The UTE/MPL trace library starts tracing right before the application begins to run,
and terminates tracing at the exit of the application. Additional UTE library routines are

provided so that users may turn on or turn off tracing at any time.

Global clock is accessed in both the UTE/MP1 and UTE/MPL trace libraries once
every 400 msec. In the UTE/MP1 trace library, it is implemented through a piggy back
function when the low-level communication timer ﬁres. In the UTE/MPL trace library,
it is implemented through a local timer and a signal handler for handling SIGALRM. Our

experience shows that both approaches work equally well.

3.1.3 Tools and visualization

A utility, utemerge, is used to merge multiple trace streams based on global timestamps.
The merged trace stream is then passed to other tools for trace listing, performance analysis,
or visualization. Another utility, lsut e, is used to list and analyze UTE/AIX trace ﬁles.
With no option set, the l sute utility lists each event, including node ID, timestamp, event
name, and associated data words. The tool can also generate a histogram for MP1 or MPL
routines to report the number of times each routine is called, and the total and average

elapsed times for each routine called in the application.

Since each node in an IBM SP system may be shared by other processes, the information
on how the total elapsed time was partitioned may be very useful. For the main process, the
utility shows both the time when the CPU is running it and the time when the main process

is in its compute mode (i.e., not running in any MP1 routine). Table 3.1 shows the example

of a time partition table for a set of four trace ﬁles.

Table 3.1: A time partition table

 

 

 

 

 

 

 

 

I Node I 0 I 1 F 2 I 3I
Main pid 15076 15183 18901 11172
Elapsed time 27.155 27.696 27.799 27.702
Other processes 0.689 0.210 0.127 0.11 1
Idle time 0.293 0.259 0.310 0.282
Main process 26.171 27.226 27.360 27.308
Compute time 14.649 14.726 14.609 14.559

 

 

 

 

 

 

 

The analysis of parallel program tracing typically involves matching events in one
stream with a related event in another stream. For example, in message passing systems
it is important to provide users with run-time data such as the observed message passing

time. Detailed descriptions of the analysis techniques can be found in [46].

Table 3.2 shows a histogram of all MPL events and user markers of a two-node program.
User markers can be inserted in pairs anywhere in a program to collect information about
various phases, loops, and routines. The visualization of these events can be found in

Figure 3.5.

It can be seen in Table 3.2 that it takes little time to execute some MPL routines, such
as MP_Task-query and MP-Envi ron. For example, the total elapsed time for executing
an MP_Task_query and generating trace events can be as little as 3.5 psec. This shows

that trace overhead is indeed very small.

The purpose of program visualization systems is to gain insight into the dynamic
behavior of programs. UTE provides multiple conversion utilities to convert a merged
trace ﬁle to formats suitable for visualization, including the SDDF format [51] for Pablo

and ALOG format for UPSHOT/N UPS HOT [52, 53].

For visualization, we are interested in several aspects of the parallel application trace

that has been captured by the UTE tracing library.

0 Process State Information: Display the various states of each process of the parallel

52

Table 3.2: A histogram of MPL events and user markers
Total_time Count Average

20 S_Phase

node 0 0.453592064 10 0.045359206

node 1 0.468927034 10 0.046892703
20 MP_Brecv

node 0 0.472896768 10 0.047289677

node 1 0.457345280 10 0.045734528
20 MP_Bsend

node 0 0.302807808 10 0.030280781

node 1 0.301400832 10 0.030140083
2 Init_Phase

node 0 0.121678080 l 0.121678080

node 1 0.121463296 l 0.121463296
2 MP_Sync

node 0 0.000151808 l 0.000151808

node 1 0.008803072 1 0.008803072
2 MP_Task_query

node 0 0.000003840 1 0.000003840

node 1 0.000003584 l 0.000003584
2 MP_Environ

node 0 0.000004608 l 0.000004608

node 1 0.000005120 l 0.000005120

WMWW mum

. . II!
mum mam...
’ f 1‘} I DIVIEIIIIV.‘lﬂillﬂ‘lflillﬁﬂﬂ‘ll‘ﬂhﬂn
' 5 -ﬂIli III-WI

 

Figure 3.3: A N UPSHOT visualization of matched sends and receives

53

job as timelines in a time-space diagram. This is a standard view found in practically

every visualization tool.

0 Process Interference Information: Display system interference (e.g., other process
activities, I/O activities, etc) during the various states as part of the timelines. This
allows a user to easily identify why certain states, for example a particular message

passing call, consumed much longer time than expected.

0 Source Code Association: It allows a user to relate process states to source code
location. Typically it is difﬁcult to identify a linear sequence of states with the source
code location or program structure. We simply desire to click on a process state and

be presented with a ﬁle browser identifying the source code location in the executed

program.

0 User Markers: User markers allow a user to mark various phases, loops, and routines
in the application. Displaying user markers along with source code association
provides a simple way to extend the tool for better understanding the structure and/or

dynamics of the application.

We chose NUPSHOT, a public domain visualization tool developed at Argonne Na-
tional Laboratory, and modiﬁed it to suit our needs. NUPSHOT provides a graphical
interface to display timelines of process state information. The trace information is pro-
vided in either ALOG or PICL format, two popular trace ﬁle formats. Along with other
conversion tools, we developed a conversion tool, called ute2ups, that transforms the UTE
output into the ALOG ﬁle format. Providing such a transformation tool allows one to easily

port to other visualization systems without changing other UTE analysis tools.

Figure 3.3 shows a snapshot of a N UPSHOT visualization for a three-node program
in which a l-MByte message is circulating among all nodes. Matched sends and receives
can be displayed by arrows, from the begin event of a send (such as MPI_Send) to the end

event of a corresponding receive (such as MPLRecv).

Because process interference events (e.g., context switches, etc.) are captured by

UTE, the conversion tool simply registers these as special state events, thus not requiring

54

changes for NUPSHOT. A small program with small circulating messages was run, and
process dispatch events were traced along with message passing events. Figure 3.4 shows
two different views of the same MPI events with and without the <OtherProcesses>
state events. An <OtherProcesses> state indicates a period of time stolen by other
processes, including the idle process. It can be seen in Figure 3.4 that a big chunk of time
was stolen by other processes, especially at node 0. For instance, shortly after all nodes were
synchronized at time 0.1215 sec, all other nodes had to wait (in MPI.Recv state) because
node 0 had a context switch and was running something else. Note that the idle process
may be dispatched if the application is waiting for the completion of an I/O operation, such
as page fault. Because both MP1 and MPL message passing libraries are shared libraries

and usually loaded at run time, page faults may occur and result in dispatching the idle

process.

User markers, if used in pairs, can be analyzed and visualized in UTE. They provide
an easy way to mark various phases, loops, and routines in application programs. Fig-
ure 3.5 shows an MPL program visualization, including two user states, Ini t_Phase and

S_Phase.

In addition to process dispatch events, other system events such as system calls and I/O
activities can be captured as well. Thus, the framework provides an environment not only
for end users but also for system software developers to calculate path lengths, understand

system behaviors, and eliminate program bottlenecks.

To capture source code to process state associations, we do the following. When a
trace event is generated, we store the link register from the execution stack of the event
generation procedure, such as the proﬁling MPI call and the routine to generate user
markers, in the event itself. The link register holds the address to branch to after the
subroutine is completed. This is the instruction immediately after the subroutine invocation
in the application program. Although this requires an extra function call and the extension

of each event by an extra word, the overhead is negligible in terms of execution time.

We extended the ALOG ﬁle format to hold an optional instruction address with each

event. NUPSHOT itself had to be extended as well to store this instruction address in its

55

 

llzeen- lulu": n

:—_u trmnnum:-Isi

W”!

:-t In ill—Ii.I

Wenlnlnnueuu

 

Figure 3.4: NUPSHOT visualization: with and without <OtherProcesses> states

 

 

Figure 3.5: Visualization of user markers

r

it (rare; '
if lushs‘ some?

function:

Figure 3.6: File browser for source code association

internal state database. In case the application has been compiled with debug information
enabled (i.e., with the —g option), line information is available in the executable. Therefore,
NUPSHOT was extended with a module that loads in the executable and obtains the
line information. When the process state is graphically selected and instruction address
information is available for this state, this module is queried with the address and returns
(similar to the operation of a debugger) the source ﬁlename and the line number associated
with that address. This information is then provided to a ﬁle browser, which highlights the

event generating location in the application’s source code.

Figure 3.6 shows a ﬁle browser that is presented by clicking on an event. As long
as the code is compiled with -g, the feature of source code association is available for all
message passing events and user markers.

All message passing events and user markers can be visualized and source code
association is available if compiled with —g. Therefore, a user can easily visualize the most

time-consuming states on screen, and ﬁnd out where (which line) in the source code the

 

57

responsibility for it is by clicking on the state area.

3.2 UTE Extensions for Distributed Web services

UTE was originally developed for scientiﬁc applications. A scientiﬁc application runs on
a number of processors (or nodes), which communicate through messages to jointly solve
the problem. UTE relies upon the AIX Parallel Operating Environment (POE) to assign a
unique node ID to each node. Trace generation in UTE typically starts when the application
begins to run, and stops when the application exits. Thus it is able to capture all message

passing events along with user markers and system activities.

In addition to scientiﬁc applications, many emerging applications follow the
client/server model. In a client/server computing environment, a client requests an op-
eration that another program, the server, provides. Upon receiving a client request, the
server performs the requested service and returns any result. A client interface speciﬁes the
individual services or operations supported by the server. Clients can only request services

that conform to the client interface provided by the given server.

A client/server application is very differentfrom a scientiﬁc application, in that a server
may be idle (while ready for client requests) for a long time between incoming requests.
For example, a Web server may be very active during the prime shift and close to idle at
night. Obviously, tracing the entire session of the Web server would cause unnecessary
events to be generated, and terminating the Web server merely to collect trace events does
not make sense in a real-world application. Thus, a trace facility that does not rely on

POE and is capable of dynamic trace generation is needed to trace distributed client/server

applications.

3.2.1 Existing benchmarking tools and open issues

Several approaches have been developed for performance measurement of Web servers. A

simple but popular approach to measure the performance of a Web server is to keep access

58

logs [17] along with timestamps. Information stored in the access log may include the
document being requested, the size of the requested document, the time it was requested,
and the Internet address from which it was requested. The information stored in the access

log is then analyzed for performance.

The WebStone [ 19], a Web server benchmark, was developed in an attempt to bet-
ter understand the performance characteristics of Web services. In particular, it allows
performance measurement of the server in terms of the average and maximum response
time, average and maximum connect time, data throughput rate, number of pages retrieved,
and number of ﬁles retrieved. It was developed by Silicon Graphics, and is available on
the SGI Web server. It is the generally accepted industry standard for measuring Web
server performance. WebStone runs exclusively for clients (i.e., the Web browser), makes
all measurements from the point of view of the clients, and is independent of the server
software. WebStone is suitable for testing the performance of any and all Web servers,
regardless of architecture, and all combinations of Web server, operating system, network
operating system, and hardware. Each WebStone client workstation is able to launch a
number of children (called Webchildren), depending on how the system load is conﬁgured.
Each of the Webchildren simulates a Web client and requests information from the server
based on a conﬁgured ﬁle load. A program called WebMaster controlled the starting and
stopping of WebStone and the collection of data at the end of each test run. It ran on one of

the workstations but used no network or processing resources while the test was running.

In addition, system monitoring, such as “vmstate” and “netstat” traces [4], were also
developed to store CPU, VM, and network usage information. These trace events are kept
for a long period and consume much of disk space. However, the lack of efﬁcient analysis
and visualization tools limits the scope of these tools. It is difﬁcult for a human being to
inspect these trace events efﬁciently, and allowing the Web server to use these traces and to

adjust the server performance would be even more difﬁcult.

The Webperf benchmark is a product of SPEC (Standard Performance Evaluation
Committee), a nonproﬁt organization that develops standard benchmarks and publishes

ofﬁcial results [54]. The Webperf benchmark is similar to the SGI WebStone in style

59

and intent, but was developed completely independently. Webperf is based on the SPEC
LADDIS benchmark for NFS ﬁle servers [55] and has a Web browser interface that was
adopted from the SATAN (Security Administrator’s Tool for Analyzing Networks) security
tool [56]. The developers of Webperf sought to keep the best features of WebStone while
improving its portability, applicability, and validity. Like the WebStone, the Webperf is a
"black box" test, generating a workload with one or more client processes on one or more
workstations. The response time and throughput are measured by the clients, and the results
automatically summarized into a standard report. The Webperf can be conﬁgured to use

different workloads, and has a Web browser interface.

These benchmarking tools provide mechanisms to examine and compare the perfor-
mance of Web servers as they work today; they are a ﬁrm foundation for evaluating Web
server performance. However, important open issues not yet addressed by any of the

benchmarks remain.

One of the issues is the lack of techniques to measure the performance of dynamic
documents and scripts. The Web is rapidly evolving from the retrieval of static ﬁles to more
interactive applications such as image maps, database queries, and Java appliets. These
requests will make new and different demands on servers. The existing benchmark tools
do not handle these kinds of requests conveniently, if at all. Furthermore, these types
of workloads have yet to deﬁned. The framework for distributed Web services provides a
gateway API into the back-end server, but measuring the Web performance does not provide

enough performance information for our entire Web service.

The lack of appropriate monitoring tools had prompted us to extend the UTE trace
library to support tracing in the distributed framework for Web services. To support this

new client/server computing model, we modify UTE in multiple ways.

3.2.2 New trace events - IP_Send, IP_Recv

To support communications through ﬁrewall or proxy servers, we implement communica-

tion connections through sockets and use SOCKS interface to go through ﬁrewalls. This

60

works well in a single workstation, a cluster of workstations, and an IBM SP system using IP
through the High-Performance Switch. The connection manager can automatically detect
the existence of the High-Performance Switch and take advantage of it. The same socket

send/receive interface is thus used regardless of the platform.

Since our connection manager is built on the UNIX socket library, we need trace events
for UNIX socket send/receive operations. Thus, we add two new events - IP_Send and
IP_Recv. With these two new events, we can trace interactions between the Connection
Manager Daemon and the cliette process, the CGI process and the Connection Manager

Daemon, and the CGI process and the cliette process.

The Connection Manger Daemon when running on an IBM SPx machine, detects
the existence of the High-Performance Switch automatically and instructs cliette processes
to use the High-Performance Switch to take advantage of the speed the switch provides.
The CGI process could be instructed to use the High-Performance Switch by setting up
an environment variable - CMD-HOST. Communication through IP is used by our Web
server design. Send and receive operation through UNIX socket ports are used exclusively
for communication among cliette processes, CGI processes, and the Connection Manager
Daemon. These operations are captured in the trace ﬁle as begin and end events for

IP_Send and IP-ReCV.

3.2.3 Dynamic trace generation

A dynamic tracing interface, which allows a Web service administrator to turn on/off trace
whenever necessary, is provided through the use of the CMDadmin utility. Unlike many
other projects in which performance analysis and trace generation is often an afterthought

process, the Connection Manager Daemon has a built-in interface to accept trace requests
coming from CMDadmin.
Upon receiving a Trace.start request, the Connection Manager Daemon calls the

TraceCliette routine, which asks each active cliette process to turn on its trace. An ac-

knowledgment will be sent back to the Connection Manager Daemon after the cliette

61

process turns on its trace. If a cliette process is busy serving a CGI request, a ﬂag is posted
on the cliette queue. When a busy cliette process returns to the clietteAVA/L state, this ﬂag
is checked and corresponding trace action will be performed. This delay makes certain that
cliette trace events are generated on a CGI-request basis and prevents trace initialization
when serving a CGI request. The Connection Manager Daemon turns on its own trace only
after it has received conﬁrmation messages from all cliette processes. Figure 3.7 shows
how the human trace request ﬂows. If a cliette process fails to turn on its trace, an error for
tracing will be sent back to the Connection Manager Daemon. In this case, the Connection
Manager Daemon may abort trace generation by sending a Trace.stop command to those

cliette processes whose traces are already on.

   
   
  

Step 5
traceon_done

Step 1
request trace on

8mp4
haceon_done

 
 

Connection

Step 4
Manager

traceon_done

    
      
 
 

 

 

Step 2

     
   
 
      

Cliette 1

  
 

Step 3
start trace

Step 3
start trace

Step 3
start trace

      

  

Figure 3.7: Uniﬁed Trace Environment for IBM SP systems

62

One possible option is to turn on some cliette processes’ traces while others are
running without tracing. This allows a system administrator to monitor problematic cliette
processes. The tracing of the Connection Manger Daemon is always on as long as some
cliette processes are being traced. Thus, dynamic trace generation of selected cliettes
provides a way for measuring/debugging interactions between the Connection Manager
Daemon and a newly developed cliette. Trace generation is terminated when the traced
process exits or the system administrator issues a stop-tracing request using the CMDadmin

utility.

3.2.4 Multiple trace channels

UTE was originally developed under the control of the AIX Parallel Operating Environment

(POE), which dispatches jobs to various SP nodes.

The Connection Manager Daemon has task ID zero, and assigns a unique, positive
task ID to each cliette process. A trace ﬁle is generated for each traced process, with the
unique task ID as the ﬁle name extension. This saves one word per trace record in raw
trace ﬁles, because a trace record does not need a ﬁeld to indicate which task it is generated
from. Special precaution is taken to avoid the accidental overwriting of existing trace ﬁles
by repeated requests to turn on tracing. Each trace on/off request pair generates a set of
trace ﬁles, and multiple trace on/off requests can be issued during the entire course of the
distributed Web services. After trace ﬁles are generated, UTE utilities are used to merge

and analyze the trace events.

CGI processes are children processes of the Web server. They reside on the same
machine as the Web server. A cliette process can also be on the same machine/node as the
Connection Manager Daemon. Even a threaded Web server, such as Netscape commerce
server, has its pre-started CGI threads on the same machine/node as the server process. The
trace facility must be able to collect events from multiple processes in the same workstation.
We therefore add the ability to support multiple trace channels in the UTE+ trace library.
This allows the Connection Manager Daemon and its cliette processes to run on the same

node, and also allows the capturing of trace events of more than one cliette on each node.

63

An available trace channel is chosen when tracing is turned on for each process. The process
using trace channel zero (the primary channel) in each node is capable of generating system

events as well as message passing events.

3.2.5 Unique IDs for trace generation

Each traced process needs to have a unique ID to generate a unique trace ﬁle. Previous
UTE trace libraries require the POE on an SP machine to schedule tasks and assign a
unique ID for each collaborating process in scientiﬁc applications. In our distributed Web
services design, each cliette process has a unique cliette ID. This ID is used mainly by the
Connection Manager Daemon to control individual cliette processes. It is also used as a
token to distinguish a CGI/Cliette pair. A CGI process, after receiving the assigned cliette
ID from the Connection Manager Daemon, passes this ID to the waiting cliette process. If
this cliette ID does not match the receiving cliette ID, request is denied. This is to prevent
tampering with the CGI process process without getting permission from the Connection
Manager Daemon. If tracing of CGI processes is needed, the Connection Manager Daemon
is responsible for assigning each CGI process a unique ID. To prevent duplicate IDs from

being used, the Connection Manager Daemon is the only one that can assign IDs.

Thus, because distributed Web service does not require the POE, it is portable in

multiple platforms and environments.

3.2.6 Clock synchronization

As discussed in the section 3.1.3, a common time reference would make it easier to merge
multiple trace ﬁles collected on multiple nodes. In a cluster of workstations, clocks on
different systems will drift apart over time if not periodically synchronized. The drift
of a clock is the frequency error of the clock relative to a reference clock. Oscillator
manufactures quote frequency errors typically in the order of 1 part per million (1 psec/sec),

which represents a drift rate of l psec per second, or 3.6 msec per hour.

Clock synchronization in a cluster of workstations can be achieved by either manual

64

adjustment or time synchronizing daemons such as the Network Time Protocol (NT P) [57] or
the timed daemon of the 4.3BSD UNIX [58]. Time synchronizing daemons are specialized
software for distributing and receiving time over local area networks. A time server or a
hierarchy of time servers periodically distributes the current time to client nodes, which
can then adjust their clocks accordingly. The sptdaemon [59] in NT P keeping clocks is

synchronized to within 1 to 3 msec of each other, and the timed to within 5 msec.

3.2.7 On-line timing routines for run-time timing data and statistics

The CM Daemon uses a FIFO queue to choose the ﬁrst available cliette process to serve
an incoming CGI request. To balance the load of each machine/node, other factors need to
be considered. First, a cliette process on a lightly loaded node should be chosen before a
cliette process on a heavily loaded node. Second, a cliette process running on a powerful
machine/node should be chosen before a cliette process running on a low-end machine.
Other factors affect a cliette performance, such as system memory, system swap space,
and disk space. Various history statistics also help make the decision easy, such as paging

statistics, process activity, and so forth.

We provide on-line timing routines to collect run-time timing data and statistics in the
UTE+ trace library. These on-line timing routines provide valuable run-time information
for performance steering. They are useful especially when the distributed Web services may
be run in multiple platforms, including a single-node workstation, a cluster of workstations,

and an IBM SP2 machine.

3.2.8 Enhancement to the utility command - ute2ups

During the period of tracing, various CGI processes will be invoked to request dynamically
generated HTML documents from cliette processes. To trace all these CGI process results
is a waste of system resources, such as CPU time and disk spaces. But without the trace
data from these CGI processes, much IP_Send/IPRecv events could not be paired. In other

words, an IP_Send event from the cliette process to a CGI process will not have a matched

65

IP_Recv event in the ﬁnal trace ﬁles. This causes problems for the original ute2uts utility.
We modify the original ute2uts to pair only those with positive task IDs, but keep statistical
information with all the unpaired sets. With this, we could assign task ID -1 to each untraced
CGI process when recording an IP_Send or IP.Recv event to avoid confusion when paring
events. In Appendix D, we show a listing of uteZuts results for both a Connection Manager

tracing and cliette process tracing information.

3.2.9 Enhancement to the NUPSHOT program

Nupshot is modiﬁed to display additional information, such as message size for IP_Send
and IP_Recv states, and process name and ID for other processes, in the dynamic pop-up

information window.

3.2.10 User marker - seerGI and seerache

Although there are various types of action performed by a single cliette process, the most
important task is serving a CGI request. By knowing the time taken to serve a CGI request,
a system administrator could use this information to tune the system. Furthermore, this
information could also be fed back in real time to the Connection Manager Daemon to do
dynamic load balancing. A set of phase markers — b_seerGI and e_seerGI — are used
to mark the beginning and end of a cliette serving a CGI request. This new user marker is
built into the API library. When users compile their cliette program with -DUTE_TRACE

ﬂag, this user marker is automatically included.

Another useful user marker is seerGI. It is used by the cache manager to indicate
the beginning and end of serving a Cache item. Depending on how many threads of control
a cache manager has, the cache manager could generate a user marker for each of its cache
threads. In the Chapter 6, we discuss how this Cache Manager marker is used by two

different threads to generate two different user markers - Pi ckCache and GIFCache.

CHAPTER 4

Performance Evaluation of the

Framework

In this chapter, we will evaluate the performance of the proposed framework. First, we
will describe the prototype system setup for performance measurement. As a basis for
comparison, the performance based on the traditional design is measured. We claim that
the proposed framework is scalable by showing the inﬂuence of the number of CGI requests,

the number of cliette processes, and the number of servers on the system.

4.1 Prototype System Setup

In a tradition design, each CGI process needs to establish a connection with its back-end
server. As pointed out in Chapter 2, each CGI process needs to perform the initializa-
tion/negotiation step before it actually forwards its requests to the back-end server. The
time could be signiﬁcant if there are many CGI processes. If each CGI process performs
a relatively simple task, the time required for initialization is signiﬁcant, wastes system
resources, and becomes the bottleneck. And, if each of these CGI processes needs to
perform some complicated job when constructing HTML pages, it not only adds load to
the system running the Web server, it will also add load to the back-end server, resulting

in slow response time for all CGI processes. Figure 4.1 shows a high-level block diagram

66

67

 

    

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IBM SP node 1
Node 3
f I Server A
Dnnect to
er A
All con act to
Server A
j
Server B
I
Node 4

 

 

 

Figure 4.1: Traditional Web server design

using the traditional method allowing each CGI process to establish connection with the
back-end server. It also shows that when using multiple HTTP servers to allow more CGI
processes, there is no way for the HTTP server to evenly distribute the CGI connection to

the back-end server without knowing the load of the system in advance.

In Figure 4.2, we show the high-level diagram using our framework design. In this
example, there are at most 8 CGI processes that could be served by cliette processes
simultaneously while others wait for a free cliette. But since these 8 cliette processes are
evenly connected to two back-end servers, each could perform a CGI request in a reasonably

short time and return to serve the next CGI request. Cliette processes can be created on a

68

IBM SP node 2 Node 3

 

 

j
Server A

IBM SP node 1

 

 

 

 

 

 

 

 

 

 

 

 

 

L Cliette NOde 4

 

Figure 4.2: Our framework solution

69

 

 

F a .
WEB Server 'Ssue
requests

   
  

  
    

Back-end
Server

  

Initialization

    

rok

* return to ‘

Web browser

  

negotiation

 

receive results

 

 

wait/constructing HTML page J

L

Total Time = Fork CGI process time +

 

 

Issue requests to back-end server time +
Wait for results from back-end server time +
Construct HTML page time +

 

 

 

 

 

 

Figure 4.3: Elapsed time using traditional Web server design

remote system to distribute the system load. Even the Connection Manager Daemon could
be duplicated on a different machine to manage another set of cliette processes. There can
be any number and any type of back-end server, from a database management system to a
documentation management system. Different types of cliette processes can be managed
by a single Connection Manager. In Chapter 5, we will demonstrate the building of the
Digital Library using the IBM Visual Info Product as our back-end server.

4.1.1 Performance of the traditional design

Figure 4.3 illustrates the time required for a Web server to access a back-end server when
using the traditional design. It details the time required for each step to complete a CGI

request.

In the tradition design, each CGI has to perform negotiation/initialization time before

70

actually sending the requests to be processed by the back-end server. In addition, each CGI
needs to logout/terminate connection with the back-end server when it ﬁnishes. For our
Digital Library design (to be discussed in Chapter 5), initialization/negotiation time needed
for each CGI process to establish the connection with the back-end server is detailed
in Table 4.1, including the network delay. These data are gathered when running 6 CGI
processes concurrently on an IBM SP node (SP2 Thin-Node model 390 with 128M memory).

The back-end servers are IBM Visual Info servers.

 

 

 

 

 

 

 

 

 

Steps Minimum time Maximum time

(sec) (sec)
Login 0.045 0.088
Session setup 0.125 0.210
Access Index class 0.452 1.009
Access Attribute class 0.278 0.404
Access Linkage class 0.139 0.250
Setup Cache 0.009 0.012
Total 1.048 1.973

 

 

 

 

 

 

Table 4.1: Detailed initialization time in the Digital Library environment

The main operations at this initialization/negotiation phase are login ID/password
veriﬁcation; setting up the session ID and handler; initializing and setting up connection
with the Cache Manager; and accessing and arranging the corresponding index, attribute,

and linkage classes. These steps generate about 10 to 15 query statements.

The CGI process in the traditional design cannot be reused for any following CGI
request. The HTTP server forks a child process when a CGI request is received by the
HTTP server. After the standard output port of this child process is redirected to the open
socket port with the HTTP server process, this child process image is overwritten by the
CGI script. The only way this CGI process could forward a dynamic HTML page is to write
it to the standard output port, which is then redirected to the HTTP server for returning to
the Web client. Because this CGI process has no concept of the HTTP server and has no

other connection with the HTTP server, it cannot be reused.

Adding more H‘l'l'P servers on an extra machine/node cannot solve this problem

71

 

Web Server Our Framework

.1
w

I
fonNard ® JN

 

 

 

 

 

 

Cliette issues request to back-end server time +
Cliette waits for results from back-end server time +
Cliette constructs HTML page time +

    

CGI returns HTML page to Web browser time

 

 

 

Figure 4.4: Elapsed Time using our framework solution

efﬁciently, as illustrated in Figure 4.1.

4.1.2 Our framework solution

In our design. the negotiation/initialization is done only once at the start-up step of the
cliette process (as shown in Figure 4.4). The difference between our design and the
previous traditional one are highlighted in the shaded areas. These cliette processes stay
connected and are ready to forward requests to back-end servers. The added overhead when
using our framework is the negotiation/connection time between the CGI process and the
cliette process. If the time is considerably less than the initialization time in the traditional

design, using our framework reduces the overall time.

Our framework overhead measures, on average, around 15 msec (Table 4.2) — from

when the CGI process asks the Connection Manager for a free cliette process until the

72

connection between the CGI and the free cliette has been established. Instead of spending
at least 1 sec for each CGI process during initialization, we add only 15 msec overhead.
Table 4.2 represents the overhead of our framework, as indicated by the shaded areas in
Figure 4.4. The ﬁnal HTML page is assumed to be 2,048 bytes, which is the size of one

network send packet.

 

 

 

 

 

 

 

 

Steps Minimum time Average time Maximum time

(msec) (msec) (msec)
Between CGI and CMD 7.4 10.5 32.2
Between CGI and Cliette 3.8 5.2 16.6

 

 

 

Table 4.2: Overhead in our framework, assuming 2,049 bytes per HTML page

As mentioned in Chapter 1, the forking time in either case could be eliminated by
using the dynamic load library, such as Netscape NSAPI or Microsoft’s ISAPI. Because
this architecture is not yet used by the general public, we use the standard HTTP server,

which uses forking to startup a CGI process in our environment.

4.2 Design Considerations

To achieve scalability, we had to design our framework with minimum overhead. There
are several components that determine how the system should be designed to achieve

scalability. We will discuss each component in detail in this section.

4.2.1 Connection Manager

There are several potential drawbacks to our framework design. The Connection Manager
Daemon is a potential bottleneck; it must respond to every CGI request. The entire server
cannot proceed any faster than the Connection Manager Daemon. For this reason, the
Connection Manager code must be carefully written to be as fast and robust as possible. In

this section, we will examine the inﬂuence of this potential bottleneck.

73

Instead of using centralized control, where one Connection Manager manages local
and remote cliette processes, multiple Connection Manager Daemons could be used to
manage only local cliette processes to reduce this potential bottleneck. To balance the
number of CGI requests to multiple Connection Managers, these Connection Managers
must constantly exchange load information, which adds load to the Connection Manager

and complicates the Connection Manager design.

4.2.2 Number of cliette processes

The number of cliette processes must be managed. While cliette processes wait for requests,

they use up memory and operating system resources. For this reason, it is wise not to create

more cliette processes than necessary.

On the other hand, if all the cliette processes are busy and more CGI requests arrive,
the Connection Manager must either reject the overﬂow requests, queue them up until a
cliette process is free, or start more cliette processes. Which of these three choices is best

depends on the server and the situation, although rejecting a connection should be avoided

if possible.

Starting more cliette processes allows the system to dynamically adjust to the number
of CGI requests, but creating and starting the extra cliette processes slows the system at the
worst possible time — peak load. Worse, the extra cliette processes may soon become idle

if the load drops, and will continue to hang around unless action is taken to retire them.

Queuing the requests is better than rejecting them, but if the wait is more than a few
seconds many clients will think the server has crashed and will close the connection anyway.
Keeping track of all the pending requests may not be easy to do efﬁciently, either. The last
thing a busy Web system should do at the highest peak of activities is a lot of bookkeeping.

The best strategy, then, is to have the number of cliette processes sufﬁcient to meet
all but the very highest loads, while not being an excessive burden when the system is less
busy. The best number of cliette processes for a given server must be determined through

experience. In this section, we will examine the average CGI response time versus the

74

number of cliette processes assuming that the average CGI request will cost the back-end
server from 500 msec to 1 sec to process. In addition, two different degrees of cliette busy
rates are used. Normally, a cliette process may not be busy all the time. It may stay idle
while waiting for results coming from the back-end server in order to determine what the
next query should be. While one cliette process is idle, it releases the processor and allows
other cliette processes to proceed. If the cliette spends more time idling, the system could
handle more cliette processes while maintaining reasonable response time. We use two busy
rates - 50% and 20% — in our demonstration. These rates are only rough measurements,
and do not include the network communication overhead between the cliette processes and

the CGI process, or the cliette process with the back-end server.

4.2.3 Number of SP nodes for cliette processes

Once a node is overloaded with cliette processes, increasing cliette processes will only
add load to the system, which results in longer CGI response time. At the point, the only

solution is to scale the system up from one node to two or more nodes.

Our framework design is ﬂexible; it can be set to automatically start up cliette processes
on a remote node if the number of waiting CGI requests has reached a predeﬁned number.
Because cliette processes are dispatched to serve CGI requests from the free queue in a
ﬁrst-come ﬁrst-serve fashion, each cliette is kept equally busy. In addition to automatically
beginning new/remote cliette processes to serve more CGI requests, the system adminis-
trator can use a maintenance command to start new cliette processes. Number of cliette
processes per node is deﬁned in the Connection Manager conﬁguration ﬁle. Distribution

of CGI requests to either a local or a remote cliette process is maintained by the centralized

Connection Manager.

75

4.3 Performance Results

A user using the Web browser does not like to wait for an HTML page. They tend to stop
their requests if it takes too long for a page to show up. The performance of a Web server
is judged by how fast an HTML page could be returned to the Web client, given that the
back-end server performance is not affected by the number of cliette processes connected
to it. The same measurement applies to our performance monitoring. The less time it take
a cliette process to generate a dynamic HTML page, the better our framework is. The
faster a CGI process can receive and return this dynamic HTML page, the better the entire
Web performance is. We use the average CGI request time as the performance metn'c,
assuming each cliette process take an average of 0.8 sec (as explained in Section 4.3.1) to

get a response from the back-end server.

Our extended UTE trace library and tools are used to capture event traces and analyze
traces. Customized UTE markers are used to mark the beginning and end of CGI requests.

This set of markers helps us to identify the total elapsed time for each CGI request.

4.3.1 Workload

We use a modiﬁed HTTP server to fork initial CGI processes in our environment After the
initial stage in which CGI processes are forked consecutively, these CGI processes remain
active and issue CGI requests continuously while we gather the data. The motivation is to
control the number of busy cliette processes at all times to guarantee that all the available
cliette processes are busy if the number of CGI requests is equal to or greater than the

number of cliette processes.

There are two reasons why we choose this model. First, forking a process takes
time. For example, if forking a process takes 20 msec, there will be only one busy cliette
even with continuous forking because it only takes 15 msec for a CGI process to ﬁnish its
requests. We will not be able to see all the cliette processes in the busy state especially
when measuring the overhead of our framework. Second, there is a limit on how many

child processes can be outstanding for a single process. If we choose to fork one child

76

process every X msec, we will eventually reach this limit depending on how fast a cliette

process serves a CGI request under the current load.

Table 4.3 lists the minimum average time when using a single CGI process to one
cliette process. The work stage is implemented by using a loop to keep the CPU busy. The

wait stage is implemented by using usleepO system call.

 

 

Percentage Average time Average time Average total time to
of Busy in Work state in Wait state complete one CGI

 

 

 

 

 

 

 

time (msec) (msec) (msec)
50% Busy 0.39 0.40 0.822
20% Busy 0.125 0.63 0.773

 

 

Table 4.3: Average Wait/Work time for 1 CG] process versus 1 Cliette process

In some cases, we assume that the back-end server is inﬁnitely fast. This means the
cliette process responds with the ﬁnal HTML page as soon as it receives the CGI requests.
This allows us to examine the overhead of our framework. In addition to different numbers
of cliette processes (ranging from 1 cliette process to 32), we also use different numbers of

CGI processes, ranging from 1 to 32.

4.3.2 Inﬂuence of the number of cliette processes

In Figure 4.5, we assume that the back-end server is inﬁnitely fast. We only use one SP
node for this setting to show the impact of the number of cliette processes on a single node.

In the next section, we will scale up the number of cliette processes to at most 16 SP nodes.

From Figure 4.5, we show that when the number of Busy cliettes exceeds a certain
number, the average CGI response time begins to increase. This occurs because our cliette
processes begin to overload the system resources, especially when our cliette processes stay

Busy all the time.

If cliette processes are not Busy all the time, the system resources can be shared

among cliette processes, and more cliette processes can be run on the same system without

77

 

l conmxrrenr CGI procesas
2 concurrent CHI processse
4 conmirxenr PG] process -3
B committent CGI prob _aees
If concurrnnr CGI p .susases
32 ccnmerent CGI iroweasaes

ii'l’i‘ 1'1

I
l

0.14

I each CGI request (sec;

Irll
0,11
0.10
0.09
0.08
0.07
0.06
0.05
0.04
0.03
n_o:
0.01

 

Average time S:

 

 

I I I I I I I I I I I I

h
H
w J i
p) l" II‘ II
Ll l l l l l J l l I I

 

 

 

 

l 4 8 16
Number of Cliette processes per 3? node

Figure 4.5: Assuming inﬁnitely fast back-end server

overloading the system. We use a 20% Busy ratio and a 50% Busy ratio to present the

threshold of the system.

In Figures 4.6 and 4.7, it clearly shows that the increasing rate for the average CGI

response time depends on how busy the cliette processes are.

In Figure 4.7, we show that the best average CGI response time when serving 32
concurrent CGI processes on a single SP node is about 11 sec. In the next section, we will

show how to scale up the system to lower the average CGI response time.

4.3.3 Inﬂuence of the number of SP nodes

Once a node is overloaded with cliette processes, increasing cliette processes will only add
load to the system, which results in longer CGI response time as shown in Figures 4.6
and 4.7. When the cliette process Busy ratio is 20%, more cliette processes could be
responding to equal numbers of CGI processes without additional noticeable load on the
system. The threshold could be up to 32 cliettes per node. When the cliette process Busy

ratio is 50%, adding more cliette processes to handle equal numbers or more CGI processes

‘51:

GI request

r each I

A'r—ar age r. Lmé f

w n

21% u

14 n

1|! 0

(\l {l

r
1 “1 nr‘uirenr “1:1 pp a» <I—
I run’ "’51 pi- ‘— _
‘ *—
9—
4—
.—
_ -l
1 .7 4 H '1 ll
Hum—.1 -.f Flu—tr..- pr,.;es.:e3 per ‘TF nod-,— {:03 Busy [We]
. . . , . .
Figure 4.6. Assuming cliette processes Busy ratio IS 20%
r r f I l 1
con Mrrnnr "ill prm‘ngss «l»—
_‘ [unr i‘ ”we may *— .4
4 mi 1 *—
H “II B—
l* "inﬁll “:1 L -+~
i.‘ 'uxr-‘tnr an 9—
p -r
.. .4 A. = 4: q
" Na; - A A g
'5 T 1 ﬁt ‘7’ 'l'
l J 4 A 1r. 3‘:
Mimi—r Hf -‘11‘!Vf.-'- Plugeﬁsﬁﬂ p541 gr nin‘lg {mu 3.pr ”mg,

78

 

 

  

   

     

 

 

 

 

 

 

 

  

 

   

 

 

 

 

 

 

 

Figure 4.7: Assuming cliette processes’ Busy ratio is 50%

79

 

')

«0 I I I r

Cliette processes are 50% Busy 50% idle 4'—
Cliette processes are 20% Busy 80% Idle 4F—

Average time fer each CGI request (98C|

 

l l l l

4R/nade 24/ncde 12/n0de B/node
Number of cliette processes per SF' node for total of 48 cliettes

 

 

Figure 4.8: Forty-eight processes versus 48 cliette processes

does not show improvement in the CGI response time. The threshold for this type of setting

is about 16 cliette processes per node. At this point, the only solution is to scale the system

up from one node to two or more nodes.

In this section, we will show how the average CGI response time is improved when

we scale up the number of SP nodes to accommodate the same number of cliette processes.

In Figure 4.8, we evenly distributed 48 cliette processes to a number of nodes, ranging
from 1 to 6 SP nodes. We used 48 CGI processes to guarantee that all these cliette processes
are Busy most of the time. The speedup rate is almost doubled when we scale up the system
from 1 to 2 nodes, and from 2 to 4 nodes for cliette processes that are 50% Busy. But there
is only little improvement when scaling from 4 to 6 nodes. Again, the speedup is nearly
doubled when we used 2 nodes instead of 1 for cliette processes that are busy only 20% of

the time. There is only a moderate improvement when scaling from 2 to 4 nodes, or from a

4- to a 6-node system.

In Figure 4.9, we evenly distributed 96 cliette processes among nodes, ranging from

2 to 6 nodes. Again, we used 96 CGI processes to guarantee that most of these cliette

80

 

3" I I I I
C1 tette processes are 50’s Busy 50% idle ~5—
Clzette prccesses are 20% Busy 80$ idle JP—

18- -

iaeci

CGI request

Average time for each

.1

 

 

4 .- \ cl
l l l

 

4 8 / node 32 / node 2 4 1 node 1 6 / made
Number of cl iette processes per SP node for total 96 cliettes

Figure 4.9: Ninety-six processes versus 96 cliette processes

processes are Busy at any time. These 96 CGI processes are issued from two different
H‘I'I'P servers to make the comparison between Figures 4.8 and 4.9 easier, because all the

CGI processes will be running on a system with roughly the same load.

The improvement is clear when we scaled the system from 2 to 3 nodes, and from 3 to
4 nodes. There is also signiﬁcant improvement when scaling from 4 to 6 nodes for those

cliette processes that are 50% Busy.

4.4 Observation

Performance measurement of computer systems is not an easy task. The performance of
the Web is especially difﬁcult to measure and evaluate because it has various components,
such as network delay and interference by other users or jobs, that cannot be measured or
controlled. And, as with any system, the performance of the Web depends on how it is

used: what documents are requested, and when and where they must be delivered.

In this chapter, we show how various components of the system can affect our frame-

81

work’s overall performance. The biggest impact will be on the back-end server performance
because it directly affects cliette process performance. The back-end server performance
depends on the type of services and the environment it is running on. In Chapter 6, we
will use the Digital Library environment as an example to demonstrate the impact of the

back-end server.

Using a pseudo-cliette process, we gather average CGI response times for two Busy
rates. From Figures 4.8 and 4.9, we show that scaling up the system does improve the overall
performance. But the performance for all the cases will eventually cease to improve. The
reason for this is the overhead of the CGI process. When a CGI script is invoked, the HTTP
server begins another program to execute the CGI script. These CGI processes share the
processor and memory with the HTTP server program and the operating system. Because
of the overhead, using a CGI process to dynamically generate a Web document is usually
much slower than directly retrieving a ﬁle of the same size. Future Web servers will have
better ways to run these CGI processes, using dynamic libraries, multithreaded servers, and

new programming environments such as Java.

As we mentioned before, the Connection Manager is a potential bottleneck. Increasing
the number of SP nodes and the number of cliette processes can help improve performance
until the limit of the Connection Manager is reached. When the Connection Manager
does become a bottleneck when trying to scale up the environment, it is time to add more
Connection Managers. Currently, our Connection Manager does not have load-balancing
capability to better distribute CGI requests among multiple Connection Managers. This,

we hope, will be a future improvement in our framework design.

Although the best performance depends on back-end server characteristics, the system
that the back-end server is running on, the type of cliette processes, and the type of CGI
processes, we try to provide the user with a basic guideline to manage the number of
cliette processes. From Figure 4.5, we show that using 8 cliette processes on a single
SP node achieves the minimum overhead, assuming that the back-end server is inﬁnitely
fast (meaning that cliette processes have no idle time when serving CGI processes). If

a cliette process spends some time idling, waiting for the back-end server’s response, 16

82

cliette processes on a single SP node may be a better choice when serving 32‘ concurrent
CGI processes (Figures 4.6 and 4.7). To achieve the best performance, users need some
statistics of the back-end server usage of a given environment to make a better decision. In

Chapter 5, we will use our Digital Library design as an example.

When the anticipated CGI processes exceed what a single SP node could handle, or
if faster CGI response is needed, more SP nodes can be added. As Figures 4.8 and 4.9
suggest, using 4 to 6 SP nodes is enough to provide an average 4-sec CGI response time
for higher numbers of concurrent CGI requests (i.e., between 48 and 96). More SP nodes
are needed if the Web server handles more CGI requests per second or if even faster CGI
response time is required. Although it is difﬁcult for a single HTTP server to generate huge

numbers of CGI requests per second, it is possible when using the NCSA’s scalable Web

architecture.

Using the above information as a basic rule, the user could choose the best number of
cliette processes per node and the best number of SP nodes depending on the anticipated
number of CGI requests per second. The best number of cliette processes at run time for
a given environment must be determined through constant performance monitoring. Our
extended UTE library helps users manage and adjust these values by providing precise

run-time performance information.

CHAPTER 5

A Digital Library Using the Framework

The backbone of the Digital Library is the wide area information network: a large commu-
nity of database servers, printers, scanners, and user workstations, connected to each other
and to other information networks throughout the world. The success of Digital Libraries
will turn largely on the consistency of the operational properties of these information net-
works: their speed, stability, accessibility, and the ability of users and information agents
(human or computational) to access indexes and documents consistently and quickly, and

to know when they can be certain that an effective search has been accomplished.

The World Wide Web is a signiﬁcant transformational tool for Digital Libraries. Many
Web pioneers are exploring the use of the Web by implementing innovative services for
information retrieval and discovery [60, 61].

In a joint study between IBM and the Florida Center for Library Automation (FCLA),
the framework for distributed Web services is used as a gateway between the FCLA state

of the art Bibliography Search server and the IBM Visual Info system.

In this chapter, we give a description of the overall design and implementation of the
FCLA Digital Library environment. Understanding the design is essential to analyzing the

trace data.

83

84
5.1 Overall View of the FCLA Digital Library Web Ser-

vices

The IBM Visual Info product is used to store electronic collections of journal articles and
class notes. FCLA Web (http://www.fcla.uﬂ.edu) interface provides search access to one
of the eight university library databases using command available through the Library User
Interface System. Each search yields a list of at most 50 URL pointers, each of which
points to an article. After selecting an article, the user is presented with a brief description
about the article such as the title, author’s name, and abstract, including a URL pointer —
Available Electronically - which points to the actual contents of the article in the Visual

Info database.

When this URL link is chosen, another Web server is invoked to serve the request. This
Web server (http://vigw.fcla.uﬂ.edu) provides two types of CGI processes to serve incoming
requests. The CGIscript: is the primary interface to Visual Info cliettes acquiring the
dynamic HTML page, and the GetGif serves as a GIF ﬁle delivery server that provides

the Web client with the chosen page image.

The FCLA Digital Library server consists of the Connection Manager Daemon, Cache
Manager, and 6 Visual Info cliette processes. These pro-started cliette processes are managed
by the Connection Manager Daemon and are responsible for accessing information saved
in the Visual Info database and updating the retrieved information into the Cache Manager.
The Connection Manager Daemon is responsible for assigning individual URL requests
from the primary CGI process CGIscript to a free cliette process. The Second CGI
process - GetGif — serves as a GIF ﬁle delivery server, which provides the Web client
with the chosen article pages on-line without saving it to the server disk ﬁrst. This secondary
CGI process can also request service from the cliette process if the Cache Manager cannot
satisfy its request. Depending on the nature of the cached information, we use two cache
threads to manage the cache operation — PickCache thread and GifCache thread. In
the following sections, we discuss in detail the internal design of these two CGI interfaces

and the Visual Info Cliette process. Figure 5.1 illustrates the FCLA Digital Library Web

85

services.

5.2 The Internal Design of the Visual Info Cliette Process

During initialization steps, a cliette process starts a login process and makes an initial
connection to the Connection Manager Daemon through the use of the Ini t Sel f ( ) API
call. After successfully establishing the Visual Info session, it stays idle waiting for the
assignment request from the Connection Manager Daemon. The Connection Manager Dae-
mon chooses the ﬁrst available cliette process in a ﬁrst-in ﬁrst-out fashion. After selecting
an available cliette process, the Connection Manager Daemon notiﬁes the requesting CGI
process of the port/machine ID of the serving cliette process and sends the identity (i.e., the
process ID and host machine name) of the requesting CGI process to the assigned cliette
process. This last step is to prevent any CGI process from connecting to a cliette process
without getting authorization from the Connection Manager Daemon ﬁrst. After receiving
the assignment request and the identify of the CGI process, the cliette process begins to
listen on its public socket port for the CGI request. These steps are performed by a call to

the API - WaitForJobO.

Once the connection is made, the cliette process issues the GetURL() and URLend()
API calls to poll the URL string from the CGI process. After receiving the URL string, the

cliette process parses the URL string to retrieve the Visual Info Item ID.

This Item ID is assigned randomly by the Visual Info system when the item is im-
ported/created in the Visual Info database. For security reasons, User cannot specify a
speciﬁc Item ID to be used by the Visual Info system. This Item ID is used as an index

from the Bibliography Search server to our Visual Info system.

Upon retrieving this unique Item ID, the cliette process will ﬁrst ask the cache manager
if it has the info in cache. This is because another cliette process might enter this particular
cache information after the previous checking was done by the requesting CGI process. If
no cache data are found in the Cache Manager cache, the cliette process will demand that

the Cache Manager reserve a space for any future information before it requests information

86

    
   
 

LUIS Web services

http://www.fcla.uﬂ.edu

Figure 5.1: FCLA Web services

87

from the Visual Info servers. This is to avoid the race condition of various cliette processes
trying to update the same cache information. It also blocks any attempt to read the cache
information by other CGI/cliette processes. After successfully receiving all the necessary
information from the Visual Info server, the cliette process constructs a dynamic HTML
documentation based on the information received. This HTML documentation is passed
back to the requesting CGI process, and at the same time it is entered into the Cache
Manager’s cache. Any process that is blocked waiting for the same cache information
will awake and read the cache information after the cliette ﬁnishes updating the cache
information. This blocking read operation is included in the Cache Manager design to

avoid the race condition.

Any information stored in the Visual Info system must belong to a particular class —
the index class. In this project, we use FCLA_1 index class for information related to the
FCLA Digital Library. A index class has one or more attributes, which are used to describe
the characteristics of the information. There are 21 attributes in the FCLA-1 index class

(detail descriptions can be found in Appendix F).

After successful retrieving of the Item ID from the URL string, the cliette process uses
the Item ID to retrieve attribute values related to this Item ID from the Visual Info system.
These attribute values contain information such as an Item ID for the next page, or a page

title.

The cliette process examines the Di splayMethod attribute value to determine what
type of ITEM it is. This DisplayMethod has four values. Value 1 is used to indicate
that this ITEM should generate an HTML page containing a pick list of items, such as page
lists for a particular article or article lists for a particular journal. Value 2 indicates that the

ITEM contains a Logical Page. Value 3 is currently not used. Value 4 indicates that the

ITEM contains a physical page.

Figure 5.2 illustrates a simple documentation search graph as stored in the Visual Info
database. Each node in the graph stands for an item in the Visual Info database identiﬁed
by a unique Item ID. The symbol D in the ﬁgure stands for the Di splayMethod value.

Depending on the value of the DisplayMethod, the cliette process performs different

88

    

 

[ Entire Issue 10a Article In Issue

 

 

 

I

 

 

Page Page
In In
Issue Issue

 

 

 

YV//

[Physical] {Physical} Physical]
[Page “Page }Page 9 ' ‘

 

 

0.4 024 0.4

Figure 5.2: Graphic representation of documentation stored in the Visual Info

VI Cliette

89

Is info in cache
No Yes
' l
Type of request Return to
\ CGI
Physical Page Others
reg: onl Contact Visual
page 99 Y Info for information
call TIIQGII What type of request

I

Update GIFcache. and
return image to CGI in—Iine

Picklist Pageltem

Update PickCache Update PickCache

Return into to CGI and GifCache
Retum Pick info and
another URL to retrieve
GI F from GifCache

Figure 5.3: Flowchart of FCLA cliette processes

actions to create a dynamic HTML page. We discuss that in detail in Sections 5.2.1 and

5.2.2.

Figure 5.3 shows a simple ﬂowchart of our FCLA cliette process.

5.2.1 Generating HTML page when the value of Di splayMethod is

2

If the Di splayMethod value is 2, an HTML page with an actual page image will be

created. The cliette process retrieves the page image and enters it into the Cache Manager’s

cache in the original format (in our case, we use TIFF5 format). The dynamic HTML

page contains the following URL string to invoke a second CGI process to retrieve the

page image from the cache and translate it to a different image type depending on the Web

90

browser capability.

IMG SRC="http:\\vigw.fcla.ufl.edu\cgi—
bin\GetGif?KSDFD#JD$SDF3DF6.gif"

where KSDFD#JD$SDF3DF6 stands for the Item ID pointing to the item holding the
physical page (i.e., the TIFF ﬁle).

If the Web browser could handle the GIF image, the GetGif CGI process will add
the MIME [34, 35] header "Content-Type:image/gif" before sending the GIF image to the

Web browser client.

If the GIF ﬁle cannot be fetched from the Cache Manager, the GetGIF CGI process
will issue a GetCl iette ( ) request to ask a free cliette process to retrieve the page from
the Visual Info system. This is the only case that the cliette process will receive an item ID
with its Di splayMethod being 4. The reason for a missing cached page image may be
that the cached page has exceeded the time limit set by the Cache Manager deﬁnition ﬁle,
or the Cache Manager has reached the maximum cache capacity and must remove some
old image to make room for new information. After the cliette process has successfully
retrieved the ﬁle from the Visual Info database, it will send the image back to the requesting

GetGi f CGI process and update the cache at the same time.

In addition to the page image, the dynamically generated HTML page also contains 5
navigation links under the section titled, "Moving through the electronic collection." They
are URL links to First Page, Previous Page, Next Page, Last Page and Parent Link ( i.e.,
link to table of content page). These links are all identiﬁed by the Visual Info Item ID
values. By using these navigation links, user could essentially navigate through almost any
article saved in the Visual Info database. Figure 5.4 shows a screen dump of an example
HTML page.

There is also a navigation URL link - Citation - which allows users to return to the

original citation search results. From there, the user can begin another search.

91

. *leisxmpe‘ TIT”? 1 '_ jj 4L

NEWS-HEARS

BUSINESS BAR
u.s. GOVERNMENT—IEADING III

Will bosses buy into
Calif. employee
R-Ph- QUIUGIIMST TOTAL RETAIL TFIADE SALES”
here are. early SIgns that some Drugstores and propﬂejary slores‘
corporate pharmacy bosses are Gen. merchandise group mores:
rereptive In new guidelines on
workload lor California staff pharmu- CONSUMER PHlCt INDtXT
Fists. .
‘ ‘ . Mcdlcal care component Index:
lhe model employment practice . ,
guideline is intended as a way fm staff Medical care commoditles Index:
pharmacists and management to build Prescnptcrt drug pricelndex‘l'
a cooperative rclailonship that wlll Nonproacrlptlon deQS IIKTOX§
help enhance patient care. Developed 3:413:63 mm [gurus (n moms.
by the California Pharmacists Associa- Isorysu-alyml'rxsted, base per-yd 1952-84.

 

Figure 5.4: A dynamic HTML page contains a page GIF image

92

5.2.2 Generating HTML page when the value of Di splayMethod is
1

If the Di splayMethod is 1, this URL request is generated by the "Moving through the
electronic collection" navigation links in the HTML page described in Section 5.2.1. This
is a table of contents dynamic HTML page (also called PickList page). It contains all
5 essential navigation links, and 3 Contents of link, which lists all the children links
associate with this item. It could be a list of pages within an article, or a list of articles
within an issue. This list of children links is acquired by a search through the Visual Info
system. To display a long list of pages associated with a book or journal is a waste of
time and space. A user will most likely want to read them sequentially. The dynamically
generated Pickl i st HTML page will display at most 12 entries, and 3 Next x items
hyperlink to retrieve the next 12 entries or a Previous 12 i tems hyperlink to retrieve
the previous 12 entries. A form entry is also provided to allow the user to jump to any
page. The complete list is stored in the Cache Manager’s cache. If a user requests the next
12 entries, the CGI process will retrieve it from the Cache Manager instead of requesting
a cliette process. The URL string contains one extra tag/value SPick= to indicate the
starting sequence number. If the corresponding information has been removed from the
Cache Manager, the CGI process must request it from the cliette process again. The cliette

process also updates the Cache Manager’s cache with the same information for future

requests.

Figure 5.5 shows a screen dump of an example HTML page.

5.3 The Internal Design of the Primary CGI Interface -
CGIscript

The primary CGI used in our design is named mph-CGIscript. It is responsible for
retrieving either a Page HTML or a Pi ckList HTML page. This CGI process, after

parsing the incoming URL string and retrieving the Item ID, will ask the cache manger

93

o
17- vim-t
. ,9

«x4?-

.-.

 

Figure 5.5: A dynamic HTML page contains a pick list

94

for cached information. Because there are two types of dynamically generated HTML page,
there are also two types of cached information managed by the PickCache cache thread.

The following is the format of two cache types - PICKLIST and PAGELIST.

Type — PICKLIST
1lTitle!First_ITEMID:First_Anchor#Prev_ITEMID:Prev_Anchor#
Next_ITEMID:Next_Anchor#Last_ITEMID:Last_Anchor=
Parent_ITEMID:Parent_Anchor#
CategoryName:CategoryID:Identifier/NUM_OF_ITEM/
ITEMID:CategoryName:CategoryID:Identifier#
ITEMID:CategoryName:CategoryID:Identifier#
ITEMID:CategoryName:CategoryID:Identifier#
ITEMID:CategoryName:CategoryID:Identifier#
ITEMID:CategoryName:CategoryID:Identifier#
ITEMID:CategoryName:CategoryID:Identifier#

Type - PAGELIST
2lTitle!First_ITEMID:First_Anchor#Prev_ITEMID:Prev_Anchor#

Next_ITEMID:Next_AnChor#Last_ITEMID:Last_Anchor=
Parent_ITEMID:Parent_Anchor#

CategoryName:CategoryID:Identifier/PAGEITEMID/

The ﬁrst ﬁeld in the cached information speciﬁes what kind of HTML page this is.
If this is a PAGELIST type of cache, it will build a corresponding hyperlink for the Web
client to retrieve the GIF ﬁle from the Gi f Cache cache thread. Figure 5.6 shows a simple

ﬂowchart of the mph—CGIscript process.

95

Contact FCLA_PickCache

Dose cache exist?

\>No

Yes I
What type of cache? Contact Cliette Process
Picklist Pageltem
Return CVaChe Info Retum 080116 1010

Does the page image exist
FCLA_GitCache

\

Yes No

l

Retum Cache Ask Cliette for image
Info

Figure 5.6: Flowchart of nph-CGIscript process

96

Is into in Cache

‘
/ Yes

No l
Contact VI Return Cache into
Cliette tor image to Web Client
Using IMAGE_ONLY
request

1

Wait for image
from Cliette

Return image
to Web client

Figure 5.7: Flowchart of GetGif process

5.4 The Internal Design of the CGI interface - GetGif

The main purpose of this CGI process GetGi f is to retrieve the actual page image from the
Gi fCache cache thread, convert it to an image format acceptable by the Web browser, and
add the MIME header information. If there is a cache miss, this CGI process will request
the ﬁle through the cliette process. Figure 5.7 shows a simple ﬂowchart of the process

GetGif.

5.5 Performance of Distributed Web Services

The performance of client-server systems such as the Web depends on many factors: the
client platform, the client software, the network, network protocols, the server software, and
the server platform. Because many different clients interoperate on the Webs and there are
many different types of platforms and networks in use, it would be difﬁcult to characterize

the entire Web.

A trace generation facility may help a system administrator of the distributed Web ser-

97

vices understand its cliettes’ access patterns. Our Extended UTE tracing tools, especially the
dynamic tracing facility, provides us with the ability to trace and visualize the performance.
The dynamic tracing facility allows a Web service administrator to turn on/off trace when-
ever necessary, through the use of the CMDadmin utility. Upon receiving a Trace_s tart
request, the Connection Manager Daemon will call the TraceCl iet te ( ) routine, which
asks each active cliette process to turn on its trace. An acknowledgment is sent back to
the Connection Manager Daemon after the cliette process turns on its trace. If a cliette
process is busy serving a CGI request, a ﬂag is posted on the cliette queue. When a busy
cliette process returns to the c l i et t e_AVAI L state, this ﬂag is checked and corresponding
trace action will be performed. This guarantees that cliette trace events are generated on
a CGI-request basis and prevents trace initialization when the cliette process is serving a
CGI request. The Connection Manager Daemon turns on its own trace only after it has
received conﬁrmation messages from all cliette processes. If a cliette process fails to turn
on its trace, an error indicating trace has failed will be sent back to the Connection Manager
Daemon. In this case, the Connection Manager Daemon may abort trace generation by

sending a Trace.stop command to those cliette processes whose traces are already on.

Another useful option is to turn on some cliette processes’ traces while others are
running without tracing. This allows the system administrator to monitor problematic
cliette processes. The tracing of the Connection Manger Daemon is always on as long as
some cliette processes are being traced. Thus, dynamic trace generation of selected cliettes
may be a powerful choice for debugging interactions between the Connection Manager
Daemon and a newly added cliette process. Trace generation is terminated when the traced

process exits or when the system administrator issues a stop-tracing request using the

CMDadmin utility.

The Connection Manager Daemon has a task ID of zero, and assigns a unique, positive
task ID to each cliette process. A trace ﬁle is generated for each traced process, with the
unique task ID as the ﬁle name extension. This saves one word per trace record in raw
trace ﬁles, because a trace record does not need a ﬁeld to indicate which task it is generated

from. Special precaution is taken to avoid the accidental oversriting of existing trace ﬁles

98

by repeated requests of turning on tracing. Each trace on/off request pair generates a set
of trace ﬁles, and multiple trace on/off requests can be issued during the entire course of
the distributed Web services. After the traces are successfully generated, UTE utilities are

used to merge and analyze these trace events.

To achieve higher performance, our distributed Web services allows users to selectively
use the Cache Manager. If the Cache Manager is not used, the second CGI process GetGi f
will not be called. Instead, the GIF ﬁle is saved as a normal ﬁle and could be retrieved
by the Web client. Also, a complete picklist is displayed whenever a pic/dist is requested
instead of the ﬁrst 12 items. If the Cache Manager is used, its task ID is assigned by the

Connection Manager at the startup time.

CHAPTER 6

Digital Library Performance Analysis

and Visualization

In this chapter, we will present performance tracing results of our distributed Web server
used in the FCLA Digital Library project. Our extended UTE trace library and tools are used
to gather and analyze trace results. We will try to limit the gateway impact to a minimum
based on the results shown in Chapter 4. The ﬂexibility of our Connection Manager design
and the scalability of our Web services enables us to fully utilize the IBM SP system to

gather trace data for different communication environments.

6.1 FCLA Digital Library Trace Environment Setup

The basic trace setting for our Digital Library uses a fully functional cliette process to
conduct our tracing. As described in Chapter 5, this cliette process maintains a permanent

connection with its back-end server — the IBM Visual Info system.

To provide a reference point, we designed an experiment using all standard compo-
nents without our gateway. Its performance is discussed in Section 6.2. To analyze the
performance and to compare any possible different conﬁguration, we used a single work-
station, a cluster of IBM RS/6000 workstations, and an 8-node IBM SP system to gather

trace information. The SP system used in our tracing provides both the high-speed network

99

100

.@ ®@ Wm, Visual Into
g .@ @. High Speed Switch Library Server

E
@C:l>® . . meme” Vlsual Into

6—9

@ HighSpeedSwnch Object Server

Figure 6.1: One workstation/SP node without the Cache Manager

 

 

 

 

 

 

 

 

 

 

 

 

and the token ring/Ethernet connections. All the nodes used are IBM SP2 Thin-Node model
390 with 128M memory each. We used at most 6 cliette processes, because the back-end
Visual Info Library server has only ﬁve concurrent processes to handle Visual Info requests.
Also, based on the performance results shown in Chapter 4, the overall system performance

will begin to suffer when the number of total cliette processes exceeds a certain number.

We will run our tracing in three basic settings. The ﬁrst one is running the Web services
all on one workstation or one SP node with or without Cache Manager support. Figure 6.1
diagrams the setting of one workstation without the Cache Manager. Figure 6.2 diagrams

one workstation with the Cache Manager support.

The second basic setting is to run the Web server, Connection Manager Daemon,
Cache Manager Daemon, and two cliette processes on one workstation (or one SP node),
while the remaining four cliette processes run on two workstations (or two SP nodes).
Figure 6.3 shows the setting using three workstations/nodes without the Cache Manager
support. Figure. 6.4 shows the setting using three workstations/nodes with the Cache

Manager support.

The third setting is to run the Web server on one workstation (or one SP node), while

distributing six cliette processes evenly on three SP nodes. For each of the settings, we will

101

 

 

 

 

 

 

 

 

 

 

 

anew, Visual Info
@ @ HighSpeedSwiich Library Sewer
e @

e a ti

@ 66) (Eng Wsual lnio
@ “WSW" Object Server
@ e

 

Figure 6.2: One workstation/SP node with the Cache Manager

 

 

3

. Ethernet/High Speed Switch Visual Into

<:> @ < > Library Server

 

 

 

®®®

 

Ethemet/Higi Speed Switch

 

3®®§§

 

 

 

< a Visual Into

 

Object Sewer

 

 

 

 

 

 

it
@@

Figure 6.3: Three workstations/SP nodes without the Cache Manager

 

 

3
e<=l>

 

 

102

 

 

 

 

 

 

 

@l
669' Ethemet/ Visual lnto
@' H
@l .
865) @ I High Speed Switch Library Server
I
©@ r
' Ethemet/
@. 02> : a <———> Visual Into
I High Speed Switch
Object Server

 

 

 

it it

Figure 6.4: Three workstations/SP nodes with the Cache Manager

 

 

 

 

 

 

 

make comparisons with and without the Cache Manager. When running the tracing on the
SP node, only the hi gh-speed switch is used for communication between servers. In addition
to providing us information about the advantage of using our distributed Web services for
back-end server supports, these various settings allow us to compare the advantage of using
the Cache Manager, and the trade-off of using the IBM SP system with the high-speed

switch.

Processes to be monitored and traced are the Connection Manager Daemon, the Cache
Manager, and six cliette processes to support up to six simultaneous requests. The number
of cliettes is chosen to match the performance of the back-end server. Although CGI
activities are not captured in these settings to reduce the total amount of trace events, send
and receive operations are indeed captured in a cliette process’s trace ﬁle for messages
sent to or received from its CGI processes. There are a total of 180 CGI requests during
each of our test runs. To closely simulate an actual Web server environment, each CGI is
begun with a randomly chosen URL string from a pool of URL strings. These CGI requests
are generated automatically by a control program. A request is queued until timed out if
there is no free cliette available at the time of request. In addition to the seerGI trace

marker provided by our gateway API, we add user trace markers to trace Cache Manager

103

activities. Two extra user markers, PickCache and GifCache, are used to signal the
beginning and end of the Cache Manager serving Pickl ist cache information and Gi f

page cache information.

Our back-end service contains two basic components — the Library server and the
Object server. In a Digital Library environment, the library server is viewed as a library
catalog containing indexes to various collections. The object server is an actual book shelf
holding books, journals, and newspapers. Some of the operations, such as requesting a
content listing of a journal, could be served only by the Library server. Getting the actual
page image required the library server to coordinate with the object server with information.
Object server responds to the requesting client directly with a page image. Both servers
use the IBM DB2/6000 database server to maintain their contents. A Visual Info system
can have more than one object server connecting to one library server. In our testing, we
only use one library server and one object server. Each server resides either on an RS/6000
workstation or on one IBM SP/2 node. Communication between servers could be either

through the token ring or the high-performance switch.

The sample journal we used for our testing contains 500 articles. There are 6,453
different URL strings; among them are 5,111 URL strings that generate HTML pages with
GIF images. These CGI processes are generated by a control program instead of a Web
server process. Using a control program, we could generate CGI processes fast enough to
saturate the pool of cliette processes. Thus, we could trace the average availability of each

cliette process and the failure rate of these CGI requests.

6.2 Standard HTTP Setting Without Using CMD/Cache

Support

To provide a reference point, we set up an experiment using all standard components. A
complex CGI program is created to retrieve data directly from the Visual Info system and

convert it to HTML format before sending the data back to the Web client. After receiving

104

the URL string, a CGI process parses the incoming message and retrieves the Visual Info
Item ID. The CGI process then tries to login to the Visual Info system. Upon successful
login, the CGI process starts issuing several queries to get information on the index class -
FCLA_1. Another set of search queries is issued to get information for the item. Depending
on the type of item being retrieved, another set of queries may be required to retrieve
information for building navigation links. If the item contains a page image, a query is
issued to retrieve the actual image. The CGI process eventually constructs an HTML page

and passes it back to the Web client.

The CGI program, which is quite complex, is about 1.5 Mbytes in size. Because
information related to the item ID varies in size, the CGI process needs to issue several
memory allocation subroutine calls. We measure in a system with a normal load of six
simultaneous CGI requests; it takes about 3 to 5 sec from the time the Web server received
the request until the C01 is ready to process the URL string. To capture the seerGI
elapsed time, we begin tracing as soon as the process starts. We collect about 200 such
requests. The average access time is about 28 see if there are six simultaneous CGI requests.

With only one active CGI request, the average access time is about 19 see.

In addition to the above setup, we also modify our cliette process to do a login whenever
a CGI request is received. This setting uses our dynamic trace start/stop facility provided
by the UTE+ trace package. Table 6.1 shows the elapsed seerGI time. We use six cliette
processes to do the performance tracing. From Table 6.1, we calculate that the average is
around 41 sec. This is higher than using the C01 process alone. If we add the time used
to start a CGI process in the previous setting, this setting still presents 6 to 8 sec higher
access time. Several factors contribute to the higher number. First, the IP send and receive
time contribute to the elapsed seerGI time. Second, in addition to six cliette processes,
six CGI processes are running simultaneously as well. It also takes time to request a
cliette process. This setting does not reﬂect the real performance of not using CMD/Cache
services, it merely gives us some indications of the overhead that our CMD/Cache design

might bring.

1 05

 

 

 

 

 

 

 

 

 

Cl iet te ID Total No. of calls Average
Cliette 1 1114.312 38 29.324
Cliette 2 1131.712 32 35.366
Cliette 3 1217.125 32 38.035
Cliette 4 1097.844 22 49.902
Cliette 5 1187.430 30 39.581
Cliette 6 1141.656 24 47.569
Total 168

 

 

 

 

 

 

 

Table 6.1: Elapsed seerGI time statistics (using Web standard component without CMD
support)

6.3 Running on a Single Workstation

One workstation is used to run the Web server, including the HTTP Daemon, the Connection
Manager Daemon, and six cliette processes. Communication between the Web server and

the back-end server is established through a 16-Mbytes token-ring network.

A Message is sent using a blocking send(). and received using a nonblocking select()
followed by a blocking recv() through an Internet stream socket. We use a pair of I P-S end
(begin and end) events to indicate the elapsed time of a send(). and a pair of IP_Recv

(begin and end) events to show the elapsed time of a recv().

Because CGI processes are not traced, send and receive operations for messages
between a cliette and its CGI processes are captured only in the cliette’s trace. Messages of
two different sizes, 148 bytes and 2,032 bytes, are observed. Exchanging URL strings and
HTML documentations between a cliette process and its CGI process is done using large
messages (2,032 bytes); small messages (148 bytes) are used for other purposes such as

control, acknowledgment, or administration.

Figure 6.5 shows the tracing of the interprocesses communication trafﬁc of the Con-
nection Manager and cliette processes; it also shows the elapsed time distribution for
(I P_Send, IP_Recv) and seerGI events. The CMD has the ID number 0, and cliettes
have ID numbers from 1 to 6. From the elapsed time distribution of seerGI events, we

show that most CGI requests are served within 10 sec while some are scattered up to 90

,llll

ll Vlllllllllll..._ llll'
lll ' 7

ii. I.
. Al

I l

\

 

 

 

I ‘III I u l I
z . s “W-‘n‘
1'. .‘ 34;.
I‘
I
.1“
v]
‘t
K};
H I ll 1 l l
I I l
l l l I I
l I
l I

Figure 6.5: Distributed Web services on a single workstation

M M”,
“Mirnm Job: peg-121.9. Ddc: W900 Tlrm' 11:45 000w WHAFAZPC Pm W42“:

 

 

107

sec, depending on the Visual Info operation, the network/system load, and the size of the
retrieved page images. As we mentioned in Chapter 5, the retrieved page images have to be
written to disk for the Web client to read them. These page images also must be transmitted
from the remote object server to local cliette processes. Depending on the network load

and the system load, the time for retrieving and writing a page image varies.

Tables 6.2 and 6.3 show the total elapsed time, number of calls, and average elapsed
time of three different N UPSHOT states - IP-Send, IP_Recv, and seerGI. A "NUP-

SHOT state" is a period of time with two events: a begin event and an end event.

 

 

 

 

 

 

 

 

 

 

Task Type IP_Send IP-Recv

Total No. of Average Total No. of Average

(sec) calls (msec) (sec) calls (msec)
CMD 0.308 218 1.413 0.139 355 0.391
Cliette 1 10.198 218 46.782 0.131 195 0.673
Cliette 2 11.319 237 47.759 0.211 211 1.004
Cliette 3 12.601 290 43.454 0.195 259 0.754
Cliette 4 8.591 200 42.959 0.134 179 0.753
Cliette 5 12.688 254 49.955 0.239 227 1.055
Cliette 6 14.017 301 46.569 0.524 268 1.955
Total 1,718 1,694

 

 

 

 

 

 

 

 

 

 

 

Table 6.2: Elapsed I 13.5 end and IP_Recv time statistics on a single workstation

 

 

 

 

 

 

 

 

 

 

C l i et t e Total No. of Average
ID (sec) calls (sec)
Cliette 1 532.751 24 22.197
Cliette 2 586.476 26 22.556
Cliette 3 534.141 32 16.691
Cliette 4 535.874 22 24.357
Cliette 5 558.111 28 19.932
Cliette 6 534.113 33 16.185
Total 165

 

 

 

 

 

 

Table 6.3: Elapsed seerGI time statistics on a single workstation

The average IP_Send time is considerably higher than the average receiving time.

This is because recv() is called only when the message has anived. The Connection Manager

108

Daemon has a lower average sending time because it only handles small messages. Note
that there are only 165 seerGI states in the trace (Table 6.3). This means 15 CGI requests
( 180 - 165 = 15 ) failed to receive services due to time-out. The average seerGI elapsed

time is about 20 sec, about 8 see less than that in the all-standard component experiment.

6.4 Running on a Single IBM SP2 Node

To take advantage of the high-performance switch on the IBM SP2 machine, we allocate
one SP2 node for our Web services in this setting. The back-end servers, Library server and
Object server, are also running on the same SP2 machine but on two different nodes. The
communication between the Web servers and the back-end servers are through the high-
performance switch. By allocating one dedicated SP node for the Connection Manager

Daemon, we can schedule CGI requests to their cliette processes in a timely fashion.

Tables 6.4 and 6.5 show the total elapsed time, number of calls, and average elapsed

time of three different NUPSHOT states - IP-Send, IPRecv, and seerGI.

 

 

 

 

 

 

 

 

 

 

 

 

Task Type IP-Send IP_Recv

Total No. of Average Total No. of Average

(sec) calls (msec) (sec) calls (msec)
CMD 0.076 189 0.403 0.036 367 0.100
Cliette 1 0.157 272 0.580 0.045 242 0.189
Cliette 2 0.134 271 0.495 0.047 241 0.197
Cliette 3 0.443 272 1.629 0.019 242 0.079
Cliette 4 0.200 272 0.736 0.052 242 0.216
Cliette 5 0.347 265 1.311 0.024 236 0.105
Cliette 6 0.243 280 0.871 0.046 249 0.188
Total 1821 1819

 

 

 

 

 

 

 

 

 

Table 6.4: Elapsed IP-Send and IP_Recv time statistics on a single IBM SP node

From the results shown in Tables 6.2 and 6.4, we can see that the average elapsed time
for IRS end improves from roughly 0.04 sec on a single workstation to around 1 msec on

an IBM SP system. The average elapsed time for I P.Recv improves from roughly 1 msec

tortilla Display 200m

Printer. 5.127115 (ulsecurrlls) N N 8939?th Heb

lP Saul l ' lP Retv .seerGl

vlylllllllld lllllill ll 0W

 

 

 

 

 

 

l l ‘ l 1
711 ilﬂ 30 1011 iii] 1211 I'll} 1:111 1511 lilil 171] 1311 191] m0 2111 9le ill] Nil P50 780 2711 2811 .3911 .1110 ilil .2[

0050!: ll 0116le lllrsur: 0.000075

1

l

l

101: mil. 1
-' 01 states: r 01510195: 1 i “1 states:
93% ‘19“ ‘, r

l

l

l

l

l

l

l

I

1 1111's.

: nl llms:

1110

r 01 blitS‘ : 01 blns:

till 25
Real» to ill rye-5.3.3 m m Hesrzr lb lll
: a
. l’ntll

L, l 1

0w lIJil 0.0025 0.0050 0.0 0050 m 0.001 0.002 n, 0050 2-50 5110 7.50 1000 1260 15.00 17.50
_ _ 4—
trail loll Lluttlln tn start; ll nut left suit 01 histogram ills mg lell button 10 JlNlL'll cutie" “(1901 [lrslugruu ly- ijk‘li lllllll'lll in strelrhnut loll Sillﬁ 01 llrslbgrlm lt‘rsylhv.
Drag mlllrlle button to slllle llisibqram lisplav. Drag nllllllle button in stilt hrslnljrnm rllsphv. r"3'1 ""‘l‘lll‘ ““110" 10 5W- lllim‘lm" ”ISPIRY

Drag l'llllll llllllull 10 stretch ulll nrllll 3019 01 liltllUljl'dtll l Drag ngllt lllltlnll t0 slrvlrll but right side 01 histogram [Wt Nil" W110" '0 W915" 0'“ ”11m 3'09 0' "15101?“ 11511191:
Dray ally bullun 10 stretch lrlslugralns vertically Drag any 11111111" in slrtttll luslbryratls vertrrrllv. Dray WW 11""9" ‘0 51"?th hISlUWMIS VP'lIUllY

Print ' Pnnl

 

Figure 6.6: Distributed Web services on a single IBM SP2 nodc

Thu: 1114600000: WHIFQPC m: we

 

110

 

 

 

 

 

 

 

 

 

Cl iette ID Total No. of Average

(secs) calls (secs)
Cliette 1 309.950 30 10.331
Cliette 2 311.666 30 10.388
Cliette 3 308.229 30 10.274
Cliette 4 307.765 30 10.258
Cliette 5 309.830 29 10.683
Cliette 6 309.945 31 9.998
Total 180

 

 

 

 

 

 

 

Table 6.5: Elapsed seerGI time statistics on a single IBM SP node

to 0.2 msec. The average elapsed time for seerGI also improves from roughly 20 sec to

10 sec.

By comparing Figures 6.5 and 6.6, we ﬁnd that most of the seerGI operations are

centered around 10 sec. This indicates the advantage of using the hi gh-performance switch.

6.5 Running on a Single Workstation with Cache Manager

By examining the HTTP daemon log ﬁle, we ﬁnd there are always some HTML pages being
accessed frequently. These pages could be the Welcome page, the Int roduct ion page,
or the News of the Day page. For example, in our Digital Library environment, a dynamic
HTML page containing a professor’s recent class notes is accessed more often than older
class notes. In the previous example, we ﬁnd that on average it takes about 10 sec to
request a page image, construct the HTML page, and pass it back to the requesting client
browser process. To improve the throughput of heavily accessed pages, we add the Cache

Manager support in this setting. There are two cache, the PickCache and the Gi f Cache,

maintained by the Cache Manager.

Figure 6.7 shows the trace visualization of the interprocesses communication trafﬁc
(IP_Send and IP_Recv) between the Connection Manager Daemon and cliette processes,
along with elapsed time distributions for IP-Send, IP_Recv, and seerGI. Our Cache

Manager uses two cache threads to maintain two different cache types. Due to the limitation

 

lll

lblrlrle lllsplav [mm

Ptllrllur: 1915/8819 (In seconds) .9 0 RWIWEW H910

ﬁll: Semi : lP_ Harv - seerlil

lillillllllll

 

 

 

 

 

 

 

 

 

 

 

 

Curstlr' 0 003910

: 01 states: : 0t states:
llltl'.‘° 100'.

i 111 llth: , 011mg, : ul lllllili
50 75 A ‘37
Retro: 10 I'll

Resue to ill Resrze 10 ill

hull Pttlll

0 00 0.2 0.3 0.l
—

0W, ill 001 0.03 ll 0:: 0.0

_
Italy but llutlnn lb strrlrll out llrtt 5019 nl lllSlogr-‘ltllll Dray left button 10 stretch 001 lell sulv 01105109200 lirsyl Ural; lell buttrln tn stmlr. ll out left slur. rlt lllslngnm llﬁpl'lv
Dray mlrlrlte button to sllde hlslmram lllsplav. Dng mdvitr button in slide histogram insptrv Draq numtle button to slide human 1

twin.
Ul'dl] right button 10 such ll uul light side 0t llhlﬂtﬂlll Drall right button to stretch out right Still: of llstognlm (ll Ural] nlllll lmllurl lb stltlttl ulll nglll site 01 hiluqulll ltlslllzlv
Dray anv lllllltlil to strut: ll luslnmms ventrallv [hag dlIV button to stretch histogram mm in [Iraq any button 10 slmlrtl lllstmlram: YPerRI

0090

Figure 6.7: Distributed Web services on a single workstation with Cache manager support

 

__ Tine: 11:45 Dunno WHIFIZPC m WNZAC

 

112

of NUPSHOT when displaying nested tracing event, Figure 6.7 only shows the interaction
between the connection manager daemon and cliette processes. For example, the nested
tracing event can happen when the begin marker of a Gi fCache(P i ckCache) is recorded

before the end of a PickCache(Gi fCache) marker,

Table 6.6 shows the total elapsed time, number of calls, and average elapsed time of
two different N UPSHOT states - I P-Send and IP_Recv. The statistics for the seerGI

state are shown in Table 6.7.

 

 

 

 

 

 

 

 

 

 

Task Type IP_Send IP_Recv

Total No. of Average Total No. of Average

(sec) calls (msec) (sec) calls (msec)
CMD 0.168 153 1.103 0.066 205 0.325
Cliette 1 7.598 168 45.229 0.141 151 0.940
Cliette 2 7.128 184 38.743 0.125 165 0.759
Cliette 3 6.945 166 41.839 0.165 149 1.112
Cliette 4 6.640 186 35.701 0.211 167 1.267
Cliette 5 7.920 145 54.626 0.091 129 0.707
Total 1002 966

 

 

 

 

 

 

 

 

 

 

Table 6.6: Elapsed IP_Send and IP_Recv time statistics on a single workstation with
Cache Manager support

 

 

 

 

 

 

 

 

 

Cliette ID Total No. of Average

(sec) calls (sec)
Cliette 1 778.840 19 40.991
Cliette 2 737.366 21 35.112
Cliette 3 758.605 19 39.926
Cliette 4 717.627 21 34.172
Cliette 5 783.304 16 48.956
Total 96

 

 

 

 

 

 

Table 6.7: Elapsed seerGI time statistics on a single workstation with Cache Manager
support

Due to the limitations of available trace channels, we invoked ﬁve cliette processes

instead of six in this setting. This results in a longer average seerGI response time

113

(increase from 22 sec to around 40 sec). Another contribution to the longer response time
is the fact that each cliette has to access a cache item before it requests anything from the
back-end server. It also must update the cache item if it is missing from the cache. Note
that the Web client received its information before the cliette process updates the missing
cache item. Thus, the longer seerGI response time does not necessarily indicates a
longer response time for the Web client. It shows how long a cliette process would take to

ﬁnish a CGI request and be ready for the next request.

Table 6.8 shows the average response time of a cache thread responding to a
CacheOpen call until the corresponding CacheClose call. These calls can be ei-
ther a Cache_Read or a Cache_Write request. The total number of cache accesses is
the combined total of requests coming from either cliette or CGI processes. From Table 6.7,
we know there are 96 serv_CGI requests from 180 CGI processes. This means 84 CGI
requests are served by the Cache Manager instead of cliette processes. For a cache hit,
a CGI process must send requests to both PickCache and GifCache Cache Manager
threads to construct the dynamic HTML page. This cache access with an average of 2.51
(1.30 + 1.21) sec return rate indicates that accessing the Cache Manager is about 8 to 10
times faster than accessing the back-end servers. This cache access time also contributes
to the longer average seerGI elapsed time shown in Table 6.7. Note that the seerGI
trace marker is set when the cliette process begins to listen to a request coming from a
particular CGI process from the Connection Manager Daemon. The connection setup time
between the cliette process and the CGI process is also included in the time marked by the
seerGI trace marker. The PickCache and GifCache trace markers do not include

the original connection setup time between the requester and the Cache Manager.

 

 

 

 

 

 

 

 

Cache Operation Total N o. of Average

(sec) calls (sec)
PickCache 474.682 364 1.304
GifCache 359.783 297 1.211

 

 

Table 6.8: Cache Manager activity statistics on a single workstation

It is interesting to see in Figure 6.7 that there are two clusters of the seerGI elapsed

114

time: A cliette process may encountered a cache hit before it actually accesses the back-end

SCI'VCI‘S.

6.6 Running on a Single IBM SP Node with Cache Man-
ager

In addition to using either the high-performance switch or the Cache Manager, we combine

both in this test setting.

Figure 6.8 shows the screen dump of the NUPSHOT results. Tables 6.9 and 6.10 show

the results of all the trace markers.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Task Type IP_Send IP_Recv

Total No. of Average Total No. of Average

(sec) calls (msec) (sec) calls (msec)
CMD 0.1 13 157 0.719 0.020 21 1 0.096
Cliette 1 0.264 140 1.887 0.018 125 0.146
Cliette 2 0.331 172 1.925 0.023 153 0.151
Cliette 3 0.493 217 2.275 0.028 193 0.149
Cliette 4 0.412 187 2.206 0.021 166 0.131
Cliette 5 0.379 199 1.907 0.029 177 0.165
Total 971 940

 

 

 

Table 6.9: Elapsed IP-S end and IP_Recv time statistics on an IBM SP node with Cache
Manager support

There are 93 requests of 180 actually served by the cliette processes. Comparing
Tables 6.5 and 6.10, we could determine that the average time spent for a CGI request
increases by about 3 see. This is the same reason mentioned in Section 6.5 due to the use
of the Cache Manager. Comparing Tables 6.8 and 6.11, the average time for each cache
access is about the same because all the cliette processes and the Cache Manager Daemon
are on the same machine/node in both settings. Among the 93 requests, the longest is less
than 30 see (Figure 6.8) compared to about 50 sec in Figure 6.7. Both are using Cache

Manager but one is on a single IBM SP node and the other is on a single workstation.

115

 

 

. , —
' a ‘1 i
a! ‘

 

 

 

 

 

 

 

 

 

 

 

Figure 6.8: High-Performance Web server on an IBM SP2 system with caching support

. MM'“MM~_

 

116

 

 

 

 

 

 

 

 

C l i et t e ID Total No. of calls Average

(sec) calls (sec)
Cliette 1 274.134 15 18.275
Cliette 2 259.282 19 13.646
Cliette 3 263.222 24 10.967
Cliette 4 260.298 21 12.395
Cliette 5 263.388 22 11.972
Total 93

 

 

 

 

 

 

 

Table 6.10: Elapsed seerGI time statistics on a single IBM SP node with Cache Manager
support

 

 

Cache Operation Total No. of Average

(sec) calls (sec)
491.912 365 1.347
555.710 301 1.846

 

PickCache
GifCache

 

 

 

 

 

 

 

 

Table 6.11: Cache Manager activities on a single IBM SP node

6.7 Running on a Cluster of Workstations

In this setting, we use three RS/6000 model 390 workstations to run our Web services. The
HTTP Daemon is running on the same node as the CMD and the ﬁrst two cliette processes.

Communication between remote cliette processes and the CMD is established through a

l6-Mbyte token ring network.

Table 6.12 shows that the average elapsed time for I P-Recv in the last four cliettes is
greater than that in the ﬁrst two cliettes because the communications between the Connection

Manager and the last four cliettes are through the local area network.

Table 6.13 shows that the average seerGI time is around 10 sec, 3 200% improvement
over that in Table 6.3. The success rate also increases from 165 to 178 for the total of 180

requests. Figure 6.9 shows the NUPSHOT result.

117

 

 

 

 

 

 

y. "f >:y_'..
, It,
e “_,
I .l l .l 1 ll 1 I I l l . l .t

.. . ‘1‘
.«L 3.
l
l
l
I
.
I l l
l l I
. r
I I I [I

Figure 6.9: Distributed Web services on a cluster of three workstations

 

Thu: 11:45 0|”: WI-MF‘ZPC PM“ WH‘F‘ZAC

 

 

 

 

118

 

 

 

 

 

 

 

 

 

 

 

 

 

Task Type IP_Send IP-Recv

Total No. of Average Total No. of Average

(sec) calls (msec) (sec) calls (msec)
CMD 0.038 191 0.201 0.028 367 0.077
Cliette 1 0.489 289 1.694 0.023 257 0.091
Cliette 2 0.290 244 1.188 0.033 217 0.153
Cliette 3 0.032 244 0.135 0.144 217 0.668
Cliette 4 0.036 263 0.138 0.137 234 0.587
Cliette 5 0.037 281 0.133 0.147 250 0.526
Cliette 6 0.038 289 0.133 0.159 257 0.622

 

 

 

 

 

 

 

Table 6.12: Elapsed IP_S end and IP_Recv time statistics on a cluster of workstations

 

 

 

 

 

 

 

 

 

Cl iette ID Total No. of Average

(sec) calls (sec)
Cliette 1 320.325 32 10.010
Cliette 2 317.928 27 11.775
Cliette 3 318.441 27 11.794
Cliette 4 323.710 29 1 1.162
Cliette 5 322.085 31 10.389
Cliette 6 316.773 32 9.899
Total 178

 

 

 

 

 

 

 

Table 6.13: Elapsed seerGI time statistics running on a cluster of workstations

119
6.8 Running on a Cluster of Workstations with Cache

Manager

In addition to having a cluster of workstations, we also use the Cache Manager to improve

access rate for frequently accessed pages.

Figure 6.7 shows the tracing of the Connection Manager and cliette processes. Because

of the limitations of NUPSHOT, we cannot show the tracing of the Cache Manager Daemon.

Tables 6.14 and 6.15 show the total elapsed time, number of calls, and average elapsed
time of three different NUPSHOT states - IP_Send, IP_Recv, and seerGI. The total
and average times of the two new user markers, PickCache and GIFCache, are also

shown in Table 6.16.

 

 

 

 

 

 

 

 

 

 

 

Task Type IP_Send IP_Recv

Total No. of Average Total No. of Average

(sec) calls (msec) (sec) calls (msec)
CMD 0.092 203 0.454 0.021 105 0.206
Cliette 1 0.018 159 0.114 0.314 176 1.784
Cliette 2 0.020 162 0.129 0.362 182 1.990
Cliette 3 0.061 101 0.604 0.013 112 0.119
Cliette 4 0.059 89 0.664 0.011 100 0.113
Cliette 5 0.075 108 0.696 0.012 115 0.110
Cliette 6 0.070 115 0.612 0.014 128 0.112
Total 937 918

 

 

 

 

 

 

 

 

 

 

Table 6.14: Elapsed IP-Send and IP.Recv time statistics on a cluster of workstations
with Cache Manager support

From Table 6.14, we ﬁnd that the ﬁrst two cliettes have much better response time than
the remaining four cliette processes. The main reason is that these four cliette processes
are on a different node than the Cache Manager Daemon. Accessing the Cache Manager
for either reading or writing is through the local area network and adds to the total response
time for a single CGI request. A total of 91 CGI requests of 180 were actually served by

cliette processes. The average time for either IP_s end or IP_recv is about the same as

in Table 6.12.

Lugtle Display [00in

Pointer: tﬂtllttttfi (in sctninls)

ﬁ "189ml :_ I IPjtet'v - sen/[tit

N 0 Reset View

 

 

 

 

 

fir-or Itlllﬂt 7‘1

.‘ ut states

50
Resue to m
Hint

0w tit] ll ttlt‘i ﬂint

i‘iinnr:

r ut states:

.- nt lnns:

(6

t
Rem to m 5

Mil

(ESP

[1.0114857

120

 

Ill“ 1111 lltlt HI llil] 1611

: of states:

‘1:
‘ t

" 1 at inn?
1
I00 0.11025 [1.1105
“

0030

l i
} ,,
I
100
Rem tn ﬁt
i Pmt

 

 

1/ll Hill 131] 3l|ll 'i’ltl Yi’ll fill 1'1" Zfill

i
gt
1

ill

Draq left hiittnn tn stmtrtt niil tntt will. at tiistntyam an linig tei't tiiittnn in «will out In" sine nl tmtnqmn 01' “W I?“ ““1th in “Mitt out In" W at titstngmui display

[traq minute button tn slide histogram disptav

Ora; iiiilttllu lltlllLtll lu blttlt‘ histnqmn titsttld‘f. [Wt

middle button tn slut? liistngmn ilisplav.

Drug right tiiittuti to strvttli out right iltit.’ nt tnstuty‘nn Dnuj nglit button to 51ml“! out right snle at liistugmn Drag ﬁght tiultun tn SIMLh uul nqlit side ul tiistoiirain display,

Uni] any buttnn to stretch liisintpins venicalv Ding any nuttnn to stmtch tustogmis VMICMIY. 1'qu any hltllun “J S'Mth hlimwll‘: Vt‘mti't
Figure 6.10: Distributed Web services on a cluster of three workstations with Cache

Manager support

 

Tint 11:46 Oucuo: WHIFIZPC m Wm!)

l

 

 

121

 

 

 

 

 

 

 

 

 

C l i e t t e I D Total N o. of Average

(sec) calls (sec)
Cliette 1 184.099 20 9.204
Cliette 2 179.513 20 8.975
Cliette 3 178.169 13 13.705
Cliette 4 190.737 11 17.339
Cliette 5 172.499 13 13.269
Cliette 6 189.344 14 13.524
Total 91

 

 

 

 

 

 

 

Table 6.15: Elapsed seerGI time statistics on a cluster of workstations with Cache Manager
support

 

 

Cache Operation Total No. of Average

(sec) calls (sec)
529.728 356 1.488
536.029 296 1.81 1

 

PickCache
GifCache

 

 

 

 

 

 

 

 

Table 6.16: Cache Manager activities on a cluster of workstations

6.9 Running on Three IBM SP Nodes

In this setting, we run our scalable Web services on three IBM SP2 nodes using the high-

performance switch.

Figure 6.11 shows the tracing of the interprocesses communication trafﬁc (I P-Send
and I P-Recv) between the Connection Manager Daemon and cliette processes running on

a cluster of workstations.

Tables 6.17 and 6.18 show the total elapsed time, number of calls, and average elapsed

time of three different N UPSHOT states - IP-Send, IP_Recv, and seerGI.

Comparing Table 6.12 and 6.17, we see that the average IP_Send and IP_Recv
has improved slightly; Section 6.7 notes that we use three SP nodes to simulate a cluster
of workstations. Because these SP nodes are tightly coupled and communicate through a

private Ethernet network, their network performance is expected to be much better than

logfile [Itsptav Zoom

Hither. 1.115111 (methods) 00

’ IP_finnit 1 ‘ lIP_ltnri -inerGl

”11

21110050601080

1111111131]

1813
1 nt stalks:
11111".

.- n1 IllM’

fill
Hague to M

Fhiit

(lose

.00 00025 00050 0.0
_

122

0. Reset vww Help

 

111. 1

n

911 11111 1111 1.711 1111 1111 .i11 1611

unort 0.0111 1165

: 01 states:

25
1195112 to fit

Phiil

Dose

 

1111

 

911.1

181.1 1111 21111 2'30 .‘111 2111 .7511 2611 7711 21111 2911 11111 .110
-. . ' ' .. p

131]
- 01 states,
100°.

.' 01 tlllﬁ

100
1103119 to In

Hint

flow

Dnglrtt 00mm to stmtrh out Inn «110 nt Iistnnmn iiispt Drau Ieit button to stretch out tett Side of instant lira; Mlhiitton to .In-tth 00111.1! uh: 01 Iiisturim itisptiv

Draw; nndtlo button to shun histniram itisplav

11mg huddle button to shite histoifciiii Itisphiy

than mutate 11mm" 10 SM? histogram display

 

[Iraq right hullon to 3tmtih out nght 51119 01 histugmi 11h Drag right button to stintrh out right stile nt titstoqn Draq mint hutton to stretch out tight 51119 01 histoimii displav

11mg anv hutton tn stmtth hi. tngrims VPl’llt'JNy

Figure 6.1 1: Distributed Web services on

Mu

Tint. 11:40 Ola-tn: WH‘FQPC Pm WHIF42AC

Draq any button to stretch histogms verticsty Uriq any 111111011 10 strvtth histograms venicattv

3IBM SP nodes

 

 

123

 

 

 

 

 

 

 

 

 

 

 

 

Task Type IP-Send IP-ReCV

Total No. of Average Total No. of Average

(sec) calls (msec) (sec) calls (msec)
CMD 0.035 367 0.095 0.039 186 0.212
Cliette 1 0.049 249 0.198 0.101 280 0.361
Cliette 2 0.040 234 0.172 0.200 263 0.763
Cliette 3 0.044 233 0.189 0.027 262 0.105
Cliette 4 0.045 241 0.187 0.028 271 0.103
Cliette 5 0.044 241 0.186 0.031 271 0.115
Cliette 6 0.049 249 0.200 0.032 280 0.114
Total 1814 1813

 

 

 

 

 

 

 

 

 

Table 6.17: Elapsed IP_Send and IP_Recv time statistics on three IBM SP nodes

 

 

 

 

 

 

 

 

 

 

Cl iette ID Total No. of Average

(secs) calls (secs)
Cliette 1 308.529 31 9.952
Cliette 2 312.231 29 10.766
Cliette 3 310.899 29 10.720
Cliette 4 310.549 30 10.351
Cliette 5 307.281 30 10.242
Cliette 6 310.165 31 10.005
Total 180

 

 

 

 

 

 

Table 6.18: Elapsed seerGI time statistics running on three IBM SP nodes

124

a cluster of workstations; that expectation also applies to the average seerGI request
time. Yet there is not much difference in terms of average seerGI elapsed time between

cliettes 1 and 2, and among 3 through 6 compared to the noticeable difference in Table 6.13.

From the distribution of seerGI elapsed time in Figure 6.11, we show that the
elapsed time for seerGI centered around 10 see with the highest being 15 see. This
indicates that using the high-performance switch gives us a more compact elapsed time
distribution and provides Web browser clients with a more predictable response time. This
also is the reason of no failures for CGI requests, because the cliette process could respond

to a C01 request within 15 see (which is our CGI processes’ timeout value).

6.10 Running on an IBM SP System with Cache Manager

This section describes how we run the Web services on three IBM SP2 nodes. The
httpd (our control program), Connection Manager Daemon, Cache Manager, and the
ﬁrst two cliette processes all run on the ﬁrst node. The remaining two nodes have four
cliette processes running, two for each node. Communication between the remote cliette
processes and daemons is through the high-performance switch network. Each SP2 node

has the same CPU model (RS/6000 model 390) as those used in Section 6.7.

Figure 6.12 shows the trace visualization of the interprocess communication trafﬁc
(I P-S end and I P-Recv) between the Connection Manager Daemon and cliette processes,

along with elapsed time distributions for IP-Send, IP_Recv, and seerGI.

Tables 6.19 shows the total elapsed time, number of calls, and average elapsed time of
two Nupshot states. From Table 6.20, we ﬁnd that the average time that a cliette spends on
serving a CGI request increases slightly for the last four cliettes. This is due to the need to
check with the Cache Manager, which is on the same node as the ﬁrst two cliette processes.
On the other hand, the total number of seerGI requests decreases dramatically to 97,

indicating that many of the requests have been satisﬁed by the Cache Manager.

The total and average times of the two major Cache Manager operations, Pi ckCache

and GIFCache, are shown in Table 6.21. The total number of cache accesses is the

 

 

 

 

 

 

 

 

 

 

 

 

 

I
.
I
I1
I 1 . 1
Al V I
1 I ‘- ’-. ‘. ~ -7
i - I l 5 i e [,1
.3 . - 2% :1
I l l l I il I II I I | I | l I 1
v — - -i .,.‘ii,’1r;",, x
> . P.."
n 1“; t. . 1
ill I l
' i
I 1
I ) l ’
. v, ‘
i
l . _ -
t11 'i "
I I I\;. , , C
; ~ ,. .'
7?; l‘. ".
l |
l u i 1 I
I l . I l I .
'1 l i i i l I I II t i I
I I

Figure 6.12: Distributed Web services on an IBM SP2 system with caching support

r . is
m- ““ *- Tlmo: ii;ui Qua-2 wmrw’c me WC

 

126

 

 

 

 

 

 

 

 

 

 

 

 

Task Type IP-Send IP_Recv

Total No. of Average Total No. of Average

(secs) calls (msecs) (secs) calls (msecs)
CMD 0.019 104 0.187 0.021 201 0.107
Cliette 1 0.420 181 2.321 0.020 161 0.129
Cliette 2 0.289 163 1.776 0.012 145 0.084
Clistte 3 0.013 129 0.108 0.023 115 0.200
Cliette 4 0.012 119 0.106 0.019 106 0.183
Cliette 5 0.015 146 0.104 0.030 130 0.233
Cliette 6 0.014 147 0.097 0.028 131 0.219
Total 989 989

 

 

 

 

 

 

 

 

 

Table 6.19: Elapsed IP-Send and IP-Recv time statistics on three IBM SP nodes with
Cache Manager support

 

 

 

 

 

 

 

 

 

Cliette ID Total No. of Average
ID (sec) calls (sec)
Cliette 1 170.291 20 8.514
Cliette 2 171.333 18 9.518
Cliette 3 184.673 14 13.190
Cliette 4 172.214 13 13.247
Cliette 5 185.187 16 11.574
Cliette 6 174.216 16 10.888
Total 97

 

 

 

 

 

 

 

Table 6.20: Elapsed seerGI time statistics on 3 IBM SP nodes with cache manager
support

combined total of requests coming from both cliette and CGI processes. The average
elapsed time for cache accesses (1.74 and 2.04 sec for PickCache and GIFCache,
respectively) indicates that accessing the Cache Manager is roughly four to ﬁve times faster

than accessing the back-end server.

From Tables 6.18 and 6.20, we ﬁnd that the average time cliette spends on serving a
CGI request increases slightly for the last four cliettes. This is due to the need to access
the Cache Manager, which is on the same node as the ﬁrst two cliette processes. But

the difference is not as large as in Table 6.15 because, in the current setting, we use the

127

 

 

 

 

 

 

 

 

Cache Operation Total No. of Average

(sec) calls (sec)
PickCache 654.315907 374 1.749508
GifCache 631.876162 309 2.044907

 

 

 

Table 6.21: Cache Manager activities on an IBM SP system

hi gh-performance switch for the cliette processes to talk to the Cache Manager Daemon.

6.11 Running on a Cluster of Four Workstations with One

Workstation Dedicated to the HTTP Daemon

In this example, we dedicate one workstation for the HTTP daemon (our control pro-
gram) and all its dynamically created CGI processes. Six cliette processes are running the

remaining three workstations, with two on each workstation.

Figure 6.13 shows the tracing of the interprocesses communication trafﬁc (I P_S end
and I P_Recv) between the Connection Manager Daemon and cliette processes running on

a cluster of workstations.

Tables 6.22 and 6.23 show the total elapsed time, number of calls, and average elapsed

time of three different NUPSHOT states - IP_Send, IP_Recv, and seerGI.

In this setting, we ﬁnd that the ﬁrst two cliette processes have a slight performance
improvement over those shown in Table 6.13. Because these nodes are capable of handling
a large load, the improvement is not shown clearly when moving the HTTP daemon to a
dedicated workstation. Also, in this setting we used the Ethernet connection for all the
connections, including the CGI process to any cliette process, which would contribute to

some delay in the cliette response time.

 

128

logﬁle Display Toom

Pointer: 16.562552 {in seconds) N 90 Reset view Help

111111111 1111" ., 111111111 111 1 1 1 1111. _ 11111

1 1i, .. “f

 

 

 

 

 

 

“I. I J . J ‘2..» . " . II

1111 3211 1311 .

z

t‘iirsor, 11111111111 (Ilﬁttt‘i 111162503 01501”. 2.2386311

119-t till 1711
r of states: 3 01 states: . . 01 states:
111111. 1011”. ‘ 1009-.
: ot bios. r 01 thus. ’ .6 01111113;
50 25 " 100
Restze to 111

Reset: to 111 {Issue to ﬁt

Punt Pnnt j*‘ him 1 1 l 1!

”059 m 0001 (Msp . 0059 '10 000 150 1000 1350 lhhu 1710 2100 re
<IIIIIIIIIIIIIIIIIIII

Drag lialt hutton to stretch out 1911 side 01 liistoimini ilis Drag tell button In stretrh not tell side at hstngmin 11 Drag Iett button to stretch out tell sole ul histogram display
[trig middle button to slide histogram dlspla'i. Drag mirlitle button to shite histogram dismay. [1mg mlt‘llllll button In .3101: histngmn 111511111.

Drag right hiitton to stretch out right 31111! Olylltsltttjl'dlll 1 Drag right button to strelrh out right ante ol histogrni 0m; right button In stmtnh mil nght 51119 ntthistngmni digilav.
Dmg any hulton to strelih histograms vertii ally Drag .iny hutton to stretch histogram ventrally. Drag any hulton to stretch histograms vemcalty.

Figure 6.13: Distributed Web services on a cluster of four workstations

Tlm.‘ 111000000: WH4F42PC Prime: W430

 

129

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Task Type IP-Send IP-Recv

Total No. of Average Total No. of Average

(sec) calls (msec) (sec) calls (msec)
CMD 0.029 192 0.154 0.026 367 0.071
Cliette 1 0.056 280 0.201 0.156 249 0.628
Cliette 2 0.072 291 0.247 0.149 259 0.576
Cliette 3 0.033 247 0.137 0.164 220 0.749
Cliette 4 0.036 263 0.139 1.698 234 7.259
Cliette 5 0.031 239 0.131 0.121 213 0.570
Cliette 6 0.036 282 0.131 1.476 251 5.883
Total 1.794 1,793

 

 

Table 6.22: Elapsed I 13.8 end and I P_Recv time statistics on a cluster of four workstations

with one workstation dedicated to the Web server

 

 

 

 

 

 

 

 

 

 

Cliette ID Total No. of Average

(sec) calls (sec)
Cliette 1 305.463 31 9.853
Cliette 2 300.912 32 9.403
Cliette 3 298.914 27 11.070
Cliette 4 297.316 29 10.252
Cliette 5 297.873 26 11.456
Cliette 6 301.120 31 9.713
Total 176

 

 

 

 

 

 

Table 6.23: Elapsed seerGI time statistics on a cluster of four workstations with one
dedicated to the HTTP Daemon

 

130

6.12 Observations and Lessons Learned

By using the ﬂexibility and scalability of our distributed Web services, we present various

possible settings to analyze the impact of our design or of the hardware platfonn. Table 6.24

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Configuration Send Recv CGI total Cache

(rmcct (rm-c1 (sec) CG I S (sec)
w/o gateway solution N/A N/A 39 168 N/A
1 Workstation 45 0.9 21 180 N/A
1 WK, Cache 41 0.9 14 96+ 1.4
1 SP node 1 0.15 10 180 N/A
1 SP, Cache 1.9 0.15 12 93+ 1.6
3 Workstations 1.2/0.1 1.8/0.1 11 178 N/A
3 WKs, Cache 0.1/0.6 1.8/0.1 13 91+ 1.5
3 SP nodes 0.18/0.2 0.5/0.1 10 180 N/A
3 SP, Cache 2.0/0.1 0.1/0.2 11 97+ 1.7

 

 

 

Table 6.24: Summary of performance results

shows the overall performance data for various test environments. Comparing these num-
bers with the performance data shown in Chapter 4, we show that the average gateway
overhead (15 msec from Figure 4.5) is quite small compared to the average cliette response

time (10 see).

We have made the following observations:

0 In each setting using the Cache Manager, the actual number of requests served by

cliette processes is about half of all the CGI requests.

0 Using the high-performance switch not only improves overall performance for
IRS end, IP.Recv, and seerGI elapsed time, it also provides more predictable

seerGI elapsed time.

0 Using both the Cache Manager and the high-performance switch, the performance
improvement depends on the arrangement of cliette processes and the interaction
needed between cliette processes and the Cache Manager. Such an arrangement

sometimes does not provide the most efﬁcient environment.

131

0 Our tracing is done in a semi-realistic environment. First, we are sharing the load
with other users - we did not dedicate the machine to our own use. Second, we
are using a control program to generate CGI requests instead of an H‘l'l'P daemon

process.

0 Failed seerGI requests occur because each CGI process has a time-out value
waiting for an available cliette process. If all cliette processes are busy, either serving
CGI processes or are updating cache information, the CGI process will not get a free

cliette process.

0 Although not shown in our report, the number of failed CGI requests is higher than
that of the failed seerGI requests, because each CGI process has a timeout value
set for waiting for an HTML page from the cliette process. Although the cliette

process may ﬁnish the request successfully, the CGI process has terminated due to

time-out.

During our tracing activities, we sometime experience a huge amount of failure rates
for CGI requests. In those situations, we always witness high system loads. Because we
do not have a migrate function for our Connection Manager Daemon, by the time we add

more cliette processes to distribute the load, the system load has decreased.

CHAPTER 7

Conclusion and Future Research

In this thesis, we presented a framework for distributed Web services containing the Con-
nection Manager Daemon, the Cache Manager Daemon, and APIs for building CGI and
cliette processes. The framework is designed to improve the scalability and portability of
existing Web services. We also demonstrated an innovative tracing facility for distributed
Web services. The events obtained from the trace facility help a Web administrator monitor
the performance of the Web and identify the bottlenecks of the Web. The dynamic tracing
feature improves the ﬂexibility of gathering trace records, and thus extends the tracing
scope from scientiﬁc codes to client/server applications. In this chapter, we summarize
the salient contributions made by this research and present interesting avenues for possible

future research.

7.1 Research Contributions

Existing CGI interfaces provide a simple, easy-to-use method of executing programs from
within a Web server. This mechanism can be used to create dynamic HTML documents
and interface with services outside the normal Web server. The CGI interface has many
advantages, including portability between server softwares, and a large base of public
domain programs and development tools has been designed using the interface. On the

other hand, it has been noted that using existing CGI scripts incurs signiﬁcant overhead.

132

133

Several alternatives have been put forward to address the performance problem. N SAP] is
Netscape’s proprietary "standart ," similar to other proposals. Basically, CGI programmers
need to rewrite general purpose scripts in C language for NSAPI so that it can be dynamically
loaded into the Web server daemon. Dynamic loading is much faster than fork/exec. The
downside to N SAPI is that one must write the script especially for N SAPI, and it can only
be loaded into a Netscape server. There has been no good substitution to completely replace

the use of CGI scripts.

In our framework, CGI scripts are used to provide a gateway between the Web server
and the Digital Library. If the CGI script is simple and small, the impact to the overall
system performance is quite small. But in the case of Digital Library, we have shown that
the CGI script is both complex and huge. To effectively improve Web server performance
and add scalability while using CGI scripts to construct dynamic HTML pages, we have
presented a Connection Manager/Cache Manager gateway interface. Results have shown
that our design not only improves the latency of Web client requests, it also provides

portability and scalability without touching the original http daemon.

In addition to providing the CGI gateway interface, we also present a set of
performance-tracing tools. Originally developed for scientiﬁc applications, UTE is capable
of generating system as well as message passing events with minimal overhead. Traced
events can then be analyzed for performance and visualized to pinpoint performance bot-
tlenecks. The UTE was extended to support tracing in a client/server environment. Using
the dynamic tracing facility, a user can begin tracing whenever a distributed system acts

abnormal, or stop the tracing and examine the trace results without stopping the entire

system.

7.2 Directions for Future Research

Several improvements are under way for the interface design and implementation. Because
the Extended UTE tools provide the on-line timing routines, we would like to exploit

these routines for program steering and automatic load balance on multiple platforms.

134

A multithreaded Connection Manger Daemon is also under development to reduce CGI

request overhead.

We also would like to exploit the possibility of combining the use of a dynamic loading
facility such as Netscape’s NSAPI with our CGI gateway interface and the Connection
Manager/Cache Manager Daemon. The framework can be extended with a threaded http

daemon or a daemon with preforking capability to further improve performance.

Appendices

Appendix A

Sample CMD Conﬁguration File and
CGI API Calls

A.l Sample CMD Conﬁguration File

COMPONENT_NAME: Connection Manager Daemon start—up file

This file is generate by CMDadmin —b command.

The purpose of using CMDadmin command is to encrypt the cliette
password. This file could be edited manually after the cliette
password is generated by the CMDadmin —b command. The password

field should not be edited.
How to configure this file for your system:
1) Service Type statement
Each type of services should start with a %service=

keyword followed by the name of the service, the initial number

of cliettes and the maximum number of cliettes to be created.

*###**####t=¢####

%service=Service_type Initial_Number Max_Number

135

136

#

# The Initial_Number statement defines how many processes for this
# service to start when the Connection Manager Daemon is started.

#

# The Max_Number statement defines the maximum number of processes
# that may be started for this service.

#

# 2) Cliette definition statements

#

# There should be a "Max_Number" definition statements defined

# immediately following its Service Type statement.

#

# Example:

# %service=Service_type Initial_Number Max_Number

# -cliette:{CMD_EXEC_PATH=Path_Name;

# CMD_NAME=User_ID;

# CMD_PASSWD=encrypted_passwd;

# CMD_CACHE_MANAGER=cache_service_name;

# CLIETTE_DBNAME=C1iette_db}

#

# The CMD_EXEC_PATH is required and defines the name of the cliette
# binary.

# Specify CMD_EXEC_PATH in the format

# {CMD_EXEC_PATH=<hostnameGid+pw>Path_Name}

# where

# hostname -— optionally specifies the name of the host

# on which to start the cliette. If not specified, the
# cliette is started on the same host as the Connection
# Manager Daemon.

# id —- is the userid the cliette runs under. If not

# specified it is the same as the Connection Manager
Daemon

# pw —— is the password for ——id—- if the Connection Manager
# Daemon needs a password to log in as -—id--

#

# For example:

##2##:4t:##################¢##¢####ttttttﬁ:

137

-Cliette:{CMD_EXEC_PATH:

<hostname@remote_user+remote_passwd>/etc/cli}

The optional keyword CMD_CACHE_MANAGER=Service_type is

used to associate a cliette with a specific cache manager.
The hostname and port of that cache manager is passed to the
cliette in the environment variables CMD_CACHE_HOSTNAME and
CMD_CACHE_PORT when it is started up if CMD_CACHE_MANAGER

is specified.

Cliette environment variables may be initialized by adding them
to the cliette definition statement. Every "name"="value"
pair in this statement is added to the environment. In the above
example all of the environment variables CMD_EXEC_PATH, CMD_NAME,
CMD_PASSWD, CMD_CACHE_MANAGER, and CLIETTE_DBNAME are placed in

the environment of the cliette.

3) Cache definition statements

Specify
%service=Service_type Initial_Number Max_Number
—cache={CMD_EXBC_PATH=<hostname@id+pw>Path_Name;
CMD_PARAMETERS=cache—startup-parameters}

The Initial_Number may be 0 or 1 indicating whether this cache

manager is started during initialization.

The Max_Number must be 1 for each cache statement.

Both CMD_EXEC_PATH and CMD_PARAMETERS are required

CMD_EXEC_PATH specifies the full pathname of the cache manager

executable and is used as described in the cliette definition

statements.

138

CMD_STARTUP_PARAMETERS=/usr/cmd/cache.cfg 7175;

# CMD_STARTUP_PARAMETERS, if specified, specifies the command line
# parameters for starting the cache manager. These are:

# config-file cache-mgr-por

# where

# config—file —- is the name of the cache manager configuration
# file to use.

# cache-mgr—port -- optionally specifies the port number the
# cache manager should use if not specified
# in /etc/services

# Example:

#

#

# start and fully configure a local cache manager at initialization time
%servicezLocalCacheManager 1 1
—cache:{CMD_EXEC_PATH=/etc/www/cache_manager;

CMD_PARAMETERS=/etc/www/cache.cfg 7175};

# start a remote cache manager at initialization time. Use all defaults.

%service=RemoteCacheManager l 1

-cache={CMD_EXEC_PATH=<viobj.xxx.edu>/etc/VI_cache_manager;
CMD_PARAMETERS:/etc/www/cache_manager/config 7176)

# start 4 cliettes at the initialization time and allow maximum of

# 6 cliettes

%service=VI 4 6
—cliette={CMD_NAME=userl;CMD_PASSWD=4C4fe868381597cc;
CMD_EXEC_PATH=/etc/VI_cliette;
CMD_CACHE_MANAGER=LocalCacheManager}
—cliette={CMD_NAME=user2;CMD_PASSWD=d25d89OS6eea4le4;
CMD_EXEC_PATH=/etc/VI_cliette;
CMD_CACHE_MANAGER=LocalCacheManager}
—cliette={CMD_NAME=user3;CMD_PASSWD=b2cc52cc08900f38;
CMD_EXEC_PATH=/etc/VI_cliette;
CMD_CACHE_MANAGER=LocalCacheManager}

139

-cliette={CMD_NAME=user4;CMD_PASSWD=ddaeeObOea8c98a4;
CMD_EXEC_PATH=/etc/VI_cliette;
CMD_CACHE_MANAGER=LocalCacheManager}

—cliette={CMD_NAME=userS;CMD_PASSWD=dl8caf6e6fb94ba2;
CMD_EXEC_PATH=/etc/VI_c1iette;
CMD_CACHE_MANAGER=LocalCacheManager}

~cliette={CMD_NAME=user6;CMD_PASSWD:8bad27lcd9bf623c;
CMD_EXEC_PATH=/etc/VI_cliette;
CMD_CACHE_MANAGER=LocalCacheManager}

# start no cliette at the initialization time and allows maximum
# of 2 cliettes. No definition string is required by the cliette

# program for initialization. These cliettes will not cache anything.

%service=XXX O 2
-cliette={CMD_EXEC_PATH=/etc/XXX_cliette}
-cliette={CMD_EXEC_PATH=/etc/XXX_cliette}

# start cliette at the remote host viobj.xxx.edu

%service=RemoteVI l l
-cliette={CMD_EXEC_PATH=<viobj.xxx.edu>/etc/VI_cliette;
CMD_NAME=userF;
CMD_PASSWD=6e8ea7c6a5c2971e,
CMD_CACHE_MANAGER=RemoteCacheManager}

The number of the cliette deﬁnition statements should be the same as the maximum
cliette processes deﬁned in the service type statement. Otherwise, no cliette process will
be created. The Initial_Number in the service type statement deﬁned how many cliette
processes should be created at the initialization time. If more cliette processes are needed,
the system administrator uses command CMDadmin to create the cliette processes until
the maximum number is reached. The system administrator could use kill -1 CMDae-

mon_process-id or CMDadmin -i to force the CM Daemon to do re-initialization and to

140

re-read the conﬁguration ﬁle. Re-reading the conﬁguration ﬁle will cause all the cliette and

cache processes to terminate.

All values speciﬁed in the cliette deﬁnition statement are added to the cliette’s envi-
ronment. Some of these values are reserved and have speciﬁc uses. They are prepended

with “CMD_”:

CMD_EXEC_PATH

CMD_NAME

CMD_PASSWD
CMD_CACHE_MANAGER
CMD__STARTU P_PARAMETERS

All others are ignored by the Connection Manager but are placed into the environment

of the cliette.

The CM Daemon could also connect to a remote cliette. The remote cliette is started
by the local CM Daemon using rexec on the host machine speciﬁed in the conﬁguration ﬁle.
If the remote cliette process resides on a host in the open intemet environment while CM
daemon resides inside the ﬁrewall, connection between the CM Daemon and the cliette is

made through the ﬁrewall host using the SOCKS service.

The CM Daemon will start cache managers if so conﬁgured. The Initial_Number should
be 0 if the cache manager is not to be started during initialization and 1 otherwise. The
Max_Number for cache managers must be speciﬁed as “”.l The “Service_Type” statement

is used to satisfy requests for the location of a speciﬁc cache manager by cliettes and CGI

processes.

A.2 General Purpose Request Block

The following data structure deﬁnes the two types of network packets being passed through

the socket connection.

/* First type of network package

struct HTTPRequest {

*/

*/

*/

*/

i,-

/* Possible type of Client_Type */

int client;

union {

int unl_http_service;
int unl_http_total_size;
int un1_cgi_timeout;
int unl_missing_sequence;
} http_unl;

int request;
int sender_pid;
union {

int un2_cgi_pid;

141

used

to

,.
/.
,.
/.

transfer request */

what kind of service */
size of the HTML */
from cliette/daemon to CGI */

reguest missing sequence */

/* what kind of request */

/* Who sent this request */

/* Used only when Daemon send request to

/* Cliette about the incoming CGI process

int un2_cliette_uid;

} http_un2;

struct CliettePortInfo port_info;

int http_sender_state;

union {

/* Cliette pid for Administration uses

/* cliette listing port info

struct Cliette_Queue un4_qe_contents;

struct Cliette_Def un4_qe_defcontents;

char un4_http_service_name[50];

} http_un4;

#define CMDaemon 0

#define Client_CGI 1

#define Client_ADM 2

#define Cliette 3

/* Possible type of Request */

#define
#define
*/
#define
*/
#define

*/

#define
*/
#define

*/

*/

#define
*/
#define
*/
#define
*/
#define
*/
#define
*/
#define

*/

*/

*/

Connect_to_Cliette l

Cliette_initdone

Cliette_free

Cliette_finish

Cliette_done

Cliette_OK

Cliette_reinit

Cliette_stop

Cliette_debug

Cliette_debugend

Cliette_ayt

Cliette_dojob

#define CGI_HTMLready

*/

#define CGI_settimeout

its*/

2

10

ll

12

l3

l4

/*
/*

/*

/*

/*
/*

/*

/*

/*

/*

/*

/*

/*

/*

/*

/*

/*

/*

142

CGI -> Daemon */

Cliette finish init (Cliette—>Daemon)

Cliette finish ljob (Cliette—>Daemon)
(Cliette—>Daemon)

Cliette terminate

In response to Cliette_stop */

Cliette finish req (Cliette->Daemon)

Cliette response to Cliette_ayt

(Cliette—>Daemon)

Re-initialization (Daemon—>Cliette)

(Daemon—>Cliette)

Terminate a cliette

start debugging mode (Daemon—>Cliette)

end debugging mode (Daemon—>Cliette)
Are you there (Daemon—>Cliette)
tell Cliette its CGI client process id

Cliette prepare for incoming CGI req

(Daemon—>Cliette)

HTML is ready (Cliette —> CGI)

Daemon or Cliette ask CGI to change

*/

*/

#define
*/
#define
*/
#define
*/
#define
*/
#define
*/
#define
#define
pkg*/
#define
*/
#define
*/
#define

*/

Cliette_init

Cliette_list

Cliette_kill

Daemon_terminate

Daemon_reinit

Packet_ACK

Packet_Resend

Get_URL_String

URL_String_End

Cache_initdone

15

l6

l7

18

19

20
21

22

23

24

/*

/*

/*

/*

/*

/*

/*

/*
/'k

/*

/*

/*

143

its default timeout value

(Cliette —> CGI)

Init a new cliette (Admin->Daemon)

list cliette status (Admin->Daemon)

kill a hanging c1iette(Admin->Daemon)

terminate the daemon (Admin->Daemon)

Daemon re-read the configuration file

Used for general acknowledgement */

CGI asks Cliette to resent the last

clliette requests URL string from CGI

Cliette does not want any more URL

Cache Mgr finish init (Cache—>Daemon)

/* Second type of network package : mainly for transfer URL and HTML

*/

#define DATASIZE 2020

#define HTMLSIZE DATASIZE
#define URLSIZE DATASIZE

struct DATARequest {

int data_size;

int data_sequence_number;

int end_block;

char data_string[DATASIZE];

144

145

A.3 Connection Manager Daemon API

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Function Call Argument Return Value Action
InitCliette struct Cliette-Def * Process ID InITIALIZE a Cliette
process.
StopCliette struct Cliette_Def * 0 : success Terminate a running
aé 0 : fail cliette process.
KillCliette struct Cliette_Queue * struct Cliette_Def Kill a hanging
*
NULL : fail cliette process.
ReInitCliette struct Cliette-Queue * 0 : success Force the cliette
74 0 : fail to run initialization step
DebugCliette struct Cliette-Queue * 0 : success Force cliette process
int 75 0 : fail to generate debuging
char * information
GetNextCliette struct HTTPRequest * struct Get the next available
Cliette_Queue * available cliette
NULL: no free to serve a request.
cliette
AreYouThere struct Cliette.Queue * 0 : success check if cliette process
T1me-out value 75 0 : fail is still running. Daemon
waits for Time_out
FindCliette CGI socket number 0: successful Find a free cliette
struct Cliette_Queue * -I: fail process for the CGI
1 : resource process
unavailable

 

 

The structure of Cliette_Def and Cliette-Queue are deﬁned as the following:

struct Cliette_Def {

char def_string[MAXCLIETTEDEF];

 

l:

146

char remote_host[MAXHOSTLEN];
char remote_user[9];

char remote_passwd[9];

struct in_addr remote_addr;
int remote_addr_length;

int port_no;

int unique_id;

int Cliette_state;

unsigned char adm_wait;

char exec_path[MAXPATHLEN];

struct Cliette_Queue *qe; /* qe point to the corresponding */

/*
/*
/*

struct Cliette_Def *next_def;

struct Cliette_Queue {

1;

struct Cliette_Def *cliette_defptr;
int Cliette_uniqueid;

int Cliette_processid;

int serving_cgi_pid;

int socket_num;

int public_socket;

int Cliette_state;

int Cliette_type;

unsigned char stop_byte;

struct Cliette_Queue *next;

union {

queue element.

NULL,

to be used

/* redundant info */

struct Cliette_Queue *un2_next_avail;

struct Cliette_Queue *un2_next_unavail;

} QunZ;

#define next_avail Qun2.un2_next_avail

#define next_unavail Qun2.un2_next_unavail

If qe equal

this definition is free */

147

A.4 Cliette Process API

This section deﬁnes the cliette process API with both the CM Daemon and the CGI process.

148

 

 

Function Call

Argument

Return Value

Action

 

 

 

 

 

 

 

 

 

InitSelf int service_type daemon socket Initialization step
number
char *hostname Return value contains
char *defstring socket number to
daemon
WaitForJob struct HTTPRequest * cliette-socket Wait for Daemon to
> 0 socket to assign job to me.
CGI
= 0 request Returning value is the
from Daemon
< 0 socket connected socket to CGI
error
DoAdmin struct HTTPRequest * running administration
function
IamFree Cliette is free to serve
more request.
GetURL Pointer to URL string size of the Get the URL string
URL
size of the URL string URL string from the CGI process
cgi_socket received
SendHTML Pointer to HTML size of the Give the HTML
size of the HTML HTML being to the CGI.
cgi_socket sent
TelICGI cgi socket number 0: successful Tell CGI to process the
total html size 75 0 : CGI had result of a URL search
died
URLend Cliette does not need

 

 

 

any more URL

 

 

 

149

A.4.1 The HTML Request Block

This request block is used to pass the HTML documentation generated by the cliette in

response to the URL request.

#define DATASIZE 2020
#define HTMLSIZE DATASIZE

struct HTMLblock {

int data_size; /* Current block size */
int data_sequence_number; /* Current sequence number */
char html_data[DATASIZE]; /* data */

A.5 CGI Process API

This section deﬁnes the CGI process API with the Connection Manager Daemon and the

Cliette processes.

150

 

 

Function Call

Argument

Return Value

Action

 

 

 

 

 

GetCliette char *servtype > 0 Cliette sk no. Requesting a
Timeout value =0 Timeout Cliette
struct CliettePortInfo * < 0 Daemon died
GetCache cache manager 0: successful Find the cache
port number, -I: fail manager for the
hostname 1: resource CGI process
unavailable
ConnectCliette struct CliettePortInfo * make initial
connection to
cliette
PutURL cliette socket number = sizeof URL sent Give URL to
S sizeof URL : fail Cliette
WaitForHTML cliette socket number 0: successful Wait for Cliette
S O : Cliette failed and output HTML

 

 

 

to stdout

 

 

 

 

Appendix B

CMDadmin Manual Page and Its Usage

Example

B.l CMD Administration Command Manual Page

NAME

CMDadmin - system adminitrator command to control Connection Manager Daemon process and

variour Cliette processes.

SYNOPSIS

CMDadmin [ [-a Cliette_id] [-b <Connection Manager Daemon conﬁguration ﬁle>l [-
c type_of_service] [-d Cliette_id] [-e Cliette_id] [—i] [—k Cliette_id] [-r Cliette_id] [-s cli-

etteJd/typebeervice] [—u clietteJd/typebeervice] [-v Cliette_id/typechervice] [-t] [-x]

DESCRIPTION

CMDadmin is used by the system administrator to maintain the Connection Manager Daemon and

the Cliette Processes. The cliette id used in the command line is cliette unique id number which is

different from its actume process id.

151

OPTIONS

-C

152

check if the cliette process is functional.

build the connection manager daemon conﬁguration ﬁle. This ﬁle is used by the
connection manager daemon to startup cliette process. If no conﬁguration ﬁle name
is given at the command line, the system will prompt you for the ﬁlename. If a
ﬁle with the same ﬁlename already exists. the system will give you option to either
save. overwrite. quit or append. The save option will rename the original ﬁle and
response with the "saving the original ﬁle as ..... ". It is the user’s responsibility to
maintain the backup ﬁles. The overwrite option will overwrite the original ﬁle with
new information. The quit option allows the user to stop. The append option allows

the user to append addition cliette information to the conﬁguration ﬁle.
create a new type of service cliette process.

start debugging the cliette process.

end debugging the cliette process with id equals to cilette id.

force Connection manager daemon to run reinitialization process. All the running

cliette processes are terminated before the reinitialization process.

kill a hanging cliette process.

re-initialization a cliette process.

terminate a cliette process or all the cliettes with type equals to type of service.
terminate all the cliette processes and the connection manager daemon.

start UTE tracing.

view status of the cliette of type equals to

stop UTE tracing. type of service or id equals to cliette id.

B.2 CMDadmin -b Command Example

We show here how to use CMDadmin command to build a conﬁguration ﬁle.

Enter configuration file (/etc/CMDaemon.conf):

Enter the Cliette service type (such as D82 or VI): VI

Enter the number of Cliette to start initially : 1

153

Enter the maximum number of Cliette allowed : 4

**** Please enter information for Cliette #1 ***

Remote host for cliette (Enter for local)

CLIETTE program name (includes complete path): /usr/sys/CMD/vi_cli
Enter CLIETTE Login NAME (Enter for none): vi_userl

Enter Cliette password (Enter for NONE):

Enter Cliette password again for verfication:

Enter extra Cliette global environment variable

Empty string to stop

Enter extra Cliette info (TAG=VALUE): VILIleibservl

Enter extra Cliette info (TAG=VALUE): VIOBJzobjservl

Enter extra Cliette info (TAG=VALUE):

**** Please enter information for Cliette #2 ***

Remote host for cliette (Enter for local) : tivoli.watson.ibm.com
Enter remote machine login name: cmd

Enter remote login passwd (Enter for NONE):

Enter passwd again for verification :

CLIETTE program name (includes complete path): /usr/sys/CMD/vi_cli
Enter CLIETTE Login NAME (Enter for none): vi_user2

Enter Cliette password (Enter for NONE):

Enter Cliette password again for verfication:

Enter extra Cliette global environment variable

Empty string to stop

Enter extra Cliette info (TAG=VALUE): VILIleibservl

Enter extra Cliette info (TAG=VALUE): VIOBJzobjservl

Enter extra Cliette info (TAG=VALUE):

**** Please enter information for Cliette #3 ***

Remote host for cliette

CLIETTE program name

Enter
Enter
Enter
Enter
Empty
Enter
Enter

Enter

154

(Enter for local)

(includes complete path): /usr/sys/CMD/vi_cli
CLIETTE Login NAME (Enter for none): vi_user3
Cliette password (Enter for NONE):
Cliette password again for verfication:

extra Cliette global environment variable
string to stop

extra Cliette info (TAG=VALUE): VILIleibservl
extra Cliette info (TAG=VALUE): VIOBJ=objservl

extra Cliette info (TAG=VALUE):

**** Please enter information for Cliette #4 ***

Remote host for cliette

CLIETTE program name (includes complete path):

(Enter for local)

/usr/sys/CMD/vi_cli

Enter CLIETTE Login NAME (Enter for none): vi_user4
Enter Cliette password (Enter for NONE):

Enter Cliette password again for verfication:

Enter extra Cliette global environment variable
Empty string to stop

Enter extra Cliette info (TAGzVALUE): VILIleibservl
Enter extra Cliette info (TAG=VALUE): VIOBJzobjservl
Enter extra Cliette info (TAG=VALUE):

More Cliette type (y/n) ? n

The following is the conﬁguration ﬁle.

# COMPONENT_NAME: Connection Manager Daemon startup file

%service=VI 1 4

-cliette={CMD_EXEC_PATH=/usr/sys/CMD/vi_cli;

cliette:{CMD_EXEC_PATH=<tivoli.watson.ibm.com@cmd+e22beb09f55cda78>/usr/sys/CMD/vi

CMD_NAME=vi_userl;CMD_PASSWD=edae8ac3cdf88f2b;

VILIleibservl;
VIOBJzobjservl}

155

CMD_NAME=vi_user2;CMD_PASSWD=d5c8a9a6cb5e4ble;

VILIleibservl;

VIOBJzobjservl}
-cliette={CMD_EXEC_PATH=/usr/sys/CMD/vi_cli;

CMD_NAME=vi_user3;CMD_PASSWD=cd46fC62bda5f9c3;

VILIB=libservl;

VIOBJzobjservl}
-cliette={CMD_EXEC_PATH=/usr/sys/CMD/vi_cli;

CMD_NAME=vi_user4;CMD_PASSWD20034bcb27136l9f5;

VILIleibservl;

VIOBJ=objservl}

Appendix C

Sample Cache Manger Conﬁguration

File and Its API Calls

C.l Sample Conﬁguration File

cache—manager {

on

logging
logfile = all_logs/cache.log
port = 7175

wrap-log : yes

log—size = 64000

connection-timeout = 3008

}

cacheO {
root = /usr/www/cache0 # root for cache files
caching = on # enable caching

file-cache = 100MB
memory-cache = lOOOKB
expiration = 60M
check-expiration = 608

datum—memory-limit = 2K8

156

157

datum—disk—limit = 4KB

cachel : cache 0 {
root = /usr/www/cachel
fs—size = 100MB

mem—size = 0

C.2 Cache Manager API

The API and cache manager communicate using a CacheToken structure. Depending on
the operation the programmer will supply information or to, or receive information from

the cache manager via the CacheToken. The CacheToken structure is as follows:

typedef struct _CacheHandle {
char * cache_host;
int port;
char * cache_id;
int socket;

} CacheHandle;

enum CacheDisp { CacheRO,
CacheWO,

CacheNone

1;

enum CacheRc { CacheNo,
CacheModified,
CacheFound,
CacheExpired,
CacheLocked

158

l;

//

// CacheToken:

// function - c version of cache token as used in API

//
typedef struct _CacheToken {
void *data; // the data
int len; // length of data
int datum_len; // length of the datum (-l for N/A)
time_t creation; // when data was last written (open R/W)
time_t expiration; // expiration date
time_t last_access; // when item was last opened for read
CacheHandle * connection; // connection fd to daemon
enum CacheRc return_code; // return code from last operation
enum CacheDisp disp; // indicates whether datum RO, RW, NONE
(not open),
// or NULL (does not exist)

} CacheToken;

Cachelnit routine

Syntax
CacheHandle * CacheInit(char *cachernachine, int

cache_port, char *cache_service);

Description

Initialize a connection to cache manager.
Parameters

cache_machine

This is the name of the machine running the cache manager.

159
cache_port
This is the connection port the cache manager is listening on.

cache_service
This is the name of the cache service to connect to. This corresponds to the

“cache-id” in the speciﬁed cache.

Return value
The Cachelnit routine returns a CacheHandle which is used in subsequent cache
operations. If the cache manager cannot be contacted the CacheHandle is returned

NULL.

CacheClose routine

Syntax

void CacheClose(CacheHandle * ch);

Description

Close the connection to cache manager.
Parameters

ch
This is a CacheHandle returned by Cachelnit.

Return value

Nothing is returned.

CacheMakeTbken

Syntax

CacheToken * CacheMakeToken(void *data, int len);

Description

Initialize a cache token.

160

Parameters

data
This is a pointer to the data comprising the token. The data is understood to be

an array of arbitrary bytes.

len

This is the length of the token.

Return value

A CacheToken is allocated from free storage, initialized, and returned. This must be

freed eventually with CacheFreeToken.

The application may modify the datum_len, expiration, and disp ﬁelds as appropriate.
If not modiﬁed by the programmer, the expiration ﬁeld is set by the cache manager

to its default, and the disp is set to RO.

CacheOpenData

Syntax
int CacheOpenData(CacheToken *token, CacheHandle *

cache_manager) ;

Description

Look up an entry in the cache, and get R0 or WO access to it.
Parameters

token
This is a pointer to a CacheToken which has been initialized by CacheMake-

Token(); The “disp” ﬁeld must be set appropriately. If RW is speciﬁed, the
“datum_len” must also be set to the length of the data. If “expiration” is not set,
the cache manager will use its default expiration data; otherwise the “expiration”

in the token will take precidence.

161

cache_manager

This is a handle to a cache manager which has been initialized by Cachelnit.
Return value A single integer return code is returned:

CacheFound

This is returned if the data is in cache and is valid.

CacheModiﬁed

This is returned if the data is in the cache but is marked expired.

CacheNo

This is returned if the data is not in the cache, or the data cannot be accessed for

some reason.

Set the “disp” ﬁeld to CacheRO for read access or CacheWO to create or replace an
entry. If CacheWO is speciﬁed and the item already exists the item is discarded and

re-allocated for fresh creation.

The CacheHandle is placed in the token by this call so the token may be used without

a handle from this point on until it is closed.

If the object is found in the cache, the token is updated to reﬂect the correct da-

tum_length, creation, expiration, and last_access values.

CacheCloseData

Usage Notes Syntax

void CacheCloseData(CacheToken *token);

Description

Close the cache item.

Parameters

162

token
This is a pointer to a CacheToken which has been initialized by CacheOpen-

Data() specifying RO mode.

Return value None.

CacheRead

Syntax

int CacheRead(CacheToken *token, void *buffer, int len);

Description

Read data from the cache.
Parameters

token
This is a pointer to a CacheToken which has been initialized by CacheOpen-

Data() specifying RO mode.

buffer

This is a buffer to receive the data.

len

This is the number of bytes to read.

Return value The number of bytes actually read is returned. On read errors -1 is returned

and errno reﬂects the cause of the error.

CacheReadToFile

Syntax

int CacheReadToFile(CacheToken *token, char *file);

Description

Read data from the cache into the named ﬁle.

163

Parameters

token

This is a pointer to a CacheToken which has been initialized by CacheOpen-
Data() specifying RO mode.
ﬁle

This is the name of the ﬁle to receive the data.

Return value The entire cache item is transferred to the named ﬁle. If the ﬁle already

exists it is overwritten.

CacheWrite

Syntax

int CacheWrite(CacheToken *token, void *buffer, int

len);

Description
Read data from the cache.

Parameters

token
This is a pointer to a CacheToken which has been initialized by CacheOpen-

Data() specifying RW mode and the total length of the data object.

buffer
This is a buffer from which to send the data.

len

This is the number of bytes to write.

Return Value The number of bytes actually written is returned. On write errors -1 is

returned and errno reﬂects the cause of the error.

164

Usage Notes When a datum is opened with CacheOpenData a seek pointer is set for the
data object in the cache manager. This pointer is incremented with each read or
write command and can be reset only by closing and reopening the item. The cache
manager will accept data only up to the number of bytes speciﬁed in datum_len in the

token when the item was opened.

CacheWriteFromFile

Syntax

int CacheWriteFromFile(CacheToken *token, char *file);

Description
Write data to the cache from a ﬁle.

Parameters

token
This is a pointer to a CacheToken which has been initialized by CacheOpen-

Data() specifying RO mode.
ﬁle

This is the name of the ﬁle to be copied to cache.

/
Return value The entire ﬁle is copied to the cache. Note that you must have ﬁrst opened

the cache entry in CacheRW mode, passing the correct length of the ﬁle in the open
call. This is required to permit the cache to correctly allocate space (in memory or

disk).

CachePurge

Syntax

int CachePurge(CacheToken *token)

Description
Purge data from the cache.

165

Parameters

token
This is a pointer to a CacheToken which has been initialized by CacheOpen-

Data() specifying RW mode.

Return Value If the item is purged, the value TRUE (1) is returned. Otherwise FALSE

(0) is returned.

CacheClear

Note: is this call wise? There are probably security concerns.

Syntax
int CacheClear(CacheHandle * handle);

Description

Request the cache manager to invalidate all its data.

Parameters

handle
This is a CacheHandle initialized by a call to Cachelnit;

Return Value If all cache are cleared, the value TRUE (1) is returned. Otherwise FALSE

(0) is returned.

Usage Notes The operation is performed only if ALL entries can be invalidated; otherwise

none of the entries are invalidated.

CacheSetParameters

This is a “futures” call.

Note: There are probably security concerns for this too.

166
Note: Details of this call are not clear at this point and will be determined by

experimentation.

Syntax

int CacheSetParameters(CacheHandle * handle, ... );

Description

Reset cache parameters without stopping and restarting the cache manager.

Parameters

handle
This is a CacheHandle initialized by a call to Cachelnit;

These are whatever we decide later

Return Value

None.

Appendix D

Ute2ups Output File

OOOOO

OFE
OFE
OFE
OFE
OFE
OFE

2032

0.

OOOOOOOOOOOOOOO

OFD
OFD
OFE
OFE
OFE
OFE
OFE
OFE
OFE
OFE
OFE
OFD
00000000
OFE
OFE
OFE

OOOOOO

000000000000

l3
13

13.
.387514112

13

l3.
l3.

13.
13.
l3.
l3.
13.
13.
13.
.581643008

13

13.
13.
13.
17.

17.
.403901440

l7

17.

.387242240
.387314176

387451136

388979712
389090560

389742592
389754368
390113536
390174208
390174208
581337088
581447424

581971456

582043648

582094592

403258880

403339008

403959808

b_IP_Recv
e_IP_Recv
b_IP_Recv
e_IP_Recv
b_IP_Recv

e_IP_Recv

Define_Marker

b_seerGI
b_IP_Send
e_IP_Send
e_IP_Send
b_IP_Recv
e_IP_Recv
b_IP_Send
e_IP_Send
b_IP_Send
e_IP_Send

NNNNNN

138
@10001594

2
12

NNNNNN

2

GCLOCK 287007.686391450 ADJAMT

b_IP_Send
e_IP_Send
b_IP_Recv

167

O 148
0 148
-l 148
12 148
—l 2032
1296389192
seerGI
12 148
148
12 148
12 2032
12 2032
12 148
12 148
12 148
12 148
12 148
12 148
12 148

0.

OOOOOOOOOOOOOOOOOOOO

OPE
OFE
OFE
OFE
OFE
OFD
00000000
OFE
OFE
OFE
OFE
OFE
OFE
OFE
OFE
OFE
OFE
OFE
OFD
00000000

000000

000000000000

17.
17.
17.
17.
17.
19.

19.
.590217472

19

19.
19.
19.
19.
19.
19.
19.
19.
19.
.788507648

19

404042496
404656384
412805120
412879360
412956672
589584384

589664512

590289920
590373888
590631680
595549696
595615744
595690752
595762944
602458624
602458624

168

e_IP_Recv
b_IP_Send
e_IP_Send
b_IP_Recv

e_IP_Recv

NNNN

12
12
12
12
12

GCLOCK 287009.872619575 ADJAMT

b_IP_Send
e_IP_Send
b_IP_Recv
e_IP_Recv
b_IP_Send
e_IP_Send
b_IP_Recv
e_IP_Recv
b_IP_Send
e_IP_Send
e_IP_Send

NNNNNNNNNN

2

12
12
12
12
12
12
12
12
12
18
12

GCLOCK 287010.071621925 ADJAMT

148
2032
2032

148

148

148
148
148
148
2032
2032
148
148
148

148

Appendix E

One set of the CGI preformance trace

results

This appendix shows an example of the CGI preformance trace results. There are 32 Cliette
processes and 32 concurrent CGI processes used. Trace markers preﬁxed with CLIonly-
are markers for gathering communication overhead with cliette processes. Trace markers
preﬁxed withTotalCGI- are markers for the total CGI requests overheads. The results are

generate using our extended UTE tools.

169

170

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Markers Total # Average maximum minimum
elapsed time calls per calls

(msecs) (msecs) (msecs)
CLIonly-l 1.180 30 0.039365 0.100808 0.009687
CLIonly-Z 1.406 30 0.046889 0.458509 0.004284
CLIonly-3 0.866 30 0.028874 0.131889 0.004897
CLIonly-4 2.264 30 0.075496 0.527412 0.005591
CLIonly-S 2.005 30 0.066835 0.353570 0.006677
CLIonly-6 1.400 30 0.046676 0.2781 17 0.007001
CLIonly-7 1.597 30 0.053241 0.392853 0.009950
CLIonly-8 1.1 13 30 0.0371 19 0.320832 0.006632
CLIonly-9 1.351 30 0.045048 0.237793 0.004432
CLIonly- 10 1.207 30 0.040258 0.437840 0.004048
CLIonly- 1 1 1.245 30 0.041502 0.176489 0.004313
CLIonly- 12 0.581 30 0.019368 0.084759 0.00421 1
CLIonly-13 1.344 30 0.044829 0.156143 0.005445
CLIonly- 14 2.272 30 0.075742 1.070509 0.004054
CLIonly- 15 1.075 30 0.035842 0.139207 0.005370
CLIonly- 16 1.223 30 0.040797 0.375009 0.005303
CLIonly-17 0.620 30 0.020693 0.101845 0.004476
CLIonly- 18 0.806 30 0.026884 0.439822 0.003896
CLIonly- 19 1.383 30 0.0461 15 0.167071 0.004154
CLIonly-ZO 1.729 30 0.057660 0.225426 0.006752
CLIonly-21 0.469 30 0.015662 0.229808 0.003834
CLIonly-22 1.032 30 0.034413 0.189010 0.004010
CLIonly-23 1.044 30 0.034821 0.163891 0.003958
CLIonly-24 1.081 30 0.036049 0.447261 0.004008
CLIonly-25 0.557 30 0.018577 0.081385 0.003886
CLIonly-26 0.667 30 0.022239 0.084272 0.004105
CLIonly-27 0.973 30 0.032434 0.208985 0.005090
CLIonly-28 0.981 30 0.032710 0.124501 0.005154
CLIonly-29 0.899 30 0.029980 0.109072 0.004023
CLIonly-30 0.932 30 0.03 1093 0.089812 0.004395
CLIonly-3 1 0.601 30 0.020049 0.217402 0.004091
CLIonly-32 0.958 30 0.031934 0.205561 0.003927

 

 

171

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Ma r ker 3 Total # Average maximum minimum
elapsed time calls per calls

(msecs) (msecs) (msecs)
TotalCGl-l 3.619 30 0.120643 0.983301 0.022279
TotalCGI-Z 3.393 30 0.1 13101 1.096251 0.012936
TotalCG1-3 1.843 30 0.061445 0.217051 0.018176
'1‘otalCGI-4 4.774 30 0.159149 0.724803 0.014992
TotalCGl-S 4.217 30 0.140570 0.475607 0.0203 10
TotalCG1-6 2.796 30 0.093209 0.302459 0.016272
TotalCGI-7 3.965 30 0.132191 0.534225 0.022115
TotalCGI-8 7.944 30 0.264831 5.902418 0.015512
TotalCGI-9 4.347 30 0.144915 1.158108 0.016727
TotalCGI- 10 8.813 30 0.293794 5.909424 0.012247
TotalCGI-l 1 8.251 30 0.275064 5.647499 0.013061
TotalCGI- 12 7.217 30 0.240593 5.635181 0.014370
TotalCG1-13 3.046 30 0.101542 0.331563 0.017238
TotalCGl- 14 3 .464 30 0.1 15482 1.148947 0.013400
TotalCGl- 1 5 2.645 30 0.088167 0.344756 0.015360
TotalCGI- 16 2.814 30 0.093818 0.425163 0.019374
TotalCGI- 17 8.447 30 0.281591 5.932014 0.016651
TotalCGI- 18 7.1 19 30 0.237332 5.634085 0.01 1395
TotalCGl- 19 3.345 30 0.111525 0.311600 0.015581
TotalCGl-ZO 3.439 30 0. 1 1464 1 0.320407 0.024020
TotalCGl-Zl 7.424 30 0.247479 5.657050 0.011613
TotalCGI-22 2.861 30 0.095373 0.520424 0.01521 1
TotalCGI-23 8.027 30 0.267586 5.540569 0.012145
TotalCGI-24 7.988 30 0.266293 5.527653 0.01 1663
TotalCGI-25 7.535 30 0.251 168 5.932544 0.01 1602
TotalCGI-26 1.733 30 0.057781 0.1501 15 0.012808
TotalCGl-27 2.510 30 0.083676 0.597024 0.014737
TotalCGI-28 2.362 30 0.078741 0.369441 0.022032
TotalCGI-29 2.404 30 0.080159 0.269753 0.021 162
TotalCGI-30 2.197 30 0.073245 0.203909 0.013885
TotalCGI-31 6.868 30 0.228959 5.6631 16 0.012985
TotalCGI-32 1.77 1 30 0.059039 0.239831 0.01 1593

 

 

 

Appendix F

Attributes for the FCLA_1 Index Class

0 CategoryName: Contents thing like Journal, Issue, Article, Class Notes etc. This

ﬁeld is used to describe how the information has been categorized.

CategoryID: Could be either a name, or some identifying aspect within the Category
Name. This could be thing like Jouma1 Name, Issue number, article number/name,

Course Number, etc. This attribute is used to identify the information within a

category.
ExtKey: This is a unique key used to identiﬁed a MARC record in the LUIS database.

Identiﬁer: This is some identiﬁer that is outside of VI, but may be unique in the context

of the data. It will be used to construct a HTML title within a HTML document.

URL: This is some hyper link that is used to redirect the browser either back to the

Luis catalog, to some other WEB server, or somewhere else within the VI catalog.

NextSibling: This is the VI ITEM ID of the next logical ITEM that can be (or is)
associated with this speciﬁc ITEM or FOLDER.

PrevSibling: This is the VI ITEM ID of the previous logical ITEM that can be (or is
) associated with this speciﬁc ITEM or FOLDER.

172

173

DiscoveryMethod: This is an enumerated numeric attribute deﬁning the type of
display to be used for the ITEMS or FOLDERS that are contained by this FOLDER.

NxtSibAnchor: This is a character ﬁeld that will specify what to place as the anchor

for the URL used to request the next sibling.

PrvSibAnchor: This is a character ﬁeld that will specify what to place as the anchor

for the URL used to request the previous sibling.

RemoteDeliveryServer: This is a human readable character ﬁeld that contains the

protocol, address, and port of a remote server.

TimeDuration: An integer value specifying an amount of time in minutes that a URL
is valid. This ﬁeld is used with the Timestamp sent in the HTTP request to validate

the current URL. ZERO indicated an indeﬁnite duration.

FirstSibAnchor: This is a character ﬁeld that will specify what to place as the anchor

for the URL used to request the ﬁrst sibling.

LastSibAnchor: This is a character ﬁeld that will specify what to place as the anchor

for the URL used to request the last sibling.

ParentAnchor: This is a character ﬁeld that will specify what to place as the anchor

for the URL used to request the parent folder.

SequenceNumber: This is the sequence number for this ITEM ID or FOLDER as it
pertains to the FOLDER that contains this ITEM or FOLDER.

SequenceTotal: This is the total number of containing ITEMS or FOLDERS as it
pertains to the FOLDER that contains this ITEM or FOLDER.

IntKey: Unique Visual-Info Item ID which is created every time the item is entered

into the Visual-Info Database.

UniqueKey: It is a Visual-Info Item ID which is generated the ﬁrst time the item is

entered into the Visual-Info database. This information is also saved in the LUIS

174

database to link the MARC record with Visual-Info information. This ITEM ID
remained unchanged if a rebuild/restore is done on the Visual-Info database to avoid

rebuilding the LUIS database because of the changes in the IntKey ﬁeld..

Bibliography

[1] T. Bemers-Lee, “The HTTP protocols as implemented in w3,” tech. rep., URL = http:
//www.w3.org/ pub/ www/ doc/ http.txt, January 1992.

[2] T. Kwan, R. McGrath, and D. Reed, “NCSA’s world wide web server: Design and
performance,” IEEE Computer, vol. 28, November, 1995.

[3] T. Bemers-Lee, R. Fielding, and H. Frystyk, “Hypertext transfer protocol -
HTTP/ 1.0,” tech. rep., Internet Draft — http:// www.w3.org /pub [WWW /Protocols,
1995.

[4] R. McGrath, “What we do and don’t know about the load on the NCSA WWW
server,” September, 1994.

[5] E. D. Katz, M. Butler, and R. McGrath, “A scalable HTTP server: The NCSA proto-
type,” Proc. 1994 World Wide Web Conference, 1994.

[6] A. Ford, Spinning The Web - How to Provide Information on the Internet. VNR
Communications Library, 1995.

[7] T. BernerS-Lee, “Hypertext markup language (HTML),” tech. rep., URL = ftp:
//www.w3.org/ pub/ www/ doc/ html-spec.ps, March 1993.

[8] S. Lewontin, “The DCE web toolkit: Enhancing www protocols with lower-layer
services,” Third International World Wide Web Conference, 1995.

[9] S. Lewontin and M. E. Zurko, “The DCE web: Providing authorization and other
distributed services to the world wide web,” WWW Conference ’94, 1994.

175

176

[10] NCSA, “The common gateway interface,” tech. rep., http:// hoohoo. ncs.uiuc.edu/
docs-1.4l, 1994.

[11] C. Bowman, P. Danzig, D. Hardy, U. Manber, M. Schwartz, and D. Wessels, “Har-
vest: A scalable, customizable discovery and access system,” tech. rep., Dept. of CS,

University of Colorado - Boulder, March, 1995.

[12] IBM, ImagePlus Visuallnfo - An integrated desktop solution for document manage-
ment and beyond. IBM Publication 6221-4011-00, 1995.

[13] IBM, D32 Information and Concepts for common server. IBM Publication S20H-
4664-00, 1995.

[14] C. Stunkel, D. Shea, B. Abali, M. Atkins, C. Bender, D. Grice, P. Hochschild,
D. Joseph, B. Nathanson, R. Swetz, R. Stucke, T. Tsao, and P. Varker, “The SP2
high-performance switch,” IBM Systems Journal, vol. 34, no. 2, 1995.

[15] L. Lamport, “Time, clocks, and the ordering of events in a distributed system,” Com-
munications of ACM, vol. 21, no. 7, July 1978.

[16] T. Kwan, D. Reed, and R. McGrath, “User access patterns to NCSA’s world wide web

server,” 1995.

[17] R. McGrath, “Performance of several HTTP daemons on an HP 735 workstation,”
tech. rep., http://www. ncsa. uiuc. edu/InformationServers/ Performance/ V1.4/ re-

porthtrnl, April, 1995.

[18] R. McGrath, “Performance of several web server platforms,” tech. rep., httpzllwww.
ncsa. uiuc. edu/InformationServers/ Performance/ Platforms/report.html, January 22,
1996.

[19] G. Trent and M. Sake, “WebStone: The ﬁrst generation in HTTP server benchmark-
ing,” WWW Conference ’95, 1995.

[20] “Webperf: The next generation in web server benchmarking,” 1996.

177

[21] R. McGrath, “Measuring the performance of HTTP daemons,” tech. rep., httpzllwww.

ncsa. uiuc. edu/InformationServers/ Performance] Benchmarking/ bench.htrnl, 1996.

[22] R. B. Denny, “WebSite performance analysis,” tech. rep., The Web Developer’s Vir-
tual Library — http: //www. Stars. com/, 1995.

[23] S. E. Spero, “Analysis of HTTP performance problems,” WWW Conference ’ 94, 1994.

[24] V. N. Padmanabhan and J. C. Mogul, “Technical document: Improving HTTP
latency,” tech. rep., http://www.ncsa.edu/SDG/ 1194/ Proceedings/ DDay/mogul/
HTTPLatencyhtml, 1995.

[25] “The netscape server API,” tech. rep., Netscape Communications Corporation —

http://www.netscape.com/ newsref/ std/ server_api.htm1, 1995.

[26] “Performance benchmark tests of unix web servers using APIs and C613; executive
summary,” tech. rep., Hayes & Company and Shiloh Consulting — http: //www. ncsa.
edu ﬂnformationServers /Performation /CGI /cgi—nsapi.html, 1995.

[27] J. Devember, Presenting Java. No. 1-575-21039-8, Sams.Net Publishing, September
20, 1995.

[28] T. Ritchey, Java! No. 1-56205-533-X, New Riders, 1995.

[29] A. Parpcke, S. Cousins, and H. G.-M. etc., “Towards interoperability in digital li-
braries: Overview and selected highlights of the Stanford digital library project,” tech.
rep., httpzll www-diglib.stanford.edu/ cgi-bin/ WP/get ISIDL-WP- 1995-0013, 1995.

[30] W. Birmingham, “An agent-based architecture for digital libraries,” tech. rep., http://
www.dlib.org/ dlib/ July95/ 07birrningham.html, July, 1995.

[31] R. McGrath, “Caching for large scale systems: Lessons from the W,” Magazine
for Digital Library, 1996.

[32] C. Bowman, P. Danzig, D. Hardy, U. Manber, M. Schwartz, and D. Wessels, “The
harvest information discover and access system,” Proc. of the Second International
WWW Conference, 1994.

178

[33] C. E. Wu, H. Franke, and Y.-H. Liu, “UTE: A uniﬁed trace environment for IBM SP
systems,” Proc. 1995 International Conference on Parallel and Distributed Comput-

ing Systems, September 1995.

[34] N. Borenstein and N. Freed, “Mime - multipurpose intemet mail extensions: Part i
- mechanisms for specifying and describing the format of internet message bodies,”

tech. rep., IEEE, September, 1993.

[35] K. Moore, “Mime - multipurpose intemet mail extensions: Part ii - message header

extensions for non-ascii text,” tech. rep., IEEE, September, 1993.

[36] D. Crocker, “Standard for the format of ARPA intemet text messages,” tech. rep.,
IEEE, August 13, 1982.

[37] D. Koblas and M. R. Koblas, “SOCKS,” Proc. of 1992 USENIX Security Symposium,
1992.

[38] T. Kerola and H. Schwetrnan, “Monit: A performance monitoring tool for parallel and

pseudo-parallel programs,” Proc. of the ACM SIGMETRICS, May 1987.

[39] M. Heath, “Visualizing the performance of parallel programs,” IEEE Software, vol. 8,
no.5,Sep.1991.

[40] A. Malony, D. Hammerslag, and D. Jablonowski, “Traceview: A trace visualization

tool,” IEEE Software, September 1991.

[41] M. Pongami, W. Hseush, and G. Kaiser, “Debugging multi-threaded programs with
MPD,” IEEE Software, vol. 8, no. 3, May 1991.

[42] V. B. et al., “The IBM external user interface for scalable parallel systems,” tech. rep.,
IBM, 1994.

[43] IBM, IBM AIX Parallel Environment: Parallel Programming Reference. IBM Publi-
cation SH26-7228, Sep. 1993.

[44] MP1, “Document for a standard message-passing interface,” Tech. Rep. CS-93-214,

MP1 Forum, University of Tennessee, Nov. 1993.

179

[45] H. Franke, P. Hochschild, P. Pattnaik, and M. Snir, “MPI-F: An efﬁcient implemen-
tation of MP1 on IBM SP2,” Proc. of 1994 International Conference on Parallel Pro-
cessing, August 1994.

[46] C. E. Wu, Y.-H. Liu, and Y. Hsu, “Timestamp consistency and trace-driven analysis
for distributed parallel systems,” Proc. 1995 Int’l Parallel Processing Symposium,
April 1995.

[47] P. C. et al., “Parallel ﬁle systems for IBM SP computers,” IBM System Journal, 1995.

[48] P. Corbett and D. Feitelson, “Design and implementation of the vesta parallel ﬁle
system,” Proc. of 1994 Scalable High Performance Computing Conference, pp. 63—
70, 1994.

[49] S. Baylor and C. Wu, “Parallel i/o workload characteristics using vesta,” Proc. of
IOPADS Workshop at IPPS’ 95, April 1995.

[50] D. K. et al., “Visualizing the execution of high performance fortran (HPF) programs,”
Proc. of 1995 Int’l Parallel Processing Symposium, April 1995.

[51] R. Aydt, “The pablo self-deﬁning data format,” tech. rep., Dept. of Computer Science,
University of Illinois, July 1994.

[52] V. Herrarte and E. Lusk, “Studying parallel program behavior with upshot,” Tech.
Rep. ANL—9 l/ 15, Mathematics and Computer Science Division, Argonne National
Laboratory, 1991 .

[53] E. Karrels and E. Lusk, “Performance analysis of MP1 programs,” Proc. of the Work-
shop on Environments and Tools for Parallel Scientiﬁc Computing, 1994.

[54] “Webperf: The next generation in web server benchmarking,” Tech. Rep. unpub—
lished, Standard Performance Evaluation Corporation (SPEC), 1996.

[55] M. Wittle and B. E. Kieth, “LADDIS: The next generation in NFS ﬁle server bench-
marking,” 1993 summer USENIX Conference, 1993.

180

[56] “Security administrator’s tool for analyzing networks,” tech. rep., http: //www. ﬁsh.
com/ satab, 1996.

[57] D. L. Mills, “Internet time synchronization: The network time protocol,” IEEE T rans-

actions of Communications, vol. 39, Oct. 1991.

[58] W. R. Stevens, Unix Network Programming. Prentice-Hall, Englewood Cliffs, NJ,
1990.

[59] B. Abali and C. B. Stunkel, “Time synchronization on SP1 and SP2 parallel systems,”
Proc. of 9th International Parallel Processing Symposium, April, 1995.

[60] M. Ridley, “Innovation and implementation: Adopting and managing world wide web

services in academic libraries,” tech. rep., http: //www. Stars. com/, 1995.

[61] A. Koopman and S. Hay, “Swim at your own risk - no librarian on duty: Large-scale
application of mosaic in an academic library,” tech. rep., http: //www. Stars. com/,

1995.