0% found this document useful (0 votes)
164 views

Monitoring in Grid

This paper aims at providing a multi-platform grid monitoring service. It can monitor resources such as CPU speed and utilization, memory usage, disk usage, and network bandwidth in a real-time manner. Monitoring data is extracted from Ganglia and NWS tools then stored and transmitted in XML form and then used for displaying.

Uploaded by

scandalousk
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
164 views

Monitoring in Grid

This paper aims at providing a multi-platform grid monitoring service. It can monitor resources such as CPU speed and utilization, memory usage, disk usage, and network bandwidth in a real-time manner. Monitoring data is extracted from Ganglia and NWS tools then stored and transmitted in XML form and then used for displaying.

Uploaded by

scandalousk
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

2007 IEEE Asia-Pacific Services Computing Conference

Implementation of Monitoring and Information Service Using Ganglia and


NWS for Grid Resource Brokers*

Chao-Tung Yang† Tsui-Ting Chen Sung-Yi Chen


High-Performance Computing Laboratory
Department of Computer Science and Information Engineering
Tunghai University, Taichung, 40704, Taiwan
[email protected]

Abstract due to the considerable diversity, large numbers,


dynamic behavior, and geographical distribution of the
Recently, Grid computing is increasingly used by entities. Hence, information services are a vital part of
organizations to achieve high performance computing any Grid software infrastructure.
and heterogeneous resources sharing. These Grids Typical monitoring and discovery use cases include
may span several domain administrations via internet. providing data so that resource brokers can locate
As a result of this, it may be difficult to monitor, computing elements appropriate for a job, and
control and manage those machines and resources. streaming data to an application [7, 12, 19]. MDS of
This paper aims at providing a multi-platform Grid Globus Toolkit provides a nice information
monitoring service which can monitor resources such management tool, but it is not capable of providing the
as CPU speed and utilization, memory usage, disk rich set of all the requisite information by itself. For
usage, and network bandwidth in a real-time manner. this paper, we will concentrate on building monitoring
Monitoring data is extracted form Ganglia and NWS and information services focused on providing
tools then stored and transmitted in XML form and comprehensive details about the resource information
then used for displaying. All the information is on a grid [8, 9, 10, 13, 14, 15]. It can provide us with
displayed using real-time graphs. execution details of any grid job or grid CPU (load,
architecture, etc.,) of interest.
1. Introduction As we known, Ganglia [16] can monitor cluster
resources and enhance MDS information with
Grid computing can be defined as coordinated re comprehensive cluster-level status information.
source sharing and problem solving in dynamic, multi Ganglia is flexible open source, coexists easily with
institutional collaborations [1, 2, 3, 5, 7, 11, 12, 19, MDS and other information providers, and it is well
21]. Grid computing involves sharing heterogeneous tested and widely used. For the collection of historic
resources, based on different platforms, information we chose Ganglia [16]. The Ganglia
hardware/software, computer architecture, and monitoring system collects a set of system properties at
computer languages, which located in different places regular intervals and stores them in a round-robin
belonging to different administrative domains over a database. It is also possible to monitor additional
network using open standards. properties by providing a script to extract these
As more Grids are deployed worldwide, the number properties to Ganglia.
of multi-institutional collaborations is rapidly growing. Ganglia provide a PHP Web front-end for
However, for Grid computing to realize its full administrator to view cluster or Grid status information
potential, it is expected that Grid participants are able in real time. The default information includes some
to use resource of one another. In the Grid the metrics, such as processor load, memory usage,
characterization and monitoring of resources [1, 7, 12, network (bytes input/output), and disk utilization. For
19], services, and computations are very challenging more information related to network of a grid, The

*
This work is supported in part by National Science Council, Taiwan (ROC), under grant no. NSC 96-2221-E-029-019-MY3 and
NSC 95-2218-E-007-025.

Corresponding author.

0-7695-3051-6/07 $25.00 © 2007 IEEE 356


DOI 10.1109/APSCC.2007.74
Network Weather Service (NWS) [18], is used to however, allows new internal sensors to be configured
obtain monitoring information. As we know, NWS into the system.
though not targeted at Beowulf clusters, is a distributed
system that periodically monitors and dynamically 3. System design
forecasts the performance various network and
computational resources can deliver over a given time 3.1. Resource broker
interval. However, the administrator needs more
flexible and variable operation and network bandwidth The previous work has implemented a Resource
related information provided by Web front-end for all Broker for Computational Grid. Resource broker
practical purposes. discovers and evaluates Grid resources, and makes job
As a result of this, it may be difficult to monitor, submission decisions by comparing the requirements
control and manage those machines and resources. of a job with Grid resources. The system architecture
This paper aims at providing a multi-platform Grid of Resource Broker and the relation of each component
monitoring service which can monitor resources such were shown in Figure 1. Each rectangular represents a
as CPU speed and utilization, memory usage, disk unique component of our system. Users could easily
usage, and network bandwidth in a real-time manner. make use of our Resource Broker through a common
Monitoring data is extracted form Ganglia and NWS Grid portal [9, 13, 15, 19, 21]. The primary task of
tools then stored and transmitted in XML form and Resource Broker is to compare requests of users and
then used for displaying. All the information is resource information provided by Information Service.
displayed using real-time graphs. After choosing the appropriate job assignment scheme,
Grid resources are assigned and the Scheduler is
2. Background responsible to submit the job. The results are collected
and returned to Resource Broker. Then, Resource
2.1. Machine information provider Broker records results of execution in the database of
Information Center through the Agent of Information
The Ganglia [16] is an open source project grew out Service. The user can query the results from Grid
of the University of California, Berkeley’s Millennium portal.
initiative. The Ganglia is a scalable distributed system
for monitoring status of nodes (processor collections)
in wide-area systems based on Grid or clusters. It
adopts a hierarchical; tree-like communication
structure among its components in order to
accommodate information from large arbitrary
collections of multiple Grid or clusters. The
information collected by the Ganglia monitor includes
hardware and system information, such as processor
type, CPU load, memory usage, disk usage, operating
system information, and other static/dynamic
scheduler-specific details.

2.2. Network information provider

The NWS (Network Weather Service) [18] is a


distributed system that detects network status by
periodically monitoring and dynamically forecasting
over a given time interval. The service operates a
distributed set of performance sensors (network
monitors, CPU monitors, etc.) from which it gathers Figure 1. System architecture of resource
system condition information. It then uses numerical broker
models to generate forecasts of what the conditions
will be for a given time period. The system includes
sensors for end-to-end TCP/IP performance 3.2. Software stack diagram
(bandwidth and latency), available CPU percentage,
and available non-paged memory. The sensor interface,

357
The software stack diagram of the system includes
three layers constructed of bottom up methodology,
such as bottom layer, middle layer, and top layer, the
sense of each layer are described in the following.
Bottom Layer: The principal part of this layer is
composed of Nodes, i.e., the node in Grid should be
constructed by software stack which is shown in Figure
2. This layer contains two main blocks, first is
Information Provider, which gathers machine
information of Nodes, such as the number of
processor/core, the load of processor, the free/total size Figure 3. The software stack of all Sites and
of memory, and the usage of disk, for the above- the Service
mentioned purposes the Ganglia serves as the Machine
Information Provider in this system. The part of
essence of Grid is connecting Nodes in Grid with
4. System implementation
Internet, hence the network information among Nodes
such as the bandwidth and latency is essential, and for 4.1. Information Aggregator
above purposes the NWS takes on the Network
Information Provider. The second block is Grid The main phases of Resource Broker are Resource
Middleware, used to join Grid Nodes together, and the Discovery, Application Modeling, Information
MPICH-g2 [4] that compatibles with GT is required Gathering, System Selection, and Job Execution [14].
for running parallel applications on the Grid. The subject matter of Information Gathering phase is
aggregating machine and network information for
Resource Broker making a suitable match of job and
resources. For above purposes, this work devises two
services called Information Service and Monitoring
Service, Information Service plays the role of
gathering the machine and network information and
store up into database, and Monitoring Service
provides a Web front-end page for users to observe the
variation during the process of jobs execution. Figure 4
Figure 2. The software stack of all Nodes depicts architectures of Information Service and
Monitoring Service, and their relation between
Middle Layer: The main composition of this layer is Resource Broker.
Site. The software stack diagram is shown in Figure 3. The primary purpose of Information Service is to
Each Site consists of several Nodes, which are located collect related resource information (processors,
in the same place or connected with same switch/hub, memory, disk, and network bandwidth) of all machines
each Node in a Site should connect to each other by in the Grid and provide the analyzed information.
Internet. Moreover, each Site usually is built up as a These components and their relations are described as
cluster and each Node has a real IP, and the first Node follows:
of this Site is called the head Node in this Site. The z Agent: It is the primary component of
construction of this layer is related to the domain-based Information Service, and is the contact window
network information model that will be described later. of Information Service. Either Scheduler of
Top Layer: The core component of this layer Resource Broker or Controller of Monitoring
consists of two blocks, Resource Broker and Service needs real-time information of machines
Monitoring Service, as shown in Figure 3. Moreover, or estimated information. For example, assume
the Monitoring Service provides a web front-end for that Resource Broker is requested by users for
users to observe the variation during the progress of the list of machines with low CPU loading. First,
jobs. Besides, users can specify the duration of Resource Broker sends a Request to Agent. After
particular Nodes or several particular links in a domain Agent receives the Request, it uses Getter and
which was developed based on Ganglia and NWS Setter to get required information, and returns it
tools. to Agent. Then, Agent sends it to Resource
Broker. After the task is finished, Resource
Broker delivers the related data during execution
to Agent, including number of used CPUs,

358
execution time, disk space usage, memory usage, z Displayer: This component is to provide a query
task requirement, etc. These historical data are mechanism for users to observe historical data of
stored in Message Center. Predictor will be able Grid Nodes. Therefore, a web interface must be
to analyze these data, and then report a suggested provided for convenient use. This component
machine list to Agent. will be integrated into Portal for users to query
z Gatherer and Setter: This component responds conveniently.
information collecting and data accessing could
occur at any time, so events of database
operations would be frequent. In order to unify
information access and reduce redundant
program development, Getter and Setter are
designed and placed at the front end of Message
Center, to control the access of Message Center.
z Message Center: This component is mainly used
to store native information from the Grid,
including CPU Load, Memory Free, Disk Usage,
Network Information, etc. In addition,
observation data of Job Execution and Prediction Figure 4. Architecture of Information Service
data analyzed by Predictor are included. and Monitoring Service
z Gather: MDS service of Globus could collect
resource information such as CPU speed, number The Ganglia is a scalable distributed monitoring
of CPUs, CPU loading, memory size, available system for monitoring status of host in cluster or Grids.
memory, disk space usage, and network interface It provides a PHP Web front-end for administrator to
information. The NWS tool is used to collect the view cluster or Grid status information in real time.
network bandwidth currently. Then, the Getter The default information includes some metrics, such as
and Setter component stores that information to processor load, memory usage, network (bytes
Message Center for future usages. input/output), and disk utilization. For all practical
z Predictor: This component has two functions. purposes, the administrator needs more flexible and
One is to periodically get native information variable operation provided by Web front-end. For this
from Message Center. By Modular design, purpose, this work developed a system that can satisfy
different Type of native information is adopted above needs and compatible with Ganglia. The main
by different prediction model. Then, they are steps are listed in the following:
stored in Message Center for future use. The 1. Dump the contents of a RRD file [17, 20] to
other is to accept Request of Agent to predict and XML format: The following shell script is used to
get required results, increasing system flexibility dump the contents of a RRD file to XML format.
and more applications.
The goal of Monitoring Service is to acquire the #!/bin/sh
information maintained by Information Service, and for i in `*.rrd`
do
present it in graphical form. The tasks and relations of rrdtool dump ${i} > ${i}.xml
these components are described as follows. done
z Controller: It is the main component of
Monitoring Service, and its primary task is to 2. Convert the XML output of an RRD file to
control the behavior of Monitoring Service, JRobin RrdDB format - RrdDb is a class of
including Grid Nodes configuration and JRobin, and it provides a constructer used to
parameter setting. Controller needs periodically create new RRD from XML dump. This class is
to get native information of Nodes from Agent of listed as belows.
Information Service. Then, it sends parameters
and data to Drawer for illustration. public static void xml2JRrd(String name) {
String xml = name + ".xml";
z Drawer: It receives parameters and data from String jrrd = name + ".jrrd";
Controller and draws these figures. Then, RrdDb db = new RrdDb(jrrd, xml);
Displayer presents the figures. The functions of db.close();
drawing need to be flexible. It has to draw }
appropriate figures according to information 3. Render the graph of JRobin RrdDB by
types. RrdGraphDef: The following codes show an
example of rendering an image from JRobin

359
RrdDB that contains processor load information /* do render graph */
RrdGraph rrdGraph = new RrdGraph(def);
and the output graph of CPU loading is shown in }
Figure 5.

public static void cpuReport(String


rrd_dir,
String hostname, long start_time, long
end_time, String img_file) {

String rd = rrd_dir; // rrd file dir

/* start of RrdGraphDef */
RrdGraphDef def = new RrdGraphDef();

/* definition of graph */
def.setMaxValue(100);
def.setMinValue(0); Figure 5. A CPU load visual graph of Node
def.setRigid(true); gamma2
def.setVerticalLabel("Percent");
def.setTimeSpan(start_time, end_time);
def.setTitle(hostname + " CPU last " + Furthermore, this work developed a system that can
getRange(end_time - start_time)); satisfy above needs and compatible with NWS for
def.setAntiAliasing(true); extracting network bandwidth. The main steps of
def.setFilename(img_file); Rendering Network Information Graph with JRobin
def.setWidth(WIDTH);
def.setHeight(HEIGHT); are listed in the following:
def.setLazy(LAZY); 1. Create a JRobin RrdDB for a domain: Each
JRobin RrdDB file responses for a domain and
/* definition of datasource */ each JRobin RrdDB file using constructor to
def.datasource("cpu_user",
rd + "/cpu_user.rrd.jrrd", "sum", create new RRD object from the definition.
"AVERAGE");
def.datasource("cpu_nice", public void addRrd(String domainname,
rd + "/cpu_nice.rrd.jrrd", "sum", String[] links) {
"AVERAGE"); String jrrd = domainname + ".jrrd";
def.datasource("cpu_system", String head = null;
rd + "/cpu_system.rrd.jrrd", "sum", String tail = null;
"AVERAGE");
def.datasource("cpu_wio", RrdDef def = new
rd + "/cpu_wio.rrd.jrrd", "sum", RrdDef(this.nwsrrds_root + "/" + jrrd);
"AVERAGE"); def.setStep(this.step);
def.datasource("cpu_idle",
rd + "/cpu_idle.rrd.jrrd", "sum", for (int i = 0; i < links.length; i++) {
"AVERAGE"); head = links[i].split(" ")[0];
tail = links[i].split(" ")[1];
def.area("cpu_user", CPU_USER, "User def.addDatasource(
CPU"); head + "." + tail + "_b", "GAUGE",
def.gprint("cpu_user", "AVERAGE", " 600, 0.0, Double.NaN);
avg: %6.2f\\l"); def.addDatasource(
def.stack("cpu_nice", CPU_NICE, "Nice head + "." + tail + "_l", "GAUGE",
CPU"); 600, 0.0, Double.NaN);
def.gprint("cpu_nice", "AVERAGE", " }
avg: %6.2f\\l"); def.addArchive("MIN", 0.5, 1, 603);
def.stack("cpu_system", CPU_SYSTEM, def.addArchive("MIN", 0.5, 6, 603);
"System CPU"); def.addArchive("MIN", 0.5, 24, 603);
def.gprint("cpu_system", "AVERAGE", def.addArchive("MIN", 0.5, 288, 800);
"avg: %6.2f\\l");
def.stack("cpu_wio", CPU_WIO, "Wait def.addArchive("AVERAGE", 0.5, 1, 603);
CPU"); def.addArchive("AVERAGE", 0.5, 6, 603);
def.gprint("cpu_wio", "AVERAGE", " avg: def.addArchive("AVERAGE", 0.5, 24, 603);
%6.2f\\l"); def.addArchive("AVERAGE", 0.5, 288,
def.stack("cpu_idle", CPU_IDLE, "Idle 800);
CPU");
def.gprint("cpu_idle", "AVERAGE", " def.addArchive("MAX", 0.5, 1, 603);
avg: %6.2f\\l"); def.addArchive("MAX", 0.5, 6, 603);
def.comment("- " + new Date() + " - def.addArchive("MAX", 0.5, 24, 603);
\\r"); def.addArchive("MAX", 0.5, 288, 800);

RrdDb db = new RrdDb(def);

360
}
2. Query measurement from NWS and update
JRobin RrdDB file: The detail codes are listed as public static void netReport(String
follows. rrd_file, String domain,
long start_time, long end_time, String
public void updateDB(String[] members, img_file) {
NwsJRrd jrrd) {
// links in a domain /* start of RrdGraphDef */
String[] links = this.getLinks(members); RrdGraphDef def = new RrdGraphDef();
// valid members in a domain
String[] mbrs = /* definition of graph */
this.updateMembers(members); def.setMinValue(0);
def.setRigid(true);
// query string
String[] q = this.getFilename(mbrs); def.setVerticalLabel("(Mbit/second)");
String[] result = null; def.setTimeSpan(start_time, end_time);
String[] ss = null; def.setTitle(domain.replace(".", "-") + "
long now = Util.getTime(); Network last "
long step = 0L; + getRange(end_time - start_time));
long stamp = 0L; def.setAntiAliasing(true);
double value = 0L; def.setFilename(img_file);
Vector<Double> rec = new def.setWidth(WIDTH);
Vector<Double>(); def.setHeight(HEIGHT);

/* Query Bandwidth Info. from NWS */ /* definition of datasource */


for (int i = 0; i < q.length; i++) { RrdDb db = new RrdDb(rrd_file, true);
if (!q[i].contains("null")) { int maxLength = -1;
for (int i = 0; i < db.getDsCount(); i++)
result = memory.retrieve(q[i], 1);
if (result.length != 0 && result != {
null) { Datasource ds = db.getDatasource(i);
String dsn = ds.getDsName();
ss =
if (dsn.endsWith("_b")) {
result[0].trim().split("\\s+");
if (dsn.length() > maxLength) {
stamp = Long.parseLong(ss[0]);
step = now - stamp; maxLength = dsn.length();
value = (step < this.period) ? }
new Double(ss[1]) : 0L; }
} else { }
for (int i = 0; i < db.getDsCount(); i++)
value = 0L;
} {
} else { Datasource ds = db.getDatasource(i);
value = 0L; String dsn = ds.getDsName();
}
if (dsn.endsWith("_b")) {
rec.add(value);
String space = "";
}
for (int j = 0; j < (maxLength -
/* Update RrdDB */ dsn.length()); j++) {
RrdDb db = jrrd.getDb(); space += " ";
}
Sample sample = db.createSample(now);
int discount = String sn = dsn.replace("_b",
db.getRrdDef().getDsCount(); "").trim();
def.datasource(sn, rrd_file, dsn,
for (int i = 0; i < dscount; i++) { "AVERAGE");
sample.setValue(i, rec.get(i)); def.line(sn, getRandomColor(),
} sn.replace(".", "~"));
sample.update(); def.gprint(sn, "MIN", space + "min:
db.close(); %6.2f");
} def.gprint(sn, "AVERAGE", "avg:
%6.2f");
def.gprint(sn, "MAX", "max:
3. Render the graph of JRobin RrdDB by %6.2f\\l");
RrdGraphDef: This is an example of rendering }
graph from JRobin RrdDB that contains network }
db.close();
information about a domain. A diagram example def.comment("- " + new Date() + " -\\r");
of network bandwidth is shown in Figure 6.
/* do rendering graph */
RrdGraph rrdGraph = new RrdGraph(def);
}

361
Figure 6. A monthly network graph of beta-
domain Figure 7. The enhanced design of previous
work
5. Experimental results This subsection describes experimental results of
network information model (NIM) and dynamic
The previous work reduced the number of network information model (DNIM). Two clusters, eta
bandwidth measurement between all Grid Nodes, but it and beta, are used in this experiment. We transfer a
lacks network information between two Nodes other 5GB file from eta1 to beta1 during time period
than the head Node located in two different Sites other between the 20th until 30th timestamp, and the
than the head Node. For example, the bandwidth
bandwidth between eta2 and beta1 are observed in
measurement between Nodes A2 and B3 not performed every 60 seconds. Figure 8 depicts that the bandwidth
in the previous model.
of the connection from eta2 to beta1 obtained by NIM
For solving the above need, this work enhanced the is a smooth curve, which cannot reflect the actual
previous model by increasing a switching mechanism. situation. But DNIM can present the variation of the
We call it the dynamic domain-based network link. Figure 9 depicts an unstable fluctuation of the
information model which is shown in Figure 7. The
error rate of NIM, providing broker an unstable
principal improvement is switching the head Node to information reference and causing broker to make
the next free Node of a Site. For example, when Node wrong decisions.
A1 is busy, the head Node of Site A will be the next
free Node A2, which will conduct the bandwidth
measurement between itself and Nodes B3, C2, and D4, 6. Conclusions
if they are the free Nodes of their own Site respectively.
There are three obvious advantages in using this model. This paper is presented to help the user make better
z First, the number of bandwidth measurement can
use of the grid resources available. This paper will look
be still reduced in the same as the previous static at the use of information services in a grid and discuss
model. the monitoring use of the Ganglia toolkit to enhance
z Second, the bandwidth measurement between
the information services already present in the Globus
two arbitrary Nodes in two different Sites can environment. Our grid resource brokerage system
obtain easily. discover and evaluate grid resources, and make
z Finally, the bandwidth measurement obtains real
informed job submission decisions by matching a job’s
values instead of estimation values of a network. requirements with an appropriate grid resource to meet
That is, the Resource Broker is useful in budget and deadline requirements.
scheduling jobs with multi-site condition.

362
DNIM v.s. NIM eta2 -> beta1 eta2 -> beta1 (NIM) [6] V. Laszewski, I. Foster, J. Gawor, and P. Lane, “A Java
80 commodity grid kit,” Concurrency and Computation:
70
Practice and Experience, 2001, vol. 13, pp. 645-662.
[7] H. Le, P. Coddington, and A.L. Wendelborn, “A Data-
60
Aware Resource Broker for Data Grids,” IFIP
50 International Conference on Network and Parallel
Mb/s

40 Computing (NPC’2004), LNCS, 3222 Springer-Verlag,


30 Oct. 2004.
[8] C.T. Yang, C.L. Lai, K.C. Li, C.H. Hsu, and W.C. Chu,
20
“On Utilization of the Grid Computing Technology for
10 Video Conversion and 3D Rendering,” Parallel and
0 Distributed Processing and Applications: Third
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 International Symposium, ISPA 2005, Lecture Notes in
Time (1 minute/unit)
Computer Science, vol. 3758, pp. 442-453, Springer-
Verlag, Nov. 2005.
Figure 8. DNIM shows a better performance [9] C.T. Yang, P.C. Shih, and K.C. Li, “A High-Performance
than NIM Computational Resource Broker for Grid Computing
Environments,” Proceedings of the International
Error Rate of NIM (eta2 -> beta1) Error Rate
Conference on AINA’05, vol. 2, pp. 333-336, Taipei,
450 Taiwan, March 2005.
400 [10] C.T. Yang, K.C. Li, W.C. Chiang, and P.C. Shih,
350 “Design and Implementation of TIGER Grid: an
300 Integrated Metropolitan-Scale Grid Environment,”
th
250 Proceedings of the 6 IEEE International Conference on
%

200
PDCAT’05, pp. 518-520, Dec. 2005.
150
[11] J. Nabrzyski, J.M. Schopf, and J. Weglarz, Grid
100
Rrsource Management, Kluwer Academic Publishers,
50
2005.
[12] S.M. Park and J.H. Kim, “Chameleon: A Resource
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Scheduler in a Data Grid Environment,” Proceedings of
rd
Time (minute/unit)
the 3 IEEE/ACM International Symposium on Cluster
Computing and the Grid, pp. 258-265, May 2003.
Figure 9. Error rate of NIM worse to 428.37% [13] C.T. Yang, C.L. Lai, P.C. Shih, and K.C. Li, “A
Resource Broker for Computing Nodes Selection in Grid
Environments,” Grid and Cooperative Computing - GCC
References rd
2004: 3 International Conference,, Lecture Notes in
Computer Science, Springer-Verlag, vol. 3251, pp. 931-
[1] K. Czajkowski, S. Fitzgerald, I. Foster, and C. 934, Oct. 2004.
Kesselman, “Grid Information Services for Distributed [14] C.T. Yang, P.C Shih, S.Y. Chen, and W.C. Shih, “An
Resource Sharing,” Proceedings of the Tenth IEEE Efficient Network Information Modeling using NWS for
International Symposium on High-Performance Grid Computing Environments,” Grid and Cooperative
th
Distributed Computing, IEEE press, 2001. Computing - GCC 2005: 4 International Conference,
[2] I. Foster and C. Kesselman, “The Grid 2: Blueprint for a Lecture Notes in Computer Science, vol. 3795, pp. 287-
New Computing Infrastructure,” Morgan Kaufmann, 2
nd
299, Springer-Verlag, Nov. 2005.
edition, 2003. [15] C.T. Yang, C.F. Lin, and S.Y. Chen, “A Workflow-
[3] I. Foster, “The Grid: A New Infrastructure for 21st based Computational Resource Broker with Information
th
Century Science,” Physics Today, 2002, vol. 55, no. 2, pp. Monitoring in Grids,” Proceedings of the 5 International
42-47. Conference on Grid and Cooperative Computing (GCC
[4] I. Foster and N. Karonis, “A Grid-Enabled MPI: Message 2006), IEEE CS Press, pp. 199-206, China, Oct. 2006.
Passing in Heterogeneous Distributed Computing [16] Ganglia, http://ganglia.sourceforge.net/
Systems,” Proceedings of 1998 Supercomputing [17] JRobin, http://www.jrobin.org/
Conference, 1998. [18] Network Weather Service, http://nws.cs.ucsb.edu/ewiki/
[5] I. Foster and C. Kesselman, “Globus: A Metacomputing [19] TIGER, http://gamma2.hpc.csie.thu.edu.tw/ganglia/
Infrastructure Toolkit,” International Journal of [20] Tomcat, http://tomcat.apache.org/
Supercomputer Applications, 1997, vol. 11, no. 2, pp. [21] UniGrid, http://140.114.91.31/ganglia/
115-128.

363

You might also like