SG 246650
SG 246650
Grid Computing
Products and Services
Grid computing functionality
Luis Ferreira
Mariano Batista
Sebastien Fibra
Chin Yau Lee
Carlos Alexandre Queiroz Silva
Joao Almeida
Fabiano Lucchese
Nam Keung
ibm.com/redbooks
International Technical Support Organization
August 2005
SG24-6650-00
Note: Before using this information and the product it supports, read the information in,
“Notices” on page xiii.
This edition applies to the capability of the IBM, ISVs, and open source products used to build a
grid computing solution.
Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
The team that wrote this redbook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi
Part 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.1 Grid computing as an evolution of distributed computing . . . . . . . . . . 4
1.1.2 Grid computing models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.3 Grid and its relationship with other IT components . . . . . . . . . . . . . . 11
1.2 Categories and functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.1 Categorization of components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.2 Functionality on grid computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.3 Grid computing products and players . . . . . . . . . . . . . . . . . . . . . . . . 20
Contents v
8.3.4 Engineering Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.3.5 Geophysical Processing and Analysis . . . . . . . . . . . . . . . . . . . . . . 134
8.3.6 Information Accessibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.3.7 IT Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.3.8 Research Collaboration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.3.9 Grid Clash Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.3.10 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.4 Services approach to Grid Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Contents vii
viii Grid Computing Products and Services
Figures
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area.
Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product, program, or service that
does not infringe any IBM intellectual property right may be used instead. However, it is the user's
responsibility to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document.
The furnishing of this document does not give you any license to these patents. You can send license
inquiries, in writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such provisions
are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES
THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer
of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may
make improvements and/or changes in the product(s) and/or the program(s) described in this publication at
any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any
manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without
incurring any obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm
the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on
the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrates programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the
sample programs are written. These examples have not been thoroughly tested under all conditions. IBM,
therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy,
modify, and distribute these sample programs in any form without payment to IBM for the purposes of
developing, using, marketing, or distributing application programs conforming to IBM's application
programming interfaces.
EJB, Java, JavaBeans, Jini, JDBC, JSP, J2EE, Solaris, Sun, Sun Microsystems, and all Java-based
trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.
Microsoft, Nina, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United
States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, and service names may be trademarks or service marks of others.
Mariano Batista is a Certified IT Architect. He has been working for IBM for
more than 11 years. He is usually called to do pre-sales activities in multi-brand
projects that have strong integration and complex technologies. Mariano got his
certification as ITA in Middleware and Application Services in December 2002
and was invited to be part of the ITA Certification Board at the same time. In the
last six years, he has worked for Sales and Distribution as a Solution IT Architect
doing pre-sales activities for multi-industry clients, mainly in Argentina, with
coverage on SSA (Spanish South America) and Latin America. He has been part
of the on demand pre-sales IT Architects team since 2003 and the e-business
Solutions Technical Sales unit since 2001 in the Americas. His current
responsibilities include a broad range of IT solutions and technologies; grid
computing is one of them. Mariano represents IBM in internal and external
conferences, events, and articles published in newspapers and magazines in
Argentina related to grid computing. Before becoming a formal IT architect,
Mariano was part of the Industry Systems Business Unit for Financial Services
Sector in Argentina. He was responsible for important engagements in
e-business solutions for Financial Services Sector industries. He was part of the
team that implemented the first Interactive Financial Services project in Latin
America, and he was the IT Architect and technical leader for the second phase
of that project. He started at IBM in 1993 as a C/C++ developer in the Advanced
Solutions Unit of IBM Argentina. He also provided AIX® and system
management support for that unit. Mariano received his Bachelor of Computer
Sciences degree with honors at the Universidad del Salvador in Argentina in
1994.
Chin Yau Lee works as an Advisory Technical Specialist in grid computing for
IBM ASEAN/South Asia. He holds a Honours Degree in Computing and
Information System from the University of Staffordshire. He has been using Linux
since 1996 and had a few years of experience as a UNIX and Linux engineer
before joining IBM. His areas of expertise includes High Performance Linux and
UNIX, UNIX Systems Administration, High Availability solutions, Internet based
solutions, and grid computing architectures, which he has been actively working
on for the last four years. He is also an IBM Certified Advance Technical Expert
on AIX, Sun™ Certified System/Network Administrator and Red Hat Certified
Engineer. He co-authored the redbook Deploying Linux on IBM ^ pSeries
Clusters, SG24-7014.
Preface xvii
Nam Keung is a senior technical consultant for IBM in Austin, Texas. He has
worked in the area of AIX ISDN communications, AIX SOM/DSOM development,
AIX multimedia development, NT clustering technology, and Java™ performance.
His current assignment involves helping IBM business partners and solution
providers in their efforts to port and deploy applications to the pSeries platform.
He also consults in performance tuning and other educational needs for the
pSeries platform.
Acknowledgements
Thanks to the following people for their contributions:
Joanne Luedtke, Lupe Brown, Cheryl Pecchia, Arzu Gucer, Chris Blatchley,
Wade Wallace, Ella Buslovich
International Technical Support Organization, IBM
Tony White
Worldwide Grid Computing Technical Sales Business Unit Executive, IBM
Ronald Watkins
Worldwide Grid Computing Business Development Executive, Public Sector, IBM
Matthew P Haynos
Program Director, Grid Computing Strategy, IBM Somers
Matthew Friedman
Worldwide Grid Marketing Executive
David Chisholm
Manager Grid/VE, Linux, and Digital Media, IBM Americas
Stephen Gordon
Industry Solutions Executive, IBM Beaverton
Al Hamid
Executive IT Architect and STSM, Grid/OSS Worldwide Leader, BCS
Peter Nielsen
Linux & Grid EBO, IBM Global Services
Chris Reech, Viktors Berstis, Richard Strysniewicz, Jeff Mausolf, Dennis Spexet
IBM Global Services / e-Technology Center, Grid Computing Initiative
Nina® Wilner
Grid Computing - IT Technical Architect LifeSciences, IBM Austin
Bill Bovermann
World Community Grid™ Executive, BCS Seattle
Susan Malaika
STSM (Web, XML, Grid, DB2®, Data), IBM Silicon Valley Laboratory
Allen Luniewski
Information Management On Demand Strategy, IBM Almaden
Chris Dawson
IT Architect, Grid Community of Practice Lead, IBM Bethesda
Alexei Chirokikh
Consulting IT Architect, IBM Rochester
Robert Vrablik
Grid Computing Strategy and Planning, IBM Poughkeepsie
Barbara DeLibero
Worldwide Sales Executive - Grid, IBM Waltham
Judith Adolino
WW Linux and Grid Services EBO, IBM Global Services Boca Raton
Joe Zhou
Grid Computing, XML and Relational conversion, IBM Southbury
Dikran S Meliksetian
STSM - High Performance On Demand Solutions, IBM Southbury
Takanori Seki
Distinguished Engineer - Grid Computing Business, Asia Pacific, IBM Japan
Tomoari Yasuda
Grid Specialist, IBM Japan
Yann Guerin
EMEA Grid Computing TSM, IBM France
Herve Sabrie
Manager, EMEA Design Center & Solution Centers, IBM France
Ruth Harada
Alliances Manager, IBM Brazil
Preface xix
Katia Pessanha
Universities Alliances Manager, IBM Brazil
Srikrishnan Sundararajan
IBM India Software Labs
Clive Harris
Senior Architect, IBM UK
Cliff Vars
Information Integration Technology Solutions, IBM Austin
Ellen Stokes
STSM, Grid Computing, IBM Austin
Pete Henry
Client Executive, ISV & Developer Relations IBM
Elton Minetto
Universidade Comunitária Regional de Chapecó, Brazil
Karen Miller
University of Wisconsin - Madison
Michael Humphrey
Altair Engineering, Inc.
Your efforts will help increase product acceptance and client satisfaction. As a
bonus, you'll develop a network of contacts in IBM development labs, and
increase your productivity and marketability.
Find out more about the residency program, browse the residency index, and
apply online at:
ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us!
Preface xxi
xxii Grid Computing Products and Services
Part 1
Part 1 Introduction
Chapter 1. Introduction
This chapter discusses the following:
Concepts of grid computing
Terminology
Classification of the grid disciplines that will be the basis of the analysis of the
products and scenarios
Several things have happened from the time of initial concept and academic
implementation. Over the following years, grid was used in laboratories and
gained maturity as the concept became a model. That model started a new way
of thinking in the IT community and helped solve requirements for end users and
the consumers of the IT industry. Definitions like server grid, desktop grid,
data/information grid, and others appeared.
The grid paradigm inspired several technical people outside the places where the
concept was born. It was integrated into products of both incipient and well
established IT enterprises. The paradigm opened the way to real pieces of
technology that could be used for commercial purposes.
With grid computing, we can observe a very interesting evolution: What started in
papers and in projects led by universities became a conceptual architecture that
can be used with commercial targets to enrich a user’s experience and to help
with the requirements of the enterprises that consume IT.
...over a network...
The network is the physical structure that interconnects the distributed
resources. A network generally only provides the pipes to connect some
component with other components. However, in order to create comprehensive
and interesting communications, it is necessary to adopt some protocols and
open standards.
Chapter 1. Introduction 5
...with secure access...
The implementation of virtual dynamic organizations requires secure
mechanisms to ensure protection of all shared elements. If a user requires CPU
time to run a certain program, it should not jeopardize any grid component.
The integrity of the information is needed to ensure that the information that is
transferred in the network is not modified after delivery. Security has to enforce
privacy as well, to make sure that only those who have the right to see the
information will be allowed to.
Grids must also implement policies to satisfy the requirements of the consumers
and to use the capabilities of the donors without exceeding the levels that they
stated. Coordination requires the implementation of policies, the registration of
the capabilities of the donors, some metering abilities, and some intelligence to
optimize the usage of the whole grid. It should plan for potential cases, such as a
machine turned off during non-working hours or days. If the grid has premium
users, it has to give a higher priority to their requests
There are also some fundamental grid models based on the type of basic
services provided. Resources can basically be computing power, provided by
servers or individual computers, data storage capacity, provided by information
and data repositories, or network bandwidth, provided by networked
infrastructures. Of course, a grid can be conceived to handle several types of
resources. It is also possible to implement higher, value-add services based on
one or more of these models. These grids expand the principles of the basic
models, leveraging and combining them in innovative way to provide more
advanced services.
Computational grid
A computational grid is an infrastructure that allows resources to donate
computing power to the grid whenever the workload demands. This infrastructure
is suitable for applications that demand:
As much processing power as possible or additional processing power during
certain periods of time and a single machine cannot provide it or at too much
cost.
It is usually associated with resource scavenging in desktops machines and
underused servers.
Such a grid takes advantage of the idle resources in the virtual organization. Just
think about the cycles of CPU that are unused when a typical user browses on
the Internet, reads an e-mail, or creates office documents like presentations or
word processing files. Additionally, it would allow to reuse the infrastructure and
the client application for more than one project at the same time.
Chapter 1. Introduction 7
job to the grid, it would use idle resources to fulfill the job, and it would return
results faster than if the user had used only his own computer.
In this kind of grid, it is more complex to guarantee the time needed to get a
result. For this reason, it is primarily intended for applications that can afford to
deliver results with no critical time commitment. Although it is called a desktop
grid, some distinct coordination points, typically deployed on server machines,
play an important role in this solution; scheduling, directory services, dispatching,
disassembly, and assembly mechanisms should be provided. Another important
characteristic of this grid that could be very appreciated by its adopters is the
provision of a toolkit or framework to create applications.
Information Infrastructure
Information is usually defined as "meaningful data". Meaningful data is often
associated with the unit of information that means something useful to the end
user. The vision of different repositories of information as though they were a
single one illustrates the idea of information integration. Information grid is
intended to integrate different sources of information in a comprehensive way. It
allows applications and users to see a single database that hides the complexity
of accessing multiple databases. The information spread across them is
published by the grid as though it were really centralized.
This grid should implement connectors to the final databases, which surely are
heterogeneous and geographically distributed. It is desirable as well to have
replication and caching mechanisms that make the management and usage of
the IT infrastructure more efficient.
Data is not always meaningful to the end user. However, it is the unit of work of
operating systems and hardware devices. Data in this context means blocks of
data or files.
Network grid
In a typical corporate network, computers are very often permanently connected
to it while using only a portion of its bandwidth. Every machine, servers and
desktops, has underused network bandwidth, which can be considered as an
idle resource. When a given user or machine requires more resources from the
network, a bottleneck is reached.
Chapter 1. Introduction 9
Examples
In a file transfer using FTP, the server side of it allows no more than two or three
simultaneous connections from the same requester. The traffic may be limited by
the server itself just to avoid the situation where a given user gets a great portion
of the server’s bandwidth. This is necessary because there could be other
requesters asking the same server for its resources. The server has to
implement some way to ensure a balanced competition among their requesters.
This is fair from the server point of view, but from the requester standpoint, things
may be seen a bit differently. If the desktop tries to download a file from a FTP
server that limits the bandwidth just in case there are other requesters competing
for the same resource, some idle network resources will appear on the requester
side. Imagine a desktop connected through a 100 Mbps connection to a server
that perhaps provides one or more 1 Gbps connections. If the limitation is set up
to 1 Mbps per connection, the desktop will have theoretically 99 Mbps free, ready
to use. The server has to leave some free network resources to satisfy other
users’ requirements. But the desktop could use its free resources. We know that
there could be additional factors to consider, like WAN bandwidth limitations,
routers, and so on. But, just to keep it simple, we are showing this example in a
LAN environment, or in an environment without other physical restrictions than
the bandwidth of the nodes.
In this example, the desktop could connect to other servers in the same
infrastructure to simultaneously download another portion of the file. This way,
the first server would not modify its policies to attend users’ requests, the second
server could provide more data to the desktop, and the desktop would more
effectively use its own network interface.
From the server side, this requires the duplication of the pool of files on each
server. Perhaps the servers are geographically distributed, in a WAN
environment. In this case, it is obvious that storage is cheaper than network.
Therefore, it is possible to have the files downloaded from several servers.
This type of grid is sometimes called “distributed peer to peer” grid, network grid
communication grid, or “grid delivery”. We can see some resemblance to the well
known peer-to-peer (P2P) programs. However, a network grid has some more
efficient mechanisms to provide services. For example, it can provide a
centralized directory of files, or the directory itself can be part of the client side.
IBM has implemented an internal network grid called IBM downloadGrid with
services similar to the one described above, which allowed IBM to reduce the
transatlantic network traffic.
Multipurpose grid
The multipurpose grid is perhaps the more common implementation in the future
of grid computing. The infrastructure of this grid should be adaptive enough to
provide any of the grid models. It could be implemented as well as a meta-grid
with abilities to route the requests to the grid that supports the right model to fulfill
them.
A deep computer infrastructure can be part of a grid, and a grid can be the
mechanism to provide deep computing resources. Grid provides something that
a typical deep computing cluster does not: access to heterogeneous resources.
Chapter 1. Introduction 11
integrable and must have autonomic capabilities. Grid computing leverages on
demand because it is a fundamental component to achieve the highest degree of
virtualization, one of the key factors of the on demand operating environment.
Grid is also leveraged when its implementation complies with open standards; it
is integrated into the business processes of the organization, and its components
have autonomic abilities.
Applications
Content Management
Billing and Metering
Orchestration and Provisioning
Task Scheduling
Workload Management
Systems Management
Software licensing
Job Scheduling
Structured Data Virtualization
File and Block Data Virtualization
Grid Middleware
Application
This layer describes an environment for developed applications to take
advantage of the whole set of features provided by the grid. Applications can
make use of all layers in the stack, through classes, APIs, frameworks, toolkits, or
Software Development Kit (SDKs).
Chapter 1. Introduction 13
The application layer is also associated with the applications developed for a
specific industry.
Content management
Content management provides another degree of virtualization to data and
information within a grid. It is related to the handling of digital media, like video,
audio, images, or streams. These are not structured data, and need to be treated
in a different way. It usually requires specific features, such as searching an
image database for the most similar picture to the one given in input.
Task scheduling
Task scheduling provides an environment to run small pieces of execution that,
combined, create the unit of work that the requester needs to execute. The
decomposition of this unit of work into smaller pieces can consist of parallel,
short running tasks or of tasks that have to be run in an special order within an
automated workflow.
Workload management
Workload management is a mechanism designed to balance the workload
among different resources. It is defined during the setup and used at run time.
The workload management can be dynamic depending on the real time workload
of the involved components and the policies applied to them. This layer may be
associated with a Service Level Agreement (SLA). In fact, if a SLA exists, the
workload management layer should be aware of it in order to provide the right
power to the right resource at the right time.
Several of the activities behind the scenes are related to management and
administration. Almost all layers in the grid stack have a strong relationship with
management. The middleware has to be configured and enabled. Security has to
be set up and security logs have to be sent to the system management. Block,
file, and structured data virtualization have to be configured and adapted to
optimize their utilization. License and Workload management are obvious
management activities. Provisioning, orchestration, billing, metering, and the rest
of the layers in the grid stack have to be configured and managed as well. The
Tivoli Suite is an example of management infrastructure. It is intended for a
traditional IT environment, but can clearly be integrated into a grid.
Software licensing
License management is a technology that allow you to control the licenses that
are in use for a certain period of time. It provides elements to determine which
machine is the most suitable candidate for a task, based on the software installed
on it.
Job scheduling
Job scheduling is used to optimize the execution of jobs onto a grid. In this
context, a job can be understood as the computational work needed to deliver a
meaningful result within the context of an application. Its execution is governed
by policies that rule the basic scheduling and prioritizing of jobs to be run.
This is a higher level of data virtualization. A grid can create an unified vision of
the different repositories of information. The consumers of this service should
see a single database that in fact is composed of aggregations and associations
of other databases.
Chapter 1. Introduction 15
File and block data virtualization layer
File and block data virtualization mechanisms are related to the lowest layers of
data services provided by the grid. They can live at the hardware level,
associated with blocks of data managed by storage devices.
In the block data virtualization case, the actor is the hardware itself and its
drivers. The components create the unified vision at the operating system level,
which, in turn, will create a file system on top of the block data grid. File data
virtualization is provided by file system implementations.
Grid middleware
The grid middleware provides the integration among the grid components. It is
the keystone software that provides basic services to grid components. These
basic services are integration mechanisms based on standards, description, and
creation of services. The middleware also includes the elements that allow the
grid to interact with its own components.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
As an example, you will find in Figure 1-3 on page 18 the product chart for the
WebSphere Information Integrator, which illustrates our understanding of the
support it provides for the functions specified in this redbook. The chart is to be
read as follows:
Information virtualization
3: The product was developed and designed to completely cover the
discipline. WebSphere Information Integrator is a data federation software
that provides structured access to heterogeneous types of data.
Chapter 1. Introduction 17
Functionality on Grid Computing
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Workload virtualization
Scheduling, the main matter of workload virtualization, is the ability to execute a
unit of work at the most favorable moment under certain policies. Scheduling can
be defined at a single machine level (within a grid) and at grid level (over several
computation pools); this is what meta-scheduling is about. The unit of work and
terminology are defined according to the type of application environment. It is
normally possible to schedule jobs, tasks, or services. Just to simplify matters,
we can generically consider it as an “activity”.
A scheduler’s main role is to ensure that some activity will be carried out over a
certain period of time. The scheduler has to consider the workload of the
resources that are candidates to run some activity. It would not make sense to
submit a consequent job to a machine that is already heavily loaded, while other
machines just sit in the grid with idle resources. If the machine accepts that
workload, it should delay the execution of its current and new activities with a
lower priority.
Information virtualization
Information virtualization encompasses the different types of data infrastructure
for grid and the different types of products available to implement them, in order
to enable the unified vision of different repositories of data. The virtualization can
be done at the block data, file, or information levels.
The block data virtualization provides the unified access directly from hardware
or from a driver of a given hardware that runs on a given operating system. The
consumer of this data is usually the operating system itself, and the unit of work
is a raw block of data in a given storage device. It allows operating systems to
see physical resources that are not in the same device as a single logical volume
or file system. Then the operating system can emulate a regular file system on
top of the services provided by the lower layers. These components create a
logical low level structure on top of the physical structure of the storage and
publish it to the operating systems.
The file virtualization level is associated with file systems. In this case, the unit of
work is a file. A file can be meaningful by itself, like a spreadsheet or a word
processing document; however, applications usually need a combination of files
to map the data into meaningful information.
Chapter 1. Introduction 19
The information layer is the highest level of the information virtualization
discipline. It is also known as the structured data layer and it provides the ability
to integrate information from heterogeneous resources with heterogeneous
formats by accessing them using heterogeneous protocols and publishing them
to the applications in a homogeneous way, thereby enabling the applications, or
consumers of this information, to have a unified vision of the information
repository.
Other disciplines
Examples of other disciplines are security, provisioning, billing, and metering.
Explanations about these disciplines can be found in Chapter 5, “Other
disciplines” on page 85.
The products presented in this redbook are the main products suitable for a grid
computing environment to date. We will not discuss the displayed products as
grid products, or grid-enabled products, but as products with useful and strong
capabilities that are qualified to be part of a grid computing solution. Some of
these products are IBM products, but some of them are also developed by
Independent Software Vendor (ISVs) or Open Source communities. These ISVs,
also presented in Chapter 7, “ISV suites” on page 113, can be integrated through
IBM professional services as well.
IBM has established key business relationships with leading middleware ISVs to
provide clients with the most robust grid solutions in the industry. Each plays an
important role in IBM Grid Computing solutions. More information can be found
at the following Web site:
http://www.ibm.com/grid/grid_partners/index.shtml
Part 2 Functionality
and products
The Grid Standards and Toolkits section focuses on low-to-middle level software
toolkits that support the building of grids, but are not meant to be usable right “out
of the box”. Their purpose is to make grid oriented programming easier by
providing developers with APIs they can call from within their own code, in order
to implement grid features.
The Grid Application Environments section aims at describing what products are
eligible for grid application deployment, according to grid standards, such as
Open Grid Services Architecture (OGSA).
To better understand how OGSA and WSRF combine to become a grid standard,
one has to consider what steps are required to design a technology:
1. Defining the technology: What is the technology all about? What need is it
expected to fulfill? What are its limitations? What is needed to build this
technology?
2. Specifying the technology: Specifications for the components, definition of
properties and attributes for the objects, and detailed description of the
communication protocols and interfaces involved.
3. Implementation: A reference implementation to support the specifications and
to demonstrate both the feasibility and the relevance of the technology.
As far as grid technologies are concerned, OGSA addresses the first issue
(definition), WSRF addresses the second one (specifications), and the Globus
Toolkit 4.0, for example, is a reference implementation of WSRF.
The best way to look at OGSA is to consider it as a layered model, just like the
OSI reference model (see Figure 2-1 on page 26).
OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled OGSA Enabled
Security Workflow Database File Systems Directory Messaging
Note: WSRF aims to solve the concerns that the Grid community has
expressed about former OGSA specifications, namely OGSI. Its main purpose
is to achieve the convergence of Grid services and Web services. It was
announced by the Globus Alliance and IBM in conjunction with HP on January
2004. In March 2004, the WSRF specifications was submitted to OASIS.
More to the point, WSRF introduces both a design pattern to specify how to use
Web services to access “stateful” components, and a message-based
publish-subscribe mechanism to Web services.
From a Grid perspective, this means all components of a Grid architecture can
be described via WSRF as stateful resources accessible through one or more
Web services, whether they are physical entities (processor, disk drive,
communication link, and so on) or logical construct (authorization, job,
subscription, and so on), real or virtual, static or dynamic (through lifetime
management).
Figure 2-2 gives an overview of the Web Services based Grid Application Model.
Application-Specific Services
Grid Services
Note: The Globus Toolkit is an open source software developed by the Globus
Alliance. More information can be found at the following Web site:
http://www.globus.org/
The Toolkit relies on the open standard protocols described above, and provides
APIs to allocate and manage shared resources in a secure framework. It
provides the core services necessary to address the Grid Computing challenges
introduced earlier.
Besides, IBM Grid Toolbox includes an embedded relational database used for:
Reliable File Transfer service journaling, which is required for recoverability
Backing store for Policy services
Backing store for Service Group service
Finally, IBM Grid Toolbox embeds OpenJMS for the JMS based notification
framework.
User
Browser
HTTP
Java
Servlet
Java Certificate
Web Service Authority
Globus
Grid
Java
Resource
CoG Kit
The Controller manages and controls all interactions between the user and the
application. Usually, the Controller is a servlet that receives an HTTP request
when the user clicks on a link in the browser. Then, the Controller passes the
input parameters to the Model that does the work. This Grid Registration service
is made of several Controller servlets that implement methods specific to each
user action.
The Model encapsulates the business logic, rules, and data, and does the
business processing, usually implemented by Java classes, Java Beans, or
Enterprise Java Beans (EJBs). The Grid Registration service can use a session
EJB™ to store and fetch data from the database.
Called by the Controller, the View part uses the results of the business process
and constructs the response to be presented to the user, usually implemented
through Java Server Pages (JSPs). JSPs build dynamic HTML code and send it
as an HTTP response to the user's browser.
Here, WebSphere Application Server interacts with the Globus Toolkit through
the Globus Java Commodity Grid (CoG) Kit, which provides a set of APIs to
handle Globus resources; it is especially useful here to create certificates signed
by the Certificate Authority, and exchange information with the grid resource
manager in order to submit a job.
Of course, these are only a few hints about what can be created using
WebSphere Application Server as the hosting keystone; further developments
could extend the reach of the application to other products, in order for it to
include job scheduling, resource provisioning, and other grid features.
Once the WAS-XD engine is running, it finds those applications that can be
placed or split in different application servers. Based on policies, it assigns
Information and the path to this information can be personalized according to the
user and type of user who is accessing it. Personalization services also take into
account the kind of device that is being used to access the portal, and their policy
can be modelled as a matrix of users and devices. Devices can be typical
browsers, low end workstation browsers, public Web browsers (which impacts
the information’s confidentiality), cell phones and PDAs using WAP, pagers,
interactive voice response systems (IVRs), scripts automatically retrieving
content, and so on. Defining how the information is accessed and formatted
depending on the device is crucial to enable a user to have different profiles and
preferences based on the device that is being used.
Additional services provided to enhance the portal are, for example, tools to
search topics or certain content in information managed by the portal. Portal
Document Manager is a portlet that allows users to share, view, and organize
files. On the other hand, Productivity Components allow users to create, edit,
convert, and view documents like word processing files and spreadsheets.
WebSphere Portal Server also includes collaborative tools, like Instant
Messaging or People Finder Portlet, which bring advanced navigation
capabilities over a corporate directory. My Lotus® Workspace (QuickPlace®)
portlet manages workplaces, which are virtual desktops managed by the portal.
WebConferencing Portlet provides a suite to manage online meetings. Users can
schedule or join meetings directly from the portlet, with a few mouse clicks.
Finally, WebSphere Portal Server can provide external search capabilities in
DB2, Lotus Notes®, and Lotus Domino® databases, HTML and text documents,
and external Web search engines.
Note: More information about Lotus Notes, Lotus Domino, and Lotus
Workspace can be found at:
http://www.lotus.com/
WebSphere Portal Server can implement high level services to track specific
actions taken by the users. This may be particularly useful to prevent abusive
usage of the grid, like repetitive high priority jobs whose results are not retrieved
immediately. It is vital to know how relevant a job submitted to the grid is, so that
more important jobs can have precedence if need be. As a result, grid schedulers
and workload managers implement sophisticated policies to assign job priorities
and book resources in advance, and dynamically adapt their policies. This is an
example of how sophisticated but typical portal characteristics can improve the
user experience and the grid efficiency.
In some cases, portals are necessary rather than just desirable for a grid. For
example, in a IBM Grid Toolbox or Globus Toolkit grid, all grid nodes must have
the software installed to be registered to the grid and submit jobs. A way to plug
these machines into a IBM Grid Toolbox or Globus Toolkit grid is to build
“submission clients” that can run on the desktops and connect them to the grid.
Another more scalable and easy to administer solution would be to create a Web
portal that connects the user with the grid. It would be necessary to build a portlet
that allows you to submit jobs, to schedule their execution and inquire about their
status, to book CPU time in advance, and so on.
All the features described above are typical qualities of the Web Services Core
and Hosting ecosystem from the user interaction standpoint. WebSphere Portal
Server, IBM Grid Toolbox, and the WebSphere Application Server family provide
all the tools necessary to implement them and integrate people, besides
applications, with the grid.
WebSphere MQ Messaging
WebSphere MQ Messaging (formerly MQSeries®) provides any-to-any
connectivity from desktop to mainframe, through business quality messaging,
supporting several platforms. MQ supplies a reliable messaging system and APIs
for several languages on the supported platforms. By using MQ, an application
can interact with another, whatever protocols and operating systems are used by
either one. The developer only needs to create the interface to implement the
syntax and semantic of the message. MQ provides the connectivity, and the
application brings the logic to create and interpret messages.
On the other hand, MQ has Web services connectors that can use SOAP and
XML to receive and transfer messages. Moreover, if an application that exposes
MQ interface to the rest of the infrastructure needs to connect to a grid, it may be
relevant to use MQ capabilities as an entry point to the grid, for example, by
implementing a grid gateway that sits waiting for MQ messages on one side and
sends grid messages on the other side. This would bring a substantial benefit to
the architecture by making it unnecessary to modify the interfaces of the
applications that submit jobs to the grid, while guaranteeing the reliability of the
information transmitted as messages.
In a different context, a grid can have several entry points: standard Web
services, a portal, and ad-hoc connectors, which may be implemented with MQ.
For example, some platforms do not support Web services, or applications can
be developed in a programming language that was not designed to easily create
SOAP content. In this case, it is still possible to leverage the applications and to
protect existing investments while enabling the building of a grid: instead of
The grid environment can access the ESB's services instead of accessing
individual service providers with ad-hoc mechanisms. In fact, if an ESB already
exists in an organization, and the services to be provided by the grid are
accessible through the ESB, then the grid should access these services through
the ESB.
WBI-MQ can be used in a grid environment as the ESB of a corporate grid. The
grid environment can access ESB's services instead of accessing the service
providers with ad-hoc mechanisms. In fact, if an ESB already exists in an
organization, and the services to be provided by the grid are accessible through
the ESB, then the grid should access these services through the ESB. A grid can
also be a service provider of the ESB of an organization.
Uses WAS-MQ internally and externally. Uses WAS-MQ internally and supports
MQ externally wrapped in JMS messages.
Provides basic connectors out of the box Provides several connectors out of the
while others have to be added. box.
Most of the connections with providers Usually connections with providers can be
should be treated as MQ messages. treated in the original protocols.
2.4 Conclusion
In this chapter, we briefly described the standards and products associated with
the building of basic grid computing infrastructures. Some basic products for the
Web Services Core/Hosting discipline are part of the WebSphere family. Most of
their capabilities are well known from the traditional IT perceptive and can be
integrated with a variety of other products (for example, IBM Grid Toolbox
includes WebSphere Application Server Express Edition).
Jobs schedulers
Job schedulers are able to submit, control, and monitor the workload of jobs
submitted in a network of computers. The jobs submitted are ran based on
priorities and scheduling algorithms that are typically defined by the submitter.
Some of the most popular job schedulers are IBM LoadLeveler®, Platform LSF,
Portable Batch System (PBS), and Condor.
Figure 3-1 on page 43 points out the focus of the chapter and where job
schedulers are typically used.
User
Meta CSF
Job Submit
Scheduler IBM Loadleveler
Platform's LSF
OpenPBS
etc
Scheduler Scheduler
Application
Data access
Knowing the inner details of an application, and therefore knowing that this
application can be “gridified”, lead us to choose a best fit for our scheduling
requirements. Such requirements should also take into account operational
issues like the ones addressed in the following questions.
What are the user requirements?
User requirements are the most critical factors here; some users jobs may ran
for days or maybe weeks before they complete. Users might want to have
some checkpoints features in the scheduler so that it can resume without
restarting the whole job. Status reporting and monitoring might be the area in
which the user wants to focus on.
Some additional user requirements relies on workload scheduling activities,
priority algorithms and policies, automatic rescheduling, load balancing,
performance, and capacity.
Tivoli Workload Scheduler for Virtualized Data Centers V8.2 extends the
capabilities of IBM Tivoli Workload Scheduler to workload optimization and grid
computing. It provides an enterprise-level scheduling solution that helps you
perform cross-platform, cross-domain, and cross-enterprise scheduling by
integrating business applications across grid, mainframe, and traditional
distributed environments. It also provides fault tolerance, scalability, and
outstanding workload performance. Workload Scheduler for Virtualized Data
Centers V8.2 provides the following features:
Centralized control: All scheduling objects are defined and stored on a master
scheduling controller.
Delegation of work: Scheduling duties are sent from the master to other
workstations in the network.
Fault tolerance: In the event of a loss of network connectivity with the master,
other workstations continue to run jobs.
Remote administration: Manipulation of workload can be done remotely using
the Job Scheduling Console.
Automatic driver
Workload Scheduler provides leading-edge solutions to problems in a production
workload management. It can automate, plan, and control the processing of an
enterprise’s entire production workload. It functions as an “automatic driver” for
the production workload, maximizing the throughput of work and optimizing the
resources usage. When Workload Scheduler interfaces with other system
management products, it actually takes part in a more comprehensive integrated
automation and systems management platform.
IBM Tivoli Workload Scheduler can manage the complete flow of work through
an enterprise’s entire operation, on both local and remote systems.
Important: Linux machines and AIX 5L™ machines can now exist in the same
LoadLeveler cluster. For more information, refer to the following Web site:
http://publib.boulder.ibm.com/infocenter/clresctr/index.jsp?topic=/com.ib
m.cluster.loadl.doc/doc_updates/ll3_2.1update.html
The scheduler manages both serial and parallel jobs over the grid nodes. In this
particular case, this distributed environment consists of a pool of machines or
servers, often referred to as a LoadLeveler cluster. Machines in the pool may be
of three basic types: desktop workstations available for batch jobs (usually when
not in use by their owner), dedicated servers, and parallel machines.
The allocation of jobs depends on the availability of resources within the cluster
and on a set of rules defined by the LoadLeveler administrator. A user submits a
job using a job command file, and the LoadLeveler scheduler attempts to find
resources to satisfy the requirements of the job.
Submitting Machines
• Submit, query or cancel jobs
• May be outside the LoadLeveler cluster
Scheduler
• Receives jobs submitted by users
• Manages each job to completion & maintains information
• Asks Central Manager to find appropriate machine(s)
Central Manager
• Examines each job’s requirements & determines
most appropriate machine(s) to run the job
• Central Manager is not a single point of failure
Executing Machines
• Run the jobs
Client
Engine
Engine Daemon
Director
GridServer
Broker Engine Instance
Manager
Engine Instance
Engines and Grid Clients log in to the Director and are authenticated; the
Director then routes Engines and Grid Clients to available Brokers. Grid Clients
submit requests through the Broker. Engines receive work requests from the
Broker, and in most cases, exchange data directly with the Engine processes.
This allows the system to be highly scalable. Since the Brokers manage all work
requests, load balancing is optimal and resilience is built into the system. Brokers
manage Engines and Grid Clients, and schedule work via lightweight messages.
Note: Platform LSF is part of the Platform LSF family of products. For more
information, refer to the following Web site:
http://www.platform.com/products/LSFfamily/
Note: Platform LSF MultiCluster is part of the Platform LSF family of products.
For more information, refer to the following Web site:
http://www.platform.com/products/LSFfamily/
Note: For more informtion about Platform Symphony, refer to the following
Web site:
http://www.platform.com/Products/Platform.Symphony/
PBS Professional is most often deployed across a set of server systems within
an HPC datacenter environment. PBS Professional is a capable solution for large
complex heterogeneous HPC datacenter deployments. PBS Professional can
also be deployed across networks of workstations or personal computers. In
cycle harvesting mode, PBS Professional provides significantly higher hardware
utilization rates by placing batch jobs onto desktop systems when they are not
being utilized for interactive work. PBS Professional is widely deployed
throughout a variety of market segments.
3.4.10 OpenPBS
OpenPBS is a simple workload management solution intended for deployment
on small clusters comprised of dedicated homogeneous server nodes. OpenPBS
is distributed in source code form and users will need to compile the software for
each platform.
The end-user license agreement associated with OpenPBS is not a GPL license
agreement. The OpenPBS license agreement allows any end user to utilize the
software without any costs, but prohibits any entity from redistributing OpenPBS
in conjunction with any commercial product (hardware, software, or service) that
is not free of charge.
Altair does not offer professional software support for OpenPBS. However, there
is an extremely large community of OpenPBS users worldwide and some level of
support is facilitated through this open source community. The
http://www.openpbs.org Web site presents a product comparison between
OpenPBS and PBS Professional.
3.4.11 Condor
Condor, commonly used by universities and research and development
companies, is a freely available, open source, software package that leverages
the use of dedicated or nondedicated resources.
Figure 3-4 on page 57 is a very high level representation of how Condor works.
S u b m it E xe c u tio n
M a c h in e M a c h in e
C h e c k p o in t
D a ta
Note: For more informtion about CSF, please refer to the following Web sites:
http://www.platform.com/Products/Platform.Globus.Toolkit/Product.Informat
ion/Features.Benefits.htm
http://sourceforge.net/projects/gcsf/
Platform Computing has contributed the CSF open source metascheduler to the
Globus Toolkit. CSF classes can be extended to provide more domain specific
community schedulers and support many different kinds of grid deployment
models. Examples of grid level scheduling algorithms include scheduling across
multiple clusters within a virtual organization, co-scheduling across multiple
resource managers, scheduling based on SLAs, and economic scheduling
models. By making use of the open source CSF, grid scheduling implementations
ensure that they interact with resource managers using standard interfaces
without needing to know all of the underlying details of the specification or having
to implement the protocol themselves.
Besides using Maui, starvation can also be resolved using a preemptive job
scheduler, but this is not an easy task, as this implies a rather precise
performance prediction schema.
Current environment
The researchers are currently running their application on a dedicated
multi-processor machine, which is insufficient to achieve effective results on the
jobs. Also, the jobs they are running often “hog” the computing resources in the
lab, which is a shared facility. However, there are a number of student computer
labs that are idle at night.
The proposed Desktop grid environment, shown in Figure 3-5 on page 60, uses
the United Devices grid framework for scheduling, monitoring, and jobs
coordination across all the desktops.
Agent Agent
Agent Agent
Agent Agent
Agent
Job Split
Lab O pen M P
Server United Devices
Agent Agent
Results
Job
Agent
Current situation
In the current environment, the clients have the following infrastructure that they
use as a file server, printer server, workstations, and others.
50
.........
Manual Query
Job Submission
Telnet/SSH
Researchers
Figure 3-6 Current environment
Components of grid
The grid consists of a number of different components that are suitable for a wide
range of specific needs, but in this case, we first propose the addition of
scheduling software, possibly OpenPBS.
Also, by placing all the various machines under the control of the scheduler, it
automates the task of job dispatching, saving time and optimizing the use of
hardware.
Administration
By placing the machines under the administration of the scheduler, the selection
of servers/workstations to perform a certain task is now left to the scheduler.
Users will also not be required to have access to all the servers to run the jobs.
The scheduler will be running the job on the user’s behalf.
Also, OpenPBS has the capability for file staging, providing users with the ability
to specify any files that need to be copied onto the execution host before the job
runs, and any that need to be moved out after the job completes. The job will be
scheduled to run only after the required files have been successfully transferred.
Open PBS
Server
Job Scheduler
Researchers/SSH
A Web Portal will give the users the capability to submit jobs and manage the
environment via Web services. Using this service, the users effectively have an
interface to select the type of jobs, the platform, and the priority of the jobs. For
more information about these products, see Chapter 2, “Web services
core/hosting” on page 23.
The virtualization process can take place in different levels of abstractions and,
as such, data is treated in this context as a multiple-layer concept. The next
section describes the data layers that we consider in this chapter and presents
basic products that provide data virtualization functionality.
Information
Information
Resources
File
File System
System
Block
Block Data
Data
The layer between computers and storage devices hide the complexity of storage
and helps to increase the utilization. It also helps to reduce the costs of
managing by centralizing the management of the multiple storage devices or
subsystems.
Storage Devices
A number of solutions have been created to handle the problem of remote data
access in homogeneous and heterogeneous environments. Network file
systems, for example, are a convenient solution for sharing data across computer
nodes and a typical example is the NFS protocol. Such file systems make
possible the access to remote files in a networked environment as though they
were locally available.
File Systems
Information Virtualization
Structured Content
DBMS Flat File Management
File
System
The examples described above are not mutually exclusive and can be combined
if the solution requires it.
Products overview
Figure 4-5 provides an overview of some Information virtualization products in
each layer.
At the File System Layer GPFS, NFS Version 4, SAN File System, Network
Attached Storage (NAS), and Andrew File System (AFS®) provide an abstraction
At the block data layer, Tivoli Storage Resource Manager and IBM SAN Volume
Controller provide an abstraction of storage devices. It permits a level of
indirection between storage devices, as seen by computers and storage devices
exported by storage systems.
Resources
Block
Block Data
Data Layer
Layer
Figure 4-6 Block data layer
Clients can use SAN capabilities to reduce the complexity that is associated with
managing large amounts of (probably) heterogeneous storage devices.
Furthermore, it provides them with greater flexibility in allocating storage space.
Note: SAN Volume Controller is part of the IBM TotalStorage Open Software
Family. The TotalStorage Open Software family is a set of comprehensive and
flexible storage software solutions that can help enterprises address their
storage management challenges today and is designed to help enterprises
take steps toward evolving to an On Demand storage environment. For more
information, please refer to the following Web site:
http://www.ibm.com/servers/storage/software/virtualization/svc/
With SAN VC, users can allocate their storage to applications based on
application requirements instead of constraints in subsystem connectivity and
functional capabilities. Besides this function, it also provides a centralized
storage management capability.
Note: IBM Tivoli Storage Resource Manager is part of the IBM TotalStorage
Open Software Family. The TotalStorage Open Software family is a set of
comprehensive and flexible storage software solutions that can help
enterprises address their storage management challenges today and is
designed to help enterprises take steps toward evolving to an On Demand
storage environment. For more information, please refer to the following Web
site:
http://www.ibm.com/software/tivoli/products/storage-resource-mgr/
File
File System
System Layer
Layer
Note: The NFS version 4 protocol is a further revision of the NFS protocol
version 2 (RFC1094) and 3 (RFC1813). For more information, please refer to
the following Web sites:
http://www.ietf.org/rfc/rfc3530.txt
http://www.nfsv4.org/
Note: The IBM GPFS, originally developed for SP clusters, can today provide
fast data access from all nodes in a cluster of IBM UNIX servers. For more
information, please refer to the following Web sites:
http://www.ibm.com/servers/eserver/clusters/software/gpfs.html
http://www.ibm.com/servers/eserver/pseries/software/sp/gpfs.html
Files created in GPFS can be accessed from every node that runs GPFS code.
For those nodes not running GPFS, files will still be accessible by using NFS. A
parallel file system not only offers performance advantages by eliminating the
limitation of a single server for file services, but also offers a great deal of
flexibility. With a parallel file system, since all nodes can “see” the file system, it is
easier to move applications from one node to another. GPFS also has the ability
to define data groups that are located under the same “single point of failure”
unit. This allows the data access path to be automatically routed if any of the data
Sharing the same file system among several nodes has the benefit of increasing
the maximum I/O bandwidth that otherwise would be limited by the maximum
local I/O bandwidth of the single server.
SAN FS
SAN FS (“Tivoli SAN FS” on page 189) provides file system virtualization that
can help reduce complexity by treating storage resources as a single, common
pool. The design of SAN FS is also meant to support a heterogeneous
environment.
Note: SAN FS is part of the IBM TotalStorage Open Software Family. The
TotalStorage Open Software family is a set of comprehensive and flexible
storage software solutions that can help enterprises address their storage
management challenges today and is designed to help enterprises take steps
toward evolving to an on demand storage environment. For more information,
please refer to the following Web site:
http://www.ibm.com/servers/storage/software/virtualization/sfs/
The meta-information about the files, such as file location, security, and the
locking mechanism, are managed by the SAN FS. In many traditional file
systems, the metadata resides within individual servers, which can create
limitations in sharing and accessing data across servers or across file systems.
By managing the metadata on the storage network using a metadata server,
instead of within individual application servers, the design of the SAN File
System helps move intelligence out of individual servers onto the storage
network, so that it can be available to any application server in the network.
Some of the relevant features are:
Provide central point for data and storage management, with a single file
system with consistent policies, management, and monitoring capabilities.
Storage pools can be designed to represent differentiated classes of service
for different business areas: a critical storage pool can have the data stored in
the most expensive storage while the staging data can be kept in an older
disk.
SAN FS allows you to manage physical storage as a pool, with options to
transparently add, remove, and change the application servers.
Resources Information
Information Layer
Layer
Federation of data in this sense means that end users and applications are able
to access any data, in any format, on any heterogeneous platform. It also allows
end users and applications to query and manipulate different types of data,
organized in a structured database format, all at the same time
Note: For more information about the DB2 Product Family, please refer to:
http://www.ibm.com/software/data/db2/
DPF partition consists of its own data, indexes, configuration files, and
transaction logs. A partitioned database is a database with two or more
partitions. Tables can then be located in one or more database partitions.
Processors associated with each database partition are used to satisfy table
requests. Data retrieval and update requests are decomposed automatically into
sub-requests, and executed in parallel among the applicable database partitions.
Note: A client who has purchased and installed DB2 Universal Database
Enterprise Server Edition (ESE) can create a partitioned instance by using the
database partitioning feature (DPF). More information about DPF can be
found at:
http://www.ibm.com/developerworks/db2/library/techarticle/dm-0504mcarthur
/
http://publib.boulder.ibm.com/infocenter/db2help/index.jsp?topic=/com.ibm
.db2.udb.concepts.doc/concepts/about_dpf.htm
Note: DB2 Content Manager is part of the DB2 Content Management family of
products, which helps manage, share, integrate, and deliver critical business
information on demand. For more information, please refer to the following
Web site:
http://www.ibm.com/software/data/cm/
Note: DB2 Information Integrator for Content is part of the DB2 family of
products. DB2 Information Integrator for Content provides a strategic
framework to help clients speed new application deployment and control IT
costs. It provides a framework that includes data and content federation,
replication, event publishing, and enterprise search. For more information,
please refer to the following Web site:
http://www.ibm.com/software/data/eip/
At the management level, the CIO wants to evaluate the use of a virtualization
technology to improve the research time and achieve higher resource utilization.
Information grids support heterogeneous platforms and provide data sharing and
large-scale collaboration. Figure 4-9 illustrates the current infrastructure that
“xyz” Corporation has.
DB F o ld e r F o ld e r DB
N e tw o rk
F a c ility A F a c ility C
F ile F ile DB
DB
F a c ility B
Challenges
Engineers working on similar projects are located in distinct sites, but the
research repository data is only available to the main team.
Access to data today is performed on a case-to-case basis, and data sharing
between different research center is low.
The file sharing protocols are different: The UNIX and LINUX machines
access shared files using Network File System (NFS) and the Microsoft
based machines access shared files using the Server Messages Block (SMB)
protocol.
Having the above scenario in mind, IT engineers assigned by the CIO will work
together to come up with a solution and implement it as a pilot data grid project to
share information between two of “xyz” Corporation’s branches. An information
grid in this case will enable engineers and other departments to share and view
data regardless of where it is located or resides. It will also provides access to
heterogeneous files, databases, and storage systems, and shares files through
multiple protocols.
Current environment
There are multiple databases deployed throughout the corporation.
There are also documents in text files and images.
Users have to go through each database to search for files and information,
and there is no universal data repository of the information.
WebSphere®
Information Integration
Provisioning
Provisioning handles the coordination and allocation of managed resources,
such as servers or storage within an on demand environment.
These resources are provisioned dynamically since they are required for
satisfying the needs of a business application. Provisioning is done by allocating
managed resources from a “resource pool". This gives the IT organization the
flexibility to dynamically respond to the needs of the business. Resources can be
dynamically allocated from the pool, then de-allocated and returned to the pool
Orchestration
Orchestration extends the provisioning model by providing a sensing, capacity
planning, and responsive software application to dynamically meet service level
agreements. Because of this sensing and responsive software architecture,
orchestration enables dynamic and automated allocation of server capacity on
demand. Intelligent Orchestrator, by dynamically allocating capacity to
applications that require it, improves utilization of underutilized computing
systems. Orchestrator reallocates existing server capacity to support
unpredictable application demands. It improves application service levels without
investing in additional capacity. It increases the value of IT server investments by
making server capacity available to all applications under its management. Using
Intelligent Orchestrator, a data center can accurately predict impending capacity
requirements, and facilitate dynamic infrastructure reallocation of server and
network resources. Orchestration initiates the activities necessary to
automatically maintain server availability and meet service level agreements. It
provides the why, where, and when of a complete solution.
Why
By monitoring the applications under its control, an orchestrator senses
degrading performance and determines the actions that have to be taken.
Where
Because solutions are monitored closely, an orchestrator determines where a
server needs to be deployed and automatically instructs the provisioning function
to deploy a server to the proper cluster and configure all the network attributes to
keep the solution within adequate service levels.
The IBM Tivoli Orchestration solution makes it possible for the staff to monitor the
response times that clients are actually getting from the business applications. If
the system senses that is about to break that minimum required service level, it
will direct IBM Tivoli Provisioning Manager to provision more IT resources to
support the application that those clients are using. IBM Tivoli Intelligent
Note: IBM Tivoli Intelligent Orchestrator extends the benefits of the IBM Tivoli
Provisioning Manager. For more information, please refer to the following Web
site:
http://www.ibm.com/software/tivoli/products/intell-orch/
When the current Web server pool is not enough to respond to clients’ requests
in an adequate amount of time, the orchestration mechanism takes over and
searches for underutilized resources on other company’s sites. Under manual
processing, these new machines will be added one by one, demanding many
operators and much time. Using the Provisioning Manager, the deployment of
new machines could be automated, demanding less time and no operator for the
reallocation.
The operations described above are managed and made by IBM Tivoli
Provisioning and IBM Tivoli Orchestration. Tivoli Provisioning is responsible for
allocating new machines automatically while orchestration provides a dynamic
and on demand environment for applying the company’s best practices.
5.3 Security
This section discuss security and its relationship with grid computing. In grid or in
any distributed computing system, security plays a major role when ensuring the
quality of service delivered. We can divide security into several subjects to
analyze how a grid can satisfy basic security requirements:
Authentication The grid has to “be aware“ of the identity of the users that
interact with it, either providing or using resources.
Authorization The grid has to restrict access to its resources to the
users that are eligible to access it.
Integrity Data exchanged among grid nodes should not be subject
to tampering.
Privacy The grid must provide facilities for assuring the privacy of
the information exchanged between nodes when
required.
Other desirable mechanisms that a grid could implement are related to the
trusted relationships that have to be established among different virtual
organizations that belong to the same grid at a certain time. Globus Toolkit and
On the software side, there are some alternatives: probably the most traditional is
a SSL-based virtual private network (VPN). This consists of an encrypted,
secure, and mutually trusted connection, which establishes a transparent tunnel
of communication between two nodes that belong to different network contexts.
In this sense, a VPN may be used to join two real organizations into one virtual
organization for the implementation of a single grid.
Additional security requirements and interesting capabilities for grid are mutual
authentication among different organizations and delegation of authentication so
that a process becomes more efficient when a computational activity involves a
number of spread resources.
Note: Tivoli Access Manager for e-business is part of the IBM Tivoli Access
Manager family of products. For more information, please refer to the following
Web site:
http://www.ibm.com/software/tivoli/products/access-mgr-e-bus/
Note: Tivoli Access Manager for Business Integration is part of the IBM Tivoli
Access Manager family of products. For more information, please refer to the
following Web site:
http://www.ibm.com/software/tivoli/products/access-mgr-bus-integration/
Note: Tivoli Access Manager for Operating Systems is part of the IBM Tivoli
Access Manager family of products. For more information, please refer to the
following Web site:
http://www.ibm.com/software/tivoli/products/access-mgr-operating-sys/
Mutual authentication occurs when two parties have their own certificates signed
by their respective CAs (certification authorities), and those actors also trust the
signature of the other’s CAs in the other’s certificates. This mechanism allows
you to pass over the boundaries of an organization.
Further explanation of these processes and GSI can be found in the redbook
Introduction to Grid Computing with Globus, SG24-6895 (at Redbooks Web site)
and at the Globus Alliance Web site:
http://www.redbooks.ibm.com/
http://www.globus.org
The EIM Domain controller is based on a LDAP Server. In that server, EIM stores
the identifiers, associations, and registries for the different systems. The
registries contain a user and password for a given identity in a given system.
EIM can help in a grid environment by providing unified identities to users that
have access with different user and passwords to heterogeneous systems.
5.3.3 Conclusions
In summary, we can say that there are not yet a wide range of products
specifically designed for security in grids. However, most traditional IT products
may be integrated into a grid system. That is because a grid can be seen as a
new way of using traditional IT components. Therefore, the ability to integrate
traditional IT components with specifically designed components for grid is
associated with those products capabilities.
A grid surely will require sophisticated security mechanisms due to its intrinsic
complexity and distribution of resources. For that reason, we recommend having
a security framework or a similar security infrastructure, based on open
standards, to unify the security policies and ease the definition and management
of complex rules. Toolkits like Globus Toolkit and IBM Grid Toolbox require digital
certificates, but they usually provide only services to applications, and the grid
has resources other than applications. At this point, a security framework and
other mechanisms are vital for the health of the grid.
Most of the time, Directory Services are interrelated with security disciplines and
products (please refer to 5.3, “Security” on page 92).
It supports Web services, connectors to LDAP, JDBC™, JMS, XML, and JNDI,
among others, and includes the IBM Tivoli Directory Server, which is the IBM
directory of resources for mid-range platforms.
Note: Tivoli Directory Integrator (TDI) is one of the products of the IBM identity
management solution For more information, please refer to the following Web
site:
http://www.ibm.com/software/tivoli/products/directory-integrator
5.4.2 Conclusions
Directory Services are used in a grid environment to make the search and
discovery of resources placed in different directories or repositories easier by
providing a unified vision of the directory to the grid environment or the user that
asks for certain services from the grid.
5.5 Licensing
In most organizations, a main portion of their investment in information
technology infrastructure is dedicated to software licensing. For this reason, it is
not surprising that most companies are looking for a better way to manage their
software licenses. This leads us to a new term: Software Asset Management.
This is a process of managing software assets, which includes budget planning,
software life cycle, such as upgrades, and others.
Site license
Under site license, the company typically pays one single amount to have
unlimited use of the software: the company is allowed to add additional servers
without worrying about paying new fees or a nominal fee could be charged based
on the amount of new servers. This is commonly used in data centers where the
software is extensively used, and helps build up an agile environment, as users
can run their applications anywhere. On the other hand, this is generally a very
expensive solution.
Floating license
In the floating license method or model, a certain number of licenses is shared
within a set of nodes. Most often, there is a license manager server controlling
and keeping track of the licenses distributed. Applications request an available
license from the license manager when it is started, and once completed, it will
return it back. However, if the licenses are insufficient, additional applications
instances are unable to execute. This model is very effective and very commonly
used in grid applications (if the application supports it), allowing growth and
scalability.
Note: License Manager is part of the IBM Tivoli software family to automate
the software licensing and utilization, and provides advanced reporting and
inventory capabilities. For more information, please refer to the following Web
site:
http://www.ibm.com/software/tivoli/products/license-mgr/
5.5.2 Conclusion
As more software vendors start to adopt grid into their software, the licensing
strategy may expand. Major vendors have software licensing models catered
towards grid environments. In extreme cases, a license can be managed by
provisioning techniques. In this case, the software is installed on the server in
which the execution is to be performed and, after the job is completed, the
application is removed.
Some of the commonly used methods to accomplish billing and metering today
includes:
Scheduler accounting logs
Licensing management
Licensing management
Licensing management is one of the easiest methods to monitor the usage of
software, manually or through a software, such a IBM Tivoli License Manager.
Usage can be tracked based on the application installation rather than the
application’s executions. This may sound ineffective, but it provides the quickest
method to deploy billing and metering. Products such as IBM Tivoli Orchestration
and IBM Tivoli Provisioning Manager have some features for monitoring the
usage, which can be used for billing and metering as well.
5.6.1 Conclusion
As utility-like services get more popular, there will be a greater demand for billing
and metering software with better defined control. This is especially true in light
of electronic utility computing, also known as “e-utility computing”, in which
someone can use the computing power through simple and dynamic access, in
an on-demand basis, and be billed at a later stage.
Part 3 Bundles
S y s te m s
S e r v ic e s
IB M
V irtu a liz a tio n
E n g in e ™
S y s te m s O p e r a tin g
T e c h n o lo g ie s S y s te m s
System services
System services’ products and technologies are higher level components in the
stack of infrastructure layers. This set of components is intended to create the
environment to run applications, manage the complexity of the technology, and
create a unified repository of storage resources. There are two main groups of
elements: those related to servers and those related to storage.
Operating systems
VE features work on the operating system level, as they provide the ability to run
multiple operating systems on a single box. These characteristics are available in
pSeries, iSeries™ through LPAR and DLPAR technologies, and zSeries®, which
allows LPAR and VM images to run different operating systems. xSeries
operating systems can be virtualized through VMware software
(http://www.vmware.com/) that can be integrated into a VE infrastructure.
Examples of the operating systems that can use these features are z/OS®,
i5/OS™, AIX 5L, Linux, and Windows.
There are two suites that make up Virtualization Engine systems services:
IBM Virtualization Engine Suite for Servers
IBM Virtualization Engine Suite for Storage
The main differentiator of VE is that it provides services for the whole stack of
infrastructure levels, creating an integrated environment to run and manage
applications, and to provide, virtualize, and manage resources.
EWLM can also work jointly with Tivoli Intelligent Orchestrator (TIO). TIO can
deploy the EWLM agent in the resources that it is allocating through provisioning
and orchestration. Those resources are allocated in a EWLM domain, so EWLM
6.3 Conclusion
As mentioned before in this redbook, there are several traditional IT components
that can fit into a grid. Moreover, that grid can be improved and leveraged by
these traditional IT components, although they were not specifically designed for
grid. This is an example of the traditional IT technology tools, or non-grid tools,
that are included in provisioning, Virtual I/O, Virtual LAN, micro-partitioning,
monitoring, and so on. Several of these technologies provide services to the grid
computing environment’s underlaying layers.
See 3.4.4, “Platform LSF (Load Sharing Facility)” on page 50, 3.4.5, “Platform
LSF MultiCluster” on page 51, 3.4.6, “Platform LSF HPC (High-Performance
Computing)” on page 51, and 3.4.7, “Platform Symphony” on page 52 for more
informtion about this topic.
Platform offers two different types of bundles, including a full set of grid and
distributed computing products and a family bundle of products.
7.3 DataSynapse
DataSynapse's GridServer software provides open standards compliant,
pre-manufactured services across various grid computing disciplines:
scheduling, workload management, data virtualization, and grid middleware.
DataSynapse enables organizations to create "Service Grids" that support
stateful, persistent, transactional processing of high concurrency and traditional
parallel execution workloads.
The United Devices Grid MP platform family of products can operate grids of any
size as well as aggregate resources on a network (including clusters, servers,
desktops, and laptops) to create a high-performance compute grid for powering a
wide range of applications. United Devices serves clients in a variety of
industries, including life sciences, manufacturing, government, industrial
engineering, and others.
Note: IBM and business partners work together with clients to leverage grid
computing solutions in their existing IT infrastructure. It is important to mention
that grid computing is not an out-of-the-box product: building a grid is the
result of an assessment and architecture development carried out in a
client-tailored fashion.
Important: The following Web site provides a search capability to access all
IBM announcement letters, part numbers, sales manual, press releases,
Redbooks, white papers and catalogs:
http://www.ibm.com/common/ssi
Grid Typically a 3~5 days workshop, its main goal is to provide a grid
Innovation computing design overview that can be used as a basis for follow-up
Workshop engagements.
Grid Strategy Engagement service that defines the vision for a client to leverage
Engagement innovative grid computing solutions. It also identifies how grid
computing technologies can impact an organization’s existing
enterprise architecture and business model.
Grid Training Training and education offering available to complement and leverage
and Education the skills of the interveners in the grid computing implementation
roadmap.
The graphic in Figure 8-1 on page 124 shows an example roadmap for grid
computing implementation services.
Grid Design
Services Grid Implementation
Project 2 Services
Project 2
Grid Design
Services Grid Implementation
Project n Services
Project n
Description
This service is designed with the objective of identifying high-level business
goals, objectives, and critical success factors. Typically, a workshop that lasts
from three to five days, is designed to define potential stakeholders and
understand goals, needs, and expectations. This gives you a chance to identify
and prioritize actionable pilot opportunities that are aligned with the business
goals, objectives, and critical success factors. One of the objectives in this
service is to develop a grid pilot plan that includes process identification and
implementation timelines that provide awareness regarding the benefits and
technology of grid.
This workshop also lets you define and provide an introduction to grid computing
concepts, evolution, and adoption frameworks. It also provides a high-level
introduction to grid architecture and technologies, and introduces potential
The main goal of this service is to have every stakeholder (IT representatives,
product representatives, and decision makers) and IBM sitting at the same table,
in order to be able to successfully outline an engagement for the steps of the grid
implementation and to be able to successfully deploy the strategy that is outlined
during this workshop.
Description
This service provides the initial steps necessary to evaluate a client's current IT
environment and gives the client the information necessary to alter their overall
IT strategy based on their business needs. This service is also key to typical
follow-on engagements, where a more detailed grid design is developed and
implemented.
Using this service will allow the client to arrive at a logical and defensible plan for
using grid technologies to support and enhance the client business plans.
Positioned at the front of the full life cycle of grid services, it is followed by 8.2.3,
“Grid Design Services” on page 126 and 8.2.4, “Grid Software Implementation
Services” on page 127. It also provides a thorough review of all aspects of the
client IT environment and is also a way of linking the client business strategy with
a complete IT strategy, including the client IT and business organization: systems
management, security, applications development, organizational change, testing,
and business continuity.
Description
This service begins with the grid solution architecture overview and is narrowed
to two progressively more detailed levels of design granularity:
1. First level design: The grid macro design provides a product free design that
may also include the design of processes and a high level organizational
structure.
2. Second level of design: The grid micro design is the physical or node level
design that provides a more detailed product mapping overview of the grid
architecture.
If processes are included in the scope of the design, the micro design will also
include process workflow and a detailed design of the organization that will
implement the processes. The intent of the grid micro design is to produce a
design in enough detail to implement the grid solution.
These design phases provide the steps necessary to design a grid from the
logical architecture level, down to a detailed design where vendor products are
Grid Design Services addresses the following major areas of focus for a
successful grid design:
Engagement and project management
Understanding the enterprise business environment and objectives as well as
the current IT environment and plans
Documenting grid application and infrastructure requirements
Providing architectures describing grid applications, middleware,
infrastructure, and security
Designing macro level organization and technology solutions
Development of physical models for applications, grid middleware, grid
infrastructure, and grid management tools
Conducting static testing
Training and user support definition
High level implementation planning
Positioned at the mid-point of the life cycle of grid services, following 8.2.2, “Grid
Strategy Engagement” on page 125 and followed by 8.2.4, “Grid Software
Implementation Services” on page 127, Grid Design Services provides a way of
linking the grid infrastructure to other aspects of the enterprise IT and business
organization at key design decision points, like systems management, security,
applications development, organizational change, testing, and business
continuity.
Description
The phases of the service provide the steps necessary to implement a grid
environment, from the initial project planning, through piloting the environment,
and completing with the migration into operations.
Using this service will allow a client to implement previously planned applications
modifications and install a grid infrastructure.
Positioned at the end of the life cycle of grid services following 8.2.2, “Grid
Strategy Engagement” on page 125, and 8.2.3, “Grid Design Services” on
page 126, Grid Software Implementation Services provides the tools to
implement the grid environment achieved from the planning in previous services.
Analytics Acceleration Reduce time and effort required to analyze data; accelerate
product development, client insight applications, and
accelerate time-to-results
Design Collaboration Leverage idle hardware, speed data capture, and virtualize
disparate systems
Engineering Design Reduce time to market, leverage idle hardware, speed data
capture, and virtualize disparate systems
Description
This is an assessment offering. Its main goals are to predict the business value of
the investment in a grid environment, validate the proposed grid model,
determine the optimal grid blueprint, and monitor IT level indicators of financial
value to efficiently manage the infrastructure.
IBM Grid Value at Work assists in developing a grid deployment roadmap. The
roadmap includes components for total cost of ownership analysis, grid capacity
planning, and business value estimation. It provides the flexibility to focus on one
or more of these components to determine the appropriate investment levels for
grid computing.
Additionally, the IBM Grid Value at Work service helps identify the grid’s most
appropriate uses by analyzing the comprehensive business value of the
implementation across all deployment stages and quantifying the following
information:
Business value of improving the speed, accuracy, frequency, or throughput of
the analysis of results for business analytics
IT infrastructure cost savings and potential benefits from faster time to market
or improved design quality for research, engineering, and design
Cost savings from virtualization of the IT infrastructure and value derived from
exploiting existing, underutilized resources for IT infrastructure optimization
Cost savings from integrated information access and shared usage of IT
resources across departments and organizations
With Grid Value at Work, companies can establish strong business cases to
justify investment on a grid and help the clients to meet their business needs in a
more competitive world.
Description
Working with partners in its industry, IBM has built a grid offering that can deal
with the most demanding applications today. Clients can now reach a new level
of analysis with much more accurate results and yet with less investment in
hardware resources.
This offering is particularly optimized for computing intensive tasks, mainly for the
financial sector. Its main characteristic is that it intends to use every kind of
hardware resources that the infrastructure provides: servers, clusters,
mainframes, and desktops.
Figure 8-2 on page 132 provides an overview of a typical architecture for the
Analytics Acceleration offering.
Avaki / DataSynapse
Resources Resources
Pools Pools
Unstructured Structured
Data Data
Description
Most automobile and aerospace manufacturers use a leading edge design tool
that has many complex features, but some partners may not want to purchase
and maintain multiple design tools for different projects. Using another tool might
require additional testing and analysis by design partners and the original
equipment manufacturer, adding time and cost to the project. Also, most
The Engineering Design offering aims to reduce time to market, leverage idle
hardware, speed data capture, and virtualize disparate systems.
Description
Engineering Design uses the latest grid technologies, including the Open Grid
Services Architecture (OGSA), which merges the open protocols used for grid
computing with the protocols used for Web services. It also provides the latest
software from leading grid software companies, such as Platform Computing,
and the latest open source grid technologies from The Globus Alliance.
Figure 8-3 on page 134 provides an overview of a typical architecture for the
Engineering Design offering.
Figure 8-4 on page 136 provides an overview of a typical architecture for the
Geophysical Processing and Analysis offering.
Description
IBM has scalable tools and systems to help life sciences organizations and
governments benefit from a grid implementation. The offering can achieve a
seamless integration of data and resources and leverage existing IT
environments to provide centralized data access from multiple data sources.
8.3.7 IT Optimization
In a global economy, businesses increasingly rely on IT infrastructures that drive
business strategy, evolution, and success. Streamlining these components and
the supporting organizational processes is an essential yet complex undertaking,
which can be accomplished through grid computing.
Description
Grid computing can enable consolidation of applications, networks, and other
data and can help reduce tension between existing systems and politics between
departments. This new approach can help organizations share resources and
improve resource management. Grid computing can and should be used to
optimize existing traditional IT environments, leveraging management and
monitoring automated provisioning and building a on demand computing
environment. Using the tools and benefits provided by grid computing, clients
can leverage their existing IT environment to a more productive level, increasing
the return of the investment in these IT environments.
Description
Grid computing can help these organizations manage large-scale data sharing.
This new approach helps maximize the use of existing data resources and
makes both structured and unstructured data available across a department,
organization, or consortium. In addition, grid computing can help researchers
secure data access and optimize storage; plus, IBM has scalable tools and
systems to help research organizations and universities benefit from a grid
The Grid Clash Analysis offering helps to optimize and accelerate clash analysis.
Description
This offering is designed to offer an increase of productivity to existing clash
analysis environments. Typically, clash analysis is a very time consuming part of
the design process for, for example, cars, aircraft, and other equipment. Using
grid, IBM can provide a noticeable improvement to this process, providing
increasing advantages to its clients.
8.3.10 Summary
Figure 8-3 on page 140 presents the availability of the above offerings’
availability by industry. Some of offerings are customized for specific industries
and others are in fact cross-industry.
Engineering Offering
Design Available
Research Offering
Collaboration Available
Clash Offering
Analysis Available
For complete list of the grid computing offerings, refer to the following Web site:
http://www.ibm.com/grid/solutions/index.shtml
These solutions and offering should be focused on the business need and should
be addressed with a solution independent approach and oriented to offer the
best business value throughout the project’s life.
Part 5 Appendixes
Globus Alliance
The Globus Alliance conducts research and development to create fundamental
technologies for grid computing. The alliance is formed by a group of sponsors
and collaborators from around the world. The core team is based at the Argonne
National Laboratory and other worldwide institutions (see
http://www.globus.org/about/team.html for more information).
The Globus Toolkit is being developed by the Globus Alliance and many others
all over the world. For more information, please refer to the following Web site:
http://www.globus.org/
W3C
The World Wide Web Consortium (W3C) is an international consortium that
develops specifications and guidelines for Web technologies. For more
information, please refer to the following Web site:
http://www.w3.org/
TeraGrid
TeraGrid is a initiative to build and deploy the world's largest, distributed
infrastructure for open scientific research. It combinations three programs:
National Science Foundation (NSF) Terascale initiative: Terascale Computing
System (TCS®), Distributed Terascale Facility (DTF) and Extensible Terascale
Facility (ETF). For more information, please refer to the following Web site:
http://www.teragrid.org/
http://www.nsf.gov/
But beyond business concerns, grid computing also provides a way for
volunteers to donate unused processing time on their computers, to solve large
and complex problems for the benefit of humanity. Discovering treatments for
killer viruses, delivering more accurate forecasts of looming natural disasters,
and helping researchers identify cures for widespread diseases are only a few
ways of how individual computers could be used in this scenario.
Being firmly convinced that grid computing is poised for philanthropic initiatives
around the world, IBM is committed to providing the technical infrastructure
required to support a philanthropic world-scale grid: the World Community Grid.
The vision of this initiative is to create the world's largest public computing grid to
tackle carefully selected humanitarian projects. The name was chosen to
represent the endeavor's reach (worldwide), volunteer and scientific collaboration
(community), and technical underpinnings (grid computing).
The motivation behind the World Community Grid is leadership. IBM and a group
of leading philanthropic foundations will provide this leadership, crucial expertise,
resources, and guidance, and the philanthropic missions of these foundations will
be strengthened by the power of grid computing.
The official Web site of the World Community Grid can be found at:
http://www.worldcommunitygrid.org/
Technical features
To match expectations and provide a high quality of service, the World
Community Grid has the following properties:
Desktop grid model
Easy to volunteer
High-performance
Reliable
Scalable
Secure
Entertaining
Easy to volunteer
Previous large-audience PC grids projects have established that most volunteers
are driven, first, by the sense that they are contributing their resources to a good
cause and, second, by the ease, both functional and financial, of contributing the
PC's power.
High-performance
Considering the complexity of the problems to be submitted by the research
community, and the huge processing power their resolution implies, the World
Community Grid must be designed to:
Store very large amounts of data
Avoid bottlenecks and minimize latency time
Handle over one million of contributing computers from the launch
Reliable
The World Community Grid must be reliable in terms of:
Availability of the platform
Arrangements must be made to avoid any interruption of service (server
redundancy, 24x7 technical support).
Dependability of the returned results
Computing tasks should be submitted a number of times, to several
computers, and returned results compared with each other to make sure that
faulty computations are discarded.
Scalable
As a large part of the 650 million PCs worldwide is expected to participate in the
future, the World Community Grid must anticipate a far larger number of
contributors. As a corollary, such a computing power increase implies that the
World Community Grid will have the ability to handle several grid-enabled
research projects, simultaneously deployed on the same platform.
Of course, since the World Community Grid infrastructure is accessed from the
Internet, drastic additional security measures must be considered to prevent any
server from being compromised (firewall and other security devices).
Entertaining
Finally, the World Community Grid should provide potential contributors with
various collateral features to encourage participation, and to back up
communication efforts aimed at advertising the World Community Grid:
A few measures for the participants to evaluate their contribution:
– Total CPU time contributed
– Number of tasks completed
– An arbitrary system of points, based on the computing power of the device
and on the time spent computing results
A user-friendly interface for the agent, including a task execution progress bar
and the measures described above; the Agent should also come with a
screen saver displaying the computer’s World Community Grid-related
activity.
A Web site (http://www.worldcommunitygrid.org) to showcase the World
Community Grid and to:
– Spread the word about the World Community Grid and about the research
projects currently using it
– Register contributors and make the Agent available for download
– Provide additional entertaining features, some of them collaborative, such
such as team creation, forums, global statistics, and so on
Funding
The World Community Grid will initially be sponsored by IBM and various
foundations invited to join the effort, to cover the following activities:
Infrastructure hosting with IBM Global Services, including all necessary
hardware and software
The development and ongoing maintenance of the World Community Grid
Web site
Application preparation and loading of the individual research projects onto
the grid
Ongoing support of the World Community Grid community through forums,
online content, and e-mail
Public relations and marketing activities to attract World Community Grid
volunteers and research projects
Technical resources
These resources include hardware, software, and qualified people to operate
them, as developed in the previous section.
IBM's current guidelines for donations will serve as a starting point in determining
the organizations that are eligible to apply for the computational power.
Internet Connected
World Community Grid
Volunteers
Desktop PC
Laptop computer
Laptop computer
Engineering
Workstation
Data
Grid Servers
Data
World Community Grid
Server Infrastructure
Application Servers Database Servers
In order to handle the heavy load that is expected once several millions of people
are aware of the World Community Grid and start to use it frequently, it has been
decided to split the treatment of static and dynamic pages between two groups of
dedicated servers. The static pages, served by the front-end Web servers,
contain informtion about the World Community Grid and on the research projects
running on it, FAQs, troubleshooting tips, legal mentions, and so on. On the other
hand, the dynamic pages will be served by the back-end application servers and
deal mainly with user-oriented services, such as global, by project and individual
statistics, member authentication, forums, and so on.
For security reasons, the grid software running on these servers will also keep
track of the submitted tasks so that only computing resources identified as valid
devices registered through the agent interface are allowed to return a result.
In order to balance the load and minimize the risks of server failure, and because
this data is accessed from distinct servers, the decision was made to distribute
the data over two platforms, according to their type.
Products used
In order to complete the design of the World Community Grid, hardware and
software products have to be chosen for each architectural component.
For reasons of costs and maintenance, the selected servers were IBM ^
xSeries, with Linux as the operating system.
For reasons listed above, these servers should be IBM ^ xSeries running
Linux as well.
MSS offers flexible and scalable storage networking solutions for midrange
server environments. Storage capacity "on demand," backup/restore, and
management services are available as fee-based services. Hosted storage
resources receive the benefits of Storage Area Network (SAN) and Network
Attached Storage (NAS) technologies as part of an integrated solution.
In terms of software, the grid infrastructure product that would be part of the
World Community Grid had to demonstrate sizeable client counts and satisfy as
many of the following criteria as possible:
General considerations
– Good level of maturity, especially on open Internet usage
Project-related considerations
– Ability to run multiple applications concurrently, depending on the
contributor preferences
– Contributed resource evaluation system included (to keep track of the
points)
– Collaborative features provided (teams)
Agent features
– Non-intrusive technology
– Easy installation and configuration of software
– Low disk space, memory, and bandwidth required
– Ability to communicate through a firewall
– Screensaver
– Automatic update of the client software
Security issues
– Protection against spoofing and man-in-the-middle attacks
– Communication secured by mutual handshake and encryption
More information
To conclude this presentation of the World Community Grid, we thought that it
would be relevant to consider some aspects of the project from a contributor
standpoint, in particular the visible part of the World Community Grid as it is
today.
www.worldcommunitygrid.org
Below are a few screen captures of the actual the World Community Grid Web
site; this is solely intended as a glimpse of what kind of information should be
provided to encourage potential participants, and to illustrate some of the
previously discussed features.
The World Community Grid home page (Figure B-7 on page 164) gives quick
access to the main sections of the Web site, and underlines the message the
World Community Grid is conveying: “You can help change the world”.
The World Community Grid Web site provides visitors with global in-depth
statistics on the community contribution. These stats can be consolidated by
various criteria, in particular by projects (see Figure B-8 on page 165).
The World Community Grid Web site has a member area, for registered
contributors to measure their individual contribution and manage their devices.
In terms of measures, the goal is to enlist two million computers in the World
Community Grid in the first year, running up to six applications, increasing that to
eight million computers in the third year with more applications.
Products covered in this redbook support (or not) the functions described in
1.2.2, “Functionality on grid computing” on page 16. Figures presented in this
appendix illustrate our understanding of the functions each product can
provide in a grid environment. It is not a prescription, but just a big picture
about each product.
The width of each bar (0 to 3) reflects our perception of the importance of the
product in a grid computing environment. Figures are to be read as follows:
0: The product has no features of this discipline.
1: The product has some features of this discipline.
2: The product has most of the features of this discipline.
3: The product was developed and designed to completely cover the
discipline.
Important: The following Web site provides a search capability to access all
IBM announcement letters, part numbers, sales manual, press releases,
Redbooks, white papers, and catalogs:
http://www.ibm.com/common/ssi
WebSphere MQ IBM
Workload virtualization
Workload virtualization products presented in this appendix are listed in
Table C-2.
LoadLeveler IBM
Information virtualization
Information virtualization products presented in this appendix are listed in
Table C-3.
Products for security that are presented in this appendix are listed in Table C-5.
Monitoring products for monitoring that are presented in this appendix are listed
in Table C-6.
The products for licensing that are presented in this appendix are listed in
Table C-7.
Figure C-1 and Table C-8 on page 174 present the importance of DB2 Content
Manager in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
GPFS
IBM General Parallel File System (GPFS) is a file system designed to have high
performance and high availability. Originally developed for RS/6000 SP clusters,
it has been ported to Linux and is currently used as one of the first choices in grid
solutions for distributed data storage.
Figure C-2 on page 175 and Table C-9 on page 175 present the importance of
GPFS in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-3 and Table C-10 on page 177 present the importance of the IBM
LoadLeveler in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-4 on page 178 and Table C-11 on page 178 present the importance of
Tivoli Access Manager in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-5 and Table C-12 present the importance of Tivoli Enterprise Console
in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-6 and Table C-13 on page 181 present the importance of Tivoli
Intelligent Orchestrator in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-7 on page 182 and Table C-14 on page 182 present the importance of
Tivoli License Manager in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-8 and Table C-15 on page 184 present the importance of Tivoli
Management Framework in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Tivoli Monitoring
IBM Tivoli Monitoring provides the means to monitor several IT infrastructure
resources. It detects bottlenecks and potential problems. It also provides
automated best practices out of the box that proactively monitor critical hardware
and software, including middleware, applications, and databases.
Figure C-9 on page 185 and Table C-16 on page 185 present the importance of
Tivoli Monitoring in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-10 and Table C-16 on page 185 present the importance of Tivoli
NetView in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-11 on page 188 and Table C-18 on page 188 present the importance of
Tivoli Provisioning Manager in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-12 and Table C-19 on page 190 present the importance of Tivoli SAN
FS in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-13 on page 191 and Table C-20 on page 191 present the importance of
Tivoli Service Level Advisor in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-14 and Table C-21 on page 193 present the importance of Tivoli
Workload Scheduler in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-15 on page 194 and Table C-22 on page 194 present the importance of
TotalStorage SAN Volume Controller in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-16 and Table C-23 on page 196 present the importance of
Virtualization Engine in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Web Services Has the IBM Grid Toolbox and is complemented by the
Core/Hosting WebSphere family of products.
Information Implements some functionality through IBM Grid toolbox and can
virtualization be complemented by WebSphere Information Integrator.
Figure C-17 on page 197 and Table C-24 on page 197 present the importance of
WebSphere Application Server in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Other Disciplines Flexible management and security model with policies and
authentication.
Figure C-17 on page 197 and Table C-24 on page 197 present the importance of
WebSphere Extended Deployment in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-19 on page 200 and Table C-26 on page 200 present the importance of
WebSphere Information Integrator in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-20 and Table C-27 on page 202 present the importance of WebSphere
MQ in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
Figure C-21 on page 203 and Table C-28 on page 203 present the importance of
WebSphere Portal Server in the grid computing environment.
Web Services
Core/Hosting
Workload
Virtualization
Information
Virtualization
Other Disciplines
The publications listed in this section are considered particularly suitable for a
more detailed discussion of the topics covered in this redbook.
IBM Redbooks
For informtion about ordering these publications, see “How to get IBM Redbooks”
on page 216. Note that some of the documents referenced here may be available
in softcopy only.
Fundamentals of Grid Computing, REDP-3613
Globus Toolkit 3.0 Quick Start, REDP-3697
Grid Computing with the IBM Grid Toolbox, SG24-6332
Grid Computing on Research and Education, SG24-6649
Grid Services Programming and Application Enablement, SG24-6100
Introduction to Grid Computing with Globus, SG24-6895
Other publications
These publications are also relevant as further information sources:
Abbas, Grid Computing: A Practical Guide to Technology and Applications,
Charles River Media, 2003, ISBN 1584502762
Berman, et al (editors), Grid Computing: Making The Global Infrastructure a
Reality, Wiley, John & Sons, Incorporated, 2003, ISBN 0470853190
Cockburn, Writing Effective Use Cases, Pearson Education, 2000, ISBN
0201702258
Foster, et al (editors), The Grid 2: Blueprint for a New Computing
Infrastructure, Morgan Kaufmann Publishers, 2003, ISBN 1558609334
“GPFS: A Shared-Disk File System for Large Computing Clusters” by Frank
Schmuck, Roger Haskin in Proceedings of the Conference on File and
Storage Technologies, 2002
Joseph, et al, Grid Computing, Prentice Hall Professional Technical
Reference, 2003, ISBN 0131456601
API. Application Programming Interface. Globus MDS. Monitoring and Discovery System.
CIFS. Common Internet file system, the protocol Globus MJS. Managed Job Service.
formerly known as Server Message Block (SMB)
and, before that, as the BAF protocol (after its Globus RFT. Reliable File Transfer.
original creator, Dr. Barry Feigenbaum). CIFS is a
protocol for file and device sharing across a network. Globus UHE. User Hosting Environment.
EJB. Enterprise Java Beans, a set of GSI. The Grid Security Infrastructure contains
JavaBeans™, currently being developed by Sun to components to secure your grid network.
support development of distributed applications.
HPC High Performance Computing consists of
FTP. File Transmission Protocol, a standard computing applications on (parallel)
Internet protocol to exchange files on top of TCP/IP. supercomputers and computer clusters.
GCC. GNU Compiler Collection, which currently HTC. High Throughput Computing.
contains front ends for C, C++, Objective-C, Fortran,
Java, and Ada, as well as libraries for these HTML. Hypertext Markup Language, which is a
languages (libstdc++, libgcj, and so on). document type consisting of text and tags.
LDAP. Lightweight Directory Access Protocol. OGSI. Open Grid Services Infrastructure. A
IETF term for an TCP/IP based network protocol that minimal set of grid services and supporting
enables access to a Directory System Agent. infrastructure on top of which OGSA systems and
applications can be built or extended.
MMJFS. Managed Job Factory
Service, which is a job manager that provides an P2G. Peer to Group.
interface for requesting and using remote resources
for the jobs. Peer to Peer .A network that does not rely on
dedicated servers for communication but instead
MVC. Model View Controller. mostly uses direct connections between clients
(peers). A pure peer-to-peer network does not have
NAS. Network Attached Storage. NAS is storage in the notion of clients or servers, but only equal peer
a form that is readily accessible on a network, as nodes that simultaneously function as both "clients"
opposed to direct-attached storage. and "servers" to the other nodes on the network.
ODBC. Open Database Connectivity. Open SDK. Software Development Kit. A programming
standard developed by Microsoft aimed at providing package that enables a programmer to develop
a consistent interface for relational databases, applications for a specific platform.
regardless of which database engine is used (for
example, SQL Server, Oracle, or MySQL). SMB. Server Message Block. A file sharing
protocol used in DOS, OS/2®, and early versions of
OGSA. The Open Grid Services Architecture is a Windows.
standard that sets the basis for communication in
grids across virtual organizations. OGSA marries SOA. Service-Oriented Architecture is an
open standards and grid computing protocols with architecture that represents software functionality as
Web Services, bringing together the ability to share discoverable services on the network.
computing resources with the ability to provide
application interoperability over the Internet. SOAP. Simple Object Access Protocol. A method
invented by Microsoft to use RPC over the internet
via HTTP calls.
Glossary 219
220 Grid Computing Products and Services
Index
distributed peer to peer grid
A see Network grid
Access Control Lists
see ACL
ACL 6 E
AFS 70 Enterprise Identity Mapping 96
AIX 159 EWLM 108
Altair PBS Professional 54
API 13, 18, 76, 177
ARM 108
F
federated database 70
authentication 6, 16, 92–94, 177 File System Virtualization Layer 70, 75
authorization 6, 16, 92, 94, 177 File systems
AFS 70
B NAS 70
billing 15, 102 SAN 70, 75
firewall 152, 160
FTP 10
C Functionality
CA 96
Billing and metering 85
certificates 93, 95, 97
Directory Services 85
certification authorities
Information Virtualization 16–17, 19
see CA
Information virtualization 65
cluster 11, 43, 47, 135, 196
Licensing 85
Condor 56
Other Disciplines 16, 20
CSF 57
Provisioning and orchestration 85
Scheduling 16, 19
D Security 85
Data grid 9 Web Services Core/Hosting 16, 18, 23
Database 80 Workload Virtualization 41
Database partitioning feature (DPF) 80
DataSynapse GridServer 48, 116, 135, 173
DB2 Content Manager 80
G
Gang Scheduler 43
DB2 Information Integrator for Content 81
General Parallel File System
DB2 Product Family
see IBM General Parallel File System
Database partitioning feature (DPF) 80
GGF 114
DB2 Content Manager 80
Global Grid Forum 122
DB2 Information Integrator for Content 81
Globus Alliance 122, 133
DB2 UDB 78–79
Globus Toolkit 92, 97, 116, 135, 174
DBMS 78
GPFS
deep computing
see IBM General Parallel File System
see HPC
Grid computing models
Desktop grid 45, 59, 150
Data grid 9
Directed Attached Storage
Desktop grid 45, 59
see DAS
Index 223
Platform LSF Reports 116 Grid Design Services 123, 126
Platform Symphony 52, 115 Grid Innovation Workshop 123–124
United Devices 52 Grid Software Implementation Services 123,
United Devices MP 53, 117, 160 127
Web Services Core/Hosting Grid Strategy Engagement 123, 125
IBM WebSphere Application Server 196 Grid Training and Education 123, 128
IBM WebSphere MQ 201 IBM Managed Storage Services 159
IBM WebSphere Portal Server 202 IBM Support Line 128
WebSphere Application Server 30, 92, 159, 171 Shared Resources
WebSphere Business Integration Messaging CPU 5
Broker 38 databases 5
WebSphere Extended Deployment 32, 171, disks 5
198 memory 5
WebSphere Information Integration 84, 137, network 5
171, 199 storage devices 5
WebSphere Information Integrator 79 single sign-on 94–95
WebSphere Information Integrator Content Edi- SLA 14
tion 79 Smallpox Project 160
WebSphere Information Integrator OmniFind SMB 83
Edition 79 Software Development Kit
WebSphere Messaging Resources 37 see SDK
WebSphere MQ 171 Solutions
WebSphere MQ Family 36 Analytics Acceleration 129, 131, 140
WebSphere MQ Messaging 36 Clash Analisys 140
WebSphere Portal Server 33, 171 Design Collaboration 129, 132, 140
provisioning 15, 86, 88, 107–108, 110, 187 Engineering Design 129, 133, 140
pthreads 135 Geophysical Processing and Analysis 129,
134, 140
Grid Clash Analysis 129, 139
Q IBM Grid Innovation Workshop 140
QoS 6, 58
IBM Grid Value at Work 129–130
quality of service
Information Accessibility 129, 136, 140
see QoS
IT Optimization 129, 137, 140
Research Collaboration 129, 137, 140
R Storage Area Network 9
Redbooks Web site 216 see SAN
Contact us xxi Storage Virtualization Layer 71
resource pool 86
T
S TCO 130
SAN 71, 74, 83, 110, 159, 193
SAN Volume Controller
see SVC U
UMI 12
SDK 13, 118
United Devices 52, 59, 135
Server grid 7, 45, 60, 114
United Devices MP 53, 117, 160
Service Level Agreement
UNIX 83, 92, 189
see SLA
Utility Management Infrastructure
Services
see UMI
W
Web services 133
WebSphere Application Server 30, 92, 159, 171
WebSphere Business Integration Messaging Broker
38
WebSphere Extended Deployment 32, 171
WebSphere Information Integration 78
WebSphere Information Integrator 79
WebSphere Information Integrator Content Edition
79
WebSphere Information Integrator OmniFind Edition
79
WebSphere Messaging Resources 37
WebSphere MQ 171
WebSphere MQ Family 36
WebSphere MQ Messaging 36
WebSphere Portal Server 33, 171
workload management 15, 45, 53, 87, 107–108
World Community Grid 149, 151–154, 158, 167
Agent 161
Screen Saver 162
Servers 161
world-scale grid 150
WSRF 196
X
XML 15
Z
z/OS 107
Index 225
226 Grid Computing Products and Services
Grid Computing Products and Services
(0.5” spine)
0.475”<->0.875”
250 <-> 459 pages
Back cover ®
Grid computing This IBM Redbook is part of a series of documents related to grid
functionality computing that IBM is presenting to the IT community to enrich it and INTERNATIONAL
all its players: clients, industry leaders, emerging enterprises, TECHNICAL
Products for grid
universities, and producers of technology. It is mainly oriented to IT SUPPORT
architects or those who have the responsibility of analyzing the
computing capabilities of the product used to build a grid solution.
ORGANIZATION
Services and We describe the functions of grid computing and the categorizations
of the components within it. IBM, ISVs, and open source products will
solutions be explained. Some of these products are not specific for grid, but they BUILDING TECHNICAL
can be used in a grid computing environment. We hope that this INFORMATION BASED ON
redbook helps you select functions and products and shows you how PRACTICAL EXPERIENCE
grid can fit into your IT picture as new products and services for grid
are introduced.
IBM Redbooks are developed by
the IBM International Technical
Part 1 introduces the concept of grid computing and provides the Support Organization. Experts
terminology the readers will use in this redbook. from IBM, Customers and
Partners from around the world
Part 2 explores the functionality associated with certain products and create timely technical
product families. It also presents some examples of the utilization of information based on realistic
these products in the grid computing environment. scenarios. Specific
recommendations are provided
Part 3 presents grid computing product families that can be used to to help you implement IT
build a complete grid solution. These bundles, or suites, often solutions more effectively in
implement most of the core grid disciplines. your environment.