GridComputing-An Introduction MAIN
GridComputing-An Introduction MAIN
Outline
Introduction to Grid Computing
Methods of Grid computing
Grid Middleware
Grid Architecture
Grid Applications
Related topics on Grid
Grid Computing
Grid computing is a form of distributed computing whereby a "super and
virtual computer" is composed of a cluster of networked, loosely coupled
computers, acting in concert to perform very large tasks.
Benefits
Exploit Underutilized resources
Resource load Balancing
Virtualize resources across an enterprise
Data Grids, Compute Grids
Enable collaboration for virtual organizations
Who can use grid computing
Governments and International
Organizations
The military
Businesses
Grid Applications
Data and computationally intensive applications:
This technology has been applied to computationally-intensive scientific,
mathematical, and academic problems like drug discovery, economic
forecasting, seismic analysis back office data processing in support of
e-commerce
A chemist may utilize hundreds of processors to screen thousands of
compounds per hour.
Teams of engineers worldwide pool resources to analyze terabytes of
structural data.
Meteorologists seek to visualize and analyze petabytes of climate data
with enormous computational demands.
Resource sharing
Computers, storage, sensors, networks, …
Sharing always conditional: issues of trust, policy, negotiation,
payment, …
Coordinated problem solving
distributed data analysis, computation, collaboration, …
Grid Computing Applications
One of the most tantalizing applications of radio
astronomy is the observation of radio signals as
part of Searches for Extra Terrestrial Intelligence
(SETI).
The vast amount of computing capacity required
for SETI radio signal processing has led to a
unique grid computing concept that has now been
expanded to many applications.
Grid Topologies
• Intragrid
– Local grid within an organisation
– Trust based on personal contracts
• Extragrid
– Resources of a consortium of organisations
connected through a (Virtual) Private Network
– Trust based on Business to Business contracts
• Intergrid
– Global sharing of resources through the internet
– Trust based on certification
TYPES OF GRID
• Computational Grid
• Scavenging Grid
• Data Grid
Computational Grid
Distributed Supercomputing
High-Throughput Computing
On-Demand Computing
Data-Intensive Computing
Collaborative Computing
Logistical Networking
Distributed Supercomputing
On-Demand Computing
Uses grid capabilities to meet short-term
requirements for resources that are not
locally accessible.
Models real-time computing demands.
Collaborative Computing
Concerned primarily with enabling and
enhancing human-to-human interactions.
Applications are often structured in terms of a
virtual shared space.
Data-Intensive Computing
The focus is on synthesizing new information
from data that is maintained in geographically
distributed repositories, digital libraries, and
databases.
within VO environment.
Layered Grid Architecture
(By Analogy to Internet Architecture)
Application
Architecture
Internet Protocol
“Coordinating multiple resources”:
ubiquitous infrastructure services, Collective
app-specific distributed services Application
Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © 2009.
Chapter 1, pp 19-28. For educational use only. All rights reserved. Aug 24, 2009 1-2.38
Grid computing infrastructure
(middleware) software
Primary objective:
1-2.39
Grid computing infrastructure software
Key aspects include:
Secure envelop over all transactions
Single sign-on - being able to access all available
resources and run jobs without having to supply
additional passwords or account information.
Data management tools
Information services providing characteristics of
resources and their status (including dynamic load)
APIs and services that enable applications
themselves to take advantage of Grid platform
Convenient user interface
1-2.40
Globus Project
Open source software toolkit developed for
Grid computing.
One of the most influential projects
Roots in I-way experiment.
Work started in 1996.
Four versions developed to present time.
Reference implementations of Grid computing
standards.
Defacto standard for Grid computing.
1-2.41
Globus
A “toolkit” of services and packages for
creating the basic grid computing
infrastructure
Higher level tools added to this infrastructure
Version 4 is web-services based
Some non-web services code exists from
earlier versions (legacy) or where not
appropriate (for efficiency, etc.).
1-2.42
Some Globus toolkit versions
(approximate time line)
1-2.46
Security
Distributed resources must be protected from unauthorized
access.
GSI (Grid Security Infrastructure) -- Globus components for
creating security envelop.
Requires each user to be authenticated (their identity proved).
Uses public key cryptography (basis of Internet security)
Each user must possess a so-called (digital) certificate, signed
by a trusted certificate authority.
Users will also need to be able to give their authority to Grid
components to act on their behalf.
Users generally will also need accounts of resources they
intend to use (authorization).
1-2.47
Resource Discovery
1a.48
Resource Discovery
Basic Globus component called MDS (Monitoring
and Discovery System).
Users might access MDS to discover status of
compute resources. In practice, users often
know what resources are there but not dynamic
load.
MDS might be used by other Grid components
such as schedulers.
1-2.49
Executing a Job
Next user typically would want to submit a job.
1-2.50
Command-line interface
Grid computing environments mostly Linux-based and originally
accessed through a command line.
Once you have established your security credentials, to run a
simple job you might issue GRAM command:
1-2.52
GridFTP command to transfer files
globus-url-copy \
gsiftp://www.coitgrid02.uncc.edu/~abw/
prog1out \
file:///home/abw/
Fig. 1.7
Before users can log on, they need a user name and
password for portal.
They must have user “credentials” and accounts on
the resources they wish to access.
In UNC–Charlotte course portal, PURSe (Portal-
based User Registration Service) portlet used to
facilitate setup procedures.
Reached by selecting “Register” tab.
User enters required information (name, email
address, institution, etc.) which is forwarded to Grid
system administrator to set up accounts and
credentials.
1-2.56
PURSe
registration
portlet
Fig. 1.8
Registration activities
1-2.60
Proxies
To use many services, you are required to have a
proxy certificate (a proxy).
Proxies are part of Grid security infrastructure,
discussed later in course.
Proxy is an electronic document that enables
resources to be accessed on user’s behalf.
Very convenient to use credential management
service called myProxy to hold proxies
Usually, Gridsphere automatically obtains a proxy
from the myProxy server for you when you log in.
1-2.61
Proxy management tab
1-2.62
File management tab
1-2.63
Batch job submission tab
CONCLUSION
Grid computing introduces a new concept to IT
infrastructures because it supports distributed
computing over a network of heterogeneous
resources and is enabled by open standards.