0% found this document useful (0 votes)
50 views59 pages

Introduction To Grid Computing

Grid computing involves pooling distributed computing resources like processors, storage, and bandwidth to solve large-scale computing problems. It allows sharing of these resources across organizational boundaries. Key aspects of grid computing include resource sharing, coordinated problem solving in virtual organizations, and dynamic and flexible resource allocation. Standards like Open Grid Services Architecture provide common interfaces and services to enable grid applications and resource sharing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views59 pages

Introduction To Grid Computing

Grid computing involves pooling distributed computing resources like processors, storage, and bandwidth to solve large-scale computing problems. It allows sharing of these resources across organizational boundaries. Key aspects of grid computing include resource sharing, coordinated problem solving in virtual organizations, and dynamic and flexible resource allocation. Standards like Open Grid Services Architecture provide common interfaces and services to enable grid applications and resource sharing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 59

Introduction to Grid Computing

Introduction to Grid Computing

 The term Grid comes from an analogy to the


Electric Grid.
– Pervasive access to power.
– Similarly, Grid will provide pervasive, consistent, and
inexpensive access to advanced computational
resources.
 Grid computing is all about achieving greater
performance and throughput by pooling resources
on a local, national, or international level.
Scalable Computing
P
E
R
F 2100 2100 2100 2100

O
R
M
A
N 2100 2100 2100 2100

C 2100

E Administrative Barriers
+ •Individual
Q •Group
•Department
o •Campus
S •State
•National
•Globe

Personal Device SMPs or Local Enterprise Global Inter Planet


SuperComputers Cluster Cluster/Grid Grid Grid
GRID Computing

 Grids are about large-scale resource sharing.


– Spanning administrative boundaries.
 Central processors, storage, network bandwidth, databases,
applications, sensors and so on
 Problem solving in dynamic, multi-institutional environment.
 Organizing geographically distributed computing resources
– So that they can be flexibly and dynamically allocated and
accessed

 Providing such capabilities, where Sharing is highly


controlled, clear definitions of exactly what is shared, who
is allowed to share, and the conditions under which sharing
occurs.
Elements of Grid Computing

 Resource sharing
– Computers, data, storage, sensors, networks, …
– Sharing always conditional: issues of trust, policy, negotiation,
payment, …
 Coordinated problem solving
– Beyond client-server: distributed data analysis, computation,
collaboration, …
 Dynamic, multi-institutional virtual organizations
– Community overlays on classic org structures
– Large or small, static or dynamic
Virtual Organizations
 A set of individuals and/or institutions defined by a set of
sharing rules
 The sharing is highly controlled, with resource providers
and consumers defining clearly and carefully just what is
shared
An example: the set of application service providers, storage
service providers, cycle providers and consultants engaged
by a car manufacturer to plan for a new factory
Another example: industrial consortium building a new aircraft
More Formal Definition of Grids

 A grid is a system that:


– Coordinates resource sharing in a de-centralized manner (i.e.,
different VOs).
– Uses standard, open, general purpose protocols and interfaces.
– Delivers non-trivial qualities of service.
 Guaranteed bandwidth for application.
 Guaranteed CPU cycles.
 Guaranteed latency.
Computational Grid Applications

 Biomedical research
 Industrial research
 Engineering research
 Studies in Physics and Chemistry
Science Today is a Team Sport!!

I. Foster
eScience
eScience [n]: Large-scale science carried out
through distributed collaborations—often
leveraging access to large-scale data &
computing

I. Foster
TeraGrid is an Important Project developed by
the National Science Foundation (NSF).

Slide obtained from B. Wilkinson, http://sol.cs.wcu.edu/~abw/CS493F04/


TeraGrid

Slide obtained from B. Wilkinson, http://sol.cs.wcu.edu/~abw/CS493F04/


UK e-Science Grid

Slide obtained from B. Wilkinson, http://sol.cs.wcu.edu/~abw/CS493F04/


Applications

 National Virtual Observatory


– Astronomical surveys produce terabytes of data.
– Data sets will cover sky in different wave bands (x-rays,
optical, infrared, radio).
– Challenge is to make this accessible to general
research community.
 Heterogeneous data producers and consumers.
– Resources in this Grid are data sets rather than
compute engines.
High-Energy Physics

 Large-scale collaborations for CERN’s Large Hadron


Collider.
 Involves 4000 physicists, 150 institutions, in more than 30
countries.
 Data sets now at petabyte level. Predicted to generate data
at the exabyte level in this decade.
 Challenges:
– Providing rapid access to subsets of data.
– Secure access to distributed computing and data handling
resources.
 Essentially, provide a distributed collaborative
infrastructure that will allow physicist from around the globe
to effectively analyze results from their home institution.
Online Access to
Scientific Instruments
Advanced Photon Source

wide-area
dissemination

real-time archival desktop & VR clients


with shared controls
collection storage

tomographic reconstruction
DOE X-ray grand challenge: ANL, USC/ISI, NIST, U.Chicago
NSF Network for Earthquake Engineering Simulation
(NEES)
Transform our ability to carry out research vital to reducing
vulnerability to catastrophic earthquakes

I. Foster
NEES

 network of 15 large-scale, experimental sites


 advanced tools such as shake tables, centrifuges
that simulate earthquake effects, unique
laboratories, a tsunami wave basin and field-
testing equipment.
 linked to a centralized data pool and earthquake
simulation software, bridged together by the high-
speed Internet2.
 off-site researchers to interact in real time with any
of the networked sites.
 Securely store, organize, and share data within a
standardized framework in a central location.
 Remotely observe and participate in experiments
through the use of synchronized real-time data
and video.
 Collaborate with colleagues to facilitate the
planning, performance, analysis, and publication
of research experiments.
 Conduct hybrid simulations that combine the
results of multiple distributed experiments and link
physical experiments with computer simulations.
DOE Earth System Grid
Goal: address
technical
obstacles to
the sharing &
analysis of
high-volume
data from
advanced
earth system
models

www.earthsystemgrid.org I. Foster
Earth System Grid I. Foster
 High-resolution, long-duration simulations performed with
advanced DOE climate models produce tens of petabytes
of output.
 This output made available to global change impacts
researchers nationwide, both at national laboratories and at
universities, other research laboratories, and other
institutions.
 a virtual collaborative environment that links distributed
centers, users, models, and data.
 provides scientists with virtual proximity to the distributed
data and resources that they require to perform their
research.
Lets Play Virtual Organization!

 The members of this class represent a VO within the


university.
 The resources of the VO include:
– The laptops, workstations, and printers belonging to the individuals
of the VO (that’s you guys1!).
– Does this bring up any issues worth concerning yourself about?

1. I do not join virtual organizations


 Want to tightly control who may use these resources and
how they may be used. Thus need security.
 Security:
– Want to tightly control who may use these resources and how they
may be used.
 How about Larry and Ramm wanting to use your printer at
the same time (which happens to be 3:30 AM). Is this a
problem?
 Security:
– Want to tightly control who may use these resources and how they
may be used.
 How about Larry and Sarah wanting to use your printer at
the same time (which happens to be 3:30 AM). Is this a
problem?
– Might want to have a scheduler, which in this case need not be
more sophisticated than turning off the printer.
 What if David forgot Dan’s IP address and cannot gain
access to his laptop? How could this be resolved
(assuming you want it resolved)?
 What if David forgot Dan’s IP address and cannot gain
access to his laptop? How could this be addressed
(assuming you want it addressed)?
– You could provide an information service that could tell David how
to find the laptop.
 You would also have to deal with allocating multiple
resources to a user, e.g., a laptop to write a paper and a
printer to print it out. Thus need a resource manager.
 Also need a way to monitor your application executing in
your VO Grid.
Grid Computing Software
Infrastructure
Open Grid Services Architecture

 Developed by the Global Grid Forum to define a


common, standard, and open architectures for
Grid-based applications.
– Provides a standard approach to all services on the Grid.
 VO Management Service.
 Resource discovery and management service:
 Job management service.
 Security services.
 Data management services.
 Built on top of and extends the Web Services
architecture, protocols, and interfaces.
A stateless Web Service invocation

                                                                                   
Figure 1.11. A stateful Web Service invocation

                                                                                   
  Relationship between OGSA, WSRF, and
Web Services
WSRF

 Web Services Resource Framework


– a specification developed by OASIS.
– WSRF specifies how to make Web Services stateful.
– joint effort by the Grid and Web Services communities.
– WSRF provides the stateful services that OGSA needs.
– OGSA is the architecture, WSRF is the infrastructure on
which that architecture is built on.
Standards Bodies
The primary standards-setting body is1:
 Global Grid Forum (GGF)
– Started in 1998
– Meets three times a year, GGF1, GGF2, GGF3 …
– More than 40 organizations involved and growing …

Others:
 W3C consortium (Worlds Wide Web Consortium)
– Working on standardization of web-related technologies such as XML
– See http://www.w3.org
 OASIS (Organization for the Advancement of Structured
Information Standards)
 IETF, DMTF

1 “The Grid Core Technologies” by M. Li and M. Baker, 2005, page 4.


Standards in the Web Services
World
 XML introduced (ratified) in 1998
 SOAP ratified in 2000
 Web services developed
 Subsequently, standards have been are
continuing to be developed:
– WSDL
– WS-* where * refers to names of one of many standards
Standards in the grid computing
world
 Open Grid Services Architecture (OGSA)

 First announced at GGF4 in Feb 2002

 OGSA does not give details of


implementation.
Globus Project

 Open source software toolkit developed for grid


computing.
 Roots in I-way experiment.
 Work started in 1996.
 Four versions developed to present time.
 Reference implementations of grid computing
standards.
 Defacto standard for grid computing.
Globus Version 4
 A “toolkit” of services and packages for creating
the basic grid computing infrastructure
 Higher level tools added to this infrastructure
 Version 4 is web-services based
 Some non-web services code exists from earlier
versions (legacy) or where not appropriate (for
efficiency, etc.).
Layered diagram of OGSA, GT4, WSRF, and Web Services
 Each part comprises a set of web services
and/or non-web service components.

 Some built upon earlier versions of Globus.


Globus Open Source Grid Software
G Community Python WS Core
Delegation Scheduler [contribution]
T Service Framework
4 [contribution] C WS Core

Community
OGSA-DAI
Authorization
[Tech Preview]
G Service Web
Services
T Components
Grid Monitoring
3 WS Reliable
Resource & Discovery
Authentication File Java WS Core
Allocation Mgmt System
Authorization Transfer
(WS GRAM) (MDS4)

G Pre-WS
Grid Monitoring
Resource & Discovery C Common
T Authentication GridFTP
Allocation Mgmt System Libraries
Authorization
2 (Pre-WS GRAM) (MDS2) Non-WS

G Replica
Components
T Location XIO
Service
3
G
Credential
T Management
4

Data Execution Information Common


Security
Management Management Services Runtime
I Foster
Another view of GT4 Components
Your Your Your Your Your
Your Your
Your
CLIENT Your Your Your Your
Python
Java
Java CC Python
Python
Java
Java CC Python
Client Client Client Client Client
Client Client
Client
Client Client Client Client

Interoperable
X.509 credentials =
WS-I-compliant
common authentication
SOAP messaging

Your Your Your

Pre-WS MDS
Pre-WS GRAM
OGSA-DAI

Your
Delegation

SimpleCA
C
Archiver

Python

MyProxy
GridFTP
Java
Trigger
GRAM

Java
GTCP
Index

CAS
RFT

RLS
Service Service Service
Service
pyGlobus C WS
WS Core Core

Java Services in Apache Axis Python hosting, C Services using GT


SERVER
Plus GT Libraries and Handlers GT Libraries Libraries and Handlers
I Foster
GT Core
 Provides the ability to create services
running inside the GT 4 container.
Java WS Core
G Community Python WS Core
Delegation Scheduler [contribution]
T Service Framework
4 [contribution] C WS Core

Community
OGSA-DAI
Authorization
[Tech Preview]
G Service Web
Services
T Components
Grid Monitoring
3 WS Reliable
Resource & Discovery
Authentication File Java WS Core
Allocation Mgmt System
Authorization Transfer
(WS GRAM) (MDS4)

G Pre-WS
Grid Monitoring
Resource & Discovery C Common
T Authentication GridFTP
Allocation Mgmt System Libraries
Authorization
2 (Pre-WS GRAM) (MDS2) Non-WS

G Replica
Components
T Location XIO
Service
3
G
Credential
T Management
4

Data Execution Information Common


Security
Management Management Services Runtime
GT4 Web Services Core
User Applications

Custom GT4

Administration
WSRF Web WSRF Web

Registry
Custom
GT4 Container

Services Services
Web
Services
WS-Addressing, WSRF,
WS-Notification

WSDL, SOAP, WS-Security

I Foster
Execution Management

Key component

GRAM (Grid Resource Allocation Manager)

 For submitting executable jobs


 May interface to a local job scheduler
GRAM (Grid Resource Allocation Manager)
G Community Python WS Core
Delegation Scheduler [contribution]
T Service Framework
4 [contribution] C WS Core

Community
OGSA-DAI
Authorization
[Tech Preview]
G Service Web
Services
T Components
Grid Monitoring
3 WS Reliable
Resource & Discovery
Authentication File Java WS Core
Allocation Mgmt System
Authorization Transfer
(WS GRAM) (MDS4)

G Pre-WS
Grid Monitoring
Resource & Discovery C Common
T Authentication GridFTP
Allocation Mgmt System Libraries
Authorization
2 (Pre-WS GRAM) (MDS2) Non-WS

G Replica
Components
T Location XIO
Service
3
G
Credential
T Management
4

Data Execution Information Common


Security
Management Management Services Runtime
GT4 GRAM Structure:
Sun Grid Engine
Service host(s) and compute element(s)

GT4 Java Container Compute element


Local job
GRAM
GRAM control
services Local
services
Job tions GRAM

sudo
Dele scheduler
func gate adapter
Client

Transfer
Delegation request
Delegate
GridFTP User
RFT File
FTP job
Transfer
control
FTP data
Remote
GridFTP storage
Data management components element(s)

I Foster
Security Components
Addresses the security requirements of grid
computing. Three important factors are:

 Authorization
– Process of deciding whether a particular identity can
access a particular resource
 Authentication
– Process of deciding whether a particular identity is
who he says he is (applies to humans and systems)
 Delegation (somewhat specific to grid computing)
– Process of giving authority to another identity
(usually a computer/process) to act on your behalf.
Security continued
 Security aspects complicated by the fact
that virtual organization members and
resources can be in different administrative
domains.
Security
G Community Python WS Core
Delegation Scheduler [contribution]
T Service Framework
4 [contribution] C WS Core

Community
OGSA-DAI
Authorization
[Tech Preview]
G Service Web
Services
T Components
Grid Monitoring
3 WS Reliable
Resource & Discovery
Authentication File Java WS Core
Allocation Mgmt System
Authorization Transfer
(WS GRAM) (MDS4)

G Pre-WS
Grid Monitoring
Resource & Discovery C Common
T Authentication GridFTP
Allocation Mgmt System Libraries
Authorization
2 (Pre-WS GRAM) (MDS2) Non-WS

G Replica
Components
T Location XIO
Service
3
G
Credential
T Management
4

Data Execution Information Common


Security
Management Management Services Runtime
GT4 Data Management
 Move large data to/from nodes
 Replicate data for performance & reliability
 Locate data of interest
 Provide access to different data sources
– File systems, parallel file systems, hierarchical
storage (GridFTP)
– Databases (OGSA DAI)
GridFTP and Reliable File Transfer
G Community Python WS Core
Delegation Scheduler [contribution]
T Service Framework
4 [contribution] C WS Core

Community
OGSA-DAI
Authorization
[Tech Preview]
G Service Web
Services
T Components
Grid Monitoring
3 WS Reliable
Resource & Discovery
Authentication File Java WS Core
Allocation Mgmt System
Authorization Transfer
(WS GRAM) (MDS4)

G Pre-WS
Grid Monitoring
Resource & Discovery C Common
T Authentication GridFTP
Allocation Mgmt System Libraries
Authorization
2 (Pre-WS GRAM) (MDS2) Non-WS

G Replica
Components
T Location XIO
Service
3
G
Credential
T Management
4

Data Execution Information Common


Security
Management Management Services Runtime
GridFTP
 Built on FTP using separation of data and
control channels
 Provides features for
– Large data transfers
– Secure transfers
– Fast transfers
– Reliable transfers
– Third party transfers
 Not a web service
– RTF (Reliable File Transfer) service provided WS-
level interface
Parallel transfers and striping
 Using multiple (virtual) connections for transfer
– Same external network
– Speed improvement possible, but limited by network
card
 Striping
– a version of parallel transfers that can use separate
hardware interfaces
– Implemented in GT 4.
Monitoring and Discovery
G Community Python WS Core
Delegation Scheduler [contribution]
T Service Framework
4 [contribution] C WS Core

Community
OGSA-DAI
Authorization
[Tech Preview]
G Service Web
Services
T Components
Grid Monitoring
3 WS Reliable
Resource & Discovery
Authentication File Java WS Core
Allocation Mgmt System
Authorization Transfer
(WS GRAM) (MDS4)

G Pre-WS
Grid Monitoring
Resource & Discovery C Common
T Authentication GridFTP
Allocation Mgmt System Libraries
Authorization
2 (Pre-WS GRAM) (MDS2) Non-WS

G Replica
Components
T Location XIO
Service
3
G
Credential
T Management
4

Data Execution Information Common


Security
Management Management Services Runtime
Monitoring and Discovery
 WSRF provides common mechanisms for
monitoring and discovering a service:
 GT4 “aggregator” services within MDS:
– MDS-Index: collects state information from
registered resources and makes it available
as XML document
– MDS-Trigger: passes this information to an
executable
– MDS-Archive: archives state information
(awaiting implementation)
 Every GT 4 is discoverable

You might also like