Notes
Notes
UNIT I
INTRODUCTION
1. Evolution of Distributed computing
2. Scalable computing over the Internet
3. Technologies for network based systems
4. Clusters of cooperative computers
5. Grid computing Infrastructures
6. Cloud computing
7. Service oriented architecture
8. Introduction to Grid Architecture and standards
9. Elements of Grid
10.Overview of Grid Architecture.
UNIT-I
Introduction
Evolutionary Trend in Distributed Computing
The chapter mainly asses the evolutionary change in machine architecture, operating system
platform, network connectivity and application workload.
HPC systems emphasize the row speed performance. The speeds of HPC system are improved
by the demand from scientific, engineering and manufacturing communities. The majority of
computer users are using desktop computer on large severs when they conduct internet searches
and market drivers computing tasks.
HTC systems pays more attention to high flux computing .The main application of high flux
computing is in internet searches and web services. The throughput is defined as number of tasks
completed per unit of time .HTC not only improves bath processing speed ,but also addresses
problems of cost, energy, savings, security and reliability.
Computing paradigms
A) Centralized computing
B) Parallel computing
All processer are either tightly coupled with centralized shared memory (or) loosely
coupled with distributed memory. It is also accomplished through processing inter processor
communication is accomplished through shared memory via manger passing. A system capable
of parallel computing is known as parallel computer programs shows in parallel computer are
called parallel programs. The process of writing parallel programs is referred as parallel
programming.
C) Distributed computing
A Distributed system consists of multiple autonomous computer .each having its own
private memory communication through a computer network .Information exchange is
accomplished by message passing. A computer program that runs in distributed system is known
as distributed programs .The process of writing distributed programs is known a distributed
programming.
D) Cloud computing
An internet cloud of resources can either a centralized or distributed computing
systems .The cloud can be parallel or distributed computing .clouds can be built with physical or
virtualized resources ever large data centers.
E) Ubiquitous computing
It refers to computing with pervasive device at any place and time using wired (or)
wireless communication.
A) Efficiency
B) Dependability
It measures the reliability and self manager from clip to the system and application
levels.
The purpose is to provide high throughput services with quality of service (QOS) even
under failure state.
C) Adaption in Processing models
It measures the ability to support large jobs requests ever massive data sets and virtually cloud
resources under various workload and service models.
a) Bit level Parallelism (BLP): converts bit-serial processing to world level processing.
b) Instruction level parallelism (ILP): processor executes instructions simultaneously.
c) Data level parallelism (DLP): Through SIMD (simple instruction, multiple data).It requires
more hardware support and compiler assistance.
d)Task level parallelism (TLP): Due to an introduction of multi core processor and clip
multiprocessor (CMP).It is fair due to difficulty in programming and complication of code for
efficient execution on multi core CMPS.
e) Job level Parallelism (JLP): Due to move from parallel processing to distributed processing.
The cases grain parallelism is built on top of fine grain parallelism.
Utility computing
Internet of Things
The IOT refer to the networked interconnection of everyday object tools devices .These things
can be large or small and they vary respect to time and place The idea is to every object using
REID(or)sensor(or)electronic technology. The IOT needs to be designed to track many static (or)
morning objects simultaneously .It demands universal addressability of all the objects. It reduces
the complexity of identification search and storage and set the threshold to filter out fine grain
objects.
Three communication patterns exists
a) Human to Human (H2H)
b) Human to Thing (H2T)
c) Thing to Thing (T2T)
The concept behind their communication pattern is to connect things at any time and any place
intelligently .But still it is infancy stage of development.
It result of interaction between computational process and the physical world .A CPS integrates
cyber (heterogeneous, asynchronous) with physical (concurrent, information dense) objects.
It merges 3c technologies.
a) Computation
b) Communication
c) Control -into an intelligent closed feedback system between the physical world and
information world.
It emphasizes the exploration of virtual Reality application in physical world.
ii) Multi-Threading
Five independent Threads of instruction to four pipelined data paths in five different
processors
a) Four- issue superscalar processor
b) Fine grain multi threaded processor
c) Coarse grain multi threaded processor
d) Dual are CMP
e) Simultaneous Multi Threaded Processor
The super scalar processer is single threaded with four Functional units. Each of three multiple
threaded processer in four way multi threaded over four functional data paths .The dual core
processer assume two processing cores, each a single threaded two way superscalar processors.
In superscalar processors same thread is executed .In fine grained switches the execution of
instruction from different thread per cycle. In coarse grained, executes instruction from different
threads simultaneously SMT allows simultaneously scheduling of instruction from different
threads in same cycle. The blank field indicates no instructions.
b) GPU Computing
GPU is a graphics coprocessor (or) acceleration mounted on computer graphics card (or)
video card. A GPU offloads CPU from tedious graphics tasks in video edition application. These
clips can process a minimum of load million polygons per second.
i) Working of GPU
NVIDIA GPU has been upgraded to 128 cores on a single clip. Each core on a GPU can
handle eight threads of instruction. This translates to having up to 1024 threads executed
concurrently on a single GPU.
CPU is optimized for latency cache, while GPU is optimized to delivers high throughput
with explicit management of on-chip memory.
GPUs are designed to handle large no of floating point operations in parallel GPU offloads
the CPU from all data interactive calculation.
1. Memory Technology
The capacity increase of disk array will be greater. Faster processor speed and largely memory
capacity result in a wider gap between processors and memory. The memory will may become
even worse a problem limiting CPU performance.
The rapid growth of flash memory and Solid State Driver (SSD) also impacts the future of HPC
and HTC systems. Power increases linearly with respect to clock frequency. The clock rate
cannot be increased indefinitely. SSD's are expensive to replace stable disk arrays.
The nodes in small clusters are mostly interconnected by an Ethernet switch (or) LAN.LAN is
used to connect client hosts to big servers. SAN connects servers to network storage such as disk
array.NAS connects client hosts directly to disk arrays.
d) Virtual Machines and Virtualization Middleware
Hypervisor
Bare-metal Hosted
The user application running on its dedicated OS could be bundled together as a Virtual
appliance that can be ported to any hardware platform.
These four system classes consists of millions of computers are participating nodes. These
machines work collectively, collaboratively at various levels.
a) Clusters
b) Computational Grids
Grids
Computational P2P
(OR) Data
P2P architecture offers a distributed model of networked systems. In P2P every node acts
as both client and server, providing part of system resources.
Peer machines are client connected over internet. All client machines act autonomously
to join or leave the system freely. There is no master slave relationship and no control
database.
Only the participating peers from the physical network at anytime. The physical network
is simply an adhoc network formed at various internet domain randomly using TCP/IP
protocols.
Files are distributed in participating peers. Based on communication peer IDs form an
overlay network at logical level. This overlay is a virtual network formed by mapping
each physical machine with ID.
When new peer joins the system, peer ID is added as node in overlay network. When a
peer removes the system, Peer ID is removed from overlay network
ISSUES
d) Internet clouds
Cloud Computing
A cloud is a pool of virtualized computer resources. A cloud can host a variety of different
workloads, including batch style backend jobs and interactive and user facing applications.
Cloud computing applies a virtual platform with elastic resources on demand by provisions
hardware, software and data sets dynamically .Virtualize resources from data centers to form an
Internet cloud for paid users to run their application.
1. Infrastructure as a Service
Put together infrastructure demanded by users. The users can deploy and run on multiple
VM running guest OS on specific applications. The user does not manage or control the
underlying cloud infrastructure but can specify when to request and release the needed resources.
2. Platform as a Service
This model enables the user to deploy user built applications onto a virtualized cloud
platform. It includes middleware, databases, development tools and runtime support. This
platform includes both hardware and software integrated with specific programming interfaces.
The user is freed from managing the cloud infrastructure.
3. Software as a Service
It is browser initiated application software over paid cloud customers. It applies to business
process, industry applications; ERP on customer side, there is no upfront investment in servers.
On provider side, costs are low, compared with hosting of user application.
Deployment modes
a) Private
b) Public
c) Managed &
d) Hybrid cloud
1. Desired location in areas with protected space and higher energy efficiency.
2. Sharing of load capacity among large pool of users.
3. Separation of infrastructure maintenance duties from domain specific application
development.
4. Cost reduction
5. Cloud computing programming and application development
6. Service and discovery
7. Privacy, security, copyright and reliability
8. Service agreements, business models and pricing.
In grid and web services, Java and CORBA architecture built on traditional OSI layers that
provide base networking abstraction. The base software environment such as .Net, Apache Axis
for web services, Java Virtual Machine for java.
The entity interfaces corresponds to WSDD. Java method and CORBA interface definition
language (IDF). These interfaces are linked with customized, high level communication system.
The communication system supports features including RPC, fault recovery and specialized
routing. These communications built on message oriented middleware infrastructure such as web
sphere (or) Java Message Service (JMS) which provides rich functionality and support
virtualization.
Security is a critical capability that either user or re implements the concepts of IP sec and secure
sockets. Entity communication is supported by higher level services.
Linked with
Entity Interface Communication System
Customized
Built on
The web service specifies all aspects of service and its environment. The specification is
carried out using SOAP communication message. The hosting environment then becomes
a universal distributed operating system with fully distributed capability carried by SOAP
messages.
The REST Approach adopts universal principle and delegates most of difficult problems
to application software. It has minimal information in header, message body contains all
information. It is appropriate for rapid technology and environments.
In CORBA and Java, the distributed entities are linked with RPC and the simplest way to
build composite applications, to view the entities as objects. For Java writing java
program with method calls replaced by (RMI) Remote Method Invocation, While
CORBA supports similar model with syntax reflecting c++ style of object interfaces.
A grid system applies static resources while cloud emphasizes elastic resources. Build a grid out
of multiple clouds. So Grid is better than cloud because it explicitly support negotiated resource
allocation.
Objective Questions and Answers
S.No. Objective Questions
1. A Computer capable of Parallel Computing is ________________
a) DLP b) TLP
c) BLP d) ILP
6. ___________ are formed for distributed file sharing and content delivery
application.
a) GPU b) CPU
c) VPU d) AG
8. ______________ a network connection of everyday objects.
a) BLP b)DLP
c) ILP d)TLP
10. _______________ computing with pervasive device at any place and any time
a) Ubiquitous b) Distributed
c) Parallel d) Cloud
11. HTC systems pays more attention to ___________
a) WAN b) PAN
c) LAN d) MAN
15. _____________ consists of homogeneous nodes with distributed control.
a) Clusters b) Servers
c) Clients d) Gateways
2 Marks Questions and Answers
2. Define GPU.
Ans: A GPU is a graphics coprocessor or accelerator on a computer’s graphics card or video
card. A GPU offloads the CPU from tedious graphics tasks in video editing applications. The
GPU chips can process a minimum of 10 million polygons per second. GPU’s have a throughput
architecture that exploits massive parallelism by executing many concurrent threads.
Ans:
Distributed Parallel
Each processor has its All processors may have
own private memory access to a shared
(distributed memory). memory to exchange
Information is exchanged information between
by passing messages processors.
between the processors.
Ans: Distributed computing is a field of computer science that studies distributed systems. A
distributed system is a model in which components located on networked computers
communicate and coordinate their actions by passing messages The components interact with
each other in order to achieve a common goal. Three significant characteristics of distributed
systems are: concurrency of components, lack of a global clock, and independent failure of
components. Examples of distributed systems vary from SOA-based systems to massively
multiplayer online games to peer-to-peer applications.
A computer program that runs in a distributed system is called a distributed program, and
distributed programming is the process of writing such programs. There are many alternatives for
the message passing mechanism, including pure HTTP, RPC-like connectors and message
queues.
HISTORY
The use of concurrent processes that communicate by message-passing has its roots in operating
system architectures studied in the 1960s. The first widespread distributed systems were local-
area networks such as Ethernet, which was invented in the 1970s.
ARPANET, the predecessor of the Internet, was introduced in the late 1960s, and ARPANET e-
mail was invented in the early 1970s. E-mail became the most successful application of
ARPANET, and it is probably the earliest example of a large-scale distributed application. In
addition to ARPANET, and its successor, the Internet, other early worldwide computer networks
included Usenet and FidoNet from the 1980s, both of which were used to support distributed
discussion systems.
The study of distributed computing became its own branch of computer science in the late 1970s
and early 1980s. The first conference in the field, Symposium on Principles of Distributed
Computing (PODC), dates back to 1982, and its European counterpart International Symposium
on Distributed Computing (DISC) was first held in 1985.
The HPC & HTC system are both adapted by consumer and high end web scale
computing and information services.
In HTC system, Peer to Peer network are formed for distributed file sharing and content
delivery application.
A P2P systems built on many client system cloud computing and web services platforms
are focused on HTC application.
HPC systems emphasize the row speed performance. The speeds of HPC system are improved
by the demand from scientific, engineering and manufacturing communities. The majority of
computer users are using desktop computer on large severs when they conduct internet searches
and market drivers computing tasks.
HTC systems pays more attention to high flux computing .The main application of high flux
computing is in internet searches and web services. The throughput is defined as number of tasks
completed per unit of time .HTC not only improves bath processing speed ,but also addresses
problems of cost, energy, savings, security and reliability.
Computing paradigms
The Three main computing paradigms
1) Web 2.0 services
2) Internet clouds
3) Internet of Things (IOT)
D) Centralized computing
E) Parallel computing
All processer are either tightly coupled with centralized shared memory (or) loosely
coupled with distributed memory. It is also accomplished through processing inter processor
communication is accomplished through shared memory via manger passing. A system capable
of parallel computing is known as parallel computer programs shows in parallel computer are
called parallel programs. The process of writing parallel programs is referred as parallel
programming.
F) Distributed computing
A Distributed system consists of multiple autonomous computer .each having its own
private memory communication through a computer network .Information exchange is
accomplished by message passing. A computer program that runs in distributed system is known
as distributed programs .The process of writing distributed programs is known a distributed
programming.
D) Cloud computing
An internet cloud of resources can either a centralized or distributed computing
systems .The cloud can be parallel or distributed computing .clouds can be built with physical or
virtualized resources ever large data centers.
E) Ubiquitous computing
It refers to computing with pervasive device at any place and time using wired (or)
wireless communication.
D) Efficiency
E) Dependability
It measures the reliability and self manager from clip to the system and application
levels.
The purpose is to provide high throughput services with quality of service (QOS) even
under failure state.
a) Bit level Parallelism (BLP): converts bit-serial processing to world level processing.
b) Instruction level parallelism (ILP): processor executes instructions simultaneously.
c) Data level parallelism (DLP): Through SIMD (simple instruction, multiple data).It requires
more hardware support and compiler assistance.
d)Task level parallelism (TLP): Due to an introduction of multi core processor and clip
multiprocessor (CMP).It is fair due to difficulty in programming and complication of code for
efficient execution on multi core CMPS.
e) Job level Parallelism (JLP): Due to move from parallel processing to distributed processing.
The cases grain parallelism is built on top of fine grain parallelism.
Utility computing
Internet of Things
The IOT refer to the networked interconnection of everyday object tools devices .These things
can be large or small and they vary respect to time and place The idea is to every object using
REID(or)sensor(or)electronic technology. The IOT needs to be designed to track many static (or)
morning objects simultaneously .It demands universal addressability of all the objects. It reduces
the complexity of identification search and storage and set the threshold to filter out fine grain
objects.
Three communication patterns exists
a) Human to Human (H2H)
b) Human to Thing (H2T)
c) Thing to Thing (T2T)
The concept behind their communication pattern is to connect things at any time and any place
intelligently .But still it is infancy stage of development.
It result of interaction between computational process and the physical world .A CPS integrates
cyber (heterogeneous, asynchronous) with physical (concurrent, information dense) objects.
It merges 3c technologies.
a) Computation
b) Communication
c) Control -into an intelligent closed feedback system between the physical world and
information world.
It emphasizes the exploration of virtual Reality application in physical world.
ii) Multi-Threading
Five independent Threads of instruction to four pipelined data paths in five different
processors
a) Four- issue superscalar processor
b) Fine grain multi threaded processor
c) Coarse grain multi threaded processor
d) Dual are CMP
e) Simultaneous Multi Threaded Processor
The super scalar processer is single threaded with four Functional units. Each of three multiple
threaded processer in four way multi threaded over four functional data paths .The dual core
processer assume two processing cores, each a single threaded two way superscalar processors.
In superscalar processors same thread is executed .In fine grained switches the execution of
instruction from different thread per cycle. In coarse grained, executes instruction from different
threads simultaneously SMT allows simultaneously scheduling of instruction from different
threads in same cycle. The blank field indicates no instructions.
i) Working of GPU
NVIDIA GPU has been upgraded to 128 cores on a single clip. Each core on a GPU can
handle eight threads of instruction. This translates to having up to 1024 threads executed
concurrently on a single GPU.
CPU is optimized for latency cache, while GPU is optimized to delivers high throughput
with explicit management of on-chip memory.
GPUs are designed to handle large no of floating point operations in parallel GPU offloads
the CPU from all data interactive calculation.
Hypervisor
Bare-metal Hosted
The user application running on its dedicated OS could be bundled together as a Virtual
appliance that can be ported to any hardware platform.
Ans: In grid and web services, Java and CORBA architecture built on traditional OSI layers that
provide base networking abstraction. The base software environment such as .Net, Apache Axis
for web services, Java Virtual Machine for java.
The entity interfaces corresponds to WSDD. Java method and CORBA interface definition
language (IDF). These interfaces are linked with customized, high level communication system.
The communication system supports features including RPC, fault recovery and specialized
routing. These communications built on message oriented middleware infrastructure such as web
sphere (or) Java Message Service (JMS) which provides rich functionality and support
virtualization.
Security is a critical capability that either user or re implements the concepts of IP sec and secure
sockets. Entity communication is supported by higher level services.
Linked with
Entity Interface Communication System
Customized
Built on
The web service specifies all aspects of service and its environment. The specification is
carried out using SOAP communication message. The hosting environment then becomes
a universal distributed operating system with fully distributed capability carried by SOAP
messages.
The REST Approach adopts universal principle and delegates most of difficult problems
to application software. It has minimal information in header, message body contains all
information. It is appropriate for rapid technology and environments.
In CORBA and Java, the distributed entities are linked with RPC and the simplest way to
build composite applications, to view the entities as objects. For Java writing java
program with method calls replaced by (RMI) Remote Method Invocation, While
CORBA supports similar model with syntax reflecting c++ style of object interfaces.
SYLLABUS
UNIT-II
GRID SERVICES
It defines how different components will interact each other in grid environment.
It is a set of standards defining the way in which information is shared among diverse
components of large, heterogeneous grid systems.
A grid system is scalable WAN that supports resource sharing and distribution.
Architecture of OGSA
Web service is software available online that could interact with other software using XML
It consists of Open Grid Services Infrastructure (OGSI) sub-layer which specifies grid services
and provides consistent way to interact with grid services
Consists of 5 interfaces:
This layer is mainly classified into three service categories. They are:
3. Policy Services: Provide framework for creation, administration & management of policies
for system operation.
It support data virtualization provide mechanism for access to distributed resources such as
databases, files.
4) Application Layer
This layer comprise of applications that use the grid architected services.
Grid Computing allows networked resources to be combined and used. Grid Computing offers
great benefit to organizations.
OGSI specification defines grid services and builds upon web services.
OGSI Creates an extensions model for web services definition language (WSDL) called GWSDL
(Grid WSDL) due to interface inheritance and service data for expressing state information.
Lifecycle
State management
Service Groups
Factory
Notification
Handle Map
Computation-intensive
Data intensive
The grid system must specially designed to discover, transfer and manipulate the massive data
sets.
Data access method is also known as caching, which is often applied to enhance data efficiency
in a grid environment.
The replication strategies determine when and where to create a replica of data.
Strategies of Replication
Static Dynamic
a) Static Method
The locations and number of replicas are determined in advance and will not be modified.
Replication operations require little overhead Static strategic cannot adapt to changes in
demand, bandwidth and storage variability.
b) Dynamic Method
Dynamic strategies can adjust locations and number of data replicas according to change in
conditions
Frequent data moving operations can result in much more overhead the static strategies
Optimization may be determined based on whether the data replica is being created, deleted or
moved.
The most common replication includes preserving locality, minimizing update costs and
maximizing profits.
The Grid Data Access Models consists of four access models for organizing a data grid. They are
1. Monadic method
2. Hierarchical model
3. Federation model
4. Hybrid model
1. Monadic Method
This is a centralized data repository model.
All data is saved in central data repository. When users want to access some data they have no
submit request directly to the central repository.
Disadvantages:
For a larger grid this model is not efficient in terms of performance and reliability.
Data replication is permitted in this model only when fault tolerance is demanded.
2. Hierarchical Model
It is suitable for building a large data grid which has only one large data access directory
Then some data in the regional center is transferred to the third level centre.
After being forwarded several times specific data objects are accessed directly by users.
PKI security services are easier to implement in this hierarchical data access model.
3. Federation Model
It is suited for designing a data grid with multiple sources of data supplies.
The data is shared the data and items are owned and controlled by their original owners.
Only authenticated users are authorized to request data from any data source.
Disadvantage:
This mesh model cost the most when the number of grid intuitions becomes very large.
4. Hybrid Model
This model combines the best features of the hierarchical and mesh models.
Traditional data transfer technology such as FTP applies for networks with lower bandwidth.
High bandwidth are exploited by high speed data transfer tools such as GridFTP developed with
Globus library.
The cost of hybrid model can be traded off between the two extreme models of hierarchical and
mesh connected grids.
It opens multiple data streams for passing subdivided segments of a file simultaneously.
Although the speed of each stream is same as in sequential streaming, the total time to move data
in all streams can be significantly reduced compared to FTP transfer.
The data object is partitioned into a number of sections and each section is placed in an
individual site in a data grid.
When a user requests this piece of data, a data stream is created for each site in a data gird. When
user requests this piece of data, data stream is created for each site, and all the sections of data
objects are transected simultaneously.
OGSA Services
OGSA services fall into seven broad areas, defined in terms of capabilities frequently required in
a grid scenario.
Figure shows the OGSA architecture. These services are summarized as follows:
1. Infrastructure Services
It refers to a set of common functionalities, such as naming, typically required by higher level
services.
It concerned with issues such as starting and managing tasks, including placement, provisioning,
and life-cycle management.
Tasks may range from simple jobs to complex workflows or composite services.
It provide functionality to move data to where it is needed, maintain replicated copies, run
queries and updates, and transform data into new formats.
These services must handle issues such as data consistency, persistency, and integrity.
An OGSA data service is a web service that implements one or more of the base data interfaces
to enable access to, and management of, data resources in a distributed environment.
4. Resource Management Services
It provide management capabilities for grid resources: management of the resources themselves,
management of the resources as grid components, and management of the OGSA infrastructure.
For example, resources can be monitored, reserved, deployed, and configured as needed to meet
application QoS requirements.
It also requires an information model (semantics) and data model (representation) of the grid
resources and services.
5. Security Services
The three base interfaces, Data Access, Data Factory, and Data mana gement, define basic
operations for representing, accessing, creating, and managing data.
6. Information Services
It provide efficient production of, and access to, information about the grid and its constituent
resources.
The term “information” refers to dynamic data or events used for status monitoring; relatively
static data used for discovery; and any data that is logged.
Troubleshooting is just one of the possible uses for information provided by these services.
7. Self-Management Services
It support service-level attainment for a set of services (or resources), with as much automation
as possible, to reduce the costs and complexity of managing the system.
These services are essential in addressing the increasing complexity of owning and operating an
IT infrastructure.
Motivation
1. Facilitate use and management of resources and data across distributed, heterogeneous
environments
Functionality Requirements
• Basic functions: includes discovery and brokering, virtual organizations, data sharing,
monitoring and policy
• Data sharing
• Deployment
• Monitoring policies
Requirements – Security
• Encryption
– Data encryption while transmission from one location to another
• Certification
• Provisioning
• Resource virtualization
• Processor scavenging
• Load balancing
• Advanced reservation
– Billing services based on metering of time and use of resources ( e.g. skynet)
• Fault tolerance
• Disaster recovery
• Self healing
OGSA aims at addressing standardization (for interoperability) by defining the basic framework
of a grid application structure. Some of the mechanisms employed in the standards formulation
of grid computing
The objectives of OGSA are
Manage resources across distributed heterogeneous platforms
Support QoS-oriented Service Level Agreements (SLAs). The topology of grids is often
complex; the interactions between/among grid resources are almost invariably dynamic.
Provide a common base for autonomic management. A grid can contain a plethora of resources,
along with an abundance of combinations of resource
MPICH-G2: Grid-enabled message passing (Message Passing Interface)
_ CoG Kits, GridPort: Portal construction, based on N-tier architectures
_ Condor-G: workflow management
_ Legion: object models for grid computing
_ Cactus: Grid-aware numerical solver framework
Portals
_ N-tier architectures enabling thin clients, with middle tiers using grid functions
_ Thin clients = web browsers
_ Middle tier = e.g., Java Server Pages, with Java CoG Kit, GPDK, GridPort utilities
_ Bottom tier = various grid resources
_ Numerous applications and projects, e.g.,
_ Unicore, Gateway, Discover, Mississippi Computational Web Portal, NPACI Grid
Port, Lattice Portal, Nimrod-G, Cactus, NASA IPG Launchpad, Grid Resource
Broker
High-Throughput Computing and Condor
_ High-throughput computing
_ Processor cycles/day (week, month, year?) under nonideal circumstances
_ ―How many times can I run simulation X in a month using all available machines?‖
_ Condor converts collections of distributively owned workstations and dedicated clusters
into a distributed high-throughput computing facility
_ Emphasis on policy management and reliability
Object-Based Approaches
_ Grid-enabled CORBA
_ NASA Lewis, Rutgers, ANL, others
_ CORBA wrappers for grid protocols
_ Some initial successes
_ Legion
_ University of Virginia
_ Object models for grid components (e.g., ―vault‖ = storage, ―host‖ = computer)
Cactus: Modular, portable framework for parallel, multidimensional simulations
Construct codes by linking
_ Small core: management services
_ Selected modules: Numerical methods, grids and domain decomps, visualization and
steering, etc.
_ Custom linking/configuration tools
_ Developed for astrophysics, but not astrophysics specific
There are two fundamental requirements for describing Web services based on the OGSI
1. The ability to describe interface inheritance—a basic concept with most of the distributed
object systems.
2. The ability to describe additional information elements with the interface definitions.
Provides a more detailed view of OGSI based on the OGSI specification itself. For a more
comprehensive description of these concepts, the reader should consult the specification
OGSI defines a component model that extends WSDL and XML schema definition to
incorporate the concepts of
stateful Web services
_ Extension of Web services interfaces _ Asynchronous notification of state change
_ References to instances of services _ Collections of service instances
_ Service state data that augment the constraint capabilities of XML schema definition
Setting the Context
GGF calls OGSI the ―base for OGSA.‖ Specifically, there is a relationship between OGSI and
distributed object systems and also a relationship between OGSI and the existing (and evolving)
Web services framework
Relationship to Distributed Object Systems
Given grid service implementation is an addressable and potentially stateful instance that
implements one or more interfaces described by WSDL portTypes. Grid service factories can be
used to create instances implementing a given set of portType(s).
Client-Side Programming Patterns
Another important issue is
how OGSI interfaces are likely to be invoked from client applications. OGSI exploits an
important component of the Web services framework: the use of WSDL to describe multiple
protocol bindings, encoding styles, messaging styles (RPC versus document oriented), and so on,
for a given Web service.
ServiceData defines a newportType child element named serviceData, used to define serviceData
elements, or SDEs, associated with that portType. These serviceData element definitions are
referred to as serviceData declarations, or SDDs.
serviceDataValues.
Each service instance is associated with a collection of serviceData elements: those serviceData
elements defined within the various portTypes that form the service’s interface, and also,
potentially, additional service
SDE Aggregation within a portType Interface Hierarchy
WSDL 1.2 has introduced the notion of multiple portType extension, and one can model that
construct within the GWSDL namespace. A portType can extend zero or more other portTypes
Dynamic serviceData Elements
Although many serviceData elements are most naturally defined in a service’s interface
definition, situations can arise in which it is useful to add or move serviceData elements
dynamically to or from an instance.
a) OGSA b) OGSI
c) OGSE d) OGSD
c) Migration d) Connection
3. WSDL Stands for ______________
9. Which model best suits the applications with a data grid with multiple sources of
data supplies?
1. Define – OSGI.
Ans: Open Grid Services Architecture (OGSA) is a set of standards defining the way in which
information is shared among diverse components of large, heterogeneous grid systems. In this
context, a grid system is a scalable wide area network (WAN) that supports resource sharing and
distribution. OGSA is a trademark of the Open Grid Forum.
2. Define – OSGA.
Ans: The Open Grid Services Infrastructure (OGSI) was published by the Global Grid Forum
(GGF) as a proposed recommendation in June 2003. It was intended to provide an
infrastructure layer for the Open Grid Services Architecture (OGSA). OGSI takes the
statelessness issues (along with others) into account by essentially extending Web services to
accommodate grid computing resources that are both transient and stateful.
6. Define GRAM.
Ans: GRAM provides resource allocation, process creation, monitoring, and management
services. The most common use of GRAM is the remote job submission and control facility.
GRAM simplifies the use of remote systems.