0% found this document useful (0 votes)
19 views

Notes

The document outlines the syllabus for a course on distributed computing, covering topics such as the evolution of distributed systems, high performance computing (HPC), high throughput computing (HTC), and the Internet of Things (IoT). It discusses various computing paradigms, design objectives, and technologies for network-based systems, including multi-core CPUs, GPU computing, and virtualization. Additionally, it classifies massive systems into clusters, P2P networks, computing grids, and internet clouds, highlighting their characteristics and applications.

Uploaded by

dineshreddyz93
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Notes

The document outlines the syllabus for a course on distributed computing, covering topics such as the evolution of distributed systems, high performance computing (HPC), high throughput computing (HTC), and the Internet of Things (IoT). It discusses various computing paradigms, design objectives, and technologies for network-based systems, including multi-core CPUs, GPU computing, and virtualization. Additionally, it classifies massive systems into clusters, P2P networks, computing grids, and internet clouds, highlighting their characteristics and applications.

Uploaded by

dineshreddyz93
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 72

SYLLABUS

UNIT I
INTRODUCTION
1. Evolution of Distributed computing
2. Scalable computing over the Internet
3. Technologies for network based systems
4. Clusters of cooperative computers
5. Grid computing Infrastructures
6. Cloud computing
7. Service oriented architecture
8. Introduction to Grid Architecture and standards
9. Elements of Grid
10.Overview of Grid Architecture.
UNIT-I
Introduction
Evolutionary Trend in Distributed Computing

The chapter mainly asses the evolutionary change in machine architecture, operating system
platform, network connectivity and application workload.

 To solve the computational problem is centralized computer, parallel and distribution


computing system is used.
 It uses multiple computer to solve large scale problem over internet. The space computer
to solve large scale problem over internet. The space computer sites and large datacenter
must provide high performance computing services to huge no. of internet uses carefully.
 Based on high demand, high performance computing (HPC) application is no longer
optimal for necessary system performance.
 High Throughput Computing (HTC) system built with parallel and distributed computing
technologies.
 The purpose is to advance network based computing and web services with emerging
new technologies.

Massive Parallel Processors (MPP)


 The HPC & HTC system are both adapted by consumer and high end web scale
computing and information services.
 In HTC system, Peer to Peer network are formed for distributed file sharing and content
delivery application.
 A P2P systems built on many client system cloud computing and web services platforms
are focused on HTC application.

High Performance Computing (HPC)

HPC systems emphasize the row speed performance. The speeds of HPC system are improved
by the demand from scientific, engineering and manufacturing communities. The majority of
computer users are using desktop computer on large severs when they conduct internet searches
and market drivers computing tasks.

High Throughput Computing (HTC)

HTC systems pays more attention to high flux computing .The main application of high flux
computing is in internet searches and web services. The throughput is defined as number of tasks
completed per unit of time .HTC not only improves bath processing speed ,but also addresses
problems of cost, energy, savings, security and reliability.

Computing paradigms

The Three main computing paradigms


1) Web 2.0 services
2) Internet clouds
3) Internet of Things (IOT)

 With an introduction of SOA, web2.0 service is available.


 Advance in virtualization webs the growth of internet cloud.
 The growth of radio frequency Identification & sensor GPS has triggered the
development of IOT.

A) Centralized computing

This is a computing Paradigm by which all computer resources such as processors,


memory and storage are fully stored and tightly coupled with in one integrated OS.
Eg:-Datacenters and super computer are centralized systems, but they are used in parallel
distributed and cloud computing applications.

B) Parallel computing

All processer are either tightly coupled with centralized shared memory (or) loosely
coupled with distributed memory. It is also accomplished through processing inter processor
communication is accomplished through shared memory via manger passing. A system capable
of parallel computing is known as parallel computer programs shows in parallel computer are
called parallel programs. The process of writing parallel programs is referred as parallel
programming.

C) Distributed computing
A Distributed system consists of multiple autonomous computer .each having its own
private memory communication through a computer network .Information exchange is
accomplished by message passing. A computer program that runs in distributed system is known
as distributed programs .The process of writing distributed programs is known a distributed
programming.

D) Cloud computing
An internet cloud of resources can either a centralized or distributed computing
systems .The cloud can be parallel or distributed computing .clouds can be built with physical or
virtualized resources ever large data centers.

E) Ubiquitous computing
It refers to computing with pervasive device at any place and time using wired (or)
wireless communication.

F) Internet of Things (IOT)


It is a network connection of everyday objects it is supported by internet clouds to
achieve ubiquitous computing.

Design objectives of HTC & HPC systems

A) Efficiency

 It measures the utilization rate of resources in an execution model by exploiting massive


parallelism in HPC.
 For HTC, efficiency is related to job throughput data access storage and power
efficiency.

B) Dependability

 It measures the reliability and self manager from clip to the system and application
levels.
 The purpose is to provide high throughput services with quality of service (QOS) even
under failure state.
C) Adaption in Processing models

It measures the ability to support large jobs requests ever massive data sets and virtually cloud
resources under various workload and service models.

D) Flexibility in application Deployment


It measures the ability of distributed systems to run both in HPC & HTC application.
Scalable computing trends and Parallelism

Design of parallelism (BLP)

a) Bit level Parallelism (BLP): converts bit-serial processing to world level processing.
b) Instruction level parallelism (ILP): processor executes instructions simultaneously.
c) Data level parallelism (DLP): Through SIMD (simple instruction, multiple data).It requires
more hardware support and compiler assistance.
d)Task level parallelism (TLP): Due to an introduction of multi core processor and clip
multiprocessor (CMP).It is fair due to difficulty in programming and complication of code for
efficient execution on multi core CMPS.
e) Job level Parallelism (JLP): Due to move from parallel processing to distributed processing.
The cases grain parallelism is built on top of fine grain parallelism.

Application HPC & HTC systems

A) Science and Engineering: Earthquake prediction .Global warming.


B) Business, Education, services: Telecommunication, content delivery, e-communication.
C) Industry and Health care: Banking, stock exchange, hospital automation.
D) Internet & web services Government Application: Data centers, cyber security, online-tax
return processing, social networking.
E) Mission-critical applications: Military command and control, intelligent systems.

Trend towards utility computing

Utility computing

It focuses on a business model in which customer service computing resources from a


paid service provides all grid and cloud platforms are regarded as utility service provides.

HTC in business and HPC in scientific application

Internet of Things
The IOT refer to the networked interconnection of everyday object tools devices .These things
can be large or small and they vary respect to time and place The idea is to every object using
REID(or)sensor(or)electronic technology. The IOT needs to be designed to track many static (or)
morning objects simultaneously .It demands universal addressability of all the objects. It reduces
the complexity of identification search and storage and set the threshold to filter out fine grain
objects.
Three communication patterns exists
a) Human to Human (H2H)
b) Human to Thing (H2T)
c) Thing to Thing (T2T)

The concept behind their communication pattern is to connect things at any time and any place
intelligently .But still it is infancy stage of development.

Cyber Physical System (CPS)

It result of interaction between computational process and the physical world .A CPS integrates
cyber (heterogeneous, asynchronous) with physical (concurrent, information dense) objects.
It merges 3c technologies.
a) Computation
b) Communication
c) Control -into an intelligent closed feedback system between the physical world and
information world.
It emphasizes the exploration of virtual Reality application in physical world.

Technologies for Network Based System


a) Multi core CPUs and Multithreading
b) GPU computing
c) Memory storage and wide Area N/w
d) Virtual machine and virtualization Middleware
e) Data center virtualization for cloud computing

a) Multi core CPUs and Multi threading


 Due to growth of component and network technologies the development of HTC and
HPC system are crucial.
 The processor speed is measured in millions of instruction per second (MIPS). The
network bandwidth in measured in MBPS (or) GBPS.
 Each core is essentially a processor with its own private cache (l1 cache) multiple cases
are houses in same clip with L2 cache as it is shared by all cores.
 Multi core and multithreaded CPUs are equipped with high and processors .Each core can
also be multi threaded.
i) Multi core CPUs & many -core GPU Architecture
Multi core CPUs may increase in future but CPU has reached its limit in extending and
exploiting massive DLP (Data Level Parallelism) due to a fore mentioned memory wall problem
so many-core GPU introduced Both IA-32 & IA-64 instruction set architecture built into
commercial CPUs The GPU has been applied in large clusters to built super computer in MPPs.

ii) Multi-Threading
Five independent Threads of instruction to four pipelined data paths in five different
processors
a) Four- issue superscalar processor
b) Fine grain multi threaded processor
c) Coarse grain multi threaded processor
d) Dual are CMP
e) Simultaneous Multi Threaded Processor

The super scalar processer is single threaded with four Functional units. Each of three multiple
threaded processer in four way multi threaded over four functional data paths .The dual core
processer assume two processing cores, each a single threaded two way superscalar processors.

In superscalar processors same thread is executed .In fine grained switches the execution of
instruction from different thread per cycle. In coarse grained, executes instruction from different
threads simultaneously SMT allows simultaneously scheduling of instruction from different
threads in same cycle. The blank field indicates no instructions.
b) GPU Computing

GPU is a graphics coprocessor (or) acceleration mounted on computer graphics card (or)
video card. A GPU offloads CPU from tedious graphics tasks in video edition application. These
clips can process a minimum of load million polygons per second.

Million polygon per Second

i) Working of GPU

NVIDIA GPU has been upgraded to 128 cores on a single clip. Each core on a GPU can
handle eight threads of instruction. This translates to having up to 1024 threads executed
concurrently on a single GPU.
CPU is optimized for latency cache, while GPU is optimized to delivers high throughput
with explicit management of on-chip memory.
GPUs are designed to handle large no of floating point operations in parallel GPU offloads
the CPU from all data interactive calculation.

ii) GPU programming Model


CPU is the conventional multi core processor with limited parallelism to exploit. GPU has many
cores architecture that has hundreds of simple processing cores organized as multi processes.
Each core can have one or more threads. The CPU instructs the GPU to perform massive data
processing. CPU's floating point kernel computation role is largely offloaded to many cores GPU

c) Memory, Storage and Which Area Networking

1. Memory Technology

The capacity increase of disk array will be greater. Faster processor speed and largely memory
capacity result in a wider gap between processors and memory. The memory will may become
even worse a problem limiting CPU performance.

2. Disks and Storage Technology

The rapid growth of flash memory and Solid State Driver (SSD) also impacts the future of HPC
and HTC systems. Power increases linearly with respect to clock frequency. The clock rate
cannot be increased indefinitely. SSD's are expensive to replace stable disk arrays.

3. System Area Interconnect

The nodes in small clusters are mostly interconnected by an Ethernet switch (or) LAN.LAN is
used to connect client hosts to big servers. SAN connects servers to network storage such as disk
array.NAS connects client hosts directly to disk arrays.
d) Virtual Machines and Virtualization Middleware

Virtual machines offer novel solutions to underutilized resources, application inflexibility,


software manageability, security in existing physical machines.
The VM can be provisioned for any hardware system. The VM is built with virtual resources
managed by guest OS to run a specific application between VMs and host platform one needs to
deploy a middleware called Virtual Machine Monitor (VMM).
VM1 VM2
----------------------------- Middleware (VMM)
Host platform

a) Bare-metal VM hypervisor, because the hypervisor handles the hardware directly.


b) Host VM Hypervisor, the last OS need not to be modified.

Hypervisor

Bare-metal Hosted

The user application running on its dedicated OS could be bundled together as a Virtual
appliance that can be ported to any hardware platform.

Low Level Virtual Machine Operations:-


a) The VMs can be multiplexed between hardware machines
b) The VMs can be suspended and stored in stable storage
c) A suspended VM can be resumed (or) provisioned to a new hardware platform
d) VM can be migrated from one hardware platform to another
e) Data Center Virtualization for Cloud Computing

Data Center design emphasizes the performance/price ratio over speed


performance alone. A large data center is built with thousand of servers.
Cost
Servers + disks -- 30%
Chiller -- 33%
UPS -- 18%
A/C -- 9%
Power Distribution -- 7%
-----------
97%

The cloud computing is enabled in four areas. They are:

a) Hardware Virtualization and multi core chips.


b) Utility and grid computing
c) SOA, Web 2.0
d) Autonomic computing and data center automation

System Models for Distributed and Cloud Computing


(OR)
Massive System Classification
The massive systems are classified into four groups. They are
a) Clusters
b) P2P networks
c) Computing grids
d) Internet Clouds

These four system classes consists of millions of computers are participating nodes. These
machines work collectively, collaboratively at various levels.

a) Clusters

 A computing cluster consists of interconnected stand alone computers which work


cooperatively as a single integrated computing resource.
 A cluster of servers interconnected by high bandwidth SAN (or) LAN with shared I/0
devices and disk arrays. The cluster acts as a single computer attached to the internet via
Virtual Private Network (VPN) gateway. The gateway IP address locates the cluster.
 The system image of computer is decided by the way the OS manages shared cluster
resources. Most clusters have loosely coupled node computers. All resources of server
node are managed by their own OS.
 An SSI is an illusion created by software (or) hardware that presents a collection of
resources as one integrated, powerful resource. It makes the cluster appear like a single
machine to the user.
 It consists of homogeneous nodes with distributed control. Most clusters run on LINUX
OS. The nodes are interconnected by high bandwidth network special cluster middleware
supports are needed to create SSI (or) High availability.

Cluster Design Issues


1. Complete resource sharing is not available.
2. Without middleware cluster nodes cannot be work together effectively.

b) Computational Grids

The computational grid offers an infrastructure that couples computers, software/hardware,


special instruments, people and sensor together. The computers used in grid are workstations,
servers, clusters and supercomputers. PDAs, laptops can be used as access devices to grid
system.
The grid is built across various IP broadband networks including LAN and WAN already used
by enterprises over internet. The grid integrates computing, communications, contents and
transactions as rented services.
At server end, grid is a network. At the client end, wired (or) Wireless terminal devices.

Grids

Computational P2P
(OR) Data

c) P2P Network (Peer to Peer)

 P2P architecture offers a distributed model of networked systems. In P2P every node acts
as both client and server, providing part of system resources.
 Peer machines are client connected over internet. All client machines act autonomously
to join or leave the system freely. There is no master slave relationship and no control
database.
 Only the participating peers from the physical network at anytime. The physical network
is simply an adhoc network formed at various internet domain randomly using TCP/IP
protocols.
 Files are distributed in participating peers. Based on communication peer IDs form an
overlay network at logical level. This overlay is a virtual network formed by mapping
each physical machine with ID.
 When new peer joins the system, peer ID is added as node in overlay network. When a
peer removes the system, Peer ID is removed from overlay network

P2P Overlay N/W


Unstructured Structured

 In unstructured overlay network, it is characterized by random graph. There is no fixed


route to send messages or files among nodes. Flooding is applied to send a query. It
results in heavy network traffic and non deterministic search results.
 In structured overlay network follows certain topology and rules for inserting and
removing nodes. Routing mechanisms are also developed and used.

Distributed file sharing of digital


contents (music, videos, etc.)
Collaboration – Skype, MSN
P2P Applications Distributed P2P

Other P2P - .NET, JXTA

ISSUES

1. Too many hardware models and architecture.


2. Incompatibility exists between software and OS.
3. Different network Connections and Protocols.

d) Internet clouds

Cloud Computing

A cloud is a pool of virtualized computer resources. A cloud can host a variety of different
workloads, including batch style backend jobs and interactive and user facing applications.
Cloud computing applies a virtual platform with elastic resources on demand by provisions
hardware, software and data sets dynamically .Virtualize resources from data centers to form an
Internet cloud for paid users to run their application.

Cloud Service Offerings

a) Infrastructure as a service (IaaS)


b) Platform as a Service (PaaS)
c) Software as a Service (SaaS)

1. Infrastructure as a Service

Put together infrastructure demanded by users. The users can deploy and run on multiple
VM running guest OS on specific applications. The user does not manage or control the
underlying cloud infrastructure but can specify when to request and release the needed resources.

2. Platform as a Service

This model enables the user to deploy user built applications onto a virtualized cloud
platform. It includes middleware, databases, development tools and runtime support. This
platform includes both hardware and software integrated with specific programming interfaces.
The user is freed from managing the cloud infrastructure.

3. Software as a Service

It is browser initiated application software over paid cloud customers. It applies to business
process, industry applications; ERP on customer side, there is no upfront investment in servers.
On provider side, costs are low, compared with hosting of user application.

Deployment modes
a) Private
b) Public
c) Managed &
d) Hybrid cloud

Advantages of cloud over Internet

1. Desired location in areas with protected space and higher energy efficiency.
2. Sharing of load capacity among large pool of users.
3. Separation of infrastructure maintenance duties from domain specific application
development.
4. Cost reduction
5. Cloud computing programming and application development
6. Service and discovery
7. Privacy, security, copyright and reliability
8. Service agreements, business models and pricing.

Service Oriented Architecture (SOA)

In grid and web services, Java and CORBA architecture built on traditional OSI layers that
provide base networking abstraction. The base software environment such as .Net, Apache Axis
for web services, Java Virtual Machine for java.
The entity interfaces corresponds to WSDD. Java method and CORBA interface definition
language (IDF). These interfaces are linked with customized, high level communication system.
The communication system supports features including RPC, fault recovery and specialized
routing. These communications built on message oriented middleware infrastructure such as web
sphere (or) Java Message Service (JMS) which provides rich functionality and support
virtualization.

Security is a critical capability that either user or re implements the concepts of IP sec and secure
sockets. Entity communication is supported by higher level services.
Linked with
Entity Interface Communication System
Customized

Built on

Message Oriented Middleware

a) Web Services and Others

 The web service specifies all aspects of service and its environment. The specification is
carried out using SOAP communication message. The hosting environment then becomes
a universal distributed operating system with fully distributed capability carried by SOAP
messages.
 The REST Approach adopts universal principle and delegates most of difficult problems
to application software. It has minimal information in header, message body contains all
information. It is appropriate for rapid technology and environments.
 In CORBA and Java, the distributed entities are linked with RPC and the simplest way to
build composite applications, to view the entities as objects. For Java writing java
program with method calls replaced by (RMI) Remote Method Invocation, While
CORBA supports similar model with syntax reflecting c++ style of object interfaces.

Differences between Grid & Cloud

A grid system applies static resources while cloud emphasizes elastic resources. Build a grid out
of multiple clouds. So Grid is better than cloud because it explicitly support negotiated resource
allocation.
Objective Questions and Answers
S.No. Objective Questions
1. A Computer capable of Parallel Computing is ________________

a) Parallel Computer b) distributed Computer

c) Cloud Computer d) Centralized Computer

2. Computational grids are also called as _____________

a) data grids b) internet grids

c) Service grids d) centralized grids


3. RFID Stands for __________

a) Radio Frequency Identification b) Radio Frequency Identifier

c) Radio Frequency Identity d) Radio Frequency Instruction


4. HTC Stands for _____________

a) High Throughput Computing b) High Task Computing

c) High Term Computing d) High Technology Computing


5. __________ converts bit-serial Processing to word-level Processing.

a) DLP b) TLP

c) BLP d) ILP
6. ___________ are formed for distributed file sharing and content delivery
application.

a) P2P Networks b) Distributed Networks


c) Parallel Networks d) Centralized Networks

7. _____________ is a graphics coprocessor.

a) GPU b) CPU

c) VPU d) AG
8. ______________ a network connection of everyday objects.

a) IOT b) ITT c) IFFT d) IT


9. In ________________ processor executes instructions simultaneously

a) BLP b)DLP

c) ILP d)TLP
10. _______________ computing with pervasive device at any place and any time

a) Ubiquitous b) Distributed

c) Parallel d) Cloud
11. HTC systems pays more attention to ___________

a) High Throughput Computing b) High Flux Computing

c) High Performance Computing d) High Field Computing


12. All processors are either tightly coupled with centralized shared memory in _______

a) Distributed Computing b) Parallel Computing

c) Cloud Computing d) Centralized Computing


13. The process of writing distributed programs is known as ______________

a) distributed programming b) parallel programming

c) cloud programming d) network programming


14. __________ is used to connect client hosts to big servers

a) WAN b) PAN
c) LAN d) MAN
15. _____________ consists of homogeneous nodes with distributed control.

a) Clusters b) Servers

c) Clients d) Gateways
2 Marks Questions and Answers

1. Define multi core CPU.


Ans: Advanced CPUs or microprocessor chips assume a multi-core architecture with dual, quad,
six, or more processing cores. These processors exploit parallelism at ILP and TLP levels. CPU
has reached its limit in terms of exploiting massive DLP due to the aforementioned memory wall
problem.

2. Define GPU.
Ans: A GPU is a graphics coprocessor or accelerator on a computer’s graphics card or video
card. A GPU offloads the CPU from tedious graphics tasks in video editing applications. The
GPU chips can process a minimum of 10 million polygons per second. GPU’s have a throughput
architecture that exploits massive parallelism by executing many concurrent threads.

3. What is mean by Virtualization Middleware?


Ans: Virtualization software or middleware, that acts as a bridge between the application layer
and the resource layer. This layer provides resource broker, communication service, task
analyzer, task scheduler, security access, reliability control, and information service capabilities.
Between the VMs and the host platform, one needs to deploy a middleware layer called a virtual
machine monitor (VMM).

4. Difference between distributed and parallel computing.

Ans:

Distributed Parallel
Each processor has its All processors may have
own private memory access to a shared
(distributed memory). memory to exchange
Information is exchanged information between
by passing messages processors.
between the processors.

It is loosely coupled. It is tightly coupled.


An important goal and Large problems can often
challenge of distributed be divided into smaller
systems is location ones, which are then
transparency. solved concurrently ("in
parallel").

5. List the design issues in clusters.


Ans:
 A cluster-wide OS for complete resource sharing is not available yet
 Without this middleware, cluster nodes cannot work together effectively to achieve
cooperative computing
 The software environments and applications must rely on the middleware to achieve high
performance

6. Define peer-to-peer network.


Ans: The P2P architecture offers a distributed model of networked systems. Every node acts as
both a client and a server, providing part of the system resources. Peer machines are simply
client computers connected to the Internet. All client machines act autonomously to join or leave
the system freely. This implies that no master-slave relationship exists among the peers. No
central coordination or central database is needed.

7. What is mean by service oriented architecture?


Ans: In grids/web services, Java, and CORBA, an entity is, respectively, a service, a Java object,
and a CORBA distributed object in a variety of languages. These architectures build on the
traditional seven Open Systems Interconnection (OSI) layers that provide the base networking
abstractions. On top of this we have a base software environment, which would be .NET or
Apache Axis for web services, the Java Virtual Machine for Java, and a broker network for
CORBA.

8. What is mean by SaaS?


Ans: The software as a service refers to browser initiated application software over thousands of
paid customer. The SaaS model applies to business process industry application, consumer
relationship management (CRM), Enterprise resource Planning (ERP), Human Resources (HR)
and collaborative application.

9. What is mean by IaaS?


Ans: The Infrastructure as a Service model puts together the infrastructure demanded by the user
namely servers, storage, network and the data center fabric. The user can deploy and run on
multiple VM’s running guest OS on specific application.

10. What is PaaS?


Ans: The Platform as a Service model enables the user to deploy user built applications onto a
virtualized cloud platform. It includes middleware, database, development tools and some
runtime support such as web2.0 and java. It includes both hardware and software integrated with
specific programming interface.

11. Define private cloud.


Ans: The private cloud is built within the domain of an intranet owned by a single organization.
Therefore, they are client owned and managed. Their access is limited to the owning clients and
their partners. Their deployment was not meant to sell capacity over the Internet through publicly
accessible interfaces. Private clouds give local users a flexible and agile private infrastructure to
run service workloads within their administrative domains.

12. Define public cloud.


Ans: A public cloud is built over the Internet, which can be accessed by any user who has paid
for the service. Public clouds are owned by service providers. They are accessed by subscription.
Many companies have built public clouds, namely Google App Engine, Amazon AWS,
Microsoft Azure, IBM Blue Cloud, and Salesforce Force.com. These are commercial providers
that offer a publicly accessible remote interface for creating and managing VM instances within
their proprietary infrastructure.

13. Define hybrid cloud.


Ans: A hybrid cloud is built with both public and private clouds, Private clouds can also support
a hybrid cloud model by supplementing local infrastructure with computing capacity from an
external public cloud. For example, the research compute cloud (RC2) is a private cloud built by
IBM.

14. List the essential characteristics of cloud computing


Ans:
1. On-demand capabilities
2. Broad network access
3. Resource pooling
4. Rapid elasticity
5. Measured service

15. List the design objectives of cloud computing.


Ans:
 Shifting Computing from Desktops to Datacenters
 Service Provisioning and Cloud Economics
 Scalability in Performance
 Data Privacy Protection.
 High Quality of Cloud Services.
 New Standards and Interfaces

16. How does cloud computing provides on-demand functionality?


Ans: Cloud computing is a metaphor used for internet. It provides on-demand access to
virtualized IT resources that can be shared by others or subscribed by you. It provides an easy
way to provide configurable resources by taking it from a shared pool. The pool consists of
networks, servers, storage, applications and services.

17. What is Grid Computing?


Ans: Grid computing is a processor architecture that combines computer resources from various
domains to reach a main objective. In grid computing, the computers on the network can work
on a task together, thus functioning as a supercomputer.

18. What is QOS?


Ans: Grid computing system is the ability to provide the quality of service requirements
necessary for the end-user community. QOS provided by the grid like performance, availability,
management aspects, business value and flexibility in pricing.

19. What are the features of data grids?


Ans: The ability to integrate multiple distributed, heterogeneous and independently managed
data sources. The ability to provide data catching and/or replication mechanisms to minimize
network traffic. The ability to provide necessary data discovery mechanisms, which allow the
user to find data based on characteristics of the data.

20. Define – Cloud Computing.


Ans: Cloud computing, often referred to as simply “the cloud,” is the delivery of on-demand
computing resources—everything from applications to data centers—over the Internet on a pay
for- use basis. Storing and accessing data and programs over the Internet instead of your
computer's hard drive.

21. What are the facilities provided by virtual organization?


Ans: The formation of virtual task forces, or groups, to solve specific problems associated with
the virtual organization. The dynamic provisioning and management capabilities of the resource
required meeting the SLA’s.

22. What are the properties of Cloud Computing?


Ans: There are six key properties of cloud computing: Cloud computing is
 user-centric
 task-centric
 powerful
 accessible
 intelligent
 programmable

23. Sketch the architecture of Cloud.


Ans:
24. What are the types of Cloud service development?
Ans:
 Software as a Service
 Platform as a Service
 Web Services
 On-Demand Computing

25. Define – Distributed Computing.


Ans: Distributed computing is a field of computer science that studies distributed systems.
A distributed system is a software system in which components located on networked
computers communicate and coordinate their actions by passing messages. The components
interact with each other in order to achieve a common goal.

Descriptive Questions and Answers

1. Describe about Evolution of Distributed computing.

Ans: Distributed computing is a field of computer science that studies distributed systems. A
distributed system is a model in which components located on networked computers
communicate and coordinate their actions by passing messages The components interact with
each other in order to achieve a common goal. Three significant characteristics of distributed
systems are: concurrency of components, lack of a global clock, and independent failure of
components. Examples of distributed systems vary from SOA-based systems to massively
multiplayer online games to peer-to-peer applications.
A computer program that runs in a distributed system is called a distributed program, and
distributed programming is the process of writing such programs. There are many alternatives for
the message passing mechanism, including pure HTTP, RPC-like connectors and message
queues.
HISTORY
The use of concurrent processes that communicate by message-passing has its roots in operating
system architectures studied in the 1960s. The first widespread distributed systems were local-
area networks such as Ethernet, which was invented in the 1970s.
ARPANET, the predecessor of the Internet, was introduced in the late 1960s, and ARPANET e-
mail was invented in the early 1970s. E-mail became the most successful application of
ARPANET, and it is probably the earliest example of a large-scale distributed application. In
addition to ARPANET, and its successor, the Internet, other early worldwide computer networks
included Usenet and FidoNet from the 1980s, both of which were used to support distributed
discussion systems.
The study of distributed computing became its own branch of computer science in the late 1970s
and early 1980s. The first conference in the field, Symposium on Principles of Distributed
Computing (PODC), dates back to 1982, and its European counterpart International Symposium
on Distributed Computing (DISC) was first held in 1985.

2. Explain in detail about Scalable computing over the Internet?


Ans:
 To solve the computational problem is centralized computer, parallel and distribution
computing system is used.
 It uses multiple computer to solve large scale problem over internet. The space computer
to solve large scale problem over internet. The space computer sites and large datacenter
must provide high performance computing services to huge no. of internet uses carefully.
 Based on high demand, high performance computing (HPC) application is no longer
optimal for necessary system performance.
 High Throughput Computing (HTC) system built with parallel and distributed computing
technologies.
 The purpose is to advance network based computing and web services with emerging
new technologies.
Massive Parallel Processors (MPP)

 The HPC & HTC system are both adapted by consumer and high end web scale
computing and information services.
 In HTC system, Peer to Peer network are formed for distributed file sharing and content
delivery application.
 A P2P systems built on many client system cloud computing and web services platforms
are focused on HTC application.

High Performance Computing (HPC)

HPC systems emphasize the row speed performance. The speeds of HPC system are improved
by the demand from scientific, engineering and manufacturing communities. The majority of
computer users are using desktop computer on large severs when they conduct internet searches
and market drivers computing tasks.

High Throughput Computing (HTC)

HTC systems pays more attention to high flux computing .The main application of high flux
computing is in internet searches and web services. The throughput is defined as number of tasks
completed per unit of time .HTC not only improves bath processing speed ,but also addresses
problems of cost, energy, savings, security and reliability.

Computing paradigms
The Three main computing paradigms
1) Web 2.0 services
2) Internet clouds
3) Internet of Things (IOT)

 With an introduction of SOA, web2.0 service is available.


 Advance in virtualization webs the growth of internet cloud.
 The growth of radio frequency Identification & sensor GPS has triggered the
development of IOT.

D) Centralized computing

This is a computing Paradigm by which all computer resources such as processors,


memory and storage are fully stored and tightly coupled with in one integrated OS.
Eg:-Datacenters and super computer are centralized systems, but they are used in parallel
distributed and cloud computing applications.

E) Parallel computing

All processer are either tightly coupled with centralized shared memory (or) loosely
coupled with distributed memory. It is also accomplished through processing inter processor
communication is accomplished through shared memory via manger passing. A system capable
of parallel computing is known as parallel computer programs shows in parallel computer are
called parallel programs. The process of writing parallel programs is referred as parallel
programming.

F) Distributed computing
A Distributed system consists of multiple autonomous computer .each having its own
private memory communication through a computer network .Information exchange is
accomplished by message passing. A computer program that runs in distributed system is known
as distributed programs .The process of writing distributed programs is known a distributed
programming.

D) Cloud computing
An internet cloud of resources can either a centralized or distributed computing
systems .The cloud can be parallel or distributed computing .clouds can be built with physical or
virtualized resources ever large data centers.

E) Ubiquitous computing
It refers to computing with pervasive device at any place and time using wired (or)
wireless communication.

F) Internet of Things (IOT)


It is a network connection of everyday objects it is supported by internet clouds to
achieve ubiquitous computing.
Design objectives of HTC & HPC systems

D) Efficiency

 It measures the utilization rate of resources in an execution model by exploiting massive


parallelism in HPC.
 For HTC, efficiency is related to job throughput data access storage and power
efficiency.

E) Dependability

 It measures the reliability and self manager from clip to the system and application
levels.
 The purpose is to provide high throughput services with quality of service (QOS) even
under failure state.

F) Adaption in Processing models


It measures the ability to support large jobs requests ever massive data sets and virtually cloud
resources under various workload and service models.

G) Flexibility in application Deployment


It measures the ability of distributed systems to run both in HPC & HTC application
Design of parallelism (BLP)

a) Bit level Parallelism (BLP): converts bit-serial processing to world level processing.
b) Instruction level parallelism (ILP): processor executes instructions simultaneously.
c) Data level parallelism (DLP): Through SIMD (simple instruction, multiple data).It requires
more hardware support and compiler assistance.
d)Task level parallelism (TLP): Due to an introduction of multi core processor and clip
multiprocessor (CMP).It is fair due to difficulty in programming and complication of code for
efficient execution on multi core CMPS.
e) Job level Parallelism (JLP): Due to move from parallel processing to distributed processing.
The cases grain parallelism is built on top of fine grain parallelism.

Application HPC & HTC systems

A) Science and Engineering: Earthquake prediction .Global warming.


B) Business, Education, services: Telecommunication, content delivery, e-communication.
C) Industry and Health care: Banking, stock exchange, hospital automation.
D) Internet & web services Government Application: Data centers, cyber security, online-tax
return processing, social networking.
E) Mission-critical applications: Military command and control, intelligent systems.

Trend towards utility computing

Utility computing

It focuses on a business model in which customer service computing resources from a


paid service provides all grid and cloud platforms are regarded as utility service provides.

HTC in business and HPC in scientific application

Internet of Things

The IOT refer to the networked interconnection of everyday object tools devices .These things
can be large or small and they vary respect to time and place The idea is to every object using
REID(or)sensor(or)electronic technology. The IOT needs to be designed to track many static (or)
morning objects simultaneously .It demands universal addressability of all the objects. It reduces
the complexity of identification search and storage and set the threshold to filter out fine grain
objects.
Three communication patterns exists
a) Human to Human (H2H)
b) Human to Thing (H2T)
c) Thing to Thing (T2T)

The concept behind their communication pattern is to connect things at any time and any place
intelligently .But still it is infancy stage of development.

Cyber Physical System (CPS)

It result of interaction between computational process and the physical world .A CPS integrates
cyber (heterogeneous, asynchronous) with physical (concurrent, information dense) objects.
It merges 3c technologies.
a) Computation
b) Communication
c) Control -into an intelligent closed feedback system between the physical world and
information world.
It emphasizes the exploration of virtual Reality application in physical world.

3. Explain in detail about Multicore CPUs and Multithreading Technologies


Ans: a) Multi core CPUs and Multi threading
 Due to growth of component and network technologies the development of HTC and
HPC system are crucial.
 The processor speed is measured in millions of instruction per second (MIPS). The
network bandwidth in measured in MBPS (or) GBPS.
 Each core is essentially a processor with its own private cache (l1 cache) multiple cases
are houses in same clip with L2 cache as it is shared by all cores.
 Multi core and multithreaded CPUs are equipped with high and processors .Each core can
also be multi threaded.

i) Multi core CPUs & many -core GPU Architecture


Multi core CPUs may increase in future but CPU has reached its limit in extending and
exploiting massive DLP (Data Level Parallelism) due to a fore mentioned memory wall problem
so many-core GPU introduced Both IA-32 & IA-64 instruction set architecture built into
commercial CPUs The GPU has been applied in large clusters to built super computer in MPPs.

ii) Multi-Threading
Five independent Threads of instruction to four pipelined data paths in five different
processors
a) Four- issue superscalar processor
b) Fine grain multi threaded processor
c) Coarse grain multi threaded processor
d) Dual are CMP
e) Simultaneous Multi Threaded Processor
The super scalar processer is single threaded with four Functional units. Each of three multiple
threaded processer in four way multi threaded over four functional data paths .The dual core
processer assume two processing cores, each a single threaded two way superscalar processors.

In superscalar processors same thread is executed .In fine grained switches the execution of
instruction from different thread per cycle. In coarse grained, executes instruction from different
threads simultaneously SMT allows simultaneously scheduling of instruction from different
threads in same cycle. The blank field indicates no instructions.

4. Explain in detail about GPU Computing


Ans: GPU is a graphics coprocessor (or) acceleration mounted on computer graphics card (or)
video card. A GPU offloads CPU from tedious graphics tasks in video edition application. These
clips can process a minimum of load million polygons per second.

Million polygon per Second

i) Working of GPU

NVIDIA GPU has been upgraded to 128 cores on a single clip. Each core on a GPU can
handle eight threads of instruction. This translates to having up to 1024 threads executed
concurrently on a single GPU.
CPU is optimized for latency cache, while GPU is optimized to delivers high throughput
with explicit management of on-chip memory.
GPUs are designed to handle large no of floating point operations in parallel GPU offloads
the CPU from all data interactive calculation.

ii) GPU programming Model


CPU is the conventional multi core processor with limited parallelism to exploit. GPU has many
cores architecture that has hundreds of simple processing cores organized as multi processes.
Each core can have one or more threads. The CPU instructs the GPU to perform massive data
processing. CPU's floating point kernel computation role is largely offloaded to many cores
GPU.

5. Describe about Virtual Machines and Virtualization Middleware


Ans: Virtual machines offer novel solutions to underutilized resources, application inflexibility,
software manageability, security in existing physical machines.
The VM can be provisioned for any hardware system. The VM is built with virtual resources
managed by guest OS to run a specific application between VMs and host platform one needs to
deploy a middleware called Virtual Machine Monitor (VMM).
VM1 VM2
----------------------------- Middleware (VMM)
Host platform

a) Bare-metal VM hypervisor, because the hypervisor handles the hardware directly.


b) Host VM Hypervisor, the last OS need not to be modified.

Hypervisor

Bare-metal Hosted

The user application running on its dedicated OS could be bundled together as a Virtual
appliance that can be ported to any hardware platform.

Low Level Virtual Machine Operations:-


a) The VMs can be multiplexed between hardware machines
b) The VMs can be suspended and stored in stable storage
c) A suspended VM can be resumed (or) provisioned to a new hardware platform
d) VM can be migrated from one hardware platform to another
6. Explain in detail about clusters of cooperative computers
 Ans: A computing cluster consists of interconnected stand alone computers which work
cooperatively as a single integrated computing resource.
 A cluster of servers interconnected by high bandwidth SAN (or) LAN with shared I/0
devices and disk arrays. The cluster acts as a single computer attached to the internet via
Virtual Private Network (VPN) gateway. The gateway IP address locates the cluster.
 The system image of computer is decided by the way the OS manages shared cluster
resources. Most clusters have loosely coupled node computers. All resources of server
node are managed by their own OS.
 An SSI is an illusion created by software (or) hardware that presents a collection of
resources as one integrated, powerful resource. It makes the cluster appear like a single
machine to the user.
 It consists of homogeneous nodes with distributed control. Most clusters run on LINUX
OS. The nodes are interconnected by high bandwidth network special cluster middleware
supports are needed to create SSI (or) High availability.
Cluster Design Issues

1. Complete resource sharing is not available.


2. Without middleware cluster nodes cannot be work together effectively.

7. Explain in detail about service oriented architecture

Ans: In grid and web services, Java and CORBA architecture built on traditional OSI layers that
provide base networking abstraction. The base software environment such as .Net, Apache Axis
for web services, Java Virtual Machine for java.
The entity interfaces corresponds to WSDD. Java method and CORBA interface definition
language (IDF). These interfaces are linked with customized, high level communication system.
The communication system supports features including RPC, fault recovery and specialized
routing. These communications built on message oriented middleware infrastructure such as web
sphere (or) Java Message Service (JMS) which provides rich functionality and support
virtualization.
Security is a critical capability that either user or re implements the concepts of IP sec and secure
sockets. Entity communication is supported by higher level services.

Linked with
Entity Interface Communication System
Customized

Built on

Message Oriented Middleware

a) Web Services and Others

 The web service specifies all aspects of service and its environment. The specification is
carried out using SOAP communication message. The hosting environment then becomes
a universal distributed operating system with fully distributed capability carried by SOAP
messages.
 The REST Approach adopts universal principle and delegates most of difficult problems
to application software. It has minimal information in header, message body contains all
information. It is appropriate for rapid technology and environments.
 In CORBA and Java, the distributed entities are linked with RPC and the simplest way to
build composite applications, to view the entities as objects. For Java writing java
program with method calls replaced by (RMI) Remote Method Invocation, While
CORBA supports similar model with syntax reflecting c++ style of object interfaces.
SYLLABUS
UNIT-II
GRID SERVICES

1. Introduction to Open Grid Services Architecture (OGSA)


2. Motivation
3. Functionality Requirements
4. Practical & Detailed view of OGSA/OGSI
5. Data intensive grid service models
6. OGSA services
UNIT-II
Open Grid Service Architecture (OGSA)

It defines how different components will interact each other in grid environment.

It is a set of standards defining the way in which information is shared among diverse
components of large, heterogeneous grid systems.

A grid system is scalable WAN that supports resource sharing and distribution.

Architecture of OGSA

The architecture of OGSA consists of four layers. They are:

1. Physical and Logical Resources Layer

2. Web Service Layer

3. OGSA Architected Grid Services Layer

4. Grid Applications Layer

a) Physical and logical Resources layer

Servers, storage, networks are the physical resources.


Database Managers, workflow managers are logical resources.

Logical resources manage physical resources.

Both logical and physical resources are OGSA enabled services.

b) Web services Layer

Web service is software available online that could interact with other software using XML

It consists of Open Grid Services Infrastructure (OGSI) sub-layer which specifies grid services
and provides consistent way to interact with grid services

It also extends Web Service Capabilities

Consists of 5 interfaces:

1. Factory: provide way for creation of new grid services

2. Life Cycle: Manages grid service life cycles

3. State Management: Manage grid service states

4. Service Groups: collection of indexed grid services

5. Notification: Manages notification between services & resources

c) OGSA Architected Services Layer

This layer is mainly classified into three service categories. They are:

1. Grid Core Services

2. Grid Program Execution Services

3. Grid Data Services

1) Grid Core Services:


It composed of 4 main types of services:

1. Service Management: assist in installation, maintenance, & troubleshooting tasks in grid


system

2. Service Communication: include functions that allow grid services to communicate

3. Policy Services: Provide framework for creation, administration & management of policies
for system operation.

4. Security Services: provide authentication & authorization mechanisms to ensure systems


interoperate securely.

2) Grid Program Execution Services

It supports unique grid systems in high performance computing, collaboration, parallelism.

It also supports virtualization of resource processing.

3) Grid Data Services

It support data virtualization provide mechanism for access to distributed resources such as
databases, files.

4) Application Layer

This layer comprise of applications that use the grid architected services.

Grid Computing allows networked resources to be combined and used. Grid Computing offers
great benefit to organizations.

Open Grid Services Infrastructure (OGSI)


OGSI provides detailed description about grid services in a formal and technical specification
manner.

It also defines the working of grid services in a described way.

GT3 includes a complete implementation of OGSI.

Other implementations are

a) OGSI::Lite (Perl)1 and the

b) UNICORE OGSA demonstrator2

OGSI specification defines grid services and builds upon web services.

OGSI Creates an extensions model for web services definition language (WSDL) called GWSDL
(Grid WSDL) due to interface inheritance and service data for expressing state information.

The Components of OGSI are

 Lifecycle
 State management
 Service Groups
 Factory
 Notification
 Handle Map

Data Intensive Grid Services Models

The grid applications are normally grouped into two categories:

 Computation-intensive
 Data intensive

The data intensive application deals with massive amounts of data.

The grid system must specially designed to discover, transfer and manipulate the massive data
sets.

Transferring the massive data sets is a time consuming task.

Data access method is also known as caching, which is often applied to enhance data efficiency
in a grid environment.

The replication strategies determine when and where to create a replica of data.
Strategies of Replication

Static Dynamic

a) Static Method

The locations and number of replicas are determined in advance and will not be modified.

Replication operations require little overhead Static strategic cannot adapt to changes in
demand, bandwidth and storage variability.

Optimization is required to determine the location and number of data replicas.

b) Dynamic Method

Dynamic strategies can adjust locations and number of data replicas according to change in
conditions

Frequent data moving operations can result in much more overhead the static strategies

Optimization may be determined based on whether the data replica is being created, deleted or
moved.

The most common replication includes preserving locality, minimizing update costs and
maximizing profits.

Grid Data Access Models:

The Grid Data Access Models consists of four access models for organizing a data grid. They are

1. Monadic method

2. Hierarchical model

3. Federation model

4. Hybrid model

1. Monadic Method
This is a centralized data repository model.

All data is saved in central data repository. When users want to access some data they have no
submit request directly to the central repository.

No data is replicated for preserving data locality.

Disadvantages:

For a larger grid this model is not efficient in terms of performance and reliability.

Data replication is permitted in this model only when fault tolerance is demanded.

2. Hierarchical Model
It is suitable for building a large data grid which has only one large data access directory

Data may be transferred from the source to a second level center.

Then some data in the regional center is transferred to the third level centre.

After being forwarded several times specific data objects are accessed directly by users.

Higher level data center has a wider coverage area.

PKI security services are easier to implement in this hierarchical data access model.

3. Federation Model

It is suited for designing a data grid with multiple sources of data supplies.

It is also known as a mesh model

The data is shared the data and items are owned and controlled by their original owners.

Only authenticated users are authorized to request data from any data source.

Disadvantage:
This mesh model cost the most when the number of grid intuitions becomes very large.

4. Hybrid Model

This model combines the best features of the hierarchical and mesh models.

Traditional data transfer technology such as FTP applies for networks with lower bandwidth.

High bandwidth are exploited by high speed data transfer tools such as GridFTP developed with

Globus library.

The cost of hybrid model can be traded off between the two extreme models of hierarchical and
mesh connected grids.

Parallel versus Striped Data Transfers

Parallel data transfer

It opens multiple data streams for passing subdivided segments of a file simultaneously.
Although the speed of each stream is same as in sequential streaming, the total time to move data
in all streams can be significantly reduced compared to FTP transfer.

Striped data transfer

The data object is partitioned into a number of sections and each section is placed in an
individual site in a data grid.

When a user requests this piece of data, a data stream is created for each site in a data gird. When
user requests this piece of data, data stream is created for each site, and all the sections of data
objects are transected simultaneously.
OGSA Services

OGSA services fall into seven broad areas, defined in terms of capabilities frequently required in
a grid scenario.

Figure shows the OGSA architecture. These services are summarized as follows:

1. Infrastructure Services

It refers to a set of common functionalities, such as naming, typically required by higher level
services.

2. Execution Management Services

It concerned with issues such as starting and managing tasks, including placement, provisioning,
and life-cycle management.

Tasks may range from simple jobs to complex workflows or composite services.

3. Data Management Services

It provide functionality to move data to where it is needed, maintain replicated copies, run
queries and updates, and transform data into new formats.

These services must handle issues such as data consistency, persistency, and integrity.

An OGSA data service is a web service that implements one or more of the base data interfaces
to enable access to, and management of, data resources in a distributed environment.
4. Resource Management Services

It provide management capabilities for grid resources: management of the resources themselves,
management of the resources as grid components, and management of the OGSA infrastructure.
For example, resources can be monitored, reserved, deployed, and configured as needed to meet
application QoS requirements.
It also requires an information model (semantics) and data model (representation) of the grid
resources and services.

5. Security Services

Facilitate the enforcement of security-related policies within a (virtual) organization, and


supports safe resource sharing. Authentication, authorization, and integrity assurance are
essential functionalities provided by these services.

The three base interfaces, Data Access, Data Factory, and Data mana gement, define basic
operations for representing, accessing, creating, and managing data.

6. Information Services

It provide efficient production of, and access to, information about the grid and its constituent
resources.

The term “information” refers to dynamic data or events used for status monitoring; relatively
static data used for discovery; and any data that is logged.

Troubleshooting is just one of the possible uses for information provided by these services.
7. Self-Management Services

It support service-level attainment for a set of services (or resources), with as much automation
as possible, to reduce the costs and complexity of managing the system.

These services are essential in addressing the increasing complexity of owning and operating an

IT infrastructure.

Motivation

1. Facilitate use and management of resources and data across distributed, heterogeneous
environments

2. Deliver seamless non trivial QoS

3. Define open, published interfaces in order to provide interoperability of diverse resources

4. Exploit industry-standard integration technologies

5. Develop standards that achieve interoperability

6. Integrate, virtualize, and manage services and resources in a distributed, heterogeneous


environment

7. Deliver functionality as loosely coupled, interacting services aligned with industry-


accepted web service standards.

Functionality Requirements

• Basic functions: includes discovery and brokering, virtual organizations, data sharing,
monitoring and policy

• Security functions: includes multiple security infrastructures, authentication,


authorization, accounting and instantiate new services

• Resource management functions: includes advance reservation, notification/messaging,


scheduling, load balancing, logging, disaster recovery, workflow management, fault
tolerance and self-healing capabilities

Basic functionality requirements

• Discovery and brokering

– Mechanisms to discover and allocate services, data and resources

– Service brokers check availability of


• software and hardware

• Identify codes and platform for execution

• Metering and accounting

– Metering to record usage and duration ( both devices and licenses)

– Accounting audits the usage based on metering

• Data sharing

– Manages data archives, data in cache.

– Ensure data consistency for indexing, metadata.

• Deployment

– Deploy data in the hosting environment for execution

– Deploy application executables for running apps

• Virtual organisations (VO)

– Create a VO in the grid to provide resources to customer’s job

– Negotiate with other grid, if required, to either share data or resources

• Monitoring policies

– Two levels of monitoring

• Low level – Resources management

• High Level – Business process management

– Users can monitor their job

– Cross organizational view of resources

– Fail over management

Requirements – Security

• Authentication, Authorization and Accounting

– To enter and use the apps/data in the grid

• Encryption
– Data encryption while transmission from one location to another

• Multilevel security infrastructure

– Its distributed environment, single password, multilevel security

• Perimeter security solutions

– Providing security on either side of firewall…outside firewall security is called


perimeter security

• Application and network level firewalls

– Two levels of firewall

• Certification

– Certifications from trusted party

Requirements – Resource management

• Provisioning

– CPU, storage, network, application, sensor instruments, licenses require proper


provisioning (allocation)... These must be uniform and consistent

• Resource virtualization

– Dynamic provisioning of these resources virtually ( without customer knowledge)

• Resource usage optimization

– All resources to be effectively used

• Management and monitoring

– Constant monitoring of tasks and Conflict resolution management` when required

• Processor scavenging

– Cache cleaning and making more processor power available to tasks

• Load balancing

– Dynamic allocation of jobs to various resources to optimize load

• Advanced reservation

– Application execution to happen on reserved resources


• Pricing

– Billing services based on metering of time and use of resources ( e.g. skynet)

• Fault tolerance

– Load distribution, failover mechanisms, Disaster recovery, replication ensures


fault tolerance

• Disaster recovery

– Remote data backup centre ensures this

• Self healing

– Application level / Database level rollback facility, heartbeat between servers /


storage devices etc to ensure this

• Legacy application management

– Extra care on legacy systems ( as no modification / no source code )

• Administration and aggregation of services

– Multi services involved...so admin of these services is required.

Practical view of OGSA/OGSI

OGSA aims at addressing standardization (for interoperability) by defining the basic framework
of a grid application structure. Some of the mechanisms employed in the standards formulation
of grid computing
The objectives of OGSA are
Manage resources across distributed heterogeneous platforms
Support QoS-oriented Service Level Agreements (SLAs). The topology of grids is often
complex; the interactions between/among grid resources are almost invariably dynamic.
Provide a common base for autonomic management. A grid can contain a plethora of resources,
along with an abundance of combinations of resource
MPICH-G2: Grid-enabled message passing (Message Passing Interface)
_ CoG Kits, GridPort: Portal construction, based on N-tier architectures
_ Condor-G: workflow management
_ Legion: object models for grid computing
_ Cactus: Grid-aware numerical solver framework
Portals
_ N-tier architectures enabling thin clients, with middle tiers using grid functions
_ Thin clients = web browsers
_ Middle tier = e.g., Java Server Pages, with Java CoG Kit, GPDK, GridPort utilities
_ Bottom tier = various grid resources
_ Numerous applications and projects, e.g.,
_ Unicore, Gateway, Discover, Mississippi Computational Web Portal, NPACI Grid
Port, Lattice Portal, Nimrod-G, Cactus, NASA IPG Launchpad, Grid Resource
Broker
High-Throughput Computing and Condor
_ High-throughput computing
_ Processor cycles/day (week, month, year?) under nonideal circumstances
_ ―How many times can I run simulation X in a month using all available machines?‖
_ Condor converts collections of distributively owned workstations and dedicated clusters
into a distributed high-throughput computing facility
_ Emphasis on policy management and reliability
Object-Based Approaches
_ Grid-enabled CORBA
_ NASA Lewis, Rutgers, ANL, others
_ CORBA wrappers for grid protocols
_ Some initial successes
_ Legion
_ University of Virginia
_ Object models for grid components (e.g., ―vault‖ = storage, ―host‖ = computer)
Cactus: Modular, portable framework for parallel, multidimensional simulations
Construct codes by linking
_ Small core: management services

_ Selected modules: Numerical methods, grids and domain decomps, visualization and
steering, etc.
_ Custom linking/configuration tools
_ Developed for astrophysics, but not astrophysics specific
There are two fundamental requirements for describing Web services based on the OGSI
1. The ability to describe interface inheritance—a basic concept with most of the distributed
object systems.
2. The ability to describe additional information elements with the interface definitions.

Detailed view of OGSA/OGSI

Provides a more detailed view of OGSI based on the OGSI specification itself. For a more
comprehensive description of these concepts, the reader should consult the specification
OGSI defines a component model that extends WSDL and XML schema definition to
incorporate the concepts of
stateful Web services
_ Extension of Web services interfaces _ Asynchronous notification of state change
_ References to instances of services _ Collections of service instances
_ Service state data that augment the constraint capabilities of XML schema definition
Setting the Context
GGF calls OGSI the ―base for OGSA.‖ Specifically, there is a relationship between OGSI and
distributed object systems and also a relationship between OGSI and the existing (and evolving)
Web services framework
Relationship to Distributed Object Systems
Given grid service implementation is an addressable and potentially stateful instance that
implements one or more interfaces described by WSDL portTypes. Grid service factories can be
used to create instances implementing a given set of portType(s).
Client-Side Programming Patterns
Another important issue is
how OGSI interfaces are likely to be invoked from client applications. OGSI exploits an
important component of the Web services framework: the use of WSDL to describe multiple
protocol bindings, encoding styles, messaging styles (RPC versus document oriented), and so on,
for a given Web service.

Client Use of Grid Service Handles and References


Client gains access to a grid service instance through grid service handles and grid service
references. A grid service handle (GSH) can be thought of as a permanent network pointer to a
particular grid service instance.
Relationship to Hosting Environment
OGSI does not dictate a particular service-provider-side implementation architecture. A variety
of approaches are possible, ranging from implementing the grid service instance directly as an
operating system process to a sophisticated server-side mponent model such as J2EE. In the
former case, most or even all support for standard grid service behaviors (invocation, lifetime
management, registration, etc.)

The Grid Service


The purpose of the OGSI document is to specify the (standardized) interfaces and behaviors that
define a grid service
WSDL Extensions and Conventions
OGSI is based on Web services; in particular, it uses WSDL as the mechanism to describe the
public interfaces of grid services.
Service Data
The approach to stateful Web services introduced in OGSI identified the need for a common
mechanism to expose a service instance’s state data to service requestors for query, update, and
change notification.
Motivation and Comparison to JavaBean Properties
OGSI specification introduces the serviceData concept to provide a flexible, properties- style
approach to accessing state data of a Web service. The serviceData concept is similar to the
notion of a public instance variable or field in object-oriented programming languages such as
Java, Smalltalk, and C++.
Extending portType with serviceData

ServiceData defines a newportType child element named serviceData, used to define serviceData
elements, or SDEs, associated with that portType. These serviceData element definitions are
referred to as serviceData declarations, or SDDs.
serviceDataValues.
Each service instance is associated with a collection of serviceData elements: those serviceData
elements defined within the various portTypes that form the service’s interface, and also,
potentially, additional service
SDE Aggregation within a portType Interface Hierarchy
WSDL 1.2 has introduced the notion of multiple portType extension, and one can model that
construct within the GWSDL namespace. A portType can extend zero or more other portTypes
Dynamic serviceData Elements
Although many serviceData elements are most naturally defined in a service’s interface
definition, situations can arise in which it is useful to add or move serviceData elements
dynamically to or from an instance.

Objective Questions and Answers


S.No. Objective Questions
1. __________ is a standard for building Grids

a) OGSA b) OGSI

c) OGSE d) OGSD

2. ___________ is established through negotiation between service requester and


provider prior to service execution.

a) Service level agreement b) Service level attainment

c) Migration d) Connection
3. WSDL Stands for ______________

a) Web Service Definition Language b) Web Server Definition Language

c) Web Service Defined Language d) Web Server Defined Language


4. A ________ is a web service that provides a set of well-defined interfaces,
following specific conventions that are expressed using WSDL.

a) grid service b) web service

c) P2P d) internet service

5. __________ deals issues like task management, with placement, provisioning,


and life-cycle management of services.

a) Data Management Services b) Execution Management Services

c) Resource Management Services d) Security Services


6. ____________ functionalities to provide data where it is requested.

a) Data Management Services b) Execution Management Services

c) Resource Management Services d) Security Services

7. _________________ automize, reduce the costs and complexity of managing the


system.

a) Data Management Services b) Execution Management Services

c) Resource Management Services d) Self-Management Services

8. Authentication, authorization, and integrity assurance are essential


functionalities provided by _______________

a) Data Management Services b) Execution Management Services

c) Resource Management Services d) Security Services

9. Which model best suits the applications with a data grid with multiple sources of
data supplies?

a) Mesh Model b)Hierarchical Model

c) Monadic Model d) Federation Model


10. _________ resembles mesh model
a) Bus Model b)Hierarchical Model

c) Monadic Model d) Federation Model

11. In which model the data is saved in a central data repository

a) Bus Model b)Hierarchical Model

c) Monadic Model d) Federation Model


12. GSH stands for _________________

a) Grid Server Handler b) Grid Service Handler

c) Grid Service Host d) Grid Service Handle


13. ___________ Port returns the Grid service Reference of the associated grid
service handles.

a) Handle Map b) Registry

c) Service Group d) Service Group Entry


14. ______________ port finds the service data

a) Handle Map b) Registry

c) Service Group d) Grid Service


15. OGSI Stands for _______________

a) Open Grid Services Infrastructure b) Open Grid Services Internet

c) Open Grid Services Intranet d) Open Grid Services Information

2 Marks Questions and Answers

1. Define – OSGI.
Ans: Open Grid Services Architecture (OGSA) is a set of standards defining the way in which
information is shared among diverse components of large, heterogeneous grid systems. In this
context, a grid system is a scalable wide area network (WAN) that supports resource sharing and
distribution. OGSA is a trademark of the Open Grid Forum.
2. Define – OSGA.
Ans: The Open Grid Services Infrastructure (OGSI) was published by the Global Grid Forum
(GGF) as a proposed recommendation in June 2003. It was intended to provide an
infrastructure layer for the Open Grid Services Architecture (OGSA). OGSI takes the
statelessness issues (along with others) into account by essentially extending Web services to
accommodate grid computing resources that are both transient and stateful.

3. What are the major goals of OSGA?


Ans:
 Identify the use cases that can drive the OGSA platform components.
 Identify and define the core OGSA platform components.
 Define hosting and platform specific bindings.
 Define resource models and resource profiles with interoperable solutions.

4. What are the layers available in OGSA architectural organizations?


Ans:
 Native platform services and transport mechanisms.
 OGSA hosting environment.
 OGSA transport and security.
 OGSA infrastructure (OGSI).
 OGSA basic services (meta-OS and domain services)

5. What is meant by grid infrastructure?


Ans: Grid infrastructure is a complex combination of a number of capabilities and resources
identified for the specific problem and environment being addressed. It forms the core
foundations for successful grid applications.

6. Define GRAM.
Ans: GRAM provides resource allocation, process creation, monitoring, and management
services. The most common use of GRAM is the remote job submission and control facility.
GRAM simplifies the use of remote systems.

7. What are the different layers of grid architecture?


Ans:
 Fabric Layer: Interface to local resources
 Connectivity Layer: Manages Communications
 Collective Layer: Coordinating Multiple Resources
 Application Layer: User Defined Application.

8. List Grid Core Services?


Ans:
 Infrastructure Services
 Resource Management Services
 Security Services
 Information Services
 Self-Management Services
9. List OGSA Services?
Ans:
 Grid Core services
 Program execution services or Execution Management Services
 Grid data services or Data Management Services

10. Discuss briefly about Hybrid Model?


Ans:
 Combination of hierarchical and federation models
 Needs high bandwidth network
 Uses Grid FTP protocol
11. Draw hybrid model layout?
Ans:

12. Discuss briefly about Federation model?


Ans:
• Multiple databases
• Geographically distributed
• Known as “mesh data access model”
• Authorized specific location users can only access respective DB.

13. Draw Federation Model?


Ans:

14. Discuss briefly about hierarchical model?


Ans:
• Data centers designed as first, second…levels
• First level data replicated to second level etc
• Data in each level accessed by its own grid users

15. Draw hierarchical model?


Ans:
16. Discuss briefly about monadic model?
Ans:
• Used when centralized data repository is required
• All data stored in repository
• Repository ( all data) is replicated within grid
• To access data
– User submits request to central repository
– Permission given based on prior registration
• Fault tolerance, performance, reliability very poor.

17. Draw monadic model?


Ans:

18. List Grid data access models?


Ans:
 Monadic model
 Hierarchical model
 Federation model
 Hybrid model

19. Define Data Caching?


Ans: Data access operation in database is called caching.

20. What is Data replication?


Ans:
 same data is scattered and stored in multiple grid locations
 Users access data from multiple locations parallely based on the locality of reference
(Google search).

21. What are the benefits of data replication?


Ans:
 Data availability is improved
 One data storage becomes backup for another data storage.

22. Discuss Program Execution service?


Ans:
 Supports unique grid systems in high performance computing, collaboration, parallelism
and virtualization
 Support virtualization of resource processing
 Supports Distributed logging
23. List OGSA architectured services?
Ans:
 Grid Core Services
 Grid Program Execution Services
 Grid Data Services
 Additionally any domain specific services, only if required

24. Define Stateful?


Ans: Stateful means the computer or program keeps track of the state of interaction, usually by
setting values in a storage field designated for that purpose.

25. Define Stateless?


Ans: Stateless means there is no record of previous interactions and each interaction request has
to be handled based entirely on information that comes with it.
Descriptive Questions and Answers

1. Explain in detail about Open Grid Services Architecture


Ans: The OGSA is an open source grid service standard jointly developed by academia and the
IT industry under coordination of a working group in the Global Grid Forum (GGF). The
standard was specifically developed for the emerging grid and cloud service communities. The
OGSA is extended from web service concepts and technologies. The standard defines a common
framework that allows businesses to build grid platforms across enterprises and business
partners. The intent is to define the standards required for both open source and commercial
software to support a global grid infrastructure
OGSA Framework
The OGSA was built on two basic software technologies: the Globus Toolkit widely adopted as a
grid technology solution for scientific and technical computing, and web services (WS 2.0) as a
popular standards-based framework for business and network applications. The OGSA is
intended to support the creation, termination, management, and invocation of stateful, transient
grid services via standard interfaces and conventions
OGSA Interfaces
The OGSA is centered on grid services. These services demand special well-defined application
interfaces.
These interfaces provide resource discovery, dynamic service creation, lifetime management,
notification, and manageability. These properties have significant implications regarding how a
grid service is named, discovered, and managed.

Grid Service Handle


A GSH is a globally unique name that distinguishes a specific grid service instance from all
others. The status of a grid service instance could be that it exists now or that it will exist in the
future.
These instances carry no protocol or instance-specific addresses or supported protocol bindings.
Instead, these information items are encapsulated along with all other instance-specific
information. In order to interact with a specific service instance, a single abstraction is defined as
a GSR.
Grid Service Migration
This is a mechanism for creating new services and specifying assertions regarding the lifetime of
a service. The OGSA model defines a standard interface, known as a factor, to implement this
reference. This creates a requested grid service with a specified interface and returns the GSH
and initial GSR for the new service instance.
If the time period expires without having received a reaffirmed interest from a client, the service
instance can be terminated on its own and release the associated resources accordingly
OGSA Security Models
The grid works in a heterogeneous distributed environment, which is essentially open to the
general public. We must be able to detect intrusions or stop viruses from spreading by
implementing secure conversations, single logon, access control, and auditing for
nonrepudiation.
At the security policy and user levels, we want to apply a service or endpoint policy, resource
mapping rules, authorized access of critical resources, and privacy protection. At the Public Key
Infrastructure (PKI) service level, the OGSA demands security binding with the security protocol
stack and bridging of certificate authorities (CAs), use of multiple trusted intermediaries, and so
on.
2. Describe in detail about basic functionality requirements and System Properties Requirements
Ans: Basic functionality requirements
Discovery and brokering: Mechanisms are required for discovering and/or allocating services,
data, and resources with desired properties. For example, clients need to discover network
services before they are used, service brokers need to discover hardware and software
availability, and service brokers must identify codes and platforms suitable for execution
requested by the client
Metering and accounting: Applications and schemas for metering, auditing, and billing for IT
infrastructure and management use cases. The metering function records the usage and duration,
especially metering the usage of licenses. The auditing function audits usage and application
profiles on machines, and the billing function bills the user based on metering.
Data sharing: Data sharing and data management are common as well as important grid
applications. chanisms are required for accessing and managing data archives, for caching data
and managing its consistency, and for indexing and discovering data and metadata.
Deployment: Data is deployed to the hosting environment that will execute the job (or made
available in or via a high-performance infrastructure). Also, applications (executable) are
migrated to the computer that will execute them
Virtual organizations (VOs):The need to support collaborative VOs introduces a need for
mechanisms to support VO creation and management, including group membership services. For
the commercial data center use case, the grid creates a VO in a data center that provides IT
resources to the job upon the customer’s job request.
Monitoring: A global, cross-organizational view of resources and assets for project and fiscal
planning, troubleshooting, and other purposes. The users want to monitor their applications
running on the grid. Also, the resource or service owners need to surface certain states so that the
user of those resources or services may manage the usage using the state information
Policy: An error and event policy guides self-controlling management, including failover and
provisioning. It is important to be able to represent policy at multiple stages in hierarchical
systems, with the goal of automating the enforcement of policies that might otherwise be
implemented as organizational processes or managed manually
System Properties Requirements
Fault tolerance. Support is required for failover, load redistribution, and other techniques used to
achieve fault tolerance. Fault tolerance is particularly important for long running queries that can
potentially return large amounts of data, for dynamic scientific applications, and for commercial
data center applications.
Disaster recovery: Disaster recovery is a critical capability for complex distributed grid
infrastructures. For distributed systems, failure must be considered one of the natural behaviors
and disaster recovery mechanisms must be considered an essential component of the design.
Self-healing capabilities of resources, services and systems are required. Significant manual
effort should not be required to monitor, diagnose, and repair faults.
Legacy application management: Legacy applications are those that cannot be changed, but
they are too aluable to give up or to complex to rewrite. Grid infrastructure has to be built around
them so that they can continue to be used
Administration: Be able to ―codify‖ and ―automate‖ the normal practices used to administer
the environment. The goal is that systems should be able to self organize and self-describe to
manage low-level configuration details based on higher-level configurations and management
policies specified by administrators.
Agreement-based interaction: Some initiatives require agreement-based interactions capable of
specifying and enacting agreements between clients and servers (not necessarily human) and
then composing those agreements into higher-level end-user structures
Grouping/aggregation of services: The ability to instantiate (compose) services using some set
of existing services is a key requirement. There are two main types of composition techniques:
selection and aggregation. Selection involves choosing to use a particular service among many
services with the same operational interface.

3. Explain the following functionality requirements


Ans: a) Security requirements
Grids also introduce a rich set of security requirements; some of these requirements are:
Multiple security infrastructures: Distributed operation implies a need to interoperate with and
manage multiple security infrastructures. For example,for a commercial data center application,
isolation of customers in the same commercial data center is a crucial requirement; the grid
should provide not only access control but also performance isolation.
Perimeter security solutions: Many use cases require applications to be deployed on the other
side of firewalls from the intended user clients. Intergrid collaboration often requires crossing
institutional firewalls.
Authentication, Authorization, and Accounting: Obtaining application programs and
deploying them into a grid system may require authentication/authorization. In the commercial
data center use case, the commercial data center authenticates the customer and authorizes the
submitted request when the customer submits a job request.
Encryption: The IT infrastructure and management use case requires encrypting of the
communications, at least of the payload
Application and Network-Level Firewalls: This is a long-standing problem; it is made
particularly difficult by the many different policies one is dealing with and the particularly harsh
restrictions at international sites.
Certification: A trusted party certifies that a particular service has certain semantic behavior.
For example, a company could establish a policy of only using e-commerce services certified by
Yahoo
b) Resource Management Requirements
Resource management is another multilevel requirement, encompassing SLA negotiation,
provisioning, and scheduling for a variety of resource types and activities
Provisioning: Computer processors, applications, licenses, storage, networks, and instruments
are all grid resources that require provisioning. OGSA needs a framework that allows resource
provisioning to be done in a uniform, consistent manner.
Resource virtualization: Dynamic provisioning implies a need for resource virtualization
mechanisms that allow resources to be transitioned flexibly to different tasks as required; for
example, when bringing more Web servers on line as demand exceeds a threshold..
Optimization of resource usage: while meeting cost targets (i.e., dealing with finite resources).
Mechanisms to manage conflicting demands from various organizations, groups, projects, and
users and implement a fair sharing of resources and access to the grid
Transport management: For applications that require some form of real-time scheduling, it can
be important to be able to schedule or provision bandwidth dynamically for data transfers or in
support of the other data sharing applications. In many (if not all) commercial applications,
reliable transport management is essential to obtain the end-to-end QoS required by the
application
Management and monitoring: Support for the management and monitoring of resource usage
and the detection of SLA or contract violations by all relevant parties. Also, conflict management
is necessary;
Processor scavenging : is an important tool that allows an enterprise or VO to use to aggregate
computing power that would otherwise go to waste
Scheduling of service tasks:Long recognized as an important capability for any information
processing system, scheduling becomes extremely important and difficult for distributed grid
systems.
Load balancing: In many applications, it is necessary to make sure make sure deadlines are met
or resources are used uniformly. These are both forms of load balancing that must be made
possible by the underlying infrastructure
Advanced reservation: This functionality may be required in order to execute the application on
reserved resources.
Notification and messaging: Notification and messaging are critical in most dynamic scientific
problems.
Logging: It may be desirable to log processes such as obtaining/deploying application programs
because, for example, the information might be used for accounting. This functionality is
represented as ―metering and accounting.‖
Workflow management: Many applications can be wrapped in scripts or processes that require
licenses and other resources from multiple sources. Applications coordinate using the file system
based on events
Pricing: Mechanisms for determining how to render appropriate bills to users of a grid.

4. Describe in detail about Practical view of OGSA/OGSI


Ans: OGSA aims at addressing standardization (for interoperability) by defining the basic framework
of a grid application structure. Some of the mechanisms employed in the standards formulation of
grid computing
The objectives of OGSA are
Manage resources across distributed heterogeneous platforms
Support QoS-oriented Service Level Agreements (SLAs). The topology of grids is often complex;
the interactions between/among grid resources are almost invariably dynamic.
Provide a common base for autonomic management. A grid can contain a plethora of resources,
along with an abundance of combinations of resource
MPICH-G2: Grid-enabled message passing (Message Passing Interface)
_ CoG Kits, GridPort: Portal construction, based on N-tier architectures
_ Condor-G: workflow management
_ Legion: object models for grid computing
_ Cactus: Grid-aware numerical solver framework
Portals
_ N-tier architectures enabling thin clients, with middle tiers using grid functions
_ Thin clients = web browsers
_ Middle tier = e.g., Java Server Pages, with Java CoG Kit, GPDK, GridPort utilities
_ Bottom tier = various grid resources
_ Numerous applications and projects, e.g.,
_ Unicore, Gateway, Discover, Mississippi Computational Web Portal, NPACI Grid
Port, Lattice Portal, Nimrod-G, Cactus, NASA IPG Launchpad, Grid Resource
Broker
High-Throughput Computing and Condor
_ High-throughput computing
_ Processor cycles/day (week, month, year?) under nonideal circumstances
_ ―How many times can I run simulation X in a month using all available machines?‖
_ Condor converts collections of distributively owned workstations and dedicated clusters
into a distributed high-throughput computing facility
_ Emphasis on policy management and reliability
Object-Based Approaches
_ Grid-enabled CORBA
_ NASA Lewis, Rutgers, ANL, others
_ CORBA wrappers for grid protocols
_ Some initial successes
_ Legion
_ University of Virginia
_ Object models for grid components (e.g., ―vault‖ = storage, ―host‖ = computer)
Cactus: Modular, portable framework for parallel, multidimensional simulations
Construct codes by linking
_ Small core: management services
_ Selected modules: Numerical methods, grids and domain decomps, visualization and
steering, etc.
_ Custom linking/configuration tools
_ Developed for astrophysics, but not astrophysics specific
There are two fundamental requirements for describing Web services based on the OGSI
1.The ability to describe interface inheritance—a basic concept with most of the distributed object
systems.
2. The ability to describe additional information elements with the interface definitions.

5. Explain in detail about Detailed view of OGSA/OGSI


Provides a more detailed view of OGSI based on the OGSI specification itself. For a more
comprehensive description of these concepts, the reader should consult the specification
OGSI defines a component model that extends WSDL and XML schema definition to incorporate the
concepts of
Stateful Web services
_ Extension of Web services interfaces _ Asynchronous notification of state change
_ References to instances of services _ Collections of service instances
_ Service state data that augment the constraint capabilities of XML schema definition
Setting the Context
GGF calls OGSI the ―base for OGSA.‖ Specifically, there is a relationship between OGSI and
distributed object systems and also a relationship between OGSI and the existing (and evolving) Web
services framework
Relationship to Distributed Object Systems
Given grid service implementation is an addressable and potentially stateful instance that implements
one or more interfaces described by WSDL portTypes. Grid service factories can be used to create
instances implementing a given set of portType(s).
Client-Side Programming Patterns
Another important issue is
how OGSI interfaces are likely to be invoked from client applications. OGSI exploits an important
component of the Web services framework: the use of WSDL to describe multiple protocol
bindings, encoding styles, messaging styles (RPC versus document oriented), and so on, for a given
Web service.

Client Use of Grid Service Handles and References


Client gains access to a grid service instance through grid service handles and grid service
references. A grid service handle (GSH) can be thought of as a permanent network pointer to a
particular grid service instance.

Relationship to Hosting Environment


OGSI does not dictate a particular service-provider-side implementation architecture. A variety
of approaches are possible, ranging from implementing the grid service instance directly as an
operating system process to a sophisticated server-side mponent model such as J2EE. In the
former case, most or even all support for standard grid service behaviors (invocation, lifetime
management, registration, etc.)

The Grid Service


The purpose of the OGSI document is to specify the (standardized) interfaces and behaviors that
define a grid service
WSDL Extensions and Conventions
OGSI is based on Web services; in particular, it uses WSDL as the mechanism to describe the
public interfaces of grid services.
Service Data
The approach to stateful Web services introduced in OGSI identified the need for a common
mechanism to expose a service instance’s state data to service requestors for query, update, and
change notification.
Motivation and Comparison to JavaBean Properties
OGSI specification introduces the serviceData concept to provide a flexible, properties- style
approach to accessing state data of a Web service. The serviceData concept is similar to the
notion of a public instance variable or field in object-oriented programming languages such as
Java, Smalltalk, and C++.
Extending portType with serviceData
ServiceData defines a newportType child element named serviceData, used to define serviceData
elements, or SDEs, associated with that portType. These serviceData element definitions are
referred to as serviceData declarations, or SDDs.
ServiceDataValues.
Each service instance is associated with a collection of serviceData elements: those serviceData
elements defined within the various portTypes that form the service’s interface, and also,
potentially, additional service
SDE Aggregation within a portType Interface Hierarchy
WSDL 1.2 has introduced the notion of multiple portType extension, and one can model that
construct within the GWSDL namespace. A portType can extend zero or more other portTypes
Dynamic serviceData Elements
Although many serviceData elements are most naturally defined in a service’s interface
definition, situations can arise in which it is useful to add or move serviceData elements
dynamically to or from an instance.
6. Short notes on OGSA Service
a) Metering Service
Different grid deployments may integrate different services and resources and feature different
underlying economic motivations and models; however, regardless of these differences, it is a
quasiuniversal requirement that resource utilization can be monitored, whether for purposes of
cost allocation (i.e., charge back), capacity and trend analysis, dynamic provisioning, grid-
service pricing, fraud and intrusion detection, and/or billing.
A grid service may consume multiple resources and a resource may be shared by multiple
service instances. Ultimately, the sharing of underlying resources is managed by middleware and
operating systems.
A metering interface provides access to a standard description of such aggregated data (metering
serviceData). A key parameter is the time window over which measurements are aggregated. In
commercial Unix systems, measurements are aggregated at administrator-defined intervals
(chronological entry), usually daily, primarily for the purpose of accounting.
Several use cases require metering systems that support multitier, end-to-end flows involving
multiple services. An OGSA metering service must be able to meter the resource consumption of
configurable classes of these types of flows executing on widely distributed, loosely coupled
server, storage, and network resources. Configurable classes should support, for example, a
departmental charge-back scenario where incoming requests and their subsequent flows are
partitioned into account classes determined by the department providing the service.
b) Service Groups and Discovery Services
GSHs and GSRs together realize a two-level naming scheme, with HandleResolver services
mapping from handles to references; however, GSHs are not intended to contain semantic
information and indeed may be viewed for most purposes as opaque. Thus, other entities (both
humans and applications) need other means for
discovering services with particular properties, whether relating to interface, function,
availability, location, policy
Attribute naming schemes associate various metadata with services and support retrieval via
queries on attribute values. A registry implementing such a scheme allows service providers to
publish the existence and properties of the services that they provide, so that service consumers
can discover them
A ServiceGroup is a collection of entries, where each entry is a grid service implementing the
rviceGroupEntry interface. The ServiceGroup interface also extends the GridService interface
It is also envisioned that many registries will inherit and implement the notificationSource
interface so as to facilitate client subscription to register state changes
Path naming or directory schemes (as used, for example, in file systems) represent an
alternative approach to attribute schemes for organizing services into a hierarchical name space
that can be navigated. The two pproaches can be combined, as in LDAP.
c) Rating Service
A rating interface needs to address two types of behaviors. Once the metered information is
available, it has to be translated into financial terms. That is, for each unit of usage, a price has to
be associated with it. This step is accomplished by the rating interfaces, which provide
operations that take the metered information and a rating package as input and output the usage
in terms of chargeable amounts.
For example,
a commercial UNIX system indicates that 10 hours of prime-time resource and 10 hours on
nonprime-time resource are consumed, and the rating package indicates that each hour of prime-
time resource is priced at 2 dollars and each hour of nonprime- time resource is priced at 1
dollar, a rating service will apply the pricing indicated in the rating package
Furthermore, when a business service is developed, a rating service is used to aggregate the costs
of the components used to deliver the service, so that the service owner can determine the
pricing, terms, and conditions under which the service will be offered to subscribe
d) Other Data Services
A variety of higher-level data interfaces can and must be defined on top of the base
data interfaces, to address functions such as:
_ Data access and movement
_ Data replication and caching
_ Data and schema mediation
_ Metadata management and looking
Data Replication. Data replication can be important as a means of meeting performance
objectives by allowing local computer resources to have access to local data. Although closely
related to caching (indeed, a ―replica store‖ and a ―cache‖may differ only in their policies),
replicas may provide different interfaces
Data Caching. In order to improve performance of access to remote data items, caching services
will be employed. At the minimum, caching services for traditional flat file data will be
employed. Caching of other data types, such as views on RDBMS data, streaming data, and
application binaries, are also envisioned
Consistency—Is the data in the cache the same as in the source? If not, what is the coherence
window? Different applications have very different requirements. _ Cache invalidation protocols
—How and when is cached data invalidated? _ Write through or write back? When are writes to
the cache committed back to the original data source?
Security—How will access control to cached items be handled? Will access control enforcement
be delegated to the cache, or will access control be somehow enforced by the original data
source? _ Integrity of cached data—Is the cached data kept in memory or on disk? How is it
protected from unauthorized access? Is it encrypted
Schema Transformation. Schema transformation interfaces support the transformation of data
from one schema to another. For example, XML transformations as specified in XSLT.

You might also like