Unit-1 Grid
Unit-1 Grid
Unit-1 Grid
COMPUTING
OBJECTIVES:
The student should be made to:
Understand how Grid computing helps in solving large scale
scientific problems.
to cloud computing.
Understand the security issues in the grid and the cloud environment.
UNIT I INTRODUCTION (9)
Requirements Practical & Detailed view of OGSA/OGSI Data intensive grid service
virtualization of CPU, Memory and I/O devices virtual clusters and Resource Management
Open source grid middleware packages Globus Toolkit (GT4) Architecture , Configuration
Framework - Mapreduce, Input splitting, map and reduce functions, specifying input and
output parameters, configuring and running a job Design of Hadoop file system, HDFS
concepts, command line and java interface, dataflow of File read & File write.
UNIT V SECURITY (9)
Trust models for Grid security environment Authentication and Authorization methods
Grid security infrastructure Cloud Infrastructure security: network, host and application
level aspects of data security, provider data and its security, Identity and access
management architecture, IAM practices in the cloud, SaaS, PaaS, IaaS availability in the
Apply the security models in the grid and the cloud environment.
TEXT BOOK:
distributed systems.
passing messages.
common goal.
Evolution of Distributed computing
Over the past 60 years, computing technology has undergone a series of platform and
environment changes.
In this section, we assess evolutionary changes in machine architecture, operating system
platform, network connectivity, and application workload.
Instead of using a centralized computer to solve computational problems, a parallel and
distributed computing system uses multiple computers to solve large-scale problems over
the Internet.
Thus, distributed computing becomes data-intensive and network-centric. This section
identifies the applications of modern computer systems that practice parallel and
distributed computing.
These large-scale Internet applications have significantly enhanced the quality of life and
information services in society today.
The Age of Internet Computing
Billions of people use the Internet every day. As a result, supercomputer sites and large
data centers must provide high-performance computing services to huge numbers of
Internet users concurrently.
Because of this high demand, the Linpack Benchmark for high-performance computing
(HPC) applications is no longer optimal for measuring system performance.
The emergence of computing clouds instead demands high-throughput computing
(HTC) systems built with parallel and distributed computing technologies .
We have to upgrade data centers using fast servers, storage systems, and high-
bandwidth networks.
The purpose is to advance network-based computing and web services with the
emerging new technologies.
SCALABLE COMPUTING OVER THE INTERNET
The Platform Evolution Computer technology has gone through five generations of development, with
each generation lasting from 10 to 20 years.
Successive generations are overlapped in about 10 years. For instance, from 1950 to 1970, a handful of
mainframes, including the IBM 360 and CDC 6400, were built to satisfy the demands of large
businesses and government organizations.
From 1960 to 1980, lower-cost minicomputers such as the DEC PDP 11 and VAX Series became
popular among small businesses and on college campuses.
From 1970 to 1990, we saw widespread use of personal computers built with VLSI microprocessors.
From 1980 to 2000, massive numbers of portable computers and pervasive devices appeared in both
wired and wireless applications.
Since 1990, the use of both HPC and HTC systems hidden in clusters, grids, or Internet clouds has
proliferated.
These systems are employed by both consumers and high-end web-scale computing and information
services.
The general computing trend is to leverage shared web resources and massive amounts
of data over the Internet.
On the HPC side, supercomputers (massively parallel processors or MPPs) are
gradually replaced by clusters of cooperative computers out of a desire to share
computing resources.
The cluster is often a collection of homogeneous compute nodes that are physically
connected in close range to one another.
On the HTC side, peer-to-peer (P2P) networks are formed for distributed file sharing
and content delivery applications.
A P2P system is built over many client machines.
Peer machines are globally distributed in nature.
P2P, cloud computing, and web service platforms are more focused on HTC
applications than on HPC applications.
Clustering and P2P technologies lead to the development of computational grids or data
grids.
Fig1: Evolutionary trend toward parallel, distributed, and cloud computing with
clusters, MPPs, P2P networks, grids, clouds, web services, and the Internet of Things
When the Internet was introduced in 1969, Leonard Klienrock of UCLA declared: As of now, computer
networks are still in their infancy, but as they grow up and become sophisticated, we will probably see the
spread of computer utilities, which like present electric and telephone utilities, will service individual
homes and offices across the country.
Many people have redefined the term computer since that time.
In 1984, John Gage of Sun Microsystems created the slogan, The network is the computer.
In 2008, David Patterson of UC Berkeley said, The data center is the computer. There are dramatic
differences between developing software for millions to use as a service versus distributing software to run
on their PCs.
Recently, Rajkumar Buyya of Melbourne University simply said: The cloud is the computer. This
book covers clusters, MPPs, P2P networks, grids, clouds, web services, social networks, and the IoT.
In fact, the differences among clusters, grids, P2P systems, and clouds may blur in the future.
Some people view clouds as grids or clusters with modest changes through virtualization.
Others feel the changes could be major, since clouds are anticipated to process huge data sets generated by
the traditional Internet, social networks, and the future IoT.
The distinctions and dependencies among all distributed and cloud systems models will become clearer
and more transparent.
Computing Paradigm Distinctions
The high-technology community has argued for many years about the precise
cloud computing.
overlaps with distributed computing to a great extent, and cloud computing overlap
In parallel computing, all processors are either tightly coupled with centralized
shared memory or loosely coupled with distributed memory.
Some authors refer to this discipline as parallel processing .
Interprocessor communication is accomplished through shared memory or via
message passing.
A computer system capable of parallel computing is commonly known as a
parallel computer Programs running in a parallel computer are called parallel
programs.
The process of writing parallel programs is often referred to as parallel
programming .
DISTRIBUTED COMPUTING
A distributed system consists of multiple autonomous computers, each having its own
passing.
program.
programming.
What is Cloud?
The term Cloud refers to a Network or Internet. In other words, we can say that Cloud
Cloud can provide services over public and private networks, i.e., WAN, LAN or VPN.
Cloud Computing refers to manipulating, configuring, and accessing the hardware and
software resources remotely.
organization.
In fact, designers and programmers want to predict the technological capabilities of future
systems.
For instance, Jim Grays paper, Rules of Thumb in Data Engineering, is an excellent
example of how technology affects applications and vice versa.
In addition, Moores law indicates that processor speed doubles every 18 months. Although
Moores law has been proven valid over the last 30 years, it is difficult to say whether it will
continue to be true in the future.
Gilders law indicates that network bandwidth has doubled each year in the past. Will that
trend continue in the future?
The tremendous price/performance ratio of commodity hardware was driven by the desktop,
This has also driven the adoption and use of commodity technologies in large-scale
computing.
INNOVATIVE APPLICATIONS
Both HPC and HTC systems desire transparency in many application aspects.
networks.
The tremendous price/performance ratio of commodity hardware was driven by the desktop,
This has also driven the adoption and use of commodity technologies in large-scale
computing.