0% found this document useful (0 votes)
17 views68 pages

CCS Module 1 Notes

The document discusses the evolution and significance of cloud computing and security, focusing on distributed systems, high-performance computing (HPC), and high-throughput computing (HTC). It outlines the transition from centralized computing to parallel and distributed systems, emphasizing the need for scalable computing solutions to meet increasing demands. Additionally, it introduces various computing paradigms, including the Internet of Things (IoT), and highlights the importance of efficiency, reliability, and adaptability in future computing systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views68 pages

CCS Module 1 Notes

The document discusses the evolution and significance of cloud computing and security, focusing on distributed systems, high-performance computing (HPC), and high-throughput computing (HTC). It outlines the transition from centralized computing to parallel and distributed systems, emphasizing the need for scalable computing solutions to meet increasing demands. Additionally, it introduces various computing paradigms, including the Internet of Things (IoT), and highlights the importance of efficiency, reliability, and adaptability in future computing systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

CLOUD COMPUTING & SECURITY (BIS613D)

Module 1- Distributed System Models and Enabling Technologies


Scalable Computing Over the Internet,
Technologies for Network Based Systems,
System Models for Distributed and Cloud
Module 1 Syllabus Computing, Software Environments for
Distributed Systems and Clouds, Performance,
Security and Energy Efficiency.

Handouts for Session 1: SCALABLE COMPUTING OVER THE INTERNET

The computing technology has undergone a series of platform and environment changes such as changes
in machine architecture, operating system platform, network connectivity, and application workload
over the years.
Instead of using a centralized computer to solve computational problems, a parallel and distributed
computing system uses multiple computers to solve large-scale problems over the Internet. Thus,
distributed computing becomes data-intensive and network-centric.
These large-scale Internet applications have enhanced the quality of life and information services in
society today.
The Internet has become a critical tool for billions of users, necessitating the development of high-
throughput computing (HTC) systems.
HPC applications is no longer optimal due to the high demand for computing clouds (said by Linpack
Benchmark).
HTC systems built with parallel and distributed computing technologies are needed. Data centers need
to be upgraded with fast servers, storage systems, and high-bandwidth networks to advance network-
based computing and web services.

Computer technology has evolved through five generations, with each lasting 10-20 years.
-1950 to 1970: mainframes such as IBM 360 and CDC 6400 were developed for large businesses and
government organizations.
-1960 to 1980: mini computers such as DEC PDP 11 and VAX Series
- 1970 to 1990: personal computers with VLSI microprocessors.
-1980 to 2000: portable computers and devices.

Page 1
CLOUD COMPUTING & SECURITY (BIS613D)

Since 1990: the use of HPC and HTC systems, hidden in clusters, grids, or Internet clouds, has grown,
used by consumers and high-end web-scale computing and information services. The trend is to leverage
shared web resources and massive amounts of data over the Internet.
The Figure 1.1 shows the evouation of HPC or HTC systems.

Figure 1.1: Evolution of HPC and HTC Systems


HPC systems replace supercomputers (massively parallel processors or MPPs) with clusters of
cooperative computers, while HTC systems form peer-to-peer networks for distributed file sharing and
content delivery applications.

The cluster is a collection of homogeneous compute nodes that are physically connected in close range
to one another.
On the HTC side, peer-to-peer (P2P) networks are formed for distributed file sharing and content
delivery applications.
A P2P system is built over many client machines.
Peer machines are globally distributed in nature.
P2P, cloud computing, and web service platforms are more focused on HTC applications than on HPC
applications.
Clustering and P2P technologies lead to the development of computational grids or data grids.

Page 2
CLOUD COMPUTING & SECURITY (BIS613D)

High Performance Computing


HPC systems has become important due to the speed.
The speed of HPC systems has increased from Gflops in the early 1990s to now Pflops in 2010 due to
the demands from scientific, engineering, and manufacturing communities.

For example,
Top 500 most powerful computer systems in the world are measured by floating-point speed in
Linpack benchmark results. However, the number of supercomputer users is limited to less than
10% of all computer users. Today, the majority of computer users are using desktop computers or
large servers when they conduct Internet searches and market-driven computing tasks.

High-Throughput Computing
The development of market-oriented high-end computing systems is undergoing a strategic change
from an HPC paradigm to an HTC paradigm.
This HTC paradigm pays more attention to high-flux computing.
The main application for high-flux computing is in Internet searches and web service by millions or
more users simultaneously. The performance goal thus shifts to measure high throughput or the
number of tasks completed per unit of time. HTC technology needs to not only improve in terms of
batch processing speed, but also address the acute problems of cost, energy savings, security, and
reliability at many data and enterprise computing centers. This book will address both HPC and HTC
systems to meet the demands of all computer users.

Three New Computing Paradigms


As Figure 1.1 illustrates, with the introduction of SOA, Web 2.0 services become available.
Advances in virtualization make it possible to see the growth of Internet clouds as a new computing
paradigm.
The maturity of radio-frequency identification (RFID), Global Positioning System (GPS), and sensor
technologies has triggered the development of the Internet of Things (IoT).
When the Internet was introduced in 1969, Leonard Klienrock of UCLA declared: “As of now,
computer networks are still in their infancy, but as they grow up and become sophisticated, we will
probably see the spread of computer utilities, which like present electric and telephone utilities,
will service individual homes and offices across the country.” Many people have redefined the term

Page 3
CLOUD COMPUTING & SECURITY (BIS613D)

“computer” since that time. In 1984, John Gage of Sun Microsystems created the slogan, “The
network is the computer.” In 2008, David Patterson of UC Berkeley said, “The data center is the
computer.
There are dramatic differences between developing software for millions to use as a service
versus distributing software to run on their PCs.” Recently, Rajkumar Buyya of Melbourne University
simply said: “The cloud is the computer.”
Some people view clouds as grids or clusters with modest changes through virtualization.
Others feel the changes could be major, since clouds are anticipated to process huge data sets
generated by the traditional Internet, social networks, and the future IoT

Computing Paradigm Distinctions


The high-technology community has argued for many years about the precise definitions of
centralized computing, parallel computing, distributed computing, and cloud computing.
In general, distributed computing is the opposite of centralized computing. The field of parallel
computing overlaps with distributed computing to a great extent, and cloud computing overlaps with
distributed, centralized, and parallel computing.
• Centralized computing: This is a computing paradigm by which all computer resources are
centralized in one physical system. All resources (processors, memory, and storage) are fully
shared and tightly coupled within one integrated OS. Many data centers and supercomputers are
centralized systems, but they are used in parallel, distributed, and cloud computing applications.

• Parallel computing (or parallel processing: In parallel computing, all processors are either tightly
coupled with centralized shared memory or loosely coupled with distributed memory.
Inter processor communication is accomplished through shared memory or via message passing. A
computer system capable of parallel computing is commonly known as a parallel computer. Programs
running in a parallel computer are called parallel programs. The process of writing parallel programs is
often referred to as parallel programming.

• Distributed computing: This is a field of computer science/engineering that studies distributed


systems. A distributed system consists of multiple autonomous computers, each having its own private
memory, communicating through a computer network. Information exchange in a distributed system is
accomplished through message passing. A computer program that runs in a distributed system is

Page 4
CLOUD COMPUTING & SECURITY (BIS613D)

known as a distributed program. The process of writing distributed programs is referred to as


distributed programming.

• Cloud computing: An Internet cloud of resources can be either a centralized or a distributed


computing system. The cloud applies parallel or distributed computing, or both. Clouds can be built
with physical or virtualized resources over large data centers that are centralized or distributed.

• Concurrent computing or concurrent programming refers to the union of parallel computing and
distributing computing, although biased practitioners may interpret them differently.

• Ubiquitous computing refers to computing with pervasive devices at any place and time using wired
or wireless communication.

• Internet of Things (IoT) is a networked connection of everyday objects including computers, sensors,
humans, etc. The IoT is supported by Internet clouds to achieve ubiquitous computing with any object
at any place and time.

• Internet computing is even broader and covers all computing paradigms over the Internet.

Session 1 questions:
1. What is the difference between HPC and HTC?
2. Name computing paradigms.
3. What is cloud computing?
4. What is ubiquitous computing?
5. What is IoT?

Page 5
CLOUD COMPUTING & SECURITY (BIS613D)

Handouts for Session 2: SCALABLE COMPUTING OVER THE INTERNET (Contd.)

Distributed System Families


1. Grids:
Since the mid-1990s, technologies for building P2P networks and networks of
clusters have been consolidated to establish wide area computing infrastructures, known as
computational grids or data grids.
Grids emphasize resource sharing in hardware and software.

2. Clouds:
Internet clouds are the result of moving desktop computing to service-oriented computing using server
clusters and huge databases at data centers.
3. P2P Networks: Involve millions of client machines working together.

4. HPC and HTC systems emphasize parallelism and distributed computing.

Future HPC and HTC systems must be able to satisfy huge demand in computing power in terms of
throughput, efficiency, scalability, and reliability. The system efficiency is decided by speed,
programming, and energy factors (i.e., throughput per watt of energy consumed).

Meeting these goals requires to yield the following design objectives:


• Efficiency measures the utilization rate of resources in an execution model by exploiting massive
parallelism in HPC. For HTC, efficiency is more closely related to job throughput, data access, storage,
and power efficiency.
• Dependability measures the reliability and self-management from the chip to the system and
application levels. The purpose is to provide high-throughput service with Quality of Service
(QoS) assurance, even under failure conditions.
• Adaptation in the programming model measures the ability to support billions of job requests
over massive data sets and virtualized cloud resources under various workload and service
models.
• Flexibility in application deployment measures the ability of distributed systems to run well in
both HPC (science and engineering) and HTC (business) applications.

Page 6
CLOUD COMPUTING & SECURITY (BIS613D)

Scalable Computing Trends and New Paradigms


- Predictable technology trends significantly influence computing applications.
- Designers and programmers aim to forecast the technological capabilities of future systems.
- Jim Gray’s paper, “Rules of Thumb in Data Engineering,” illustrates the reciprocal relationship
between technology and applications.
- Moore’s law states that processor speed doubles approximately every 18 months, although its future
validity remains uncertain.
- Gilder’s law observes that network bandwidth has historically doubled each year, raising questions
about its continuation.
- The surge in price/performance ratio of commodity hardware has been attributed to the rise of desktop,
notebook, and tablet markets.
- This market growth has propelled the incorporation of commodity technologies in large-scale
computing.
- The future of computing trends will be examined in later sections, focusing on distributed systems
emphasizing resource distribution and high degrees of parallelism (DoP).
- An overview of degrees of parallelism will precede the discussion on distributed computing
requirements.

Degrees of Parallelism
- Fifty years ago, computers were primarily designed using bit-serial processing due to the high cost and
bulkiness of hardware.
- Progression from 4-bit to 64-bit microprocessors led to the development of instruction-level parallelism
(ILP).
- ILP allows simultaneous execution of multiple instructions, utilizing techniques like pipelining,
superscalar computing, VLIW architectures, and multithreading.
Effective ILP implementation requires branch prediction, dynamic scheduling, speculation, and
compiler support.
- Data-level parallelism (DLP) emerged through SIMD and vector machines, necessitating significant
hardware and compiler support.
- The introduction of multicore processors has shifted focus to task-level parallelism (TLP), which
remains challenging due to programming complexities.
- Modern processors integrate BLP, ILP, and DLP but face hurdles with TLP on multicore CMPs.

Page 7
CLOUD COMPUTING & SECURITY (BIS613D)

● - Distributed processing breaks tasks into smaller parts, enabling job-level parallelism based on
fine-grain parallel tasks.

Innovative Applications
- HPC and HTC systems prioritize transparency in areas such as data access, resource allocation, process
location, concurrency, job replication, and failure recovery for users and system management.
- Key applications driving parallel and distributed systems development span various domains, including
science, engineering, business, education, health care, traffic control, Internet and web services, military,
and government.
Table 1.1 highlights the applications of High-Performance and High-Throughput Systems

Table 1.1 Applications of High-Performance and High-Throughput Systems


Domain Applications
Science and engineering Scientific simulations, genomic analysis, etc.
Earthquake prediction, global warming, weather forecasting, etc.
Business, education, Telecommunication, content delivery, e-commerce, etc.
services industry, and health Banking, stock exchanges, transaction processing, etc.
care Air traffic control, electric power grids, distance education, etc.
Health care, hospital automation, telemedicine, etc.
Internet and web services, Internet search, data centers, decision-making systems, etc.
and government applications Traffic monitoring, worm containment, cyber security, etc.
Digital government, online tax return processing, social networking, etc.
Mission-critical applications Military command and control, intelligent systems, crisis management, etc.

- Applications require computing economics, web-scale data collection, system reliability, and scalable
performance.
- In the banking and finance industry, distributed transaction processing is critical, with transactions
accounting for 90% of the market for reliable banking systems.
- Maintaining consistency of replicated transaction records is essential for real-time banking services.
- Challenges in these applications include lack of software support, network saturation, and security threats.

Page 8
CLOUD COMPUTING & SECURITY (BIS613D)

The Trend toward Utility Computing


Figure 1.2 identifies major computing paradigms to facilitate the study of distributed systems and
their applications.

Figure 1.2: The vision of computer utilities in modern distributed computing systems

These paradigms exhibit common traits:


- Ubiquity in daily life
- Focus on reliability and scalability
- Support for autonomic operations and dynamic discovery
- Composability with Quality of Service (QoS) and Service-Level Agreements (SLAs)
These characteristics contribute to the vision of computer utility.
Utility computing presents a model where customers obtain computing resources from paid service
providers, with all grid/cloud platforms functioning as utility service providers.
Cloud computing expands upon utility computing, enabling distributed applications on accessible
servers in edge networks.
Major technological challenges encompass various domains in computer science and engineering,
necessitating:
- New network-efficient processors
- Scalable memory and storage solutions
- Distributed operating systems

Page 9
CLOUD COMPUTING & SECURITY (BIS613D)

- Middleware for machine virtualization


- Innovative programming models
- Effective resource management
- Development of applications
These hardware and software components are essential for constructing distributed systems that
leverage massive parallelism across processing levels.

The Hype Cycle of New Technologies


- New computing and information technologies undergo a hype cycle with five stages: trigger, peak of
inflated expectations, disillusionment, enlightenment, and plateau of productivity.

Figure 1.3: Hype cycle for Emerging Technologies, 2010.


- Each stage has associated timelines for mainstream adoption indicated by symbols:
- Hollow circles signify adoption within 2 years.
- Gray circles indicate 2-5 years.
- Solid circles represent 5-10 years.
- Triangles are for over 10 years.
- Crossed circles represent technologies that will become obsolete before reaching the plateau.
- As of August 2010, consumer-generated media was in the disillusionment stage, expected to hit the plateau
in less than 2 years.

Page 10
CLOUD COMPUTING & SECURITY (BIS613D)

- Internet micropayment systems were anticipated to reach maturity in 2-5 years, while 3D printing was
projected to take 5-10 years.
- Mesh network sensors were expected to take more than 10 years to achieve mainstream adoption.
- Cloud technology had just crossed the peak stage, with an estimated 2-5 years to reach productivity.
- Broadband over power line technology was forecasted to become obsolete before leaving the
disillusionment stage.
- Several technologies, indicated by dark circles, were at the peak expectation stage and anticipated to take
5-10 years for success, including cloud computing, biometric authentication, interactive TV, speech
recognition, predictive analytics, and media tablets.

The Internet of Things and Cyber-Physical Systems


Internet of Things
- Internet connects machines and web pages, while the Internet of Things (IoT) concept was introduced
in 1999 at MIT.
- IoT refers to the networked interconnection of everyday objects and devices through a wireless sensor
network.
- Objects, large or small, are tagged using RFID or similar technologies such as GPS, supported by the
vast address space of IPv6.
- Predictions suggest each person may interact with 1,000 to 5,000 objects, with the IoT designed to
track up to 100 trillion objects simultaneously.
- Universal addressability is essential, and filtering can help reduce identification complexity.
- The IoT encompasses human-to-human (H2H), human-to-thing (H2T), and thing-to-thing (T2T)
communication, connecting devices intelligently in various environments.
- The IoT is in its early stages, with prototypes being tested in limited areas.
- Cloud computing is expected to enhance interaction speed and efficiency among people and
devices.
- The vision includes smart cities and improved living standards globally, though realizing this dream
will take time.

Page 11
CLOUD COMPUTING & SECURITY (BIS613D)

Cyber-Physical Systems
- Cyber-physical system (CPS) results from the interaction between computational processes and the physical
world.
- It integrates "cyber" (heterogeneous and asynchronous) with "physical" (concurrent and information-dense)
objects.
- CPS combines the "3C" technologies: computation, communication, and control into an intelligent feedback
system linking the physical and information worlds.
- While the Internet of Things (IoT) focuses on networking among physical objects, CPS emphasizes virtual
reality (VR) applications in the physical realm.
- CPS has the potential to transform interactions with the physical world, similar to the Internet's impact on
the virtual world.

Session 2 questions:
1. What is degree of papalism?
2. List few computing paradigms.
3. List few domains where HPC is used?
4. What is the advantage of cloud computing in IoT?
5. What is CPS?

Page 12
CLOUD COMPUTING & SECURITY (BIS613D)

Handouts for Session 3: TECHNOLOGIES FOR NETWORK-BASED SYSTEMS


-Explores hardware, software, and network technologies needed for designing distributed computing
systems and focuses on strategies for developing distributed operating systems that handle parallelism.

Multicore CPUs and Multithreading Technologies


Consider the growth of component and network technologies over the past 30 years, led to the
development of HPC and HTC systems.
In Figure 1.4, processor speed is measured in millions of instructions per second (MIPS) and network
bandwidth is measured in megabits per second (Mbps) or gigabits per second (Gbps).
The unit GE refers to 1 Gbps Ethernet bandwidth.

Advances in CPU Processors


Advanced CPUs and microprocessors use multicore designs with two, four, six, or more cores to
improve performance.
They use parallel processing at different levels.
Processor speeds have increased significantly, from 1 MIPS in 1978 to 22,000 MIPS in 2008,
(following Moore’s Law).

Page 13
CLOUD COMPUTING & SECURITY (BIS613D)

However, clock speeds, which grew from 10 MHz to 4 GHz, have reached a limit due to heat and
power issues, with most CPUs staying below 5 GHz.
Modern CPUs use techniques such as superscalar architecture and speculative execution to boost
performance, while GPUs use many simple cores for parallel processing.

Figure 1.5: Schematic of a modern multicore CPU chip using a hierarchy of caches, where L1
cache is private to each core, on-chip L2 cache is shared and L3 cache or DRAM Is off the
chip.

Modern multi-core CPUs and many-core GPUs can run multiple instruction threads at different levels.
Figure 1.5 shows the architecture of a typical multicore processor. A multicore processor has multiple
cores, each with its own small cache (L1), while all cores share a larger cache (L2). Future chips may
have even more cores and shared caches (L3). High-performance processors like Intel i7, Xeon, and
AMD Opteron use this design. Niagara II, have eight cores, each handling eight threads, allowing 64
threads in total. In 2011, the Intel Core i7 990x reached 159,000 MIPS, showing significant
performance growth (as shown in Figure 1.4).

Multicore CPU and Many-Core GPU Architectures


In the future, multicore CPUs may grow from tens to hundreds of cores, but they face limits in
handling massive data processing due to memory constraints. This led to the rise of many-core GPUs
with hundreds of smaller cores.

Page 14
CLOUD COMPUTING & SECURITY (BIS613D)

Modern CPUs use IA-32 and IA-64 architectures, and x86 processors are now used in high-
performance computing (HPC) and cloud computing (HTC).
Many RISC processors have been replaced by x86 CPUs and GPUs in top supercomputers, showing a
trend toward x86 dominance.
Future processors may combine powerful CPU cores with energy-efficient GPU cores on the same
chip for better performance.

Multithreading Technology
Figure 1.6 shows the dispatch of five independent threads of instructions to four pipelined data paths
(functional units) in each of the following five processor categories, from left to right: a four-issue
superscalar processor, a fine-grain multithreaded processor, a coarse-grain multi- threaded processor, a
two-core CMP, and a simultaneous multithreaded (SMT) processor.

FIGURE 1.6: Five micro-architectures in modern CPU processors, that exploit ILP and TLP
supported by multicore and multithreading technologies.

Superscalar processors execute instructions from a single thread at a time.

Fine-grain multithreading switches between threads every cycle.

Coarse-grain multithreading runs multiple instructions from one thread before switching.
Page 15
CLOUD COMPUTING & SECURITY (BIS613D)

Multicore processors (CMPs) run different threads on separate cores.

SMT processors execute instructions from multiple threads at the same time.

Blank spaces in the figure show unused execution slots, indicating inefficiencies. No processor fully
achieves maximum instruction or thread-level parallelism in every cycle, but each method improves
scheduling efficiency in different ways.

GPU Computing to Exascale and Beyond

GPU is a graphics processor found in a computer’s graphics or video card.

GPU helps the CPU by handling graphics tasks, such as video editing.
The first GPU, GeForce 256, was released by NVIDIA in 1999 and could process at least 10 million
polygons per second. Today, GPUs are in almost every computer, and some of their features have
been added to CPUs.

Unlike CPUs, which have only a few cores (e.g., the Xeon X5670 has six cores), modern GPUs have
hundreds of cores. GPUs use a throughput architecture, running many tasks at once but at a slower
speed, while CPUs focus on running single tasks very quickly. Recently, GPU clusters have gained
popularity over CPUs for parallel computing. The use of GPUs for general computing, called
GPGPU, is widely used in high-performance computing (HPC), with NVIDIA’s CUDA leading this
trend.

How GPUs Work

Early GPUs worked as coprocessors alongside the CPU, but modern NVIDIA GPUs have 128 cores
on a single chip, with each core handling 8 threads. This allows a single GPU to run up to 1,024
threads at the same time, achieving massive parallelism compared to the limited threads of a CPU.

While CPUs are designed for fast access to cached data, GPUs focus on high throughput by managing
memory efficiently.

Page 16
CLOUD COMPUTING & SECURITY (BIS613D)

Today, GPUs are not just for graphics and video processing—they also power supercomputers and
HPC systems by handling complex floating-point calculations in parallel. This reduces the workload
on CPUs for data-heavy tasks.

GPUs are now found in mobile phones, game consoles, embedded systems, PCs, and servers. High-
performance models such as NVIDIA CUDA Tesla and Fermi are used in GPU clusters for large-scale
parallel computing.

GPU Programming Model


Figure 1.7 shows the interaction between a CPU and GPU in performing parallel execution of floating-
point operations concurrently.
CPU is the conventional multicore processor with limited parallelism to exploit.
GPU has a many-core architecture that has hundreds of simple processing cores organized as
multiprocessors.
Each core can have one or more threads.
The CPU’s floating-point kernel computation role is largely offloaded to the many-core GPU.
The CPU instructs the GPU to perform massive data processing. The bandwidth must be matched
between the on-board main memory and the on-chip GPU memory. This process is carried out in
NVIDIA’s CUDA programming using the GeForce 8800 or Tesla and Fermi GPUs.

FIGURE 1.7: The use of a GPU along with a CPU for massively parallel execution in hundreds
or thousands of processing cores.

Page 17
CLOUD COMPUTING & SECURITY (BIS613D)

Example 1.1 The NVIDIA Fermi GPU Chip with 512 CUDA Cores
In November 2010, three of the world's five fastest supercomputers—Tianhe-1a, Nebulae, and
Tsubame—used a large number of GPU chips to speed up floating-point computations. One of the
GPUs, called the Fermi GPU, developed by NVIDIA, became important in this process. The Fermi
GPU is a streaming multiprocessor (SM), which contains multiple SMs, with each SM having up to
512 streaming processors (also known as CUDA cores). For example, the Tesla GPUs used in the
Tianhe-1a had 448 CUDA cores.

The Fermi GPU is part of NVIDIA’s newer generation of GPUs that appeared in 2011. These GPUs
can be used in desktop workstations to accelerate complex calculations or in large-scale data centers.
Each CUDA core in the Fermi GPU contains an integer ALU (Arithmetic Logic Unit) and a Floating
Point Unit (FPU), both of which can operate at the same time for better performance.

Each Streaming Multiprocessor (SM) has 16 load/store units that help calculate source and destination
addresses for 16 threads each clock cycle. There are also special function units (SFUs) to handle
complex mathematical functions.

In terms of memory:
Each SM has a 64 KB L1 cache.
All SMs share a 768 KB unified L2 cache, which helps manage load, store, and texture operations.
The GPU also has 6 GB of off-chip DRAM for larger memory needs.
The SM schedules threads in groups called warps, with each warp having 32 threads. When fully used,
each SM can do up to 515 Gflops (billion floating-point operations per second) of double-precision
results. With 16 SMs, a single GPU can achieve a peak speed of 82.4 Tflops (teraflops).

In the future, GPUs with thousands of cores might help supercomputers reach Exascale computing—
Exaflops systems that can process 10^18 floating-point operations per second. However, achieving
this level of performance will face challenges like:
Energy and power consumption,
Memory and storage management,
Concurrency and locality, and
System resilience (ensuring systems can recover from failures).

Page 18
CLOUD COMPUTING & SECURITY (BIS613D)

This trend of building supercomputers with both CPUs and GPUs reflects the growing importance of
hybrid architectures to tackle complex computing tasks efficiently.

FIGURE 1.8: NVIDIA Fermi GPU built with 16 streaming multiprocessors (SMs) of 32
CUDA cores each; only one SM ss shown in figure.

Power Efficiency of the GPU


The power efficiency and massive parallelism are the main advantages of GPUs over CPUs (said by
Bill Dally from Stanford University)
To run an exaflops system (one that can perform 10^18 floating-point operations per second), it is
estimated that 60 Gflops/watt per core will be needed. This means power limits what can be included
in both CPU and GPU chips.

Page 19
CLOUD COMPUTING & SECURITY (BIS613D)

A CPU chip uses 2 nJ per instruction, while a GPU chip uses 200 pJ per instruction, making the GPU
10 times more power-efficient than the CPU.

CPUs are designed for fast access to small amounts of data (low latency), while GPUs are built to
handle large amounts of data at once (high throughput) by efficiently managing memory.

Figure 1.9 compares the CPU and GPU in their performance/power ratio measured in Gflops/ watt per
core.

FIGURE 1.9: The GPU performance (middle line, measured 5 Gflops/W/core in 2011),
compared with the lower CPU performance (lower line measured 0.8 Gflops/W/core in 2011)
and the estimated 60 Gflops/W/core performance in 2011 for the Exascale (EF in upper curve)
in the future.

In 2010, GPUs achieved 5 Gflops/watt per core, while CPUs only reached less than 1 Gflop/watt
per core. This difference in power efficiency could limit the future growth of supercomputers.
However, GPUs might improve and close the gap with CPUs. Data movement is the main factor
driving power consumption, so it's important to optimize how data is stored and accessed,
making memory management more efficient for specific tasks. To improve GPU-based
supercomputers, we need smarter operating systems, runtime support, and compilers that are
aware of memory usage and data locality. The key challenges for future computing systems will
be power management and software optimization.

Page 20
CLOUD COMPUTING & SECURITY (BIS613D)

Memory, Storage, and Wide-Area Networking


Memory Technology
The upper curve in Figure 1.10 shows that DRAM chip capacity grew from 16 KB in 1976 to
64 GB in 2011, with memory capacity increasing about 4 times every 3 years.
Memory access time has not improved much, and the memory wall problem (slow memory
access compared to fast processors) is getting worse as processors get faster.
For hard drives, capacity grew from 260 MB in 1981 to 250 GB in 2004, and the Seagate
Barracuda XT reached 3 TB in 2011, growing about 10 times every 8 years.
While memory and storage capacities continue to increase, the gap between processor speed
and memory performance is widening. This growing gap may make the memory wall an even
bigger issue, limiting CPU performance in the future.

Disks and Storage Technology

FIGURE 1.10: Improvement in memory and disk technologies over 33 years. The Seagate
Barracuda XT disk has a capacity of 3 TB in 2011.

Beyond 2011, disks and disk arrays have exceeded 3 TB in capacity.

Page 21
CLOUD COMPUTING & SECURITY (BIS613D)

The lower curve in Figure 1.10 shows disk storage growing by 7 orders of magnitude over 33
years. The rise of flash memory and solid-state drives (SSDs) also affects the future of HPC
(high-performance computing) and HTC (high-throughput computing). SSDs have a good
lifespan, handling 300,000 to 1 million write cycles per block, meaning they can last for
several years even with heavy usage. SSDs and flash memory will speed up many applications
significantly.

The power consumption, cooling, and packaging will eventually limit the growth of large
systems. Power use increases with clock frequency and voltage, meaning clock rates can not be
increased endlessly. Lower voltage chips are in high demand.
Jim Gray once said, “Tape units are dead, disks are tape, flashes are disks, and memory are
caches now,” predicting the future of storage technology. While SSDs were still too expensive
to replace traditional disk arrays in 2011, they are expected to become more common in the
future.

System-Area Interconnects
In small clusters, the nodes are usually connected by an Ethernet switch or a local area network
(LAN). As shown in Figure 1.11, a LAN is often used to connect client hosts to large servers. A
Storage Area Network (SAN) links servers to network storage like disk arrays, while Network
Attached Storage (NAS) connects client hosts directly to the disk arrays. These three types of
networks are commonly used in large clusters built with commercial network components. If
there's no need for large shared storage, a small cluster can be set up with a multiport Gigabit
Ethernet switch and copper cables to connect the machines. All of these network types are
commercially available.

Page 22
CLOUD COMPUTING & SECURITY (BIS613D)

FIGURE 1.11: Three interconnection networks for connecting servers, client hosts, and storage
devices; the LAN connects client hosts and servers, the SAN connects servers with disk arrays,
and the NAS connects clients with large storage systems in the network environment.

Wide-Area Networking
The lower curve in Figure 1.10 shows the rapid growth of Ethernet bandwidth, increasing from
10 Mbps in 1979 to 1 Gbps in 1999 and 40-100 Gbps in 2011. It was predicted that 1 Tbps
network links would be available by 2013. In 2006, network links with various bandwidths—
1,000 Gbps for international connections, 1 Gbps for copper desktop connections—were
reported. Network performance was increasing at a rate of doubling every 2 years, which is
faster than Moore’s law for CPU speed, which doubles every 18 months. This suggests that in
the future, more computers will be used together in massively distributed systems. High-
bandwidth networking will enable the creation of these large systems. A 2010 report from IDC
predicted that InfiniBand and Ethernet will be the main interconnect options for high-
performance computing (HPC). Most data centers currently use Gigabit Ethernet as the main
connection for their server clusters.

Session 3 questions:
1. What is MIPS stand for?
2. Why clock speed does not increase based on Moore’s law?

Page 23
CLOUD COMPUTING & SECURITY (BIS613D)

3. How many cores does a modern multicore CPU typically have?


4. What is the difference between superscalar and multithreading processors?
5. What is the role of a GPU in high-performance computing?

Page 24
CLOUD COMPUTING & SECURITY (BIS613D)

Handouts for Session 4: TECHNOLOGIES FOR NETWORK-BASED SYSTEMS (Contd..)

Virtual Machines and Virtualization Middleware


A conventional computer has a single OS image. This offers a rigid architecture that tightly
couples application software to a specific hardware platform. Some software running well on
one machine may not be executable on another platform with a different instruction set under a
fixed OS. Virtual machines (VMs) offer novel solutions to underutilized resources, application
inflexibility, software manageability, and security concerns in existing physical machines.
Today, to build large clusters, grids, and clouds, we need to access large amounts of
computing, storage, and networking resources in a virtualized manner. We need to aggregate
those resources, and hopefully, offer a single system image. In particular, a cloud of
provisioned resources must rely on virtualization of processors, memory, and I/O facilities
dynamically. Figure 1.12 illustrates the architectures of three VM configurations.

FIGURE 1.12: Three VM architectures in (b), (c), and (d), compared with the traditional
physical machine shown in (a).

Virtual Machines
In Figure 1.12, the host machine has physical hardware, like an x-86 desktop running Windows
OS. A Virtual Machine (VM) can be set up on any hardware system. The VM uses virtual

Page 25
CLOUD COMPUTING & SECURITY (BIS613D)

resources managed by a guest OS to run a specific application. To connect the VM with the
host system, a middleware called a Virtual Machine Monitor (VMM) is needed.

Figure 1.12(b) shows a native VM using a hypervisor in privileged mode. For example, the
hardware runs x-86 architecture with Windows, and the guest OS could be Linux using the
XEN hypervisor. This is called a bare-metal VM because the hypervisor manages the hardware
directly.

Another setup is the host VM in Figure 1.12(c), where the VMM runs in nonprivileged mode.
The host OS doesn’t need changes. There is also a dual-mode VM in Figure 1.12(d), where
part of the VMM runs at the user level and part at the supervisor level. In this case, the host OS
might need some changes.

Multiple VMs can be run on the same hardware, making the system hardware-independent.
This means applications on different guest OSes can be bundled into a virtual appliance and
run on any hardware platform. The VM can even run on an OS different from the host
computer.

VM Primitive Operations
The VMM provides the VM abstraction to the guest OS. With full virtualization, the VMM
exports a VM abstraction identical to the physical machine so that a standard OS such as
Windows 2000 or Linux can run just as it would on the physical hardware. Low-level VMM
operations are indicated by Mendel Rosenblum and illustrated in Figure 1.13.

Page 26
CLOUD COMPUTING & SECURITY (BIS613D)

FIGURE 1.13: VM multiplexing, suspension, provision, and migration in a distributed


computing environment.

• The VMs can be multiplexed between hardware machines, as shown in Figure 1.13(a).
• VM can be suspended and stored in stable storage, as shown in Figure 1.13(b).
•A suspended VM can be resumed or provisioned to a new hardware platform, as shown in
Figure 1.13(c).
•A VM can be migrated from one hardware platform to another, as shown in Figure 1.13(d).
VM operations allow a VM to run on any available hardware, making it easier to move
distributed applications across systems. This approach improves server resource use by
consolidating multiple functions on the same hardware, increasing system efficiency. It helps
reduce server sprawl by running systems as VMs, making the shared hardware more efficient.
According to VMware, this can boost server utilization from 5–15% to 60–80%.

Virtual Infrastructures
In Figure 1.14, physical resources for compute, storage, and networking are mapped to the
applications in various VMs. This separates hardware from software. Virtual infrastructure
connects resources to applications and dynamically assigns system resources to specific tasks.

Page 27
CLOUD COMPUTING & SECURITY (BIS613D)

This leads to lower costs and higher efficiency. Example: Server consolidation using
virtualization.

FIGURE 1.14: Growth and cost breakdown of data centers over the years.

Data Center Virtualization for Cloud Computing


In this section, we discuss basic architecture and design considerations of data centers. Cloud
architecture is built with commodity hardware and network devices. Almost all cloud platforms
choose the popular x86 processors. Low-cost terabyte disks and Gigabit Ethernet are used to
build data centers. Data center design emphasizes the performance/price ratio over speed
performance alone. In other words, storage and energy efficiency are more important than
shear speed performance. Figure 1.13 shows the server growth and cost breakdown of data
centers over the past 15 years. Worldwide, about 43 million servers are in use as of 2010. The
cost of utilities exceeds the cost of hardware after three years.

Data Center Growth and Cost Breakdown


A large data center may have thousands of servers, while smaller ones typically have hundreds.
Building and maintaining data centers has become more expensive over the years. According

Page 28
CLOUD COMPUTING & SECURITY (BIS613D)

to a 2009 IDC report, 30% of the costs go to buying IT equipment (like servers and disks), 33%
for chillers, 18% for UPS, 9% for air conditioning, and 7% for power distribution, lighting, and
transformers. About 60% of the costs are for management and maintenance. While server
purchase costs haven't changed much, electricity and cooling costs have increased from 5% to
14% over 15 years.

Low-Cost Design Philosophy


High-end switches or routers can be too expensive for building data centers, making high-
bandwidth networks less practical for cloud computing. With a fixed budget, using affordable
commodity switches and networks is more cost-effective. Similarly, commodity x86 servers
are preferred over expensive mainframes. The software layer manages network traffic, fault
tolerance, and scalability. Today, most cloud computing data centers use Ethernet as their main
network technology.

1.2.5.3 Convergence of Technologies


Cloud computing is driven by the merging of four key technologies: (1) hardware virtualization
and multi-core chips, (2) utility and grid computing, (3) SOA, Web 2.0, and mashups, and (4)
autonomic computing and data center automation. Hardware virtualization and multi-core
chips enable dynamic cloud configurations. Utility and grid computing form the foundation for
cloud systems. Advances in SOA, Web 2.0, and mashups are advancing cloud capabilities.
Additionally, autonomic computing and automated data center management are helping cloud
computing grow.

Cloud technology is especially important in managing and analyzing large amounts of data, or
the "data deluge." This data comes from sources like sensors, experiments, and the web.
Handling this requires high-performance tools for storage, processing, and analysis. Cloud
computing is a big part of this transformation, supporting the data-intensive nature of modern
science.

Cloud computing impacts e-science, which uses parallel computing and cloud technologies to
process large datasets. Cloud computing offers on-demand services at various levels
(infrastructure, platform, software). For instance, MapReduce is a programming model that
handles data parallelism and fault tolerance, key for processing scientific data.

Page 29
CLOUD COMPUTING & SECURITY (BIS613D)

Overall, cloud computing, multicore technologies, and data-intensive science are coming
together to shape the future of computing. They enable a pipeline where data turns into
valuable insights and, eventually, machine wisdom.

Session 4 questions:
1. What is the benefit of using virtual machines (VMs)?
2. What is host VM?
3. What is native VM?
4. What is VM migration?
5. What is meant by virtual infrastructure?

Handouts for Session 5: SYSTEM MODELS FOR DISTRIBUTED AND CLOUD COMPUTING

Page 30
CLOUD COMPUTING & SECURITY (BIS613D)

- Distributed and cloud computing systems consist of numerous autonomous computer nodes
connected through SANs, LANs, or WANs in a hierarchical structure.
- Modern networking technology allows a few LAN switches to link hundreds of machines into
a functioning cluster.
- WANs can further connect local clusters to create expansive networks, potentially comprising
millions of computers integrated with edge networks.
- These extensive systems are categorized as highly scalable, achieving web-scale connectivity
in both physical and logical dimensions.
- A classification of massive systems includes four primary groups: clusters, P2P networks,
computing grids, and Internet clouds using large data centers.
- Each group can encompass a range of participating nodes, from hundreds to millions,
functioning together at varying levels of cooperation.
- Technical and application characteristics of these system classes are detailed in the Table 1.2

Page 31
CLOUD COMPUTING & SECURITY (BIS613D)

- Clusters are widely utilized in supercomputing, with 417 of the Top 500 supercomputers
utilizing this architecture in 2009.
- Clusters serve as a foundation for developing large-scale grids and clouds.
- P2P networks are more suited for business applications, yet the content industry hesitated to
adopt P2P due to concerns over copyright protection.
- Many national grids constructed in the last decade have experienced underutilization due to
insufficient middleware or poorly developed applications.
- Cloud computing offers advantages like low cost and simplicity for both providers and users.

Clusters of Cooperative Computers


A computing cluster is formed by interconnected stand-alone computers. These computers
collaborate to function as a unified computing resource.
The clustered systems have effectively managed substantial workloads and large data sets.

Cluster Architecture
- Figure 1.15 illustrates the structure of a typical server cluster utilizing a low-latency, high-
bandwidth interconnection network (e.g., SAN or LAN).
- Clusters can be expanded with multiple levels of switches (e.g., Gigabit Ethernet, Myrinet,
InfiniBand) for larger configurations.
- Hierarchical construction through SAN, LAN, or WAN enables the creation of scalable
clusters with increased node count.
- The cluster connects to the Internet through a virtual private network (VPN) gateway, which
provides the gateway IP address for locating the cluster.
- The operating system (OS) determines the system image based on its management of shared
cluster resources, resulting in clusters with loosely coupled nodes.
- Each server node's resources are managed by its respective OS, leading to multiple system
images across different autonomous nodes.

Page 32
CLOUD COMPUTING & SECURITY (BIS613D)

FIGURE 1.15: A cluster of servers interconnected by a high-bandwidth SAN or LAN with


shared I/O devices and disk arrays; the cluster acts as a single computer attached to the
Internet.

Single-System Image
An ideal cluster should merge multiple system images into a single-system image (SSI) (said
by Greg Pfister).
Cluster designers desire a cluster operating system or some middle- ware to support SSI at
various levels, including the sharing of CPUs, memory, and I/O across all cluster nodes.
An SSI is an illusion created by software or hardware that presents a collection of resources as
one integrated, powerful resource.
SSI makes the cluster appear like a single machine to the user.
A cluster with multiple system images is nothing but a collection of independent computers.

Hardware, Software, and Middleware Support


- MPPs, or clusters exploring massive parallelism, dominate the Top 500 list of HPC clusters.
Key components consist of computer nodes (PCs, workstations, servers, SMP), specific
communication software (PVM, MPI), and network interface cards.

Page 33
CLOUD COMPUTING & SECURITY (BIS613D)
The majority of these clusters operate under the Linux operating system. - Nodes are connected
via high-bandwidth networks like Gigabit Ethernet, Myrinet, and InfiniBand.
Middleware is essential for creating SSI (Single System Image) or high availability (HA).
Both sequential and parallel applications are supported, requiring specialized parallel
environments.
Distributed memory configurations may lead to the formation of distributed shared memory (DSM)
for resource sharing.
Achieving SSI features can be challenging and costly; many clusters function as loosely coupled
systems.
Virtualization enables the dynamic creation of multiple virtual clusters based on user demand.

Major Cluster Design Issues


- A cluster-wide operating system for complete resource sharing is not yet available.
- Middleware or OS extensions have been developed in user space to enable Software Systems
Integration (SSI) at selected functional levels.
- Without this middleware, cluster nodes cannot collaborate effectively for cooperative computing.
- Software environments and applications rely on middleware for achieving high performance.
Key benefits of clusters include:
-Scalable performance
-Efficient message passing
-High system availability
- Seamless fault tolerance
- Cluster-wide job management
as summarized in Table 1.3.

Page 34
CLOUD COMPUTING & SECURITY (BIS613D)

Grid Computing Infrastructures


- Over the last 30 years, users have transitioned from the Internet to advanced web and grid
computing services.
- Internet services such as Telnet facilitate connections between local and remote computers.
- Web services such as HTTP allow users to access remote web pages.
- Grid computing aims to enable simultaneous interaction among applications on distant
machines.
- Forbes Magazine projects the IT-based economy to grow from $1 trillion in 2001 to $20
trillion by 2015.
- The evolution of these computing services is significantly contributing to economic growth.

Computational Grids
- A computing grid integrates computers, software, middleware, instruments, and sensors,
similar to an electric utility power grid.
- Grids can span LAN, WAN, or the Internet, functioning at regional, national, or global levels.

Page 35
CLOUD COMPUTING & SECURITY (BIS613D)

- Organizations present grids as integrated computing resources and virtual platforms to


support various needs.
- The primary components of a grid include workstations, servers, clusters, and
supercomputers, with personal devices accessing the grid.
- An example of a computational grid includes multiple resource sites from different
organizations, providing diverse computing capabilities.
- Access to the grid is facilitated through existing broadband networks utilized by enterprises or
organizations.
- Special instruments, such as radio telescopes, may be used for specific applications like
astronomical research.
- The grid operates as a network, offering integrated services for computing, communication,
content, and transactions.
- The user base, composed of enterprises and consumers, influences usage trends and service
characteristics.

Grid Families
- Grid technology necessitates advancements in distributed computing models,
software/middleware support, network protocols, and hardware infrastructures.
- National grid projects are being complemented by industrial platform developments from
companies such as IBM, Microsoft, Sun, HP, Dell, Cisco, and EMC.
- A surge in new grid service providers (GSPs) and applications is occurring, reflecting a trend
similar to the evolution of Internet and web services over the last twenty years.
- Grid systems are categorized into two main types: computational or data grids, and peer-to-
peer (P2P) grids, with the former primarily established at a national level.

Page 36
CLOUD COMPUTING & SECURITY (BIS613D)

FIGURE 1.16: Computational grid or data grid providing computing utility, data, and
information services through resource sharing and cooperation among participating
organizations.

Peer-to-Peer Network Families


- The client-server architecture is a common example of a distributed system, where clients
connect to a central server for various applications such as compute, email, file access, and
databases.
- P2P (peer-to-peer) architecture represents a distributed model where the network is client-
oriented rather than server-oriented.
- The section introduces P2P systems at the physical level, alongside overlay networks at the
logical level.

Page 37
CLOUD COMPUTING & SECURITY (BIS613D)

P2P Systems
- In a P2P system, each node functions as both a client and a server, contributing resources.
- Nodes, known as peer machines, connect autonomously to the Internet, joining or leaving the
system freely.
- There is no master-slave relationship; the network operates without central coordination or a
central database.
- No peer has a complete view of the entire P2P system; it is characterized by self-organization
and distributed control.
- The architecture comprises unrelated peers that form an ad hoc network using TCP/IP
protocols.
- The P2P network's physical structure is dynamic, changing in size and topology due to the
voluntary nature of membership.

Overlay Networks
Data items or files are distributed in the participating peers. Based on communication or file-
sharing needs, the peer IDs form an overlay network at the logical level. This overlay is a
virtual network formed by mapping each physical machine with its ID, logically, through a
virtual mapping as shown in Figure 1.17.

FIGURE 1.17: The structure of a P2P system by mapping a physical IP network to an overlay
network built with virtual links.

- A new peer adds its peer ID as a node in the overlay network, while an exiting peer’s ID is
automatically removed.
- The P2P overlay network defines logical connectivity among peers.
Page 38
CLOUD COMPUTING & SECURITY (BIS613D)

- There are two types of overlay networks:


Unstructured overlay networks:
- Characterized by a random graph with no fixed routing for message or file transmission.
- Utilizes flooding to query all nodes, leading to heavy network traffic and unpredictable
search results.
Structured overlay networks:
- Follows specific connectivity rules for node management.
- Employs developed routing mechanisms to optimize performance.

P2P Application Families


- P2P networks are categorized into four groups based on applications:
Distributed File Sharing: Involves sharing digital content (e.g., music, videos) through
popular networks like Gnutella, Napster, and BitTorrent.
Collaboration Networks: Includes services like MSN, Skype, instant messaging, and
collaborative design tools.
Distributed P2P Computing: Features specific applications such as SETI@home, which
utilizes combined computing power from over 3 million host machines for various tasks.
Supportive P2P Platforms: Examples include JXTA and .NET, which facilitate naming,
discovery, communication, security, and resource aggregation in P2P applications.

P2P Computing Challenges


- P2P computing faces heterogeneity issues in hardware, software, and network requirements.
- Numerous hardware models complicate selection; incompatibility arises between software
and operating systems; diverse network connections add complexity.
- System scalability is crucial as workload increases, directly linked to performance and
bandwidth; P2P networks exhibit these qualities.
- Key design objectives for distributed P2P applications include data locality, network
proximity, and interoperability.
- Performance is influenced by routing efficiency, self-organization of peers, and addressing
fault tolerance, failure management, and load balancing.
- Trust is a concern among peers; security, privacy, and copyright violations hinder business
application of P2P technology.

Page 39
CLOUD COMPUTING & SECURITY (BIS613D)

- In P2P networks, clients contribute resources such as computing power, storage space, and
bandwidth, enhancing robustness through a distributed approach.
- Risks include management difficulties due to decentralization, potential security breaches,
and unreliable client systems.
- P2P networks are effective for a limited number of peers and are best suited for applications
with minimal security concerns and non-sensitive data.

Cloud Computing over the Internet


“Computational science is changing to be data-intensive. Supercomputers must be balanced
systems, not just CPU farms but also petascale I/O and networking arrays.”- Gordon Bell, Jim
Gray, and Alex Szalay
- Future data processing will involve sending computations to the data instead of transferring
data to workstations.
- This shift aligns with the trend of moving computing from desktops to data centers that offer
on-demand software, hardware, and data services.
- The growth of large data sets has accelerated the adoption of cloud computing.
- Cloud computing is defined by various entities, including IBM, which describes it as a pool
of virtualized resources capable of hosting diverse workloads.
- Clouds enable quick deployment and scaling of workloads through the rapid provisioning of
virtual or physical machines.
- They support self-recovering, scalable programming models to mitigate hardware or software
failures.

Page 40
CLOUD COMPUTING & SECURITY (BIS613D)

- Real-time monitoring of resource utilization is essential for reallocating resources as needed.

Internet Clouds
- Cloud computing utilizes a virtualized platform for on-demand resources.
- It dynamically provisions hardware, software, and data sets.
- The concept transitions desktop computing to a service-oriented environment with server
clusters and large databases in data centers.
- Key benefits include low cost and simplicity for users and providers.
- Machine virtualization plays a crucial role in achieving cost-effectiveness.
- The primary goal of cloud computing is to meet diverse user applications simultaneously.
- The cloud ecosystem must be designed to be secure, trustworthy, and dependable.

FIGURE 1.18: Virtualized resources from data centers to form an Internet cloud, provisioned
with hardware, software, storage, network, and services for paid users to run their applications.

The Cloud Landscape


- Distributed computing systems have traditionally been managed by independent
organizations, facing challenges like system maintenance, poor utilization, and rising
hardware/software upgrade costs.
- Cloud computing addresses these issues by offering on-demand computing resources.
Major cloud service models include:
Infrastructure as a Service (IaaS): Users can utilize servers, storage, and networks without
managing the underlying infrastructure, allowing them to deploy multiple virtual machines.

Page 41
CLOUD COMPUTING & SECURITY (BIS613D)

Platform as a Service (PaaS): This model allows users to deploy applications on a virtual
platform that includes integrated middleware, databases, development tools, and runtime
support, freeing them from infrastructure management.
Software as a Service (SaaS): Users access applications via a web browser, eliminating upfront
costs for servers and software licensing, with lower costs compared to traditional hosting.

FIGURE 1.19: Three cloud service models in a cloud landscape of major providers.

- Internet clouds have four deployment modes: private, public, managed, and hybrid.
- Each mode presents varying levels of security and shared responsibilities between cloud
providers, consumers, and third-party software providers.
- Cloud benefits are well-supported by IT experts, industry leaders, and researchers.

Eight reasons to adopt cloud computing for enhanced Internet applications and web services
include:
1. Locations with secure spaces and greater energy efficiency.
2. Enhanced utilization via shared peak-load capacity among users.
3. Clear separation of infrastructure maintenance from application development.
4. Cost reductions compared to traditional computing.
5. Facilitated programming and application development.
6. Improved service and data discovery, along with content distribution.
7. Addressing privacy, security, copyright, and reliability concerns.
8. Clarity in service agreements, business models, and pricing structures.

Page 42
CLOUD COMPUTING & SECURITY (BIS613D)

Session 5 questions:
1. What is overlay network?
2. Name two types of overlay network.
3. List different categories of peer-to-peer (P2P) networks?
4. List the cloud service models.
5. What is PaaS?

Handouts for Session 6: SOFTWARE ENVIRONMENTS FOR DISTRIBUTED SYSTEMS AND


CLOUDS

Page 43
CLOUD COMPUTING & SECURITY (BIS613D)

Popular software environments for using distributed and cloud computing systems are:

Service-Oriented Architecture (SOA)


- In grids/web services, entities correspond to services, Java objects, and CORBA distributed
objects across various languages.
- These architectures are built upon the seven layers of the Open Systems Interconnection
(OSI) model, which provide foundational networking abstractions.
- A base software environment is established, utilizing .NET or Apache Axis for web services,
the Java Virtual Machine for Java, and a broker network for CORBA.
- Higher-level environments are developed to accommodate the unique characteristics of
distributed computing.
- This includes entity interfaces and communication, paralleling the top four OSI layers at the
entity level rather than the bit level.
- A layered architecture for distributed entities is depicted, highlighting its application in web
services and grid systems.

Layered Architecture for Web Services and Grids


- Entity interfaces are aligned with WSDL, Java methods, and CORBA IDL specifications in
distributed systems.
- Customized high-level communication systems utilized include SOAP, RMI, and IIOP.
- These communication systems offer features like Remote Procedure Call (RPC) patterns,
fault recovery, and specialized routing.
- Communication systems are often supported by message-oriented middleware, such as
WebSphere MQ and Java Message Service (JMS).
- Middleware provides extensive functionality and enables virtualization of routing, senders,
and recipients.
- The Web Services Reliable Messaging (WSRM) framework incorporates features that
resemble TCP fault tolerance, adapted for different messaging abstractions.
- Security measures in WSRM either utilize or replicate functionalities found in frameworks
like Internet Protocol Security (IPsec) and secure sockets.
- Higher-level services support entity communication, facilitating registries, metadata, and
management functions.

Page 44
CLOUD COMPUTING & SECURITY (BIS613D)

FIGURE 1.20: Layered architecture for web services and the grids.
- Various models exist within the Java distributed object framework, including JNDI, CORBA
Trading Service, UDDI, LDAP, and ebXML.
- Management services encompass service state and lifetime support, with examples being
CORBA Life Cycle, persistent states, Enterprise JavaBeans models, and Jini’s lifetime model.
- The collection of language and interface terms represents entity-level capabilities.
- Distributed models offer performance benefits, utilizing multiple CPUs for improved
efficiency and fostering a clear separation of software functions, leading to better software
reuse and maintenance.
- The distributed model is anticipated to become the standard approach for software systems,
having evolved from earlier technologies like CORBA and Java to more modern methods such
as SOAP, XML, and REST.

Web Services and Tools


- Services offer advantages over distributed objects due to loose coupling and support for
heterogeneous implementations.
- The architecture for services can be based on web services or REST systems, each with
distinct approaches for building reliable and interoperable systems.

Page 45
CLOUD COMPUTING & SECURITY (BIS613D)

- Web services involve extensive specification of all service aspects, utilizing SOAP for
communication, creating a universal distributed environment. However, this approach faces
challenges in protocol agreement and efficient implementation.
- REST systems prioritize simplicity, requiring minimal information in headers and using
message bodies for necessary data, making them suitable for rapidly evolving technologies.
- While REST provides flexibility in design, the concepts in web services remain significant
for mature systems at higher levels of application architecture.
- REST can incorporate XML schemas but not those associated with SOAP, with "XML over
HTTP" being a common choice.
- In frameworks like CORBA and Java, distributed entities are linked through Remote
Procedure Calls (RPCs), with a composite application being constructed by viewing these
entities as objects.
- The term “grid” can denote a single service or a collection of services, illustrated by sensors
emitting data messages and grids/clouds representing multi-input and output service
collections.

The Evolution of SOA


- Service-oriented architecture (SOA) has developed to support various computing
environments, including grids, clouds, and interclouds.
- SOA utilizes a range of sensor devices (e.g., ZigBee, Bluetooth, WiFi) for data collection,
referred to as sensor services (SS).
- These sensors collect raw data, which interacts with computers, grids, and different cloud
services (compute, storage, filter, discovery).
- Filter services (denoted as fs) are employed to refine raw data, ensuring only relevant
information is delivered based on specific requests.
- A network of filter services constitutes a filter cloud, facilitating the management of collected
data.

Page 46
CLOUD COMPUTING & SECURITY (BIS613D)

FIGURE 1.21: The evolution of SOA: grids of clouds and grids, where “SS” refers to a sensor
service and “fs” to a filter or transforming service.

- SOA focuses on extracting useful data from vast amounts of raw data.
- Processing this data leads to valuable information and knowledge for everyday decisions.
- Wisdom or intelligence is derived from extensive knowledge bases, influencing decision-
making through biological and machine insights.
- Most distributed systems necessitate a web interface or portal for data utilization.
- Data from numerous sensors undergoes transformation via compute, storage, filter, and
discovery clouds before converging at a user-accessible portal.
- Example portals, such as OGFCE and HUBzero, utilize both web service (portlet) and Web
2.0 (gadget) technologies.
- Numerous distributed programming models are constructed upon these foundational
elements.

Page 47
CLOUD COMPUTING & SECURITY (BIS613D)

Grids versus Clouds


- The distinction between grids and clouds is becoming less clear, particularly in web services
and workflow technologies.
- Workflow technologies coordinate services for specific business processes, such as two-phase
transactions.
- Key topics include BPEL Web Service standards and notable workflow approaches like
Pegasus, Taverna, Kepler, Trident, and Swift.
- Grid systems utilize static resources, while cloud systems focus on elastic resources.
- Some researchers assert that the main difference between grids and clouds lies in dynamic
resource allocation via virtualization and autonomic computing.
- It is possible to create grids from multiple clouds, which may outperform standalone clouds
by supporting negotiated resource allocation.
- This leads to complex architectures, including cloud of clouds, grid of clouds, cloud of grids,
and inter-clouds as part of a service-oriented architecture (SOA).

Trends toward Distributed Operating Systems


- Computers in distributed systems are loosely coupled, leading to multiple system images due
to each node operating independently.
- A distributed operating system (OS) is preferable for effective resource sharing and
communication among nodes.
- Such systems often operate as closed systems, utilizing message passing and remote
procedure calls (RPCs) for inter-node communication.
- The implementation of a distributed OS is essential for enhancing the performance,
efficiency, and flexibility of distributed applications.

Distributed Operating Systems


- Tanenbaum identifies three approaches for distributing resource management in distributed
computer systems:
Network OS: Built over heterogeneous OS platforms; offers low transparency and acts
primarily as a distributed file system through file sharing.
Middleware: Enables limited resource sharing, exemplified by MOSIX/OS for clustered
systems.
Distributed OS: Aims for higher use and system transparency through a more integrated
approach.

Page 48
CLOUD COMPUTING & SECURITY (BIS613D)

- Table 1.6 compares the functionalities of these three types of distributed operating systems.

MOSIX2, developed since 1977, is designed for HPC Linux and GPU clusters.
Architecture:
- Employs a microkernel-based, location-transparent distributed OS structure.
- Utilizes multiple servers for handling file, directory, replication, boot, and TCP/IP services.
- Serves as middleware for distributed applications with support for RPC, security, and
threading.
Core Support:
- Features a microkernel for low-level process, memory, I/O, and communication
management.
- DCE packages deliver file, time, directory, security services, RPC, and authentication.
- Compatible with Linux 2.6, offering extensions for multi-cluster and cloud environments
with provisioned VMs.
Communication:

Page 49
CLOUD COMPUTING & SECURITY (BIS613D)

- Implements a network-layer FLIP protocol and RPC for both point-to-point and group
communication.
- Provides authenticated communication and security services within user programs.
- Supports collective communications via PVM and MPI, alongside priority process control
and queuing services.

Amoeba versus DCE


- DCE is a middleware system developed for distributed computing environments at Free
University, Netherlands, through the Amoeba project.
- The Open Software Foundation (OSF) supports the adoption of DCE for distributed computing.
- Systems like Amoeba, DCE, and MOSIX2 are primarily academic research prototypes and lack
successful commercial operating system products.
- There is an increasing demand for web-based operating systems to effectively virtualize
resources in distributed environments, a largely unexploited area.
- A distributed OS should manage resources across multiple servers, in contrast to the centralized
approach of conventional OSs.
- Future designs may favor a lightweight microkernel approach or enhance existing OSs such as
DCE and UNIX.
- The aim is to minimize user involvement in resource management tasks.

MOSIX2 for Linux Clusters


- MOSIX2 is a distributed operating system functioning within a virtualization layer in a Linux
environment.
- It provides a partial single-system image for user applications.
- The system supports both sequential and parallel applications, facilitating process migration
and resource discovery among Linux nodes.
- MOSIX2 can effectively manage a Linux cluster or a grid comprising multiple clusters.
- It allows for flexible management and resource sharing among multiple cluster owners.
- The MOSIX-enabled grid has the potential to expand indefinitely, contingent on the trust
among cluster owners.
- Applications of MOSIX2 are being investigated for resource management in various settings,
including Linux clusters, GPU clusters, grids, and cloud environments utilizing virtual
machines.

Page 50
CLOUD COMPUTING & SECURITY (BIS613D)

Transparency in Programming Environments


Figure 1.22 shows the concept of a transparent computing infrastructure for future computing
platforms. The user data, applications, OS, and hardware are separated into four levels. Data is
owned by users, independent of the applications. The OS provides clear interfaces, standard
programming interfaces, or system calls to application programmers. In future cloud
infrastructure, the hardware will be separated by standard interfaces from the OS. Thus, users
will be able to choose from different OSes on top of the hardware devices they prefer to use.
To separate user data from specific application programs, users can enable cloud applications
as SaaS. Thus, users can switch among different services. The data will not be bound to
specific applications.

FIGURE 1.22: A transparent computing environment that separates the user data, application, OS, and
hardware in time and space – an ideal model for cloud computing.

Parallel and Distributed Programming Models


In this section, we will explore four programming models for distributed computing with
expected scalable performance and application flexibility. Table 1.7 summarizes three of these

Page 51
CLOUD COMPUTING & SECURITY (BIS613D)

models, along with some software tool sets developed in recent years. MPI is the most popular
programming model for message-passing systems.

Message-Passing Interface (MPI)


- MPI is a primary programming standard for developing parallel and concurrent programs in
distributed systems.
- It serves as a library of subprograms callable from C or FORTRAN for writing parallel
applications.
- The framework supports clusters, grid systems, and P2P systems with enhanced web services
and utility computing applications.
- Additionally, distributed programming can utilize low-level primitives like the Parallel
Virtual Machine (PVM).

MapReduce
- This model is predominantly used in web-scale search and cloud computing applications.
- Users define a Map function to create intermediate key/value pairs.
- A Reduce function is then applied to aggregate all intermediate values with the same key.
- MapReduce enables high scalability and parallelism at various job levels.
- It efficiently manages terabytes of data across numerous client machines.
- Hundreds of MapReduce programs can run concurrently, with thousands of jobs executed
daily on Google’s clusters.

Page 52
CLOUD COMPUTING & SECURITY (BIS613D)

Hadoop Library
- Hadoop is a software platform originally developed by a Yahoo! group.
- It enables users to write and run applications over large amounts of distributed data.
- Hadoop can be scaled to store and process petabytes of web data economically.
- It features an open-source version of MapReduce that reduces overhead in task management
and data communication.
- The platform is efficient, processing data with high parallelism across numerous commodity
nodes.
- It ensures reliability by automatically maintaining multiple data copies for task redeployment
during system failures.

Open Grid Services Architecture (OGSA)


- Large-scale distributed computing applications drive grid infrastructure development,
requiring high resource and data sharing.
- OGSA is introduced as a common standard for grid services for public use.
- Genesis II exemplifies the implementation of OGSA.
- Key features of Genesis II include:
- Distributed execution environment
- Public Key Infrastructure (PKI) services with a local certificate authority (CA)
- Trust management
- Security policies in grid computing

Globus Toolkits and Extensions


- Globus is a middleware library developed by U.S. Argonne National Laboratory and USC
Information Science Institute.
- The library implements OGSA standards for resource discovery, allocation, and security
enforcement in grid environments.
- It supports multisite mutual authentication using PKI certificates.
- The current version, GT 4, has been utilized since 2008.
- IBM has extended Globus for business applications.
Session 6 questions:
1. What are the key components of Service-Oriented Architecture (SOA) in distributed systems?
2. What are the main differences between grids and clouds?

Page 53
CLOUD COMPUTING & SECURITY (BIS613D)

3. What are the advantages and challenges of using web services over REST systems in distributed
environments?
4. What is MapReduce programming model?
5. What are the functions of the Open Grid Services Architecture (OGSA)?

Page 54
CLOUD COMPUTING & SECURITY (BIS613D)

Handouts for Session 7: PERFORMANCE, SECURITY, AND ENERGY EFFICIENCY


Performance Metrics and Scalability Analysis
Performance metrics are needed to measure various distributed systems.
Performance Metrics
Performance in distributed systems influenced by various factors.
- System throughput measured in MIPS, Tflops, or TPS.
- Other performance metrics include job response time and network latency.
- Preferred interconnection networks exhibit low latency and high bandwidth.
- System overhead includes OS boot time, compile time, I/O data rate, and runtime support.
- Additional performance metrics encompass QoS for Internet and web services, system
availability and dependability, and security resilience against network attacks.

Dimensions of Scalability
- Users require a distributed system that offers scalable performance and ensures backward
compatibility with existing hardware and software during upgrades.
- Overdesign of systems is often not cost-effective, and scaling can depend on various practical
factors.
-Dimensions of scalability in parallel and distributed systems include:
Size Scalability: Achieving increased performance by enlarging the machine (e.g., adding
processors, memory, or I/O). Different architectures vary in their ability to scale; IBM's S2
reached 512 processors in 1997, while BlueGene/L reached 65,000 in 2008.
Software Scalability: Involves OS or compiler upgrades, addition of libraries, and ensuring
compatibility with larger systems, which requires thorough testing and adjustments.
Application Scalability: Concerns aligning problem size with machine size to optimize
efficiency; users may increase problem size instead of machine size for better cost-
effectiveness.
Technology Scalability: Requires adaptability to new building technologies, considering
factors like time (generation scalability), space (packaging and energy), and heterogeneity
(compatibility of components from multiple vendors).

Page 55
CLOUD COMPUTING & SECURITY (BIS613D)

Scalability versus OS Image Count


Figure 1.23 shows the scalable performance estimated against the multiplicity of OS images in
distributed systems deployed up to 2010.
- Scalable performance allows systems to increase speed through additional processors,
memory, disk capacity, or I/O channels.
- The number of independent OS images in a system varies in clusters, grids, P2P networks, or
clouds.
- SMP (symmetric multiprocessor) servers operate with a single system image; by 2010, their
scalability was limited to a few hundred processors due to packaging and interconnect
constraints.
- NUMA (nonuniform memory access) architectures consist of SMP nodes with distributed
memory, capable of supporting thousands of processors and multiple operating systems.
- Clusters generally exhibit higher scalability than NUMA systems due to their configuration,
with the number of OS images tied to the active nodes.
- As of 2010, the largest cloud could scale to a few thousand virtual machines (VMs).
- Cluster systems typically have a significantly higher total number of processors or cores
compared to the number of OS images.
- Grid structures may encompass various node types, resulting in potentially hundreds or
thousands fewer OS images than processors.
- P2P networks can scale to millions of independent nodes, with performance dependent on
public network quality of service (QoS).
- The evaluation of low-speed P2P networks, Internet clouds, and computer clusters should be
conducted at the same networking level.

Page 56
CLOUD COMPUTING & SECURITY (BIS613D)

FIGURE 1.23: System scalability versus multiplicity of OS images based on 2010 technology.

Amdahl’s Law
- A program’s execution time on a uniprocessor is T minutes.
- A fraction α of the code is sequential (sequential bottleneck), while (1 − α) is parallelizable
across n processors.
- Total execution time formula: αT + (1 − α)T/n.
- First term: sequential execution time.
- Second term: parallel execution time.
- Assumes no system or communication overhead and excludes I/O or exception handling time
in speedup analysis.
Amdahl’s Law states that the speedup factor of using the n-processor system over the use of a
single processor is expressed by:
Speedup = S = T/[αT + (1 − α)T/n] = 1/[α + (1 − α)/n]
- Maximum speedup is only achieved if the sequential bottleneck (α) is eliminated or the code
is fully parallelizable.
- As the cluster size increases (n →∞), speedup (S) approaches 1/α, which is an upper bound
independent of n.
- The sequential bottleneck represents the non-parallelizable portion of the code.
- For instance, with α = 0.25, a maximum speedup of 4 can be achieved, regardless of the
number of processors.

Page 57
CLOUD COMPUTING & SECURITY (BIS613D)

- Amdahl’s law emphasizes minimizing the sequential bottleneck to maximize speedup.


- Increasing cluster size alone does not guarantee improved speedup if the bottleneck remains
significant.

Problem with Fixed Workload


- Amdahl's law assumes a fixed workload for both sequential and parallel execution of
programs.
- This scenario, termed fixed-workload speedup, implies system efficiency can be defined as E
= S/n = 1/[αn + 1 − α].
- System efficiency often remains low, particularly with large cluster sizes.
- For instance, in a 256-node cluster, the observed efficiency is E = 1/[0.25 × 256 + 0.75] =
1.5%.
- This low efficiency arises because only a few processors (e.g., 4) are actively utilized while
the remaining nodes remain idle.

Gustafson’s Law
- To enhance efficiency in a large cluster, it is crucial to scale the problem size to align with the
cluster's capabilities.
- John Gustafson's 1988 speedup law, known as scaled-workload speedup, addresses this issue.
- The workload in a program is represented as W.
- In an n-processor system, users adjust the workload to W′ = αW + (1 − α)nW, where only the
parallelizable part is multiplied by n.
- The adjusted workload W′ reflects the sequential execution time on a single processor.
- The parallel execution time of the scaled workload on n processors establishes a definition for
scaled-workload speedup.

S′ = W′/W = [αW + (1 − α)nW ]/W = α + (1 − α)n


This speedup is known as Gustafson’s law. By fixing the parallel execution time at level W, the
following efficiency expression is obtained:
E′ = S′/n = α/n + (1 − α)
- The efficiency of a 256-node cluster can be improved to E′ = 0.25/256 + 0.75 = 0.751 for a
scaled workload.
- Amdahl’s law should be applied for fixed workloads.

Page 58
CLOUD COMPUTING & SECURITY (BIS613D)

- Gustafson’s law is applicable for solving scaled problems.

Fault Tolerance and System Availability


In addition to performance, system availability and application flexibility are two other
important design goals in a distributed computing system.

System Availability
- High availability (HA) is crucial in clusters, grids, P2P networks, and cloud systems.
- A system is considered highly available if it has a long mean time to failure (MTTF) and a
short mean time to repair (MTTR).
System Availability = MTTF / (MTTF + MTTR)
- Failures can occur in any hardware, software, or network components, with a single point of
failure significantly impacting system operations.
- To achieve high availability, systems should be designed without single points of failure,
incorporating:
- Hardware redundancy
- Increased component reliability
- Testability in design
- As distributed systems increase in size, overall availability may decrease due to the increased
likelihood of failures and challenges in isolating them.
- Symmetric Multiprocessing (SMP) and Massively Parallel Processing (MPP) systems exhibit
vulnerabilities with centralized resources under one operating system.
- Non-Uniform Memory Access (NUMA) architectures enhance availability through multiple
operating systems.
- Clusters are often designed for high availability with failover capabilities.
- Private clouds, formed from virtualized data centers, generally share availability
characteristics with their hosting clusters.
- Grids, structured as hierarchies of clusters, benefit from fault isolation, leading to higher
availability compared to other systems.
- However, as system size grows, availability tends to decrease across clusters, clouds, and
grids.
- P2P file-sharing networks, while having numerous client machines, operate independently
and generally have low availability, especially if multiple nodes fail simultaneously.

Page 59
CLOUD COMPUTING & SECURITY (BIS613D)

FIGURE 1.24: Estimated system availability by system size of common configurations in2010.
Session 7 questions:
1. How does Amdahl’s Law affect the performance and speedup of parallel systems?
2. How does Gustafson’s Law improve the scalability and efficiency of distributed systems compared
to Amdahl’s Law?
3. What do you mean by high availability of a system?
4. What is MPP?
5. How does the availability is enhanced in NUMA architecture?

Page 60
CLOUD COMPUTING & SECURITY (BIS613D)

Handouts for Session 8: PERFORMANCE, SECURITY, AND ENERGY EFFICIENCY (Contd..)


Network Threats and Data Integrity
Threats to Systems and Networks
- Network viruses have led to widespread attacks affecting many users.
- These incidents have caused a worm epidemic, impacting routers and servers, resulting in
significant financial losses in business and government sectors.
- Various attack types can cause specific damages:
- Information leaks result in loss of confidentiality.
- User alterations, Trojan horses, and service spoofing attacks compromise data integrity.
- Denial of Service (DoS) attacks lead to operational loss and disrupted Internet connections.
- Lack of authentication and authorization enables unauthorized use of computing resources.
- Open resources, including data centers, P2P networks, and cloud infrastructures, are
increasingly vulnerable to attacks.
- Users must ensure the protection of clusters, grids, and cloud systems, or refrain from
utilizing or trusting them.
- Malicious intrusions can destroy valuable hosts and network resources.
- Internet anomalies in routers and distributed hosts may impede the adoption of public-
resource computing services.

Security Responsibilities
- Three primary security requirements: confidentiality, integrity, and availability.
- SaaS: Cloud provider handles all security aspects.
- PaaS: Provider ensures data integrity and availability; users manage confidentiality and
privacy.
- IaaS: Users are primarily responsible for security functions, with the provider focusing on
availability.

Page 61
CLOUD COMPUTING & SECURITY (BIS613D)

FIGURE 1.25: Various system attacks and network threats to the cyberspace, resulting 4 types
of losses.

Copyright Protection
- Collusive piracy constitutes a major source of intellectual property violations in P2P
networks.
- Paid clients, referred to as colluders, illegally share copyrighted content with unpaid clients,
or pirates.
- Online piracy has restricted the use of open P2P networks for commercial content delivery.
- A proactive content poisoning scheme can be devised to combat collusion and piracy without
harming legitimate users.
- Timely detection of pirates is achieved through identity-based signatures and timestamped
tokens.

System Defense Technologies


Three generations of network defense technologies have evolved:
First Generation: Focused on preventing intrusions through access control policies, tokens, and
cryptographic systems, acknowledging that weak links can still be exploited.
Second Generation: Emphasized intrusion detection for timely remedial actions, utilizing tools
such as firewalls, intrusion detection systems (IDS), PKI services, and reputation systems.

Page 62
CLOUD COMPUTING & SECURITY (BIS613D)

Third Generation: Introduced intelligent responses to detected intrusions, enhancing reaction


capabilities.

Data Protection Infrastructure


Security infrastructure is essential for the protection of web and cloud services.
Trust negotiation and reputation aggregation must be performed at the user level.
Application security measures are necessary for:
- Worm containment
- Intrusion detection against:
- Viruses
- Worms
- Distributed DoS (DDoS) attacks
Mechanisms to prevent online piracy and copyright violations should be implemented.
Security responsibilities vary across cloud service models:
- Cloud providers bear full responsibility for platform availability.
- IaaS users handle confidentiality issues, while IaaS providers are accountable for data
integrity.
- In PaaS and SaaS, both providers and users equally share responsibilities for data integrity
and confidentiality.

Energy Efficiency in Distributed Computing


- Conventional parallel and distributed computing systems prioritize high performance and
throughput, with a focus on performance reliability, including fault tolerance and security.
- New challenges have emerged, notably energy efficiency and resource outsourcing, which are
essential for the sustainability of large-scale computing systems.
- Integrated solutions are necessary for protecting data centers from energy consumption-
related problems, which impact financial, environmental, and system performance.
- For instance, the Earth Simulator and Petaflop systems have peak power usages of 12 and 100
megawatts, leading to hourly energy costs of $1,200 and $10,000, respectively, which can
exceed many operators' budgets.
- Additionally, cooling is a critical concern due to high temperatures that can adversely affect
electronic components and lead to their premature failure.

Page 63
CLOUD COMPUTING & SECURITY (BIS613D)

Energy Consumption of Unused Servers


- Operating a server farm incurs significant annual costs for hardware, software, operational
support, and energy.
- Companies must assess the utilization levels of their server farms to ensure resource
efficiency.
- Historically, approximately 15% of servers are idling daily, equating to around 4.7 million
nonproductive servers globally.
- Potential savings from powering down these idle servers amount to $3.8 billion in energy
costs and $24.7 billion in total operating expenses.
- This wasted energy contributes to approximately 11.8 million tons of carbon dioxide
emissions annually, akin to the pollution produced by 2.1 million cars.
- In the U.S. alone, this translates to 3.17 million tons of carbon dioxide, comparable to the
emissions of 580,678 vehicles.
- IT departments should prioritize analyzing their servers to identify unused and underutilized
resources.

Reducing Energy in Active Servers


In addition to identifying unused/underutilized servers for energy savings, it is also necessary
to apply appropriate techniques to decrease energy consumption in active distributed systems
with negligible influence on their performance. Power management issues in distributed
computing platforms can be categorized into four layers namely the application layer,
middleware layer, resource layer, and network layer, as shown in Figure 1.26..

FIGURE 1.26: Four operational layers of distributed computing systems.

Page 64
CLOUD COMPUTING & SECURITY (BIS613D)

Application Layer
- User applications in various fields, such as science, business, engineering, and finance,
typically aim to enhance system speed or quality.
- The introduction of energy-aware applications necessitates the development of complex
energy management systems that do not compromise performance.
- A crucial initial step is to investigate the interplay between performance and energy
consumption.
- An application’s energy consumption is significantly influenced by the number of instructions
required for execution and interactions with the storage unit (memory).
- The relationship between compute (instructions) and storage (transactions) affects the overall
completion time.

Middleware Layer
- The middleware layer serves as a connection between the application and resource layers.
- Key functionalities include resource brokering, communication services, task analysis, task
scheduling, security access, reliability control, and information services.
- It implements energy-efficient techniques, especially in task scheduling.
- The traditional focus of scheduling has been on reducing makespan (the total execution time
of tasks).
- Distributed computing systems require a revised cost function that considers both makespan
and energy consumption.

Resource Layer
- The resource layer encompasses a variety of computing nodes and storage units, managing
interactions with hardware and the operating system.
- It is responsible for controlling distributed resources within computing systems.
- Recent advancements focus on efficient power management for hardware and operating
systems, predominantly through hardware-centric methods for processors.
- Two prominent methods include:
Dynamic Power Management (DPM): Allows hardware, like CPUs, to transition from idle to
lower-power modes.
Dynamic Voltage-Frequency Scaling (DVFS): Achieves energy savings by adjusting
operating frequencies and supply voltage, directly influencing power consumption in CMOS
circuits.

Page 65
CLOUD COMPUTING & SECURITY (BIS613D)

Network Layer
- The network layer in distributed computing systems is responsible for routing packets and
enabling services for the resource layer.
- Key challenges for energy-efficient networks include:
- Developing comprehensive models reflecting time, space, and energy interactions within
networks.
- Creating new energy-efficient routing algorithms and protocols to counteract network
attacks.
- Data centers are increasingly vital for economic and social progress, akin to essential
infrastructures like power grids and transportation systems.
- Traditional data centers face high costs, complex resource management, poor usability,
security and reliability issues, and significant energy consumption.

DVFS Method for Energy Efficiency


The DVFS method enables the exploitation of the slack time (idle time) typically incurred by
inter- task relationship. The slack time associated with a task is utilized to execute the task in a
lower voltage frequency. The relationship between energy and voltage frequency in CMOS
circuits is related by:

where v-voltage,
Ceff- circuit switching capacity
K-a technology dependent factor
vt-threshold voltage
t- is the execution time of the task under clock frequency f.

By reducing voltage and frequency, the device’s energy consumption can also be reduced.

Page 66
CLOUD COMPUTING & SECURITY (BIS613D)

Energy Efficiency in Distributed Power Management


Figure 1.27 illustrates the DVFS method. This technique as shown on the right saves the
energy compared to traditional practices shown on the left. The idea is to reduce the frequency
and/or voltage during work- load slack time. The transition latencies between lower-power
modes are very small. Thus energy is saved by switching between operational modes.
Switching between low-power modes affects performance. Storage units must interact with the
computing nodes to balance power consumption. This figure increases rapidly due to a 60
percent increase in storage needs annually, making the situation even worse.

FIGURE 1.27: The DVFS technique (right) saves energy, compared to traditional practices
(left) by reducing the frequency or voltage during slack time.

Session 8 questions:
1. Name different security requirements in Cloud computing.
2. What is the difference between SaaS, PaaS, and IaaS?
3. What is denial of service (DoS) attacks?
4. What are the challenges posed by idle servers?
5. What are the four operational layers of distributed computing systems?

Page 67
CLOUD COMPUTING & SECURITY (BIS613D)

Question Bank
1. What is the difference between HPC and HTC?
2. List the different computing paradigms. Explain
3. What is cloud computing?
4. What is ubiquitous computing?
5. What is IoT?
6. What is degree of parallelism? Explain
7. List a few domains where HPC is used.
8. What is the advantage of cloud computing in IoT?
9. What is CPS? Explain
10. What is the difference between superscalar and multithreading processors?
11. What is the role of a GPU in high-performance computing?
12. What is the benefit of using virtual machines (VMs)?
13. What is a host VM?
14. What is a native VM?
15. What is VM migration?
16. List different types of VM. Explain with neat figure.
17. What is meant by virtual infrastructure?
18. List and Explain different cluster design issues.
19. What is an overlay network?
20. Name two types of overlay networks.
21. List different categories of peer-to-peer (P2P) networks.
22. List the cloud service models.
23. What is PaaS?
24. What are the key components of Service-Oriented Architecture (SOA) in distributed systems?
25. List and explain P2p application families.
26. What are the main differences between grids and clouds?
27. Explain with a neat figure layered architecture for web services and grids.
28. List and explain parallel and distributed programming models.
29. What is the MapReduce programming model?
30. What are the functions of the Open Grid Services Architecture (OGSA)?
31. How does Amdahl’s Law affect the performance and speedup of parallel systems?
32. How does Gustafson’s Law improve the scalability and efficiency of distributed systems
compared to Amdahl’s Law?
Page 68

You might also like