Cloud Computing Assignment
Cloud Computing Assignment
INSTITUTE OF TECHNOLOGY
SCHOOL OF COMPUTING
Indiviual assignment
Submitted to Dr Direslign.
Problem 1.1
Briefly define the following basic techniques and technologies that represent recent related
advances
in computer architecture, parallel processing, distributed computing, Internet technology, and
information services:
a. High-performance computing (HPC) system
b. High-throughput computing (HTC) system
c. Peer-to-peer (P2P) network
d. Computer cluster versus computational grid
e. Service-oriented architecture (SOA)
f. Pervasive computing versus Internet computing
g. Virtual machine versus virtual infrastructure
h. Public cloud versus private cloud
i. Radio-frequency identifier (RFID)
j. Global positioning system (GPS)
k. Sensor network
l. Internet of Things (IoT)
m. Cyber-physical system (CPS
solution
A ) HPC - High-performance computing (HPC) is the ability to process data and perform
complex calculations at high speeds. To put it into perspective, a laptop or desktop with a 3 GHz
processor can perform around 3 billion calculations per second. While that is much faster than
any human can achieve, it pales in comparison to HPC solutions that can perform quadrillions of
calculations per second.
B ) HTC - High throughput computing does not aim to optimize a single application but several
users and applications. In this way, many applications share a computing infrastructure at the
same time - in this way the overall throughput of several applications is supposed to be
maximized.
C ) P2P : In a P2P network, the "peers" are computer systems which are connected to each
other via the Internet. Files can be shared directly between systems on the network without the
need of a central server. In other words, each computer on a P2P network becomes a
file server as well as a client.
The only requirements for a computer to join a peer-to-peer network are an Internet
connection and P2P software. Common P2P software programs include Kazaa,
Limewire, BearShare, Morpheus, and Acquisition. These programs connect to a P2P
network, such as "Gnutella," which allows the computer to access thousands of other
systems on the network.
D) CLUSTER COMPUTING
Nodes must be homogenous i.e. they should have same type of hardware and operating
system.
Computers in a cluster are dedicated to the same work and perform no other task.
Computers are located close to each other.
Computers are connected by a high speed local area network bus.
Computers are connected in a centralized network topology.
Scheduling is controlled by a central server.
Whole system has a centralized resource manager.
Whole system functions as a single system.
GRID COMPUTING
Nodes may have different Operating systems and hardwares. Machines can be
homogenous or heterogenous.
Computers in a grid contribute their unused processing resources to the grid computing
network.
Computers may be located at a huge distance from one another.
Computers are connected using a low speed bus or the internet.
Computers are connected in a distributed or de-centralized network topology.
It may have servers, but mostly each node behaves independently.
Every node manages it's resources independently.
Every node is autonomous, and anyone can opt out anytime.
E ) SOA: Service-Oriented Architecture (SOA) is a style of software design where services are
provided to the other components by application components, through a communication protocol
over a network.
Its principles are independent of vendors and other technologies. In service oriented
architecture, a number of services communicate with each other, in one of two ways:
through passing data or through two or more services coordinating an activity.
G ) A virtual infrastructure lets you share your physical resources of multiple machines across
your entire infrastructure. A virtual machine lets you share the resources of a single physical
computer across multiple virtual machines for maximum efficiency.
H ) A private cloud hosting solution, also known as an internal or enterprise cloud, resides on
company's intranet or hosted data center where all of your data is protected behind a firewall.
This can be a great option for companies who already have expensive data centers because they
can use their current infrastructure. However, the main drawback people see with a private cloud
is that all management, maintenance and updating of data centers is the responsibility of the
company. Over time, it's expected that your servers will need to be replaced, which can get very
expensive. On the other hand, private clouds offer an increased level of security and they share
very few, if any, resources with other organizations.
The main differentiator between public and private clouds is that you aren't responsible
for any of the management of a public cloud hosting solution.
Your data is stored in the provider's data center and the provider is responsible for the
management and maintenance of the data center. This type of cloud environment is
appealing to many companies because it reduces lead times in testing and deploying new
products. However, the drawback is that many companies feel security could be lacking
with a public cloud. Even though you don't control the security of a public cloud, all of
your data remains separate from others and security breaches of public clouds are rare.
I ) Radio-Frequency Identification (RFID) is the use of radio waves to read and capture
information stored on a tag attached to an object. A tag can be read from up to several feet away
and does not need to be within direct line-of-sight of the reader to be tracked.
J ) The Global Positioning System (GPS) is a navigation system using satellites, a receiver and
algorithms to synchronize location, velocity and time data for air, sea and land travel.
K ) A sensor network comprises a group of small, powered devices, and a wireless or wired
networked infrastructure. They record conditions in any number of environments including
industrial facilities, farms, and hospitals.
Problem 1.3
An increasing number of organizations in industry and business sectors adopt cloud systems.
Answer the following questions regarding cloud computing:
Solution A)
On-demand self-service: Whenever Consumer required the Server or resources they can
provision from any where, any time and any resource.This can be done automatically with
the help of automation software without any human interaction.
Access from Any Device: You can access resources like Application and Desktop or
Server by using internet from any device like laptop, ipad,mobile phones and so on.
Multitenancy: Consumer's are using shared or can used shared resources which are
abstracted by Software to provide Cloud service.by using multi-tenant model resources
allocation or de-allocation as per requirement and on the basis of demand.
Elasticity: This is very important aspect in Cloud that elasticity. Resources can be
provisioned and release automatically and at any given point of time.
Cost Transparency Service: In cloud Service provider have some software by using
which it will charge you for monthly or yearly basis. Charging on the basis of Usages of
CPU hours or GB storage and active user accounts.
Multiplatform: You can provision any Server any OS with any configuration.b. Discuss
key enabling technologies in cloud computing systems.
Solution B )
Virtualization VDC (Virtual data Center): Resource abstraction like CPU,RAM,
Storage & Network
Virtual Private Cloud: Service Provider can provide or separate resources logically and
can provide to some organization.
Virtual Network: In Cloud you will get Virtual Network logically you can manage and
get all the benefit of physical network.
PBM (Policy base Management): You can manage and get all the benefit of underlying
hardware by using policies which can met the SLA and other requirements.
Solution C)
Resource Sharing: They can get maximum benefit by sharing the resources and
hibernate rest of the servers.
DPM (Dynamic Power Management): With the help of Software Venders like Amazon,
Microsoft, VMware can manage Power for Servers even Rack or Chassis Dynamically
and if there is less utilization then it will move machines from under utilize server to
other and hibernate the server or rack.
Problem 1.5
Consider a multicore processor with four heterogeneous cores labeled A, B, C, and D. Assume cores
A and D have the same speed. Core B runs twice as fast as core A, and core C runs three times
faster than core A. Assume that all four cores start executing the following application at the same
time and no cache misses are encountered in all core operations. Suppose an application needs to
compute the square of each element of an array of 256 elements. Assume 1 unit time for core A or
D to compute the square of an element. Thus, core B takes 12 unit time and core C takes 13 unit
time to compute the square of an element. Given the following division of labor in four cores:
Core A 32 elements,
Core C 64 elements,
Core D 32 elements
a ) Compute the total execution time (in time units) for using the four-core processor to compute
the squares of 256 elements in parallel. The four cores have different speeds. Some faster cores
finish the job and may become idle, while others are still busy computing until all squares are
computed.
b) Calculate the processor utilization rate, which is the total amount of time the cores are busy (not
idle) divided by the total execution time they are using all cores in the processor to execute the
above application.
Solution
We can normalize all times to Time A.
- Time B = Time A / 2
- Time C = Time A / 3
- Time D = Time A
a ) Total execution time = max(32 / 1, 128 / 2, 64 / 3, 32 / 1)
= max(32, 64, 21.3, 32)
Max is 64 time units.
b ) Utilization: ((32/1) + (128/2) + (64/3) + (32/1)) / 4 cores
=149 time units / 4 cores
=129 time units / 4 cores / 64 units
58% utilization for job.
Problem 1.7
Consider a program for multiplying two large-scale N × N matrices, where N is the matrix size.
The sequential multiply time on a single server is T1 = cN3 minutes, where c is a constant
determined by the server used. An MPI-code parallel program requires T n = cN3 /n + dN2 /n0.5
minutes to complete execution on an n-server cluster system, where d is a constant determined
by the MPI version used. Assume the program has a zero sequential bottleneck (α = 0). The
second term in T n accounts for the total message-passing overhead experienced by n servers.
Answer the following questions for a given cluster configuration with n = 64 servers, c = 0.8,
and d = 0.1. Parts (a, b) have a fixed workload corresponding to the matrix size N = 15,000.
Parts (c, d) have a scaled workload associated with an enlarged matrix size N′ = n1/3 N = 64 1/3
× 15,000 = 4 × 15,000 = 60,000. Assume the same cluster configuration to process both
workloads.Thus, the system parameters n, c, and d stay unchanged. Running the scaled
workload, the overhead also increases with the enlarged matrix size N′.
a. Using Amdahl’s law, calculate the speedup of the n-server cluster over a single server.
b. What is the efficiency of the cluster system used in Part (a)?
c. Calculate the speedup in executing the scaled workload for an enlarged N′ × N′ matrix on the
same cluster configuration using Gustafson’s law.
d. Calculate the efficiency of running the scaled workload in Part (c) on the 64-processor cluster.
e. Compare the above speedup and efficiency results and comment on their implications.
Solution
A ) Speedup S is given by, Speedup S = T/[αT + (1 – α)T/n] = 1/[α + (1 – α)/n] where T is the
total execution time and α is percentage of sequential execution of code ,therefore in our question
we are given α = 0 and n = 64 therefore from this we can easily calculate Speedup simply substitute
α and n in the equation , therefore S =1/[0 +(1-0)/64], S=1/[1/64]=64.
C) Given
c = 0.8, d = 0.1, N’ = n1/3N, N = 15,000, n = 64
Speedup, S’ = Time taken by a single server / Time taken by the cluster
D ) Efficiency E’ = α / n + (1 – α), here we have α = 0, and n=64 so, E’ = 0/64 +(1-0) = 100%
E ) In both cases ( above two cases ) the speed up and the efficiency of the fixed workload and
scaled workload is the same. i.e efficiency E=100% and Speedup S=64.
In the above two cases since α = 0 ,we are able to get the maximum speed up and efficiency
of 100% for both fixed and scalable workloads.
There fore even if we have large cluster we will not have maximum speed up due to α.
Problem 1.13
Characterize the following three cloud computing models:
Problem 1.15
Briefly explain the following terms associated with network threats or security defense in a
distributed computing system:
a. Denial of service ( DoS )
b. Trojan horse
c. Network worm
d. Service spoofing
e. Authorization
f. Authentication
g. Data integrity
h. Confidentiality
Solution
A ) A Denial-of-Service (DoS) attack is an attack meant to shut down a machine or network,
making it inaccessible to its intended users. DoS attacks accomplish this by flooding the target with
traffic, or sending it information that triggers a crash
B ) A Trojan Horse Virus is a type of malware that downloads onto a computer disguised as a
legitimate program. The delivery method typically sees an attacker use social engineering to hide
malicious code within legitimate software to try and gain users' system access with their software.
C ) network worms are stand-alone malicious programs that can self-replicate and propagate
independently as soon as they have breached the system. Network Worms do not require activation
or any human intervention ,to execute or spread their code.
D ) Spoofing is a type of scam in which a criminal disguises an email address, display name, phone
number, text message, or website URL to convince a target that they are interacting with a known,
trusted source.
F ) Authentication is the process of determining the identity of a client. Cloud Storage uses OAuth
2.0 for API authentication and authorization.
G ) Data integrity is the accuracy, completeness, and quality of data as it’s maintained over time
and across formats.
G ) Confidential in computing ensures data is secured and encrypted against risks such as
malicious insiders, network vulnerabilities or any threat to hardware or software based technology
that could be compromised. The Confidential Cloud is a secure confidential computing environment
formed within one or more public clouds.
Problem 1.17
Compare GPU and CPU chips in terms of their strengths and weaknesses. In particular, discuss the
trade-offs between power efficiency, programmability, and performance. Also compare various
MPP architectures in processor selection, performance target, efficiency, and packaging constraints.
Solution
Central Processing Unit (CPU):
CPU is known as brain for every ingrained system. CPU comprises the arithmetic logic unit
(ALU) accustomed quickly to store the information and perform calculations and Control
Unit (CU) for performing instruction sequencing as well as branching. CPU interacts with
more computer components such as memory, input and output for performing instruction.
Graphics Processing Unit (GPU):
GPU is used to provide the images in computer games. GPU is faster than CPU’s speed and it
emphasis on high throughput. It’s generally incorporated with electronic equipment for
sharing RAM with electronic equipment that is nice for the foremost computing task. It
contains more ALU units than CPU.
The basic difference between CPU and GPU is that CPU emphasis on low latency. Whereas,
GPU emphasis on high throughput.
They are used in nearly every computer. Some GPU features were also integrated into
certain CPUs. The traditional CPUs are structured with only a few cores.
For example, the Xeon X5670 CPU has 6cores. However, a modern GPU chip could be
built with hundreds of processing cores.
Unlike CPUs, GPUs have a throughput architecture that exploits massive parallelism
by executing many concurrent threads slowly, instead of executing a single long
thread in conventional microprocessor very quickly.
Laterly, parallel GPUs or GPU clusters are catching a lot of attentions against the use
of CPU with limited parallelism.