0% found this document useful (0 votes)
41 views211 pages

Mcad22e3 Cloud Computing Notes

Uploaded by

Vinu Varshith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views211 pages

Mcad22e3 Cloud Computing Notes

Uploaded by

Vinu Varshith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 211

MASTER OF COMPUTER

APPLICATIONS

MCAD22E3
CLOUD COMPUTING

Second Semester

Semester – III

DIRECTORATEOFDISTANCEEDUCATION
SRMInstituteofScienceandTechnology,
Potheri, Chengalpattu District 603203, Tamil Nadu, India.
Phone:044–27417040/41
Website:www.srmist.edu.in/Email:[email protected]

SRMISTDDEMCASelfInstructionalMaterial
CourseWriter(s):Dr.R. Jayashree

Information contained in this book has been obtained by its


Author(s) from sources believed to be reliableandare
correcttothebestoftheir knowledge.However
PublishersandtheAuthor(s)shall in
noeventbeliableforanyerrors,omissionsordamagesarisingoutof
thisinformationandspecifically disclaim any implied
warranties or merchantability or fitness for any particular .

DIRECTORATEOFDISTANCEEDUCATION
SRMInstituteofScienceandTechnology,
Potheri, Chengalpattu District 603203, Tamil Nadu,IndiaPhone: 044 –
27417040 / 41
Website:www.srmist.edu.in/Email:[email protected]

SRMISTDDEMCASelfInstructionalMaterial
MCAD22E3
CLOUD COMPUTING
SEMESTER – III

Course Code Course Title L T P Total C


LTP
CLOUD COMPUTING
MCAD22E3 3 0 2 5 4

INSTRUCTIONAL OBJECTIVES
Student Outcomes
At the end of this course the learner is expected:
1. To understand the concepts of Cloud Computing and Learn about various
a
public cloud services
2. To explore about Web Services and Service Oriented Architecture. a
3. To learn about Cloud Management Products,Cloud Storage and Cloud
e l k
Security.

UNIT -1:Introduction to Distributed Systems


Introduction to Distributed Systems – Characteristics - Issues in Distributed
Systems - Distributed System Model - Request/Reply Protocols – RMI - Logical
Clocks and Casual Ordering of Events - RPC- Election Algorithm - Distributed
Mutual Exclusion - Distributed Deadlock Detection Algorithms

UNIT - 2: Introduction to Cloud Computing

Introduction to Cloud Computing - Evolution of Cloud Computing - Cloud


Characteristics- Elasticity in Cloud - On-demand Provisioning - NIST Cloud
Computing Reference Architecture - Architectural Design Challenges -
Deployment Models: Public, Private and Hybrid Clouds - Service Models: IaaS-
PaaS – SaaS - Benefits of Cloud Computing

UNIT - 3: Introduction to Web Service


Introduction to Web Service and Service Oriented Architecture - SOAP – REST –
Basics of Virtualization - Full and Para Virtualization - Implementation Levels of
Virtualization - Tools and Mechanisms - Virtualization of CPU - Memory – I/O
Devices - Desktop Virtualization - Server Virtualization

UNIT -4: Cloud Management Products

Resource Provisioning and Methods - Cloud Management Products - Cloud


Storage – Provisioning Cloud Storage - Managed and Unmanaged Cloud Storage -
Cloud Security Overview - Cloud Security Challenges - Architecture Design –
Virtual Machine Security - Data Security

UNIT - 5: Google App Engine and AWS

SRMISTDDEMCASelfInstructionalMaterial
HDFS MapReduce - Google App Engine (GAE) - Programming Environment for
GAE - Architecture of GFS - Case Studies: Openstack, Heroku and Docker
Containers - Amazon EC2 – AWS - Microsoft Azure - Google Compute Engine

TEXT BOOKS
1. Andrew S. Tanenbaum, Maarten Van Steen, “Distributed Systems - Principles and Paradigms”, Second Edition,
Pearson, 2006.
2. Buyya R., Broberg J., Goscinski A., “Cloud Computing: Principles and Paradigm”, John Wiley& Sons, 2011.

REFEENCE:

1. Kai Hwang, Geoffrey C Fox, Jack G Dongarra, "Distributed and Cloud Computing, From Parallel Processing to
the Internet of Things", Morgan Kaufmann Publishers, 2012.
2. Mukesh Singhal, "Advanced Concepts In Operating Systems", McGraw Hill Series in Computer Science, 1994.
3.John W.Rittinghouse, James F.Ransome, "Cloud Computing: Implementation
"Management, and Security", CRC Press, 2010

SRMISTDDEMCASelfInstructionalMaterial
CONTENTS

Module 1
Introduction to Distributed Systems – Characteristics - Issues in Distributed
Systems - Distributed System Model - Request/Reply Protocols – RMI

Module 2
Logical Clocks and Casual Ordering of Events –CAP Theorem - Election Algorithm
- Distributed Mutual Exclusion - Distributed Deadlock Detection Algorithms

Module 3
Introduction to Cloud Computing - Evolution of Cloud Computing - Cloud
Characteristics- Elasticity in Cloud - On-demand Provisioning - NIST Cloud
Computing Reference Architecture - Architectural Design Challenges

Module 4
Deployment Models: Public, Private and Hybrid Clouds - Service Models: IaaS-
PaaS – SaaS - Benefits of Cloud Computing - Disadvantages of cloud computing

Module 5
Introduction to Web Service and Service Oriented Architecture - SOAP and REST
– Basics of Virtualization - Full and Para Virtualization - Implementation Levels
of Virtualization

Module 6
Tools and Mechanisms - Virtualization of CPU - Memory – I/O Devices - Desktop
Virtualization - Server Virtualization

Module 7
Resource Provisioning and Methods - Cloud Management Products - Cloud
Storage – Provisioning Cloud Storage - Managed and Unmanaged Cloud Storage

Module 8
Cloud Security Overview - Cloud Security Challenges - Architecture Design –
Virtual Machine Security - Data Security

Module 9
HDFS MapReduce - Google App Engine (GAE) - Google App- Programming
Environment for GAE

Module 10
Case Studies: Openstack, Heroku and Docker Containers - Amazon EC2 – AWS -
Microsoft Azure - Google Compute Engine

SRMISTDDEMCASelfInstructionalMaterial
MCAD22E3CLOUD COMPUTING

MODULEI

1.1. Introduction to Distributed Systems

1.2. Characteristics

1.3. Issues in Distributed Systems

1.4. Distributed System Model

1.5. Request/Reply Protocols

1.6. RMI
1.1. Introduction to Distributed Systems

Distributed Systems

A distributed system contains multiple nodes that are physically separate but linked
together using the network. All the nodes in this system communicate with each other
and handle processes in tandem. Each of these nodes contains a small part of the
distributed operating system software.
A diagram to better explain the distributed system is −

Types of Distributed Systems

The nodes in the distributed systems can be arranged in the form of client/server
systems or peer to peer systems. Details about these are as follows –

a) Client/Server Systems
In client server systems, the client requests a resource and the server provides that
resource. A server may serve multiple clients at the same time while a client is in
contact with only one server. Both the client and server usually communicate via a
computer network and so they are a part of distributed systems.
b) Peer to Peer Systems
The peer to peer systems contains nodes that are equal participants in data sharing. All
the tasks are equally divided between all the nodes. The nodes interact with each other
as required as share resources. This is done with the help of a network.
Advantages of Distributed Systems

Some advantages of Distributed Systems are as follows −

● All the nodes in the distributed system are connected to each other. So nodes
can easily share data with other nodes.
● More nodes can easily be added to the distributed system i.e. it can be scaled
as required.
● Failure of one node does not lead to the failure of the entire distributed system.
Other nodes can still communicate with each other.
● Resources like printers can be shared with multiple nodes rather than being
restricted to just one.

Disadvantages of Distributed Systems

Some disadvantages of Distributed Systems are as follows −

● It is difficult to provide adequate security in distributed systems because the


nodes as well as the connections need to be secured.
● Some messages and data can be lost in the network while moving from one
node to another.
● The database connected to the distributed systems is quite complicated and
difficult to handle as compared to a single user system.
● Overloading may occur in the network if all the nodes of the distributed
system try to send data at once.

1.2. Characteristics

Key Characteristics of Distributed Systems

A distributed system is a system in which components are located on


different networked computers, which can communicate and coordinate their actions
by passing messages to one another. The components interact with one another in
order to achieve a common goal.

Key characteristics of distributed systems are


● Resource Sharing

Resource sharing means that the existing resources in a distributed system can be
accessed or remotely accessed across multiple computers in the system.

Computers in distributed systems shares resources like hardware (disks and


printers), software (files, windows and data objects) and data.

Hardware resources are shared for reductions in cost and convenience. Data is shared
for consistency and exchange of information.

Resources are managed by a software module known as a resource manager. Every


resource has its own management policies and methods.

● Heterogeneity

In distributed systems components can have variety and differences in Networks,


Computer hardware, Operating systems, Programming languages and
implementations by different developers.

● Openness

Openness is concerned with extensions and improvements of distributed systems. The


distributed system must be open in terms of Hardware and Softwares. In order to
make a distributed system open,

1. A detailed and well-defined interface of components must be published.

2. Should standardize the interfaces of components

3. The new component must be easily integrated with existing components

● Concurrency

Concurrency is a property of a system representing the fact that multiple activities are
executed at the same time. The concurrent execution of activities takes place in
different components running on multiple machines as part of a distributed system. In
addition, these activities may perform some kind of interactions among them.
Concurrency reduces the latency and increases the throughput of the distributed
system.

● Scalability

Scalability is mainly concerned about how the distributed system handles


the growth as the number of users for the system increases. Mostly we scale the
distributed system by adding more computers in the network. Components should not
need to be changed when we scale the system. Components should be designed in
such a way that it is scalable.

● Fault Tolerance

In a distributed system hardware, software, network anything can fail. The system
must be designed in such a way that it is available all the time even after something
has failed.

● Transparency

Distributed systems should be perceived by users and application programmers as a


whole rather than as a collection of cooperating components. Transparency can be of
various types like access, location, concurrency, replication, etc.

1.3. Issues in Distributed Systems

6) Banker’s Algorithm in Operating System

The banker’s algorithm is a resource allocation and deadlock avoidance algorithm that
tests for safety by simulating the allocation for predetermined maximum possible
amounts of all resources, then makes an “s-state” check to test for possible activities,
before deciding whether allocation should be allowed to continue.

a)Why Banker’s algorithm is named so?

Banker’s algorithm is named so because it is used in banking system to check whether


loan can be sanctioned to a person or not.
Suppose there are n number of account holders in a bank and the total sum of their
money is S. If a person applies for a loan then the bank first subtracts the loan amount
from the total money that bank has and if the remaining amount is greater than S then
only the loan is sanctioned. It is done because if all the account holders comes to
withdraw their money then the bank can easily do it.

In other words, the bank would never allocate its money in such a way that it can no
longer satisfy the needs of all its customers. The bank would try to be in safe state
always.
Following Data structures are used to implement the Banker’s Algorithm:

Let ‘n’ be the number of processes in the system and ‘m’ be the number of resources
types.

Available :
● It is a 1-d array of size ‘m’ indicating the number of available resources of each
type.
● Available[ j ] = k means there are ‘k’ instances of resource type Rj

Max :
● It is a 2-d array of size ‘n*m’ that defines the maximum demand of each process
in a system.
● Max[i, j ] = k means process Pi may request at most ‘k’ instances of resource
type Rj.

Allocation :
● It is a 2-d array of size ‘n*m’ that defines the number of resources of each type
currently allocated to each process.
● Allocation[i, j ] = k means process Pi is currently allocated ‘k’ instances of
resource type Rj

Need :
● It is a 2-d array of size ‘n*m’ that indicates the remaining resource need of each
process.
● Need [ i, j ] = k means process Pi currently need ‘k’ instances of resource type Rj
for its execution.

● Need [ i, j ] = Max [ i, j ] – Allocation [ i, j ]

Allocationi specifies the resources currently allocated to process P i and Needi specifies
the additional resources that process Pi may still request to complete its task.
Banker’s algorithm consists of Safety algorithm and Resource request algorithm
b)Safety Algorithm

The algorithm for finding out whether or not a system is in a safe state can be
described as follows:

1) Let Work and Finish be vectors of length ‘m’ and ‘n’ respectively.
Initialize: Work = Available
Finish[i] = false; for i=1, 2, 3, 4….n
2) Find an i such that both
a) Finish[i] = false
b) Needi <= Work
if no such i exists goto step (4)
3) Work = Work + Allocation[i]
Finish[i] = true
goto step (2)
4) if Finish [i] = true for all i
then the system is in a safe state

c)Resource-Request Algorithm

Let Requesti be the request array for process Pi. Requesti [j] = k means process
Pi wants k instances of resource type R j. When a request for resources is made by
process Pi, the following actions are taken:

1) If Requesti <= Needi


Goto step (2) ; otherwise, raise an error condition, since the process has
exceeded its maximum claim.
2) If Requesti <= Available
Goto step (3); otherwise, Pi must wait, since the resources are not available.
3) Have the system pretend to have allocated the requested resources to process Pi by
modifying the state asfollows:
Available = Available – Requesti
Allocationi = Allocationi + Requesti
Needi = Needi– Requesti

d)Example:

Considering a system with five processes P 0 through P4 and three resources of type A,
B, C. Resource type A has 10 instances, B has 5 instances and type C has 7 instances.
Suppose at time t0 following snapshot of the system has been taken:
Question1. What will be the content of the Need matrix?

Need [i, j] = Max [i, j] – Allocation [i, j]


So, the content of Need Matrix is:

Question2. Is the system in a safe state? If Yes, then what is the safe sequence?

Applying the Safety algorithm on the given system,


Question3. What will happen if process P1 requests one additional instance of
resource type A and two instances of resource type C?

We must determine whether this new system state is safe. To do so, we again execute
Safety algorithm on the above data structures.
Hence the new system state is safe, so we can immediately grant the request for
process P1 .

e)Code for Banker’s Algorithm in Java

//Java Program for Bankers Algorithm


public class GfGBankers
{
int n = 5; // Number of processes
int m = 3; // Number of resources
int need[][] = new int[n][m];
int [][]max;
int [][]alloc;
int []avail;
int safeSequence[] = new int[n];
void initializeValues()
{
// P0, P1, P2, P3, P4 are the Process names here
// Allocation Matrix
alloc = new int[][] { { 0, 1, 0 }, //P0
{ 2, 0, 0 }, //P1
{ 3, 0, 2 }, //P2
{ 2, 1, 1 }, //P3
{ 0, 0, 2 } }; //P4
// MAX Matrix
max = new int[][] { { 7, 5, 3 }, //P0
{ 3, 2, 2 }, //P1
{ 9, 0, 2 }, //P2
{ 2, 2, 2 }, //P3
{ 4, 3, 3 } }; //P4

// Available Resources
avail = new int[] { 3, 3, 2 };
}
void isSafe()
{
int count=0;
//visited array to find the already allocated process
booleanvisited[] = new boolean[n];
for (int i = 0;i< n; i++)
{
visited[i] = false;
}
//work array to store the copy of available resources
int work[] = new int[m];
for (int i = 0;i< m; i++)
{
work[i] = avail[i];
}

while (count<n)
{
boolean flag = false;
for (int i = 0;i< n; i++)
{
if (visited[i] == false)
{
int j;
for (j = 0;j< m; j++)
{
if (need[i][j] > work[j])
break;
}
if (j == m)
{
safeSequence[count++]=i;
visited[i]=true;
flag=true;

for (j = 0;j< m; j++)


{
work[j] = work[j]+alloc[i][j];
}
}
}
}
if (flag == false)
{
break;
}
}
if (count < n)
{
System.out.println("The System is UnSafe!");
}
else
{
//System.out.println("The given System is Safe");
System.out.println("Following is the SAFE Sequence");
for (int i = 0;i< n; i++)
{
System.out.print("P" + safeSequence[i]);
if (i != n-1)
System.out.print(" -> ");
}
}
}

void calculateNeed()
{
for (int i = 0;i< n; i++)
{
for (int j = 0;j< m; j++)
{
need[i][j] = max[i][j]-alloc[i][j];
}
}
}

public static void main(String[] args)


{
int i, j, k;
GfGBankersgfg = new GfGBankers();

gfg.initializeValues();
//Calculate the Need Matrix
gfg.calculateNeed();

// Check whether system is in safe state or not


gfg.isSafe();
}
}

f)Python3
# Banker's Algorithm

# Driver code:
if __name__=="__main__":

# P0, P1, P2, P3, P4 are the Process names here


n = 5 # Number of processes
m = 3 # Number of resources

# Allocation Matrix
alloc = [[0, 1, 0 ],[ 2, 0, 0 ],
[3, 0, 2 ],[2, 1, 1] ,[ 0, 0, 2]]
# MAX Matrix
max = [[7, 5, 3 ],[3, 2, 2 ],
[ 9, 0, 2 ],[2, 2, 2],[4, 3, 3]]
avail = [3, 3, 2] # Available Resources
f = [0]*n
ans = [0]*n
ind = 0
for k in range(n):
f[k] = 0
need = [[ 0 for i in range(m)]fori in range(n)]
for i in range(n):
for j in range(m):
need[i][j] = max[i][j] - alloc[i][j]
y=0
for k in range(5):
for i in range(n):
if (f[i] == 0):
flag = 0
for j in range(m):
if (need[i][j] > avail[j]):
flag = 1
break

if (flag == 0):
ans[ind] = i
ind += 1
for y in range(m):
avail[y] += alloc[i][y]
f[i] = 1

print("Following is the SAFE Sequence")

for i in range(n - 1):


print(" P", ans[i], " ->", sep="", end="")
print(" P", ans[n - 1], sep="")
1.4. Distributed System Model
Distributed System Models is as follows:

1. Architectural Models
2. Interaction Models
3. Fault Models

1. Architectural Models

Architectural model describes responsibilities distributed between system components


and how are these components placed.

a) Client-server model
The system is structured as a set of processes, called servers, that offer services to the
users, called clients.

● The client-server model is usually based on a simple request/reply protocol,


implemented with send/receive primitives or using remote procedure calls
(RPC) or remote method invocation (RMI):
● The client sends a request (invocation) message to the server asking for some
service;
● The server does the work and returns a result (e.g. the data requested) or an
error code if the work could not be performed.

A server can itself request services from other servers; thus, in this new relation,
the server itself acts like a client.

b) Peer-to-peer

All processes (objects) play similar role.

● Processes (objects) interact without particular distinction between clients and


servers.
● The pattern of communication depends on the particular application.

● A large number of data objects are shared; any individual computer holds only
a small part of the application database.
● Processing and communication loads for access to objects are distributed
across many computers and access links.
● This is the most general and flexible model.

● Peer-to-Peer tries to solve some of the above.


● It distributes shared resources widely -> share computing and communication
loads.

c) Problems with peer-to-peer:

● High complexity due to


o Cleverly place individual objects
o retrieve the objects
o maintain potentially large number of replicas.
2.Interaction Model
Interaction model are for handling time i. e. for process execution, message delivery,
clock drifts etc.

a)Synchronous distributed systems

Main features:

● Lower and upper bounds on execution time of processes can be set.


● Transmitted messages are received within a known bounded time.
● Drift rates between local clocks have a known bound.

Important consequences:
1. In a synchronous distributed system there is a notion of global physical time
(with a known relative precision depending on the drift rate).
2. Only synchronous distributed systems have a predictable behavior in terms of
timing. Only such systems can be used for hard real-time applications.
3. In a synchronous distributed system it is possible and safe to use timeouts in
order to detect failures of a process or communication link.
4. It is difficult and costly to implement synchronous distributed systems.

b)Asynchronous distributed systems

● Many distributed systems (including those on the Internet) are asynchronous. -


No bound on process execution time (nothing can be assumed about speed,
load, and reliability of computers).
● No bound on message transmission delays (nothing can be assumed about
speed, load, and reliability of interconnections) - No bounds on drift rates
between local clocks.

Important consequences:

1. In an asynchronous distributed system there is no global physical time.


Reasoning can be only in terms of logical time.
2. Asynchronous distributed systems are unpredictable in terms of timing.
3. No timeouts can be used.
4. Asynchronous systems are widely and successfully used in practice.
5. In practice timeouts are used with asynchronous systems for failure detection.
6. However, additional measures have to be applied in order to avoid duplicated
messages, duplicated execution of operations, etc.
3. Fault Models

● Failures can occur both in processes and communication channels. The reason
can be both software and hardware faults.
● Fault models are needed in order to build systems with predictable behavior in
case of faults (systems which are fault tolerant).
● such a system will function according to the predictions, only as long as the
real faults behave as defined by the “fault model”.

1.5. Request/Reply Protocols

Request/Reply Communication

Queues are the key to connectionless communication. Each server is assigned an


Inter-Process Communication (IPC) message queue called a request queue and
each client is assigned a reply queue. Therefore, rather than establishing and
maintaining a connection with a server, a client application can send requests to the
server by putting those requests on the server's queue, and then check and retrieve
messages from the server by pulling messages from its own reply queue.
The request/reply model is used for both synchronous and asynchronous service
requests as described in the following topics.

a)Synchronous Messaging
In a synchronous call, a client sends a request to a server, which performs the
requested action while the client waits. The server then sends the reply to the
client, which receives the reply. This is known as Synchronous Request/Reply
Communication.

b) Asynchronous Messaging
In an asynchronous call, the client does not wait for a service request it has
submitted to finish before undertaking other tasks. Instead, after issuing a request,
the client performs additional tasks (which may include issuing more requests).
When a reply to the first request is available, the client retrieves it.This is known as
Asynchronous Request/Reply Communication.
1.6. RMI

RMI stands for Remote Method Invocation. It is a mechanism that allows an object
residing in one system (JVM) to access/invoke an object running on another JVM.
RMI is used to build distributed applications; it provides remote communication
between Java programs. It is provided in the package java.rmi.

a)Architecture of an RMI Application

In an RMI application, we write two programs, a server program (resides on the


server) and a client program (resides on the client).

● Inside the server program, a remote object is created and reference of that
object is made available for the client (using the registry).

● The client program requests the remote objects on the server and tries to
invoke its methods.
The following diagram shows the architecture of an RMI application.
Let us now discuss the components of this architecture.

● Transport Layer − This layer connects the client and the


server. It manages the existing connection and also
sets up new connections.
● Stub − A stub is a representation (proxy) of the remote
object at client. It resides in the client system; it acts
as a gateway for the client program.
● Skeleton − This is the object which resides on the server
side. stub communicates with this skeleton to pass request to the remote
object.

● RRL(Remote Reference Layer) − It is the layer which


manages the references made by the client to the
remote object.

b)Working of an RMI Application

The following points summarize how an RMI application


works −
● When the client makes a call to the remote object, it is received by the stub
which eventually passes this request to the RRL.

● When the client-side RRL receives the request, it invokes a method


called invoke() of the object remoteRef. It passes the request to the RRL on
the server side.
● The RRL on the server side passes the request to the Skeleton (proxy on the
server) which finally invokes the required object on the server.

● The result is passed all the way back to the client.

c)Marshalling and Unmarshalling

Whenever a client invokes a method that accepts parameters on a remote object, the
parameters are bundled into a message before being sent over the network. These
parameters may be of primitive type or objects. In case of primitive type, the
parameters are put together and a header is attached to it. In case the parameters are
objects, then they are serialized. This process is known as marshalling.
At the server side, the packed parameters are unbundled and then the required
method is invoked. This process is known as unmarshalling.

d)RMI Registry

RMI registry is a namespace on which all server objects are placed. Each time the
server creates an object, it registers this object with the RMIregistry
(using bind() or reBind() methods). These are registered using a unique name
known as bind name.
To invoke a remote object, the client needs a reference of that object. At that time,
the client fetches the object from the registry using its bind name
(using lookup() method).
The following illustration explains the entire process –
e)Goals of RMI

Following are the goals of RMI −

● To minimize the complexity of the application.

● To preserve type safety.

● Distributed garbage collection.

● Minimize the difference between working with local and remote objects.
MCAD22E3CLOUD COMPUTING

MODULE2

2.1. Logical Clocks and Casual Ordering of Events

2.2. CAP Theorem

2.3. Election Algorithm

2.4. Distributed Mutual Exclusion

2.5. Distributed Deadlock Detection Algorithms


2.1. Logical Clocks and Casual Ordering of
Events

Logical Clock in Distributed System

● Logical Clocks refer to implementing a protocol on all machines within your


distributed system, so that the machines are able to maintain consistent
ordering of events within some virtual timespan.

● A logical clock is a mechanism for capturing chronological and causal


relationships in a distributed system.

● Distributed systems may have no physically synchronous global clock, so a


logical clock allows global ordering on events from different processes in such
systems.

a) Example

If we go outside then we have made a full plan that at which place we have to go first,
second and so on. We don’t go to second place at first and then the first place. We
always maintain the procedure or an organization that is planned before. In a similar
way, we should do the operations on our PCs one by one in an organized way.

Suppose, we have more than 10 PCs in a distributed system and every PC is doing it’s
own work but then how we make them work together. There comes a solution to this
i.e. LOGICAL CLOCK.

Method-1:

● To order events across process, try to sync clocks in one approach.

● This means that if one PC has a time 2:00 pm then every PC should have the
same time which is quite not possible. Not every clock can sync at one time.
Then we can’t follow this method.
Method-2:

Another approach is to assign Timestamps to events.

● Taking the example into consideration, this means if we assign the first place
as 1, second place as 2, third place as 3 and so on. Then we always know that
the first place will always come first and then so on. Similarly, If we give each
PC their individual number than it will be organized in a way that 1st PC will
complete its process first and then second and so on. But Timestamps will
only work as long as they obey causality.

b)Causality

Causality is fully based on HAPPEN BEFORE RELATIONSHIP.

● Taking single PC only if 2 events A and B are occurring one by one then TS(A) <
TS(B). If A has timestamp of 1, then B should have timestamp more than 1, then
only happen before relationship occurs.

● Taking 2 PCs and event A in P1 (PC.1) and event B in P2 (PC.2) then also the
condition will be TS(A) < TS(B). Taking example- suppose you are sending
message to someone at 2:00:00 pm, and the other person is receiving it at 2:00:02
pm.Then it’s obvious that TS(sender) < TS(receiver).

c)Properties Derived from Happen Before Relationship


● Transitive Relation
If, TS(A) <TS(B) and TS(B) <TS(C), then TS(A) < TS(C)

● Causally Ordered Relation


a->b, this means that a is occurring before b and if there is any changes in a it will
surely reflect on b.

● Concurrent Event
This means that not every process occurs one by one, some processes are made to
happen simultaneously i.e., A || B.

d)Causal ordering
Causal ordering is a vital tool for thinking about distributed systems. Once you
understand it, many other concepts become much simpler.
(i)The fundamental property of distributed systems:
Messages sent between machines may arrive zero or more times at any point after
they are sent
This is the sole reason that building distributed systems is hard.
For example, because of this property it is impossible for two computers
communicating over a network to agree on the exact time. You can send me a
message saying "it is now 10:00:00" but I don't know how long it took for that
message to arrive. We can send messages back and forth all day but we will never
know for sure that we are synchronized.
If we can't agree on the time then we can't always agree on what order things happen
in. Suppose I say "my user logged on at 10:00:00" and you say "my user logged on at
10:00:01". Maybe mine was first or maybe my clock is just fast relative to yours. The
only way to know for sure is if something connects those two events.
For example, if my user logged on and then sent your user an email and if you
received that email before your user logged on then we know for sure that mine was
first.
This concept is called causal ordering and is written like this:
A -> B (event A is causally ordered before event B)
Let's define it a little more formally. We model the world as follows: We have a
number of machines on which we observe a series of events. These events are either
specific to one machine (eg user input) or are communications between machines. We
define the causal ordering of these events by three rules:
If A and B happen on the same machine and A happens before B then A -> B

If I send you some message M and you receive it then (send M) -> (recv M)

If A -> B and B -> C then A -> C


We are used to thinking of ordering by time which is a total order - every pair of
events can be placed in some order. In contrast, causal ordering is only a partial
order - sometimes events happen with no possible causal relationship i.e. not (A -> B
or B -> A).
This diagram shows a nice way to picture these relationships.
On a single machine causal ordering is exactly the same as time ordering (actually, on
a multi-core machine the situation is more complicated, but let's forget about that for
now).
Between machines causal ordering is conveyed by messages. Since sending messages
is the only way for machines to affect each other this gives rise to a nice property:
If not(A -> B) then A cannot possibly have caused B
Since we don't have a single global time this is the only thing that allows us to reason
about causality in a distributed system. This is really important so let's say it again:
Communication bounds causality.
The lack of a total global order is not just an accidental property of computer systems,
it is a fundamental property of the laws of physics. I claimed that understanding
causal order makes many other concepts much simpler.

(ii)Clocks
Lamport clocks and Vector clocks are data-structures which efficiently approximate
the causal ordering and so can be used by programs to reason about causality.
If A -> B then LC_A < LC_B

If VC_A < VC_B then A -> B


Different types of vector clock trade-off compression vs accuracy by storing smaller
or larger portions of the causal history of an event.
⮚ Lamport clocks
• The algorithm follows some simple rules:
• A process increments its counter before each local event (e.g., message
sending event);
• When a process sends a message, it includes its counter value with the
message after executing step 1;
• On receiving a message, the counter of the recipient is updated, if necessary,
to the greater of its current counter and the timestamp in the received message.
The counter is then incremented by 1 before the message is considered
received
⮚ Vector clocks
• A vector clock is a data structure used for determining the partial ordering of
events in a distributed system and detecting causality violations.
• Just as in Lamport timestamps, inter-process messages contain the state of the
sending process's logical clock.
• A vector clock of a system of N processes is an array/vector of N logical
clocks, one clock per process; a local "largest possible values" copy of the
global clock-array is kept in each process.

11. Consistency
When mutable state is distributed over multiple machines each machine can receive
update events at different times and in different orders.
If the final state is dependent on the order of updates then the system must choose a
single serialisation of the events, imposing a global total order.
A distributed system is consistent exactly when the outside world can never observe
two different serialisations.

2.2. CAP Theorem


The CAP (Consistency-Availability-Partition) theorem also boils down to causality.
When a machine in a distributed system is asked to perform an action that depends on
its current state it must decide that state by choosing a serialisation of the events it has
seen. It has two options:

● Choose a serialisation of its current events immediately


● Wait until it is sure it has seen all concurrent events before choosing a
serialisation

The first choice risks violating consistency if some other machine makes the same
choice with a different set of events.
The second violates availability by waiting for every other machine that could
possibly have received a conflicting event before performing the requested action.
There is no need for an actual network partition to happen - the trade-off between
availability and consistency exists whenever communication between components is
not instant.
Ordering requires waiting
Even your hardware cannot escape this law. It provides the illusion of synchronous
access to memory at the cost of availabilty. If you want to write fast parallel programs
then you need to understand the messaging model used by the underlying hardware.
2.3. Election Algorithm

Election algorithm and distributed processing

a)Distributed Algorithm:

This is a algorithm that runs on a distributed system.

Distributed system is a collection of independent computers that do not share their


memory.

Each processor has its own memory and they communicate via communication
networks.

Communication in networks is implemented in a process on one machine


communicating with a process on other machine.

Many algorithms used in distributed system require a coordinator that performs


functions needed by other processes in the system.

Election algorithms are designed to choose a coordinator.

b)Election Algorithms:

Election algorithms choose a process from group of processors to act as a coordinator.


If the coordinator process crashes due to some reasons, then a new coordinator is
elected on other processor.

Election algorithm basically determines where a new copy of coordinator should be


restarted.

Election algorithm assumes that every active process in the system has a unique
priority number.
The process with highest priority will be chosen as a new coordinator. Hence, when a
coordinator fails, this algorithm elects that active process which has highest priority
number. Then this number is send to every active process in the distributed system.
We have two election algorithms for two different configurations of distributed
system.

(i) The Bully Algorithm –


This algorithm applies to system where every process can send a message to every
other process in the system.

Algorithm – Suppose process P sends a message to the coordinator.


1. If coordinator does not respond to it within a time interval T, then it is assumed
that coordinator has failed.
2. Now process P sends election message to every process with high priority
number.
3. It waits for responses, if no one responds for time interval T then process P elects
itself as a coordinator.
4. Then it sends a message to all lower priority number processes that it is elected as
their new coordinator.
5. However, if an answer is received within time T from any other process Q,
● (I) Process P again waits for time interval T’ to receive another message from
Q that it has been elected as coordinator.
● (II) If Q doesn’t responds within time interval T’ then it is assumed to have
failed and algorithm is restarted.

Bully Election Algorithm


When any process notices that the coordinator is no longer responding to request, it
initiates an ELECTION. The process holds an ELECTION message as follows:

Example:

We start with 6 processes, all directly connected to each other. Process 6 is the
leader,
as it has the highest number.

Process 6 fails.
Process 3 notices that Process 6 does not respond. So it starts an
election, notifying those processes with ids greater than 3.

Both Process 4 and Process 5 respond, telling Process 3 that they'll take over from
here.
Process 4 sends election messages to both Process 5 and Process 6.

Only Process 5 answers and takes over the election.

Process 5 sends out only one election message to Process 6.


When Process 6 does not respond Process 5 declares itself the winner.

(ii). The Ring Algorithm –

This algorithm applies to systems organized as a ring(logically or physically). In this


algorithm we assume that the link between the process are unidirectional and every
process can message to the process on its right only. Data structure that this algorithm
uses is active list, a list that has priority number of all active processes in the system.

Algorithm –
1. If process P1 detects a coordinator failure, it creates new active list which is
empty initially. It sends election message to its neighbour on right and adds
number 1 to its active list.
2. If process P2 receives message elect from processes on left, it responds in 3 ways:
● (I) If message received does not contain 1 in active list then P1 adds 2 to its
active list and forwards the message.
● (II) If this is the first election message it has received or sent, P1 creates new
active list with numbers 1 and 2. It then sends election message 1 followed by
2.
● (III) If Process P1 receives its own election message 1 then active list for P1
now contains numbers of all the active processes in the system. Now Process
P1 detects highest priority number from list and elects it as the new
coordinator.

Token Ring Election Algorithm


When any process notice that the coordinator is not functioning, it builds an
ELECTION MESSAGE containing its own process number and sends the message to
its successor.

Example:
We start with 6 processes, connected in a logical ring. Process 6 is the leader, as it has
the highest number.

Process 6 fails

Process 3 notices that Process 6 does not respond. So it starts an election, sending a
message containing its id to the next node in the ring.
Process 5 passes the message on, adding its own id to the message.

Process 0 passes the message on, adding its own id to the message.

Process 1 passes the message on, adding its own id to the message.
Process 4 passes the message on, adding its own id to the message.

When Process 3 receives the message back, it knows the message has gone around the
ring, as its own id is in the list. Picking the highest id in the list, it starts the
coordinator message "5 is the leader" around the ring.
Process 5 passes on the coordinator message.

Process 0 passes on the coordinator message.

Process 1 passes on the coordinator message.

Process 4 passes on the coordinator message.


Process 3 receives the coordinator message, and stops it.

Bully Algorithm Code in Java:

import java.io.*;
import java.util.Scanner;

class Anele{
static int n;
static int pro[] = new int[100];
static int sta[] = new int[100];
static int co;

public static void main(String args[])throws IOException


{
System.out.println("Enter the number of process");
Scanner in = new Scanner(System.in);
n = in.nextInt();

int i,j,k,l,m;

for(i=0;i<n;i++)
{
System.out.println("For process "+(i+1)+":");
System.out.println("Status:");
sta[i]=in.nextInt();
System.out.println("Priority");
pro[i] = in.nextInt();
}

System.out.println("Which process will initiate election?");


int ele = in.nextInt();

elect(ele);
System.out.println("Final coordinator is "+co);
}

static void elect(int ele)


{
ele = ele-1;
co = ele+1;
for(int i=0;i<n;i++)
{
if(pro[ele]<pro[i])
{
System.out.println("Election message is sent from "+(ele+1)+" to "+(i+1));
if(sta[i]==1)
elect(i+1);
}
}
}
}

Output:
Enter the number of process
7
For process 1:
Status:
1
Priority
1
For process 2:
Status:
1
Priority
2
For process 3:
Status:
1
Priority
3
For process 4:
Status:
1
Priority
4
For process 5:
Status:
1
Priority
5
For process 6:
Status:
1
Priority
6
For process 7:
Status:
0
Priority
7
Which process will initiate election?
4
Election message is sent from 4 to 5
Election message is sent from 5 to 6
Election message is sent from 6 to 7
Election message is sent from 5 to 7
Election message is sent from 4 to 6
Election message is sent from 6 to 7
Election message is sent from 4 to 7
Final coordinator is 6

Ring Algorithm code in Java:

import java.util.Scanner;
class Process{
public int id;
public boolean active;

public Process(int id){


this.id=id;
active=true;
}

}
public class Ring{
int noOfProcesses;
Process[] processes;
Scanner sc;

public Ring(){
sc=new Scanner(System.in);
}
public void initialiseRing(){
System.out.println(“Enter no of processes”);
noOfProcesses=sc.nextInt();
processes = new Process[noOfProcesses];
for(int i=0;i<processes.length;i++){
processes[i]= new Process(i);
}
}

public int getMax(){


int maxId=-99;
int maxIdIndex=0;
for(int i=0;i<processes.length;i++){
if(processes[i].active && processes[i].id>maxId){
maxId=processes[i].id;
maxIdIndex=i;

}
}
return maxIdIndex;
}
public void performElection(){

System.out.println(“Process no “+processes[getMax()].id+” fails”);


processes[getMax()].active=false;
System.out.println(“Election Initiated by”);
int initiatorProcesss=sc.nextInt();

int prev = initiatorProcesss;


int next = prev+1;

while(true){
if(processes[next].active){
System.out.println(“Process “+processes[prev].id+” pass
Election(“+processes[prev].id+”) to”+processes[next].id);
prev=next;
}

next = (next+1)%noOfProcesses;
if(next == initiatorProcesss){
break;
}
}

System.out.println(“Process “+ processes[getMax()].id +” becomes coordinator”);


int coordinator = processes[getMax()].id;

prev = coordinator;
next =(prev+1)%noOfProcesses;

while(true){
if(processes[next].active)
{
System.out.println(“Process “+ processes[prev].id +” pass
Coordinator(“+coordinator+ “) message to process “+processes[next].id );
prev = next;
}
next = (next+1) % noOfProcesses;
if(next == coordinator)
{
System.out.println(“End Of Election “);
break;
}
}

public static void main(String arg[]){


Ring r= new Ring();
r.initialiseRing();
r.performElection();
}

Output:

C:\Users\Garry\Desktop\CLIX\Bully>java Bully
Enter No of Processes
5
Process no 4 fails
Process 0Passes Election(0) message to process 1
Process 0Passes Election(0) message to process 2
Process 0Passes Election(0) message to process 3
Process 1Passes Ok(1) message to process 0
Process 2Passes Ok(2) message to process 0
Process 3Passes Ok(3) message to process 0
Process 1Passes Election(1) message to process 2
Process 1Passes Election(1) message to process 3
Process 2Passes Ok(2) message to process 1
Process 3Passes Ok(3) message to process 1
Process 2Passes Election(2) message to process 3
Process 3Passes Ok(3) message to process 2
Finally Process 3 Becomes Coordinator
Process 3Passes Coordinator(3) message to process 2
Process 3Passes Coordinator(3) message to process 1
Process 3Passes Coordinator(3) message to process 0
End of Election
2.4. Distributed Mutual Exclusion
Mutual exclusion in distributed system

Mutual exclusion is a concurrency control property which is introduced to prevent


race conditions. It is the requirement that a process cannot enter its critical section
while another concurrent process is currently present or executing in its critical
section i.e only one process is allowed to execute the critical section at any given
instance of time.

a)Mutual exclusion in single computer system Vs. distributed system:

In single computer system, memory and other resources are shared between different
processes. The status of shared resources and the status of users is easily available in
the shared memory so with the help of shared variable (For example: Semaphores)
mutual exclusion problem can be easily solved.

In Distributed systems, we neither have shared memory nor a common physical clock
and there for we cannot solve mutual exclusion problem using shared variables. To
eliminate the mutual exclusion problem in distributed system approach based on
message passing is used.
A site in distributed system does not have complete information of state of the system
due to lack of shared memory and a common physical clock.

b)Requirements of Mutual exclusion Algorithm:

● No Deadlock:

Two or more site should not endlessly wait for any message that will never arrive.

● No Starvation:

Every site who wants to execute critical section should get an opportunity to
execute it in finite time. Any site should not wait indefinitely to execute critical
section while other site are repeatedly executing critical section

● Fairness:

Each site should get a fair chance to execute critical section. Any request to
execute critical section must be executed in the order they are made i.e Critical
section execution requests should be executed in the order of their arrival in the
system.
● Fault Tolerance:

In case of failure, it should be able to recognize it by itself in order to continue


functioning without any disruption.

c)Solution to distributed mutual exclusion:

As we know shared variables or a local kernel can not be used to implement mutual
exclusion in distributed systems. Message passing is a way to implement mutual
exclusion. Below are the three approaches based on message passing to implement
mutual exclusion in distributed systems:

1. Token Based Algorithm:


● A unique token is shared among all the sites.

● If a site possesses the unique token, it is allowed to enter its critical section
● This approach uses sequence number to order requests for the critical section.
● Each requests for critical section contains a sequence number. This sequence
number is used to distinguish old and current requests.
● This approach insures Mutual exclusion as the token is unique

Example:
● Suzuki-Kasami’s Broadcast Algorithm

2. Non-token based approach:

● A site communicates with other sites in order to determine which sites should
execute critical section next. This requires exchange of two or more successive
round of messages among sites.
● This approach use timestamps instead of sequence number to order requests
for the critical section.
● When ever a site make request for critical section, it gets a timestamp.
Timestamp is also used to resolve any conflict between critical section
requests.
● All algorithm which follows non-token based approach maintains a logical
clock. Logical clocks get updated according to Lamport’s scheme

Example:
● Lamport's algorithm, Ricart–Agrawala algorithm

3. Quorum based approach

● Instead of requesting permission to execute the critical section from all other
sites, Each site requests only a subset of sites which is called a quorum.
● Any two subsets of sites or Quorum contains a common site.
● This common site is responsible to ensure mutual exclusion

2.5. Distributed Deadlock Detection


Algorithms
Deadlock detection in Distributed systems

In a distributed system deadlock can neither be prevented nor avoided as the system is
so vast that it is impossible to do so. Therefore, only deadlock detection can be
implemented. The techniques of deadlock detection in the distributed system require
the following:

● Progress –

The method should be able to detect all the deadlocks in the system.

● Safety –

The method should not detect false or phantom deadlocks.

There are three approaches to detect deadlocks in distributed systems. They are as
follows:

I. Centralized approach –

In the centralized approach, there is only one responsible resource to detect


deadlock. The advantage of this approach is that it is simple and easy to
implement, while the drawbacks include excessive workload at one node, single-
point failure (that is the whole system is dependent on one node if that node fails
the whole system crashes) which in turns makes the system less reliable.

II. Distributed approach –

In the distributed approach different nodes work together to detect deadlocks. No


single point failure (that is the whole system is dependent on one node if that node
fails the whole system crashes) as the workload is equally divided among all
nodes. The speed of deadlock detection also increases.

III. Hierarchical approach –

This approach is the most advantageous. It is the combination of both centralized


and distributed approaches of deadlock detection in a distributed system. In this
approach, some selected nodes or clusters of nodes are responsible for deadlock
detection and these selected nodes are controlled by a single node.

MCAD22E3CLOUD COMPUTING

MODULE3

3.1. Introduction to Cloud Computing

3.2. Evolution of Cloud Computing

3.3. Cloud Characteristics

3.4. On-demand Provisioning

3.5. NIST Cloud Computing Reference Architecture

3.6. Architectural Design Challenges


3.1. Introduction to Cloud Computing
Defining Cloud Computing

⮚ Cloud computing takes the technology, services, and applications that


are similar to those on the Internet and turns them into a self-service utility.
⮚ The use of the word “cloud” makes reference to the two essential
concepts:
1. Abstraction: Cloud computing abstracts the details of system
implementation from users and developers. Applications run on
physical systems that aren’t specified, data is stored in locations
that are unknown, administration of systems is outsourced to
others, and access by users is ubiquitous.
2. Virtualization: Cloud computing virtualizes systems by pooling and
sharing resources. Systems and storage can be provisioned as
needed from a centralized infrastructure, costs are assessed on a
metered basis, multi-tenancy is enabled, and resources are
scalable with agility.

A set of new technologies has come along that, along with the need for more
efficient and affordable computing, has enabled an on-demand system to
develop.

Clouds can come in many different types, and the services and applications that
run on clouds may or may not be delivered by a cloud service provider. These
different types and levels of cloud services mean that it is important to define
what type of cloud computing system you are working with.

Internet Vs Cloud computing:

The Internet offers abstraction, runs using the same set of protocols and
standards, and uses the same applications and operating systems. These same
characteristics are found in an intranet, an internal version of the Internet.
Cloud computing is an abstraction based on the notion of pooling physical
resources and presenting them as a virtual resource. It is a new model for
provisioning resources, for staging applications, and for platform-independent
user access to services.

Examples:
The cloud computing has changed the nature of commercial system
deployment, consider these three examples:
Google: In the last decade, Google has built a worldwide network of datacenters
to service its search engine. In doing so Google has captured a substantial portion
of the world’s advertising revenue. That revenue has enabled Google to offer free
software to users based on that infrastructure and has changed the market for
user-facing software. This is the classic Software as a Service case.

Azure Platform: By contrast, Microsoft is creating the Azure Platform. It


enables .NET Framework applications to run over the Internet as an alternate
platform for Microsoft developer software running on desktops.

Amazon Web Services: One of the most successful cloud-based businesses is


Amazon Web Services, which is an Infrastructure as a Service offering that lets
you rent virtual computers on Amazon’s own infrastructure.
These new capabilities enable applications to be written and deployed with
minimal expense and to be rapidly scaled and made available worldwide as
business conditions permit.

3.2. Evolution of Cloud Computing


Evolution of Cloud Computing

Cloud computing is all about renting computing services. This idea first came in the
1950s. In making cloud computing what it is today, five technologies played a vital
role. These are distributed systems and its peripherals, virtualization, web 2.0, service
orientation, and utility computing.
● Distributed Systems:

It is a composition of multiple independent systems but all of them are depicted


as a single entity to the users. The purpose of distributed systems is to share
resources and also use them effectively and efficiently. Distributed systems
possess characteristics such as scalability, concurrency, continuous availability,
heterogeneity, and independence in failures. But the main problem with this
system was that all the systems were required to be present at the same
geographical location. Thus to solve this problem, distributed computing led to
three more types of computing and they were-Mainframe computing, cluster
computing, and grid computing.

● Mainframe computing:

Mainframes which first came into existence in 1951 are highly powerful and
reliable computing machines. These are responsible for handling large data such
as massive input-output operations. Even today these are used for bulk
processing tasks such as online transactions etc. These systems have almost no
downtime with high fault tolerance. After distributed computing, these increased
the processing capabilities of the system. But these were very expensive. To
reduce this cost, cluster computing came as an alternative to mainframe
technology.

● Cluster computing:

In 1980s, cluster computing came as an alternative to mainframe computing.


Each machine in the cluster was connected to each other by a network with high
bandwidth. These were way cheaper than those mainframe systems. These were
equally capable of high computations. Also, new nodes could easily be added to
the cluster if it was required. Thus, the problem of the cost was solved to some
extent but the problem related to geographical restrictions still pertained. To
solve this, the concept of grid computing was introduced.

● Grid computing:

In 1990s, the concept of grid computing was introduced. It means that different
systems were placed at entirely different geographical locations and these all
were connected via the internet. These systems belonged to different
organizations and thus the grid consisted of heterogeneous nodes. Although it
solved some problems but new problems emerged as the distance between the
nodes increased. The main problem which was encountered was the low
availability of high bandwidth connectivity and with it other network associated
issues. Thus. cloud computing is often referred to as “Successor of grid
computing”.

● Virtualization:
It was introduced nearly 40 years back. It refers to the process of creating a
virtual layer over the hardware which allows the user to run multiple instances
simultaneously on the hardware. It is a key technology used in cloud computing.
It is the base on which major cloud computing services such as Amazon EC2,
VMware vCloud, etc work on. Hardware virtualization is still one of the most
common types of virtualization.

● Web 2.0:

It is the interface through which the cloud computing services interact with the
clients. It is because of Web 2.0 that we have interactive and dynamic web pages.
It also increases flexibility among web pages. Popular examples of web 2.0
include Google Maps, Facebook, Twitter, etc. Needless to say, social media is
possible because of this technology only. In gained major popularity in 2004.

● Service orientation:

It acts as a reference model for cloud computing. It supports low-cost, flexible,


and evolvable applications. Two important concepts were introduced in this
computing model. These were Quality of Service (QoS) which also includes the
SLA (Service Level Agreement) and Software as a Service (SaaS).

● Utility computing:

It is a computing model that defines service provisioning techniques for services


such as compute services along with other major services such as storage,
infrastructure, etc which are provisioned on a pay-per-use basis.

3.3. Cloud Characteristics

Examining the Characteristics of Cloud


Computing

It’s an evolutionary change that enables a revolutionary new approach to how


computing services are produced and consumed.

Paradigm shift
When you choose a cloud service provider, you are renting or leasing part of an
enormous infra-structure of datacenters, computers, storage, and networking
capacity.

Many of these datacenters are multi-million-dollar investments by the


companies that run them.

For example, there are some 20 datacenters in Amazon Web Service’s cloud
and Google’s cloud includes perhaps some 35 datacenters worldwide.
Amazon.com’s infrastructure was built to support elastic demand so the system
could accommodate peak traffic on a busy shopping day such as “Black Monday”,
which is the Monday after Thanksgiving in the United States when Internet
Christmas sales traditionally start. Because much of the capacity was idle,
Amazon.com first opened its network to partners and then as Amazon Web
Services to customers.
As these various datacenters grew in size, businesses have developed their
datacenters as “green-field” projects. Datacenters have been sited to do the
following:

o Have access to low cost power


o Leverage renewable power source
o Be near abundant water
o Be sited where high-speed network backbone connections can be
made
o Keep land costs modest and occupation unobtrusive
o Obtain tax breaks
o Optimize the overall system latency

These characteristics make cloud computing networks highly efficient and


capture enough margin to make utility computing profitable.
According to the research firm IDC, the following areas were the top five cloud
applications in use in 2010:

● Collaboration applications

● Web applications/Web serving

● Cloud backup

● Business applications

● Personal productivity applications

The last five years have seen a proliferation of services and productivity
applications delivered on-line as cloud computing applications.

For example, many people have used ChannelAdvisor.com for their auction
listings and sales management. That site recently expanded its service to include
a CRM connector to Salesforce.com. One of the largest call center operations
companies is a cloud-based service, Liveops.com.

The cost advantages of cloud computing have enabled new software vendors to
create productivity applications that they can make available to people at a
much smaller cost.

3.4. On-demand Provisioning

On-demand computing is an enterprise-level model of technology by which a


customer can purchase cloud services as and when needed.

For example, if a customer needs to utilize additional servers for the duration of
a project, they can do so and then drop back to the previous level after the
project is completed.
ODC make computing resources such as storage capacity, computational speed
and software applications available to users as needed for specific temporary
projects, known or unexpected workloads, routine work, or long-term
technological and computing requirements.

3.5. NIST Cloud Computing Reference


Architecture

The NIST model


The U.S. National Institute of Standards and Technology (NIST) has a set of
working definitions that separate cloud computing into service models and
deployment models. Those models and their relationship to essential
characteristics of cloud computing are shown in Figure 1.1.
The NIST model originally did not require a cloud to use virtualization to pool
resources, nor did it absolutely require that a cloud support multi-tenancy in the
earliest definitions of cloud computing. Multi-tenancy is the sharing of resources
among two or more clients. The latest version of the NIST definition does
require that cloud computing networks use virtualization and support multi-
tenancy.

FIGURE 1.1
The NIST cloud model doesn’t address a number of intermediary services such
as transaction or service brokers, provisioning, integration, and
interoperability services that form the basis for many cloud computing
discussions.
The Cloud Cube Model
The Open Group maintains an association called the Jericho Forum whose main
focus is how to protect cloud networks. The group has an interesting model that
attempts to categorize a cloud network based on four dimensional factors.

The type of cloud networks you use dramatically changes the notion of where the
boundary between the client’s network and the cloud begins and ends.
The four dimensions of the Cloud Cube Model are shown in Figure 1.2 and listed
here:
Physical location of the data:
Internal (I) / External (E) determines your organization’sboundaries.

Ownership:
Proprietary (P) / Open (O) is a measure of not only the technology
ownership,but of interoperability, ease of data transfer, and degree of vendor
application lock-in.

Security boundary:
Perimeterised (Per) / De-perimiterised (D-p) is a measure of whetherthe
operation is inside or outside the security boundary or network firewall.

Sourcing: Insourced or Outsourced means whether the service is provided by


the customer or the service provider.
Taken together, the fourth dimension corresponds to two different states in the
eight possible cloud forms: Per (IP, IO, EP, EO) and D-p (IP, IO, EP, EO). The
sourcing dimension addresses the deliverer of the service. What the Cloud Cube
Model is meant to show is that the traditional notion of a network boundary
being the network’s firewall no longer applies in cloud computing.

3.6. Architectural Design Challenges

Architectural Design Challenges

Cloud computing is used for enabling global access to mutual pools of resources such
as services, apps, data, servers, and computer networks. It is done on either a third-
party server located in a data center or a privately owned cloud. This makes data-
accessing contrivances more reliable and efficient, with nominal administration effort.
Because cloud technology depends on the allocation of resources to attain consistency
and economy of scale, similar to a utility, it is also fairly cost-effective, making it the
choice for many small businesses and firms.
But there are also many challenges involved in cloud computing, and if you’re not
prepared to deal with them, you won’t realize the benefits. Here are six common
challenges you must consider before implementing cloud computing technology.

1. Cost
Cloud computing itself is affordable, but tuning the platform according to the
company’s needs can be expensive. Furthermore, the expense of transferring the data
to public clouds can prove to be a problem for short-lived and small-scale projects.
Companies can save some money on system maintenance, management, and
acquisitions. But they also have to invest in additional bandwidth, and the absence of
routine control in an infinitely scalable computing platform can increase costs.

2. Service Provider Reliability


The capacity and capability of a technical service provider are as important as price.
The service provider must be available when you need them. The main concern
should be the service provider’s sustainability and reputation. Make sure you
comprehend the techniques via which a provider observes its services and defends
dependability claims.

3. Downtime
Downtime is a significant shortcoming of cloud technology. No seller can promise a
platform that is free of possible downtime. Cloud technology makes small companies
reliant on their connectivity, so companies with an untrustworthy internet connection
probably want to think twice before adopting cloud computing.
4. Password Security
Industrious password supervision plays a vital role in cloud security. However, the
more people you have accessing your cloud account, the less secure it is. Anybody
aware of your passwords will be able to access the information you store there.
Businesses should employ multi-factor authentication and make sure that passwords
are protected and altered regularly, particularly when staff members leave. Access
rights related to passwords and usernames should only be allocated to those who
require them.

5. Data privacy
Sensitive and personal information that is kept in the cloud should be defined as being
for internal use only, not to be shared with third parties. Businesses must have a plan
to securely and efficiently manage the data they gather.

6. Vendor lock-in
Entering a cloud computing agreement is easier than leaving it. “Vendor lock-in”
happens when altering providers is either excessively expensive or just not possible. It
could be that the service is nonstandard or that there is no viable vendor substitute.
It comes down to buyer carefulness. Guarantee the services you involve are typical
and transportable to other providers, and above all, understand the requirements.
Cloud computing is a good solution for many businesses, but it’s important to know
what you’re getting into. Having plans to address these six prominent challenges first
will help ensure a successful experience.

MCAD22E3CLOUD COMPUTING

MODULE4

4.1. Deployment Models

4.2. Public, Private and Hybrid Clouds

4.3. Service Models

4.4. Benefits of cloud computing

4.5. Disadvantages of cloud computing


4.1. Deployment Models

Cloud Types
There are two distinct sets of models:
Deployment models: This refers to the location and management of the
cloud’s infrastructure.
Service models: This consists of the particular types of services that you can
access on a cloud computing platform.

Deployment models
A deployment model defines the purpose of the cloud and the nature of how the
cloud is located.
4.2. Public, Private and Hybrid Clouds
The NIST definition for the four deployment models is as follows:
Public cloud:
The public cloud infrastructure is available for public use alternatively for large
industry group and is owned by an organization selling cloud services.

Private cloud: The private cloud infrastructure is operated for the exclusive
use of an organization. The cloud may be managed by that organization or a
third party. Private clouds may be either on- or off-premises.

Hybrid cloud: A hybrid cloud combines multiple clouds (private, community of


public)where those clouds retain their unique identities, but are bound
together as a unit. A hybrid cloud may offer standardized or proprietary access
to data and applications, as well as application portability.

Community cloud: A community cloud is one where the cloud has been
organized to serve a common function or purpose.
It may be for one organization or for several organizations, but they share
common concerns such as their mission, policies, security, regulatory
compliance needs, and so on. A community cloud may be managed by the
constituent organization(s) or by a third party.

Figure 1.3 shows the different locations that clouds can come in.
The United States Government, under the auspices of the General Services
Administrator (GSA),launched a cloud computing portal called Apps.gov, as
shown in Figure 1.4, with the purpose ofproviding cloud services to federal
agencies. Described under the “U.S. Federal Cloud ComputingInitiative
(http://www.scribd.com/doc/17914883/US-Federal-Cloud-Computing-
nitiative-FQ-GSA), the goal of the initiative is to make large portions of the
federal government’s apparatus available under a cloud computing model. This
is a good example of a community cloud deployment, with the government being
the community.

Apps.gov is the U.S. government’s cloud computing system for its various
agencies.

Apps.gov is also making available connections to free media services from its
cloud, such as Twitter and YouTube. An example of this connection in practice
is the YouTube channel created by the White House for citizens’ outreach. You
can find the White House channel at http:// www.youtube.com/whitehouse
and the general U.S. Government YouTube channel at
http://www.youtube.com/usgovernment. You can see YouTube in action when
you visit WhiteHouse.gov and click the video link that usually appears on that
home page.

4.3. Service Models

Service models

o In the deployment model, different cloud types are an expression of the


manner in which infrastructure is deployed.
o You can think of the cloud as the boundary between where a client’s
network, management, and responsibilities ends and the cloud service
provider’s begins.
o As cloudcomputing has developed, different vendors offer clouds that
have different services associated with them.
o The portfolio of services offered adds another set of definitions called
the service model.
o There are many different service models, all of which take the following
form:
XaaS, or “<Something> as a Service”

Three service types have been universally accepted:


Infrastructure as a Service:
IaaS provides virtual machines, virtual storage, virtual infra-structure, and other
hardware assets as resources that clients can provision.
Platform as a Service:
PaaS provides virtual machines, operating systems, applications,services,
development frameworks, transactions, and control structures.
Software as a Service:
SaaS is a complete operating environment with applications, management, and the
user interface.

4.4. Benefits of cloud computing


“The NIST Definition of Cloud Computing” by Peter Mell and Tim Grance (version
14, 10/7/2009) classified cloud computing into the three SPI service models
(SaaS, IaaS, and PaaS) and four cloud types (public, private, community, and
hybrid), also assigns five essential characteristics that cloud computing systems
must offer:

On-demand self-service:
A client can provision computer resources without the need for interaction with
cloud service provider personnel.

Broad network access:


Access to resources in the cloud is available over the network using standard
methods in a manner that provides platform-independent access to clients of all
types.
This includes a mixture of heterogeneous operating systems, and thick and thin
platforms such as laptops, mobile phones, and PDA.

Resource pooling:
A cloud service provider creates resources that are pooled together in a system
that supports multi-tenant usage.
Physical and virtual systems are dynamically allocated or reallocated as
needed. Intrinsic in this concept of pooling is the idea of abstraction that hides
the location of resources such as virtual machines, processing, memory,
storage, and network bandwidth and connectivity.

Rapid elasticity:
Resources can be rapidly and elastically provisioned.
The system can add resources by either scaling up systems (more powerful
computers) or scaling out systems (more computers of the same kind), and
scaling may be automatic or manual. From the standpoint of the client, cloud
computing resources should look limit-less and can be purchased at any time
and in any quantity.

Measured service:
The use of cloud system resources is measured, audited, and reportedto the
customer based on a metered system.
A client can be charged based on a known metric such as amount of storage used,
number of transactions, network I/O (Input/Output) or bandwidth, amount of
processing power used, and so forth. A client is charged based on the level of
services provided.

Additional advantages:
Lower costs:
Because cloud networks operate at higher efficiencies and with greater
utilization, significant cost reductions are often encountered.
Ease of utilization:
Depending upon the type of service being offered, you may find thatyou do not
require hardware or software licenses to implement your service.
Quality of Service:
The Quality of Service (QoS) is something that you can obtain undercontract
from your vendor.
Reliability:
The scale of cloud computing networks and their ability to provide load
balancing and failover makes them highly reliable, often much more reliable
than what you can achieve in a single organization.
Outsourced IT management:
A cloud computing deployment lets someone else manageyour computing
infrastructure while you manage your business. In most instances, you achieve
considerable reductions in IT staffing costs.

Simplified maintenance and upgrade:


Because the system is centralized, you can easilyapply patches and upgrades.
This means your users always have access to the latest software versions.
Low Barrier to Entry:
In particular, upfront capital expenditures are dramaticallyreduced. In
cloud computing, anyone can be a giant at any time.
Cloud computing is not a panacea, however. In many instances, cloud
computing doesn’t work well for particular applications.

4.5. Disadvantages of cloud computing

When you use an application or service in the cloud, you are using something
that isn’t necessarily as customizable as you might want.
All cloud computing applications suffer from the inherent latency that is
intrinsic in their WAN connectivity.

While cloud computing applications excel at large-scale processing tasks, if your


application needs large amounts of data transfer, cloud computing may not be
the best model for you.
Additionally, cloud computing is a stateless system, as is the Internet in general.
In order for communication to survive on a distributed system, it is necessarily
unidirectional in nature.

All the requests you use in HTTP: PUTs, GETs, and so on are requests to a
service provider. The service provider then sends a response. Although it may
seem that you are carrying on a conversation between client and provider, there
is an architectural disconnect between the two. That lack of state allows
messages to travel over different routes and for data to arrive out of sequence,
and many other characteristics allow the communication to succeed even when
the medium is faulty. Therefore, to impose transactional coherency upon the
system, additional overhead in the form of service brokers, transaction
managers, and other middleware must be added to the system. This can
introduce a very large performance hit into some applications.
If you had to pick a single area of concern in cloud computing, that area would
undoubtedly be privacy and security. When your data travels over and rests on
systems that are no longer under your control, you have increased risk due to
the interception and malfeasance of others.

Lab Exercise 1 : Use Google collaboration tools

Creating new files


Google Drive gives you access to a suite of tools that allows you
to create and edit a variety of files, including documents, spreadsheets,
and presentations. There are five types of files you can create on Google Drive:

● Documents: For composing letters, flyers, essays, and other text-


based files (similar to Microsoft Word documents)

● Spreadsheets: For storing and organizing information (similar to


Microsoft Excel workbooks)

● Presentations: For creating slideshows (similar to Microsoft


PowerPoint presentations)

● Forms: For collecting and organizing data

● Drawings: For creating simple vector graphics or diagrams

The process for creating new files is the same for all file types. Watch the video
below to learn more.

To create a new file:


1. From Google Drive, locate and select the New button, then choose
the type of file you want to create. In our example, we'll
select Google Docs to create a new document.
2. Your new file will appear in a new tab on your browser. Locate and
select Untitled document in the upper-left corner.

3. The Rename dialog box will appear. Type a name for your file, then
click OK.
4. Your file will be renamed. You can access the file at any time from
your Google Drive, where it will be saved automatically. Simply
double-click to open the file again.

You may notice that there is no Save button for your files. This is because Google
Drive uses autosave, which automatically and immediately saves your files as
you edit them.
How to run a Python script in the cloud?

1. Launch your cloud computer

Below is a quick step-by-step guide to starting a cloud computer also called a cloud
instance.

1. Create an account at AWS

2. Log in to AWS

3. Choose EC2 from the Services menu

4. (Optional) In the top-right corner select the region that is closest to you.

5. Click Launch Instance

6. Choose Amazon Linux

7. Click Review and Launch and on the next page click Launch

8. You can select Create a new key pair in the pop-up window and give it any
name you want.

9. Click Launch Instance

10. Save the .pem file somewhere you’ll remember. This is your AWS key and we
will need it later

11. Click View Instances

After a short while you should see that a new instance is up and running. You can see
this on the AWS browser page under Instances.

2. Connect to the cloud computer

1. Open the Terminal on Mac or Linux

2. You first need to change the access permissions of your AWS key file:
$ chmod 400 my-key-pair.pem

3. We now ssh into the cloud computer. This will create a secure connection
between our laptop and the cloud
$ ssh -i /path/my-key-pair.pem ec2-user@public_dns_name
If all went well, you are now remotely logged in to your cloud computer.

3. Setup Python on the cloud

Your instance should come with Python 2.7 pre-installed. I will install Python 3
below. To see the Python-related packages available for install on your cloud
computer type:
$ yum list | grep python

yum is a package manager for Linux and grep is a search command applied to the
output of yum (this is what the | sign achieves). grep looks for anything with “python”
in the output of yum.

To install Python 3:
$ sudo yum install python35

The sudo prefix ensures that the above command is run with administrator privileges,
which is necessary to install stuff.

To install Python packages with pip (Python package manager), I also needed to run:
$ sudo yum install python35-devel$ sudo yum install gcc

4. Set up a Python virtual environment

It is best practice to set up a Python virtual environment and install any new packages
there instead of installing packages in the Python root.

To create a virtual environment for Python 3


$ virtualenv -p python3 venv

You can choose a name other then “venv” for your virtual environment.

To activate the virtual environment


$ source venv/bin/activate

You can later deactivate the virtual environment by typing


$ deactivate

As an example, we can now install the Python package scrapy, a webscraping


framework, with pip install.
$ pip install scrapy
5. Run a Python script

Create a blank Python script:


$ vi test.py

Press i to enter insert mode and type the following code.


import timefori in range(1000, -1, -1):
print(“Time remaining: {}”.format(i))
time.sleep(1)

To exit the vi editor, first press Esc to exit insert mode and then press :x. This will also
save the file.

Run the script:


$ python test.py

6. Upload Python code from your own machine

If you would like to upload existing code from your machine, use the secure copy
command below.
$ scp -ri ~/documents/path/yourkey.pem ~/documents/path/example/ ec2-user@ec2–
23–21–11–38.eu-central-1.compute.amazonaws.com:/home/ec2-user

Replace the above two files paths with the path to your AWS key and the folder
containing the Python code. You can find out the last parameter that is needed above
by right-clicking on the instance you have just started and selecting Connect.

The option -r is for recursive copy as we are copying a complete folder and option -
i stands for identity_file to provide the path to your AWS key.

If you are only uploading a single file, the -r option is not necessary.

7. Keep your Python script alive and running

To keep Python running even after you disconnect from the cloud instance we
install tmux.
$ sudo yum install tmux

Start tmux
$ tmux

Run the Python script inside tmux


$ python test.py
Press CTRL + B and type :detach. This will exit tmux, but the Python script will still be
running in the background. Disconnect from the cloud by typing.
$ ~.

Reconnect to the cloud.


$ ssh -i “~/documents/path/mykey.pem” ec2-user@ec2–23–21–59–38.eu-central-
1.compute.amazonaws.com:/home/ec2-user

Remember to change the above parameters to the ones you are using.

Enter tmux and see the Python code at work:


$ tmux attach

At this point, you should see our Python script running.

To stop the script, press CTRL + B and type:


:kill-session

8. Stop the cloud computer

On the AWS browser page under Instances, right-click the instance you want to
terminate, select Instance State and select Terminate. This is also when billing stops.

Lab 2 :Explore public cloud services like Amazon, Google, Sales Force, Digital
Ocean etc
With the public cloud raging the market, virtual hardware is shared by many
companies. The multi-employer environment makes it easy to differentiate
infrastructure costs for multiple users. Due to cost benefits and payment model, the
public cloud is suitable for small and medium-sized businesses. In general, sensitive,
community-oriented web applications that can receive unexpected traffic can increase
the cloud mass very well.

While defining the cloud strategy for their business, enterprises can choose between a
public cloud, a private cloud or a hybrid cloud for efficient scaling. The choice
depends on several factors such as the type of business application, the costs involved,
the technical specifications, and other business requirements. In this blog, we will
take a closer look at the social cloud and its benefits for businesses.

The public cloud is the most popular computer model. In this case, cloud service
providers use the Internet and make services such as infrastructure, storage, servers
etc. available for businesses. Third-party providers own and use shared physical
hardware and provide that to companies according to their needs. Amazon Elastic
Compute Cloud (EC2), IBM Blue Cloud, Google App Engine, Sun Cloud, Microsoft
Azure are some of the most popular cloud providers in the community.

Let us get a close look into the top 3 cloud vendors that are high in demand with the
IT sector.

● Amazon Web Services (AWS)


● Microsoft Azure
● Google Cloud Platform
Other cloud service providers preceding the top 3 are DigitalOcean, IBM Cloud, Dell
Technologies/ VMware, CISCO Systems, Salesforce, Oracle etc.

Amazon Web Services:

● Popularly known as AWS, Amazon Web Services is the leading cloud service
provider with a 33% market share.

● AWS assists firms by providing quality services and supporting their


businesses. One can run their business on a mobile phone or desktop. In
addition, the user should focus only on creating the code and ignoring other
features.

Microsoft Azure:

● Back in 2017, Gartner called Azure a top leader in the Cloud Infrastructure as
a service space.
● Globally, 90% of Fortune 500 companies use Microsoft Azure to run their
business.
● Using the deeply integrated Azure cloud services, businesses can quickly
build, deploy and manage simple and complex systems with ease.
● Azure supports multiple programming languages, frameworks, operating
systems, information, and devices, allowing businesses to use the tools and
technologies they rely on.

Google Cloud Platform:

● With precise looks, low cost, attractive features and flexible computer
options, GCP is an attractive option for both AWS and Azure. It uses complete
encryption for all data and communication channels including traffic between
data centres.
● Some of the areas where Google Cloud competes fiercely with AWS include
model setting and payment, privacy and traffic security, cost-effectiveness,
and machine learning.
● While all three cloud providers offer up to 75% discounts on a one- to three-
year commitment, Google Cloud additionally offers up to 30% continuous
user discount on each model that works for more than 25% per month.
● Google Cloud offers several in-house APIs related to computer viewing, native
language processing and translation. Machine learning engineers can create
models based on the Cloud Machine Learning Engine open-source
TensorFlow open-source learning.

Conclusion:
Focusing your IT team on projects that can bring in more revenue rather than working
furiously to manage on-premises systems is a predominant priority to most IT
companies. With finite resources, companies are looking to adopt cloud models that
can cater to their multiple IT requirements. Cloud-native technologies empower
organizations to build and run scalable SRP-based microservers-based applications in
modern, dynamic environments.
MCAD22E3CLOUD COMPUTING

MODULE5

5.1. Introduction to Web Service and Service Oriented Architecture

5.2. SOAP and REST

5.4. Basics of Virtualization

5.5. Full and Para Virtualization

5.6. Implementation Levels of Virtualization


5.1. Introduction to Web Service and
Service Oriented Architecture
Service-Oriented Architecture

SOA is an architectural style for building software applications that use services
available in a network such as the web. It promotes loose coupling between software
components so that they can be reused. Applications in SOA are built based on
services. A service is an implementation of a well-defined business functionality, and
such services can then be consumed by clients in different applications or business
processes.

SOA allows for the reuse of existing assets where new services can be created from
an existing IT infrastructure of systems. In other words, it enables businesses to
leverage existing investments by allowing them to reuse existing applications, and
promises interoperability between heterogeneous applications and technologies. SOA
provides a level of flexibility that wasn't possible before in the sense that:

● Services are software components with well-defined interfaces that are


implementation-independent. An important aspect of SOA is the separation
of the service interface (the what) from its implementation (the how). Such
services are consumed by clients that are not concerned with how these
services will execute their requests.

● Services are self-contained (perform predetermined tasks) and loosely


coupled (for independence)

● Services can be dynamically discovered

● Composite services can be built from aggregates of other services

SOA uses the find-bind-execute paradigm as shown in Figure 1. In this paradigm,


service providers register their service in a public registry. This registry is used by
consumers to find services that match certain criteria. If the registry has such a
service, it provides the consumer with a contract and an endpoint address for that
service.
Figure 1: SOA's Find-Bind-Execute Paradigm

SOA-based applications are distributed multi-tier applications that have presentation,


business logic, and persistence layers. Services are the building blocks of SOA
applications. While any functionality can be made into a service, the challenge is to
define a service interface that is at the right level of abstraction. Services should
provide coarse-grained functionality.

Realizing SOA with Web Services

Web services are software systems designed to support interoperable machine-to-


machine interaction over a network. This interoperability is gained through a set of
XML-based open standards, such as WSDL, SOAP, and UDDI. These standards
provide a common approach for defining, publishing, and using web services.

Sun's Java Web Services Developer Pack 1.5 (Java WSDP 1.5) and Java 2 Platform,
Enterprise Edition (J2EE) 1.4 can be used to develop state-of-the-art web services to
implement SOA. The J2EE 1.4 platform enables you to build and deploy web services
in your IT infrastructure on the application server platform. It provides the tools you
need to quickly build, test, and deploy web services and clients that interoperate with
other web services and clients running on Java-based or non-Java-based platforms. In
addition, it enables businesses to expose their existing J2EE applications as web
services. Servlets and Enterprise JavaBeans components (EJBs) can be exposed as
web services that can be accessed by Java-based or non-Java-based web service
clients. J2EE applications can act as web service clients themselves, and they can
communicate with other web services, regardless of how they are implemented.

Web Service APIs


The Java WSDP 1.5 and J2EE 1.4 platforms provide the Java APIs for XML (JAX)
that are shown in Table 1.
Table 1: Java APIs for XML (JAX) provided by J2EE 1.4

API Description

This API lets you process XML documents by


Java API for XML invoking a SAX or DOM parser in your
Processing (JAXP) 1.2 application. JAXP 1.2 supports W3C XML
Schema.

This is an API for building and deploying


Java API for XML-based
SOAP+WSDL web services clients and
RPC (JAX-RPC) 1.1
endpoints.

This is a Java API for accessing different


kinds of XML registries. It provides you with
a single set of APIs to access a variety of
Java APIs for XML
XML registries, including UDDI and the
Registries (JAXR) 1.0.4
ebXML Registry. You don't need to worry
about the nitty-gritty details of each registry's
information model.

This API lets you produce and consume


SOAP with Attachments messages conforming to the SOAP 1.1
API for Java (SAAJ) 1.2 specification and SOAP with Attachments
note.

JSR 109 defines deployment requirements for


web services clients and endpoints by
leveraging the JAX-RPC programming
JSR 109: Web services for model. In addition, it defines standard
J2EE 1.0 deployment descriptors using the XML
Schema, thereby providing a uniform method
of deploying web services onto application
servers through a wide range of tools.

Note: JAX-RPC 1.1 and SAAJ 1.2 include support for the Web Services
Interoperability (WS-I) and the Web Services Interoperability Basic Profile (WSI-
BP), currently being developed by http://www.ws-i.org, which provides a set of
guidelines on how to develop interoperable web services.

With the APIs described in Table 1, you can focus on high-level programming tasks,
rather than low-level issues of XML and web services. In other words, you can start
developing and using Java WSDP 1.5 and J2EE 1.4 web services without knowing
much about XML and web services standards. You only need to deal with Java
semantics, such as method invocation and data types. The dirty work is done behind
the scenes, as discussed further in the next section.

Figure 2 illustrates how the JAXR and JAX-RPC APIs play a role in publishing,
discovering, and using web services and thus realizing SOA.

Figure 2: Web services Publish-Discover-Invoke model

Web Services Endpoints in J2EE 1.4


The J2EE 1.4 platform provides a standardized mechanism to expose servlets and
EJBs as web services. Such services are considered web service endpoints (or web
service ports), and can be described using WSDL and published in a UDDI registry so
that they can be discovered and used by web service clients.

Once a web service is discovered, the client makes a request to a web service. The
web service processes the request and sends the response back to the client. To get a
feeling for what happens behind the scenes, consider Figure 2, which shows how a
Java client communicates with a Java web service in the J2EE 1.4 platform. Note that
J2EE applications can use web services published by other providers, regardless of
how they are implemented. In the case of non-Java-based clients and services, the
figure would change slightly, however.

As mentioned earlier, all the details between the request and the response happen
behind the scenes. You only deal with typical Java programming language semantics,
such as Java method calls, Java data types, and so forth. You needn't worry about
mapping Java to XML and vice-versa, or constructing SOAP messages. All this low-
level work is done behind the scenes, allowing you to focus on the high-level issues.
Figure 3: A Java Client Calling a J2EE Web

Note: J2EE 1.4 and Java WSDP 1.5 support both RPC-based and document-oriented
web services. In other words, once a service is discovered, the client can invoke
remote procedure calls on the methods offered by the service, or send an XML
document to the web service to be processed

Interoperability
Interoperability is the most important principle of SOA. This can be realized through
the use of web services, as one of the key benefits of web services is interoperability,
which allows different distributed web services to run on a variety of software
platforms and hardware architectures. The Java programming language is already a
champion when it comes to platform independence, and consequently the J2EE 1.4
and Java WSDP 1.5 platforms represent the ideal platforms for developing portable
and interoperable web services.

Interoperability and portability start with the standard specifications themselves. The
J2EE 1.4 and Java WSDP 1.5 platforms include the technologies that support SOAP,
WSDL, UDDI, and ebXML. This core set of specifications -- which are used to
describe, publish, enable discovery, and invoke web services -- are based on XML
and XML Schema. If you have been keeping up with these core specifications, you
know it's difficult to determine which products support which levels (or versions) of
the specifications. This task becomes harder when you want to ensure that your web
services are interoperable.

The Web Services Interoperability Organization (WS-I) is an open, industry


organization committed to promoting interoperability among web services based on
common, industry-accepted definitions and related XML standards support. WS-I
creates guidelines and tools to help developers build interoperable web services.
WS-I addresses the interoperability need through profiles. The first profile, WS-I
Basic Profile 1.0 (which includes XML Schema 1.0, SOAP 1.1, WSDL 1.1, and
UDDI 2.0), attempts to improve interoperability within its scope, which is bounded by
the specification referenced by it.

Since the J2EE 1.4 and Java WSDP 1.5 platforms adhere to the WS-I Basic Profile
1.0, they ensure not only that applications are portable across J2EE implementations,
but also that web services are interoperable with any web service implemented on any
other platform that conforms to WS-I standards such as .Net.
Challenges in Moving to SOA
SOA is usually realized through web services. Web services specifications may add to
the confusion of how to best utilize SOA to solve business problems. In order for a
smooth transition to SOA, managers and developers in organizations should known
that:

● SOA is an architectural style that has been around for years. Web services
are the preferred way to realize SOA.

● SOA is more than just deploying software. Organizations need to analyze


their design techniques and development methodology and
partner/customer/supplier relationship.

● Moving to SOA should be done incrementally and this requires a shift in


how we compose service-based applications while maximizing existing IT
investments.
Sun has recognized the challenges customers face in moving to SOA and has
developed an SOA Opportunity Assessment service offering that leverages years of
experience in delivering enabling technology solutions that met the unique needs of
each customer. Sun's SOA Opportunity Assessment provides customers with an
analysis of their organization's readiness to move to SOA, and a set of best practices
developed to complement this service offering, and helps them identify business-
relevant opportunities for building their service-oriented applications using
architectural best practices and reusable design patterns. For more information on this
as well as additional Sun SOA services offerings.

In addition, Sun's Java BluePrints provide developers with guidelines, patterns, and
sample applications. Java BluePrints has a book on Designing Web Services with
J2EE 1.4, which is the authoritative guide to the best practices for designing and
integrating enterprise-level web services using J2EE 1.4. It provides the guidelines,
patterns, and real-world examples architects and developers need in order to shorten
the learning curve and start building robust, scalable, and portable solutions.

Java Business Integration


Enterprises have invested heavily in large-scale packaged application software such as
enterprise resource planning (ERP), supply chain management (SCM), customer
relationship management (CRM), and other systems to run their businesses. IT
managers are being asked to deliver the next generation of software applications that
will provide new functionality, while leveraging existing IT investments. The solution
to this is integration technology; the available integration technology solutions,
however, are proprietary and do not interoperate with each other. The advent of web
services and SOA offers potential for lower integration costs and greater flexibility.

JSR 208 Java Business Integration (JBI), is a specification for a standard that
describes plug-in technology for system software that enables a service-oriented
architecture for building integration server software. JBI adopts SOA to maximize the
decoupling between components, and create well-defined interoperation semantics
founded on standards-based messaging. JSR 208 describes the service provider
interfaces (SPIs) that service engines and bindings plug into, as well as the normalized
message service that they use to communicate with each other. It is important to note
that JSR 208 doesn't define the engines or tools themselves. JSR 208 has the
following business advantages:

● It is itself a service-oriented architecture that will be highly flexible, extensible, and


scalable.

● Service engines could be implemented in any language as long as they support the
SPI definition implemented by JSR 208 compliant systems.

● New engines can be added to the container by plugging them into the standard SPI
and defining the messages they will use to interact with the rest of the system.

● ISVs that specialize in one of these components could be able to plug special-
purpose engines into industry-standard integration solutions.

● Open interfaces will enable free and open competition around the implementation
of these engines. This means that customers will be free to choose the best solution
available, and their integration code can be migrated between implementations.
A JSR 208 example architecture is shown in Figure 4.

Figure 4: An Example Architecture Based on JSR 208

As you can see, JBI provides an environment in which plug-in components reside.
Interaction between the plug-in components is by means of message-based service
invocation. Services produced and consumed by plug-in components are modeled
using WSDL (version 2.0). A normalized message consists of two parts: the abstract
XML message, and message metadata (or message context data), which allows for
association of extra information with a particular message as it is processed by plug-in
and system components.

Project Shasta
Sun's Project Shasta, which is based on the JSR 208 architecture, aims to build a next-
generation integration solution. This project will be implemented on Sun's J2EE
application server and leverage J2EE services such as Java Message Service (JMS),
J2EE Connector Architecture (JCA), failover, and high availability. It will feature
many of the emerging standards in the web services (such as web service notification,
coordination, and transaction management) and integration space. The project will be
focused on web services and using them to enable the creation of service-oriented
architectures. Figure 5 depicts what a fully implemented product could look like.

Figure 5: An Example Architecture Based on JSR 208

Web Services and J2EE 1.4 for Enterprise Application Integration


Web services, which build on knowledge gained from other mature distributed
environments (such as CORBA and RMI), offer a standardized approach to
application-to-application communication and interoperability. They provide a way
for applications to expose their functionality over the web, regardless of the
application's programming language or platform. In other words, they allow
application developers to master and manage the heterogeneity of EIS.

Web services let developers reuse existing information assets by providing developers
with standard ways to access middle-tier and back-end services and integrate them
with other applications.

Since web services represent gateways to existing back-end servers, strong support for
back-end integration is required. This is where the J2EE platform comes into play.
The J2EE platform provides industry-standard APIs (such as the J2EE Connector
Architecture, the JDBC API, Java Message Service (JMS), among others) for
accessing legacy information systems. J2EE 1.4 (which supports web services)
provides an excellent mechanism to integrate legacy EIS and expose their
functionality as interoperable web services, thereby making legacy data available on
heterogeneous platform environments.

5.2. SOAP and REST


REST or SOAP in a cloud-native environment

Cloud-based API data models have not only enhanced the cloud experience, but also
provided a way for developers and administrators to integrate workloads into the
cloud using those APIs. For most enterprises, APIs let share information across
various on-premises and cloud-based applications. They also play an important role to
integrate platform workloads more seamlessly. As cloud adoption continues to grow,
there is more demand for integration points between applications inside and outside of
the cloud environment. Rise of multicloud strategy along with need for enhancement
in cross cloud capabilities have increased the dependency on cloud API environment.
But which approach is better and what support do you get in your cloud environment?

SOAP in a nutshell

SOAP (short for Simple Object Access Protocol), the older approach, had
industrywide support ranging from product companies such as IBM and Microsoft to
service implementers. It also came with a comprehensive yet complex set of
standards. Microsoft team who designed SOAP made it to be extremely flexible—to
be able to communicate over private networks, across the internet and emails. It was
supported by several standards as well. Initial version of SOAP was part of a
specification that contained Universal Description, Discovery, and Integration
(UDDI) and Web Services Description Language (WSDL) as well.

SOAP essentially provides the envelope for sending the web services messages. The
architecture itself is designed to help the performance of various operations between
software programs. Communication between programs usually happens via XML
based requests and HTTP based responses. HTTP is mostly used protocol of
communication, but other protocols may be used as well.

[ Also on InfoWorld: How to make the most of the Google Cloud free tier ]
A SOAP message contains some mandatory parts such as ENVELOPE, HEADER, BODY,
and FAULT. The ENVELOPE object defines the start and end of XML message
request, HEADER contains any header elements to be processed by the server, and
the BODY contains the remaining XML object that constitutes the
request. FAULT object is used any error handling.

REST

REST (Representational State Transfer) is usually referred as an architectural style


rather than a protocol, which is used to build web services. REST architecture allows
the communication between two software programs, wherein one program can request
and manipulate resources from the other one. REST request for accessing resources
on target program uses HTTP verbs: GET, POST, PUT, and DELETE. These requests can
use data format including XML, HTML and JSON. JSON is most preferred as it is
most compatible and easy to use. most REST APIs are based on URIs (Uniform
Resource Identifier) and are specific to HTTP protocol.

REST is developer-friendly because its simpler style makes it easier to implement and
use than SOAP. REST is less verbose, and less volume of data is sent when
communicating between two endpoints.

Why SOAP or REST?

While SOAP is like using an envelope that contains lots of processing information
inside it, REST can be considered a postcard that has an URI as destination address, is
lightweight, and can be cached. REST is data-driven and is primary used to access a
resource (URI) for certain data; SOAP is a protocol that is function-driven. REST
provides flexibility in choosing data format (plain text, HTML, XML, or JSON) while
SOAP only uses XML.

5.3. Basics of Virtualization

1. Full Virtualization :

Full Virtualization was introduced by IBM in the year 1966. It is the first software
solution of server virtualization and uses binary translation and direct approach
technique. In full virtualization, guest OS is completely isolated by the virtual
machine from the virtualization layer and hardware. Microsoft and Parallels
systems are examples of full virtualization.
2. Paravirtualization :

Paravirtualization is the category of CPU virtualization which uses hypercalls for


operations to handle instructions at compile time. In paravirtualization, guest OS
is not completely isolated but it is partially isolated by the virtual machine from
the virtualization layer and hardware. VMware and Xen are some examples of
paravirtualization.
5.4. Full and Para Virtualization

The difference between Full Virtualization and Paravirtualization are as follows:

S.No
. Full Virtualization Paravirtualization

In Full virtualization, virtual In paravirtualization, virtual machine


machine permit the execution of does not implement full isolation of OS
the instructions with running of but rather provides a different API
unmodified OS in an entire which is utilized when OS is subjected
1. isolated way. to alteration.

While the Paravirtualization is more


2. Full Virtualization is less secure. secure than the Full Virtualization.

3. Full Virtualization uses binary While Paravirtualization uses hypercalls


S.No
. Full Virtualization Paravirtualization

translation and direct approach as


a technique for operations. at compile time for operations.

Full Virtualization is slow than Paravirtualization is faster in operation


4. paravirtualization in operation. as compared to full virtualization.

Full Virtualization is more Paravirtualization is less portable and


5. portable and compatible. compatible.

Examples of full virtualization


are Microsoft and Parallels Examples of paravirtualization are
6. systems. VMware and Xen.

5.6. Implementation Levels of


Virtualization

IMPLEMENTATION LEVELS OF VIRTUALIZATION IN CLOUD


COMPUTING

It is not simple to set up virtualization. Your computer runs on an operating system


that gets configured on some particular hardware. It is not feasible or easy to run a
different operating system using the same hardware.

To do this, you will need a hypervisor. Now, what is the role of the hypervisor? It is a
bridge between the hardware and the virtual operating system, which allows smooth
functioning.

Talking of the Implementation levels of virtualization in cloud computing, there are a


total of five levels that are commonly used. Let us now look closely at each of
these levels of virtualization implementation in cloud computing.

1.) Instruction Set Architecture Level (ISA)


ISA virtualization can work through ISA emulation. This is used to run many legacy
codes that were written for a different configuration of hardware. These codes run on
any virtual machine using the ISA. With this, a binary code that originally needed
some additional layers to run is now capable of running on the x86 machines. It can
also be tweaked to run on the x64 machine. With ISA, it is possible to make the
virtual machine hardware agnostic.

For the basic emulation, an interpreter is needed, which interprets the source code and
then converts it into a hardware format that can be read. This then allows processing.
This is one of the five implementation levels of virtualization in cloud computing.

2.) Hardware Abstraction Level (HAL)


True to its name HAL lets the virtualization perform at the level of the hardware. This
makes use of a hypervisor which is used for functioning. At this level, the virtual
machine is formed, and this manages the hardware using the process of virtualization.
It allows the virtualization of each of the hardware components, which could be the
input-output device, the memory, the processor, etc.

Multiple users will not be able to use the same hardware and also use multiple
virtualization instances at the very same time. This is mostly used in the cloud-based
infrastructure.

3.) Operating System Level


At the level of the operating system, the virtualization model is capable of creating a
layer that is abstract between the operating system and the application. This is an
isolated container that is on the operating system and the physical server, which
makes use of the software and hardware. Each of these then functions in the form of a
server.

When there are several users, and no one wants to share the hardware, then this is
where the virtualization level is used. Every user will get his virtual environment
using a virtual hardware resource that is dedicated. In this way, there is no question of
any conflict.

4.) Library Level


The operating system is cumbersome, and this is when the applications make use of
the API that is from the libraries at a user level. These APIs are documented well, and
this is why the library virtualization level is preferred in these scenarios. API hooks
make it possible as it controls the link of communication from the application to the
system.

5.) Application Level


The application-level virtualization is used when there is a desire to virtualize only
one application and is the last of the implementation levels of virtualization in cloud
computing. One does not need to virtualize the entire environment of the platform.
This is generally used when you run virtual machines that use high-level languages.
The application will sit above the virtualization layer, which in turn sits on the
application program.

It lets the high-level language programs compiled to be used in the application level
of the virtual machine run seamlessly.

MCAD22E3CLOUD COMPUTING

MODULE6

6.1. Tools and Mechanisms

6.2. Virtualization of CPU

6.3. Memory

6.4. I/O Devices

6.5. Desktop Virtualization

6.6. Server Virtualization


6.1. Tools and Mechanisms

Virtualization Tools and Techniques

Virtualization is a technique in which the user required services run remotely in a


ubiquitous environment which gives scalable resources. Virtualization is being used
in cloud computing for load balancing and aggregation of cloud resources.
Virtualization provides higher hardware utilization. It is also being used for
partitioning of computational resources and hence supports sharing of resources.
Virtualization has different types such as Native virtualization, Full virtualization,
Operating system level virtualization and Para virtualization. Other than these there is
Resources virtualization, Desktop virtualization, Server virtualization, Data centres
virtualization and application virtualization.

The resources virtualization is implemented in different forms such as the Full


virtualization, Native virtualization, Para virtualization, Operating system (OS) layer
virtualization or Hosted virtualization. Virtual machines and Virtual machine
monitors (VMMs) have been developed to offer better energy efficient solutions to
the virtualization problems. Virtualization tools like OpenVz, Xen, VmWare etc. are
widely used in the computing industry
6.2. Virtualization of CPU

What is CPU Virtualization?


CPU Virtualization emphasizes running programs and instructions through a virtual
machine, giving the feeling of working on a physical workstation. All the operations
are handled by an emulator that controls software to run according to it. Nevertheless,
CPU Virtualization does not act as an emulator. The emulator performs the same way
as a normal computer machine does. It replicates the same copy or data and generates
the same output just like a physical machine does. The emulation function offers great
portability and facilitates working on a single platform, acting like working on
multiple platforms.

With CPU Virtualization, all the virtual machines act as physical machines and
distribute their hosting resources like having various virtual processors. Sharing of
physical resources takes place to each virtual machine when all hosting services get
the request. Finally, the virtual machines get a share of the single CPU allocated to
them, being a single-processor acting as a dual-processor.

Types of CPU Virtualization


The various types of CPU virtualization available are as follows

1. Software-Based CPU Virtualization

This CPU Virtualization is software-based where with the help of it, application code
gets executed on the processor and the privileged code gets translated first, and that
translated code gets executed directly on the processor. This translation is purely
known as Binary Translation (BT). The code that gets translated is very large in size
and also slow at the same time on execution. The guest programs that are based on
privileged coding runs very smooth and fast. The code programs or the applications
that are based on privileged code components that are significant such as system calls,
run at a slower rate in the virtual environment.

2. Hardware-Assisted CPU Virtualization

There is hardware that gets assistance to support CPU Virtualization from certain
processors. Here, the guest user uses a different version of code and mode of
execution known as a guest mode. The guest code mainly runs on guest mode. The
best part in hardware-assisted CPU Virtualization is that there is no requirement for
translation while using it for hardware assistance. For this, the system calls runs faster
than expected. Workloads that require the updation of page tables get a chance of
exiting from guest mode to root mode that eventually slows down the program’s
performance and efficiency.

3. Virtualization and Processor-Specific Behavior

Despite having specific software behavior of the CPU model, the virtual machine still
helps in detecting the processor model on which the system runs. The processor
model is different based on the CPU and the wide variety of features it offers, whereas
the applications that produce the output generally utilize such features. In such cases,
vMotion cannot be used to migrate the virtual machines that are running on feature-
rich processors. Enhanced vMotion Compatibility easily handles this feature.

4. Performance Implications of CPU Virtualization

CPU Virtualization adds the amount of overhead based on the workloads and
virtualization used. Any application depends mainly on the CPU power waiting for
the instructions to get executed first. Such applications require the use of CPU
Virtualization that gets the command or executions that are needed to be executed
first. This overhead takes the overall processing time and results in an overall
degradation in performance and CPU virtualisation execution.

6.3. Memory Virtualization


Virtual memory virtualization is similar to the virtual memory support provided by
modern operat-ing systems. In a traditional execution environment, the operating
system maintains mappings of virtual memory to machine memory using page tables,
which is a one-stage mapping from virtual memory to machine memory. All modern
x86 CPUs include a memory management unit (MMU) and a translation lookaside
buffer (TLB) to optimize virtual memory performance. However, in a virtual
execution environment, virtual memory virtualization involves sharing the physical
system memory in RAM and dynamically allocating it to the physical memory of the
VMs.

That means a two-stage mapping process should be maintained by the guest OS and
the VMM, respectively: virtual memory to physical memory and physical memory to
machine memory. Furthermore, MMU virtualization should be supported, which is
transparent to the guest OS. The guest OS continues to control the mapping of virtual
addresses to the physical memory addresses of VMs. But the guest OS cannot directly
access the actual machine memory. The VMM is responsible for mapping the guest
physical memory to the actual machine memory. Figure 3.12 shows the two-level
memory mapping procedure.

Since each page table of the guest OSes has a separate page table in the VMM
corresponding to it, the VMM page table is called the shadow page table. Nested page
tables add another layer of indirection to virtual memory. The MMU already handles
virtual-to-physical translations as defined by the OS. Then the physical memory
addresses are translated to machine addresses using another set of page tables defined
by the hypervisor. Since modern operating systems maintain a set of page tables for
every process, the shadow page tables will get flooded. Consequently, the perfor-
mance overhead and cost of memory will be very high.

VMware uses shadow page tables to perform virtual-memory-to-machine-memory


address translation. Processors use TLB hardware to map the virtual memory directly
to the machine memory to avoid the two levels of translation on every access. When
the guest OS changes the virtual memory to a physical memory mapping, the VMM
updates the shadow page tables to enable a direct lookup. The AMD Barcelona
processor has featured hardware-assisted memory virtualization since 2007. It
provides hardware assistance to the two-stage address translation in a virtual
execution environment by using a technology called nested paging.

Extended Page Table by Intel for Memory Virtualization

Since the efficiency of the software shadow page table technique was too low, Intel
developed a hardware-based EPT technique to improve it, as illustrated in Figure
3.13. In addition, Intel offers a Virtual Processor ID (VPID) to improve use of the
TLB. Therefore, the performance of memory virtualization is greatly improved. In
Figure 3.13, the page tables of the guest OS and EPT are all four-level.

When a virtual address needs to be translated, the CPU will first look for the L4
page table pointed to by Guest CR3. Since the address in Guest CR3 is a physical
address in the guest OS, the CPU needs to convert the Guest CR3 GPA to the host
physical address (HPA) using EPT. In this procedure, the CPU will check the EPT
TLB to see if the translation is there. If there is no required translation in the EPT
TLB, the CPU will look for it in the EPT. If the CPU cannot find the translation in the
EPT, an EPT violation exception will be raised.
When the GPA of the L4 page table is obtained, the CPU will calculate the GPA of
the L3 page table by using the GVA and the content of the L4 page table. If the entry
corresponding to the GVA in the L4

page table is a page fault, the CPU will generate a page fault interrupt and will let the
guest OS kernel handle the interrupt. When the PGA of the L3 page table is obtained,
the CPU will look for the EPT to get the HPA of the L3 page table, as described
earlier. To get the HPA corresponding to a GVA, the CPU needs to look for the EPT
five times, and each time, the memory needs to be accessed four times. There-fore,
there are 20 memory accesses in the worst case, which is still very slow. To overcome
this short-coming, Intel increased the size of the EPT TLB to decrease the number of
memory accesses.
6.4. I/O Virtualization
I/O virtualization involves managing the routing of I/O requests between virtual
devices and the shared physical hardware. At the time of this writing, there are three
ways to implement I/O virtualization: full device emulation, para-virtualization, and
direct I/O. Full device emulation is the first approach for I/O virtualization. Generally,
this approach emulates well-known, real-world devices.

All the functions of a device or bus infrastructure, such as device enumeration,


identification, interrupts, and DMA, are replicated in software. This software is
located in the VMM and acts as a virtual device. The I/O access requests of the guest
OS are trapped in the VMM which interacts with the I/O devices. The full device
emulation approach is shown in Figure 3.14.

A single hardware device can be shared by multiple VMs that run concurrently.
However, software emulation runs much slower than the hardware it emulates
[10,15]. The para-virtualization method of I/O virtualization is typically used in Xen.
It is also known as the split driver model consisting of a frontend driver and a backend
driver. The frontend driver is running in Domain U and the backend dri-ver is running
in Domain 0. They interact with each other via a block of shared memory. The
frontend driver manages the I/O requests of the guest OSes and the backend driver is
responsible for managing the real I/O devices and multiplexing the I/O data of
different VMs. Although para-I/O-virtualization achieves better device performance
than full device emulation, it comes with a higher CPU overhead.

Direct I/O virtualization lets the VM access devices directly. It can achieve close-
to-native performance without high CPU costs. However, current direct I/O
virtualization implementations focus on networking for mainframes. There are a lot of
challenges for commodity hardware devices. For example, when a physical device is
reclaimed (required by workload migration) for later reassign-ment, it may have been
set to an arbitrary state (e.g., DMA to some arbitrary memory locations) that can
function incorrectly or even crash the whole system. Since software-based I/O
virtualization requires a very high overhead of device emulation, hardware-assisted
I/O virtualization is critical. Intel VT-d supports the remapping of I/O DMA transfers
and device-generated interrupts. The architecture of VT-d provides the flexibility to
support multiple usage models that may run unmodified, special-purpose,
or “virtualization-aware” guest OSes.

Another way to help I/O virtualization is via self-virtualized I/O (SV-IO) [47]. The
key idea of SV-IO is to harness the rich resources of a multicore processor. All tasks
associated with virtualizing an I/O device are encapsulated in SV-IO. It provides
virtual devices and an associated access API to VMs and a management API to the
VMM. SV-IO defines one virtual interface (VIF) for every kind of virtua-lized I/O
device, such as virtual network interfaces, virtual block devices (disk), virtual camera
devices, and others. The guest OS interacts with the VIFs via VIF device drivers.
Each VIF consists of two mes-sage queues. One is for outgoing messages to the
devices and the other is for incoming messages from the devices. In addition, each
VIF has a unique ID for identifying it in SV-IO.

VMware Workstation for I/O Virtualization

The VMware Workstation runs as an application. It leverages the I/O device


support in guest OSes, host OSes, and VMM to implement I/O virtualization. The
application portion (VMApp) uses a driver loaded into the host operating system
(VMDriver) to establish the privileged VMM, which runs directly on the hardware. A
given physical processor is executed in either the host world or the VMM world, with
the VMDriver facilitating the transfer of control between the two worlds. The
VMware Workstation employs full device emulation to implement I/O virtualization.
Figure 3.15 shows the functional blocks used in sending and receiving packets via the
emulated virtual NIC.
The virtual NIC models an AMD Lance Am79C970A controller. The device driver
for a Lance controller in the guest OS initiates packet transmissions by reading and
writing a sequence of virtual I/O ports; each read or write switches back to the
VMApp to emulate the Lance port accesses. When the last OUT instruc-tion of the
sequence is encountered, the Lance emulator calls a normal write() to the VMNet
driver. The VMNet driver then passes the packet onto the network via a host NIC and
then the VMApp switches back to the VMM. The switch raises a virtual interrupt to
notify the guest device driver that the packet was sent. Packet receives occur in
reverse.

6.5. Desktop Virtualization

Desktop virtualization is a method of simulating a user workstation so it can be


accessed from a remotely connected device. By abstracting the user desktop in this
way, organizations can allow users to work from virtually anywhere with a network
connecting, using any desktop laptop, tablet, or smartphone to access enterprise
resources without regard to the device or operating system employed by the remote
user.

Remote desktop virtualization is also a key component of digital workspaces. Virtual


desktop workloads run on desktop virtualization servers which typically execute
on virtual machines (VMs) either at on-premises data centers or in the public cloud.

Since the user devices is basically a display, keyboard, and mouse, a lost or stolen
device presents a reduced risk to the organization. All user data and programs exist in
the desktop virtualization server, not on client devices.

6.6. Server Virtualization


Server virtualization is used to mask server resources from server users. This can
include the number and identity of operating systems, processors, and individual
physical servers.

Server Virtualization Definition


Server virtualization is the process of dividing a physical server into multiple unique
and isolated virtual servers by means of a software application. Each virtual server
can run its own operating systems independently.

Key Benefits of Server Virtualization:


● Higher server ability
● Cheaper operating costs
● Eliminate server complexity
● Increased application performance
● Deploy workload quicker

Three Kinds of Server Virtualization:

1. Full Virtualization: Full virtualization uses a hypervisor, a type of software


that directly communicates with a physical server's disk space and CPU. The
hypervisor monitors the physical server's resources and keeps each virtual
server independent and unaware of the other virtual servers. It also relays
resources from the physical server to the correct virtual server as it runs
applications. The biggest limitation of using full virtualization is that a
hypervisor has its own processing needs. This can slow down applications and
impact server performance.

2. Para-Virtualization: Unlike full virtualization, para-virtualization involves


the entire network working together as a cohesive unit. Since each operating
system on the virtual servers is aware of one another in para-virtualization, the
hypervisor does not need to use as much processing power to manage the
operating systems.
3. OS-Level Virtualization: Unlike full and para-virtualization, OS-level
visualization does not use a hypervisor. Instead, the virtualization capability,
which is part of the physical server operating system, performs all the tasks of
a hypervisor. However, all the virtual servers must run that same operating
system in this server virtualization method.

Why Server Virtualization?


Server virtualization is a cost-effective way to provide web hosting services and
effectively utilize existing resources in IT infrastructure. Without server
virtualization, servers only use a small part of their processing power. This results in
servers sitting idle because the workload is distributed to only a portion of the
network’s servers. Data centres become overcrowded with underutilized servers,
causing a waste of resources and power.

By having each physical server divided into multiple virtual servers, server
virtualization allows each virtual server to act as a unique physical device. Each
virtual server can run its own applications and operating system. This process
increases the utilization of resources by making each virtual server act as a physical
server and increases the capacity of each physical machine.

Lab Exercise 1:
Specifically, I'm going to walk through the creation of a simple Python Flask app
that provides a RESTful web service. The service will provide an endpoint to:

● Ingest a JSON formatted payload (webhook) from Threat Stack


● Parse the payload for Threat Stack Alert IDs
● Retrieve detailed alert data from Threat Stack
● Archive the webhook and alert data to AWS S3

But before I jump in, keep a couple of things to keep in mind. First, I will not be
bothering with any sort of frontend display functionality, so you don't need to worry
about HTML or CSS. Second, my organization follows Flask's own suggested
organization. I am going to skip the single module pattern and go straight to
the Packages and Blueprints models.

There is a large range of Flask tutorials. On one hand, there are tutorials that explain
how to build small, simple apps (where the entire app fits in a single file). On the
other hand, there are tutorials that explain how to build much larger, complicated
apps. This tutorial fills a sweet spot in the middle and demonstrates a structure that is
simple, but which can immediately accommodate increasingly complex requirements.

Project structure

The structure of the project that I'm going to build, which comes from Explore Flask,
is shown below:

Threatstack-to-s3

├── app

│ ├── __init__.py

│ ├── models

│ │ ├── __init__.py

│ │ ├── s3.py

│ │ └── threatstack.py

│ └── views

│ ├── __init__.py

│ └── s3.py

├── gunicorn.conf.py
├── requirements.osx.txt

├── requirements.txt

└── threatstack-to-s3.py

Top-level files

I'll start the discussion with the top-level files that are useful to me as I build the
service:

Gunicorn.conf.py: This is a configuration file for the Gunicorn WSGI HTTP server
that will serve up this app. While the application can run and accept connections on its
own, Gunicorn is more efficient at handling multiple connections and allowing the
app to scale with load.

Requirements.txt/requirements.osx.txt: The app's dependencies are listed in this


file. It is used by the pip utility to install the needed Python packages. For information
on installing dependencies, see the Setup section of this README.md.

Threatstack-to-s3.py: This is the application launcher. It can be run directly using


"python" if you are doing local debugging, or it can be passed as an argument to
"gunicorn" as the application entry point. For information on how to launch a service,
see README.md.

App package (app/ directory)

The app package is my application package. The logic for the application is
underneath this directory. As I mentioned earlier, I have chosen to break the app into
a collection of smaller modules rather than use a single, monolithic module file.

The following four usable modules defined in this package are:

● app
● app.views.s3
● app.models.threatstack
● app.models.s3

Note: app.views and app.models do not provide anything and their __init__.py files
are empty.

App module
The app module has the job of creating the Flask application. It exports a single
function, create_app(), that will create a Flask application object and configure it.
Currently it initializes application blueprints that correspond to my application views.
Eventually, create_app() will do other things such as initialize logging, but I'm
skipping that now for clarity and simplicity.

App/__init__.py

from flask import Flask

def _initialize_blueprints(application):
'''
Register Flask blueprints
'''
from app.views.s3 import s3
application.register_blueprint(s3, url_prefix='/api/v1/s3')

def create_app():
'''
Create an app by initializing components.
'''
application = Flask(__name__)

_initialize_blueprints(application)

# Do it!
return application
Copy

This module is used by threatstack-to-s3.py to start the application. It


imports create_app() and then uses it to create a Flask application instance.

Threatstack-to-s3.py

#!/usr/bin/env python
from app import create_app

# Gunicorn entry point.


application = create_app()

if __name__ == '__main__':
# Entry point when run via Python interpreter.
print("== Running in debug mode ==")
application.run(host='localhost', port=8080, debug=True)
Copy
Views and Flask blueprints

Before discussing the remaining three modules, I'll talk about what views and Flask
blueprints and then dive into the app.views.s3 module.

Views: Views are what the application consumer sees. There's no front end to this
application, but there is a public API endpoint. Think of a view as what can and
should be exposed to the person or thing (e.g., the consumer) who is using this
application. The best practice is to keep views as simple as possible. If an endpoint's
job is to take data in and copy it to S3, make it perform that function, but hide the
details of how that was done in the application models. Views should mostly represent
the actions a consumer wants to see happen, while the details (which consumers
shouldn't care about) live in the application models (described later).

Flask Blueprints: Earlier I said that I am going to use a Packages and Blueprints
layout instead of a single module application. Blueprints contain a portion of my API
endpoint structure. This lets me logically group related portions of my API. In my
case, each view module is its own blueprint.

Learn more

Modular Applications with Blueprints documentation on the Flask website.

Explore Flask is a book about best practices and patterns for developing web
applications with Flask.

App.views.s3 module

The threatstack-to-s3 service takes Threat Stack webhook HTTP requests in and
stores a copy of the alert data in S3. This is where I store the set of API endpoints that
allow someone to do this. If you look back at app/__init__.py, you will see that I
have rooted the set of endpoints at /api/v1/s3.

From app/__init__.py:

from views.s3 import s3


app.register_blueprint(s3, url_prefix='/api/v1/s3')
Copy

I used this path for a few reasons:

● API: To note that this is an API and I should not expect a front end. Maybe one
day I'll add a front end. Probably not, but I find this useful mentally and as a
sign to others
● V1: This is version 1 of the API. If I need to make breaking changes to
accommodate new requirements, I can add a v2 so that two APIs exist as I
migrate all consumers over to the new version
● S3: This is the service I'm connecting to and manipulating. I have some freedom
here to name this portion of the path whatever I want, but I like to keep it
descriptive. If the service was relaying data to HipChat, for example, I could
name this portion of the path hipchat

In app.views.s3, I am providing a single endpoint for now, /alert, which represents


the object I'm manipulating, and that responds only to the HTTP POST request
method.

Remember: When building APIs, URL paths should represent nouns and HTTP
request methods should represent verbs.

App/views/s3.py

'''
API to archive alerts from Threat Stack to S3
'''

from flask import Blueprint, jsonify, request


import app.models.s3 as s3_model
import app.models.threatstack as threatstack_model

s3 = Blueprint('s3', __name__)

@s3.route('/alert', methods=['POST'])
def put_alert():
'''
Archive Threat Stack alerts to S3.
'''
webhook_data = request.get_json()
for alert in webhook_data.get('alerts'):
alert_full = threatstack_model.get_alert_by_id(alert.get('id'))
s3_model.put_webhook_data(alert)
s3_model.put_alert_data(alert_full)

status_code = 200
success = True
response = {'success': success}

return jsonify(response), status_code


Copy
Now I'll walk through some key parts of the module. If you're familiar enough with
Python, you can skip the next few lines on imports, but if you're wondering why I
rename what I import, then follow along.

from flask import Blueprint, jsonify, request


import app.models.s3 as s3_model
import app.models.threatstack as threatstack_model
Copy

I'm a fan of typing brevity and consistency. I could have done this the following way
to import the model modules:

import app.models.s3
import app.models.threatstack
Copy

But that would mean I'd be using functions like:

app.models.s3.put_webhook_alert(alert)
Copy

I could have done this as well:

from app.models import s3, threatstack


Copy

However, this would break when I create the s3 Blueprint object a few lines later
because I'd overwrite the s3 model module.

s3 = Blueprint('s3', __name__) # We've just overwritten the s3 module we


imported.
Copy

For these reasons, importing the model modules and renaming them slightly is just
easier.

Now I'll walk through the app endpoint and function associated with it.

@s3.route('/alert', methods=['POST'])
def put_alert():
'''
Archive Threat Stack alerts to S3.
'''
Copy
The first line is called a decorator. I'm adding a route to the s3 Blueprint
called /alert (which expands to /api/v1/s3/alert) that when an HTTP POST request is
made to it will cause put_alert() to be called.

The body of the function is pretty simple:

● Get the request's JSON data


● Iterate over the array in the alerts key
● For each alert:
o Retrieve the alert detail from Threat Stack
o Store the alert info in the request in S3
o Store the alert detail in S3

webhook_data = request.get_json()
for alert in webhook_data.get('alerts'):
alert_full = threatstack_model.get_alert_by_id(alert.get('id'))
s3_model.put_webhook_data(alert)
s3_model.put_alert_data(alert_full)
Copy

Once that's done, I return a simple JSON doc back, indicating the success or failure of
the transaction. (Note: There's no error handling in place, so of course I've hardcoded
the success response and HTTP status code. I'll change that when error handling is
added at a later date.)

status_code = 200
success = True
response = {'success': success}

return jsonify(response), status_code


Copy

At this point, I've satisfied my request and done what the consumer requested. Notice
that I haven't included any code demonstrating how I fulfilled the request. What did I
have to do to get the alert's detail? What actions did I perform to store the alert? How
are the alerts stored and named in S3? The consumer doesn't really care about those
details. This is a good way to think about organizing your code in your own service:
What the consumer needs to know about should live in your view. The details the
consumer doesn't need to know should live in your model, which I am about to cover.

Before discussing the remaining modules, I'll talk about models, which are how to
talk to the services I'm using, such as Threat Stack and S3.
Models

Models describe "things," and these "things" are what I want to perform actions on.
Typically, when you search for help on Flask models, blogs and documentation like to
use databases in their examples. While what I'm doing right now isn't far off, I'm just
storing data in an object store instead of a database. It's not the only sort of thing I
might do in the future with the data received from Threat Stack.

Additionally, I've chosen to skip an object-oriented approach in favor of a procedural


style. In more advanced Python, I would model an alert object and provide a means of
manipulating it. But this introduces more complexity than is needed for the given task
of storing data in S3 and also makes the code more complicated for demonstrating a
simple task. I've chosen brevity and clarity over technical correctness for this.

App.models.threatstack Module

The app.models.threatstack module, as you can guess, handles communication with


Threat Stack.

'''
Communicate with Threat Stack
'''
import os
import requests

THREATSTACK_BASE_URL = os.environ.get('THREATSTACK_BASE_URL
', 'https://app.threatstack.com/api/v1')
THREATSTACK_API_KEY = os.environ.get('THREATSTACK_API_KEY')

def get_alert_by_id(alert_id):
'''
Retrieve an alert from Threat Stack by alert ID.
'''
alerts_url = '{}/alerts/{}'.format(THREATSTACK_BASE_URL, alert_id)

resp = requests.get(
alerts_url,
headers={'Authorization': THREATSTACK_API_KEY}
)

return resp.json()
Copy

Just a quick run through of a few spots of note:


THREATSTACK_BASE_URL = os.environ.get('THREATSTACK_BASE_URL
', 'https://app.threatstack.com/api/v1')
THREATSTACK_API_KEY = os.environ.get('THREATSTACK_API_KEY')
Copy

I don't want to keep the Threat Stack API in my code. This is just good clean
code/security living. I'm going to get the API key from my environment for now
because it's a quick and simple solution. At some point, I should centralize all
configuration in a single file instead of hiding it here, so the code and setup are a little
cleaner. That's a job for another time, and for now the setup is documented
in README.md.

def get_alert_by_id(alert_id):
'''
Retrieve an alert from Threat Stack by alert ID.
'''
alerts_url = '{}/alerts/{}'.format(THREATSTACK_BASE_URL, alert_id)

resp = requests.get(
alerts_url,
headers={'Authorization': THREATSTACK_API_KEY}
)

return resp.json()
Copy

The get_alert_by_id() function takes an alert ID, queries the Threat Stack platform
for the alert data, and returns that data. I'm using the Python requests module to make
an HTTP GET request to the Threat Stack API endpoint that returns alert info for the
given alert.

Read the Threat Stack API documentation.

App.models.s3 Module

The app.models.s3 module handles connectivity to AWS S3.

'''
Manipulate objects in AWS S3.
'''
import boto3
import json
import os
import time

TS_AWS_S3_BUCKET = os.environ.get('TS_AWS_S3_BUCKET')
TS_AWS_S3_PREFIX = os.environ.get('TS_AWS_S3_PREFIX', None)
def put_webhook_data(alert):
'''
Put alert webhook data in S3 bucket.
'''
alert_time = time.gmtime(alert.get('created_at')/1000)
alert_time_path = time.strftime('%Y/%m/%d/%H/%M', alert_time)
alert_key = '/'.join([alert_time_path, alert.get('id')])
if TS_AWS_S3_PREFIX:
alert_key = '/'.join([TS_AWS_S3_PREFIX, alert_key])

s3_client = boto3.client('s3')
s3_client.put_object(
Body=json.dumps(alert),
Bucket=TS_AWS_S3_BUCKET,
Key=alert_key
)

return None

def put_alert_data(alert):
'''
Put alert data in S3.
'''
alert_id = alert.get('id')
alert_key = '/'.join(['alerts',
alert_id[0:2],
alert_id[2:4],
alert_id
])

if TS_AWS_S3_PREFIX:
alert_key = '/'.join([TS_AWS_S3_PREFIX, alert_key])

s3_client = boto3.client('s3')
s3_client.put_object(
Body=json.dumps(alert),
Bucket=TS_AWS_S3_BUCKET,
Key=alert_key
)

return None
Copy

walk through the interesting parts:

TS_AWS_S3_BUCKET = os.environ.get('TS_AWS_S3_BUCKET')
TS_AWS_S3_PREFIX = os.environ.get('TS_AWS_S3_PREFIX', None)
Copy
Again, there's no config file for this app, but I need to set an S3 bucket name and
optional prefix. I should fix this eventually—the setup is documented in
the README.md, which is good enough for now.

The functions put_webhook_data() and put_alert_data() have a lot of duplicate


code. I haven't refactored them because it's easier to see the logic before refactoring.
If you look closely, you'll realize that the only difference between them is how
the alert_key is defined. I'll focus on put_webhook_data():

def put_webhook_data(alert):
'''
Put alert webhook data in S3 bucket.
'''
alert_time = time.gmtime(alert.get('created_at')/1000)
alert_time_path = time.strftime('%Y/%m/%d/%H/%M', alert_time)
alert_key = '/'.join(['webhooks', alert_time_path, alert.get('id')])
if TS_AWS_S3_PREFIX:
alert_key = '/'.join([TS_AWS_S3_PREFIX, alert_key])

s3_client = boto3.client('s3')
s3_client.put_object(
Body=json.dumps(alert),
Bucket=TS_AWS_S3_BUCKET,
Key=alert_key
)

return None
Copy

This function takes in a single argument named alert. Looking back


at app/views/s3.py, alert is just the JSON data that was sent to the endpoint.
Webhook data is stored in S3 by date and time. The alert
587c0159a907346eccb84004 occurring at 2017-01-17 13:51 is stored in S3 as
webhooks/2017/01/17/13/51/587c0159a907346eccb84004.

I start by getting the alert time. Threat Stack has sent the alert time in milliseconds
since the Unix epoch, and that needs to be converted into seconds, which is how
Python handles time. I take that time and parse it into a string that will be the
directory path. I then join the top-level directory where I store webhook data, the
time-based path, and finally the alert ID to form the path to the webhook data in S3.

Boto 3 is the primary module in Python for working with AWS resources. I initialize
a boto3 client object so I can talk to S3 and put the object there.
The s3_client.put_object() is fairly straightforward with
its Bucket and Key arguments, which are the name of the S3 bucket and the path to
the S3 object I want to store. The Body argument is my alert converted back to a
string.
Lab Exercise :2

In this lab, you will learn how to install the Oracle Solaris 11.2 Image for Oracle
VMVirtualBox—the easiest way to get up and running with Oracle Solaris
11.2.

Prerequisites

This lab is the first is a series of labs for Oracle Solaris 11. All of the labs in the series
have these prerequisites in common:

● Operating system: Windows, Mac OS X, Linux, or Oracle Solaris on x86

● Memory: 2 GB of RAM

Before starting the lab, ensure you have installed the following:

● Download and install Oracle VM VirtualBox; see Installation Details for notes on
installing on any of the above operating systems.
● Download and install the Oracle VM VirtualBox Extension Pack and follow the
instructions at Installing VirtualBox and extension packs.
Also, you must enable hardware virtualization support in the BIOS. Oracle Solaris
depends on those capabilities.

Exercise 1: Download the Oracle Solaris 11 VM for Oracle VM


VirtualBox

Download the template (that is, the virtual machine [VM]) called Oracle Solaris 11.2
Oracle VM Template for Oracle VM VirtualBox.

Exercise 2: Import the Oracle Solaris 11.2 VM into Oracle VM VirtualBox

1. 1. Start Oracle VM VirtualBox.


2. 2. Select File > Import Appliance. (This lab shows screen shots on a Mac. Screens
might look slightly different on a PC.)
Figure 1. Selecting the appliance to import

Browse to the location where you downloaded the Oracle Solaris 11.2 VM and select
it. Notice that Figure 1 mentions the OVF format, but the downloaded file is a .ova
file, which is the entire archive (including OVF.xml). Click Continue.
● 3. Before importing the VM, check the memory settings.

Figure 2. Appliance settings screen

Scroll down to check how much memory is allocated to the image. Oracle Solaris
11.2 (or later) requires a minimum of 2 GB of memory.
Figure 3. Checking the amount of allocated memory

● 4. Click the Import button.


Exercise 3: Start the Oracle Solaris 11.2 VM

In this exercise, we will run Oracle Solaris 11 for the first time—getting a basic
understanding of what's there:

● 1.Select the Oracle Solaris 11.2 VM and click the green arrow labeled Start.
Figure 4. Starting the VM

On first boot, the System Configuration Tool runs, prompting you to enter system
information providing an explanation of what is to follow. Note that during
installation, you have to actively switch between the VM and your host OS. Oracle
VM VirtualBox will open a window to explain this. After the VM boots, the
environments are integrated, so when you move the mouse pointer over the VM, any
input will be directed to the VM, and when you move the mouse pointer outside the
VM, subsequent input will go to the host OS.
Figure 5. First screen of the System Configuration Tool

- Set the region.


- Set the country.
- Set the time zone.
● 3. The next screen will prompt you for system name.

● 4. The third screen will prompt for networking settings. Choose Automatic.
● 5. Next will be three screens to set the time:
● 6. Then set the date.
● 7. And select the keyboard.
● 8. Set the password. Make sure you enter user account information as well as the root
password. You will log in through the user account.
Figure 6. Screen for entering user account and password information

Figure 7. Summary screen


For this example, we created the user demo during the configuration step, so we now
log into that account.

Figure 8. Login screen

● 9. Next, there will be two screens to enable the Oracle Configuration Manager, that is,
'phone home' capability. Unless you enter your My Oracle Support credentials (e-mail
address and password) this data gathering facility will not be activated. (No specific
user information is collected and Oracle treats the data collected as a customer's
private information.)
● 10. Finally, you will be presented with a summary page:
o 11. Press F2 to apply the specified configuration, and then Oracle Solaris will
complete the configuration/boot process.
o 12.Log in to Oracle Solaris using the user account you set up in Step 8 above.
Figure 9. Opening a terminal window

Figure 10. Investigating the Oracle VM VirtualBox package


The Oracle Solaris guest additions package creates tighter integration between the
host OS and Oracle Solaris. For example, you can cut and paste text between the two
operating systems. You can also put Oracle Solaris into full-screen mode. Do this now
by selecting Machine > Switch to Fullscreen.
Exiting full-screen mode is most easily accomplished by moving your mouse cursor
to the bottom middle of the screen, which will cause a menu to appear.
zfs list
This will print out data about all the pools and subpools created. Because there is only
one pool in use on this system, a more succinct way to get information is to just look
at rpool with thezpool(1M) command:
zpool list rpool

Figure 11. Listing data about all the pools and subpools
sudo cat /etc/sudoers
Doing this will give you privileges to run as the root user for five minutes. The demo
account user attributes are in /etc/user_attr. If you look at the contents of the file, you
will notice that when the demo user was created, it was given the role
of type=root. Hence, it can operate with root privileges.
o 13. After logging in, you should see the blank background of the user's desktop. Bring
up a terminal window by clicking the icon that looks like a computer screen (located
on the left side of the top bar of the Oracle VM VirtualBox window).
▪ 14. To get started, investigate the Oracle VM VirtualBox package by running the
command pkginfo -l SUNWvboxguest in the terminal window.
▪ 15. Next, enter the following command:
▪ 16. If you regularly use sudo(1), you can type in a command such as the following and
enter the password of the demo account.
Exercise 4: Take a Snapshot

There are two ways to take a snapshot of your environment. The first is the traditional
mechanism for VMs—capture all the information of that machine so that you can start
it up from that saved state later. This includes a snapshot of the local file system.

To take a snapshot, from the VirtualBox menu, select Machine > Take
Snapshot. Give the snapshot a name and optional description:
Figure 12. Creating a snapshot
The other approach is using the capability of ZFS. There are two ways to do this: one
for system administrators and one for users.

Lab Exercise: 3

What is Web Service?


Web service is a standardized medium to propagate communication between the
client and server applications on the WWW (World Wide Web). A web service is a
software module that is designed to perform a certain set of tasks.

● Web services in cloud computing can be searched for over the network and
can also be invoked accordingly.
● When invoked, the web service would be able to provide the functionality to
the client, which invokes that web service.
Web Service Characteristics

How WebServices Work?

How Web Services Work?


The above diagram shows a very simplistic view of how a web service would actually
work. The client would invoke a series of web service calls via requests to a server
which would host the actual web service.
These requests are made through what is known as remote procedure calls. Remote
Procedure Calls(RPC) are calls made to methods which are hosted by the relevant
web service.
As an example, Amazon provides a web service that provides prices for products sold
online via amazon.com. The front end or presentation layer can be in .Net or Java but
either programming language would have the ability to communicate with the web
service.
The main component of a web service design is the data which is transferred between
the client and the server, and that is XML. XML (Extensible markup language) is a
counterpart to HTML and easy to understand the intermediate language that is
understood by many programming languages.
So when applications talk to each other, they actually talk in XML. This provides a
common platform for application developed in various programming languages to talk
to each other.
Web services use something known as SOAP (Simple Object Access Protocol) for
sending the XML data between applications. The data is sent over normal HTTP. The
data which is sent from the web service to the application is called a SOAP message.
The SOAP message is nothing but an XML document. Since the document is written
in XML, the client application calling the web service can be written in any
programming language.
Why do you need a Web Service?
Modern day business applications use variety of programming platforms to develop
web-based applications. Some applications may be developed in Java, others in .Net,
while some other in Angular JS, Node.js, etc.
Most often than not, these heterogeneous applications need some sort of
communication to happen between them. Since they are built using different
development languages, it becomes really difficult to ensure accurate communication
between applications.
Here is where web services come in. Web services provide a common platform that
allows multiple applications built on various programming languages to have the
ability to communicate with each other.
Type of Web Service
There are mainly two types of web services.

1. SOAP web services.


2. RESTful web services.

In order for a web service to be fully functional, there are certain components that
need to be in place. These components need to be present irrespective of whatever
development language is used for programming the web service.
Let's look at these components in more detail.
SOAP (Simple Object Access Protocol)
SOAP is known as a transport-independent messaging protocol. SOAP is based on
transferring XML data as SOAP Messages. Each message has something which is
known as an XML document. Only the structure of the XML document follows a
specific pattern, but not the content. The best part of Web services and SOAP is that
its all sent via HTTP, which is the standard web protocol.
Here is what a SOAP message consists of

● Each SOAP document needs to have a root element known as the <Envelope>
element. The root element is the first element in an XML document.
● The "envelope" is in turn divided into 2 parts. The first is the header, and the
next is the body.
● The header contains the routing data which is basically the information which
tells the XML document to which client it needs to be sent to.
● The body will contain the actual message.

The diagram below shows a simple example of the communication via SOAP.
S
OAP Protocol
We will discuss SOAP in detail
WSDL (Web services description language)
A web service cannot be used if it cannot be found. The client invoking the web
service should know where the web service actually resides.
Secondly, the client application needs to know what the web service actually does, so
that it can invoke the right web service. This is done with the help of the WSDL,
known as the Web services description language. The WSDL file is again an XML-
based file which basically tells the client application what the web service does. By
using the WSDL document, the client application would be able to understand where
the web service is located and how it can be utilized.
Web Service Example
A Web services example of a WSDL file is given below.
<definitions>
<message name="TutorialRequest">
<part name="TutorialID" type="xsd:string"/>
</message>

<message name="TutorialResponse">
<part name="TutorialName" type="xsd:string"/>
</message>

<portType name="Tutorial_PortType">
<operation name="Tutorial">
<input message="tns:TutorialRequest"/>
<output message="tns:TutorialResponse"/>
</operation>
</portType>
<binding name="Tutorial_Binding" type="tns:Tutorial_PortType">
<soap:binding style="rpc"
transport="http://schemas.xmlsoap.org/soap/http"/>
<operation name="Tutorial">
<soap:operationsoapAction="Tutorial"/>
<input>
<soap:body
encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
namespace="urn:examples:Tutorialservice"
use="encoded"/>
</input>

<output>
<soap:body
encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
namespace="urn:examples:Tutorialservice"
use="encoded"/>
</output>
</operation>
</binding>
</definitions>
The important aspects to note about the above WSDL declaration examples of web
services are as follows:

1. <message> - The message parameter in the WSDL definition is used to define


the different data elements for each operation performed by the web service.
So in the web services examples above, we have 2 messages which can be
exchanged between the web service and the client application, one is the
"TutorialRequest", and the other is the "TutorialResponse" operation. The
TutorialRequest contains an element called "TutorialID" which is of the type
string. Similarly, the TutorialResponse operation contains an element called
"TutorialName" which is also a type string.
2. <portType> - This actually describes the operation which can be performed
by the web service, which in our case is called Tutorial. This operation can
take 2 messages; one is an input message, and the other is the output message.
3. <binding> - This element contains the protocol which is used. So in our case,
we are defining it to use http (http://schemas.xmlsoap.org/soap/http). We
also specify other details for the body of the operation, like the namespace and
whether the message should be encoded.

Universal Description, Discovery, and Integration (UDDI)


UDDI is a standard for describing, publishing, and discovering the web services that
are provided by a particular service provider. It provides a specification which helps
in hosting the information on web services.
Now we discussed in the previous topic about WSDL and how it contains information
on what the Web service actually does. But how can a client application locate a
WSDL file to understand the various operations offered by a web service? So UDDI
is the answer to this and provides a repository on which WSDL files can be hosted. So
the client application will have complete access to the UDDI, which acts as a database
containing all the WSDL files.
Just as a telephone directory has the name, address and telephone number of a
particular person, the same way the UDDI registry will have the relevant
information for the web service. So that a client application knows, where it can be
found.
Web Services Advantages
We already understand why web services came about in the first place, which was to
provide a platform which could allow different applications to talk to each other.
But let's look at the list of web services advantages for why it is important to use web
services.
1. Exposing Business Functionality on the network - A web service is a unit of
managed code that provides some sort of functionality to client applications or
end users. This functionality can be invoked over the HTTP protocol which
means that it can also be invoked over the internet. Nowadays all applications
are on the internet which makes the purpose of Web services more useful.
That means the web service can be anywhere on the internet and provide the
necessary functionality as required.
2. Interoperability amongst applications - Web services allow various
applications to talk to each other and share data and services among
themselves. All types of applications can talk to each other. So instead of
writing specific code which can only be understood by specific applications,
you can now write generic code that can be understood by all applications
3. A Standardized Protocol which everybody understands - Web services use
standardized industry protocol for the communication. All the four layers
(Service Transport, XML Messaging, Service Description, and Service
Discovery layers) uses well-defined protocols in the web services protocol
stack.
4. Reduction in cost of communication - Web services use SOAP over HTTP
protocol, so you can use your existing low-cost internet for implementing web
services.
Web Services Architecture
Every framework needs some sort of architecture to make sure the entire framework
works as desired, similarly, in web services. The Web Services Architecture consists
of three distinct roles as given below :

1. Provider - The provider creates the web service and makes it available to
client application who want to use it.
2. Requestor - A requestor is nothing but the client application that needs to
contact a web service. The client application can be a .Net, Java, or any other
language based application which looks for some sort of functionality via a
web service.
3. Broker - The broker is nothing but the application which provides access to
the UDDI. The UDDI, as discussed in the earlier topic enables the client
application to locate the web service.

The diagram below showcases how the Service provider, the Service requestor and
Service registry interact with each other.
Web Services
Architecture

1. Publish - A provider informs the broker (service registry) about the existence
of the web service by using the broker's publish interface to make the service
accessible to clients
2. Find - The requestor consults the broker to locate a published web service
3. Bind - With the information it gained from the broker(service registry) about
the web service, the requestor is able to bind, or invoke, the web service.

Web service Characteristics


Web services have the following special behavioral characteristics:
1. They are XML-Based - Web Services uses XML to represent the data at the
representation and data transportation layers. Using XML eliminates any
networking, operating system, or platform sort of dependency since XML is
the common language understood by all.
2. Loosely Coupled – Loosely coupled means that the client and the web service
are not bound to each other, which means that even if the web service changes
over time, it should not change the way the client calls the web service.
Adopting a loosely coupled architecture tends to make software systems more
manageable and allows simpler integration between different systems.
3. Synchronous or Asynchronous functionality - Synchronicity refers to the
binding of the client to the execution of the service. In synchronous
operations, the client will actually wait for the web service to complete an
operation. An example of this is probably a scenario wherein a database read
and write operation are being performed. If data is read from one database and
subsequently written to another, then the operations have to be done in a
sequential manner. Asynchronous operations allow a client to invoke a service
and then execute other functions in parallel. This is one of the common and
probably the most preferred techniques for ensuring that other services are not
stopped when a particular operation is being carried out.
4. Ability to support Remote Procedure Calls (RPCs) - Web services enable
clients to invoke procedures, functions, and methods on remote objects using
an XML-based protocol. Remote procedures expose input and output
parameters that a web service must support.
5. Supports Document Exchange - One of the key benefits of XML is its
generic way of representing not only data but also complex documents. These
documents can be as simple as representing a current address, or they can be
as complex as representing an entire book.

MCAD22E3CLOUD COMPUTING

MODULE7

7.1. Resource Provisioning and Methods


7.2. Cloud Management Products
7.3. Cloud Storage
7.4. Provisioning Cloud Storage
7.5. Managed and Unmanaged Cloud Storage

7.1. Resource Provisioning and Methods

Cloud Provisioning
Cloud provisioning is the allocation of a cloud provider's resources and services to a
customer.

Cloud provisioning is a key feature of the cloud computing model, relating to how a
customer procures cloud services and resources from a cloud provider. The growing
catalog of cloud services that customers can provision includes infrastructure as a
service (IaaS), software as a service (SaaS) and platform as a service (PaaS) in public
or private cloud environments.

Types of cloud provisioning


The cloud provisioning process can be conducted using one of three delivery models.
Each delivery model differs depending on the kinds of resources or services an
organization purchases, how and when the cloud provider delivers those resources or
services, and how the customer pays for them. The three models are advanced
provisioning, dynamic provisioning and user self-provisioning.

With advanced provisioning, the customer signs a formal contract of service with the
cloud provider. The provider then prepares the agreed-upon resources or services for
the customer and delivers them. The customer is charged a flat fee or is billed on a
monthly basis.

With dynamic provisioning, cloud resources are deployed flexibly to match a


customer's fluctuating demands. Cloud deployments typically scale up to
accommodate spikes in usage and scale down when demands decrease. The customer
is billed on a pay-per-use basis. When dynamic provisioning is used to create a hybrid
cloud environment, it is sometimes referred to as cloud bursting.

With user self-provisioning, also called cloud self-service, the customer buys
resources from the cloud provider through a web interface or portal. This usually
involves creating a user account and paying for resources with a credit card. Those
resources are then quickly spun up and made available for use -- within hours, if not
minutes. Examples of this type of cloud provisioning include an employee purchasing
cloud-based productivity applications via the Microsoft 365 suite or G Suite.

Clou
d provisioning can be conducted in one of three processes: advanced, dynamic and
user self-provisioning.
Why cloud provisioning matters
Cloud provisioning offers organizations numerous benefits that aren't available with
traditional provisioning approaches.

A commonly cited benefit is scalability. In the traditional IT provisioning model, an


organization makes large investments in its on-premises infrastructure. This requires
extensive preparation and forecasting of infrastructure needs, as the on-premises
infrastructure is often set up to last for several years. In the cloud provisioning model,
however, organizations can simply scale up and scale down their cloud resources
based on short-term usage requirements.

Organizations can also benefit from cloud provisioning's speed. For example, an
organization's developers can quickly spin up an array of workloads on demand,
removing the need for an IT administrator who provisions and manages compute
resources.

Another benefit of cloud provisioning is the potential cost savings. While traditional
on-premises technology can exact large upfront investments from an organization,
many cloud providers allow customers to pay for only what they consume. However,
the attractive economics presented by cloud services can present its own challenges,
which organizations should address in a cloud management strategy.

Challenges of cloud provisioning


Cloud provisioning presents several challenges to organizations.

Complex management and monitoring. Organizations may need to rely on multiple


provisioning tools to customize how they use cloud resources. Many enterprises also
deploy workloads on more than one cloud platform, which makes it even more
challenging to have a central console to view everything.

Resource and service dependencies. Applications and workloads in the cloud often
tap into basic cloud infrastructure resources, such as compute, networking and
storage. Beyond those, public cloud providers' big selling point is in higher-level
ancillary services, such as serverless functions, machine learning and big data
capabilities. However, those services may carry dependencies that might not be
obvious, which can lead to unexpected overuse and surprise costs.

Policy enforcement. A self-service provisioning model helps streamline how users


request and manage cloud resources but requires strict rules to ensure they don't
provision resources they shouldn't. Recognize that different groups of users require
different levels of access and frequency -- a DevOps team may deploy multiple daily
updates, while line-of-business users might use self-service provisioning on a
quarterly basis. Set up rules that govern who can provision which types of resources,
for what duration and with what budgetary controls, including a chargeback system.

Adherence to policies also creates consistency in cloud provisioning. For example,


specify related steps such as backup, monitoring and integration with a configuration
management database -- even agreed-upon naming conventions when a resource is
provisioned to ensure consistency for management and monitoring.

Cost controls. Beyond provisioning policies, automated monitoring and alerts about
usage and pricing thresholds are essential. Be aware that these might not be real-time
warnings; in fact, an alert about an approaching budget overrun for a cloud service
could arrive hours or days after the fact.
Cloud provisioning tools and software
Organizations can manually provision whatever resources and services they need, but
public cloud providers offer tools to provision multiple resources and services:

● AWS CloudFormation

● Microsoft Azure Resource Manager

● Google Cloud Deployment Manager

● IBM Cloud Orchestrator

Alternatively, third-party tools for cloud resource provisioning include the following:

● CloudBolt

● Snow (formerly Embotics) Commander

● Morpheus Data

● Flexera (formerly RightScale)

● CloudSphere (formerly HyperGrid and iQuate)

● Scalr

Some organizations further automate the provisioning process as part of a broader


cloud management strategy through orchestration and configuration management
tools, such as HashiCorp's Terraform, Red Hat Ansible, Chef and Puppet.

7.2. Cloud Management Products


Cloud Management products and emerging standards
Cloud Management

Several providers have products designed for cloud computing management


(VMware, OpenQRM, CloudKick, and Managed Methods), along with the big players
like BMC, HP, IBM Tivoli and CA. Each uses a variety of methods to warn of
impending problems or send up the red flag when a sudden problem occurs. Each also
tracks performance trends.

While they all have features that differentiate them from each other, they’re also
focused on one key concept: providing information about cloud computing systems. If
your needs run into provisioning, the choices become more distinct than choosing
“agent vs. agentless” or “SNMP vs. WBEM.”
The main cloud infrastructure management products offer similar core features:

● Most support different cloud types (often referred to as hybrid clouds).


● Most support the on-the-fly creation and provisioning of new objects and the
destruction of unnecessary objects, like servers, storage, and/or apps.
● Most provide the usual suite of reports on status (uptime, response time, quota
use, etc.) and have a dashboard that can be drilled into.

When it comes to meeting those three criteria, there are a few vendors that offer
pervasive approaches in handling provisioning and managing metrics in hybrid
environments: RightScale, Kaavo, Zeus, Scalr and Morph. There are also options
offered by cloud vendors themselves that meet the second and third criteria, such as
CloudWatch from Amazon Web Services.

The large companies known for their traditional data center monitoring applications
have been slow to hit the cloud market, and what products they do have are rehashes
of existing applications that do little in the way of providing more than reporting and
alerting tools. CA is on an acquisition spree to fix this and just acquired 3Tera, a
cloud provisioning player.

An example of the confusion in the industry is IBM’s Tivoli product page for cloud
computing. You’ll notice that clicking the Getting Started tab results in a 404 error.
Nice work, IBM.

Meanwhile, HP’s OpenView (now called Operations Manager) can manage cloud-
based servers, but only insofar as it can manage any other server. BMC is working on
a cloud management tool, but doesn’t have anything beyond its normal products out at
the moment.

In place of these behemoths, secondary players making splashes on the market are
offering monitoring-focused applications from companies like Scout, UpTime
Systems, Cloudkick, NetIQ and ScienceLogic. There is also the “app formerly known
as” Hyperic, now owned by VMware through the acquisition of SpringSource.

In truth, we could rival John Steinbeck and Robert Jordan in word count when it
comes to writing about all the products in this field, though within a year or two it
should be a much smaller space as acquisitions occur, companies fail and the market
sorts itself out. There’s a lot on the way in cloud computing, not the least of which is
specifications. Right now the cloud is the Wild West: vast, underpopulated, and
lacking order except for a few spots of light.

7.3. Cloud Storage


These are the best infrastructure management and provisioning options available
today:

RightScale
RightScale is the big boy on the block right now. Like many vendors in the nascent
market, they offer a free edition with limitations on features and capacity, designed to
introduce you to the product (and maybe get you hooked, ala K.C. Gillette’s famous
business model at the turn of the 20th century). RightScale’s product is broken down
into four components:

● Cloud Management Environment


● Cloud-Ready ServerTemplate and Best Practice Deployment Library
● Adaptable Automation Engine
● Multi-Cloud Engine

A fifth feature states that the “Readily Extensible Platform supports programmatic
access to the functionality of the RightScale Platform.” In looking at the product,
these features aren’t really separate from one another, but make a nice, integrated
offering.

RightScale’s management environment is the main interface users will have with the
software. It is designed to walk a user through the initial process of migrating to the
cloud using their templates and library. The management environment is then used for
(surprise!) managing that environment, namely continuing builds and ensuring
resource availability. This is where the automation engine comes into play: being able
to quickly provision and put into operation additional capacity, or remove that excess
capacity, as needed. Lastly, there is the Multi-Cloud Engine, supporting Amazon,
GoGrid, Eucalyptus and Rackspace.

RightScale is also working on supporting the Chef open-source systems integration


specifications, as well. Chef is designed from the ground up for the cloud.

Kaavo
Kaavo plays in a very similar space to RightScale. The product is typically used for:

● Single-click deployment of complex multi-tier applications in the cloud (Dev,


QA, Prod)
● Handling demand bursts/variations by automatically adding/removing
resources
● Run-time management of application infrastructure in the cloud

● Encryption of persisted data in the cloud


● Automation of workflows to handle run-time production exceptions without
human intervention

The core of Kaavo’s product is called IMOD. IMOD handles configuration,


provisioning and changes (adjustments in their terminology) to the cloud
environment, and across multiple vendors in a hybrid model. Like all
major CIM players, Kaavo’s IMOD sits at the “top” of the stack, managing the
infrastructure and application layers.

One great feature in IMOD is its multi-cloud, single system tool. For instance, you
can create a database backend in Rackspace while putting your presentation servers
on Amazon. Supporting Amazon and Rackspace in the public space and Eucalyptus in
the private space is a strong selling point, though it should be noted that most cloud
management can support Eucalyptus if it can also support Amazon, as
Eucalyptus mimics Amazon EC2 very closely.

Both Kaavo and RightScale offer scheduled “ramp-ups” or “ramp-downs” (dynamic


allocation based on demand) and monitoring tools to ensure that information and
internal metrics (like SLAs) are transparently available. The dynamic allocation even
helps meet the demands of those SLAs. Both offer the ability to keep templates as
well to ease the deployment of multi-tier systems.

Zeus
Zeus was famous for its rock-solid Web server, one that didn’t have a lot of market
share but did have a lot of fanatical fans and top-tier customers. With Apache, and to
a lesser-extent, IIS, dominating that market, not to mention the glut of load balancers
out there, Zeus took its expertise in the application server space and came up with
the Application Delivery Controller piece of the Zeus Traffic Controller. It uses
traditional load balancing tools to test availability and then spontaneously generate or
destroy additional instances in the cloud, providing on-the-fly provisioning. Zeus
currently supports this on the Rackspace and, to a lesser extent, Amazon platforms.

Scalr
Scalr is a young project hosted on Google Code and Scalr.net that creates dynamic
clusters, similar to Kaavo and RightScale, on the Amazon platform. It supports
triggered upsizing and downsizing based on traffic demands, snapshots (which can be
shared, incidentally, a very cool feature), and the custom building of images for each
server or server-type, also similar to RightScale. Being a new release, Scalr does not
support the wide number of platforms, operating systems, applications, and databases
that the largest competitors do, sticking to the traditional expanded-LAMP
architecture (LAMP plus Ruby, Tomcat, etc.) that comprises many content systems.

Morph
While not a true management platform, the MSP-minded Morph products offers
similar functionality in its own private space. Morph CloudServer is a newer product
on the market, filling the management and provisioning space as an appliance. It is
aimed at the enterprise seeking to deploy a private cloud. Its top-tier product, the
Morph CloudServer is based on the IBM BladeCenter, and supports hundreds of
virtual machines.

Under the core is an Ubuntu Linux operating system and the Eucalyptus cloud
computing platform. Aimed at the managed service provider market, Morph allows
for the creation of private clouds and the dynamic provisioning within those closed
clouds. While still up-and-coming, Morph has made quite a splash and bears
watching, particularly because of its open-source roots and participation in open-
cloud organizations.

CloudWatch
Amazon’s CloudWatch works on Amazon’s platform only, which limits its overall
usefulness as it cannot be a hybrid cloud management tool. Since Amazon’s Elastic
Compute Cloud (EC2) is the biggest platform out there (though Rackspace claims it is
closing that gap quickly), it still bears mentioning.

CloudWatch for EC2 supports dynamic provisioning (called auto-scaling),


monitoring, and load-balancing, all managed through a central management console
— the same central management console used by Amazon Web Services. Its biggest
advantage is that it requires no additional software to install and no additional website
to access applications through. While the product is clearly not for enterprises that
need hybrid support, those that exclusively use Amazon should know that it is as
robust and functional as the other market players.

Emerging Standards

Cloud standards have eveolved but are different from U.S. and European standards
organizations and includes the Open Data Center Alliance, the Distributed
Management Task Force (DMTF), standards consortium OMG, storage and
networking standards group SNIA, and the European telecommunications and
network standards group, ETSI.

All of the organizations are doing their own work with end users and vendors to
establish cloud standards, which are then discussed among the organizations.

Cloud Management Standards by DMTF

DMTF is working to Address Management Interoperability for Cloud Systems

Technologies like cloud computing and virtualization are rapidly being adopted by
enterprise IT managers to better deliver services to their customers, lower IT costs and
improve operational efficiencies.

DMTF’s Cloud Management Initiative is focused on developing interoperable cloud


infrastructure management standards and promoting adoption of those standards in the
industry. The work of DMTF working groups promoted by the Cloud Management
Initiative is focused on achieving interoperable cloud infrastructure management
between cloud service providers and their consumers and developers.
Cloud Working Groups

DMTF’s Cloud Management Initiative is promoting the work of the Cloud


Management Work Group, the Cloud Auditing Data Federation Working Group, the
System Virtualization, Partitioning, Clustering Working Group and the Software
Entitlement Working Group. Click on the links below to see an overview of each
work group’s activities, work group charter, specifications, work-in-progress
specifications and other technical documents.

● Cloud Management Working Group


● Cloud Auditing Data Federation Working Group
● Software Entitlement Working Group
● System Virtualization, Partitioning, and Clustering Working Group
7.4. Provisioning Cloud Storage
What Does Cloud Provisioning Mean?
Cloud provisioning refers to the processes for the deployment and integration of cloud
computing services within an enterprise IT infrastructure. This is a broad term that
incorporates the policies, procedures and an enterprise’s objective in sourcing cloud
services and solutions from a cloud service provider.

Explains Cloud Provisioning


Cloud provisioning primarily defines how, what and when an organization will
provision cloud services. These services can be internal, public or hybrid cloud
products and solutions. There are three different delivery models:

● Dynamic/On-Demand Provisioning: The customer or requesting application is


provided with resources on run time.
● User Provisioning: The user/customer adds a cloud device or device themselves.
● Post-Sales/Advanced Provisioning: The customer is provided with the resource upon
contract/service signup.

From a provider’s standpoint, cloud provisioning can include the supply and
assignment of required cloud resources to the customer. For example, the creation of
virtual machines, the allocation of storage capacity and/or granting access to cloud
software.

7.5. Managed and Unmanaged Cloud Storage


Managed vs Unmanaged Cloud

Which cloud service is best for your business?


There is no doubt that the pandemic is fast-tracking digital transformation and
cloud adoption for businesses across the globe. But whilst migrating to the
cloud is the first step, managing a complex cloud environment can be a much
more difficult task.

What is an unmanaged cloud?


An unmanaged cloud provides the core cloud services but with a ‘do it
yourself’ model. The customer rents access to the infrastructure, but they are
responsible for running it as well as all the tools and applications that run on
top of it.
Some examples of unmanaged clouds are AWS and Azure. A lot of unmanaged
clouds have a pay-as-you-go pricing model, which can be appealing as a way to
avoid wasting resources and money. However, using an unmanaged cloud does
mean that you require an expert in-house IT team in order to manage the cloud,
which can be expensive.
What is a managed cloud?
With a managed cloud, the hosting provider acts as an extension of your IT
team, removing the need for an in-house team.
Using a managed cloud provider means that you will have a team of expert
engineers at your disposal to implement a tailor-made cloud solution. Cloud
design, configuration, storage and networking are all managed for you by the
provider, as well as having a predictable billing model each month.

Managed hosting providers take away the worry of keeping your cloud
environment running efficiently, as well as monitoring your platform for cyber
threats and any potential security issues.
Hyve’s managed cloud
Hyve takes on the management of your cloud environment, helping with all
stages of the process including designing and planning your cloud
infrastructure, migration, and ongoing monitoring and management. We
proactively monitor your cloud and manage it up to the application layer,
saving you time and money.
● Reliability
On our high-availability VMware cloud, we guarantee 99.999% uptime SLA as
well as 100% power and network uptime SLA. We also offer a 20-minute
hardware replacement SLA for any faulty hardware.

● Scalability
Sudden traffic spikes without the correct scalability can cause a system to fail.
With a managed cloud you can rest assured that your hosting provider will be
monitoring traffic 24/7/365, and will scale out your cloud platform as and when it
is required.

● Control costs
Having a managed cloud provider removes the need for an in-house IT team.
Hiring and training staff is expensive, so a managed cloud is a great way to
reduce expenses. If you use an unmanaged cloud without the correct expertise in
place to manage it, costs can easily spin out of control, leaving you with a large
bill at the end of the month.

● Reduce risk
Using a managed provider helps to reduce risks to your business and prepares
you for any potential pitfalls. Managed providers will be experts in all areas of
cloud hosting and cybersecurity, and will be able to assist with any custom or
complex solutions.

● 24/7 support
With a managed hosting provider you will have a team of expert engineers
dedicated to your business. They will proactively monitor your cloud platform
and be available to assist you 24/7/365.

Managed v unmanaged
Managing a cloud platform whether that is Public or Private Cloud can seem
relatively simple, but for a cloud platform with a variety of services and
applications to run, it can be surprisingly complex to manage.

Although an unmanaged cloud often seems like the cheaper option, issues are
likely to occur along the way that need to be resolved by an expert. Having a
dedicated team of in-house IT experts working 24/7 is costly, and without this,
your cloud platform could be at risk. Managed providers become an extension
of your business and provide you with all you need for your business to
succeed.
Some customers also choose to retain some in-house engineers to do frontline
support or work on specific projects, but our technical support team fill any IT
skills gaps.
Managed Public Cloud
However, if you prefer the setup of a public cloud, Hyve can offer
a management layer for AWS, Google and Azure clouds. Our expert
engineers can work with your business to create a deployment or migration
plan and add the management layer on top, helping you optimise your public
cloud platform without needing an in-house IT team.

MCAD22E3CLOUD COMPUTING

MODULE8

8.1. Cloud Security Overview


8.2. Cloud Security Challenges
8.3. Virtual Machine Security
8.4. Application Security
8.5. Data Security

8.1 Cloud security defined


Cloud security, also known as cloud computing security, consists of a set of policies,
controls, procedures and technologies that work together to protect cloud-based
systems, data, and infrastructure. These security measures are configured to protect
cloud data, support regulatory compliance and protect customers' privacy as well as
setting authentication rules for individual users and devices. From authenticating
access to filtering traffic, cloud security can be configured to the exact needs of the
business. And because these rules can be configured and managed in one place,
administration overheads are reduced and IT teams empowered to focus on other
areas of the business.
The way cloud security is delivered will depend on the individual cloud provider or
the cloud security solutions in place. However, implementation of cloud security
processes should be a joint responsibility between the business owner and solution
provider.

Why is cloud security important?


For businesses making the transition to the cloud, robust cloud security is imperative.
Security threats are constantly evolving and becoming more sophisticated, and cloud
computing is no less at risk than an on-premise environment. For this reason, it is
essential to work with a cloud provider that offers best-in-class security that has been
customized for your infrastructure.

Cloud security offers many benefits, including:

Centralized security: Just as cloud computing centralizes applications and data,


cloud security centralizes protection. Cloud-based business networks consist of
numerous devices and endpoints that can be difficult to manage when dealing
with shadow IT or BYOD. Managing these entities centrally enhances traffic analysis
and web filtering, streamlines the monitoring of network events and results in fewer
software and policy updates. Disaster recovery plans can also be implemented and
actioned easily when they are managed in one place.

Reduced costs: One of the benefits of utilizing cloud storage and security is that it
eliminates the need to invest in dedicated hardware. Not only does this reduce capital
expenditure, but it also reduces administrative overheads. Where once IT teams were
firefighting security issues reactively, cloud security delivers proactive security
features that offer protection 24/7 with little or no human intervention.

Reduced Administration: When you choose a reputable cloud services provider or


cloud security platform, you can kiss goodbye to manual security configurations and
almost constant security updates. These tasks can have a massive drain on resources,
but when you move them to the cloud, all security administration happens in one
place and is fully managed on your behalf.

Reliability: Cloud computing services offer the ultimate in dependability. With the
right cloud security measures in place, users can safely access data and applications
within the cloud no matter where they are or what device they are using.

More and more organizations are realizing the many business benefits of moving their
systems to the cloud. Cloud computing allows organizations to operate at scale,
reduce technology costs and use agile systems that give them the competitive edge.
However, it is essential that organizations have complete confidence in their cloud
computing security and that all data, systems and applications are protected from data
theft, leakage, corruption and deletion.

All cloud models are susceptible to threats. IT departments are naturally cautious
about moving mission-critical systems to the cloud and it is essential the right security
provisions are in place, whether you are running a native cloud, hybrid or on-premise
environment. Cloud security offers all the functionality of traditional IT security, and
allows businesses to harness the many advantages of cloud computing while
remaining secure and also ensure that data privacy and compliance requirements are
met.
Secure Data in the Cloud
Cloud data security becomes increasingly important as we move our devices, data
centers, business processes, and more to the cloud. Ensuring quality cloud data
security is achieved through comprehensive security policies, an organizational
culture of security, and cloud security solutions.

Selecting the right cloud security solution for your business is imperative if you want
to get the best from the cloud and ensure your organization is protected from
unauthorized access, data breaches and other threats. Forcepoint Cloud Access
Security Broker (CASB) is a complete cloud security solution that protects cloud
apps and cloud data, prevents compromised accounts and allows you to set security
policies on a per-device basis.

8.2. Cloud Security Challenges

Today’s businesses want it all: secure data and applications accessible anywhere from
any device. It’s possible with cloud technology, but there are inherent cloud
computing security challenges to making it a reality.

What can enterprise businesses do to reap the benefits of cloud technology while
ensuring a secure environment for sensitive information? Recognizing those
challenges is the first step to finding solutions that work. The next step is choosing the
right tools and vendors to mitigate those cloud security challenges.

In our technology driven world, security in the cloud is an issue that should be
discussed from the board level all the way down to new employees. The CDNetworks
blog recently discussed “what is cloud security” and explained some of its benefits.
Now that we understand what cloud security is, let’s take a look at some of the key
challenges that may be faced and why you want to prevent unauthorized access at all
costs.

CHALLENGE 1: DDOS AND DENIAL-OF-SERVICE ATTACKS

As more and more businesses and operations move to the cloud, cloud providers are
becoming a bigger target for malicious attacks. Distributed denial of service (DDoS)
attacks are more common than ever before. Verisign reported IT services, cloud
platforms (PaaS) and SaaS was the most frequently targeted industry during the first
quarter of 2015.

A DDoS attack is designed to overwhelm website servers so it can no longer respond


to legitimate user requests. If a DDoS attack is successful, it renders a website useless
for hours, or even days. This can result in a loss of revenue, customer trust and brand
authority.

Complementing cloud services with DDoS protection is no longer just good idea for
the enterprise; it’s a necessity. Websites and web-based applications are core
components of 21st century business and require state-of-the-art cybersecurity.
CHALLENGE 2: DATA BREACHES

Known data breaches in the U.S. hit a record-high of 738 in 2014, according to the
Identity Theft Research Center, and hacking was (by far) the number one cause.
That’s an incredible statistic and only emphasizes the growing challenge to secure
sensitive data.

Traditionally, IT professionals have had great control over the network infrastructure
and physical hardware (firewalls, etc.) securing proprietary data. In the cloud (in all
scenarios including private cloud, public cloud, and hybrid cloud situations), some of
those security controls are relinquished to a trusted partner meaning cloud
infrastructure can increase security risks. Choosing the right vendor, with a strong
record of implementing strong security measures, is vital to overcoming this
challenge.

CHALLENGE 3: DATA LOSS

When business critical information is moved into the cloud, it’s understandable to be
concerned with its security. Losing cloud data, either through accidental deletion and
human error, malicious tampering including the installation of malware (i.e. DDoS),
or an act of nature that brings down a cloud service provider, could be disastrous for
an enterprise business. Often a DDoS attack is only a diversion for a greater threat,
such as an attempt to steal or delete data.

To face this challenge, it’s imperative to ensure there is a disaster recovery process in
place, as well as an integrated system to mitigate malicious cyberattacks. In addition,
protecting every network layer, including the application layer (layer 7), should be
built-in to a cloud security solution.

CHALLENGE 4: INSECURE ACCESS


CONTROL POINTS

One of the great benefits of the cloud is it can be accessed from anywhere and from
any device. But, what if the interfaces and particularly the application programming
interfaces (APIs) users interact with aren’t secure? Hackers can find and gain access
to these types of vulnerabilities and exploit authentication via APIs if given enough
time.
A behavioral web application firewall examines HTTP requests to a website to ensure
it is legitimate traffic. This always-on device helps protect web applications and APIS
from security breaches within cloud environments and data centers that are not on-
premises.

CHALLENGE 5: NOTIFICATIONS AND ALERTS

Awareness and proper communication of security threats is a cornerstone of network


security and the same goes for cloud computing security. Alerting the appropriate
website or application managers as soon as a threat is identified should be part of a
thorough data security and access management plan. Speedy mitigation of a threat
relies on clear and prompt communication so steps can be taken by the proper entities
and impact of the threat minimized.

FINAL THOUGHTS ON CLOUD SECURITY CHALLENGES

Cloud computing security issues and challenges are not insurmountable. With the
right cloud service provider (CSP), technology, and forethought, enterprises can
leverage the benefits of cloud technology.

CDNetworks’ cloud security solution integrates web performance with the latest in
cloud security technology. With 160 points of presence, websites and cloud
applications are accelerated on a global scale and, with our cloud security, our clients’
cloud-based assets are protected with 24/7 end to end security, including DDoS
mitigation at the network and application levels.

8.3. Virtual Machine Security

Virtualized Security
What is virtualized security?
Virtualized security, or security virtualization, refers to security solutions that are
software-based and designed to work within a virtualized IT environment. This differs
from traditional, hardware-based network security, which is static and runs on devices
such as traditional firewalls, routers, and switches.

In contrast to hardware-based security, virtualized security is flexible and dynamic.


Instead of being tied to a device, it can be deployed anywhere in the network and is
often cloud-based. This is key for virtualized networks, in which operators spin up
workloads and applications dynamically; virtualized security allows security services
and functions to move around with those dynamically created workloads.

Cloud security considerations (such as isolating multitenant environments in public


cloud environments) are also important to virtualized security. The flexibility of
virtualized security is helpful for securing hybrid and multi-cloud environments,
where data and workloads migrate around a complicated ecosystem involving
multiple vendors.
How does virtualized security work?
Virtualized security can take the functions of traditional security hardware appliances
(such as firewalls and antivirus protection) and deploy them via software. In addition,
virtualized security can also perform additional security functions. These functions
are only possible due to the advantages of virtualization, and are designed to address
the specific security needs of a virtualized environment.

For example, an enterprise can insert security controls (such as encryption) between
the application layer and the underlying infrastructure, or use strategies such as micro-
segmentation to reduce the potential attack surface.

Virtualized security can be implemented as an application directly on a bare metal


hypervisor (a position it can leverage to provide effective application monitoring) or
as a hosted service on a virtual machine. In either case, it can be quickly deployed
where it is most effective, unlike physical security, which is tied to a specific device.

What are the benefits of virtualized security?


Virtualized security is now effectively necessary to keep up with the complex security
demands of a virtualized network, plus it’s more flexible and efficient than traditional
physical security. Here are some of its specific benefits:

● Cost-effectiveness: Virtualized security allows an enterprise to maintain a secure


network without a large increase in spending on expensive proprietary hardware.
Pricing for cloud-based virtualized security services is often determined by usage,
which can mean additional savings for organizations that use resources efficiently.
● Flexibility: Virtualized security functions can follow workloads anywhere, which is
crucial in a virtualized environment. It provides protection across multiple data
centers and in multi-cloud and hybrid cloud environments, allowing an organization
to take advantage of the full benefits of virtualization while also keeping data secure.
● Operational efficiency: Quicker and easier to deploy than hardware-based security,
virtualized security doesn’t require IT teams to set up and configure multiple
hardware appliances. Instead, they can set up security systems through centralized
software, enabling rapid scaling. Using software to run security technology also
allows security tasks to be automated, freeing up additional time for IT teams.
● Regulatory compliance: Traditional hardware-based security is static and unable to
keep up with the demands of a virtualized network, making virtualized security a
necessity for organizations that need to maintain regulatory compliance.

What are the risks of virtualized security?


The increased complexity of virtualized security can be a challenge for IT, which in
turn leads to increased risk. It’s harder to keep track of workloads and applications in
a virtualized environment as they migrate across servers, which makes it more
difficult to monitor security policies and configurations. And the ease of spinning
up virtual machines can also contribute to security holes.

It’s important to note, however, that many of these risks are already present in a
virtualized environment, whether security services are virtualized or not.
Following enterprise security best practices (such as spinning down virtual machines
when they are no longer needed and using automation to keep security policies up to
date) can help mitigate such risks.

How is physical security different from


virtualized security?
Traditional physical security is hardware-based, and as a result, it’s inflexible and
static. The traditional approach depends on devices deployed at strategic points across
a network and is often focused on protecting the network perimeter (as with a
traditional firewall). However, the perimeter of a virtualized, cloud-based network is
necessarily porous and workloads and applications are dynamically created,
increasing the potential attack surface.

Traditional security also relies heavily upon port and protocol filtering, an approach
that’s ineffective in a virtualized environment where addresses and ports are assigned
dynamically. In such an environment, traditional hardware-based security is not
enough; a cloud-based network requires virtualized security that can move around the
network along with workloads and applications.

What are the different types of virtualized


security?
There are many features and types of virtualized security, encompassing network
security, application security, and cloud security. Some virtualized security
technologies are essentially updated, virtualized versions of traditional security
technology (such as next-generation firewalls). Others are innovative new
technologies that are built into the very fabric of the virtualized network.

Some common types of virtualized security features include:


● Segmentation, or making specific resources available only to specific applications
and users. This typically takes the form of controlling traffic between different
network segments or tiers.
● Micro-segmentation, or applying specific security policies at the workload level to
create granular secure zones and limit an attacker’s ability to move through the
network. Micro-segmentation divides a data center into segments and allows IT teams
to define security controls for each segment individually, bolstering the data center’s
resistance to attack.
● Isolation, or separating independent workloads and applications on the same network.
This is particularly important in a multitenant public cloud environment, and can also
be used to isolate virtual networks from the underlying physical infrastructure,
protecting the infrastructure from attack

8.4. Application Security


Learn how to mitigate threats by shrinking the application attack surface across
environments

Application security describes security measures at the application level that aim to
prevent data or code within the app from being stolen or hijacked. It encompasses the
security considerations that happen during application development and design, but it
also involves systems and approaches to protect apps after they get deployed.

Application security may include hardware, software, and procedures that identify or
minimize security vulnerabilities. A router that prevents anyone from viewing a
computer’s IP address from the Internet is a form of hardware application security.
But security measures at the application level are also typically built into the software,
such as an application firewall that strictly defines what activities are allowed and
prohibited. Procedures can entail things like an application security routine that
includes protocols such as regular testing.

Application security definition


Application security is the process of developing, adding, and testing security features
within applications to prevent security vulnerabilities against threats such as
unauthorized access and modification.
Why application security is important
Application security is important because today’s applications are often available over
various networks and connected to the cloud, increasing vulnerabilities to security
threats and breaches. There is increasing pressure and incentive to not only ensure
security at the network level but also within applications themselves. One reason for
this is because hackers are going after apps with their attacks more today than in the
past. Application security testing can reveal weaknesses at the application level,
helping to prevent these attacks.
Types of application security
Different types of application security features include authentication, authorization,
encryption, logging, and application security testing. Developers can also code
applications to reduce security vulnerabilities.

● Authentication: When software developers build procedures into an application to


ensure that only authorized users gain access to it. Authentication procedures ensure
that a user is who they say they are. This can be accomplished by requiring the user to
provide a user name and password when logging in to an application. Multi-factor
authentication requires more than one form of authentication—the factors might
include something you know (a password), something you have (a mobile device),
and something you are (a thumb print or facial recognition).
● Authorization: After a user has been authenticated, the user may be authorized to
access and use the application. The system can validate that a user has permission to
access the application by comparing the user’s identity with a list of authorized users.
Authentication must happen before authorization so that the application matches only
validated user credentials to the authorized user list.
● Encryption: After a user has been authenticated and is using the application, other
security measures can protect sensitive data from being seen or even used by a
cybercriminal. In cloud-based applications, where traffic containing sensitive data
travels between the end user and the cloud, that traffic can be encrypted to keep the
data safe.
● Logging: If there is a security breach in an application, logging can help identify who
got access to the data and how. Application log files provide a time-stamped record of
which aspects of the application were accessed and by whom.
● Application security testing: A necessary process to ensure that all of these security
controls work properly.
Application security in the cloud
Application security in the cloud poses some extra challenges. Because cloud
environments provide shared resources, special care must be taken to ensure that users
only have access to the data they are authorized to view in their cloud-based
applications. Sensitive data is also more vulnerable in cloud-based applications
because that data is transmitted across the Internet from the user to the application and
back.
Mobile application security
Mobile devices also transmit and receive information across the Internet, as opposed
to a private network, making them vulnerable to attack. Enterprises can use virtual
private networks (VPNs) to add a layer of mobile application security for employees
who log in to applications remotely. IT departments may also decide to vet mobile
apps and make sure they conform to company security policies before allowing
employees to use them on mobile devices that connect to the corporate network.
Web application security
Web application security applies to web applications—apps or services that users
access through a browser interface over the Internet. Because web applications live on
remote servers, not locally on user machines, information must be transmitted to and
from the user over the Internet. Web application security is of special concern to
businesses that host web applications or provide web services. These businesses often
choose to protect their network from intrusion with a web application firewall. A web
application firewall works by inspecting and, if necessary, blocking data packets that
are considered harmful.
What are application security controls?
Application security controls are techniques to enhance the security of an application
at the coding level, making it less vulnerable to threats. Many of these controls deal
with how the application responds to unexpected inputs that a cybercriminal might
use to exploit a weakness. A programmer can write code for an application in such a
way that the programmer has more control over the outcome of these unexpected
inputs. Fuzzing is a type of application security testing where developers test the
results of unexpected values or inputs to discover which ones cause the application to
act in an unexpected way that might open a security hole.
What is application security testing?
Application developers perform application security testing as part of the software
development process to ensure there are no security vulnerabilities in a new or
updated version of a software application. A security audit can make sure the
application is in compliance with a specific set of security criteria. After the
application passes the audit, developers must ensure that only authorized users can
access it. In penetration testing, a developer thinks like a cybercriminal and looks for
ways to break into the application. Penetration testing may include social engineering
or trying to fool users into allowing unauthorized access. Testers commonly
administer both unauthenticated security scans and authenticated security scans (as
logged-in users) to detect security vulnerabilities that may not show up in both states.
8.5. Data Security
Why is data security important?
Data security is the practice of protecting digital information from unauthorized
access, corruption, or theft throughout its entire lifecycle. It’s a concept that
encompasses every aspect of information security from the physical security of
hardware and storage devices to administrative and access controls, as well as the
logical security of software applications. It also includes organizational policies and
procedures.

When properly implemented, robust data security strategies will protect an


organization’s information assets against cybercriminal activities, but they also guard
against insider threats and human error, which remains among the leading causes of
data breaches today. Data security involves deploying tools and technologies that
enhance the organization’s visibility into where its critical data resides and how it is
used. Ideally, these tools should be able to apply protections like encryption, data
masking, and redaction of sensitive files, and should automate reporting to streamline
audits and adhering to regulatory requirements.

Business challenges

Digital transformation is profoundly altering every aspect of how today’s businesses


operate and compete. The sheer volume of data that enterprises create, manipulate,
and store is growing, and drives a greater need for data governance. In addition,
computing environments are more complex than they once were, routinely spanning
the public cloud, the enterprise data center, and numerous edge devices ranging from
Internet of Things (IoT) sensors to robots and remote servers. This complexity creates
an expanded attack surface that’s more challenging to monitor and secure.
At the same time, consumer awareness of the importance of data privacy is on the
rise. Fueled by increasing public demand for data protection initiatives, multiple new
privacy regulations have recently been enacted, including Europe’s General Data
Protection Regulation (GDPR) and the California Consumer Protection Act (CCPA).
These rules join longstanding data security provisions like the Health Insurance
Portability and Accountability Act (HIPAA), protecting electronic health records, and
the Sarbanes-Oxley Act (SOX), protecting shareholders in public companies from
accounting errors and financial fraud. With maximum fines in the millions of dollars,
every enterprise has a strong financial incentive to ensure it maintains compliance.

The business value of data has never been greater than it is today. The loss of trade
secrets or intellectual property (IP) can impact future innovations and profitability.
So, trustworthiness is increasingly important to consumers, with a full 75% reporting
that they will not purchase from companies they don’t trust to protect their data.
More on data security

Types of data security

Encryption
Using an algorithm to transform normal text characters into an unreadable format,
encryption keys scramble data so that only authorized users can read it. File and
database encryption solutions serve as a final line of defense for sensitive volumes by
obscuring their contents through encryption or tokenization. Most solutions also
include security key management capabilities.

DataErasure
More secure than standard data wiping, data erasure uses software to completely
overwrite data on any storage device. It verifies that the data is unrecoverable.

DataMasking
By masking data, organizations can allow teams to develop applications or train
people using real data. It masks personally identifiable information (PII) where
necessary so that development can occur in environments that are compliant.

DataResiliency
Resiliency is determined by how well a data center is able to endure or recover any
type of failure – from hardware problems to power shortages and other disruptive
events.

Data security capabilities and solutions

Data security tools and technologies should address the growing challenges inherent
in securing today’s complex, distributed, hybrid, and/or multicloud computing
environments. These include understanding where data resides, keeping track of who
has access to it, and blocking high-risk activities and potentially dangerous file
movements. Comprehensive data protection solutions that enable enterprises to adopt
a centralized approach to monitoring and policy enforcement can simplify the task.

Data discovery and classification tools


Sensitive information can reside in structured and unstructured data repositories
including databases, data warehouses, big data platforms, and cloud environments.
Data discovery and classification solutions automate the process of identifying
sensitive information, as well as assessing and remediating vulnerabilities.
Data and file activity monitoring
File activity monitoring tools analyze data usage patterns, enabling security teams to
see who is accessing data, spot anomalies, and identify risks. Dynamic blocking and
alerting can also be implemented for abnormal activity patterns.

Vulnerability assessment and risk analysis tools


These solutions ease the process of detecting and mitigating vulnerabilities such as
out-of-date software, misconfigurations, or weak passwords, and can also identify
data sources at greatest risk of exposure.

Automated compliance reporting


Comprehensive data protection solutions with automated reporting capabilities can
provide a centralized repository for enterprise-wide compliance audit trails.

Data security strategies

A comprehensive data security strategy incorporates people, processes, and


technologies. Establishing appropriate controls and policies is as much a question of
organizational culture as it is of deploying the right tool set. This means making
information security a priority across all areas of the enterprise.

Physical security of servers and user devices


Regardless of whether your data is stored on-premises, in a corporate data center, or
in the public cloud, you need to ensure that facilities are secured against intruders and
have adequate fire suppression measures and climate controls in place. A cloud
provider will assume responsibility for these protective measures on your behalf.

Access management and controls


The principle of “least-privilege access” should be followed throughout your entire IT
environment. This means granting database, network, and administrative account
access to as few people as possible, and only those who absolutely need it to get their
jobs done.

Application security and patching


All software should be updated to the latest version as soon as possible after patches
or new versions are released.

Backups
Maintaining usable, thoroughly tested backup copies of all critical data is a core
component of any robust data security strategy. In addition, all backups should be
subject to the same physical and logical security controls that govern access to the
primary databases and core systems.

Employee education
Training employees in the importance of good security practices and password
hygiene and teaching them to recognize social engineering attacks transforms them
into a “human firewall” that can play a critical role in safeguarding your data.
Network and endpoint security monitoring and controls
Implementing a comprehensive suite of threat management, detection, and response
tools and platforms across your on-premises environment and cloud platforms can
mitigate risks and reduce the probability of a breach.

Lab Exercise 1:
Use security tools like ACUNETIX, ETTERCAP to scan web
applications on the cloud.

HERE IS A LIST OF RECOMMENDED TOOLS FOR


PEN TESTING CLOUD SECURITY TOOLS:

ACUNETIX

This information gathering tool scans web applications on the cloud and
lists possible vulnerabilities that might be present in the given web
application. Most of the scanning is focused on finding SQL injection and
cross site scripting Vulnerabilities. It has both free and paid versions, with
paid versions including added functionalities. After scanning, it generates a
detailed report describing vulnerabilities along with the suitable action that
can be taken to remedy the loophole.

This tool can be used for scanning cloud applications. Beware: there is
always a chance of false positives. Any security flaw, if discovered through
scanning, should be verified. The latest version of this software, Acunetix
WVS version 8, has a report template for checking compliance with ISO
27001, and can also scan for HTTP denial of service attacks.

AIRCRACK-NG – A TOOL FOR WI-FI PEN TESTERS

This is a comprehensive suite of tools designed specifically for network pen


testing and security. This tool is useful for scanning Infrastructure as a
Service (IaaS) models. Having no firewall, or a weak firewall, makes it very
easy for malicious users to exploit your network on the cloud through virtual
machines. This suite consists of many tools with different functionalities,
which can be used for monitoring the network for any kind of malicious
activity over the cloud.
Its main functions include:

● Aircrack-ng – Cracks WEP or WPA encryption keys with dictionary


attacks
● Airdecap-ng – Decrypts captured packet files of WEP and WPA keys
● Airmon-ng – Puts your network interface card, like Alfa card, into
monitoring mode
● Aireplay-ng – This is packet injector tool
● Airodump-ng – Acts as a packet sniffer on networks
● Airtun-ng – Can be used for virtual tunnel interfaces
● Airolib-ng – Acts as a library for storing captured passwords and
ESSID
● Packetforge-ng – Creates forged packets, which are used for packet
injection
● Airbase-ng – Used for attacking clients through various techniques.
● Airdecloak-ng – Capable of removing WEP clocking.

Several others tools are also available in this suite, including esside-ng,
wesside-ng and tkiptun-ng. Aircrack-ng can be used on both command line
interfaces and on graphical interfaces. In GUI, it is named Gerix Wi-Fi
Cracker, which is a freely available network security tool licensed to GNU.

CAIN & ABEL

This is a password recovery tool. Cain is used by penetration testers for


recovering passwords by sniffing networks, brute forcing and decrypting
passwords. This also allows pen testers to intercept VoIP conversations that
might be occurring through cloud. This multi functionality tool can decode
Wi-Fi network keys, unscramble passwords, discover cached passwords,
etc. An expert pen tester can analyze routing protocols as well, thereby
detecting any flaws in protocols governing cloud security. The feature that
separates Cain from similar tools is that it identifies security flaws in
protocol standards rather than exploiting software vulnerabilities. This tool
is very helpful for recovering lost passwords.

In the latest version of Cain, the ‘sniffer’ feature allows for analyzing
encrypted protocols such as SSH-1 and HTTPS. This tool can be utilized for
ARP cache poisoning, enabling sniffing of switched LAN devices, thereby
performing Man in the Middle (MITM) attacks. Further functionalities have
been added in the latest version, including authentication monitors for
routing protocols, brute-force for most of the popular algorithms and
cryptanalysis attacks.

ETTERCAP

Ettercap is a free and open source tool for network security, designed for
analyzing computer network protocols and detecting MITM attacks. It is
usually accompanied with Cain. This tool can be used for pen testing cloud
networks and verifying leakage of information to an unauthorized third
party. It has four methods of functionality:

● IP-based Scanning – Network security is scanned by filtering IP based


packets.

● Mac-based Scanning – Here packets are filtered based on MAC


addresses. This is used for sniffing connections through channels.

● – ARP poisoning is used for sniffing into


ARP-based functionality
switched LAN through an MITM attack operating between two hosts
(full duplex).

● – In this functionality mode, ettercap uses


Public-ARP based functionality
one victim host to sniff all other hosts on a switched LAN network
(half duplex).
Lab Exercise 2:
Understanding Cloud Computing Vulnerabilities

Vulnerability: An Overview

Vulnerability is a prominent factor of risk. ISO 27005 defines risk as “the potential
that a given threat will exploit vulnerabilities of an asset or group of assets and
thereby cause harm to the organization,” measuring it in terms of both the likelihood
of an event and its consequence. The Open Group’s risk taxonomy offers a useful
overview of risk factors (see Figure 1).

(Click on the image to enlarge it.)

Figure 1. Factors contributing to risk according to the Open Group’s risk


taxonomy. Risk corresponds to the product of loss event frequency (left) and
probable loss magnitude (right). Vulnerabilities influence the loss event
frequency.

The Open Group’s taxonomy uses the same two top-level risk factors as ISO 27005:
the likelihood of a harmful event (here, loss event frequency) and its consequence
(here, probable loss magnitude).1 The probable loss magnitude’s subfactors (on the
right in Figure 1) influence a harmful event’s ultimate cost. The loss event frequency
subfactors (on the left) are a bit more complicated. A loss event occurs when a threat
agent (such as a hacker) successfully exploits a vulnerability. The frequency with
which this happens depends on two factors:

● The frequency with which threat agents try to exploit a vulnerability. This frequency
is determined by both the agents’ motivation (What can they gain with an attack?
How much effort does it take? What is the risk for the attackers?) and how much
access (“contact”) the agents have to the attack targets.

● The difference between the threat agents’ attack capabilities and the system’s strength
to resist the attack.

This second factor brings us toward a useful definition of vulnerability.

Defining Vulnerability

According to the Open Group’s risk taxonomy, “Vulnerability is the probability that
an asset will be unable to resist the actions of a threat agent. Vulnerability exists when
there is a difference between the force being applied by the threat agent, and an
object’s ability to resist that force.”

So, vulnerability must always be described in terms of resistance to a certain type of


attack. To provide a real-world example, a car’s inability to protect its driver against
injury when hit frontally by a truck driving 60 mph is a vulnerability; the resistance of
the car’s crumple zone is simply too weak compared to the truck’s force. Against the
“attack” of a biker, or even a small car driving at a more moderate speed, the car’s
resistance strength is perfectly adequate.

We can also describe computer vulnerability - that is, security-related bugs that you
close with vendor-provided patches - as a weakening or removal of a certain
resistance strength. A buffer-overflow vulnerability, for example, weakens the
system’s resistance to arbitrary code execution. Whether attackers can exploit this
vulnerability depends on their capabilities.

Vulnerabilities and Cloud Risk

We’ll now examine how cloud computing influences the risk factors in Figure 1,
starting with the right-hand side of the risk factor tree.

From a cloud customer perspective, the right-hand side dealing with probable
magnitude of future loss isn’t changed at all by cloud computing: the consequences
and ultimate cost of, say, a confidentiality breach, is exactly the same regardless of
whether the data breach occurred within a cloud or a conventional IT infrastructure.
For a cloud service provider, things look somewhat different: because cloud
computing systems were previously separated on the same infrastructure, a loss event
could entail a considerably larger impact. But this fact is easily grasped and
incorporated into a risk assessment: no conceptual work for adapting impact analysis
to cloud computing seems necessary.

So, we must search for changes on Figure 1’s left-hand side - the loss event
frequency. Cloud computing could change the probability of a harmful event’s
occurrence. As we show later, cloud computing causes significant changes in the
vulnerability factor. Of course, moving to a cloud infrastructure might change the
attackers’ access level and motivation, as well as the effort and risk - a fact that must
be considered as future work. But, for supporting a cloud-specific risk assessment, it
seems most profitable to start by examining the exact nature of cloud-specific
vulnerabilities.

Cloud-Specific Vulnerabilities

Based on the abstract view of cloud computing we presented earlier, we can now
move toward a definition of what constitutes a cloud-specific vulnerability. A
vulnerability is cloud specific if it

● is intrinsic to or prevalent in a core cloud computing technology,

● has its root cause in one of NIST’s essential cloud characteristics,

● is caused when cloud innovations make tried-and-tested security controls difficult or


impossible to implement, or

● is prevalent in established state-of-the-art cloud offerings.

We now examine each of these four indicators.

Core-Technology Vulnerabilities

Cloud computing’s core technologies - Web applications and services, virtualization,


and cryptography - have vulnerabilities that are either intrinsic to the technology or
prevalent in the technology’s state-of-the-art implementations. Three examples of
such vulnerabilities are virtual machine escape, session riding and hijacking, and
insecure or obsolete cryptography.
First, the possibility that an attacker might successfully escape from a virtualized
environment lies in virtualization’s very nature. Hence, we must consider this
vulnerability as intrinsic to virtualization and highly relevant to cloud computing.

Second, Web application technologies must overcome the problem that, by design, the
HTTP protocol is a stateless protocol, whereas Web applications require some notion
of session state. Many techniques implement session handling and - as any security
professional knowledgeable in Web application security will testify - many session
handling implementations are vulnerable to session riding and session hijacking.
Whether session riding/hijacking vulnerabilities are intrinsic to Web application
technologies or are “only” prevalent in many current implementations is arguable; in
any case, such vulnerabilities are certainly relevant for cloud computing.

Finally, cryptanalysis advances can render any cryptographic mechanism or algorithm


insecure as novel methods of breaking them are discovered. It’s even more common
to find crucial flaws in cryptographic algorithm implementations, which can turn
strong encryption into weak encryption (or sometimes no encryption at all). Because
broad uptake of cloud computing is unthinkable without the use of cryptography to
protect data confidentiality and integrity in the cloud, insecure or obsolete
cryptography vulnerabilities are highly relevant for cloud computing.

Essential Cloud Characteristic Vulnerabilities

As we noted earlier, NIST describes five essential cloud characteristics: on-demand


self-service, ubiquitous network access, resource pooling, rapid elasticity, and
measured service.

Following are examples of vulnerabilities with root causes in one or more of these
characteristics:

● Unauthorized access to management interface. The cloud characteristic on-demand


self-service requires a management interface that’s accessible to cloud service users.
Unauthorized access to the management interface is therefore an especially relevant
vulnerability for cloud systems: the probability that unauthorized access could occur
is much higher than for traditional systems where the management functionality is
accessible only to a few administrators.

● Internet protocol vulnerabilities. The cloud characteristic ubiquitous network access


means that cloud services are accessed via network using standard protocols. In most
cases, this network is the Internet, which must be considered untrusted. Internet
protocol vulnerabilities - such as vulnerabilities that allow man-in-the-middle attacks -
are therefore relevant for cloud computing.

● Data recovery vulnerability. The cloud characteristics of pooling and elasticity entail
that resources allocated to one user will be reallocated to a different user at a later
time. For memory or storage resources, it might therefore be possible to recover data
written by a previous user.

● Metering and billing evasion. The cloud characteristic of measured service means that
any cloud service has a metering capability at an abstraction level appropriate to the
service type (such as storage, processing, and active user accounts). Metering data is
used to optimize service delivery as well as billing. Relevant vulnerabilities include
metering and billing data manipulation and billing evasion.

Thus, we can leverage NIST’s well-founded definition of cloud computing in


reasoning about cloud computing issues.

Defects in Known Security Controls

Vulnerabilities in standard security controls must be considered cloud specific if


cloud innovations directly cause the difficulties in implementing the controls. Such
vulnerabilities are also known as control challenges.

Here, we treat three examples of such control challenges. First, virtualized networks
offer insufficient network-based controls. Given the nature of cloud services, the
administrative access to IaaS network infrastructure and the ability to tailor network
infrastructure are typically limited; hence, standard controls such as IP-based network
zoning can’t be applied. Also, standard techniques such as network-based
vulnerability scanning are usually forbidden by IaaS providers because, for example,
friendly scans can’t be distinguished from attacker activity. Finally, technologies such
as virtualization mean that network traffic occurs on both real and virtual networks,
such as when two virtual machine environments (VMEs) hosted on the same server
communicate. Such issues constitute a control challenge because tried and tested
network-level security controls might not work in a given cloud environment.

The second challenge is in poor key management procedures. As noted in a recent


European Network and Information Security Agency study, cloud computing
infrastructures require management and storage of many different kinds of keys.
Because virtual machines don’t have a fixed hardware infrastructure and cloud-based
content is often geographically distributed, it’s more difficult to apply standard
controls - such as hardware security module (HSM) storage - to keys on cloud
infrastructures.

Finally, security metrics aren’t adapted to cloud infrastructures. Currently, there are
no standardized cloud-specific security metrics that cloud customers can use to
monitor the security status of their cloud resources. Until such standard security
metrics are developed and implemented, controls for security assessment, audit, and
accountability are more difficult and costly, and might even be impossible to employ.

Prevalent Vulnerabilities in State-of-the-Art Cloud Offerings


Although cloud computing is relatively young, there are already myriad cloud
offerings on the market. Hence, we can complement the three cloud-specific
vulnerability indicators presented earlier with a forth, empirical indicator: if a
vulnerability is prevalent in state-of-the-art cloud offerings, it must be regarded as
cloud-specific. Examples of such vulnerabilities include injection vulnerabilities and
weak authentication schemes.

Injection vulnerabilities are exploited by manipulating service or application inputs to


interpret and execute parts of them against the programmer’s intentions. Examples of
injection vulnerabilities include

● SQL injection, in which the input contains SQL code that’s erroneously executed in
the database back end;

● command injection, in which the input contains commands that are erroneously
executed via the OS; and

● cross-site scripting, in which the input contains JavaScript code that’s erroneously
executed by a victim’s browser.

In addition, many widely used authentication mechanisms are weak. For example,
usernames and passwords for authentication are weak due to
● insecure user behavior (choosing weak passwords, reusing passwords, and so on), and

● inherent limitations of one-factor authentication mechanisms.

Also, the authentication mechanisms’ implementation might have weaknesses and


allow, for example, credential interception and replay. The majority of Web
applications in current state-of-the-art cloud services employ usernames and
passwords as authentication mechanism.

Architectural Components and Vulnerabilities

Cloud service models are commonly divided into SaaS, PaaS, and IaaS, and each
model influences the vulnerabilities exhibited by a given cloud infrastructure. It’s
helpful to add more structure to the service model stacks: Figure 2 shows a cloud
reference architecture that makes the most important security-relevant cloud
components explicit and provides an abstract overview of cloud computing for
security issue analysis.
Figure 2. The cloud reference architecture. We map cloud-specific vulnerabilities
to components of this reference architecture, which gives us an overview of
which vulnerabilities might be relevant for a given cloud service.

The reference architecture is based on work carried out at the University of California,
Los Angeles, and IBM. It inherits the layered approach in that layers can encompass
one or more service components. Here, we use “service” in the broad sense of
providing something that might be both material (such as shelter, power, and
hardware) and immaterial (such as a runtime environment). For two layers, the cloud
software environment and the cloud software infrastructure, the model makes the
layers’ three main service components - computation, storage, and communication -
explicit.

Top layer services also can be implemented on layers further down the stack, in effect
skipping intermediate layers. For example, a cloud Web application can be
implemented and operated in the traditional way - that is, running on top of a standard
OS without using dedicated cloud software infrastructure and environment
components. Layering and compositionality imply that the transition from providing
some service or function in-house to sourcing the service or function can take place
between any of the model’s layers.

In addition to the original model, we’ve identified supporting functions relevant to


services in several layers and added them to the model as vertical spans over several
horizontal layers.
Our cloud reference architecture has three main parts:

● Supporting (IT) infrastructure. These are facilities and services common to any IT
service, cloud or otherwise. We include them in the architecture because we want to
provide the complete picture; a full treatment of IT security must account for a cloud
service’s non-cloud-specific components.
● Cloud-specific infrastructure. These components constitute the heart of a cloud
service; cloud-specific vulnerabilities and corresponding controls are typically
mapped to these components.
● Cloud service consumer. Again, we include the cloud service customer in the
reference architecture because it’s relevant to an all-encompassing security treatment.

Also, we make explicit the network that separates the cloud service consumer from
the cloud infrastructure; the fact that access to cloud resources is carried out via a
(usually untrusted) network is one of cloud computing’s main characteristics.

Using the cloud reference architecture’s structure, we can now run through the
architecture’s components and give examples of each component’s cloud-specific
vulnerabilities.

Cloud Software Infrastructure and Environment

The cloud software infrastructure layer provides an abstraction level for basic IT
resources that are offered as services to higher layers: computational resources
(usually VMEs), storage, and (network) communication. These services can be used
individually, as is typically the case with storage services, but they’re often bundled
such that servers are delivered with certain network connectivity and (often) access to
storage. This bundle, with or without storage, is usually referred to as IaaS.
The cloud software environment layer provides services at the application platform
level:
● a development and runtime environment for services and applications written in one
or more supported languages;

● storage services (a database interface rather than file share); and

● communication infrastructure, such as Microsoft’s Azure service bus.

Vulnerabilities in both the infrastructure and environment layers are usually specific
to one of the three resource types provided by these two layers. However, cross-tenant
access vulnerabilities are relevant for all three resource types. The virtual machine
escape vulnerability we described earlier is a prime example. We used it to
demonstrate a vulnerability that’s intrinsic to the core virtualization technology, but it
can also be seen as having its root cause in the essential characteristic of resource
pooling: whenever resources are pooled, unauthorized access across resources
becomes an issue.
Hence, for PaaS, where the technology to separate different tenants (and tenant
services) isn’t necessarily based on virtualization (although that will be increasingly
true), cross-tenant access vulnerabilities play an important role as well. Similarly,
cloud storage is prone to cross-tenant storage access, and cloud communication - in
the form of virtual networking - is prone to cross-tenant network access.

Computational Resources

A highly relevant set of computational resource vulnerabilities concerns how virtual


machine images are handled: the only feasible way of providing nearly identical
server images - thus providing on-demand service for virtual servers - is by cloning
template images.

Vulnerable virtual machine template images cause OS or application vulnerabilities to


spread over many systems. An attacker might be able to analyze configuration, patch
level, and code in detail using administrative rights by renting a virtual server as a
service customer and thereby gaining knowledge helpful in attacking other customers’
images. A related problem is that an image can be taken from an untrustworthy
source, a new phenomenon brought on especially by the emerging marketplace of
virtual images for IaaS services. In this case, an image might, for example, have been
manipulated so as to provide back-door access for an attacker.

Data leakage by virtual machine replication is a vulnerability that’s also rooted in the
use of cloning for providing on-demand service. Cloning leads to data leakage
problems regarding machine secrets: certain elements of an OS - such as host keys
and cryptographic salt values - are meant to be private to a single host. Cloning can
violate this privacy assumption. Again, the emerging marketplace for virtual machine
images, as in Amazon EC2, leads to a related problem: users can provide template
images for other users by turning a running image into a template. Depending on how
the image was used before creating a template from it, it could contain data that the
user doesn’t wish to make public.

There are also control challenges here, including those related to cryptography use.
Cryptographic vulnerabilities due to weak random number generation might exist if
the abstraction layer between the hardware and OS kernel introduced by virtualization
is problematic for generating random numbers within a VME. Such generation
requires an entropy source on the hardware level. Virtualization might have flawed
mechanisms for tapping that entropy source, or having several VMEs on the same
host might exhaust the available entropy, leading to weak random number generation.
As we noted earlier, this abstraction layer also complicates the use of advanced
security controls, such as hardware security modules, possibly leading to poor key
management procedures.

Storage

In addition to data recovery vulnerability due to resource pooling and elasticity,


there’s a related control challenge in media sanitization, which is often hard or
impossible to implement in a cloud context. For example, data destruction policies
applicable at the end of a life cycle that require physical disk destruction can’t be
carried out if a disk is still being used by another tenant.

Because cryptography is frequently used to overcome storage-related vulnerabilities,


this core technology’s vulnerabilities - insecure or obsolete cryptography and poor
key management - play a special role for cloud storage.

Communication

The most prominent example of a cloud communications service is the networking


provided for VMEs in an IaaS environment. Because of resource pooling, several
customers are likely to share certain network infrastructure components:
vulnerabilities of shared network infrastructure components, such as vulnerabilities in
a DNS server, Dynamic Host Configuration Protocol, and IP protocol vulnerabilities,
might enable network-based cross-tenant attacks in an IaaS infrastructure.

Virtualized networking also presents a control challenge: again, in cloud services, the
administrative access to IaaS network infrastructure and the possibility for tailoring
network infrastructure are usually limited. Also, using technologies such as
virtualization leads to a situation where network traffic occurs not only on “real”
networks but also within virtualized networks (such as for communication between
two VMEs hosted on the same server); most implementations of virtual networking
offer limited possibilities for integrating network-based security. All in all, this
constitutes a control challenge of insufficient network-based controls because tried-
and-tested network-level security controls might not work in a given cloud
environment.

Cloud Web Applications

A Web application uses browser technology as the front end for user interaction. With
the increased uptake of browser-based computing technologies such as JavaScript,
Java, Flash, and Silverlight, a Web cloud application falls into two parts:

● an application component operated somewhere in the cloud, and

● a browser component running within the user’s browser.

In the future, developers will increasingly use technologies such as Google Gears to
permit offline usage of a Web application’s browser component for use cases that
don’t require constant access to remote data. We’ve already described two typical
vulnerabilities for Web application technologies: session riding and hijacking
vulnerabilities and injection vulnerabilities.
Other Web-application-specific vulnerabilities concern the browser’s front-end
component. Among them are client-side data manipulation vulnerabilities, in which
users attack Web applications by manipulating data sent from their application
component to the server’s application component. In other words, the input received
by the server component isn’t the “expected” input sent by the client-side component,
but altered or completely user-generated input. Furthermore, Web applications also
rely on browser mechanisms for isolating third-party content embedded in the
application (such as advertisements, mashup components, and so on). Browser
isolation vulnerabilities might thus allow third-party content to manipulate the Web
application.

Services and APIs

It might seem obvious that all layers of the cloud infrastructure offer services, but for
examining cloud infrastructure security, it’s worthwhile to explicitly think about all of
the infrastructure’s service and application programming interfaces. Most services are
likely Web services, which share many vulnerabilities with Web applications. Indeed,
the Web application layer might be realized completely by one or more Web services
such that the application URL would only give the user a browser component. Thus
the supporting services and API functions share many vulnerabilities with the Web
applications layer.

Management Access

NIST’s definition of cloud computing states that one of cloud services’ central
characteristics is that they can be rapidly provisioned and released with minimal
management effort or service provider interaction. Consequently, a common element
of each cloud service is a management interface - which leads directly to the
vulnerability concerning unauthorized access to the management interface.
Furthermore, because management access is often realized using a Web application or
service, it often shares the vulnerabilities of the Web application layer and
services/API component.

Identity, Authentication, Authorization, and Auditing Mechanisms

All cloud services (and each cloud service’s management interface) require
mechanisms for identity management, authentication, authorization, and auditing
(IAAA). To a certain extent, parts of these mechanisms might be factored out as a
stand-alone IAAA service to be used by other services. Two IAAA elements that must
be part of each service implementation are execution of adequate authorization checks
(which, of course, use authentication and/or authorization information received from
an IAA service) and cloud infrastructure auditing.

Most vulnerabilities associated with the IAAA component must be regarded as cloud-
specific because they’re prevalent in state-of-the-art cloud offerings. Earlier, we gave
the example of weak user authentication mechanisms; other examples include
● Denial of service by account lockout. One often-used security control - especially for
authentication with username and password - is to lock out accounts that have
received several unsuccessful authentication attempts in quick succession. Attackers
can use such attempts to launch DoS attacks against a user.

● Weak credential-reset mechanisms. When cloud computing providers manage user


credentials themselves rather than using federated authentication, they must provide a
mechanism for resetting credentials in the case of forgotten or lost credentials. In the
past, password-recovery mechanisms have proven particularly weak.

● Insufficient or faulty authorization checks. State-of-the-art Web application and


service cloud offerings are often vulnerable to insufficient or faulty authorization
checks that can make unauthorized information or actions available to users. Missing
authorization checks, for example, are the root cause of URL-guessing attacks. In
such attacks, users modify URLs to display information of other user accounts.

● Coarse authorization control. Cloud services’ management interfaces are particularly


prone to offering authorization control models that are too coarse. Thus, standard
security measures, such as duty separation, can’t be implemented because it’s
impossible to provide users with only those privileges they strictly require to carry out
their work.

● Insufficient logging and monitoring possibilities. Currently, no standards or


mechanisms exist to give cloud customers logging and monitoring facilities within
cloud resources. This gives rise to an acute problem: log files record all tenant events
and can’t easily be pruned for a single tenant. Also, the provider’s security monitoring
is often hampered by insufficient monitoring capabilities. Until we develop and
implement usable logging and monitoring standards and facilities, it’s difficult - if not
impossible - to implement security controls that require logging and monitoring.
Of all these IAAA vulnerabilities, in the experience of cloud service providers,
currently, authentication issues are the primary vulnerability that puts user data in
cloud services at risk.

Provider

Vulnerabilities that are relevant for all cloud computing components typically concern
the provider - or rather users’ inability to control cloud infrastructure as they do their
own infrastructure. Among the control challenges are insufficient security audit
possibilities, and the fact that certification schemes and security metrics aren’t
adopted to cloud computing. Further, standard security controls regarding audit,
certification, and continuous security monitoring can’t be implemented effectively.

Cloud computing is in constant development; as the field matures, additional cloud-


specific vulnerabilities certainly will emerge, while others will become less of an
issue. Using a precise definition of what constitutes a vulnerability from the Open
Group’s risk taxonomy and the four indicators of cloud-specific vulnerabilities we
identify here offers a precision and clarity level often lacking in current discourse
about cloud computing security.

Control challenges typically highlight situations in which otherwise successful


security controls are ineffective in a cloud setting. Thus, these challenges are of
special interest for further cloud computing security research. Indeed, many current
efforts - such as the development of security metrics and certification schemes, and
the move toward full-featured virtualized network components - directly address
control challenges by enabling the use of such tried-and-tested controls for cloud
computing.

MCAD22E3CLOUD COMPUTING

MODULE9

9.1. HDFS MapReduce

9.2. Google App Engine (GAE)

9.3. Google Apps

9.4 Google File System (GFS)

9.5. Programming Environment for GAE


9.1. HDFS and MapReduce
The main difference between HDFS and MapReduce is that HDFS is a distributed file
system that provides high throughput access to application data while MapReduce is a
software framework that processes big data on large clusters reliably.

Big data is a collection of a large data set. It has three main properties: volume,
velocity, and variety. Hadoop is software that allows storing and managing big data.
It is an open source framework written in Java. Moreover, it supports distributed
processing of large data sets across clusters of computers. HDFS and MapReduce are
two modules in Hadoop architecture.
What is HDFS
HDFS stands for Hadoop Distributed File System. It is a distributed file system of
Hadoop to run on large clusters reliably and efficiently. Also, it is based on the
Google File System (GFS). Moreover, it also has a list of commands to interact with
the file system.

Furthermore, the HDFS works according to the master, slave architecture. The master
node or name node manages the file system metadata while the slave nodes or the
data notes store actual data.
Figure 1: HDFS Architecture

Besides, a file in an HDFS namespace is split into several blocks. Data nodes stores
these blocks. And, the name node maps the blocks to the data nodes, which handle the
reading and writing operations with the file system. Furthermore, they perform tasks
such as block creation, deletion etc. as instructed by the name node.

What is MapReduce
MapReduce is a software framework that allows writing applications to process big
data simultaneously on large clusters of commodity hardware. This framework
consists of a single master job tracker and one slave task tracker per cluster node. The
master performs resource management, scheduling jobs on slaves, monitoring and re-
executing the failed tasks. On the other hand, the slave task tracker executes the tasks
instructed by the master and sends the tasks status information back to the mater
constantly.
Figure 2: MapReduce Overview

Also, there are two tasks associated with MapReduce. They are the map task and the
reduce task. The map task takes input data and divides them into tuples of key, value
pairs while the Reduce task takes the output from a map task as input and connects
those data tuples into smaller tuples. Furthermore, the map task is performed before
the reduce task.

Difference Between HDFS and MapReduce


Definition
HDFS is a Distributed File System that reliably stores large files across machines in a
large cluster. In contrast, MapReduce is a software framework for easily writing
applications which process vast amounts of data in parallel on large clusters of
commodity hardware in a reliable, fault-tolerant manner. These definitions explain the
main difference between HDFS and MapReduce.

Main Functionality
Another difference between HDFS and MapReduce is that the HDFS provides high-
performance access to data across highly scalable Hadoop clusters while MapReduce
performs the processing of big data.
9.2. Google App Engine (GAE)

App Engine

Google App Engine (often referred to as GAE or simply App Engine, and also used
by the acronym GAE/J) is a platform as a service (PaaS) cloud computing platform
for developing and hosting web applications in Google-managed data centers.
Applications are sandboxed and run across multiple servers. App Engine offers
automatic scaling for web applications—as the number of requests increases for an
application, App Engine automatically allocates more resources for the web
application to handle the additional demand.

Google App Engine is free up to a certain level of consumed resources. Fees are
charged for additional storage, bandwidth, or instance hours required by the
application. It was first released as a preview version in April 2008, and came out of
preview in September 2011.

Currently, the supported programming languages are Python, Java (and, by extension,
other JVM languages such as Groovy, JRuby, Scala, Clojure, Jython and PHP via a
special version of Quercus), and Go. Google has said that it plans to support more
languages in the future, and that the Google App Engine has been written to be
language independent.

Python web frameworks that run on Google App Engine include GAE framework,
Django, CherryPy, Pyramid, Flask, web2py and webapp2, as well as a custom
Google-written webapp framework and several others designed specifically for the
platform that emerged since the release. Any Python framework that supports the
WSGI using the CGI adapter can be used to create an application; the framework can
be uploaded with the developed application. Third-party libraries written in pure
Python may also be uploaded.

Google App Engine supports many Java standards and frameworks. Core to this is the
servlet 2.5 technology using the open-source Jetty Web Server, along with
accompanying technologies such as JSP. JavaServer Faces operates with some
workarounds. Though the datastore used may be unfamiliar to programmers, it is
easily accessed and supported with JPA. JDO and other methods of reading and
writing data are also provided. The Spring Framework works with GAE, however the
Spring Security module (if used) requires workarounds. Apache Struts 1 is supported,
and Struts 2 runs with workarounds.

The Django web framework and applications running on it can be used on App
Engine with modification. Django-nonrel aims to allow Django to work with non-
relational databases and the project includes support for App Engine.

Applications developed for the Grails web application framework may be modified
and deployed to Google App Engine with very little effort using the App Engine
Plugin.
Google App engine logo

9.3. Google Apps


Google Apps is a service from Google providing independently customizable versions
of several Google products under a custom domain name. It features several Web
applications with similar functionality to traditional office suites, including Gmail,
Google Groups, Google Calendar, Talk, Docs and Sites.

Google Apps for business is free for 30 days, $5 USD per user account and month
thereafter or $50 per year. Google Apps for Education is free and offers the same
amount of storage as free Gmail accounts. Google Apps for Education combines
features from the Standard and Premier editions.

In addition to shared apps (calendar, docs, etc.), there is Google Apps Marketplace, an
App “store” for Google Apps users. It contains various apps, both free and for a fee,
which can be installed to customize the Google Apps experience for the user.

Google Apps is available in a number of distinct editions. Each edition has a limit on
the number of users that may be active at any given time. Google Apps launched with
a default user allotment of 200 users, which was shortly changed to 100 users. In
addition, users could request to have their user limit increased through a manual
process taking (at least) 1–2 weeks for approval. In January 2009, the cap was
changed so that all new accounts would receive only 50 users as opposed to 100, and
could not request more without payment. This was confirmed as relating to the launch
of the Google Apps commercial reseller program. Existing Standard Edition users
before January 2009 kept their old allocation, in addition to the ability their “request”
more users, though these limit requests are now commonly answered with suggestions
to “upgrade your subscription”. In 2011, the limit on the free Google Apps product
was further reduced to 10 users, effective for new users.

The subscription level of a Google Apps edition is billed based on the total number of
available users in the Apps account, and the edition features apply to all users
accounts in that subscription. It is not possible to purchase upgrades for a subset of
users: to increase the user limit, subscriptions must be purchased for all accounts. For
example, an upgrade from a “Standard” limit of 50 users to allow up to 60 users
would involve paying for 60 users, whether they are used or not.

Google Apps (formerly Google Apps Standard Edition)

● Free
● Brandable name and logos in the control panel, i.e. @yourdomain.com
● Same storage space as regular gmail.com accounts (over 10,300 MB as of
October 15th, 2012)
● Text ads standard (can be turned off in each account)
● Limited to 10 users within same domain.
● Email attachments cannot be larger than 25 megabytes.
● Limited to sending email to 500 external recipients per day per email account.

Google Apps Partner Edition / Google Apps for ISPs


Same as standard edition with the following exceptions:

● No limit on number of mailboxes


● Google API is available to use to manage and provision accounts
● Paid service with tech support available with pricing starting at $0.35 per
mailbox per resellers such as http://www.ikano.com.

Google Apps for Business (formerly Google Apps Premier Edition)

● US$50 per account per year, or US$5 per account monthly


● Text ads optional
● Integrated Postini policy-based messaging security
● Conference room/resource scheduling
● 99.9% e-mail uptime guarantee
● APIs available for Single Sign On
● 24/7 phone support
● Google Video, a service similar to YouTube with private groups
(discontinued)
● Limited to sending email to 2000 external recipients per day per email
account.
● Storage space 25 GB in each account, allocated for use across all products
including e-mail.

Google Apps for Education (formerly Google Apps Education Edition)


Same as Google Apps for Business except for the following:

● Free for K-12 schools, colleges, and universities with up to 30,000 users
● No ads for faculty, staff, or students
● Google may serve ads to accounts not associated with enrolled students, staff
or volunteers
● Storage space 25 GB as of June 24, 2011

Google Apps for Non-profits (formerly Google Apps Education Edition)


Same as Google Apps for Business except for the following:
● Free for accredited 501(c)(3) non-profit entities with less than 3,000 users
● Large non-profits eligible for 40% discount on Google Apps for Business
● No ads for faculty, staff, or students
● Google may serve ads to accounts not associated with staff or volunteers
● Storage space 25 GB as of June 24, 2011

9.4. Google File System (GFS)


Google File System (GFS) is a scalable distributed file system (DFS) created by
Google Inc. and developed to accommodate Google’s expanding data processing
requirements. GFS provides fault tolerance, reliability, scalability, availability and
performance to large networks and connected nodes. GFS is made up of several
storage systems built from low-cost commodity hardware components. It is optimized
to accomodate Google's different data use and storage needs, such as its search
engine, which generates huge amounts of data that must be stored.

The Google File System capitalized on the strength of off-the-shelf servers while
minimizing hardware weaknesses. GFS is also known as GoogleFS.

9.5. Programming Environment for GAE

Google File System (GFS)

The GFS node cluster is a single master with multiple chunk servers that are
continuously accessed by different client systems. Chunk servers store data as Linux
files on local disks. Stored data is divided into large chunks (64 MB), which are
replicated in the network a minimum of three times. The large chunk size reduces
network overhead.

GFS is designed to accommodate Google’s large cluster requirements without


burdening applications. Files are stored in hierarchical directories identified by path
names. Metadata - such as namespace, access control data, and mapping information -
is controlled by the master, which interacts with and monitors the status updates of
each chunk server through timed heartbeat messages.

GFS features include:


● Fault tolerance
● Critical data replication
● Automatic and efficient data recovery
● High aggregate throughput
● Reduced client and master interaction because of large chunk server size
● Namespace management and locking
● High availability

The largest GFS clusters have more than 1,000 nodes with 300 TB disk storage
capacities. This can be accessed by hundreds of clients on a continuous basis.

MCAD22E3CLOUD COMPUTING

MODULE10

10.1. Case Studies: Openstack, Heroku and Docker Containers

10.3. Amazon EC2

10.4. AWS

10.5. Microsoft Azure

10.6 Google Compute Engine


10.1. Case Studies: Openstack, Heroku and
Docker Containers

Heroku vs. Docker


Heroku runs on dynos which they describe as “a lightweight container
running a single super-specified command”. In essence, Heroku abstracts
the container away from the user and puts a sandbox up around what it can
do. Docker is an opensource container standard that can run just about
anywhere. With Docker you get infinitely more flexibility and portability
because the user controls the underlying container rather than having that
defined by Heroku.
Again, I want to emphasize that I’m not anti-Heroku. Heroku made life
much easier for a lot of developers when it debuted a decade ago. It was the
first major platform to deliver “application-centric” development by freeing
developers from all of the hassles that come with maintaining and
provisioning infrastructure. With Heroku, you can push your code out from
Git into a pre-provisioned environment with just a few commands.
However, like any platform, Heroku has some limitations. The two big ones
are price and inflexible deployment.
Price
You can’t do an apples-to-apples comparison between Heroku and a
Docker-based alternative. Heroku and Docker are not the same thing, and
there’s no straightforward way to compare the cost of deploying a given
number of apps via Heroku to what it would cost to do the same thing using
Docker.
But if you’re weighing the price of Heroku’s PaaS against what it would
cost to set up a Docker stack on a public cloud, or, even more conveniently,
use a Containers-as-a-Service, or CaaS, solution for deploying apps via
Docker, there’s a decent chance you’ll find Heroku to be the costlier option.
For example, Heroku charges $25/month for a standard Dyno, which
has 512 megabytes of memory and one CPU. AWS’s t2.nano instance,
which happens to have exactly the same amount of resources, costs only
$4.75/month.
Again, this is not an apples-to-apples comparison. The AWS t2.nano
instance just gives you infrastructure, which you can use to set up your own
app deployment environment. Heroku gives you a load of features in
addition to the infrastructure. Plus, I haven’t mentioned storage costs. But
the point is that if you’re working on a tight budget, you might get a better
deal by paying just for hosting, then setting up your own app deployment
pipeline using Docker and a compatible CI/CD platform.
Deployment Limitations
Heroku bundles an app delivery pipeline with the infrastructure required to
host it. That’s great if you like one-stop shopping and don’t mind hosting
your delivery chain on Heroku’s public PaaS.
It’s a big drawback, however, if you would rather run the delivery pipeline
on-premises, or on a private cloud. The most privacy you can get from
Heroku is Private Spaces, which basically means running Heroku not with
normal Heroku servers, but with network isolation. That’s very different
from the privacy (and compliance) benefits you get from keeping things on
your own local servers.
When it comes to deployment flexibility, then, Heroku is pretty inflexible.
It’s certainly much less flexible than a delivery pipeline built using Docker,
which you can run on a public cloud, a private cloud, an on-premises server
or even just your local workstation (not that you should do that in
production). Heck, you can even now run Docker on Raspberry Pi if you
want to.
I’m not saying the Heroku model is flawed. For many users, a hosted, fully
managed delivery pipeline is a great value, especially for developers who
want to focus just on their apps and nothing else, which is exactly what
Heroku is designed to let them do. But if you want more flexibility in how
you deploy your delivery pipeline, Heroku’s a poor choice.
From Heroku to Docker
You might be wondering why anyone is still using Heroku at all. If Docker
is generally more cost-effective and flexible than Heroku, why didn’t
developers migrate to Docker in droves when Docker came out in 2013?
The answer is that Docker was much harder to set up than Heroku, and that
is now only beginning to change. In contrast, Heroku is a turnkey solution.
You don’t have to spend time building your pipeline or provisioning your
infrastructure.
However, as the Docker ecosystem has expanded over the past few years,
the road for migrating from Heroku to Docker has become much smoother.
In fact, there are now multiple possible roads to get from Heroku to Docker
without hitting traffic jams.
Docker DIY
The first and probably most obvious approach is to build a Docker-based
delivery chain from scratch. There’s nothing stopping you from doing this,
and one of the benefits is that you can choose to use whichever components
you want for your stack. The choice of integration server, orchestrator,
hosting infrastructure and so on is up to you. But, of course, the big
drawback is that you have to build all of this yourself. The fact that this DIY
approach used to be the only way to migrate from Heroku to Docker is why
not everyone migrated from Heroku to Docker three years ago.
Dokku
A second, more user-friendly option is Dokku, one of the earliest Docker-
based alternatives to Heroku. Dokku is a Docker-based PaaS that basically
lets you do the same thing as Heroku, but on the infrastructure of your
choosing (including on-premises or in the cloud). It’s free, and it’s pretty
easy to set up.
So what’s not to love? Mostly the fact that Dokku is not designed for large,
distributed environments. Yes, there are ways to scale out Dokku, but once
you start investing time and energy in these tricks, you might as well build
your own Docker delivery pipeline from scratch. Dokku also has no official
user-friendly web interface for administration, which is not ideal
(although third-party interfaces exist). Last but not least, Dokku only works
on Ubuntu, which is fine if you like Ubuntu (I happen to), but it could be a
limitation if you like keeping your deployment options flexible.
CaaS + Docker-Ready CI/CD
The third possible migration path involves combining general-purpose
Containers-as-a-Service (or CaaS) platforms, which provide a turnkey
solution for setting up Docker-compatible infrastructure, with Dockerized
CI/CD pipelines built using platforms like Codefresh. The maturation of
both of these kinds of platforms over the past couple of years is what has
really changed to make Heroku-to-Docker migration easy.
CaaS platforms let you set up a Docker stack very easily, complete with an
orchestrator, registry and everything else you need to deploy your
containerized apps. There are now lots of CaaS options available, from
AWS ECS and Azure Container Service to OpenShift and Rancher. Some
run in the cloud, some run on-premises, and some can run in both places.
What this means is that CaaS provides an easy, flexible and free
replacement for the infrastructure part of Heroku’s PaaS.
Meanwhile, Docker-ready pipeline management platforms like Codefresh
replace the other half of Heroku’s functionality. They let you push code
from GitHub repositories with very little effort and deploy it as Docker
containers. Then, your CaaS provides a fuss-free solution for running those
containers.

10.3. Amazon EC2


Elastic Compute Cloud (EC2)

Amazon EC2 or Elastic Compute Cloud. EC2 refers to an on-demand


computing service on the AWS cloud platform. Under computing, it includes
all the services a computing device can offer to you along with the flexibility of
a virtual environment. It also allows the user to configure their instances as per
their requirements i.e. allocate the RAM, ROM, and storage according to the
need of the current task. Even the user can dismantle the virtual device once its
task is completed and it is no more required. For providing, all these scalable
resources AWS charges some bill amount at the end of every month, bill
amount is entirely dependent on your usage.

Features of Amazon EC2:

Functionality – EC2 provides its users a true virtual computing platform,


where they can various operations and even launch another EC2 instance from
this virtually created environment. This will increase the security of these
virtual devices. Not only creating but also EC2 allows us to customize our
environment as per our requirements, at any point of time during the life span
of this virtual machine. Amazon EC2 itself comes with a set of default
AMI(Amazon Machine Image) options supporting various operating systems
along with some pre-configured resources like RAM, ROM, storage, etc.
Besides these AMI options, we can also create an AMI curated with the
combination of default and user-defined configurations. And for future
purposes, we can store this user-defined AMI, so that next time, the user won’t
have to re-configure a new AMI from scratch. Rather than this whole process,
the user can simply use the older reference while creating a new EC2 machine.

Operating Systems – Amazon EC2 includes a wide range of operating


systems to choose from while selecting your AMI. Not only these selected
options, but users are also even given the privileges to upload their own
operating systems and opt for that while selecting AMI during launching an
EC2 instance. Currently, AWS has the following most preferred set of
operating systems available on the EC2 console.
● Amazon Linux
● Windows Server
● Ubuntu Server
● SUSE Linux
● Red Hat Linux

Software – Amazon is single-handedly ruling the cloud computing market,
because of the variety of options available on it for its users. It allows its users
to choose from various software present to run on their EC2 machines. This
whole service is allocated to AWS Marketplace on the AWS platform.
Numerous software like SAP, LAMP and Drupal, etc are available on AWS to
use.

Scalability and Reliability – EC2 provides us the facility to scale up or


scale down as per the needs. All dynamic scenarios can easily be tackled by
EC2 with the help of this feature. And because of the flexibility of volumes
and snapshots, it is highly reliable for its users. Due to the scalable nature
of the machine, many organizations like Flipkart, Amazon rely on days
whenever humongous traffic occurs on their portals.

First login into your AWS account. Once you are directed to the management
console. From the left click on “Services” and from the listed options click
on EC2.

Afterward, you will be redirected to the EC2 console. Here is the image
attached to refer to.
This was all about introducing you about Amazon EC2 or Amazon Elastic
Compute Cloud. If you wish to learn about creating an EC2 instance, follow
the linked article. And if you are also another free tier account user make sure
you delete all the instances or services you have used before logging out of
from your AWS account.

10.4. AWS
Introduction to Amazon Web Services
Amazon Web Services (AWS), a subsidiary of Amazon.com, has invested
billions of dollars in IT resources distributed across the globe. These resources
are shared among all the AWS account holders across the globe. These account
themselves are entirely isolated from each other. AWS provides on-demand IT
resources to its account holders on a pay-as-you-go pricing model with no
upfront cost. Enterprises use AWS to reduce capital expenditure of building
their own private IT infrastructure (which can be expensive depending upon the
enterprise’s size and nature). All the maintenance cost is also bared by the
AWS that saves a fortune for the enterprises.

AWS Global Infrastructure

The AWS global infrastructure is massive and is divided into geographical


regions. The geographical regions are then divided into separate availability
zones. While selecting the geographical regions for AWS, three factors come
into play
● Optimizing Latency
● Reducing cost
● Government regulations (Some services are not available for some regions)

Each region is divided into at least two availability zones that are physically
isolated from each other, which provides business continuity for the
infrastructure as in a distributed system. If one zone fails to function, the
infrastructure in other availability zones remains operational. The largest region
North Virginia (US-East), has six availability zones. These availability zones
are connected by high-speed fiber-optic networking.
There are over 100 edge locations distributed all over the globe that are used
for the CloudFront content delivery network. Cloudfront can cache frequently
used content such as images and videos at edge locations and distribute it to
edge locations across the globe for high-speed delivery for end-users. It also
protects from DDOS attacks.

AWS Management Console

The AWS management console is a web-based interface to access AWS. It


requires an AWS account and also has a smartphone application for the same
purpose. Cost monitoring is also done through the console.

AWS resources can also be accessed through various Software Development


Kits (SDKs), which allows the developers to create applications as AWS as its
backend. There are SDKs for all the major languages(e.g., JavaScript, Python,
Node.js, .Net, PHP, Ruby, Go, C++). There are mobile SDKs for Android, iOS,
React Native, Unity, and Xamarin. AWS can also be accessed by making
HTTP calls using the AWS-API. AWS also provides a Command Line
Interface (CLI) for remotely accessing the AWS and can implement scripts to
automate many processes.

AWS Cloud Computing Models

There are three cloud computing models available on AWS.


1. Infrastructure as a Service (IaaS): It is the basic building block of cloud
IT. It generally provides access to data storage space, networking features,
and computer hardware(virtual or dedicated hardware). It is highly flexible
and gives management controls over the IT resources to the developer. For
example, VPC, EC2, EBS.

2. Platform as a Service (PaaS): This is a type of service where AWS


manages the underlying infrastructure (usually operating system and
hardware). This helps the developer to be more efficient as they do not have
to worry about undifferentiated heavy lifting required for running the
applications such as capacity planning, software maintenance, resource
procurement, patching, etc., and focus more on deployment and
management of the applications. For example, RDS, EMR, ElasticSearch

3. Software as a Service(SaaS): It is a complete product that usually runs on


a browser. It primarily refers to end-user applications. It is run and managed
by the service provider. The end-user only has to worry about the
application of the software suitable to its needs. For example,
Saleforce.com, Web-based email, Office 365

10.5. Microsoft Azure

Introduction to Microsoft Azure | A cloud computing service

What is Azure?
Azure is Microsoft’s cloud platform, just like Google has it’s Google Cloud
and Amazon has it’s Amazon Web Service or AWS.000. Generally, it is a
platform through which we can use Microsoft’s resource. For example, to set
up a huge server, we will require huge investment, effort, physical space and so
on. In such situations, Microsoft Azure comes to our rescue. It will provide us
with virtual machines, fast processing of data, analytical and monitoring tools
and so on to make our work simpler. The pricing of Azure is also simpler and
cost-effective. Popularly termed as “Pay As You Go”, which means how much
you use, pay only for that.

Azure History
Microsoft unveiled Windows Azure in early October 2008 but it went to live
after February 2010. Later in 2014, Microsoft changed its name from Windows
Azure to Microsoft Azure. Azure provided a service platform for .NET
services, SQL Services, and many Live Services. Many people were still very
skeptical about “the cloud”. As an industry, we were entering a brave new
world with many possibilities. Microsoft Azure is getting bigger and better in
coming days. More tools and more functionalities are getting added. It has two
releases as of now. It’s famous version Micorosft Azure v1 and
later Microsoft Azure v2. Microsoft Azure v1 was more like JSON script
driven then the new version v2, which has interactive UI for simplification and
easy learning. Microsoft Azure v2 is still in the preview version.

Azure can help in our business in the following ways-


● Capitaless: We don’t have to worry about the capital as Azure cuts out the
high cost of hardware. You simply pay as you go and enjoy a subscription-
based model that’s kind to your cash flow. Also, to set up an Azure account
is very easy. You simply register in Azure Portal and select your required
subscription and get going.

● Less Operational Cost: Azure has low operational cost because it runs on
its own servers whose only job is to make the cloud functional and bug-free,
it’s usually a whole lot more reliable than your own, on-location server.

● Cost Effective: If we set up a server on our own, we need to hire a tech


support team to monitor them and make sure things are working fine. Also,
there might be a situation where the tech support team is taking too much
time to solve the issue incurred in the server. So, in this regard is way too
pocket-friendly.

● Easy Back Up and Recovery options: Azure keep backups of all your
valuable data. In disaster situations, you can recover all your data in a single
click without your business getting affected. Cloud-based backup and
recovery solutions save time, avoid large up-front investment and roll up
third-party expertise as part of the deal.

● Easy to implement: It is very easy to implement your business models in


Azure. With a couple of on-click activities, you are good to go. Even there
are several tutorials to make you learn and deploy faster.

● Better Security: Azure provides more security than local servers. Be


carefree about your critical data and business applications. As it stays safe
in the Azure Cloud. Even, in natural disasters, where the resources can be
harmed, Azure is a rescue. The cloud is always on.

● Work from anywhere: Azure gives you the freedom to work from
anywhere and everywhere. It just requires a network connection and
credentials. And with most serious Azure cloud services offering mobile
apps, you’re not restricted to which device you’ve got to hand.

● Increased collaboration: With Azure, teams can access, edit and share
documents anytime, from anywhere. They can work and achieve future
goals hand in hand. Another advantage of the Azure is that it preserves
records of activity and data. Timestamps are one example of the Azure’s
record keeping. Timestamps improve team collaboration by establishing
transparency and increasing accountability.
Microsoft Azure Services

Some following are the services of Microsoft Azure offers:


1. Compute: Includes Virtual Machines, Virtual Machine Scale Sets,
Functions for serverless computing, Batch for containerized batch
workloads, Service Fabric for microservices and container orchestration,
and Cloud Services for building cloud-based apps and APIs.

2. Networking: With Azure you can use variety of networking tools, like the
Virtual Network, which can connect to on-premise data centers; Load
Balancer; Application Gateway; VPN Gateway; Azure DNS for domain
hosting, Content Delivery Network, Traffic Manager, ExpressRoute
dedicated private network fiber connections; and Network Watcher
monitoring and diagnostics

3. Storage: Includes Blob, Queue, File and Disk Storage, as well as a Data
Lake Store, Backup and Site Recovery, among others.

4. Web + Mobile: Creating Web + Mobile applications is very easy as it


includes several services for building and deploying applications.

5. Containers: Azure has a property which includes Container Service, which


supports Kubernetes, DC/OS or Docker Swarm, and Container Registry, as
well as tools for microservices.

6. Databases: Azure has also includes several SQL-based databases and


related tools.

7. Data + Analytics: Azure has some big data tools like HDInsight for
Hadoop Spark, R Server, HBase and Storm clusters

8. AI + Cognitive Services: With Azure developing applications with


artificial intelligence capabilities, like the Computer Vision API, Face API,
Bing Web Search, Video Indexer, Language Understanding Intelligent.

9. Internet of Things: Includes IoT Hub and IoT Edge services that can be
combined with a variety of machine learning, analytics, and
communications services.

10. Security + Identity: Includes Security Center, Azure Active Directory,


Key Vault and Multi-Factor Authentication Services.
11. Developer Tools: Includes cloud development services like Visual Studio
Team Services, Azure DevTest Labs, HockeyApp mobile app deployment
and monitoring, Xamarin cross-platform mobile development and more.

10.6 Google Compute Engine


Google Cloud Platform (GCP)

Before we begin learning about Google Cloud Platform, we will talk about
what is Cloud Computing. Basically it is using someone else’s computer over
the internet. Example- GCP, AWS, IBM Cloud, etc. Some interesting features
of cloud computing are as follows:
● You get computing resources on-demand and self-service. The customer
has to use a simple User Interface and they get the computing power,
storage requirements, and network you need, without human intervention.
● You can access these cloud resources over the internet from anywhere on
the globe.
● The provider of these resources has a huge collection of these resources and
allocates them to customers out of that collection.
● The resources are elastic. If you need more resources you can get more,
rapidly. If you need less, you can scale down back.
● The customers pay only for what they use or reserve. If they stop using
resources, they stop paying.

Three Categories of Cloud Services

● Infrastructure as a Service (IaaS): It provides you all the hardware


components you require such as computing power, storage, network, etc.
● Platform as a Service (PaaS): It provides you a platform that you can use
to develop applications, software, and other projects.
● Software as a Service (SaaS): It provides you with complete software to
use like Gmail, google drive, etc.

Google Cloud Platform

All the services listed above are provided by Google hence the name Google
Cloud Platform (GCP). Apart from these, there are so many other services
provided by GCP and also many concepts related to it that we are going to
discuss in this article.
Regions and zones:

Let’s start at the finest grain level (i.e. the smallest or first step in the
hierarchy), the Zone. A zone is an area where Google Cloud Platform
Resources like virtual machines or storage is deployed.

For example, when you launch a virtual machine in GCP using Compute
Engine, it runs in a zone you specify (suppose Europe-west2-a). Although
people consider a zone as being sort of a GCP Data Center, that’s not strictly
accurate because a zone doesn’t always correspond to one physical building.
You can still visualize the zone that way, though.

Zones are grouped into regions which are independent geographic areas and
much larger than zones (for example- all zones shown above are grouped into a
single region Europe-west2) and you can choose what regions you want your
GCP resources to be placed in. All the zones within a neighborhood have fast
network connectivity among them. Locations within regions usually have trip
network latencies of under five milliseconds.

As a part of developing a fault-tolerant application, you’ll need to spread your


resources across multiple zones in a region. That helps protect against
unexpected failures. You can run resources in different regions too. Lots of
GCP customers do this, both to bring their applications closer to users around
the world, and also to guard against the loss of a whole region, say, due to a
natural disaster.
A few GCP Services supports deploying resources in what we call a Multi-
Region. For example, Google Cloud Storage, lets you place data within the
Europe Multi-Region. What that means is that it is stored redundantly in a
minimum of two different geographic locations, separated by at least 160
kilometers within Europe. Previously, GCP had 15 regions.
Visit cloud.google.com to ascertain what the entire is up to today.

Pricing

Google was the primary major Cloud provider to bill by the second instead of
rounding up to greater units of your time for its virtual machines as a service
offering. This may not sound like a big deal, but charges for rounding up can
really add up for customers who are creating and running lots of virtual
machines. Per second billing is obtainable for a virtual machine use through
Compute Engine and for several other services too.

Compute Engine provides automatically applied use discounts which


are discounts that you simply get for running a virtual machine for a big
portion of the billing month. When you run an instance for at least 25% of a
month, Compute Engine automatically gives you a reduction for each
incremental minute you employ it. Here’s one more way Compute Engine
saves you money.

Normally, you choose a virtual machine type from a typical set of those values,
but Compute Engine also offers custom virtual machine types, in order that
you’ll fine-tune the sizes of the virtual machines you use. That way,
you’ll tailor your pricing for your workloads.

Open API’s

Some people are afraid to bring their workloads to the cloud because they’re
afraid they’ll get locked into a specific vendor. But in many ways, Google
gives customers the power to run their applications elsewhere, if Google
becomes not the simplest provider for his or her needs. Here are some samples
of how Google helps its customers avoid feeling locked in. GCP services
are compatible with open source products. For example, take Cloud
Bigtable, a database that uses the interface of the open-source database Apache
HBase, which provides customers the advantage of code portability. Another
example, Cloud Dataproc provides the open-source big data environment
Hadoop, as a managed service, etc.

Why choose GCP?

● GCP allows you to choose between computing, storage, big data, machine
learning, and application services for your web, mobile, analytics, and,
back-end solutions.
● It’s global and it is cost-effective.
● It’s open-source friendly.
● It’s designed for security.

Advantages of GCP

1. Good documentation: We are talking about many pages in total, including


a reasonably detailed API Reference guide.
2. Different storage classes for every necessity: Regional (frequent use),
Nearline (infrequent use), and Coldline (long-term storage).
3. High durability: This suggests that data survives even within the event of
the simultaneous loss of two disks.
4. Many regions available to store your data: North Ameria, South
America, Europe, Asia, and Australia.
5. The “Console” tab within the documentation allows you to try for free of
charge different SDKs. It’s incredibly useful for developers
6. One of the simplest free layers within the industry. $300 free credit to start
with any GCP product during the primary year. Afterward, 5 GB of Storage
to use forever without any charges.

Disadvantages of GCP

1. The support fee is sort of hefty: Around 150 USD per month for the
foremost basic service (Silver class).
2. Downloading data from Google Cloud Storage is expensive. 0, 12 USD
per GB.
3. Google Cloud Platform web interface is somewhat confusing. Sometimes
I am lost while browsing around the menus.
4. Prices in both Microsoft Azure (around 0.018 USD per GB/month) or
Backblaze B2 (about 0.005 USD per GB/month) are less than Google Cloud
Storage.
5. It has a high pricing schema, almost like AWS S3, so it’s easy to urge
unexpected costs (e.g. number of requests, transfers, etc.).
Lab Exercise 1 :

Best Steps to Use Devstack for Openstack Installation on CentOS 7

In this Openstack tutorial, I will take you through the best way to use devstack for
Openstack download and Openstack Deployment on RedHat/CentOS 7 using
Devstack. OpenStack is a free and open-source software platform for cloud
computing, mostly deployed as infrastructure-as-a-service, whereby virtual servers
and other resources are made available to customers. In this openstack tutorial I will
use Devstack tool but there is another way through packstack tool which you can use
for openstack download and openstack deployment.

You can even use this openstack tutorial for Openstack download and setup openstack
on laptop using Devstack. But before going through the steps to setup openstack on
laptop you need to understand the difference between how to configure openstack in a
Server and how to configure openstack in laptop. Steps on how to configure
openstack can also be done through open stackpack stack centos.

Openstack Installation and Configuration

Before going through OpenstackInstallation and configuration on your system, make


sure you have all the prerequisites is in place or installation will fail. It is also possible
to setup Openstack on laptop using Openstack minimal setup.

What is RDO Openstack

RDO Openstack is an openstack deployment launched by RedHat in 2013. RDO


Openstack is also known as RPM Distribution of RedHat. RDO Openstack was
launched for Red Hat Enterprise Linux(RHEL) and other linux distributions such as
CentOS, Fedora.

Packstack vs Devstack

a)Packstack is mostly suitable for Red Hat Distribution Linux like CentOS and
Fedora. It basically uses puppet modules to deploy various part of Openstack
Components through ssh.
b)Devstack is a script written to create an environment with Openstack minimal setup
which can be used to setup Openstack on laptop as well.
Visit OpenstackPackstack CentOS for openstack installation through Packstack on
CentOS 7.

Openstack Configuration Step by Step

Step 1: Prerequisites
a) Openstack minimal setup requires minimum memory of 4GB
should be available in your system.
b) Make sure latest version of python and pip is installed in your
system.
c) Install git using yum install git

[root@localhost~]# yum install git

Loaded plugins: fastestmirror

Loading mirror speeds from cached hostfile

epel/x86_64/metalink | 7.9 kB 00:00:00

* base: centos.excellmedia.net

* epel: mirrors.aliyun.com

* extras: centos.excellmedia.net

* updates: centos.excellmedia.net

base | 3.6 kB 00:00:00

epel | 5.3 kB 00:00:00

extras | 2.9 kB 00:00:00

kubernetes/signature | 454 B 00:00:00

kubernetes/signature | 1.4 kB 00:00:00 !!!

puppetlabs-pc1 | 2.5 kB 00:00:00

updates | 2.9 kB 00:00:00

(1/4): epel/x86_64/updateinfo | 1.0 MB 00:00:00

(2/4): kubernetes/primary | 60 kB 00:00:01

(3/4): puppetlabs-pc1/x86_64/primary_db | 234 kB 00:00:01

(4/4): epel/x86_64/primary_db | 6.9 MB 00:00:02

kubernetes 433/433

Resolving Dependencies

--> Running transaction check


.................................................................

Step 2: Create an User

Devstack perform lot of changes in your system hence it does not perform installation
through root user. Instead you need to create an user with sudo access to perform the
installation. In our example, I have created stackuser.

[root@localhost~]#useradd -d /home/stackuser -m stackuser

[root@localhost~]# cat /etc/passwd | grep -istackuser

stackuser:x:1000:1000::/home/stackuser:/bin/bash

Note:- You can also use create-stack-user.sh script to create an user.

Step 3: Provide sudo access to the User

You need to provide the sudo access to stackuser.

[root@localhost~]# echo "stackuser ALL=(ALL) NOPASSWD: ALL" | tee


/etc/sudoers.d/stackuser

stackuser ALL=(ALL) NOPASSWD: ALL

Step 4: Openstack Download using Devstack

Switch user to stackuser and start the git cloning. Here we are using rocky version of
Devstack. You can choose any version as per your requirement.

[root@localhost~]#su - stackuser

[stackuser@localhost~]$ git clone https://github.com/openstack-dev/devstack.git -b


stable/rocky devstack/

Cloning into 'devstack'...

remote: Enumerating objects: 54, done.

remote: Counting objects: 100% (54/54), done.

remote: Compressing objects: 100% (54/54), done.

remote: Total 44484 (delta 30), reused 17 (delta 0), pack-reused 44430

Receiving objects: 100% (44484/44484), 14.11 MiB | 3.84 MiB/s, done.

Resolving deltas: 100% (30950/30950), done.


Step 5: Configure local.conf for Openstack Deployment

Once Devstack is downloaded locally in your system. Go to devstack directory and


configure local.conf for openstack deployment as mentioned below.

[stackuser@localhostdevstack]$ cat local.conf

[[local|localrc]]

ADMIN_PASSWORD=test@123

DATABASE_PASSWORD=\$ADMIN_PASSWORD

RABBIT_PASSWORD=\$ADMIN_PASSWORD

SERVICE_PASSWORD=\$ADMIN_PASSWORD

HOST_IP=192.168.0.104

RECLONE=yes

Step 6: OpenStack Installation and Configuration

Now perform Openstack installation and configuration by running stack.sh Script as


shown below.

[stackuser@localhostdevstack]$ ./stack.sh

Openstack Installation and configuration usually takes around 20-25 mins depends on
your network bandwidth.

Output:-

This is your host IP address: 192.168.0.104

This is your host IPv6 address: ::1

Horizon is now available at http://192.168.0.104/dashboard

Keystone is serving at http://192.168.0.104/identity/

The default users are: admin and demo

The password: test@123

DevStack Version: rocky

OS Version: CentOS Linux Release 7.7.1908(Core)

stack.sh completed in 3000 seconds.

Step 7: Test Openstack Installation


Congratulations!! Openstack Installation and Configuration is completed now. You
can go to below URL Address and provide username/passwd.

URL: http://192.168.0.104/dashboard

User: admin

Pass: test@123

Lab Exercise 2:
HOW TO CREATE NEW LINUX VM IN OPENSTACK
DASHBOARD (HORIZON)?

Go to Project → Compute → Instances.

Click "Launch Instance".


Insert the name of the Instance (eg. "vm01") and click Next button.

Select Instance Boot Source (eg. "Image"), and choose desired image (eg. "Ubuntu
16.04 LTS") by clicking on arrow.

If you do not need to have the system disk bigger than the size defined in a
chosen flavor, we recommend setting "Create New Volume" feature to "No"
state.
Choose Flavor (eg. eo1.xsmall).

Click "Networks" and then choose desired networks.


Open "Security Groups" After that, choose "allow_ping_ssh_rdp" and "default".

Choose or generate SSH keypair for your VM. Next, launch your instance by clicking
on blue button.
You will see "Instances" menu with your newly created VM.

Open the drop-down menu and choose "Console".


Click on the black terminal area (to activate access to the console). Type: eoconsole
and hit Enter.

Insert and retype new password.

Now you can type commands.


After you finish, type "exit".

This will close the session.


Lab Exercise 3:
Launching the first Openstack Instance

This article will demonstrate that how to launch the first Openstack instance.
In the previous articles ,we have setup the openstack software and gone
through the openstack dashboard functionalities. In an order to launch the
openstack instances, first we need to create the network security group, rules
& key pairs to access the instances from other network. In the security rule , I
will make the port 22 and ping protocol to allow in the firewall. Note that once
you have download the key pair , there is no way to download it again due to
security reason. Let’s create the first openstack instance.

Create the Network security group & Configure the Rules:

1. Login to Openstack Dashboard as normal user. (demo)

2. Navigate to Access &Security . Select the tab called “Security Groups”.

Access & Security – Openstack

3. Click on “Create Security group”. Enter the name and description for the
security group.
Create Security Group

4. Once the group has been created successfully, Click on “Manage Rules” .

Manage the Network Group Rules

5. Click on “Add Rule” .


Add Rule – Openstack

6. Allow ssh from anywhere to the instances.

Allow SSH – Openstack

7. Similarly , allow “ping” as well to this host from anywhere.


Allow ICMP -Ping

Once you have added those rules to the security group, it will look like below.

Security Rules – Openstack

Create the key-pair to access the instance:

1. Login to Openstack Dashboard.

2. Navigate to security & access. Click the tab called “Key Pairs” and click
on “Create key Pair” .
Key Pairs – Openstack

3. Enter the Key pair name. (Keep Some meaning full name). Click on
“Create key Pair”

Enter the Key Pair Name

4. The key pair will be automatically downloaded to your laptop. If it didn’t


download, click the link to download it. Keep the key safe since you can’t
download it again.
Download Key pair – Openstack
In-case if you have lost the key ,then you need to re-create the new key pair
& use it.

At this point , we have created the new security group and key pair. The
security group will allows “ssh” &ping from anywhere.

Launch the New OpenstackInstance :

1. Login to Openstack Dashboard.

2. Click on “Launch Instance ” tab.

Launch instance Openstack

3. Select the instance details like below.


Enter the Instance Details

● Availability Zone – nova . (Need to select your compute node). In our case
control node & compute nodes are same.
● Instance Name – Enter the desired instance name
● Flavour – Select the available flavouraccording to your need. (See the details
in right side)
● Instance Count – Enter the instance Count
● Boot Source – Select boot from pre-defined image.
● Image Name – select “cirros” since its very small Linux foot print for testing
openstack.

4. Click on Access & security tab for the instance. From the drop down box ,
select the key pair “UAPAIR” which we have created earlier. Also select the
security group which we have created. Click “Launch” to launch the new
instance.
Select the security group & Key Pair

5. Here you can see that instance has been launched. It will take few minutes
to boot the instance depends on the image size which we have selected.

Openstack Instance Launched

6. Once the instance is completely up , you can see the screen like below.

Ope
nstack Instance is up

In the IP address tab , you can get the private IP address for the instance.
Using this IP , You should be able to access the instance.
7. If you would like to see the instance console , click the instance name and
select the console tab. You should be able to access the instance here as well
by double clicking the console bar.

Ins
tance Console

In Openstack’s kilo branch, console may not load properly if you didn’t add
the below parameter in the local.conf file during the installation.

“enable_service n-cauth”

8. You can also check the log to know the instance is booted or not . (If
console is not working due to above mentioned issue).

openstack instance log


You should be able to access the instance within the private IP range (If you
didn’t allocate the floating IP). Here I am accessing the instance from control
node.

stack@uacloud:~$ ssh [email protected]

[email protected]'s password:

$ sudosu -

# ifconfig -a

eth0 Link encap:EthernetHWaddr FA:16:3E:A6:81:BE

inet addr:192.168.204.2 Bcast:192.168.204.255 Mask:255.255.255.0

inet6 addr: fe80::f816:3eff:fea6:81be/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

RX packets:114 errors:0 dropped:0 overruns:0 frame:0

TX packets:72 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:14089 (13.7 KiB) TX bytes:8776 (8.5 KiB)

lo Link encap:Local Loopback

inet addr:127.0.0.1 Mask:255.0.0.0

inet6 addr: ::1/128 Scope:Host

UP LOOPBACK RUNNING MTU:16436 Metric:1

RX packets:0 errors:0 dropped:0 overruns:0 frame:0


TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

# route

Kernel IP routing table

Destination Gateway Genmask Flags Metric Ref Use


Iface

default 192.168.204.3 0.0.0.0 UG 0 0 0


eth0

192.168.204.0 * 255.255.255.0 U 0 0 0
eth0

You might also like