Cloud Computing Presentation Notes
Cloud Computing Presentation Notes
• Massive Scale
• Homogeneity
• Virtualization
• Low-Cost Software
• Resilient Computing
• Geographic Distribution
• Service Orientation
• Advanced Security
CLOUD COMPUTING: ARCHITECTURE
CLOUD COMPUTING:
DEPLOYMENT MODELS
• Public Cloud
• Private Cloud
• Community Cloud
• Hybrid Cloud
CLOUD COMPUTING:
SERVICE MODELS
• Infrastructure-as-a-Service (IaaS):
• Virtual computing, storage and network
resource that can be provisioned on demand
• Platform-as-a-Service (PaaS):
• Application development frameworks,
operating systems and deployment
frameworks
• Software-as-Service (SaaS):
• Applications, management and user
interfaces provided over a network
OPEN SOURCE PRIVATE CLOUD
SOFTWARE: CLOUDSTACK
• Apache CloudStack is an open source cloud software that can be used for
creating private cloud offerings.
• CloudStack manages the network, storage, and compute nodes that make up a
cloud infrastructure.
• A CloudStack installation consists of a Management Server and the cloud
infrastructure that it manages.
• Zones
• The Management Server manages one or more zones where each zone is typically a single
datacenter.
• Pods
• Each zone has one or more pods. A pod is a rack of hardware comprising of a switch and
one or more clusters.
• Cluster
• A cluster consists of one or more hosts and a primary storage. A host is a compute node
that runs guest virtual machines.
• Primary Storage
• The primary storage of a cluster stores the disk volumes for all the virtual machines
running on the hosts in that cluster.
• Secondary Storage
• Each zone has a secondary storage that stores templates, ISO images, and disk volume
snapshots
OPEN SOURCE PRIVATE CLOUD
SOFTWARE: EUCALYPTUS
• Eucalyptus is an open source private cloud software for building private and
hybrid clouds that are compatible with Amazon Web Services (AWS) APIs.
• Node Controller
• NC hosts the virtual machine instances and manages the virtual network endpoints.
• Storage Controller: which manages the Eucalyptus block volumes and snapshots to the
instances within its specific cluster. SC is equivalent to the AWS Elastic Block Store
(EBS).
• Walrus: which is equivalent to Amazon S3 and serves as a persistent storage to all of the
virtual machines in the Eucalyptus Cloud. Walrus can be used as a simple Storage-as-a-
Service.
OPEN SOURCE PRIVATE CLOUD
SOFTWARE: OPENSTACK
A data center is a specialized IT infrastructure that houses centralized IT resources, such as servers, databases, and software systems.
Data centers are typically comprised of the following technologies and components:
• Virtualization
• Standardization and Modularity
• Automation
• Remote Operation and Management
• High Availability
• Security-Aware Design, Operation and Management
• Facilities
• Computing Hardware
• Storage Hardware
• Networking Hardware
VIRTUALIZATION
• Full Virtualization
• In full virtualization, the virtualization layer completely decouples the guest OS from the underlying hardware. The guest
OS requires no modification and is not aware that it is being virtualized. Full virtualization is enabled by direct execution
of user requests and binary translation of OS requests.
• Para-Virtualization
• In para-virtualization, the guest OS is modified to enable communication with the hypervisor to improve performance and
efficiency. The guest OS kernel is modified to replace non-virtualizable instructions with hyper-calls that communicate
directly with the virtualization layer hypervisor.
• Hardware Virtualization
• Hardware assisted virtualization that is enabled by hardware features such as Intel’s Virtualization Technology (VT-x) and
AMD’s AMD-V. In hardware assisted virtualization, privileged and sensitive calls are set to automatically trap to the
hypervisor. Thus, there is no need for either binary translation or para-virtualization.
CHARACTERISTICS OF VIRTUALIZED ENVIRONMENTS
• Increased Security
• Managed Execution
• Sharing
• Aggregation
• Emulation
• Isolation
• Performance Tuning
• Virtual machine migration
• Portability
TAXONOMY OF VIRTUALIZATION TECHNIQUES
• Execution virtualization
• Machine reference model
• Hardware-level virtualization
• Hypervisors
• Hardware virtualization techniques
• Operating system-level virtualization
• Programming language-level virtualization
• Application-level virtualization
• Other types of virtualization
• Storage virtualization
• Network virtualization
• Desktop virtualization
• Application server virtualization
PROS OF VIRTUALIZATION
• Managed execution
• Isolation
• Simplified allocation and partitioning of resources
• Portability and self-containment
• Efficient use of resources
• Disaster Recovery
• Cloud Migration
CONS OF VIRTUALIZATION
• Performance degradation
• Virtualization interposes an abstraction layer between the guest and the host, which causes increased latencies for the guest.
• In hardware virtualization, VMM is executed and scheduled together with other applications sharing the resources of the
host with them, thereby causing performance degradation.
• In case of programming language virtual machines, binary translation and interpretation can slow down the execution of
managed applications.
• Inefficiency and degraded user experience
• Virtualization can sometime lead to an inefficient use of the host.
• Some of the specific features of the host cannot be exposed by the abstraction layer and then become inaccessible. This can
happen due to device drivers in case of hardware virtualization and due to lack of specific libraries in case of
programming-level virtualization.
• Security holes and new threats
• In case of hardware virtualization, malicious programs can preload themselves before the OS and act as a thin VMM
toward it. The OS is then controlled and can be manipulated to extract sensitive information.
• Examples: BluePill, SubVirt, etc
IMPLEMENTATION LEVELS OF VIRTUALIZATION
Host OS VMWare ESXi, vSphere Windows Server Linux (Kernel-based) Windows, Linux, macOS
Guest OS Support Wide range Windows, Linux Various Linux Windows, Linux, macOS, more
Hypervisor Type Type 1 (Bare-Metal) Type 1 (Bare-Metal) Type 1 (Bare-Metal) Type 2 (Hosted)
Management Tools vCenter, Web Client Hyper-V Manager, System center Libvirt, virt-manager VirtualBox Manager
High Availability Yes (HA/DRS) Yes (Failover Clustering) Manual setup No native support
• The cloud storage device mechanism represents storage devices that are
designed specifically for cloud-based provisioning.
• Instances of these devices can be virtualized and are able to provide
fixed-increment capacity allocation in support of the pay-per-use
mechanism.
• Cloud storage devices can be exposed for remote access via cloud
storage services.
• Cloud Storage Levels
• Files
• Blocks
• Datasets
• Objects
• They can be designed to forward collected usage data to a log database for post
processing and reporting purposes.
• Monitoring Agent
• A monitoring agent is an intermediary, event-driven program that exists as a service agent and
resides along existing communication paths to transparently monitor and analyse dataflows.
• This type of cloud usage monitor is commonly used to measure network traffic and message
metrics.
• Resource Agent
• A resource agent is a processing module that collects usage data by having event-driven
interactions with specialized resource software.
• This module is used to monitor usage metrics based on pre-defined observable events are the
resource software level, such as initiating, suspending, resuming, and vertical scaling.
• Polling Agent
• A polling agent is a processing module that collects cloud service usage data by polling IT
resources.
• This type of cloud service monitor is commonly used to periodically monitor IT resource status,
such as uptime and downtime.
CLOUD INFRASTRUCTURE MECHANISM:
R E S O U R C E R E P L I C AT I O N
• Bandwidth consumption
• When a failure is detected, the failed instance is removed from the load
balancing scheduler. Whichever IT resource remains operational when a
failure is detected takes over the processing.
• Active-Passive
• Database Cluster
• It usually implements a load balancer mechanism that is either embedded within the
cluster management platform or set up as a separate IT resource.
• HA Cluster
• Cloud Storage Gateway: Transforms cloud storage protocols and encodes storage devices
to facilitate data transfer and storage
• Transport Protocols
• Messaging Protocols
• The two primary types of portals that are created with the remote
administration systems are:
• Self-Service Portal
CLOUD MANAGEMENT MECHANISM:
RESOURCE MANAGEMENT SYSTEM
• Tasks that are typically automated and implemented through the resource
management system include:
• Managing virtual IT resource templates that used to create pre-built instances, such as
virtual server images.
• Allocation and releasing virtual IT resources into the available physical infrastructure in
responsive to the starting, pausing, resuming, and termination of virtual IT resource
instances.
• Enforcing usage and security policies throughout the lifecycle of cloud service instances
• The master node runs the NameNode and JobTracker processes and
the slave nodes run the DataNode and TaskTracker components of
Hadoop.
• NameNode
• NameNode keeps the directory tree of all files in the file system and tracks
where across the cluster the file data is kept. It does not store the data of
these files itself. Client applications talk to the NameNode whenever they
wish to locate a file, or when they want to add/copy/move/delete a file.
• Secondary NameNode
• NameNode is a Single Point of Failure for the HDFS Cluster. An optional
Secondary NameNode which is hosted on a separate machine creates
checkpoints of the namespace.
• JobTracker
• The JobTracker is the service within Hadoop that distributes MapReduce
tasks to specific nodes in the cluster, ideally the nodes that have the data, or
at least are in the same rack.
A PA C H E H A D O O P
• TaskTracker
• Each TaskTracker has a defined number of slots which indicate the number
of tasks that it can accept.
• DataNode
• A functional HDFS filesystem has more than one DataNode, with data
replicated across them.
• Map: In the map phase, data is read from a distributed file system and
partitioned among a set of computing nodes in the cluster. The data is sent
to the nodes as a set of key-value pairs. The Map tasks process the input
records independently of each other and produce intermediate results as
key-value pairs. The intermediate results are stored on the local disk of the
node running the Map task.
• Reduce: When all the Map tasks are completed, the Reduce phase begins
in which the intermediate data with the same key is aggregated.
• MapReduce job execution starts when the client applications submit jobs to the JobTracker.
• The JobTracker returns a JobID to the client application. The JobTracker talks to the
NameNode to determine the location of the data.
• The JobTracker locates TaskTracker nodes with available slots at/or near the data.
• The TaskTrackers send out heartbeat messages to the JobTracker, usually every few minutes,
to reassure the JobTracker that they are still alive. These messages also inform the JobTracker
of the number of available slots, so the JobTracker can stay up to date with where in the
cluster, new work can be delegated.
• The JobTracker submits the work to the TaskTracker nodes when they poll for tasks. To
choose a task for a TaskTracker, the JobTracker uses various scheduling algorithms (default is
FIFO).
• The TaskTracker nodes are monitored using the heartbeat signals that are sent by the
TaskTrackers to JobTracker.
• The TaskTracker spawns a separate JVM process for each task so that any task failure does
not bring down the TaskTracker.
• The Task Tracker monitors these spawned processes while capturing the output and exit
codes. When the process finishes, successfully or not, the TaskTracker notifies the
JobTracker. When the job is completed, the JobTracker updates its status.
DEVELOPING A MAPREDUCE APPLICATION
• Writing a program in MapReduce follows a certain pattern. You start by writing your map and reduce
functions, ideally with unit tests to make sure they do what you expect. Then you write a driver
program to run a job, which can run from your IDE using a small subset of the data to check that it is
working. If it fails, you can use your IDE’s debugger to find the source of the problem. With this
information, you can expand your unit tests to cover this case and improve your mapper or reducer as
appropriate to handle such input correctly.
• When the program runs as expected against the small dataset, you are ready to unleash it on a cluster.
Running against the full dataset is likely to expose some more issues, which you can fix as before, by
expanding your tests and mapper or reducer to handle the new cases. Debugging failing programs in
the cluster is a challenge, so we look at some common techniques to make it easier.
• After the program is working, you may wish to do some tuning, first by running through some
standard checks for making MapReduce programs faster and then by doing task profiling. Profiling
distributed programs is not easy, but Hadoop has hooks to aid the process.
UNIT V
• Basic Terms and Concepts
• Threat Agents
• Cloud Security Threats
• Cloud Security Mechanism
• Encryption
• Hashing
• Digital Signature
• Public Key Infrastructure
• Identity and Access Management
• Single Sign-On
• Cloud Based Security Groups
• Hardened Virtual Server Images
FUNDAMENTAL CLOUD SECURITY:
BASIC TERMS AND CONCEPTS
• Confidentiality: Confidentiality is the characteristic of something being made accessible only to authorized parties.
• Integrity: Integrity is the characteristic of not having been altered by an unauthorized party.
• Authenticity: Authenticity is the characteristic of something having been provided by an authorized party.
• Availability: Availability is the characteristic of being accessible and usable during a specified time period.
• Threat: A threat is a potential security violation that can challenge defenses in an attempt to breach privacy and/or cause harm.
• Vulnerability: A vulnerability is a weakness that can be exploited either because it is protected by insufficient security controls, or because
existing security controls are overcome by an attack.
• Risk: Risk is the possibility of loss or harm arising from performing an activity.
• Security Controls: Security controls are countermeasures used to prevent or respond to security threats and to reduce or avoid risk.
• Security Mechanisms: Security mechanisms are components comprising a defensive framework that protects IT resources, information, and
services.
• Security Policies: A security policy establishes a set of security rules and regulations.
FUN D A M E N TA L CLO U D SECU RI T Y:
THR E AT AG E N T S
• Anonymous Attacker
• An anonymous attacker is a non-trusted cloud service consumer without permissions in the cloud.
• It typically exists as an external software program that launches network-level attacks through public networks.
• When anonymous attackers have limited information on security policies and defenses, it can inhibit their ability to formulate effective attacks.
• Therefore, anonymous attackers often resort to committing acts like bypassing user accounts or stealing user credentials, which using methods that either ensure
anonymity or require substantial resources for prosecution.
• It typically exists as a service agent (or a program pretending to be a service agent) with compromised or malicious logic.
• It may also exist as an external program able to remotely intercept and potentially corrupt message contents.
• Trusted Attacker
• A trusted attacker shares IT resources in the same cloud environment as the cloud consumer and attempts to exploit legitimate credentials to target cloud providers
and the cloud tenants with whom they share IT resources.
• Unlike anonymous attackers (which are non-trusted), trusted attackers usually launch their attacks from within a cloud’s trust boundaries by abusing legitimate
credentials or via the appropriation of sensitive and confidential information.
• Trusted attackers (also known as malicious tenants) can use cloud-based IT resources for a wide range of exploitations, including the hacking of weak authentication
processes, the breaking of encryption, the spamming of e-mail accounts, or to launch common attacks, such as denial of service campaigns.
TYPES OF THREAT AGENTS
• Malicious Insider
• Malicious Insiders are human threat agents acting on behalf of or in relation to the cloud provider.
• They are typically current or former employees or third parties with access to the cloud provider’s premises.
• This type of threat agent carries tremendous damage potential, as the malicious insider may have administrative privileges for accessing cloud consumer IT
resources.
CLOUD SECURITY THREATS
• Traffic Eavesdropping
• Traffic eavesdropping occurs when data being transferred to or within a cloud (usually from the cloud consumer to the cloud provider) is passively intercepted by a
malicious service agent for illegitimate information gathering purposes.
• The aim of this attack is to directly compromise the confidentiality of the data and, possibly, the confidentiality of the relationship between the cloud consumer and
cloud provider.
• Because of the passive nature of the attack, it can more easily go undetected for extended periods of time.
• Malicious Intermediary
• The malicious intermediary threat arises when messages are intercepted and altered by a malicious service agent, thereby potentially compromising the message’s
confidentiality and/or integrity.
• It may also insert harmful data into the message before forwarding it to its destination.
• Denial of Service
• The objective of the denial of service (DoS) attack is to overload IT resources to the point where they cannot function properly.
• The network is overloaded with traffic to reduce its responsiveness and cripple its performance.
• Multiple cloud service requests are sent, each of which is designed to consume excessive memory and processing resources.
• Insufficient Authorization
• The insufficient authorization attack occurs when access is granted to an attacker erroneously or too broadly, resulting in the attacker getting access to IT resources
that are normally protected.
• This is often a result of the attacker gaining direct access to IT resources that were implemented under the assumption that they would only be accessed by trusted
consumer programs.
• A variation of this attack, known as weak authentication, can result when weak passwords or shared accounts are used to protect IT resources.
• Within cloud environments, these types of attacks can lead to significant impacts depending on the range of IT resources and the range of access to those IT
resources the attacker gains.
• Virtualization Attack
• A virtualization attack exploits vulnerabilities in the virtualization platform to jeopardize its confidentiality, integrity, and/or availability.
• A trusted attacker successfully accesses a virtual server to compromise its underlying physical server.
• Within public clouds, where a single physical IT resource may be providing virtualized IT resources to multiple cloud consumers, such an attack can have
significant repercussions.
• Malicious cloud service consumers can target shared IT resources with the intention of compromising cloud consumers or other IT resources that share the same
trust boundary.
• The consequence is that some or all of the other cloud service consumers could be impacted by the attack and/or attacker could use virtual IT resources against
others that happen to also share the same trust boundary.
CLOUD SECURITY MECHANISM:
E N C RY P T I O N
• When encryption is applied to plaintext data, the data is paired with a string of
characters called an encryption key, a secret message that is established by and
shared among authorized parties. The encryption key is used to decrypt the
ciphertext back into its original plaintext format.
• The encryption mechanism can help counter the traffic eavesdropping, malicious
intermediary, insufficient authorization, and overlapping trust boundaries
security threats.
• Symmetric Encryption: It uses the same key for both encryption and decryption, both of
which are performed by authorized parties that use the one shared key.
• Asymmetric Encryption: It relies on the use of two different keys, namely a private key
and a public key. The private key is known only to its owner while the public key is
commonly available. A document that was encrypted with a private key can only be
correctly decrypted with the corresponding public key and vice versa.
CLOUD SECURITY MECHANISM:
HASHING
• Once hashing has been applied to a message, it is locked, and no key is provided
for the message to be unlocked.
• The message sender can then utilize the hashing mechanism to attach the
message digest to the message.
• The recipient applies the same hash function to the message to verify that the
produced message digest is identical to the one that accompanied the message.
• In addition to its utilization for protecting stored data, cloud threats that can
mitigated by the hashing mechanism include malicious intermediary and
insufficient authorization.
CLOUD SECURITY MECHANISM:
D I G I TA L S I G N AT U R E
• A digital signature provides evidence that the message received is the same as the one
created by its rightful sender.
• Both hashing and asymmetrical encryption are involved in the creation of a digital
signature, which essentially exists as a message digest that was encrypted by a private
key and appended to the original message.
• The recipient verifies the signature validity and uses the corresponding public key to
decrypt the digital signature, which produces the message digest.
• The hashing mechanism can also be applied to the original message to produce this
message digest.
• Identical results from the two different processes indicate that the message maintained
its integrity.
• The digital signature mechanism helps mitigate the malicious intermediary, insufficient
authorization, and overlapping trust boundaries security threats.
CLOUD SECURITY MECHANISM:
PUBLIC KEY INFRASTRUCTURE (PKI)
• The Public Key Infrastructure (PKI) is a common approach for managing the
issuance of asymmetric keys, which exists as a system of protocols, data formats,
rules, and practices that enable large-scale systems to securely use public key
cryptography.
• This system is used to associate public keys with their corresponding key owners
(known as public key identification) while enabling the verification of key
validity.
• PKIs rely on the use of digital certificates, which are digitally signed data structures
that bind public keys to certificate owner identities, as well as to related
information, such as validity periods.
• The IAM mechanism is primarily used to counter the insufficient authorization, denial of service, and overlapping trust
boundaries threats.
CLOUD SECURITY MECHANISM:
SINGLE SIGN-ON (SSO)
• The single sign-on (SSO) mechanism enables one cloud service consumer to be
authenticated by a security broker, which establishes a security context that is
persisted while the cloud service consumer accesses other cloud services or
cloud-based IT resources. Otherwise, the cloud service consumer would need to
re-authenticate itself with every subsequent request.
• The credentials initially provided by the cloud service consumer remain valid for
the duration of a session, while its security context information is shared.
• The SSO mechanism’s security broker is especially useful when a cloud service
consumer needs to access cloud services residing on different clouds.
• The mechanism does not directly counter any of the cloud security threats. It
primarily enhances the usability of cloud-based environments for access and
management of distributed IT resources and solutions.
CLOUD SECURITY MECHANISM:
CLOUD-BASED SECURITY GROUPS
• Networks are segmented into logical cloud-based security groups that form logical network
perimeters.
• Each cloud-based IT resource is assigned to at least one logical cloud-based security group.
Each logical cloud-based security group is assigned specific rules that govern the
communication between the security groups.
• Multiple virtual servers running on the same physical server can become members of
different logical cloud-based security groups.
• Cloud-based security groups delineate areas where different security measures can be applied.
Properly implemented cloud-based security groups help limit unauthorized access to IT
resources in the event of a security breach. This mechanism can be used to help counter the
denial of service, insufficient authorization, and overlapping trust boundaries threats, and is
closely related to the logical network perimeter mechanism.
CLOUD SECURITY MECHANISM:
H A R D E N E D V I RT U A L S E RV E R I M A G E S