CC Compiled Notes
CC Compiled Notes
We are excited to offer you a complete set of notes for Cloud Computing,
covering all the essential topics across five key modules. Whether you're
preparing for exams or looking to deepen your understanding of cloud
technologies, these notes will serve as your go-to resource for navigating the
complexities of the subject.
Here’s what you can expect from each module:
Module 1: A comprehensive introduction to Cloud Computing, exploring
its service models (IaaS, PaaS, SaaS) and core concepts.
Module 2: A closer look at cloud deployment models (Public, Private,
Hybrid) and virtualization, laying the foundation for cloud architecture.
Module 3: Insights into cloud architecture and security practices to
ensure data integrity and safety in cloud environments.
Module 4: Exploring cloud storage solutions, resource management
strategies, and scalability techniques for efficient cloud operations.
Module 5: Keeping up with the latest trends and innovations in Cloud
Computing and understanding its diverse applications in real-world
scenarios.
These notes are crafted to not only aid in exam preparation but also provide
valuable insights that will serve you throughout your career in cloud
technologies. As you progress through the modules, you'll gain a well-rounded
understanding of the concepts, tools, and best practices in the ever-evolving
cloud space.
We hope these notes empower you to succeed in your studies and beyond. Let's
embark on this learning journey and unlock the potential of Cloud Computing
together!
1. Introduction: 1-24
Introduction, Cloud Computing at a Glance, Historical
Developments, Building Cloud Computing Environments, Amazon
Web Services (AWS), Google AppEngine, Microsoft Azure,
Hadoop, Force.com and Salesforce.com, Manjrasoft Aneka
2. Virtualization: 25-45
Introduction, Characteristics of Virtualized, Environments
Taxonomy of Virtualization Techniques, Execution Virtualization,
Other Types of Virtualizations, Virtualization and Cloud Computing,
Pros and Cons of Virtualization, Technology Examples
3. Cloud Computing Architecture: 46-54
Introduction, Cloud Reference Model, Types of Clouds, Economics
of the Cloud, Open Challenges
4. Cloud Security: 55-70
Risks, Top concern for cloud users, privacy impact assessment,
trust, OS security, VM Security, Security Risks posed by shared
images and management OS.
5. Cloud Platforms in Industry Amazon web services: - Compute 71-103
services, Storage services, Communication services, Additional
services. Google AppEngine: - Architecture and core concepts,
Application life cycle, Cost model, Observations.
MODULE 1
Cloud Computing: The Future of Utility-Based Computing Services
Utility Computing Concept:
Computing is being transformed into a model resembling utility services like water,
electricity, and gas.
In this model, users access services based on their requirements, without worrying
about where the services are hosted.
This approach is termed utility computing.
Cloud Computing as a New Paradigm:
Cloud computing is the latest paradigm aiming to make the vision of utility
computing a reality.
It changes how we design systems, develop applications, and leverage existing
services.
Dynamic Provisioning:
Cloud computing is based on the concept of dynamic provisioning, which applies to
services, computing power, storage, networking, and IT infrastructure.
Resources are made available via the Internet and are offered on a pay-per-use basis
by cloud vendors.
Scalability and Flexibility:
Cloud services allow users to scale infrastructure up or down based on demand.
Users only pay for the time they use these resources, which offers cost-efficiency and
flexibility.
Ease of Access and Deployment:
With cloud services, anyone with a credit card can subscribe to services, deploy, and
configure servers for an application.
The infrastructure can be adjusted quickly to meet changing application needs.
Cloud Computing at a Glance
1. Leonard Kleinrock's Vision (1969):
Leonard Kleinrock, one of the chief scientists behind ARPANET, predicted the rise of
computer utilities, akin to electricity or telephone services.
He anticipated a future where computing services would be available on demand, much like
today's utilities, serving homes and offices across the country.
1
Cloud Computing 21CS72
2
Cloud Computing 21CS72
This consolidation is essential for achieving the long-term vision of cloud computing as a
universal, on-demand utility service.
The Vision of Cloud Computing
1. Provisioning of Resources:
Cloud computing enables users to easily provision virtual hardware, runtime environments,
and services with just a credit card, requiring no up-front commitments.
The entire computing stack can be transformed into utilities, allowing systems to be deployed
within hours with minimal maintenance costs.
2. Growing Adoption Across Industries:
Initially met with skepticism, cloud computing has now become a widespread practice across
various business sectors and application domains due to its convenience and scalability.
The rapid demand has accelerated technical development, making services more
sophisticated and cost-effective.
3. Challenges in Vendor Standardization:
Despite advancements, cloud computing often limits users to services from a single vendor
due to the lack of standardization, making it difficult to migrate between providers.
The long-term goal is to establish an open market for IT services where cloud resources can
be traded as utilities, free from technological and legal barriers.
4. A Global Cloud Marketplace:
The vision for the future includes a global digital marketplace where cloud services can be
traded and accessed easily.
This would allow service discovery through automated processes, reducing the need for
manual intervention and enabling seamless integration into existing systems.
5. Cloud as a Utility for Diverse Stakeholders:
Different stakeholders use cloud services for various purposes: developers rely on scalable
runtime environments, end users enjoy web-based document access and processing, and
enterprises leverage on-demand storage and computing power.
3
Cloud Computing 21CS72
The establishment of unified standards is crucial for ensuring smooth interaction between
different cloud technologies and for enabling a global cloud marketplace.
Defining a Cloud
Cloud computing has become a widely used term, encompassing a range of technologies,
services, and concepts. It is often associated with virtualized infrastructure, hardware on
demand, utility computing, IT outsourcing, platform and software as a service (SaaS), and
more. The term "cloud" historically emerged from telecommunications, symbolizing the
network or the Internet in system diagrams. In cloud computing, the Internet acts as both the
medium and the platform through which services are delivered.
4
Cloud Computing 21CS72
Definitions
1. Armbrust's Definition:
Cloud computing refers to applications delivered as services over the Internet and the
hardware and system software in datacenters providing those services.
It covers the entire stack, from underlying hardware to high-level software services,
introducing the concept of Everything as a Service (XaaS). XaaS allows IT infrastructure,
platforms, databases, and more to be delivered as a service, priced based on usage.
2. NIST's Definition:
Cloud computing is a model for on-demand network access to a shared pool of configurable
computing resources (networks, servers, storage, applications, services). These resources can
be rapidly provisioned and released with minimal management effort.
Utility-Oriented Approach
Cloud computing adopts a utility-oriented model, where services are delivered with a
pricing model, typically a "pay-per-use" strategy. Users can rent virtual hardware, access
online storage, or use development platforms, paying only for their effective usage. This
model eliminates the need for large up-front costs and allows services to be accessed via a
web browser or API.
Reese identifies three key criteria to determine if a service qualifies as cloud computing:
1. Web Accessibility: The service is accessible via a nonproprietary web browser or
API.
2. Zero Capital Expenditure: No up-front costs are needed to begin.
3. Pay-per-Use: Users are charged based on actual usage.
Service-Level Agreements (SLAs)
For enterprise-level services, cloud providers and users typically establish Service-Level
Agreements (SLAs), which define the quality of service, including uptime, performance, and
support terms. This relationship ensures that cloud resources are managed according to the
user's business needs.
Buyya's Definition
Buyya et al. describe cloud computing as a parallel and distributed system consisting of
virtualized computers that are dynamically provisioned and presented as unified computing
resources. These resources are managed through SLAs negotiated between providers and
consumers, which dictate service quality and usage.
A Closer Look at Cloud Computing
Cloud computing is revolutionizing the way enterprises, governments, institutions, and
research organizations build and manage their computing systems. Its ease of access and
integration, often as simple as making a credit card transaction online, makes cloud resources
highly practical for various market segments.
5
Cloud Computing 21CS72
Real-World Examples
1. Large Enterprises:
o The New York Times: When the New York Times needed to convert its
digital library of past editions into a web-friendly format, the task required
immense computing power for a short duration. Instead of investing in
infrastructure, they used Amazon EC2 and S3 cloud resources. This allowed
them to complete the task in 36 hours with no additional ongoing costs after
relinquishing the resources.
2. Small Enterprises and Startups:
o Animoto: A company that transforms images, music, and video fragments into
customized videos for users. Their need for considerable storage and backend
processing fluctuates. Instead of owning servers, they rely entirely on Amazon
Web Services (AWS), scaling from 70 to 8,500 servers in just one week due
to user demand.
3. System Developers:
o Little Fluffy Toys: This London-based company created a widget for
providing information on nearby bicycle rentals. They used Google
AppEngine to handle the widget’s computing needs, allowing them to launch
the product within a week by focusing solely on business logic rather than
infrastructure management.
4. End Users:
o Apple iCloud: Allows users to store documents in the cloud and access them
from any device. For example, a user can take a photo with a smartphone, edit
it later on a laptop, and see it updated on a tablet. This seamless experience is
entirely transparent to the user, requiring no manual device syncing.
Cloud Computing Models
Cloud computing operates on a pay-as-you-go basis, which accommodates various needs
across sectors, including computing power, storage, and application runtime environments.
This model not only provides on-demand IT services but also reshapes how IT resources are
perceived—as utilities, similar to electricity or water.
The three major deployment models for cloud computing include:
1. Public Clouds:
o These are third-party provided environments (e.g., virtualized datacenters)
made available to consumers on a subscription basis. Public clouds allow users
to quickly access compute, storage, and application services, with data and
applications hosted on the provider's premises.
6
Cloud Computing 21CS72
2. Private Clouds:
o Large organizations with substantial computing infrastructures replicate cloud
IT services in-house, creating a private cloud. This model allows them to
manage data and applications internally while still benefiting from the cloud's
flexibility and scalability.
3. Hybrid Clouds:
o A combination of both public and private cloud infrastructures, allowing
organizations to leverage the benefits of both while maintaining control over
certain sensitive or critical workloads.
7
Cloud Computing 21CS72
8
Cloud Computing 21CS72
IaaS supports users who require control over infrastructure to build scalable systems.
PaaS caters to developers building applications in a managed environment.
SaaS benefits end users who need scalable applications without any involvement in
infrastructure or software management.
1. Infrastructure-as-a-Service (IaaS)
Description: IaaS provides the fundamental building blocks of computing systems in
the form of virtualized hardware, storage, and networking. These resources are made
available on demand and can be scaled dynamically based on user needs.
Components:
o Virtual Hardware: Delivered as virtual machine instances that can be
customized with specific software stacks.
o Virtual Storage: Available either as raw disk space or in the form of object
storage for managing data entities.
o Virtual Networking: Manages connectivity among virtual instances and the
Internet or private networks.
Examples: Amazon EC2, Amazon S3, Rightscale, vCloud.
Use Case: Ideal for users who want to build dynamically scalable computing systems
or manage large-scale data processing tasks.
2. Platform-as-a-Service (PaaS)
Description: PaaS provides a higher level of abstraction by offering scalable runtime
environments for applications. It frees users from infrastructure management,
allowing them to focus on application development using predefined APIs and
libraries. The service provider manages scalability and fault tolerance.
Components:
o Runtime Environments: Scalable and elastic environments for application
execution.
o Middleware Platforms: Abstract environments that handle application
deployment and execution.
Examples: Windows Azure, Google AppEngine, Hadoop, Aneka.
Use Case: Ideal for developers who need a scalable programming platform for
building new applications without worrying about underlying infrastructure.
9
Cloud Computing 21CS72
3. Software-as-a-Service (SaaS)
Description: SaaS delivers software applications on demand, often through a web
browser. These applications are hosted on the provider's infrastructure, making them
more scalable and accessible without requiring users to install or maintain software.
Components:
o End-User Applications: Common desktop functionalities such as document
management, CRM, and photo editing are replicated in a cloud environment.
o Social Networking and Other Services: Cloud infrastructure supports
applications like social networking sites, which need to handle massive user
interaction loads.
Examples: Google Docs, Salesforce, Facebook, Flickr.
Use Case: Best for end users who need applications like email, CRM, or document
management, without the hassle of software development or infrastructure
management.
Characteristics and Benefits of Cloud Computing
Cloud computing provides a variety of characteristics that offer substantial benefits for both
Cloud Service Consumers (CSCs) and Cloud Service Providers (CSPs). The key
characteristics include:
1. No Up-front Commitments
Description: Users do not need to invest heavily in IT infrastructure or software
before starting to use cloud services. Resources are available on demand, and users
only pay for what they use.
Benefit: This drastically reduces capital expenditures for organizations, allowing
them to avoid costly upfront investments.
2. On-demand Access
Description: Cloud resources are available whenever they are needed, and users can
scale their consumption based on demand.
Benefit: This increases flexibility and agility, allowing organizations to dynamically
scale resources to handle traffic spikes or unexpected workloads without having to
plan capacity in advance.
3. Nice Pricing (Pay-as-you-go)
Description: Cloud services operate on a pay-as-you-go model, where users are billed
for the resources they consume (such as compute hours, storage space, etc.).
Benefit: Costs are turned into operating expenses (Opex) rather than capital
expenditures (Capex), allowing businesses to better manage their budgets and
reducing the need for expensive hardware investments.
4. Simplified Application Acceleration and Scalability
10
Cloud Computing 21CS72
11
Cloud Computing 21CS72
12
Cloud Computing 21CS72
Impact: Organizations using cloud services in multiple jurisdictions may face legal
disputes regarding data ownership and access. Conflicts between local and
international laws can complicate data handling, particularly for multinational
companies.
1. Distributed Systems
Definition: A distributed system is a collection of independent computers that appears
to its users as a single coherent system. This concept is crucial for cloud computing,
which hides the complexity of its architecture behind a unified interface.
13
Cloud Computing 21CS72
14
Cloud Computing 21CS72
15
Cloud Computing 21CS72
16
Cloud Computing 21CS72
XML (eXtensible Markup Language): XML is used to structure data in a way that
is both machine-readable and human-readable, facilitating data interchange between
web services.
AJAX (Asynchronous JavaScript and XML): AJAX enables web applications to
update content asynchronously without requiring a full page reload. This results in a
more responsive and dynamic user experience.
Web Services: Web services allow applications to communicate and share data over
the internet using standard protocols. They play a crucial role in integrating and
composing different web functionalities.
RSS (Really Simple Syndication): RSS feeds allow users to receive updates from
websites without visiting them. It helps in aggregating and distributing content
efficiently.
Examples of Web 2.0 Applications
Google Documents: An online document editor that allows real-time collaboration
and sharing of documents, leveraging cloud infrastructure for storage and processing.
Google Maps: Provides interactive maps and location-based services using AJAX and
other web technologies.
Facebook: A social networking site that uses Web 2.0 technologies to provide a
highly interactive and personalized user experience.
Flickr: An image and video hosting service that harnesses user contributions for
content creation and sharing.
YouTube: A video-sharing platform that allows users to upload, view, and interact
with video content.
Wikipedia: An online encyclopedia that relies on user-generated content and
community collaboration.
Service-Oriented Computing (SOC) and Its Role in Cloud Computing
Service-Oriented Computing (SOC) represents a foundational paradigm for developing and
managing applications and systems within cloud computing. SOC focuses on using services
as the core building blocks for creating scalable, flexible, and interoperable systems. Here’s a
detailed overview of SOC and its influence on cloud computing:
Concepts of Service-Oriented Computing
1. Definition of a Service:
o Abstraction: A service is a self-contained and platform-independent
component that performs a specific function or task.
o Loose Coupling: Services are designed to be loosely coupled, meaning they
can interact with each other without being tightly integrated. This allows for
flexibility and reusability.
17
Cloud Computing 21CS72
18
Cloud Computing 21CS72
2. Mainframe Era:
Mainframe Computing: During the era of mainframe computers, companies like IBM
provided computing resources to large organizations such as banks and government agencies.
These early systems were among the first instances of utility-like computing, where
organizations paid for the computing power they used.
Improvements: This model led to advancements in mainframe technology, including
enhanced operating systems, process control, and user-metering features.
3. Cluster Computing:
Academic and Research Use: The concept of utility computing extended to academic and
research institutions with the advent of cluster computing. Institutions could access powerful
computing resources externally to tackle complex computational problems, known as "Grand
Challenge" problems, without needing to invest in their own infrastructure.
4. Internet and Web Technologies:
Global Access: The widespread adoption of the Internet and web technologies facilitated the
realization of utility computing on a global scale. Computing grids emerged, offering planet-
scale distributed computing infrastructure accessible on demand.
Market Orientation: Computing grids introduced market-oriented elements, allowing users
to bid for or purchase computing resources, such as storage and computation, much like any
other commodity.
5. E-commerce and Online Services:
E-commerce Infrastructure: The rise of e-commerce in the late 1990s, which allowed
consumers to buy a wide range of goods and services online, contributed to the adoption of
utility computing. The development of online payment systems made it easier for users to
purchase computing resources and services.
Public Interest: Although interest in online services waned after the dot-com bubble burst,
the infrastructure for online payments and services had already been established, paving the
way for utility computing.
Building Cloud Computing Environments
Creating effective cloud computing environments involves both developing applications that
leverage cloud capabilities and designing the infrastructure and systems that deliver these
cloud services. Here's a detailed breakdown of how to approach both aspects:
19
Cloud Computing 21CS72
1. Application Development
Applications that utilize cloud computing benefit from dynamic scaling and on-demand
resource allocation. This is crucial for handling varying workloads and complex processes.
The main categories of cloud applications include:
1. Web Applications:
Scalability: Web applications benefit significantly from cloud computing due to its ability to
scale resources based on user demand. This is essential for applications with fluctuating user
interactions and workload.
Web 2.0 Technologies: With Web 2.0, the Web has become a platform for complex and
interactive applications. These applications interact with users and backend services across
multiple tiers, making them sensitive to infrastructure sizing and deployment variability.
2. Resource-Intensive Applications:
Data-Intensive and Compute-Intensive: These applications require substantial resources
but only intermittently. Examples include scientific simulations and large-scale data analyses.
On-Demand Resource Usage: Cloud computing allows these applications to access massive
compute power and storage only when needed, avoiding the need for permanent
infrastructure investments.
3. Cloud Benefits:
Dynamic Scaling: Cloud environments provide methods for dynamically scaling compute
power, storage, and networking resources.
Runtime Environments: Cloud platforms offer environments designed for scalability and
dynamic resource allocation.
Application Services: Cloud services mimic desktop applications but are hosted and
managed on the provider's side, making integration seamless. These services are often
accessed through RESTful Web services, simplifying development and management.
2. Infrastructure and System Development
Developing cloud infrastructure and systems involves several core technologies and requires
addressing unique challenges:
1. Distributed Computing:
o Foundation: Cloud computing systems are distributed, and managing these
distributed resources effectively is crucial.
o Dynamism: The ability to provision new nodes and services on demand adds
complexity. This is primarily managed at the middleware layer.
2. Infrastructure-as-a-Service (IaaS):
o Resource Management: IaaS provides scalable resources (compute, storage,
networking) that can be added or removed as needed.
20
Cloud Computing 21CS72
21
Cloud Computing 21CS72
or Web services API. EC2 instances can be saved as images, allowing for template
creation, which are then stored in Simple Storage Service (S3).
Simple Storage Service (S3): Provides scalable storage in the cloud organized into
buckets. S3 allows storage of various types of objects, including files and disk images,
accessible globally.
Additional Services: AWS includes a variety of other services for networking,
caching, DNS, and databases (both relational and non-relational), supporting
comprehensive cloud computing solutions.
2. Google App Engine
Google App Engine is designed for developing scalable web applications, leveraging
Google’s infrastructure to handle dynamic scaling:
Runtime Environment: Offers a secure environment for web applications with
services such as in-memory caching, scalable data stores, job queues, messaging, and
cron tasks.
Development and Deployment: Developers use the App Engine SDK to build and
test applications locally. Once tested, applications can be deployed to App Engine
with easy migration, cost containment through quotas, and availability across the
globe.
Supported Languages: Python, Java, and Go.
3. Microsoft Azure
Microsoft Azure provides a comprehensive cloud platform with the following features:
Roles: Applications are organized into roles: Web roles for hosting web applications,
Worker roles for generic workload processing, and Virtual Machine roles for fully
customizable environments including operating systems.
Additional Services: Azure offers support for storage (both relational and blobs),
networking, caching, content delivery, and more, complementing the execution of
cloud applications.
4. Hadoop
Apache Hadoop is an open-source framework for processing large data sets:
MapReduce: Implements the MapReduce programming model developed by Google,
which consists of two operations: map (transforming input data) and reduce
(aggregating map results).
Usage: Yahoo! has integrated Hadoop into its infrastructure for data processing and
operates one of the largest Hadoop clusters. Hadoop is also available for academic
use.
22
Cloud Computing 21CS72
23
Cloud Computing 21CS72
MODULE 2
VIRTUALIZATION
Introduction
Virtualization refers to a set of technologies and concepts that provide an abstract environment,
whether virtual hardware or operating systems, to run applications. Historically, virtualization has
been available in many forms, offering virtual environments at various levels, such as the operating
system, programming languages, and applications. While it has been around for a while, recent
developments have made it more prevalent, especially in delivering Infrastructure-as-a-Service (IaaS)
solutions for cloud computing.
Virtualization has gained momentum due to several factors:
Increased Performance and Computing Capacity: Modern PCs, even desktops, are
powerful enough to run virtual machines alongside regular tasks without noticeable
performance degradation.
Underutilized Hardware and Software: Powerful computers are often not fully utilized,
especially in enterprise environments, where office computers may be idle after business
hours. Virtualization can help maximize their use.
Lack of Space: Companies are often constrained by the physical space needed for data
centers. Virtualization enables server consolidation, which reduces the number of physical
servers required.
Greening Initiatives: Reducing energy consumption and the carbon footprint of data centers
is becoming increasingly important. Virtualization helps reduce power consumption and the
need for cooling by consolidating servers.
Rising Administrative Costs: As server numbers grow, so do administrative costs.
Virtualization can reduce the number of physical servers, which helps lower these costs.
In addition to hardware virtualization, other forms of virtualization have played a key role in
development, particularly virtual machine-based programming languages like Java and .NET. Java,
released in 1995, became popular for integrating small applications (applets), and by the early 2000s,
Java and .NET were used for enterprise-class applications. These developments proved that
virtualization could be implemented without significant performance overhead, paving the way for the
widespread adoption of virtualization technologies in data centers.
Characteristics of Virtualized Environments
Virtualized environments refer to the creation of virtual versions of various system components such
as hardware, software, storage, or networks. The virtualization process involves three key
components: guest, host, and the virtualization layer. These components operate as follows:
Guest: This is the system interacting with the virtualization layer instead of directly with the
host.
Host: Represents the original environment where the virtualization layer operates.
Virtualization Layer: The software responsible for recreating the environment where the
guest will run, ensuring the separation between guest and host.
24
Cloud Computing 21CS72
Virtualization can be applied in different areas such as hardware, storage, and networking. For
instance, in hardware virtualization, the guest is typically a system image that includes an operating
system and applications running on virtual hardware managed by the Virtual Machine Manager
(VMM). This is supported by the host's physical hardware.
Key Characteristics of Virtualized Environments
1. Increased Security:
o Virtualization enables secure and isolated execution of guest systems. The virtual
machine (VM) provides a layer between the guest and the host, which can filter and
manage operations. For example, sensitive data on the host can be hidden from the
guest. This capability is essential when running untrusted code or isolated
applications like Java applets in sandboxed environments. VMs like VMware
Desktop, VirtualBox, and Parallels ensure that the virtual file system is separated
from the host system, making it ideal for running potentially harmful applications
without compromising host security.
Virtual
Resources
Physical
Resources
25
Cloud Computing 21CS72
2. Managed Execution:
o Virtualized environments provide greater control over how guest systems are
executed, offering benefits like:
Sharing: Allows multiple guests to share the same physical resources,
improving utilization. For example, in data centers, resource sharing helps
reduce server counts and power consumption.
Aggregation: The reverse of sharing, where multiple physical hosts can be
grouped together and represented as a single virtual system. This is useful in
distributed computing scenarios.
Emulation: Allows a guest to run in an environment emulated by the virtual
layer, which is different from the physical host. This feature is useful for
testing software on different platforms or running legacy applications.
Isolation: Virtualization provides each guest with a completely separate
environment, preventing interference between guests and protecting the host
from harmful operations. This is particularly useful in multi-tenant
environments.
3. Portability:
o One of the main advantages of virtualization is portability. In hardware
virtualization, virtual machine images can be moved and run on different virtual
machines with minimal effort. Similarly, applications developed for platforms like
Java (JVM) or .NET can be run on any system supporting the corresponding virtual
machine, providing flexibility and consistency across different environments.
Additional Benefits
Performance Tuning: Modern virtualization technologies allow fine-tuning of resources
allocated to guests, improving performance. For example, guests can be assigned specific
amounts of memory or processing power, and their performance can be optimized.
Virtual Machine Migration: This feature allows the movement of a guest system from one
physical machine to another without disrupting its execution, which is particularly useful in
virtualized data centers for load balancing and maintenance tasks.
These features, including enhanced security, flexibility in execution, and portability, make virtualized
environments a powerful tool for optimizing IT infrastructure and application deployment.
Taxonomy of Virtualization Techniques
Virtualization involves various techniques aimed at emulating different aspects of computing. The
taxonomy of these techniques is based on the service or entity being emulated. Broadly, virtualization
can be categorized into the following areas:
Execution Virtualization
Storage Virtualization
Network Virtualization
Of these, execution virtualization is the most developed and widely used, meriting deeper
investigation and further classification. The execution virtualization techniques can be divided into
two main categories based on the type of host they require:
26
Cloud Computing 21CS72
1. Process-Level Virtualization:
o These techniques run on top of an existing operating system, which has full control
over the hardware.
o The virtualization layer creates isolated environments for applications, which operate
as if they have their own operating system resources but share the underlying kernel.
o Examples include containers like Docker and LXC, which offer isolated user spaces
for running applications.
2. System-Level Virtualization:
o These techniques are implemented directly on hardware and do not require (or require
minimal support from) an existing operating system.
o This type of virtualization provides a guest system with a virtual environment that
closely mimics the underlying hardware.
Emulation Application
Execution
Process Level High-Level VM
Programming
Environment Language
Operating
Storage Multiprogramming
System
Virtualization
Hardware-Assisted
Network Virtualization
Full Virtualization
Hardware
. System Level
Paravirtualization
Partial Virtualization
o Examples include hypervisors like VMware ESXi, Hyper-V, and KVM, which
allow multiple operating systems to run on a single physical machine by abstracting
the hardware.
Categories within Execution Virtualization
Execution virtualization techniques can be classified based on the type of virtual environment
provided to the guest:
1. Bare Hardware Virtualization:
o The guest operating system runs directly on a virtualized version of the hardware. The
hypervisor manages virtual machines by providing them with virtualized access to
physical hardware.
o Example: Hypervisors like VMware, Xen, and Hyper-V.
27
Cloud Computing 21CS72
Applications Applications
API calls
API
Libraries Libraries
ISA ISA
Hardware Hardware
28
Cloud Computing 21CS72
29
Cloud Computing 21CS72
30
Cloud Computing 21CS72
VM VM VM VM
ISA
ISA ISA
Hardware Hardware
ISA
Instructions (ISA)
Dispatcher Interpreter
Routines
Allocator
The VMM architecture allows guest OSs to run on top of the virtual environment transparently, as if
they were running directly on physical hardware.
Goldberg and Popek Criteria for Virtualization
For efficient hardware-level virtualization, the following criteria must be met, as defined by Goldberg
and Popek (1974):
1. Equivalence: The behavior of a guest OS under a VMM should be identical to its behavior on
the physical host.
2. Resource Control: The VMM must have full control over virtualized resources.
31
Cloud Computing 21CS72
3. Efficiency: Most of the machine instructions must execute directly on the hardware without
VMM intervention.
Theorem 3.1: Conditions for Constructing a Virtual Machine Manager (VMM)
This theorem states that for any conventional third-generation computer, a VMM can be constructed if
the sensitive instructions of the system are a subset of the privileged instructions. Sensitive
instructions are those that affect system resources, and if these instructions can be trapped and
controlled by the hypervisor in user mode, the VMM can efficiently manage the resources without
significant performance degradation. The theorem ensures resource control when the hypervisor
operates in the most privileged mode (Ring 0), allowing non-sensitive instructions to be executed
without hypervisor intervention, maintaining the system's equivalence property.
Theorem 3.2: Recursive Virtualization
A system is recursively virtualizable if:
1. It is virtualizable (i.e., supports VMM construction as per Theorem 3.1).
2. A VMM without timing dependencies can be built on the system.
This allows for nested virtualization, meaning a VMM can run another VMM on top of itself, as long
as the system's resources support it.
Theorem 3.3: Hybrid VMM Construction
This theorem introduces the concept of a hybrid virtual machine (HVM), which may be built for
systems where the set of user-sensitive instructions is a subset of privileged instructions. Unlike full
virtualization, HVMs rely on interpreting more instructions (especially those related to sensitive
system behaviors) instead of executing them directly. This approach is less efficient than full
virtualization but still enables system emulation.
Virtualization Techniques
Hardware-Assisted Virtualization: This method uses hardware features (such as Intel VT
and AMD V) to support virtual machine managers in running guest OSes in isolation. It
improves performance by reducing the need for software-based emulation, which was the
standard before these hardware extensions were introduced.
Full Virtualization: Refers to complete hardware emulation, allowing the guest OS to run
without modifications. While it provides complete isolation and security, it can introduce
performance challenges, particularly with privileged instructions. Hardware-assisted
virtualization is often combined with full virtualization to mitigate these issues.
Paravirtualization: In this method, the guest OS is modified to interact directly with the host
for certain performance-critical operations, leading to improved performance. It requires the
guest OS source code to be accessible for modifications, making it more suitable for open-
source systems. Xen is a notable example using paravirtualization.
Partial Virtualization: This technique emulates only part of the hardware, allowing certain
applications to run in isolation but not the full operating system. Address space virtualization
is an example, commonly used in time-sharing systems where multiple users or applications
share the same hardware but operate in separate memory spaces.
Operating System-Level Virtualization: Here, different user-space instances are created and
isolated within the same OS kernel, allowing multiple applications to run concurrently in
separate environments. Unlike hardware virtualization, there is no hypervisor. This technique
32
Cloud Computing 21CS72
is efficient for scenarios where servers share the same OS and resources. Examples include
FreeBSD Jails, Solaris Zones, and OpenVZ.
Application-level virtualization
This is a technique that allows applications to run in environments where they are not natively
supported. Unlike traditional installation, applications are executed as if they were running in their
expected environment. This technique mainly deals with partial file systems, libraries, and operating
system components, which are emulated by a lightweight program or component responsible for
executing the application.
Key Features of Application-level Virtualization:
1. Runtime Emulation: A thin layer emulates the necessary parts of the runtime environment,
allowing applications to function without needing to be fully installed or supported by the
host system.
2. Emulation vs. Hardware-level Virtualization:
o Emulation: Executes programs compiled for different hardware architectures.
o Hardware Virtualization: Emulates a complete hardware environment that allows
running entire operating systems.
3. Strategies for Emulation:
o Interpretation: Each source instruction is interpreted and emulated, leading to poor
performance due to the overhead of interpreting every instruction.
o Binary Translation: Translates source instructions to native ones, caching translated
blocks for reuse. It has high initial overhead but improves performance as cached
instructions are reused.
Advantages:
Handling Missing Libraries: Application-level virtualization can use replacement libraries
or remap calls to compatible functions, enabling applications to run even with missing
components.
Lighter Virtualization: Compared to hardware-level virtualization, this method is less
resource-intensive because it emulates only parts of the runtime environment, not the whole
hardware.
Compatibility: It allows applications that are otherwise incompatible to run together on the
same host system.
Popular Solutions:
Wine: Runs Windows applications on Unix-like operating systems. It includes a container for
applications and libraries, called Winelib, to port applications to Unix.
CrossOver: A solution for running Windows applications on macOS.
VMware ThinApp: Captures installed applications and packages them into an executable
that runs independently of the host operating system.
33
Cloud Computing 21CS72
34
Cloud Computing 21CS72
35
Cloud Computing 21CS72
Advantages of Virtualization:
1. Managed Execution and Isolation: Virtualization enables the creation of secure, isolated
environments where harmful operations are restricted, allowing better management of
computing resources. This capability is crucial for server consolidation, providing better
control and enhanced security in IT environments.
2. Portability: Virtualized systems, such as virtual machine instances, can be easily transported
as files. They are self-contained, simplifying their administration. This portability is useful for
migrating workloads and managing applications across different hardware systems.
3. Resource Efficiency: Virtualization allows multiple systems to share the same hardware
resources without interference. This leads to more efficient use of computing power and can
reduce costs by consolidating servers, especially in underutilized environments.
4. Cost Reduction: With fewer physical machines needed to handle the same workload,
virtualization lowers hardware, maintenance, and energy costs. This leads to energy
conservation and reduced environmental impact.
5. Security: Virtualization offers the advantage of controlled, sandboxed environments, reducing
the risk of harmful software affecting the underlying system.
6. Dynamic Resource Allocation: Virtualization enables dynamic adjustment of resources to
meet current load demands, improving system flexibility. This is particularly useful for
applications that need to scale in real-time.
Disadvantages of Virtualization:
1. Performance Degradation: Since virtualization adds an extra layer between the hardware
and guest system, it can increase latency and reduce overall performance. This is particularly
noticeable in hardware virtualization due to the overhead involved in managing virtual
processors, privileged instructions, and paging.
2. Inefficiency and Degraded User Experience: Some host features may not be accessible to
the virtual machine due to the abstraction layer. For example, drivers for specific hardware
devices may not be fully utilized, leading to inefficiencies or reduced capabilities in
virtualized environments.
3. Security Vulnerabilities: Virtualization can introduce new security risks, such as malware
that can emulate virtual environments, compromising the host system. Examples like BluePill
and SubVirt demonstrate how virtualization malware can infiltrate a system by manipulating
the guest operating system.
4. Complexity in Resource Management: Virtualization adds complexity to managing
resources, which can sometimes result in suboptimal resource allocation, negatively
impacting performance or efficiency.
Xen and Paravirtualization
Xen is a widely-used open-source virtualization platform that implements paravirtualization.
Initially developed at the University of Cambridge, Xen has grown with significant contributions from
the open-source community and is also offered commercially through Citrix as XenSource. Xen is
versatile, supporting desktop, server, and cloud computing via the Xen Cloud Platform (XCP).
36
Cloud Computing 21CS72
The key element of Xen's architecture is the Xen Hypervisor, which enables efficient
paravirtualization and, more recently, hardware-assisted virtualization for full virtualization.
Paravirtualization vs Full Virtualization
Paravirtualization in Xen differs from full virtualization because it modifies the guest operating
system to eliminate performance penalties related to special instruction management. This
modification leads to high-performance execution, making Xen suitable for x86 architecture on
commodity machines and servers. Full virtualization, on the other hand, emulates the entire system
without modification to the guest OS, which can introduce performance loss.
Xen Architecture
Xen uses a privilege model mapped onto the classic x86 security rings (0 to 3):
The Xen Hypervisor operates at the highest privilege level (Ring 0) and controls the guest
OS's access to the hardware.
Guest OS instances run within domains, with Domain 0 having privileged access for
managing the other guest systems (Domain U). Domain 0 hosts an HTTP interface for
managing virtual machines (VMs) and serves as the base for cloud Infrastructure-as-a-Service
(IaaS) systems.
User Application
• VM Management Ring 3
• HTTP interface User Domains (Domain U)
• Access to the Xen Hypervisor Ring 2
• Guest OS
Ring 1
• Modified codebase
Ring 0 • Hypercalls into Xen VMM
Privileged
instructions
• Memory management
• CPU state registers
• Devices I/O
Hardware
37
Cloud Computing 21CS72
By running guest operating systems in Ring 1, Xen maintains Application Binary Interface (ABI)
compatibility. However, some system calls from Ring 3 to Ring 0 can cause traps or faults, requiring
modifications to guest operating systems. These modifications involve hypercalls, which allow the
Xen hypervisor to handle sensitive instructions
VMware: Full Virtualization
VMware implements full virtualization, a technology that replicates the underlying hardware and
presents it to the guest operating system. The guest OS operates as though it has direct access to the
physical hardware, remaining unaware of the virtualization layer. This contrasts with
paravirtualization (e.g., Xen), where the guest OS must be modified. VMware’s full virtualization
supports both desktop and server environments using Type I and Type II hypervisors.
Hypervisors in VMware:
1. Type I Hypervisors: Also known as bare-metal hypervisors, these run directly on the server
hardware, allowing VMware to virtualize server environments. An example is VMware
ESXi, which manages virtual machines (VMs) at the hardware level.
2. Type II Hypervisors: These run on top of an existing operating system, making them suitable
for desktop environments. VMware Workstation is an example of this, enabling users to run
multiple guest operating systems on a single desktop machine.
Full Virtualization Mechanism
VMware achieves full virtualization by using:
Direct execution for non-sensitive instructions, where the guest OS runs instructions directly
on the host CPU.
Binary translation for sensitive instructions, which involves dynamically translating
privileged instructions to ensure safe execution in a virtual environment.
This mechanism allows VMware to virtualize x86 architectures without needing modifications to the
guest OS, providing broad compatibility across various operating systems.
Full Virtualization and Binary Translation in VMware
VMware is renowned for its ability to virtualize x86 architectures, allowing guest operating systems
(OS) to run unmodified on its hypervisors. Prior to the introduction of hardware-assisted
virtualization (such as Intel VT-x and AMD-V in 2006), VMware relied on dynamic binary
translation to achieve full virtualization.
The Challenge with x86 Architecture
The x86 architecture does not natively meet the conditions required for classical virtualization,
particularly because the set of sensitive instructions (which control hardware and privileged actions)
is not fully encapsulated within the privileged instruction set (executed in Ring 0). In a virtualized
environment, where the guest OS runs in Ring 1 instead of Ring 0, these sensitive instructions can
misbehave, causing errors or traps.
Dynamic Binary Translation
To manage these issues, VMware utilized dynamic binary translation, a technique where sensitive
instructions that trigger traps are dynamically translated into a different set of instructions that can
be safely executed in the virtualized environment without causing exceptions. Once translated, the
new instruction set is cached, eliminating the need to retranslate them during subsequent executions.
38
Cloud Computing 21CS72
39
Cloud Computing 21CS72
40
Cloud Computing 21CS72
VMware GSX Server: An early VMware product that virtualizes server environments,
specifically targeting web server virtualization. It supports remote management and scripting
for VMs. GSX server architecture includes a daemon process (serverd) to manage VM
instances and communicate with the VMware driver.
VMware ESX and ESXi Servers: These are hypervisor-based solutions installed directly on
bare-metal servers:
o VMware ESX includes a service console based on a modified Linux kernel to
manage the hypervisor.
o VMware ESXi is a more lightweight solution with a reduced memory footprint,
providing minimal OS overhead. It employs the VMkernel, a minimal POSIX-
compliant operating system for resource scheduling, I/O stacks, and device drivers.
3. Infrastructure Virtualization and Cloud Solutions
VMware provides an entire cloud infrastructure stack:
VMware vSphere: The core of VMware’s data center virtualization, vSphere ties together a
pool of virtualized servers, providing services like virtual storage, virtual networking, and
Application
Zimbra
Virtualization
Platform
vFabric
Virtualization
vCloud
vCenter vCenter
Infrastructure
vSphere vSphere vSphere vSphere Virtualization
42
Cloud Computing 21CS72
Architecture:
Hyper-V enables multiple concurrent executions of guest operating systems through partitions—
completely isolated environments where operating systems run. The architecture includes a parent
partition, which has direct access to hardware, and child partitions that host guest operating systems
without direct hardware access.
Key components of the Hyper-V hypervisor include:
Hypercalls Interface: Entry point for executing sensitive instructions, allowing
43
Cloud Computing 21CS72
Enlightened I/O optimizes I/O operations by allowing guest operating systems to use an interpartition
communication channel, VMBus, rather than relying on hardware emulation. This feature benefits
hypervisor-aware operating systems, enhancing performance for storage, networking, and other
subsystems.
The architecture includes:
VMBus: Communication channel for data exchange between partitions.
Virtual Service Providers (VSPs): Kernel-level drivers in the parent partition providing
hardware access.
Virtual Service Clients (VSCs): Virtual device drivers in child partitions.
Legacy operating systems that lack hypervisor awareness still function but rely on less efficient device
driver emulation.
Parent Partition
The parent partition runs the host OS and implements the virtualization stack, directly accessing
device drivers and mediating access for child partitions. It manages the lifecycle of child partitions via
the Virtualization Infrastructure Driver (VID) and instantiates a Virtual Machine Worker Process
(VMWP) for each child partition.
Child Partitions
Child partitions execute guest operating systems in isolated environments. They can be either
enlightened (benefiting from Enlightened I/O) or unenlightened (relying on hardware emulation).
Cloud Computing and Infrastructure Management
Hyper-V serves as the foundation of Microsoft's virtualization infrastructure, complemented by
additional components for enhanced server virtualization capabilities. Windows Server Core, a
streamlined version of Windows Server 2008, offers a reduced maintenance footprint by removing
unnecessary features, while still allowing remote management through PowerShell.
System Center Virtual Machine Manager (SCVMM) 2008 enhances virtual machine management
with capabilities like:
Creation and management of virtual instances
Virtual to Virtual (V2V) and Physical to Virtual (P2V) conversions
Delegated administration
Intelligent VM placement and host capacity management
SCVMM integrates with various virtualization platforms, particularly benefiting from Hyper-V’s
infrastructure.
----------------------------------------------END OF MODULE 2--------------------------------------------------
44
Cloud Computing 21CS72
MODULE 3
CLOUD COMPUTUNG ARCHITECTURE
The Cloud Reference Model
Cloud computing supports various IT services that can be consumed as utilities and delivered
through networks, most commonly the internet. This model encompasses multiple aspects,
including infrastructure, development platforms, applications, and services. The cloud
reference model organizes these elements in a layered architecture, which provides a
structured view of how cloud computing resources are managed and utilized.
45
Cloud Computing 21CS72
46
Cloud Computing 21CS72
Types of Clouds
Cloud computing is categorized into several types based on the administrative domain and
the deployment model. Each type addresses different needs for service delivery, resource
management, and security.
1. Public Clouds
Definition: Public clouds are open to the general public and are managed by third-
party cloud service providers. The services offered, such as computing power, storage,
and applications, are delivered over the Internet.
47
Cloud Computing 21CS72
Key Features:
o Multitenancy: Public clouds support multiple users with isolated
environments, ensuring secure and efficient service delivery.
o Scalability: They can scale resources up or down based on demand, making
them ideal for businesses with variable needs.
o Cost-Efficiency: Public clouds are based on a pay-as-you-go model, reducing
the need for upfront investments in hardware and infrastructure.
o Examples: Amazon EC2 (IaaS), Google AppEngine (PaaS), Salesforce.com
(SaaS).
o Architecture: They are typically made up of geographically dispersed
datacenters to handle large-scale user demand and ensure reliability.
2. Private Clouds
Definition: Private clouds are deployed within the premises of an organization,
offering computing resources solely for internal users. They are designed for
organizations that need greater control over their infrastructure and security.
Key Features:
o Control and Security: Private clouds provide organizations with control over
their infrastructure, reducing security risks associated with public clouds.
o Customization: These clouds can be tailored to meet specific organizational
needs, such as compliance with regulatory requirements.
o Efficiency: Organizations can optimize their existing IT resources and reduce
the burden of managing physical infrastructure.
o Examples: VMware vCloud, Eucalyptus, OpenNebula.
3. Hybrid Clouds
A hybrid cloud integrates private and public cloud infrastructures, allowing enterprises to
benefit from the scalability and cost-effectiveness of public clouds while maintaining
sensitive data and critical workloads in a private cloud. This hybrid approach enables
businesses to optimize their IT resources, balancing between control and flexibility.
Key Characteristics:
Private Cloud Integration: Enterprises can maintain their existing IT infrastructure
while utilizing public cloud resources when required.
Dynamic Provisioning: Hybrid clouds offer dynamic provisioning, where additional
resources from public clouds can be provisioned when needed and released when no
longer necessary. This is known as cloudbursting, a concept where a private cloud
expands into a public cloud to handle peak loads.
Security: Security concerns are generally limited to the public portion of the cloud,
where less sensitive workloads are run. Sensitive data can still remain within the
private cloud.
Scalability: By leveraging external resources, hybrid clouds address scalability
challenges, especially during demand surges. These resources are rented temporarily,
ensuring cost efficiency.
49
Cloud Computing 21CS72
4. Community Clouds
A community cloud is designed for a specific community of users, such as a government
agency, industry group, or research organization. It addresses shared concerns like security,
compliance, or regulatory requirements while providing cloud services.
Key Characteristics:
Shared Infrastructure: Multiple organizations share the infrastructure, which is
managed either by the organizations themselves or a third party. It may be located on-
premise or off-premise.
Specific Community Needs: The cloud is tailored to meet the needs of a specific
community, such as government bodies, healthcare organizations, or scientific
research entities.
Collaboration: Community clouds facilitate collaboration by providing a shared
environment for joint operations while maintaining compliance with security and
privacy concerns.
50
Cloud Computing 21CS72
Example Industries:
o Healthcare: Community clouds can store patient data in a private cloud while
using shared infrastructure for non-critical services.
o Energy & Core Industries: These sectors can benefit from a community
cloud that bundles services for management, deployment, and orchestration of
operations.
o Scientific Research: Science clouds support large-scale scientific computing
by providing distributed infrastructure to multiple research organizations.
Benefits:
Openness: Community clouds often emphasize openness, removing vendor
dependencies and promoting fair competition among solutions.
Scalability: The infrastructure scales as the community expands, growing organically
with the demand from users.
Environmental Sustainability: By utilizing underutilized resources, community
clouds have a smaller carbon footprint and can be more sustainable.
51
Cloud Computing 21CS72
52
Cloud Computing 21CS72
53
Cloud Computing 21CS72
MODULE 4
CLOUD SECURITY
Security Risks Faced by Cloud Users
Cloud computing offers numerous advantages, but it also introduces a range of security risks,
which can be broadly categorized into three areas: traditional security threats, threats related
to system availability, and threats related to third-party data control.
1. Traditional Security Threats:
o Denial-of-Service (DoS) Attacks: Cloud services are vulnerable to distributed
denial-of-service (DDoS) attacks, which prevent legitimate users from
accessing cloud services.
o Phishing and SQL Injection: These attacks target user credentials and
databases, often exploiting weak input validation mechanisms.
o Cross-Site Scripting (XSS): This common web application vulnerability
allows attackers to inject malicious scripts into websites, potentially bypassing
access controls.
o Authentication and Authorization: In cloud environments, ensuring proper
user authentication and role-based access control (RBAC) is challenging.
Users must be assigned privileges based on their roles, and these policies must
align with both organizational and cloud security standards.
o Infrastructure Protection: The cloud infrastructure used by users, including
the local network, must be protected against attacks originating from external
sources. This task is complicated by the fact that some components, such as
cloud clients and virtual machines, are outside the user's traditional security
perimeter.
2. System Availability Risks:
o Service Outages and Failures: Cloud services depend on the availability of
various systems, such as data centers and network infrastructure. Failures,
such as power outages or hardware malfunctions, can disrupt cloud services,
leading to business downtime.
o Data Lock-In: In situations where an organization depends heavily on cloud-
hosted data, service outages or disruptions could result in an inability to access
critical business data, which can lead to operational failures.
o Phase Transition Phenomena: Complex systems such as cloud infrastructure
are subject to unexpected behaviors under certain conditions, such as system
overloads or resource allocation failures, which may disrupt availability.
3. Third-Party Data Control:
o Data Privacy and Espionage: Storing sensitive data on the cloud can expose
it to third-party risks, especially if the cloud provider or subcontractors have
54
Cloud Computing 21CS72
55
Cloud Computing 21CS72
56
Cloud Computing 21CS72
57
Cloud Computing 21CS72
58
Cloud Computing 21CS72
2. Outputs:
o PIA report summarizing:
Risk assessments.
Security measures.
Cross-border data flow concerns.
3. Tools:
o SaaS-based tools with a knowledge base maintained by experts.
o Systems generate reports based on templates and user-provided details.
59
Cloud Computing 21CS72
3. Phases of Trust
1. Building Phase: Formation of trust.
2. Stability Phase: Sustained trust over time.
3. Dissolution Phase: Decline or loss of trust due to violations.
60
Cloud Computing 21CS72
1. Purpose of an OS
Manages hardware resources.
Provides security through defined policies for access control, authentication, and
cryptography.
3. Trusted Applications
Special applications with privileges for security functions.
Operate with the minimum privileges necessary to reduce risks.
5. Trusted Paths
Ensure secure interactions between users and trusted software.
Protect against malicious software impersonating trusted systems.
61
Cloud Computing 21CS72
7. Application-Specific Security
Applications above the OS layer often implement better security.
Example: Digital signatures for e-commerce transactions.
8. Limitations
Commodity OSes offer low assurance and are susceptible to attacks.
Compromising one application can endanger the entire platform.
Weak authentication mechanisms and lack of trusted paths add to vulnerabilities.
62
Cloud Computing 21CS72
63
Cloud Computing 21CS72
64
Cloud Computing 21CS72
The security risks posed by shared images in cloud environments, particularly in the context
of Infrastructure as a Service (IaaS) and Amazon Machine Images (AMIs), are multifaceted
and significant. Below is a summary and analysis of the key concerns and findings:
3. Unsolicited Connections
Malicious or compromised AMIs may establish outgoing connections:
o Leak Privileged Information: Such as IP addresses or system events.
o Modified Syslog Daemons: Found in some instances, forwarding sensitive
data to external entities.
4. Malware
Some AMIs were found to include:
o Keylogging Trojans: Capture sensitive user inputs, such as passwords.
65
Cloud Computing 21CS72
6. Misconfigured Instances
Failure to run cloud-init scripts compromises SSH server host keys, enabling man-
in-the-middle attacks.
Malicious actors can use tools like NMap to identify vulnerabilities in the SSH setup
of shared AMIs.
66
Cloud Computing 21CS72
67
Cloud Computing 21CS72
68
Cloud Computing 21CS72
Trade-offs:
Enhanced security measures introduce performance overheads:
o VM build time: Increased by 1.7–2.3 times.
o VM save time: Increased by 1.3–1.5 times.
o VM restore time: Increased by 1.7–1.9 times.
-------------------------------------END OF MODULE 4-------------------------------------------------
69
Cloud Computing 21CS72
MODULE 5
Cloud platforms in Industry
Amazon Web Services
Amazon Web Services (AWS) is a comprehensive cloud computing platform that supports the
development of flexible, scalable applications. Its services are designed to provide
infrastructure scalability, messaging, and data storage solutions. AWS is accessible
via SOAP or RESTful web service interfaces and includes a web-based console for
administration, monitoring, and cost management on a pay-as-you-go pricing model.
At the core of the AWS ecosystem are its foundational services, which provide the building
blocks for scalable and reliable applications. Amazon Elastic Compute Cloud (EC2) offers
flexible virtual server capacity, while Amazon Simple Storage Service (S3) ensures secure
and scalable object storage.
These core services are complemented by solutions like Elastic Block Store (EBS) for high-
performance block storage and Amazon Relational Database Service (RDS) for managing
relational databases. Networking capabilities are strengthened with Amazon Virtual Private
Cloud (VPC) for isolated environments, Elastic Load Balancing for traffic distribution, and
Amazon Route 53 for DNS management.
Additionally, communication tools such as Amazon Simple Queue Service (SQS) and Simple
Notification Service (SNS) facilitate seamless integration and messaging between
applications, providing the infrastructure required for modern, connected systems
Compute Services
Compute services form the backbone of cloud computing platforms, offering scalable and
flexible resources for deploying applications. Amazon Elastic Compute Cloud (EC2) stands
as the primary compute service within AWS, providing an Infrastructure-as-a-Service (IaaS)
model that has become a standard for cloud solutions. EC2 enables users to deploy virtual
servers, referred to as instances, created from Amazon Machine Images (AMIs). These
instances come preconfigured with operating systems and software stacks, with customizable
memory, processor, and storage options. Users can remotely access these instances to
configure or install additional software, ensuring flexibility and control.
70
Cloud Computing 21CS72
with AMIs, enabling owners to monetize their images whenever they are used to launch EC2
instances.
71
Cloud Computing 21CS72
3. EC2 Environment
The EC2 environment enables virtual instances to operate efficiently by providing the
necessary services to host applications. Key aspects of the EC2 environment include:
1. Networking and Addressing:
Instances are assigned an internal IP address by default, allowing
communication within the EC2 network and enabling internet access as
clients.
Elastic IPs can be associated with instances, providing static IPs that can be
remapped to different instances as needed. This feature is useful for failover
implementations and public internet accessibility.
Each instance with an external IP is also assigned a domain name (e.g., ec2-
xxx-xxx-xxx.compute-x.amazonaws.com), where the IP and the availability
zone are encoded in the domain.
2. Availability Zones:
EC2 offers five availability zones, which are priced differently:
Two in the United States (Virginia, Northern California)
One in Europe (Ireland)
Two in Asia Pacific (Singapore, Tokyo)
Users can influence the deployment location for instances to some extent.
3. Security:
Key Pairs: Instance owners can associate key pairs for secure, remote root
access.
72
Cloud Computing 21CS72
73
Cloud Computing 21CS72
Storage Services
Amazon Simple Storage Service (S3): Overview
Amazon S3 is a highly scalable and distributed object storage service. It allows users to store
data in any format and access it over the web. Below is a summary of key components,
concepts, and advanced features of Amazon S3:
Core Components
1. Buckets:
Virtual containers for storing data objects.
Serve as top-level namespaces and are globally unique.
Cannot be nested, i.e., no sub-buckets.
Objects within a bucket share the same geographic region.
2. Objects:
The actual data stored in buckets, identified by a unique name within the
bucket.
Immutable once uploaded (cannot rename or modify directly).
Maximum size: 5 GB per object.
Objects support metadata (system or user-defined key-value pairs).
Concepts
1. Hierarchy:
S3 provides a flat structure with logical directories simulated using object
naming conventions (e.g., folder1/file1.txt).
2. RESTful Interface:
S3 operations are accessed via HTTP methods:
GET/HEAD: Retrieve data/metadata.
PUT/POST: Upload objects.
DELETE: Remove objects.
Accessible via Uniform Resource Identifiers (URIs).
3. Bucket Addressing:
Canonical Form: http://s3.amazonaws.com/bucket_name/
74
Cloud Computing 21CS72
75
Cloud Computing 21CS72
Costs are calculated based on allocated storage ($0.10/GB/month) and I/O requests ($0.10 per
million requests), making EBS a reliable and cost-effective storage option for various
applications.
3. Amazon ElastiCache
Amazon ElastiCache is a fully managed in-memory caching service that provides ultra-fast
data retrieval for applications hosted on AWS. By reducing the latency involved in accessing
frequently used data, ElastiCache helps optimize application performance. It supports popular
caching engines like Memcached and Redis, making it versatile and compatible with
existing caching protocols.
Key Features:
1. In-Memory Data Store: ElastiCache enables fast data access by storing data in
memory instead of disk-based storage.
2. Seamless Scalability: Clusters can be elastically scaled to handle growing application
demands, with features such as automatic sharding and replication.
3. Failover and Recovery: Automatic detection and recovery of failed cache nodes
ensure high availability without manual intervention.
4. Compatibility with Existing Tools: ElastiCache supports the Memcached and Redis
protocols, allowing users to migrate existing applications without code modifications.
5. Management Features:
Automatic software patching.
Backup and restore capabilities (especially for Redis).
Monitoring and metrics integration via Amazon CloudWatch.
Use Cases:
Web Session Storage: Store user session data for faster retrieval in web applications.
Gaming Leaderboards: Provide real-time leaderboard rankings in multiplayer
gaming environments.
Content Caching: Cache dynamic or frequently accessed content to reduce database
load.
Machine Learning: Speed up machine learning workflows by caching intermediate
computations.
Analytics: Enhance the performance of analytics platforms by caching query results
or pre-computed data.
76
Cloud Computing 21CS72
Pricing:
ElastiCache pricing depends on:
1. Node Type: Based on the type of EC2 instances used for the cache nodes.
2. Cluster Size: Number of nodes in the cluster.
3. Data Transfer: Charges may apply for data transferred out of AWS regions.
4. Features Used:
Redis-specific features like backup, restore, and multi-AZ deployment may
incur additional costs.
Advanced Features:
Redis Support:
Multi-AZ deployments with automatic failover.
Data persistence through backup and restore options.
Streams and geospatial data processing.
Monitoring and Insights: Integrated with CloudWatch for real-time monitoring of
cache performance, latency, and resource utilization.
Benefits:
1. Low Latency and High Throughput: Ideal for applications requiring sub-
millisecond response times.
2. Ease of Management: Managed by AWS, eliminating the need for server setup,
configuration, and maintenance.
3. Security: Supports Amazon Virtual Private Cloud (VPC) and encryption for secure
access and data transmission
4. Structured Storage Solutions in AWS
In enterprise applications, structured storage solutions are essential for managing and
analyzing large volumes of data. Amazon Web Services (AWS) provides several options for
structured data storage, including preconfigured EC2 AMIs, Amazon Relational Database
Service (RDS), and Amazon SimpleDB. These services are tailored to different use cases,
depending on the level of control and complexity required for data management.
4.1. Preconfigured EC2 AMIs
What it is: Preconfigured Amazon Machine Images (AMIs) are templates that come
with a database management system (DBMS) pre-installed. These AMIs allow users
to create EC2 instances with popular database engines such as IBM DB2, Microsoft
SQL Server, MySQL, Oracle, PostgreSQL, Sybase, and Vertica.
Management responsibility: The user is responsible for the configuration,
management, maintenance, and patching of the database. This solution gives users
77
Cloud Computing 21CS72
flexibility and control over the database setup but requires more administrative
overhead.
Storage: EC2 instances created from these AMIs can be paired with Amazon Elastic
Block Store (EBS) for persistent storage.
Cost: The pricing is based on EC2 instance types and follows the EC2 hourly cost
model.
4.2. Amazon RDS (Relational Database Service)
What it is: Amazon RDS is a managed relational database service that operates on top
of EC2 infrastructure. It provides automatic management of database instances,
including backups, patching, and failover mechanisms.
Database Engines: It supports MySQL, Oracle, and other RDBMS systems, offering
a fully managed solution that handles many operational aspects for the user.
Key Features:
Multi-AZ Deployment: Ensures high availability by maintaining standby
replicas in different availability zones (AZs) that can take over in case of a
failure.
Read Replicas: Improves read-heavy workloads by providing replicas of the
database that handle read requests, reducing load on the primary database.
Automated Backups: Provides features like automated daily backups, point-
in-time recovery, and snapshot management.
Management: AWS takes care of most of the administrative tasks, including
hardware provisioning, database setup, patching, and scaling.
Pricing: RDS is priced based on the instance type (Standard or High-Memory) and
storage capacity. Users can also choose between On-Demand and Reserved instances
to optimize costs.
Pricing Example (2011-2012) for On-Demand RDS Instances:
Small Instance: $0.11/hour (1.7 GB memory)
Extra Large Instance: $0.88/hour (15 GB memory)
High-Memory Instances can cost as much as $2.60/hour for quadruple extra
large.
4.3. Amazon SimpleDB
What it is: Amazon SimpleDB is a lightweight, scalable, and flexible NoSQL data
store for applications that do not require a fully relational database model. It supports
semi-structured data, allowing attributes to vary between items within the same
domain.
78
Cloud Computing 21CS72
Data Model: SimpleDB uses a key-value pair model organized into domains, where
each domain is similar to a table in relational databases but more flexible, as different
items in a domain can have different attributes.
Key Features:
Eventual Consistency: SimpleDB uses an eventually consistent model,
meaning updates to data might not be immediately visible to all readers, but
will eventually converge over time. It offers an option to block reads during
updates for stronger consistency.
Conditional Operations: Allows for conditional insertions and deletions to
prevent lost updates in multi-writer scenarios.
Scalability: Handles large quantities of data efficiently with simple queries
and indexing.
Pricing:
Free for the first 25 SimpleDB instances per month.
Charges for additional instances ($0.14/hour after the free tier) and for data
transfer.
SimpleDB is better suited for applications with smaller, semi-structured data
rather than large-scale object storage, making it more suitable for real-time
applications that need fast access to small data objects.
Pricing Example (2011-2012):
Data Transfer: Charged based on data storage and data transfer outside AWS.
Machine Usage: Additional charges apply for machine usage once the free
instance limit is exceeded.
4.4. Comparison with Amazon S3
Amazon S3 is a simple storage service designed for large objects, such as files and
backups. SimpleDB, by contrast, is designed for small, fast-access semi-structured
data and is not optimized for large object storage. While SimpleDB excels at querying
small objects quickly, S3 is a better choice for storing large files and long-term data
storage.
5. Amazon CloudFront
CloudFront is Amazon's content delivery network (CDN) service, built on top of its
distributed storage infrastructure. It optimizes the delivery of static and streaming web
content by using edge servers that are distributed globally. These edge servers reduce the
transfer time for content requests, ensuring faster access for users regardless of their location.
CloudFront works by creating a distribution, which connects to an origin server that holds the
original version of the content. The origin can be an S3 bucket, an EC2 instance, or an
external server outside of Amazon’s infrastructure. Once a distribution is set up, CloudFront
provides a domain name (e.g., my-distribution.Cloudfront.net) that users can reference. When
79
Cloud Computing 21CS72
a user requests content, CloudFront directs them to the nearest edge server, which serves the
content. If the content is not available or has expired, the request is redirected to the origin
server to fetch the latest version.
The service supports both static content (HTTP/HTTPS) and streaming content (RTMP -
Real-Time Messaging Protocol). Users can control which protocols are allowed and
configure access rules to limit or manage distribution. Additionally, CloudFront offers the
ability to invalidate or update content in its cache before it expires, ensuring content is up-to-
date.
CloudFront is typically cheaper than S3, as its purpose is to efficiently distribute frequently
accessed content across the globe, rather than serving as long-term storage for large files. The
pricing model for CloudFront reflects this optimization.
Communication Services
Amazon Web Services (AWS) provides various tools to facilitate communication between
applications and services within the AWS ecosystem. These tools are divided into two major
categories: virtual networking and messaging.
1. Virtual Networking
Virtual networking in AWS offers users a range of services that control connectivity between
compute and storage services within the AWS environment. Some of the key services
include:
1. Amazon Virtual Private Cloud (VPC):
Amazon VPC allows users to create private networks within AWS, giving
them control over the network’s structure and connectivity. Users can choose
between predefined templates for common network setups or create fully
customized networks for more advanced configurations.
Templates include public subnets, isolated networks, private networks with
internet access via NAT (Network Address Translation), and hybrid networks
that combine AWS and private resources.
VPC enables users to control connectivity between different services (e.g.,
EC2 instances and S3 buckets) through AWS Identity and Access Management
(IAM).
Cost: As of 2011, Amazon VPC was priced at $0.50 per connection hour.
2. Amazon Direct Connect:
Direct Connect allows users to create dedicated, high-bandwidth connections
between their private networks and AWS locations, ensuring consistent
performance. These connections can be further partitioned into multiple
logical connections.
This service is ideal for scenarios requiring high bandwidth between an on-
premise network and AWS services like EC2, S3, or VPC.
80
Cloud Computing 21CS72
Available ports are limited to two locations in the U.S., but users can utilize
external providers for higher bandwidth access.
Pricing:
1 Gbps port: $0.30 per hour
10 Gbps port: $2.25 per hour
Inbound traffic is free; outbound traffic costs $0.02 per GB.
3. Amazon Route 53:
Route 53 offers DNS services that allow AWS resources to be accessed
through custom domain names, rather than using Amazon’s default domain
names.
Route 53’s global network of DNS servers facilitates reliable access to
resources like EC2 instances and S3 buckets under user-controlled domain
names.
It also supports dynamic DNS, allowing AWS resources to be mapped to
domain names as they are launched or created (e.g., EC2 instances or S3
buckets).
Users can manage hosted zones and edit available resources through the Route
53 Web service.
Pricing:
$1 per month per hosted zone
$0.50 per million queries for the first billion queries per month, with
reduced rates for higher query volumes ($0.25 per million queries over
1 billion).
2. Messaging
Messaging services offered by AWS enable communication between applications by using
various message transmission models. These services include Amazon Simple Queue
Service (SQS), Amazon Simple Notification Service (SNS), and Amazon Simple Email
Service (SES).
1. Amazon SQS:
Description: SQS is a distributed messaging queue service that allows
applications to send and receive messages. It uses a disconnected model,
meaning messages are stored in a queue, and applications can retrieve them at
their own pace.
Functionality: Users can create an unlimited number of message queues and
configure access control. Messages are stored securely and redundantly within
AWS for a limited time. When a message is read, it is locked to prevent
duplication. The lock expires after a predefined period.
81
Cloud Computing 21CS72
2. Amazon SNS:
Description: SNS provides a publish-subscribe model for connecting
applications, allowing applications to receive notifications about updates on
specific topics.
Functionality: Users can create topics, and other applications can subscribe to
these topics. When a message is published to a topic, all subscribers are
automatically notified. SNS supports various notification methods, including
HTTP/HTTPS, email, and SQS.
3. Amazon SES:
Description: SES is an email service that enables users to send email
messages through AWS infrastructure. It is a scalable service primarily for
transactional and marketing emails.
Functionality: Users must verify an email address to use SES, after which
they can send emails via SMTP or raw email methods. SES provides feedback
on email delivery and failure notifications, along with detailed statistics to
improve email campaigns.
Google AppEngine
Google AppEngine is a Platform as a Service (PaaS) solution designed to facilitate the
development and hosting of scalable web applications. It leverages Google's distributed
infrastructure to handle high volumes of traffic, automatically scaling applications and
allocating additional computing resources as needed. AppEngine also simplifies the process
of application development with built-in services that support easy scaling and resource
management. It supports applications built in languages such as Java, Python, and Go.
1. Infrastructure
The primary role of AppEngine's infrastructure is to efficiently serve user requests. It does
this by utilizing Google's extensive network of servers across various data centers. For
each HTTP request, AppEngine identifies the servers hosting the application, evaluates their
current load, and may allocate more resources or redirect the request to a server that can
handle it.
Key features of AppEngine's infrastructure include:
Dynamic Resource Allocation: If necessary, additional servers are added to balance
the load, ensuring that applications can scale seamlessly.
State Independence: AppEngine applications are designed in such a way that they do
not implicitly maintain state between requests. This allows the infrastructure to route
requests to any available server without worrying about state synchronization,
simplifying load balancing and resource management.
Performance Monitoring: The infrastructure also monitors the performance of
applications and collects statistics that are later used for billing purposes.
2. Runtime Environment
82
Cloud Computing 21CS72
Sandboxing:
One of the key functions of the runtime environment in AppEngine is sandboxing.
Sandboxing isolates applications in a protected environment to ensure that they do not affect
the server or other applications. This isolation helps maintain security and stability.
Sandboxing Mechanism: AppEngine restricts access to certain system resources to
prevent potentially harmful actions. For instance, an application cannot write to the
server’s file system or access external networks (except for services like Mail,
UrlFetch, and XMPP).
Execution Restrictions: The runtime environment imposes several restrictions to
prevent long-running tasks (e.g., requests must complete within 30 seconds) and limit
operations outside the scope of requests, queued tasks, or cron jobs.
AppEngine only supports managed or interpreted languages (such as Java, Python, and
Go), and sandboxing ensures that applications developed in these languages are safe and
resource-controlled.
Supported Runtimes
AppEngine supports three programming languages for application
development: Java, Python, and Go. Each language has its own runtime environment
tailored for the AppEngine platform.
1. Java:
AppEngine supports Java 6, and developers can use Java tools like Java
Server Pages (JSP) and Servlets.
Java applications interact with AppEngine services via Java libraries that
provide interfaces for the platform's abstractions.
The Java SDK allows development with either Java 5 or Java 6, but some
Java libraries may not be compatible with AppEngine’s sandbox restrictions.
2. Python:
AppEngine uses an optimized Python 2.5.2 interpreter.
Similar to Java, Python applications can use the standard library, but with
restrictions on certain modules that perform potentially harmful operations.
AppEngine also offers a webapp framework for developing Python-based
web applications.
3. Go:
83
Cloud Computing 21CS72
84
Cloud Computing 21CS72
external resource integration, image manipulation, and asynchronous computation. Below are
the primary services and their functionalities:
1. UrlFetch
Purpose: Enables web applications to retrieve remote resources via HTTP/HTTPS.
Key Features:
Applications can perform synchronous or asynchronous web requests.
Allows integration of remote resources into the application's workflow,
aligning with the Service-Oriented Architecture (SOA) model.
Supports setting request deadlines to control timeouts.
Use Cases:
Fetching fragments of HTML or data from external APIs.
Rendering resources from remote servers within a single web page.
2. MemCache
Purpose: Provides a distributed in-memory caching system for frequently accessed
objects, enhancing application speed.
Key Features:
Acts as a volatile storage mechanism for fast data retrieval.
Automatically removes rarely accessed objects.
Advantages:
Reduces latency by serving data from memory instead of persistent storage.
Allows developers to implement a lookup hierarchy: first check MemCache,
then DataStore if needed.
Use Cases:
Storing session data.
Caching user-specific or frequently accessed data.
85
Cloud Computing 21CS72
4. Account Management
Integration with Google Accounts:
Simplifies user authentication and account management.
Leverages Google’s authentication system, eliminating the need for custom
solutions.
Key Features:
Enables storing user profiles as key-value pairs attached to Google accounts.
Particularly beneficial for corporate environments using Google Apps,
allowing seamless integration with other Google services.
Use Cases:
Handling user authentication and personalization in web applications.
Managing user data securely.
5. Image Manipulation
Purpose: Provides lightweight tools for basic image processing directly within web
applications.
Key Features:
Supports resizing, rotation, mirroring, and image enhancement.
Optimized for speed, ensuring efficient performance for routine image tasks.
Use Cases:
Adding watermarks or branding to user-uploaded images.
Resizing or formatting images for responsive design.
86
Cloud Computing 21CS72
1. Task Queues
Purpose: Enable applications to schedule tasks for later execution, especially for
long-running computations.
Key Features:
Delayed Execution: Tasks are submitted for execution at a later time, outside
the scope of the original web request.
Queue Configuration: Supports up to 10 queues, each with configurable
execution rates.
Failure Handling: Automatically retries tasks in case of transient failures to
ensure successful completion.
How It Works:
A task is defined by a web request to a specified URL.
The queue invokes the request handler, passing the task payload as part of the
web request.
The request handler performs the task execution, while the queue handles
retries if needed.
Use Cases:
Background processing, such as generating reports or resizing images.
Long-running computations that exceed the maximum response time for a web
request.
2. Cron Jobs
Purpose: Schedule operations to run at specific times or intervals, independent of
user requests.
Key Features:
Executes tasks at predefined times, such as daily, weekly, or hourly schedules.
No Automatic Retries: Unlike Task Queues, tasks are not retried upon failure.
Ideal for periodic or maintenance tasks.
87
Cloud Computing 21CS72
How It Works:
Similar to Task Queues, the service invokes a request handler for the
scheduled task.
The request handler executes the operation at the specified time.
Use Cases:
Sending scheduled email notifications or reminders.
Performing regular maintenance, such as clearing temporary data or updating
cached information.
Scheduling batch processing jobs.
Application Life Cycle
Google AppEngine simplifies the development and deployment process for scalable web
applications by providing tools for all phases, including testing, development, deployment,
and monitoring.
1. Development and Testing
Developers can use local development servers to simulate the AppEngine
runtime environment.
The servers provide mock implementations of services like DataStore,
MemCache, and UrlFetch.
Java SDK:
Supports Java 5 and Java 6 environments.
Offers integration with Eclipse through Google AppEngine plug-ins.
Allows for servlet-based development with additional tools for
building web applications.
Python SDK:
Supports Python 2.5 and includes the GoogleAppEngineLauncher tool.
Comes with a built-in web framework (webapp) and supports others
like Django.
Provides command-line tools for monitoring, deploying, and
debugging.
2. Deployment and Management
Applications are deployed using a unique Application Identifier that serves
as its address (http://application-id.appspot.com).
Once uploaded, AppEngine handles the scaling, monitoring, and management.
88
Cloud Computing 21CS72
Observations
AppEngine’s scalable and sandboxed runtime ensures secure and isolated execution of
applications.
Services are designed to handle common web development needs efficiently.
To fully utilize AppEngine, developers need to adapt to its specific application model
and scaling architecture.
This framework is ideal for building applications that require robust scalability and benefit
from a pay-as-you-go model.
89
Cloud Computing 21CS72
Scientific applications
ECG Analysis in the Cloud:
The integration of cloud computing with healthcare has enabled significant advancements,
particularly in remote diagnostic and monitoring processes. One notable example is ECG
(Electrocardiogram) analysis, which utilizes cloud technologies to assist in diagnosing
heart conditions more effectively. This approach combines wearable sensors, mobile devices,
and cloud-hosted systems to create an efficient and scalable health-monitoring infrastructure.
90
Cloud Computing 21CS72
2. Data Transmission:
The ECG data is tran smitted from the wearable device to the patient's mobile device,
which forwards it to a cloud-hosted web service.
3. Cloud Infrastructure:
Data Storage: ECG data is stored using cloud services like Amazon S3.
Processing Workflow: The system uses scalable platforms like Aneka and
EC2 instances to process the data dynamically based on demand.
Analysis: The processing involves extracting waveforms and comparing them
against reference waveforms to detect anomalies.
4. Notification:
If anomalies are detected, doctors and emergency personnel are immediately notified
to intervene.
Advantages of Cloud-Based ECG Monitoring
1. Elasticity:
Cloud infrastructures dynamically scale resources up or down based on the workload,
ensuring efficient use of computing power and reducing costs associated with over-
provisioning.
2. Ubiquity:
Cloud-hosted systems are accessible from any internet-connected device, allowing
seamless integration with hospital systems and remote access by healthcare providers.
3. Cost Savings:
Pay-per-use pricing models reduce the need for capital investment in on-
premises infrastructure.
Volume discounts for frequent usage make this solution cost-effective for
large-scale implementations.
4. Improved Patient Care:
Continuous monitoring enables early detection of potential heart conditions.
Immediate notification of anomalies ensures timely intervention, potentially
saving lives.
Cloud-Based Protein Structure Prediction in Biology
Protein structure prediction is a computationally intensive task crucial for understanding
biological processes and designing drugs for disease treatment. Using cloud computing for
this purpose offers a dynamic, scalable, and cost-efficient alternative to traditional
supercomputing or cluster computing infrastructures.
Protein Structure Prediction Overview
1. Objective:
Determine the 3D geometric structure of a protein from its gene sequence. This
91
Cloud Computing 21CS72
1. Key Features:
Uses machine learning (Support Vector Machines - SVMs) for classifying
protein structures into secondary structures (E, H, and C).
Decomposes tasks into three sequential phases: Initialization, Classification,
and Final Prediction.
Exploits parallel processing in the Classification phase, running multiple
classifiers concurrently to reduce computation time.
2. Task Execution:
Tasks are translated into a task graph and executed in the cloud using Aneka.
After completion, results are visualized through the Jeeva portal.
92
Cloud Computing 21CS72
93
Cloud Computing 21CS72
94
Cloud Computing 21CS72
Salesforce.com is one of the most popular and advanced CRM solutions available today, with
over 100,000 customers worldwide. It provides highly customizable CRM solutions that can
be integrated with additional features developed by third parties. The platform is built on the
Force.com cloud development platform, which serves as a scalable and high-performance
middleware to execute all operations within Salesforce applications.
The Force.com platform has evolved from supporting only CRM applications to
accommodating a broad range of cloud-based applications. At its core, the platform features a
metadata architecture, which offers flexibility and scalability by storing the core logic and
business rules of applications as metadata, rather than in specific components and tables. This
metadata is stored in the Force.com store, enabling a runtime engine to retrieve and perform
operations on the data. Despite operating in isolated containers, applications logically share
the same database structure, and the runtime engine processes them uniformly.
A full-text search engine further enhances the user experience, allowing efficient navigation
through large data sets. The search engine updates its indexing data continuously in the
background as users interact with the application.
Users can customize Salesforce applications by using the Force.com framework or leveraging
programmatic APIs in popular programming languages. The framework allows users to
visually define the data structure or core application logic, while APIs offer a more
conventional development approach based on web services. Additionally, users can enhance
processes and logic by writing scripts in APEX, a Java-like language that supports both
object-oriented and procedural programming. APEX enables users to define on-demand
scripts, triggers, and complex queries to access and manipulate data within the platform.
95
Cloud Computing 21CS72
96
Cloud Computing 21CS72
4. Net Suite
NetSuite is a comprehensive suite of cloud-based applications designed to manage various
aspects of business operations. It includes three main products: NetSuite Global
ERP, NetSuite Global CRM, and NetSuite Global Ecommerce. Additionally, NetSuite
offers an all-in-one solution, NetSuite OneWorld, which integrates these three products. The
services are powered by two major data centers on the East and West coasts of the United
States, ensuring 99.5% uptime and high availability.
NetSuite provides not only prepackaged solutions but also infrastructure for customized
applications, enabling businesses to extend the platform’s functionality. The NetSuite
Business Operating System (NS-BOS) is a complete stack for building Software-as-a-
Service (SaaS) business applications that leverage NetSuite's products.
With its SuiteFlex online development environment, businesses can create new web
applications that integrate NetSuite capabilities and distribute them through SuiteBundler.
The entire NetSuite infrastructure is hosted in its own data centers, which guarantee
application uptime and availability.
Productivity Applications in the Cloud
Productivity applications have become a key area in cloud computing, providing users with
essential tools like document storage, office automation, and full desktop environments—all
hosted in the cloud. These services allow users to work from anywhere, using any device
connected to the Internet, while also providing enhanced accessibility, scalability, and
collaboration features.
1. Dropbox and iCloud
Cloud-based document storage has become an essential part of many users' daily routines,
thanks to the seamless access to files across multiple platforms. Dropbox has emerged as one
of the most popular solutions for online document storage. It offers users the ability to
synchronize files across devices and platforms, including Windows, Mac, Linux, and mobile
devices. Users can access their Dropbox folder via a browser or by installing a Dropbox
client. All changes made to files in the folder are automatically synchronized across all
devices, ensuring consistency and availability. Dropbox's key strength lies in its cross-
platform support, allowing users to work on files from any device with ease.
iCloud, developed by Apple, provides a similar cloud storage and synchronization solution,
but with a focus on seamless integration across iOS-based devices. Unlike Dropbox, iCloud
operates transparently—users don't need to manually sync files. For example, photos taken
on an iPhone are automatically available on an iMac, and documents edited on one device
update on all others. However, iCloud is primarily limited to Apple devices, and currently,
there is no web-based interface for broader accessibility.
Other similar services, like Windows Live, Amazon Cloud Drive, and CloudMe, offer
similar features with varying levels of integration and device support.
2. Google Docs
Google Docs is a powerful SaaS (Software as a Service) application that delivers basic office
automation functions, such as creating and editing text documents, spreadsheets,
97
Cloud Computing 21CS72
98
Cloud Computing 21CS72
99
Cloud Computing 21CS72
Facebook, with over 800 million users, is one of the largest social networking sites
worldwide. To support such a large user base, Facebook uses cloud computing for
scalability and performance.
Cloud Infrastructure:
Facebook is backed by two data centers designed to efficiently manage large-scale
operations, reduce costs, and have a minimal environmental impact.
The infrastructure is built on inexpensive hardware, which is complemented by a
customized software stack developed by Facebook.
Back-End Technology Stack:
Facebook uses the LAMP stack (Linux, Apache, MySQL, PHP) as its foundational
technology.
The back-end consists of additional services written in various languages to support
different functionalities such as news feeds, search, notifications, and more. These
services provide high-performance functionality, with critical parts of the service
being optimized using faster languages than PHP.
Database and Caching:
The user data is stored in a distributed MySQL database cluster, where the data is
primarily in the form of key-value pairs.
For fast retrieval, frequently accessed data is cached, improving the overall
performance of the system.
Communication and Development Tools:
Thrift: An important tool for enabling cross-language communication. It allows
services written in different languages to communicate by taking care of serialization,
deserialization, and data exchange.
Scribe: A service for aggregating log data in real-time.
Rightscale: Used for auto-scaling to manage the infrastructure and ensure capacity is
dynamically added as needed.
Media Applications
Media applications are well-suited for leveraging cloud computing, especially for
tasks like video processing (encoding, transcoding, rendering), which can be
computationally demanding.
Examples:
Animoto
Animoto is an example of a cloud-based video creation platform, where users can
generate videos by uploading photos, video fragments, and music. The process is
computationally intensive, particularly for video rendering.
100
Cloud Computing 21CS72
Cloud Infrastructure:
Animoto utilizes Amazon Web Services (AWS), specifically Amazon EC2 for web
front-end and worker nodes, Amazon S3 for storing images, music, and videos,
and Amazon SQS for communication between components.
Scalability:
Animoto’s system uses auto-scaling with Rightscale, which monitors system load
and dynamically adds or removes worker instances as needed. During peak times,
Animoto can scale to 4,000 servers without dropping requests, though some delays in
rendering are acceptable.
Key Workflow:
The video creation process involves users uploading media and selecting themes, after
which rendering tasks are queued via SQS and processed by EC2 worker nodes.
Once rendered, users are notified about the completion.
101
Cloud Computing 21CS72
The service integrates with both AWS (EC2, S3, CloudFront) and Rackspace (Cloud
Servers, Cloud Files) for handling the storage and processing of videos.
Pricing Models:
Encoding.com offers several pricing plans: monthly subscriptions, pay-as-you-go, and
special rates for high volumes.
Workflow:
Users upload videos for conversion through various interfaces (web, APIs, or desktop
applications), specify the desired output format, and the transcoding is processed in
the cloud. This enables seamless integration into a variety of workflows.
102
Cloud Computing 21CS72
103
Cloud Computing 21CS72
Module-2
1. Define virtualisation in detail and explain in detail about their types with pros and
cons
2. Explain characteristics of virtualised environments
104
Cloud Computing 21CS72
Module-3
1. How can a small business use cloud computing to work better?
3. Describe the fundamental features of the economic and business model behind
Cloud computing
4. What do the infrastructure, platform, and application layers do in cloud computing?
5 How does the Cloud Reference Model help when using multiple cloud services?
6. What is "pay-as-you-go" pricing in cloud services, and what costs are involved?
Module-4
1. How can cloud providers gain trust from businesses?
4. How well does a Privacy Impact Assessment (PIA) help reduce privacy risks in the
cloud?
6. What are the long-term financial benefits and risks of using cloud computing for a
new business?
105
Cloud Computing 21CS72
Module-5
1. Explain the various storage services provided by AWS and their primary use cases.
2. Summarize the key concepts of S3 and explain their significance in cloud storage.
3. Illustrate and explain the architecture of Google App Engine, detailing the role of
each component.
4. Describe the core components of Google App Engine and their functions in
application deployment
5. Explain the functionalities of Dropbox and Animoto, including how they serve their
respective user bases.
9. Analyse how cloud computing enables remote ECG monitoring by breaking down
its key components and examining their interactions in ensuring seamless
functionality
10. Examine and explain the development technologies currently supported by App
Engine, discussing how they can be utilized in application development
106