Cloud Computing - Piyushwairale
Cloud Computing - Piyushwairale
Exam 2024
GENERAL
IT KNOWLEDGE
Cloud Computing
For Notes & Test
Series
www.piyushwairale.com
Piyush Wairale
MTech, IIT Madras
Course Instructor at IIT Madras BS Degree
No. of Tests: 5
Price: Rs.300
www.piyushwairale.com
Cloud Computing Notes
by Piyush Wairale
Instructions:
• Kindly go through the lectures/videos on our website www.piyushwairale.com
• Read this study material carefully and make your own handwritten short notes. (Short notes must not be
more than 5-6 pages)
1
Contents
1 Introduction to Cloud Computing 4
7 Virtualization 16
10 Containerization 22
12 References 28
2
LinkedIn
Youtube Channel
Telegram Group
• On-Demand Self-Service: Users can access computing resources (such as server time and network storage)
as needed, without requiring human interaction with each service provider.
• Broad Network Access: Cloud services are available over the network and can be accessed through standard
mechanisms by a variety of devices (e.g., laptops, mobile phones, tablets, etc.).
• Resource Pooling: The cloud provider pools computing resources to serve multiple consumers, using a
multi-tenant model. Resources such as storage, processing, memory, and network bandwidth are shared
among users, ensuring high availability.
• Rapid Elasticity: Cloud services can be rapidly and elastically provisioned to scale up or down based on
demand. For end users, this gives the impression of unlimited resources.
• Measured Service: Cloud computing systems automatically control and optimize resource use by leveraging
a metering capability. Resources such as storage, processing, bandwidth, and user accounts are tracked,
enabling a pay-per-use model.
• Cost Efficiency: The pay-as-you-go model allows organizations to avoid upfront infrastructure costs and
pay only for what they consume.
• Security: Cloud providers offer enhanced security, with features such as data encryption, secure access
controls, and compliance with industry standards like GDPR and HIPAA.
3 Types of Cloud Services (Service Models)
Cloud computing is commonly divided into three main service models, each offering different levels of control,
flexibility, and management:
Consumer’s Responsibilities
• Data and Information: Consumers are responsible for managing and securing their data stored in the
cloud. This includes data classification, encryption, and access controls.
• Access Security: Ensuring that only authorized personnel, services, or devices have access to the cloud
resources.
• Endpoint Management: Securing the devices (e.g., laptops, smartphones) that connect to the cloud
resources.
4.4 Multi-cloud
A fourth, and increasingly likely scenario is a multi-cloud scenario. In a multi-cloud scenario, you use multiple
public cloud providers. Maybe you use different features from different cloud providers. Or maybe you started
your cloud journey with one provider and are in the process of migrating to a different provider. Regardless, in a
multi-cloud environment you deal with two (or more) public cloud providers and manage resources and security in
both environments.
4.5 Community Cloud
Description: A community cloud is shared by several organizations with common concerns, such as industry
standards, security, compliance requirements, or shared mission objectives. It can be managed internally or by a
third-party vendor.
Advantages: Shared costs and resources between organizations with similar needs.
Examples: Government agencies, research institutions, universities.
• Scalability: Cloud services can be scaled based on business needs, making them highly adaptable to changing
workloads.
• Disaster Recovery: Cloud computing offers robust disaster recovery solutions due to its distributed nature.
• Accessibility: Cloud-based applications and data are accessible from any location with an internet connec-
tion.
• Collaboration: Cloud platforms allow for easier collaboration as users can access shared resources from
anywhere.
• Cost Efficiency: Typically cheaper, as the costs of hardware and maintenance are shared across multiple
customers.
• Scalability: Highly scalable, offering near-instant provisioning of additional resources to accommodate vari-
able demand.
• Maintenance: Managed by the cloud provider, which handles infrastructure, security updates, and compli-
ance.
• Accessibility: Services are accessible from anywhere via the internet, supporting geographically distributed
teams.
5.1.2 Advantages
• Lower Costs: No need for large upfront investments in hardware or infrastructure.
• High Availability: Public cloud providers offer extensive redundancy and failover mechanisms, ensuring
high availability.
• Flexibility: Resources can be scaled up or down depending on business needs.
• No Maintenance: The cloud provider takes care of all infrastructure management, freeing the client from
these responsibilities.
• Global Reach: Providers like AWS, Azure, and Google Cloud offer data centers worldwide, allowing busi-
nesses to serve global customers with low latency.
5.1.3 Challenges
• Security Concerns: Since resources are shared, there can be concerns about data security and compliance,
especially for sensitive data.
• Limited Control: Organizations have limited control over infrastructure, making it harder to customize
environments.
• Compliance Issues: Some industries (e.g., healthcare, finance) have strict compliance requirements, which
may limit the use of public clouds for certain data or applications.
5.2.2 Advantages
• Enhanced Security: Provides better control over data privacy and compliance, particularly important for
highly regulated industries.
• Full Customization: The organization has control over all aspects of the cloud environment, including
software and hardware configurations.
• Compliance: Easier to comply with strict regulatory requirements, such as HIPAA, GDPR, or PCI-DSS.
• High Performance: Dedicated resources can result in higher performance for specific workloads, without
the latency concerns of a shared environment.
5.2.3 Challenges
• Higher Costs: Requires higher upfront investments in infrastructure, and ongoing maintenance costs are
generally higher than public cloud.
• Limited Scalability: Scaling resources requires investing in new hardware, which may not be as rapid or
cost-effective as the public cloud.
• Complex Management: The organization needs IT staff to manage and maintain the private cloud, which
can add complexity and overhead.
• Cost Considerations: Public cloud is more cost-effective for organizations with less sensitive workloads or
when rapid scaling is required. Private cloud, on the other hand, is suitable for organizations that require
greater control and security, despite the higher cost.
• Security and Compliance: For industries like healthcare or finance, private cloud often makes more sense
due to its higher security levels and easier compliance management.
• Workload Nature: Applications that require significant customization or high performance may benefit
more from private cloud environments, while public cloud is ideal for general-purpose workloads with fluctu-
ating demand.
Definitions
• Distributed Computing: A computational approach where multiple independent computers (nodes) work
together to solve a problem. Each node in the system operates independently, and they communicate over a
network to share tasks and results.
• Parallel Computing: A computational technique where multiple processors work simultaneously on differ-
ent parts of a single problem. It involves breaking down a problem into smaller sub-problems that can be
solved concurrently to achieve faster results.
• Cloud Computing: A model of delivering computing services over the internet. It provides on-demand
access to computing resources (servers, storage, databases, etc.) and services, allowing users to leverage
distributed resources without owning the underlying infrastructure.
The following image illustrates the architectural differences between Distributed Computing and Parallel Com-
puting:
Distributed Computing
• Description: Distributed computing involves multiple independent systems (nodes), each with its own
processor and memory, connected over a network.
• Structure: Each node operates independently with its own resources and communicates with other nodes
over the network to solve complex problems collaboratively.
• Use Case: Distributed computing is used for scenarios that require high availability, scalability, and fault
tolerance, such as large-scale scientific simulations, cloud-based services, and distributed databases.
Parallel Computing
• Description: Parallel computing involves multiple processors working simultaneously within a single system
that shares the same memory.
• Structure: All processors are tightly coupled and communicate through shared memory, allowing them to
process multiple tasks in parallel.
• Use Case: Parallel computing is ideal for scenarios requiring high computational power and fast processing
speeds, such as image processing, scientific simulations, and large-scale data analysis.
Key Differences
• Distributed Computing: The system is composed of multiple independent nodes. Each node has its own
processor and memory and communicates with other nodes over a network.
• Parallel Computing: The system uses multiple processors within a single system, sharing the same memory.
The processors work simultaneously to execute different parts of a task in parallel.
Comparison Table
Use Cases
• Distributed Computing: Used in applications that require high availability and fault tolerance, such as
distributed databases, scientific research, and complex simulations.
• Parallel Computing: Used in applications that require intensive computations, such as image processing,
scientific simulations, and machine learning model training.
• Cloud Computing: Used for on-demand access to scalable resources and services, including web hosting,
data analytics, machine learning, and serverless computing.
7 Virtualization
Virtualization extends beyond server virtualization to include various other components of IT infrastructure, pro-
viding significant benefits to IT managers and enterprises. This document outlines different types of virtualization.
Types of Virtualization
1. Desktop Virtualization: Allows multiple desktop operating systems to run on the same computer using
Virtual Machines (VMs).
• Virtual Desktop Infrastructure (VDI): Runs multiple desktops in VMs on a central server and
streams them to users who log in from any device.
• Local Desktop Virtualization: Runs a hypervisor on a local computer, enabling the user to run one
or more additional operating systems without altering the primary OS.
2. Network Virtualization: Uses software to create a virtual view of the network, abstracting hardware
components such as switches and routers.
• Software-Defined Networking (SDN): Virtualizes hardware that controls network traffic routing.
• Network Function Virtualization (NFV): Virtualizes network hardware appliances such as firewalls
and load balancers, simplifying their configuration and management.
3. Storage Virtualization: Aggregates all storage devices into a single shared pool that can be accessed and
managed as a unified storage entity.
4. Data Virtualization: Creates a software layer between applications and data sources, allowing applications
to access data irrespective of source, format, or location.
5. Application Virtualization: Allows applications to run without being installed on the user’s operating
system.
• Local Application Virtualization: Runs the entire application on the endpoint device in a runtime
environment.
• Application Streaming: Streams parts of the application to the client device as needed.
• Server-based Application Virtualization: Runs applications entirely on a server, sending only the
interface to the client.
6. Data Center Virtualization: Abstracts the data center’s hardware to create multiple virtual data centers,
enabling multiple clients to access their own infrastructure.
7. CPU Virtualization: Divides a single CPU into multiple virtual CPUs, enabling multiple VMs to share
processing power.
8. GPU Virtualization: Allows multiple VMs to utilize the processing power of a single GPU for tasks such
as video rendering and AI computations.
• Pass-through GPU: Allocates the entire GPU to a single guest OS.
• Shared vGPU: Divides the GPU into multiple virtual GPUs for server-based VMs.
9. Linux Virtualization: Utilizes the kernel-based virtual machine (KVM) on Linux to create x86-based VMs.
It is highly customizable and supports security-hardened workloads.
10. Cloud Virtualization: Virtualizes resources such as servers, storage, and networking to provide:
• High Availability: Supports disaster recovery and high availability through VM backups and migrations.
• Resource Optimization: Maximizes hardware resource utilization and reduces power consumption.
Virtualization is a powerful technology that optimizes IT infrastructure, reduces costs, and provides flexibility
and scalability. Understanding the different types of virtualization can help organizations implement the best
solutions to meet their business needs.
Virtual machines
Virtual machines are virtual environments that simulate a physical computer in software form. They normally
comprise several files containing the VM’s configuration, the storage for the virtual hard drive, and some snapshots
of the VM that preserve its state at a particular point in time.
• A hypervisor is the software layer that coordinates VMs. It serves as an interface between the VM and the
underlying physical hardware, ensuring that each has access to the physical resources it needs to execute.
It also ensures that the VMs don’t interfere with each other by impinging on each other’s memory space or
compute cycles.
• Type 1 hypervisor A type 1 hypervisor, or a bare metal hypervisor, interacts directly with the underlying
machine hardware. A bare metal hypervisor is installed directly on the host machine’s physical hardware,
not through an operating system. In some cases, a type 1 hypervisor is embedded in the machine’s firmware.
The type 1 hypervisor negotiates directly with server hardware to allocate dedicated resources to VMs. It
can also flexibly share resources, depending on various VM requests.
• Type 2 hypervisor A type 2 hypervisor, or hosted hypervisor, interacts with the underlying host machine
hardware through the host machine’s operating system. You install it on the machine, where it runs as an
application.
The type 2 hypervisor negotiates with the operating system to obtain underlying system resources. However,
the host operating system prioritizes its own functions and applications over the virtual workloads.
8.1 What’s the difference between Type 1 and Type 2 Hypervisors?
Type 1 and type 2 hypervisors are software you use to run one or more virtual machines (VMs) on a single physical
machine. A virtual machine is a digital replica of a physical machine. It’s an isolated computing environment that
your users experience as completely independent of the underlying hardware. The hypervisor is the technology that
makes this possible. It manages and allocates physical resources to VMs and communicates with the underlying
hardware in the background.
The type 1 hypervisor sits on top of the bare metal server and has direct access to the hardware resources.
Because of this, the type 1 hypervisor is also known as a bare metal hypervisor. In contrast, the type 2 hypervisor
is an application installed on the host operating system. It’s also known as a hosted or embedded hypervisor.
Virtualization is technology that you use to create virtual representations of hardware components like server
or network resources. The software representation uses the underlying physical resource to operate as if it were
a physical component. Similarly, a VM is a software-based instance of a computer, with elements like memory,
processing power, storage, and an operating system.
Pros
• Iteration speed Because containers are lightweight and only include high level software, they are very fast to
modify and iterate on.
• Robust ecosystem Most container runtime systems offer a hosted public repository of pre-made containers.
These container repositories contain many popular software applications like databases or messaging systems
and can be instantly downloaded and executed, saving time for development teams
Cons
Shared host exploits:Containers all share the same underlying hardware system below the operating system
layer, it is possible that an exploit in one container could break out of the container and affect the shared hardware.
Most popular container runtimes have public repositories of pre-built containers. There is a security risk in using
one of these public images as they may contain exploits or may be vulnerable to being hijacked by nefarious actors.
• RKT Pronounced ”Rocket”, RKT is a security-first focused container system. RKT containers do not allow
insecure container functionality unless the user explicitly enables insecure features. RKT containers aim to
address the underlying cross contamination exploitive security issues that other container runtime systems
suffer from.
• Linux Containers (LXC) The Linux Containers project is an open-source Linux container runtime system.
LXC is used to isolate operating, system-level processes from each other. Docker actually uses LXC behind
the scenes. Linux Containers aim to offer a vender neutral open-source container runtime.
• CRI-O CRI-O is an implementation of the Kubernetes Container Runtime Interface (CRI) that allows the
use of Open Container Initiative (OCI) compatible runtimes. It is a lightweight alternative to using Docker
as the runtime for Kubernetes.
Pros
• Full isolation security Virtual machines run in isolation as a fully standalone system. This means that
virtual machines are immune to any exploits or interference from other virtual machines on a shared host.
An individual virtual machine can still be hijacked by an exploit but the exploited virtual machine will be
isolated and unable to contaminate any other neighboring virtual machines.
• Interactive development Containers are usually static definitions of the expected dependencies and config-
uration needed to run the container. Virtual machines are more dynamic and can be interactively developed.
Once the basic hardware definition is specified for a virtual machine the virtual machine can then be treated
as a bare bones computer. Software can manually be installed to the virtual machine and the virtual ma-
chine can be snapshotted to capture the current configuration state. The virtual machine snapshots can be
used to restore the virtual machine to that point in time or spin up additional virtual machines with that
configuration.
Cons
• Iteration speed: Virtual machines are time consuming to build and regenerate because they encompass a
full stack system. Any modifications to a virtual machine snapshot can take significant time to regenerate
and validate they behave as expected.
• Storage size cost: Virtual machines can take up a lot of storage space. They can quickly grow to several
gigabytes in size. This can lead to disk space shortage issues on the virtual machines host machine.
10 Containerization
Containerization is a software deployment process that bundles an application’s code with all the files and libraries
it needs to run on any infrastructure. Traditionally, to run any application on your computer, you had to install the
version that matched your machine’s operating system. For example, you needed to install the Windows version
of a software package on a Windows machine. However, with containerization, you can create a single software
package, or container, that runs on all types of devices and operating systems.
• Portability Software developers use containerization to deploy applications in multiple environments without
rewriting the program code. They build an application once and deploy it on multiple operating systems. For
example, they run the same containers on Linux and Windows operating systems. Developers also upgrade
legacy application code to modern versions using containers for deployment.
• Scalability Containers are lightweight software components that run efficiently. For example, a virtual
machine can launch a containerized application faster because it doesn’t need to boot an operating system.
Therefore, software developers can easily add multiple containers for different applications on a single machine.
The container cluster uses computing resources from the same shared operating system, but one container
doesn’t interfere with the operation of other containers.
• Fault tolerance Software development teams use containers to build fault-tolerant applications. They use
multiple containers to run microservices on the cloud. Because containerized microservices operate in isolated
user spaces, a single faulty container doesn’t affect the other containers. This increases the resilience and
availability of the application.
• Agility Containerized applications run in isolated computing environments. Software developers can trou-
bleshoot and change the application code without interfering with the operating system, hardware, or other
application services. They can shorten software release cycles and work on updates quickly with the container
model.
• Cloud migration: Cloud migration, or the lift-and-shift approach, is a software strategy that involves
encapsulating legacy applications in containers and deploying them in a cloud computing environment. Or-
ganizations can modernize their applications without rewriting the entire software code.
• Adoption of microservice architecture: Organizations seeking to build cloud applications with microser-
vices require containerization technology. The microservice architecture is a software development approach
that uses multiple, interdependent software components to deliver a functional application. Each microservice
has a unique and specific function. A modern cloud application consists of multiple microservices. For exam-
ple, a video streaming application might have microservices for data processing, user tracking, billing, and
personalization. Containerization provides the software tool to pack microservices as deployable programs on
different platforms.
• IoT devices: Internet of Things (IoT) devices contain limited computing resources, making manual software
updating a complex process. Containerization allows developers to deploy and update applications across
IoT devices easily.
• Docker Docker, or Docker Engine, is a popular open-source container runtime that allows software developers
to build, deploy, and test containerized applications on various platforms. Docker containers are self-contained
packages of applications and related files that are created with the Docker framework.
• Linux: Linux is an open-source operating system with built-in container technology. Linux containers are
self-contained environments that allow multiple Linux-based applications to run on a single host machine.
Software developers use Linux containers to deploy applications that write or read large amounts of data.
Linux containers do not copy the entire operating system to their virtualized environment. Instead, the
containers consist of necessary functionalities allocated in the Linux namespace.
• Kubernetes: Kubernetes is a popular open-source container orchestrator that software developers use to
deploy, scale, and manage a vast number of microservices. It has a declarative model that makes automating
containers easier. The declarative model ensures that Kubernetes takes the appropriate action to fulfil the
requirements based on the configuration files.
Key Principles of CI
• Frequent Code Integration: Developers integrate code into the repository multiple times a day, ensuring
that the latest code is available for testing.
• Automated Builds: CI involves automated builds to compile and package the code for deployment.
• Automated Testing: Automated tests run with each integration to verify the correctness and quality of
the code.
• Immediate Feedback: Quick feedback is provided to developers if there are any integration or testing
issues, allowing them to resolve problems quickly.
Benefits of CI
• Early Detection of Bugs: CI helps identify bugs early in the development process, reducing the complexity
of fixes.
• Reduced Integration Issues: Frequent integrations ensure that integration problems are addressed early
and not delayed until the end of the development cycle.
• Improved Collaboration: CI promotes collaboration between team members by integrating and testing
code frequently.
• Continuous delivery usually means a developer’s changes to an application are automatically bug tested
and uploaded to a repository (like GitHub or a container registry), where they can then be deployed to a
live production environment by the operations team. It’s an answer to the problem of poor visibility and
communication between dev and business teams. To that end, the purpose of continuous delivery is to have a
codebase that is always ready for deployment to a production environment, and ensure that it takes minimal
effort to deploy new code.
• CD addresses the problem of overloading operations teams with manual processes that slow down app delivery.
It builds on the benefits of continuous delivery by automating the next stage in the pipeline.
• In practice, continuous deployment means that a developer’s change to a cloud application could go live within
minutes of writing it (assuming it passes automated testing). This makes it much easier to continuously receive
and incorporate user feedback. Taken together, all of these connected CI/CD practices make the deployment
process less risky, whereby it’s easier to release changes to apps in small pieces, rather than all at once.
Key Principles of CD
• Automated Deployment Pipelines: CD involves setting up automated pipelines that build, test, and
package the application.
• Continuous Testing: Continuous testing ensures that the application is thoroughly tested at every stage
of the pipeline.
• Deployable Code: Every code change is kept in a deployable state, making it easy to release the latest
version of the application.
• Manual Approval: CD pipelines can include manual approval steps before deploying to production, allowing
for better control over releases.
Benefits of CD
• Faster Time-to-Market: CD enables rapid and reliable delivery of new features and bug fixes to customers.
• Reduced Deployment Risk: Automated testing and validation at each stage of the pipeline reduce the
risk of issues in production.
• Improved Quality and Reliability: Continuous testing ensures that the application is thoroughly validated
before release, leading to higher quality and reliability.
CI/CD Pipeline
The CI/CD pipeline is a series of automated processes that take code from version control through build, testing,
and deployment stages. It typically consists of the following stages:
1. Code Commit: Developers commit code changes to the version control system (e.g., Git).
2. Build Stage: The code is automatically built and compiled into an executable format.
3. Test Stage: Automated tests are run to validate the functionality, performance, and security of the appli-
cation.
4. Deploy Stage: The application is packaged and deployed to different environments (e.g., development,
staging, production) based on the pipeline configuration.
5. Monitoring and Feedback: The deployed application is monitored for performance and errors, and feed-
back is provided to developers for continuous improvement.
• atlassian.com