Notes
Notes
Cloud computing is the delivery of various services via the Internet—these services include data storage,
servers, databases, networking, and software. Instead of owning their own computing infrastructure or
data centers, companies can lease access to anything from applications to storage from a cloud service
provider.
Cloud computing services are deployed based on the type of access and the ownership of the
infrastructure. The primary deployment models are Public Cloud, Private Cloud, Community Cloud, and
Hybrid Cloud. Here's an explanation of each:
1. Public Cloud
A public cloud is a type of cloud infrastructure that is open for use by the general public. It is owned,
managed, and operated by third-party cloud service providers.
Examples: Microsoft Azure, Amazon Web Services (AWS), Google Cloud Platform (GCP).
Features:
o Accessible via the internet to anyone willing to pay.
o Cost-effective since resources are shared among multiple users.
o Highly scalable and flexible.
Advantages:
o No maintenance for the user.
o Pay-as-you-go pricing.
Disadvantages:
o Less control over resources.
o May not meet specific security or compliance needs.
pg. 1 By Ghanshyam
2. Private Cloud
A private cloud is a dedicated cloud environment used exclusively by a single organization. It may be hosted
on-premises or by a third-party provider but is not shared with other users.
Examples: VMware, OpenStack, or a custom private setup.
Features:
o Offers higher security and control compared to public clouds.
o Can be tailored to meet specific organizational needs.
Advantages:
o Greater security and privacy.
o Ideal for businesses with strict compliance requirements.
Disadvantages:
o Higher cost due to dedicated resources.
o Requires in-house expertise for management and maintenance.
3. Community Cloud
A community cloud is a cloud infrastructure shared by multiple organizations with similar interests, goals,
or compliance requirements. It is operated for the benefit of the specific community.
Examples: Government organizations sharing infrastructure for data sharing or compliance
purposes.
Features:
o Collaboration and resource sharing among a specific group of organizations.
o Managed either internally or by a third-party provider.
Advantages:
o Cost is shared among participants.
o Enhanced collaboration within the community.
Disadvantages:
o Limited scalability compared to public clouds.
o Managing shared governance can be challenging.
4. Hybrid Cloud
A hybrid cloud combines two or more cloud models (public, private, or community), allowing data and
applications to be shared between them. It provides the flexibility of multiple deployment environments.
Examples: A company might use a private cloud for sensitive operations and a public cloud for less
critical workloads.
Features:
o Seamless integration of public and private clouds.
o Offers the best of both worlds by balancing control and scalability.
Advantages:
o Greater flexibility in handling workloads.
o Cost optimization by using public cloud resources for less sensitive tasks.
Disadvantages:
o Complex to manage and integrate.
o Security concerns when data moves between public and private clouds.
pg. 2 By Ghanshyam
Comparison of Cloud Deployment Models (Public, Private, Community, and Hybrid Cloud)
Shared ownership by
Owned and managed Owned by a single Combination of public
Ownership a group of
by a third-party organization. and private clouds.
organizations.
provider.
Restricted to a
Open to the general Restricted to a single Both public and private
Accessibility specific group of
public. organization. resources are accessible.
users.
Costs are shared
Pay-per-use; cost is High cost due to Costs depend on the
Cost among community
shared among users. dedicated resources. combination used.
members.
Shared control among
Limited control over Full control over Partial control over
Control participating
infrastructure. resources. private cloud resources.
organizations.
Lower security; data High security; Moderate security, Balances security across
Security is shared over the dedicated tailored for public and private
public network. infrastructure. community needs. clouds.
Moderate scalability,
Highly scalable with Limited by the Highly scalable with
depending on
Scalability on-demand organization’s public resources and
community
resources. resources. private infrastructure.
agreements.
Suitable for
Suitable for startups Ideal for sensitive data Best for organizations
organizations requiring
Use Case or general-purpose and compliance-heavy with shared goals
flexibility and resource
applications. environments. (e.g., healthcare).
optimization.
Government or A company using AWS
AWS, Microsoft VMware Private
Examples healthcare-specific for testing and a private
Azure, Google Cloud. Cloud, OpenStack.
clouds. cloud for operations.
Cloud Component :
1. Servers 4. Virtualization
2. Data center 5. Security
3. Networking 6. Client computer
Cloud computing offers three primary service models: Infrastructure as a Service (IaaS), Platform as a
Service (PaaS), and Software as a Service (SaaS). These models differ in the level of control and
responsibility users have over the resources.
pg. 3 By Ghanshyam
1. Infrastructure as a Service (IaaS) :
IaaS provides virtualized computing resources over the internet, such as servers, storage, and networking.
It gives users control over the underlying infrastructure while offloading hardware maintenance to the
cloud provider.
Examples: Amazon EC2 (AWS), Microsoft Azure Virtual Machines, Google Compute Engine.
Features:
o Users manage the operating system, applications, and middleware.
o Scalability to meet demand.
o Resources are billed on a pay-as-you-use basis.
Advantages:
o Flexibility to configure infrastructure as needed.
o Cost savings by avoiding physical hardware investment.
Disadvantages:
o Users are responsible for managing the software stack.
pg. 4 By Ghanshyam
Comparison of IaaS, PaaS, and SaaS :
Cloud Economics refers to the financial aspects of using cloud computing, emphasizing cost-efficiency,
scalability, and optimization. Businesses adopt cloud solutions to reduce upfront investments in
infrastructure and optimize operating costs.
1. Cost Efficiency
2. Scalability and Flexibility
3. Faster Time to Market
4. Accessibility and Collaboration
5. Business Continuity
6. Global Reach
7. Security Enhancements
The architecture of cloud computing can be visualized with key components divided into the front end and
the back end of the system, interacting through the internet. Here’s a simplified view of cloud computing
architecture:
Frontend
Components:
o User Interface (UI): Web browsers or applications used to interact with the cloud (e.g.,
Google Drive, AWS Management Console).
o Client Devices: Computers, smartphones, or tablets that access cloud services.
Purpose: Provides the interface for users to access and utilize cloud resources.
Backend
Management: This part handles administrative tasks and operations of the cloud services.
Application: The software applications hosted on the cloud.
Service: Various services provided by the cloud, such as computing power, databases, networking,
and messaging.
Storage: Manages data storage solutions, ensuring efficient data management and retrieval.
Security: Ensures protection and security of data and applications hosted in the cloud.
Communication between the frontend and backend components happens over the internet, facilitating
seamless interaction and service delivery.
1.5 Cloud Based Integrated Development Environment : steps to use write run , debug , code with Browser
pg. 6 By Ghanshyam
Unit 02 : Virtualization
Introduction to Virtualization
Virtualization is the process of creating virtual versions of physical hardware, such as servers, storage,
networks, or even applications. It allows multiple operating systems or workloads to run on a single
physical machine by abstracting the hardware resources.
The Virtualization Reference Model is a conceptual framework that describes the components and layers
in a virtualized environment. It helps understand how physical resources are abstracted and managed to
provide virtualized services.
2. Virtualization Layer
Definition: This layer abstracts the physical hardware and creates virtual instances.
Key Technology: Hypervisor (or Virtual Machine Monitor).
Hypervisor Types:
o Type 1 (Bare-Metal): Runs directly on hardware (e.g., VMware ESXi, Microsoft Hyper-V).
o Type 2 (Hosted): Runs on an existing OS (e.g., Oracle VirtualBox, VMware Workstation).
Purpose:
o Allocates physical resources to virtual machines.
o Provides isolation between virtual machines.
pg. 7 By Ghanshyam
3. Virtual Machines (VMs)
4. Management Layer
Definition: Provides tools and software for managing virtualized resources and environments.
Components:
o Orchestration Tools: Automate resource allocation (e.g., Kubernetes, OpenStack).
o Monitoring Tools: Track performance and usage (e.g., vCenter, Prometheus).
o Resource Management: Allocate and optimize hardware resources dynamically.
Purpose: Simplifies the management of VMs, storage, and networks.
Definition: The interface through which users interact with and manage the virtualized
environment.
Components:
o Web portals or dashboards for administrators and users.
o APIs for programmatic access to virtual resources.
Purpose: Provide easy access for managing and using virtual machines and resources.
------------------------------------------------------------------------------------------------------------------------------------
Types of Virtualization
1. Server Virtualization
o Divides a single server into multiple virtual servers.
o Example: VMware ESXi, Microsoft Hyper-V.
o Use: Hosting multiple applications.
2. Desktop Virtualization
o Provides virtual desktops hosted on a remote server.
o Example: Citrix Virtual Apps, VMware Horizon.
o Use: Remote work environments.
pg. 8 By Ghanshyam
3. Network Virtualization
o Abstracts physical networks into virtual networks.
o Example: VMware NSX, Cisco ACI.
o Use: Simplifies cloud network management.
4. Storage Virtualization
o Combines physical storage into a virtual pool.
o Example: VMware vSAN, NetApp.
o Use: Simplified storage management.
5. Application Virtualization
o Runs apps without installing them on the OS.
o Example: Citrix XenApp, Microsoft App-V.
o Use: Compatibility and simplified deployment.
6. Data Virtualization
o Integrates data from multiple sources into one view.
o Example: Denodo, IBM Data Virtualization.
o Use: Business intelligence and analytics.
These are prominent virtualization technologies used to create and manage virtualized environments.
Here's a brief overview:
1. VMware
2. Microsoft Hyper-V
pg. 9 By Ghanshyam
4. Xen
Virtual machines (VMs) provide numerous benefits for modern IT environments, especially in virtualization,
cloud computing, and resource management. Here's a breakdown based on the aspects you mentioned:
1. Install VMware: Download and install VMware Workstation or Fusion on your physical machine.
2. Launch VMware: Open the application and select "Create a New Virtual Machine".
3. Choose Configuration: Select Typical for default settings or Custom for advanced configurations.
4. Provide OS Installation Media: Use a physical disc, ISO file, or choose to install the OS later.
5. Select OS Type: Choose the guest operating system (e.g., Windows, Linux) and version.
6. Name and Save VM: Enter a name and specify a location to save VM files.
7. Configure Resources: Assign CPU, memory (RAM), storage, and network settings.
8. Create and Power On: Finish setup, power on the VM, and install the OS.
9. Install VMware Tools: Optimize performance by installing VMware Tools after OS installation.
10. Save and Use: Save settings and start using the VM.
Features of Virtualization
pg. 10 By Ghanshyam
Unit 03 : Storage in Clouds
Storage in Cloud :
Cloud storage refers to storing data on remote servers accessed via the internet. Instead of relying on local
storage devices, cloud services offer scalable, on-demand storage solutions.
Storage system architecture refers to the design and organization of storage systems, defining how data is
stored, managed, and accessed. A typical storage architecture involves several key components:
Storage Devices
HDDs and SSDs: These devices store data physically. SSDs are faster and more reliable than HDDs.
Storage Networks
Storage Controllers
These manage the interaction between storage devices and users. RAID controllers offer
redundancy and improve performance.
Data Virtualization
Virtualization abstracts physical storage, allowing a more efficient use of storage resources.
Backup, encryption, and deduplication help manage, secure, and reduce the storage footprint.
Diagram : Storage Devices (HDD/SSD) → Storage Controllers → Data Network (DAS, NAS, SAN) →
Virtualization Layer (if applicable) → User Access and Management.
pg. 11 By Ghanshyam
Virtualized Data Center (VDC) Architecture and Environment
A Virtualized Data Center (VDC) utilizes virtualization technologies across the data center infrastructure,
including servers, storage, networking, desktops, and applications. VDC allows for flexibility, scalability,
and improved resource utilization while reducing costs and complexity.
Server Virtualization
Hypervisor (e.g., VMware ESXi, Microsoft Hyper-V, KVM) abstracts physical servers to create virtual
machines (VMs).
Multiple VMs can run on the same physical hardware, maximizing CPU, memory, and storage
utilization.
Storage Virtualization
Physical storage devices (e.g., hard drives, SSDs) are abstracted into a unified pool of storage.
SAN (Storage Area Networks) or NAS (Network-Attached Storage) provide flexible and scalable
storage solutions.
Features like Data Deduplication, RAID, and Snapshots improve storage efficiency and availability.
Networking Virtualization
Virtual network switches and routers create logical networks, separate from the physical network.
Software-Defined Networking (SDN) enables flexible network configurations and centralized
management of the network infrastructure.
Desktop Virtualization
Virtual Desktop Infrastructure (VDI) allows the creation and management of virtual desktops that
are hosted on centralized servers.
Users access these desktops remotely, making it easier to manage and secure.
Application Virtualization
Applications run on a centralized server or in the cloud, with users accessing them remotely, rather
than installing them locally on physical machines.
Tools like Citrix XenApp or VMware Horizon can provide application delivery.
pg. 12 By Ghanshyam
Virtualization Techniques :
Server Virtualization
VMWare, Hyper-V, KVM are used to create multiple VMs on a single physical server.
Benefits: Improved hardware utilization, simplified management, isolation, and security.
Storage Virtualization
Networking Virtualization
SDN allows for programmable networks where software controls network behavior.
Benefits: Easier management, improved scalability, and faster network adjustments.
Desktop Virtualization
VDI (Virtual Desktop Infrastructure) or Remote Desktop Services allow users to access virtualized
desktops remotely.
Benefits: Centralized desktop management, reduced hardware costs, and better security.
Application Virtualization
App Virtualization Tools (e.g., VMware Horizon, Citrix XenApp) allow applications to run on
centralized servers.
Benefits: Simplified application management, no need for local installation, and easy updates.
pg. 13 By Ghanshyam
oUse cloud providers’ management tools (e.g., AWS Management Console, Google Cloud
Storage Manager) for provisioning, monitoring, and scaling storage.
7. Consider Data Lifecycle Management
o Implement data tiering strategies (hot, cold, and archival storage) to optimize cost based on
data access frequency.
8. Implement Monitoring and Scaling
o Set up monitoring (e.g., CloudWatch, Prometheus) for real-time tracking of storage
utilization.
o Enable auto-scaling to automatically increase or decrease storage capacity based on deman
Definition: In block-level virtualization, data is stored in blocks, and the storage system abstracts
these blocks from the physical storage hardware. The virtualization layer presents logical storage
volumes to the operating system, enabling the use of different storage devices as a single,
consolidated block storage resource.
Example: Technologies like SAN (Storage Area Network) and iSCSI.
File-Level Storage Virtualization
Definition: In file-level virtualization, the data is organized in a file system (such as NTFS or ext4)
rather than as individual blocks. Virtualization occurs at the file level, and the system abstracts the
files across different storage systems, often using a network file system.
Example: NAS (Network Attached Storage) solutions.
pg. 14 By Ghanshyam
Virtual Storage Area Network (VSAN)
VSAN (Virtual Storage Area Network)
Definition: A VSAN is a software-defined storage solution that creates a virtualized storage area
network by pooling together storage resources across multiple physical machines. It abstracts the
underlying storage infrastructure, providing a unified, flexible virtual storage layer that is easily
scalable.
Key Features:
1. Virtualizes both block and file storage resources.
2. It is integrated into hyperconverged infrastructure (HCI), with storage, compute, and
networking combined into a single platform.
3. It is often used with VMware vSphere for virtualized environments.
Benefits of VSAN
1. Scalability: Easily scale storage capacity by adding new nodes to the network.
2. Simplified Management: Centralized management for both compute and storage resources.
3. Cost Efficiency: Reduces the need for separate storage hardware and simplifies the infrastructure.
4. Performance: Ensures better resource utilization with optimized storage performance across virtual
machines.
5. High Availability: Built-in fault tolerance and disaster recovery features ensure continuous uptime.
Cloud File Systems: Google File System (GFS) and Hadoop Distributed File System (HDFS)
Google File System (GFS) and Hadoop Distributed File System (HDFS) are both distributed file systems
designed to handle large-scale data storage across clusters of machines. They share many similarities but
also have key differences.
Purpose: Designed to meet Google's needs for scalable and fault-tolerant data storage to support its
search engines, indexing, and data processing systems.
Characteristics:
1. Data Replication: GFS replicates data across multiple nodes (typically three copies) to ensure
fault tolerance and high availability.
2. Write-Once, Append-Only: GFS allows files to be written once and then read multiple times,
with the ability to append data to existing files but no modification of existing data.
pg. 15 By Ghanshyam
3. Chunk-Based Storage: Data is divided into fixed-size chunks (64MB by default), and each
chunk is stored across multiple machines. Each chunk has a primary replica and several
replicas for redundancy.
4. Fault Tolerance: GFS has mechanisms to detect node failures and automatically replicate
missing or corrupted data from replicas.
5. Metadata: Metadata (e.g., file names, permissions) is stored in a centralized master server,
but actual data is distributed across worker nodes.
6. Optimized for Large Files: Optimized for large files (typically in GB to TB sizes) and heavy
workloads, like MapReduce.
6. Optimized for Large Files: HDFS is optimized for high throughput of large files, ideal for
batch processing and analytic workloads.
pg. 16 By Ghanshyam
3. Error Detection:
o Both systems employ error detection methods during data transfers, ensuring that data
remains intact during inter-node communication. Corrupted data is replaced from replica
copies.
4. Block/Chunk Recovery:
o In case of data corruption or node failure, both systems re-replicate the data from healthy
replicas to maintain fault tolerance and data integrity.
5. Heartbeat and Failure Detection:
o HDFS uses a Heartbeat mechanism to detect DataNode failures. In GFS, the master server
tracks replicas and initiates recovery in case of failure.
Designed for Google’s internal large-scale Designed for storing and processing large
Purpose
data storage needs datasets in the Hadoop ecosystem
Data Storage
Chunk-based storage (64MB by default) Block-based storage (128MB or 256MB blocks)
Units
Data
Default replication factor is 3 Default replication factor is 3
Replication
File
Write-once, append-only Write-once, read-many
Modification
Metadata
Centralized master node stores metadata Metadata is stored in NameNode (master)
Storage
File chunks cannot be modified once Files can be written once and appended, no
Write Access
written, but can be appended random writes
Primarily integrated into Google Integrated into the Hadoop ecosystem (e.g.,
Integration
infrastructure MapReduce, Spark)
pg. 17 By Ghanshyam