0% found this document useful (0 votes)
11 views76 pages

Cloud Computing

The document outlines the fundamental components of cloud infrastructure, including logical network perimeters, virtual servers, and cloud storage devices, emphasizing their roles in security, scalability, and efficiency. It details the mechanisms and features of logical network perimeters, such as access control and segmentation, as well as the characteristics and benefits of virtual servers and various types of cloud storage. Additionally, it discusses the integration of these components to enhance security and management in cloud environments.

Uploaded by

Raj Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views76 pages

Cloud Computing

The document outlines the fundamental components of cloud infrastructure, including logical network perimeters, virtual servers, and cloud storage devices, emphasizing their roles in security, scalability, and efficiency. It details the mechanisms and features of logical network perimeters, such as access control and segmentation, as well as the characteristics and benefits of virtual servers and various types of cloud storage. Additionally, it discusses the integration of these components to enhance security and management in cloud environments.

Uploaded by

Raj Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 76

CLOUD COMPUTING MECHANISM-I

Prof. Santosh Kumar Swain


School of Computer Engineering
KIIT
Cloud Infrastructure Mechanism
Fundamental building blocks of cloud infrastructure includes:
Logical Network Perimeter
Virtual Server,
Cloud Storage Device,
Cloud Usage Monitor,
Resource Replication
 Ready-Made Environment
Each of these mechanisms plays a critical role in creating a robust, secure,
and scalable cloud infrastructure, catering to the diverse needs of
businesses and developers.
Logical Network Perimeter
• The Logical network perimeter establishes a virtual network boundary that can encompass
and isolate a group of related cloud-based IT resources that may be physically distributed. It
is defined as the isolation of a network environment from the rest of a communications
network.
• The Logical network perimeter focuses on securing the flow of data and ensuring authorized
access in a networked environment.
• Unlike physical perimeters, logical perimeters rely on software-defined tools to protect
cloud-based infrastructure.

• Description: A virtual boundary created around a cloud environment to secure and isolate
cloud resources. It is implemented through firewalls, VPNs, and VLANs to control access and
ensure that only authorized entities can interact with the cloud infrastructure.
• Purpose: Enhances security and compliance by segregating data and restricting access.
Key features
• The Logical Network Perimeter is a virtual boundary that defines
and protects the digital resources of a network. Unlike a physical
network perimeter, which is tied to physical devices such as routers
or firewalls,
• The key features of Logical Network Perimeter are
– Access Control: Policies to regulate who can access the network.
– Segmentation: Dividing the network into smaller, secure segments.
– Encryption: Protecting data in transit within and outside the perimeter.
– Virtual Boundaries: Implemented via software-defined networking (SDN),
virtual private networks (VPNs), or cloud-based solutions.
Components of LNP

• Firewalls:
– Monitor and control incoming and outgoing traffic based on predefined rules.
– Types include network firewalls (e.g., AWS Network Firewall) and web application firewalls (e.g., Azure WAF).
• Virtual Private Networks (VPNs):
– Secure communication by encrypting data transmitted between on-premises environments and cloud services.
– Examples include AWS Client VPN and Azure VPN Gateway.
• Gateways:
– Internet Gateways: Enable cloud resources to communicate with the internet securely.
– NAT Gateways: Allow instances without public IPs to access external services while keeping them private.
– API Gateways: Manage and secure API traffic.
• Identity and Access Management (IAM):
– Enforces authentication and authorization for accessing resources.
• Intrusion Detection and Prevention Systems (IDPS):
– Detect and block malicious activity within the cloud network.
• Network Segmentation Tools:
– Virtual LANs (VLANs), security groups, and subnets are used to segregate traffic.
• Cloud Security Posture Management (CSPM):
• Monitors compliance and configuration of network perimeters
Purpose
• Access Control: Regulates which users, devices, or services can
interact with network resources.
• Data Security: Protects sensitive information from
unauthorized access.
• Threat Mitigation: Identifies, monitors, and blocks malicious
traffic or activities.
• Compliance: Helps organizations meet industry standards and
regulations (e.g., HIPAA, PCI DSS).
Security Implications
• Enhanced Security Posture:
– Reduces exposure to threats by segmenting networks and applying least-
privilege access.
• Compliance with Standards:
– Helps align with industry-specific security frameworks and regulations.
• Dynamic Threat Mitigation:
– Real-time monitoring and response to malicious activities.
• Challenges:
– Misconfigurations in cloud settings can lead to vulnerabilities.
– Insider threats or compromised accounts may bypass logical perimeters.
Examples of Implementation of LNP in Cloud Infrastructure

• Amazon Web Services (AWS):


– Components Used:
• Security Groups to define inbound and outbound rules for instances.
• VPCs (Virtual Private Clouds) to isolate resources.
• AWS WAF for web application protection.
– Use Case: A multi-tier application with separate subnets for web servers (public-facing) and databases (private).
• Microsoft Azure:
– Components Used:
• Azure Firewall to inspect and filter network traffic.
• Network Security Groups (NSGs) to control traffic at the VM level.
• Azure VPN Gateway for secure remote access.
– Use Case: Hybrid cloud deployment with on-premises integration over a VPN.
• Google Cloud Platform (GCP):
– Components Used:
• VPC firewalls for custom network rules.
• Identity-Aware Proxy (IAP) to secure app access.
– Use Case: Configuring a serverless backend protected by Cloud Armor and API Gateway.
virtual server
• A virtual server is a simulated instance of a physical server that runs
within a virtualized environment. Virtual servers leverage hypervisors
(such as VMware ESXi, Microsoft Hyper-V, or KVM) to share the
resources of a single physical machine across multiple virtual
instances.
• Description: A software-based emulation of a physical server that
runs on a hypervisor. Virtual servers are scalable and can be
provisioned or de-provisioned based on demand.
• Purpose: Provides flexible and cost-effective compute resources
without the need for physical hardware.
Key characteristics of
Virtual Server

• Key characteristics include:


– Flexibility: Easily scalable and modifiable to meet changing demands. Compatible with multiple
operating systems and applications.
– Isolation: Each virtual server operates independently, even on the same host.
– Scalability: Resources like CPU, RAM, and storage can be adjusted dynamically.
– Cost/Resource Efficiency: Reduces the need for multiple physical servers.
– Disaster Recovery: Simplifies replication and backup.

• Hypervisor is a firmware or a low level program which is a key to enable virtualization.


It is used to divide and allocate cloud resources between several customers. As it
monitors and manages cloud services/resources that’s why hypervisor is called as
VMM (Virtual Machine Monitor) or (Virtual Machine Manager).
Hypervisor
A hypervisor, also known as a virtual machine monitor (VMM), is a piece of software, firmware, or hardware that
creates and manages virtual machines (VMs). It allows multiple operating systems to run on a single physical host
simultaneously by abstracting and partitioning the hardware resources such as CPU, memory, storage, and network.

Types of Hypervisors
1. Type 1 (Bare-Metal Hypervisors)
– Runs directly on the host's hardware.
– Does not require an underlying operating system.
– Provides better performance and efficiency due to direct access to hardware.
– Common in data centers and enterprise environments.
– Examples:
• VMware ESXi, Microsoft Hyper-V, Xen
• KVM (Kernel-based Virtual Machine)
2. Type 2 (Hosted Hypervisors)
– Runs on top of an existing operating system.
– Easier to set up and use but may have overhead since it relies on the host OS.
– Common for desktop and personal use.
– Examples:
• VMware Workstation, Oracle VirtualBox, Parallels Desktop
• QEMU (can also act as a Type 1 in certain configurations)
Functions of a Hypervisor

• Resource Allocation: Dynamically allocates resources like CPU,


memory, and storage to VMs.
• Isolation: Ensures that VMs operate independently, preventing
interference or security breaches between them.
• Hardware Emulation: Presents virtual hardware to VMs, allowing
them to run any compatible operating system.
• Consolidation: Reduces hardware costs by allowing multiple VMs to
share a single physical machine.
• Snapshots and Cloning: Enables capturing the current state of a VM
or duplicating it for backups or testing.
Roles of Hypervisors in Virtual Server Management

• Allocates hardware resources like CPU, memory, and storage to


VMs.
• Ensures isolation between VMs to prevent interference.
• Facilitates live migration, cloning, and backups of virtual
servers.
• Monitors and optimizes performance across virtualized
environments.
Difference Between Virtual and Physical Servers

Aspect Physical Server Virtual Server

Definition A standalone hardware-based server. A software-based emulation of a server.

Resource Usage Dedicated to a single purpose or workload. Shares physical resources with other VMs.

Flexibility Limited scalability; hardware upgrades required. Highly scalable with configurable
resources.

Cost Higher upfront costs for hardware. Lower costs due to shared infrastructure.

Deployment Time-intensive setup and provisioning. Rapid provisioning through virtualization.

Fault Tolerance Requires additional hardware for redundancy. Enhanced through snapshots and
failovers.
Benefits of Virtual Servers in Cloud Computing
• Cost Efficiency:
– Reduces hardware investment by sharing physical resources.
• Scalability and Flexibility:
– Adjust resources as needed to match demand.
• Improved Disaster Recovery:
– Snapshots and backups allow quick recovery.
• Resource Optimization:
– Maximizes hardware utilization.
• Ease of Management:
– Centralized management tools simplify deployment and monitoring.
• Portability:
– Virtual servers can be easily migrated across environments.
Case Studies of Virtual Server Usage

1. AWS EC2 (Amazon Elastic Compute Cloud):


• Overview: Provides scalable virtual servers (instances) in the cloud.
• Features:
– Wide range of instance types optimized for different workloads.
– Elastic Load Balancing for high availability.
– Auto Scaling to handle fluctuating demands.
• Use Case:
– Netflix: Uses EC2 for streaming infrastructure, handling high traffic during peak hours.
2. Google Cloud Compute Engine:
• Overview: Offers virtual machines running on Google’s infrastructure.
• Features:
– Predefined or custom machine types.
– Live migration for maintenance without downtime.
– Integration with AI and ML tools.
• Use Case:
– PayPal: Leverages Compute Engine to ensure scalability and reliability for payment processing systems.
Integration of Logical Network Perimeter and Virtual Servers

• Enhanced Security: Logical network perimeters can enforce


security policies for virtual servers, including micro-
segmentation.
• Dynamic Workloads: Virtual servers can migrate across data
centers or clouds, with the logical perimeter adapting
dynamically.
• Centralized Management: Combining these technologies
simplifies monitoring and management in hybrid or multi-cloud
environments.
Cloud Storage Device

• Description: A virtualized storage service that provides


scalable, on-demand access to data storage. It typically
supports different storage tiers, including object, block, and file
storage.
• Purpose: Facilitates data storage, backup, and retrieval in a
scalable and secure manner.
Categories of Storage Devices in Cloud Computing

Cloud computing storage devices fall into three main :


• Object Storage: For unstructured data, used in services like
AWS S3 or Google Cloud Storage.
• Block Storage: Low-latency storage for databases and virtual
machines, e.g., AWS EBS or Azure Managed Disks.
• File Storage: Shared storage for applications requiring file
system access, e.g., AWS EFS or Azure Files.
a. Block Storage

• Definition: Provides raw storage blocks to be formatted by the user, similar to hard
drives. It works at the block level.
• Use Cases:Databases, Virtual Machine Disks, High-performance workloads
• Examples:Amazon Elastic Block Store (EBS), Google Persistent Disks, Microsoft Azure
Disk Storage
• Advantages:
– Low latency
– High IOPS (Input/Output Operations Per Second)
– Granular control over data
• Disadvantages:
– No metadata support
– Typically more expensive than other options
b. Object Storage

• Definition: Data is stored as objects, each containing the data, metadata, and a
unique identifier.
• Use Cases:Media storage, Archival and backup, Static website hosting
• Examples:Amazon S3, Google Cloud Storage, Azure Blob Storage
• Advantages:
– Scalability
– Supports metadata and retrieval by unique IDs
– Cost-effective for large amounts of unstructured data
• Disadvantages:
– High latency compared to block storage
– Not ideal for transactional data
What is the difference between block storage and object storage?

• Block storage is appropriate for databases and applications


that need direct disc access because it offers raw storage
volumes that can be read and modified at the block level.
• In contrast, object storage provides seamless scalability,
metadata capabilities, and streamlined data administration. It
is perfect for handling large volumes of unstructured data.
Cyfuture Cloud's STaaS includes block and object storage
options for diverse storage needs.
What is cloud file storage?

• Cloud file storage is a method for storing data in the cloud that provides servers and
applications access to data through shared file systems. This compatibility makes cloud file
storage ideal for workloads that rely on shared file systems and provides simple integration
without code changes.
• What is a cloud file system?
• A cloud file system is a hierarchical storage system in the cloud that provides shared access
to file data. Users can create, delete, modify, read, and write files, as well as organize them
logically in directory trees for intuitive access.
• What is cloud file sharing?
• Cloud file sharing is a service that provides simultaneous access for multiple users to a
common set of files stored in the cloud. Security for online file storage is managed with user
and group permissions so that administrators can control access to the shared file data.
How does cloud file storage help with collaboration?
• Cloud file storage allows team members to access, view, and edit the same files in near,
real-time, and simultaneously, from virtually any location. Edits are visible to users or
groups as they are made, and changes are synced and saved so that users or groups see
the most recent version of the file. Collaboration through cloud file sharing offers many
benefits:
– Work together and achieve shared goals, even with remote members.
– Schedule work flexibly by sharing tasks between collaborators in different time zones.
– Share and edit large files, like video or audio files, with ease.
– Receive notifications when files are edited or updated in real time.
– Share ideas or suggestions by leaving comments on shared files.
• What are the use cases for cloud file storage?
• Cloud file storage provides the flexibility to support and integrate with existing applications, plus the ease
to deploy, manage, and maintain all your files in the cloud. These two key advantages give organizations
the ability to support a broad spectrum of applications and verticals. Use cases such as large content
repositories, development environments, media stores, and user home directories are ideal workloads for
cloud-based file storage. Some example use cases for file storage are as follows.
Cloud-Based Storage Solutions

a. Amazon S3 (Simple Storage Service)


• Object-based storage designed for scalability, availability, and durability.
• Features:
– Lifecycle policies for automated data archiving.
– Different storage classes (e.g., Standard, Intelligent-Tiering, Glacier).
– Cross-region replication.
b. Google Cloud Storage
• Object storage solution offering multiple storage classes:
– Standard, Nearline, Coldline, and Archive.
• Integration with other Google services.
• Supports lifecycle management and automatic cost optimization.
c. Microsoft Azure Block Storage
• Offers hot, cool, and archive access tiers.
• Seamless integration with Azure services.
• Focuses on security, redundancy, and compliance.
Storage Management Techniques
Effective management of cloud storage ensures performance, cost efficiency, and scalability.
a. Tiering and Archiving
• Moving less frequently accessed data to lower-cost storage classes (e.g., Amazon S3 Glacier or Google
Coldline).
• Automating transitions using lifecycle policies.
b. Data Compression
• Reducing storage costs and speeding up data transfers by compressing files.
c. Snapshots and Cloning
• Using snapshots to back up block storage volumes (e.g., AWS EBS Snapshots).
• Cloning volumes for testing and development.
d. Monitoring and Optimization
• Tracking storage usage and costs through monitoring tools like AWS Cost Explorer or Google Cloud
Monitoring.
• Identifying and deleting unused resources.
e. Encryption and Access Management
• Encrypting data at rest and in transit (e.g., using AES-256 encryption).
• Applying strict access controls using IAM (Identity and Access Management).
Data Backup & Redundancy
• In the digital age, losing valuable data can be a fatal blow to any
organization. Gone are the days of filing cabinets and storage boxes –
today, so much as a power outage can do damage if your business
doesn’t have a robust data recovery solution in place.
• Depending on what data is lost, a company may suffer anything from a
brief period to rebuild to a setback so severe it signals the end of the line.
• This in mind, there’s no question that data recovery is a critical
component to a reliable business continuity plan. In discussions with
suppliers, your IT team or when acting as a liaison between the two,
there’s a good chance you’ve come across two key terms relating to data
recovery: back-up and redundancy.
Redundancy
• Users upload data to servers via an internet connection, where it is saved on a virtual machine on
a physical server. To maintain availability and provide redundancy, cloud providers will often
spread data to multiple virtual machines in data centers located across the world.
• Data redundancy refers to the practice of keeping data in two or more places within a database
or data storage system.
• Data redundancy ensures an organization can provide continued operations or services in the
event something happens to its data -- for example, in the case of data corruption or data loss.
• Data redundancy is a practical tool that can make all the difference in a disaster. Through this
failsafe system, data is stored in two places, so that if one element fails, data can still be used by
supplementing from another location.
• Think of it as the reserve parachute: if a skydiver jumps from a plane and the mechanism of their
first parachute fails for whatever reason, they can use the reserve chute to glide down safely
• Redundancy is designed to increase your operational time, boost staff productivity and reduce
the amount of time that a system is unavailable due to a failure.
Back-up
• Back-up, on the other hand, is designed to kick-in when something goes wrong, allowing you to completely rebuild no
matter what caused the failure.
• Unlike redundancy that attempts to prevent failure, back up processes treat failure as an inevitability; they work on the
premise that things will go wrong, and therefore prepare a back-up plan for when (not ‘if’) this happens. This type of data
backup system is ideal in preventing against human error such as accidental deletion or overwriting of data.
• On-site or off-site?
• A back-up can be hosted on-site and/or off-site. As with most IT solutions, there are advantages and disadvantages to either
approach.
– For instance, on-site back-ups are much quicker to make and far quicker to restore. They are also generally more
cost-effective to run as they do not need to be transferred to an off-site location. This can be particularly
beneficial in high-volume environments where the time to complete a back-up can be considerable.
– The key drawback to on-site back-ups is the lack of protection they offer in the event of total facility failure or
extreme scenarios such as a terrorist attack.
– Off-site back-ups are advantageous insofar as they offer a higher level of protection since they contain the same
information but are placed in an entirely different location, mitigating the risk of facility failure or attack.
– Generally speaking, off-site back-ups aren’t used to restore day-to-day data that is accidentally deleted, since it’s
much quicker to restore this information from an on-site back-up. However, in the name of continuity and
effective protection, they are of equal importance in preventing against data loss should the worst happen.
what’s the difference between redundancy and backup?

• While both terms are often used in the same conversations, this isn’t an either/or
decision. Both data back-ups and redundancy offer two different and equally as valuable
solutions to ensuring business continuity in the face of unplanned accidents, unexpected
attacks or system failures.
• Redundancy is designed to increase your operational time, boost staff productivity and
reduce the amount of time that a system is unavailable due to a failure.
• Back-up, on the other hand, is designed to kick-in when something goes wrong, allowing
you to completely rebuild no matter what caused the failure.
• In short, redundancy prevents failure while back-ups prevent loss. In a modern business
environment that is inherently dependent on access to large volumes of data,
• it’s clear that operational redundancy and back-ups are both critical elements that
comprise an effective continuity strategy.
• As interchangeable as they may seem, backup and redundancy have two distinct
meanings, and both play an important role.
Data Backup and Redundancy Techniques
Ensuring data durability and availability is crucial in cloud storage.
a. Data Backup Techniques
– Snapshot Backups: Periodic point-in-time backups of block storage volumes.
– Incremental Backups: Storing only the changes made since the last backup.
– Cloud-to-Cloud Backups: Backing up data across multiple cloud platforms.
b. Redundancy Models
– Replication:
• Cross-region replication for disaster recovery (e.g., Amazon S3 CRR).
• High availability through multi-zone replication.
– RAID-Like Architectures: (Redundant Array Independent Disks)
• Some storage solutions mimic RAID for distributed data.
c. Disaster Recovery Plans
– Ensuring data is available in case of a failure or disaster.
– Testing recovery strategies periodically.
d. Durability Guarantees
– Many providers offer "11 nines" (99.999999999%) durability for object storage (e.g., Amazon S3).
What is RAID?

• RAID (redundant array of independent disks) is a way of storing the


same data in different places on multiple hard disks or solid-state drives
(SSDs) to protect data in the case of a drive failure.
• RAID (Redundant Array of Independent Disks) — is a technology that
combines multiple disks into a single array to enhance data storage
reliability or increase system performance. Today, RAID is actively used
in IT to protect data in case of failures and to speed up read and write
operations.
• The concept of the technology was proposed in 1987 by a group of
researchers from the University of California, Berkeley.
• The main goal of RAID — is data protection and improved speed of
operation.
HOW RAID WORKS?
• RAID works by placing data on multiple disks and allowing
input/output (I/O) operations to overlap in a balanced way,
improving performance.
• Because using multiple disks increases the mean time between
failures, storing data redundantly also increases fault tolerance.
HOW IT WORKS?
• RAID arrays appear to the operating system (OS) as a single logical drive.
• RAID employs the techniques of disk mirroring or disk striping.
• Striping partitions help spread data over multiple disk drives.
• Mirroring will copy identical data onto more than one drive.
• Each drive's storage space is divided into units ranging from a sector of 512 bytes up to
several megabytes.
• The stripes of all the disks are interleaved and addressed in order.
• Disk mirroring and disk striping can also be combined in a RAID array.
• In a single-user system where large records are stored, the stripes are typically set up to be
small (512 bytes, for example) so that a single record spans all the disks and can be
accessed quickly by reading all the disks at the same time.
• In a multiuser system, better performance requires a stripe wide enough to hold the typical
or maximum size record, enabling overlapped disk I/O across drives.
RAID CONTROLLER
• A RAID controller is a device used to manage hard disk drives in
a storage array.
• It can be used as a level of abstraction between the OS and the
physical disks, presenting groups of disks as logical units.
• Using a RAID controller can improve performance and help
protect data in case of a crash
• RAID controller can be hardware or software or firmwire.
RAID LEVELS
• RAID devices use different versions, called levels.
• There are different RAID levels, however, and not all aim to provide
redundancy.
• Originally, RAID setup concept was developed and defined seven
levels of RAID -- 0 through 6.
• This numbered system enabled those in IT to differentiate RAID
versions.
• The number of levels has since expanded and has been broken into
three categories: standard, nested and nonstandard RAID levels.
standard RAID levels
• In modern computing, the standard RAID levels include:
• RAID 0: Data striping
• RAID 1: Disk mirroring
• RAID 2: Bit-level data striping with Hamming-Code parity
• RAID 3: Bit-level data striping with dedicated parity
• RAID 4: Block-level striping with dedicated parity
• RAID 5: Data striping with parity
• RAID 6: Data striping with double parity
Nested RAID Levels
• Sometimes referred to as hybrid RAID, the typical nested RAID configuration
combines two or more standardized RAID levels into one package. Depending on
the RAID levels you choose, your system might benefit from increased
performance, greater data redundancy, or improved data integrity.
• The most popular nested RAID configurations in modern computing include:
• RAID 01: Data striping and disk mirroring
• RAID 03: Byte-level data striping and a dedicated parity
• RAID 10: Block-level data striping and disk mirroring
• RAID 50: Block-level data striping and distributed parity
• RAID 60: Block-level striping and dual parity
• RAID 100: A stripe of RAID 10s
Non-Standard RAID Levels
• In some cases- non-standard RAID levels are used. Many non-standard RAID levels were developed
by individual manufacturers for use in their products, and some are only designed to work with
specific system specifications. Others are implemented as needed - regardless of the system,
products, or devices used.
• RAID 1E: Striped mirroring
• RAID 5E: RAID 5 implementation with a hot spare drive
• RAID 5EE: RAID 5 implementation with a hot spare drive that is integrated into the data striping set
• RAID 6E: RAID 6 implementation with a hot spare drive that is active in the block rotation scheme
• There are other forms of non-standard RAID, too. For example, RAID-DP uses double-parity disk
protection to help prevent disk errors and data corruption. RAID Z was designed by the Sun
Company specifically for use with the ZFS file system, and Linux MD RAID 10 only works on Linux-
based operating systems. Since manufacturers and developers are free to create their own
configurations to meet specialty needs, the possibilities are nearly endless.
RAID 0 (Striping)

• RAID 0 distributes data across all disks in the array, splitting it into blocks and writing each block to a
separate disk.
• Advantages: increased speed of operation due to parallel writing of data to both disks.
• Disadvantages: lack of redundancy — failure of one disk will result in the loss of all data in the disk
array.
• Application: Used in tasks where high performance is a priority, for example, video processing or
working with large files.
RAID 1 (Mirroring)

• RAID 1 creates a copy of the data on each disk, ensuring its safety.
• Advantages: high fault tolerance - if one disk fails, all information remains on the second.
• Disadvantages: Reduces available storage space by half.
• Examples of Use: used in systems where high reliability is required, for example, in banking systems.
RAID 2
• This configuration uses striping across disks, with some disks
storing error checking and correcting (ECC) information.
• RAID 2 also uses a dedicated Hamming code parity, a linear form of
ECC.
• RAID 2 has no advantage over RAID 3 and is no longer used.
RAID 3
• This technique uses striping and dedicates one drive to storing parity information. The
embedded ECC information is used to detect errors.
• Data recovery is accomplished by calculating the exclusive information recorded on the other
drives. Because an I/O operation addresses all the drives at the same time, RAID 3 cannot
overlap I/O.
• For this reason, RAID 3 is best for single-user systems with long record applications
RAID 4
• This level uses large stripes, which means a user can read
records from any single drive.
• Overlapped I/O can then be used for read operations. Because
all write operations are required to update the parity drive, no
I/O overlapping is possible
RAID 5 (Striping with Distributed Parity)

• RAID 5 distributes data and parity across disks, providing a balance between speed and reliability.
• This level is based on parity block-level striping. The parity information is striped across each drive, enabling the array
to function, even if one drive were to fail.
• The array's architecture enables read and write operations to span multiple drives. This results in performance better
than that of a single drive, but not as high as a RAID 0 array. RAID 5 requires at least three disks,
• Advantages: effective combination of performance, reliability, and disk space utilization.
• Disadvantages: recovery after a failure can take time and reduce performance.
• Application: widely used on servers where both performance and protection are important.
RAID 6 (Striping with Double Parity)

• RAID 6 is similar to RAID 5 but uses double parity information, allowing it to withstand the failure of
two disks simultaneously. RAID 6 requires an array of 4 or more disks to operate.
• Advantages: higher reliability compared to RAID 5.
• Disadvantages: high disk costs, as half of the capacity is used for duplication. slower write
performance than RAID 5 arrays.
• Where it is used: ideal for databases and critical applications.
RAID 10 (RAID 1+0)

• RAID 10 combines the methods of RAID 1 and RAID 0, creating mirrored pairs of disks with data
distributed across all pairs.
• Advantages: fast operation and data protection.
• Disadvantages: high disk costs, as half of the capacity is used for duplication.
• Examples of Use: ideal for databases and critical applications.
Other Nested RAID
• RAID 01 (RAID 0+1). RAID 0+1 is similar to RAID 1+0, except the data
organization method is slightly different. Rather than creating a mirror
and then striping it, RAID 0+1 creates a stripe set and then mirrors the
stripe set.
• RAID 03 (RAID 0+3, also known as RAID 53 or RAID 5+3). This level
uses striping in RAID 0 style for RAID 3's virtual disk blocks. This offers
higher performance than RAID 3, but at a higher cost.
• RAID 50 (RAID 5+0). This configuration combines RAID 5 distributed
parity with RAID 0 striping to improve RAID 5 performance without
reducing data protection.
BENEFITS OF RAID
• Improved cost-effectiveness because lower-priced disks are used in large
numbers.
• Using multiple hard drives enables RAID to improve the performance of a single
hard drive.
• Increased computer speed and reliability after a crash, depending on the
configuration.
• Reads and writes can be performed faster than with a single drive with RAID 0.
This is because a file system is split up and distributed across drives that work
together on the same file.
• There is increased availability and resiliency with RAID 5. With mirroring, two
drives can contain the same data, ensuring one will continue to work if the
other fails
Cloud Usage Monitor

• A cloud usage monitor is a tool mechanism that tracks, analyzes and logs resource
usage within a cloud environment to optimize their utilization and minimize costs.
• It provides visibility into resource usage patterns such as CPU usage, memory usage,
network traffic, and storage usage.
• It enables businesses to identify underutilized resources and make the necessary
adjustments to optimize performance and reduce expenses.
• It provides detailed metrics on resource consumption, user activities, and
performance.
• Cloud usage monitoring is implemented via automatic monitoring software, giving
control and central access over cloud infrastructure.
• Purpose:
– Enables billing, performance optimization, and resource management.
– By using cloud usage monitoring tools, businesses can stay ahead of performance issues,
identify areas for improvement, and maintain optimal cloud performance.
Roles of Usage Monitoring in Cloud

• Key Roles:
– Performance Management: Ensures workloads meet performance requirements by monitoring key metrics
(e.g., CPU, memory, Network and Disk I/O).
– Cost Optimization: Identifies underutilized resources to reduce unnecessary expenses.
– Troubleshooting: Provides insights into system behavior to diagnose and resolve issues quickly.
– Scalability: Helps predict demand trends to provision resources dynamically.
– Compliance(Log Analysis & Regular Audit): Tracks resource usage for audit, error logs for debugging and
regulatory compliance.
– Health Checks:
• Regular checks on system and service status (e.g., HTTP responses, latency).
– Alerts and Notifications:
• Automated alerts for predefined thresholds or abnormal behavior.
• Cloud monitoring assesses three main areas: Performance Monitoring, Security
Monitoring, and Compliance Monitoring. Each area is critical for managing the health,
security, and regulatory compliance of cloud services.
Benefits of Monitors in Cloud Computing

– Improved system performance.


– valuable insights into their cloud resources
– Proactive issue detection and resolution.
– allowing the businesses to make data-driven decisions that improve efficiency and
reduce expenses
– help businesses avoid unexpected costs and performance issues by providing early
warnings of potential problems.
– A cloud usage monitor provides businesses with valuable insights into their cloud
resources, allowing them to make data-driven decisions that improve efficiency and
reduce expenses.
– By optimizing resource usage, businesses can improve performance and ensure they
get the most value from their cloud investments. Additionally, cloud usage
monitoring can help businesses avoid unexpected costs and performance issues by
providing early warnings of potential problems.
– Compliance with Service Level Agreements (SLAs).
Tools for Monitoring Storage Usage

• a. AWS CloudWatch
• A monitoring and observability service.
• Features:
– Tracks metrics like CPU utilization, disk I/O, and network activity.
– Logs management via Amazon CloudWatch Logs.
– Alerts and automated responses using CloudWatch Alarms.
– Dashboards for visualization.
• b. Azure Monitor
• A comprehensive monitoring service for Azure resources.
• Features:
– Tracks metrics and logs for VMs, databases, and web apps.
– Integrated with Azure Alerts for incident response.
– Application Insights for performance monitoring of applications.
• c. Google Cloud Operations Suite (formerly Stackdriver)
• Features:
– Tracks resource metrics, logs, and traces.
– Real-time monitoring with customizable dashboards.
• d. Third-Party Tools
• Datadog: Unified monitoring across multi-cloud environments.
• New Relic: Application performance monitoring (APM) and infrastructure monitoring.
• SolarWinds: Hybrid IT monitoring with deep analytics.
Storage Usage Metrics to Monitor
• Monitoring the usage of storage devices is a critical aspect of managing
the infrastructure in cloud computing. It helps in optimizing performance,
managing costs, ensuring data availability, and detecting potential issues.
Key metrics for monitoring cloud storage include:
– Capacity Usage: The amount of used vs. available storage.
– I/O Operations: Read/write operations per second (IOPS).
– Latency: Time taken to perform storage operations.
– Throughput: Data transfer rate during storage operations.
– Error Rates: Frequency of storage-related errors.
– Snapshot and Backup Sizes: For disaster recovery and compliance.
Metrics Tracked in Cloud Monitoring

The most commonly monitored metrics include:


a. Compute Metrics
– CPU Utilization: Measures processing power usage.
– Instance Uptime: Tracks the running state of VMs.
b. Memory Metrics
– Memory Usage: Tracks how much memory is in use.
– Memory Swapping: Indicates excessive memory usage.
c. Storage Metrics
– Disk Usage: Monitors available and used storage space.
– I/O Operations: Tracks read/write operations.
– Storage Latency: Measures response time for storage requests.
d. Network Metrics
– Inbound/Outbound Traffic: Tracks data transfer rates.
– Packet Loss: Indicates network health and reliability.
– Latency: Measures delays in data transmission
Real-Time vs Historical Usage Monitoring

a. Real-Time Monitoring
– Provides live insights into resource usage.
– Useful for:
• Immediate issue detection and resolution.
• Dynamic scaling based on demand spikes.
– Tools: AWS CloudWatch Live Metrics, Azure Monitor Live Metrics.
b. Historical Monitoring
– Stores past usage data for trend analysis and capacity planning.
– Useful for:
• Identifying usage patterns.
• Forecasting future resource needs.
– Tools: Google Cloud Logging, AWS CloudWatch Logs
Benefits for Resource Optimization and Cost Control

a. Resource Optimization
– Ensures resources are neither over- nor under-provisioned.
– Identifies idle or underutilized resources for rightsizing.
– Helps with workload placement for optimal performance.
b. Cost Control
– Detects and eliminates unused resources (e.g., unattached volumes).
– Tracks cost-related metrics to avoid budget overruns.
– Enables cost allocation by tagging resources for usage accountability.
c. Improved Application Performance
– Detects performance bottlenecks (e.g., high CPU usage).
– Ensures applications run smoothly under varying loads.
d. Proactive Issue Resolution
– Triggers alerts for potential problems (e.g., memory thresholds).
– Reduces downtime through faster incident response.
Resource Replication

• Description: The process of duplicating cloud resources, such as servers,


data, or applications, across multiple locations or systems.
• Data replication is the process by which data residing on a physical/virtual
server(s) or cloud instance (primary instance) is continuously replicated or
copied to a secondary server(s) or cloud instance (standby instance).
• Organizations replicate data to support high availability, backup, and/or
disaster recovery.
• while an "Automated Scaling Listener" is a service agent that constantly
monitors workload demands and automatically
• Purpose: Improves reliability, fault tolerance, and availability by creating
redundancies.
How does resource replication occurs?

• Resource replication can involve synchronous or asynchronous


replication or Hybrid replication .
• Synchronous replication requires the bandwidth of a LAN
between the servers, possibly with an extended LAN in two
geographically remote computer zones.
• Asynchronous replication can be implemented on a low speed
WAN.
• Hybrid data replication is a method of data replication that
combines synchronous and asynchronous replication
Synchronous data replication
• Synchronous data replication is a method of data replication that ensures that the
source and the target data are always identical and consistent.
• This is accomplished through a process where every write operation on the source
is immediately replicated to the target, and the source waits for the confirmation
from the target before proceeding to the next operation, maintaining data integrity.
• This approach prevents data loss or inconsistency in case of a failure or a disaster
• drawbacks.
– include higher network bandwidth and latency requirements since the source and the target
need to communicate constantly and synchronously
– lower performance and throughput since the source has to wait for the target to acknowledge
every write operation.
– involves higher cost and complexity due to needing similar hardware and software configurations
close to each other.
Asynchronous data replication

• Asynchronous data replication is a method of data replication that does not


guarantee that the source and the target data are always identical and consistent.
• This means that the source does not wait for the confirmation from the target
before proceeding to the next operation, and the target receives the data in
batches or intervals, resulting in less network bandwidth and latency overhead, as
well as higher performance and throughput.
• However, there are some risks associated with this approach, such as
– data loss or inconsistency in case of a failure or disaster,
– data corruption or conflicts due to different versions or sequences of data from the
source,
– data recovery or synchronization challenges when attempting to reconcile and align the
source and target after a failure or disaster.
Hybrid data replication

• Hybrid data replication is a method of data replication that combines


synchronous and asynchronous replication, depending on the type, priority,
and frequency of the data.
• This way, you can balance the trade-offs between data consistency and
availability, performance, and cost.
• However, there are complexities associated with hybrid data
replication, such as
– the need to classify and categorize data for synchronous or asynchronous replication,
– manage and monitor data replication processes across multiple methods and
locations,
– test and validate the replication methods to ensure they meet business requirements.
Synchronous vs. asynchronous replication

• Synchronous replication simultaneously writes data to two systems instead of one. Asynchronous replication
writes data to the primary storage array first and then copies data to the replica.
• Synchronous replication ensures data consistency but may introduce latency and performance issues due to
waiting for acknowledgments. On the other hand, asynchronous replication offers better performance but risks
potential data loss during failures.
• Synchronous replication is the process of copying data over a network to create multiple current copies of the
data. Synchronous replication is mainly used for high-end transactional applications that need instant failover if
the primary node fails.
• The benefits of asynchronous replication
– Asynchronous replication requires substantially less bandwidth than synchronous replication.
– It is designed to work over long distances.
– asynchronous replication can tolerate some degradation in connectivity.
• In contrast, synchronous replication allows fail-over from primary to secondary data storage to occur nearly
instantaneous, to ensure little to no application downtime. However as noted above, it requires the bandwidth of
a LAN between the servers, possibly with an extended LAN in two geographically remote computer zones and
may also require specialized hardware (depending on the implementation).
How does resource replication occurs?
The process of resource replication typically involves several steps:
• Data Distribution: When data is stored in the cloud, it's often replicated across
multiple servers or data centers. This distribution ensures that if one server fails, the
data can still be accessed from another location without interruption.
• Automatic Replication: Cloud platforms often have built-in mechanisms for
automatic replication. When a file or piece of data is uploaded to the cloud, it's
automatically replicated to multiple locations according to predefined replication
policies set by the cloud provider or user.
• Load Balancing: Resource replication also involves load balancing mechanisms to
evenly distribute workloads across replicated resources. This ensures optimal
performance and prevents any single server from becoming overwhelmed with
requests.
How does resource replication occurs?
• Synchronization: To ensure consistency across replicated resources, synchronization
mechanisms are employed. Changes made to data or applications in one location are
synchronized with all other replicated instances in real-time or at defined intervals.
• Failover and Disaster Recovery: Replication plays a crucial role in failover and disaster recovery
scenarios. If one server or data center experiences a failure, traffic can be rerouted to
replicated resources, minimizing downtime and ensuring continuity of service.
• Geographical Distribution: Cloud providers often replicate resources across multiple
geographic regions to improve performance and provide resilience against natural disasters or
regional outages.
 Overall, resource replication in cloud computing is essential for maintaining high availability,
reliability, and resilience in the face of hardware failures, network issues, or other disruptions.
 By distributing data and services across multiple locations, cloud providers ensure that users
can access their resources consistently and without interruption.
Ready-Made Environment

• Description: Pre-configured environments (e.g., development,


testing, or production setups) provided as a service. These
environments include the necessary software, tools, and
configurations.
• Purpose: Simplifies deployment and accelerates development
cycles by offering an out-of-the-box solution.
Automated Scaling Listener
• An Automated Scaling Listener in cloud computing is a mechanism that monitors
the state or performance of applications, systems, or resources and triggers
scaling actions based on predefined rules or real-time conditions. This ensures
optimal resource utilization, cost management, and application performance.
• The automated scaling listener mechanism is a service agent that helps in
monitoring and tracking communications between users and cloud services
being accessed in order to achieve dynamic scaling.
• Scaling refers to the ability of a system to handle the increasing workload by
adding resources or nodes to the system. Cloud scalability in cloud computing
refers to the ability to increase or decrease IT resources as needed to meet
changing demand.
• Automated scaling listeners are deployed within the cloud, typically near the
firewall, from where they automatically track workload status information.
Automated scaling listener
• Workloads can be assessed based on the number of requests made by
cloud users or by the demands placed on the backend by particular kinds
of requests. For instance, processing a tiny amount of incoming data can
take a lot of time.
• Automated Scaling listeners can respond to workload fluctuation
conditions in a variety of ways, including:
• Automatically Adjusting IT Resources based on previously set parameters
by the cloud consumer (Auto Scaling).
• Automatic Notification of the cloud consumer when workloads go above
or below predetermined thresholds. This gives the cloud user the option
to change how its present IT resources are allocated. (Auto Notification)
Core Components of automated scaling Listner
• Monitoring Metrics
– Tracks metrics such as CPU usage, memory utilization, request rates, disk I/O, or application-specific metrics.
– These metrics are collected through monitoring tools like AWS CloudWatch, Azure Monitor, or Google Cloud Monitoring.
• Scaling Policies
– Rules that define when and how to scale resources.
– Types:
• Threshold-based Scaling: Scales resources when a metric exceeds or falls below a specific threshold (e.g., add instances if CPU > 80%).
• Schedule-based Scaling: Scales based on predefined schedules (e.g., increase capacity during peak hours).
• Predictive Scaling: Uses machine learning models to anticipate future demand and scale proactively.
• Action Triggers
– The listener identifies when conditions match scaling policies and initiates scaling actions:
• Scale-Out: Add resources (e.g., instances, containers) to handle increased demand.
• Scale-In: Remove resources to save costs when demand decreases.
• Execution
– Uses cloud-native orchestration tools or APIs to adjust resource levels.
– Examples:
• AWS Auto Scaling adjusts EC2 instances or ECS tasks.
• Kubernetes Horizontal Pod Autoscaler scales pods in a cluster.
Failover Systems in Distributed Computing

• The failover system technique makes use of established cloud technologies to


provide redundant implementations in order to boost the availability and reliability of
IT resources. When an active IT resource becomes unavailable, a failover system is set
up to automatically transition to a redundant or standby instance of that resource.
• Mission-critical software or reusable services that could create a single point of
failure for several applications are frequently subject to failover systems. In order to
host one or more redundant implementations of the same IT resource at each
location, a failover system can cover more than one geographical area. To provide
redundant IT resource instances that are continually monitored for the detection of
defects and unavailability conditions, this approach may rely on the resource
replication mechanism.

• A failover system ensures high availability by automatically redirecting workloads to a


standby system when a primary system fails.
Types of Failover
• Three forms of failover exist:
– automatic failover (without data loss),
– planned manual failover (without data loss),
– forced manual failover (with possible data loss), typically called forced failover. Automatic and
planned manual failover preserve all your data.

• Key Components:
– Redundancy:
• Duplicate resources to act as backups.
• Can be active-passive or active-active.
– Automatic Detection:
• Monitoring systems identify failures in real-time.
– Failover Mechanism:
• Seamless redirection of traffic or workloads to backup resources.
– Recovery Time Objective (RTO) and Recovery Point Objective (RPO):
• Metrics to ensure minimal downtime and data loss.
Importance of Failover Mechanisms in System Design
• A crucial component of system design is failover, particularly in settings where dependability and uptime are crucial.
Failure over is crucial for the following reasons:
– High Availability: In the event that a system or component fails, failover makes sure that services continue to be
offered. This is essential for systems like banking systems, emergency services, and e-commerce platforms that must
be available around-the-clock.
– Redundancy: In the event of a breakdown, failover systems offer redundancy by having backup parts or resources
prepared to take over. The possibility that a single point of failure will bring down the entire system is reduced by this
redundancy.
– Fault Tolerance: By automatically identifying faults and rerouting workload or traffic to healthy components, failover
techniques enhance fault tolerance. This lessens the effect that malfunctions have on the system as a whole.
– Disaster Recovery: A crucial part of any strategy for disaster recovery is failover. Failureover techniques aid in the
prompt restoration of services and reduction of downtime in the event of a disaster, such as a hardware malfunction,
network outage, or natural disaster.
– Business Continuity: By reducing downtime and guaranteeing the continued availability of vital services, failover
assures business continuity. This is especially crucial for companies whose operations significantly depend on their IT
infrastructure.
– Customer Satisfaction: Higher customer satisfaction is a result of dependable services. By preserving service
dependability and availability, failover techniques make sure that users may continue to access the services they
require.
Failover Strategies
• Active-Active
– In an active-active setup, redundant IT resource implementations actively and synchronously support
the workload (Figure 1). It is necessary to load balance among active instances. The failed instance is
eliminated from the load balancing scheduler when a failure is discovered (Figure 2). Whenever a fault is
discovered, the processing is transferred to the IT resource that is still active
• Active-Passive
– A standby or inactive implementation is triggered in an active-passive setup to take over processing
from an IT resource that becomes unavailable, and the associated workload is directed to the instance
taking over the operation.
– Some failover systems are made to reroute workloads to operational IT resources. These systems rely on
specialized load balancers to identify failure conditions and remove instances of failed IT resources from
the workload distribution. This kind of failover system is appropriate for IT resources that support
stateless processing and don’t need execution state management. The redundant or standby IT resource
implementations are required to share their state and execution context in technology architectures that
are often based on clustering and virtualization technologies.
Failover Systems in Computing
• Benefits:
– Minimizes downtime and disruptions.
– Ensures business continuity.
– Meets compliance and SLA requirements.
• Challenges and Best Practices
• Challenges:
– Complex setup for large-scale systems.
– Balancing cost and performance.
– Ensuring proper testing and validation.
• Best Practices:
– Implement regular failover testing.
– Use multi-region or multi-zone deployments.
– Monitor failover systems themselves.
– Optimize cost by scaling resources dynamically.
Failover Systems in Cloud Computing
• Cloud Computing Platforms:
– Cloud computing platforms implement failover mechanisms to maintain uptime and availability of virtualized resources and services.

• Approaches:
– DNS Failover:
• Uses DNS to route traffic to healthy resources.
• Example: AWS Route 53, Cloudflare.
– Load Balancer Failover:
• Distributes traffic across multiple instances and redirects when an instance fails.
• Example: AWS Elastic Load Balancer, Azure Load Balancer.
– Cluster-based Failover:
• Utilizes clustered servers or containers to ensure high availability.
• Example: Kubernetes, VMware vSphere.
– Backup and Restore:
• A simpler failover strategy involving restoring from backups.

– Hypervisor-based failover mechanisms, such as VMware HA (High Availability) and Microsoft Hyper-V Replica, automatically restart virtual machines on healthy hosts in case of host
failures.
– Load balancers and DNS-based failover solutions distribute incoming traffic across multiple servers or data centers and redirect traffic to healthy instances during failures or performance
degradation.

• Conclusion
Monitoring and failover systems are integral to the resilience of cloud-based infrastructures. By combining robust
monitoring tools with a reliable failover mechanism, organizations can ensure high availability and maintain service
continuity even during unexpected disruptions. Monitoring and failover systems are critical components of cloud
computing infrastructures, ensuring reliability, availability, and seamless service delivery.
Thank You!

You might also like