Unit-4 - Cloud Storage and Database Services
Unit-4 - Cloud Storage and Database Services
Cloud Databases
Types of cloud databases (SQL, NoSQL)
Data scaling and replication
Cloud Storage Solutions
Cloud storage:-
Cloud storage is a data deposit model in which digital information
such as documents, photos, videos and other forms of media are
stored on virtual or cloud servers hosted by third parties.
It allows you to transfer data on an offsite storage system and
access them whenever needed.
Cloud storage is a service model in which data is transmitted and
stored on remote storage systems, where it is maintained,
managed, backed up and made available to users over a network --
typically, the internet.
Users generally pay for their cloud data storage on a per-
consumption, monthly rate.
Cloud Storage Solutions
Cloud storage:-
Cloudstorage is a virtual locker
where we can remotely stash any
data.
When we upload a file to a cloud-
based server like Google Drive, One
Drive, or iCloud that file gets copied
over the Internet into a data server
that is cloud-based actual physical
space where companies store files on
multiple hard drives.
Cloud Storage Solutions
Features of Cloud storage:-
It has a greater availability of resources.
Easy maintenance is one of the key benefits of using Cloud computing.
Cloud computing has a Large Network Access.
It has an automatic system.
Securityis one of the major components and using cloud computing you
can secure all over the networks.
Examples:-
Google Drive, Dropbox, Microsoft One Drive, Amazon S3, and iCloud.
These services provide users with a convenient, scalable, and accessible
way to store and manage their data in the cloud, without the need to
maintain physical storage infrastructure.
Cloud Storage Solutions
Advantages of Cloud storage:-
Scalability – Capacity and storage can be expanded and
performance can be enhanced.
Flexibility – Data can be manipulated and scaled according to
the rules.
Simpler Data Migrations – As it can add and remove new and old
data when required and eliminates disruptive data migrations.
Recovery -In the event of a hard drive failure or other hardware
malfunction, you can access your files on the cloud.
Cloud Storage Solutions
Disadvantages of Cloud storage:-
Data centers require electricity and proper internet facility to
operate their work, failing which system will not work properly.
Support for cloud storage isn’t the best, especially if you are using
a free version of a cloud provider.
When you use a cloud provider, your data is no longer on your
physical storage.
Cloud-based storage is dependent on having an internet
connection. If you are on a slow network you may have issues
accessing your storage.
Cloud Storage Solutions
Managed Cloud storage:-
Managed cloud storage is a fully-managed service provided by
cloud vendors, where the cloud provider takes care of the entire
storage infrastructure, including hardware, software, and
maintenance tasks.
The key characteristics of managed cloud storage are:
Fullymanaged: The cloud provider handles all aspects of storage
management, including provisioning, scaling, backups,
redundancy, and updates.
High availability and durability: Managed cloud storage services
are designed to provide high availability and durability, with data
replicated across multiple data centers and geographic regions for
fault tolerance and data protection.
Cloud Storage Solutions
Managed Cloud storage:-
Automatic scaling: Storage capacity can be automatically scaled
up or down based on your needs, without the need for manual
intervention.
Pay-as-you-go pricing: You pay only for the storage space you use,
typically on a per-gigabyte or per-terabyte basis.
Examples:
Access RESTful APIs, typically Block device interface, e.g., File system protocols,
Method HTTP/HTTPS iSCSI, Fibre Channel e.g., NFS, SMB/CIFS
High-performance, low
Stores unlimited data with Offers high performance
Performance latency, and rapid data
minimal latency. for shared file access.
transfer.
In shopping app , if you purchase anything then the app immediately
shows the correct information everywhere.
Data consistency and durability
Data Consistency:-
In cloud computing environments, data consistency is typically
achieved through one of two main models:
strong consistency
eventual consistency
These models represent different approaches to maintaining data
integrity and synchronization across distributed systems.
Real world applications:
Financial Transactions
E-commerce Inventory Management
Collaborative Document Editing
Data consistency and durability
Data Consistency:-
Strong Consistency:
Strong consistency, also known as strict consistency or synchronous
replication, ensures that data is always consistent across all nodes or
replicas in the system.
This means that any read operation will always return the most recent,
updated value as seen by the latest write operation, regardless of which
node or replica is accessed.
In a strongly consistent system, write operations are propagated
synchronously to all replicas before they are acknowledged as
successful.
This ensures that all replicas have the same view of the data at any
given point in time.
Data consistency and durability
Data Consistency:-
Eventual Consistency:
Eventual consistency is a more relaxed consistency model that
prioritizes availability and partition tolerance over strict consistency.
In an eventually consistent system, changes to data are propagated
asynchronously to replicas, which means that different nodes or
replicas may have temporarily inconsistent views of the data.
However, the system is designed to eventually converge to a consistent
state, where all replicas will eventually become consistent with each
other after the updates have been fully propagated and reconciled.
Data consistency and durability
Advantages of Data Consistency:-
Data Integrity: Ensuring data consistency helps maintain the
accuracy and integrity of data across multiple nodes or replicas in the
cloud. This is crucial for applications that require reliable and
trustworthy data.
Improved Data Quality: Consistent data reduces the risk of data
corruption, duplication, or inconsistencies, which can lead to better
data quality and decision-making processes.
Simplified Data Management: By maintaining data consistency,
organizations can simplify data management processes, such as
backups, migrations, and synchronization across different cloud
environments or regions.
Enhanced User Experience: Applications that rely on consistent data
can provide a better user experience by ensuring that users access the
same, up-to-date information regardless of their location or the node
they interact with.
Data consistency and durability
Disadvantages of Data Consistency:-
Performance Impact: Achieving strong data consistency can often
come at the cost of performance. Techniques like distributed
transactions, quorum-based protocols, or synchronous replication can
introduce latency and overhead, potentially impacting application
performance.
Availability Trade-offs: According to the CAP theorem (Consistency,
Availability, Partition Tolerance), in the presence of network partitions,
distributed systems must choose between consistency and availability.
Prioritizing consistency may sacrifice availability in certain scenarios.
Increased Complexity: Implementing and maintaining data
consistency in distributed cloud environments can be complex,
requiring specialized techniques, protocols, and tools. This complexity
can increase development and operational costs.
Data consistency and durability
Data Durability:-
Data durability in the cloud refers to the ability of cloud storage
systems to ensure that data remains intact, accessible, and
recoverable even in the face of various failures, disruptions, or
unforeseen events.
Cloud service providers employ various techniques and strategies to
achieve data durability and provide reliable storage services to their
customers.
Example:
IF you save any document in cloud service, it ensures that your
document is safe even if there is a power outage or server failure.
Data consistency and durability
Data Durability:-
Replication and Redundancy:
Cloud storage providers replicate data across multiple storage nodes,
data centers, and even geographic regions to ensure redundancy and
fault tolerance.
Ifone storage node or data center fails, the data can be retrieved from
other replicas, ensuring data durability and availability.
Continuous Data Protection:
Cloud storage systems often employ continuous data protection
mechanisms, such as write-ahead logging or journaling, to ensure data
consistency and recoverability in the event of system failures or
crashes.
These techniques capture all data modifications and allow for point-in-
time recovery or roll-forward recovery, minimizing data loss.
Data consistency and durability
Advantages of Data Durability:-
High availability and fault tolerance: Cloud providers implement
redundancy and replication techniques, ensuring data remains
accessible even in the event of hardware failures, network outages, or
data center disruptions.
Disaster recovery and business continuity: By storing data in
multiple geographic locations and providing backup and recovery
services, cloud providers enable organizations to recover from disasters
and minimize downtime.
Scalability and elasticity: Cloud storage systems can easily scale up
or down to accommodate changing data volumes, ensuring data
durability without compromising performance or capacity.
Automatic data repair and healing: Cloud storage systems often
incorporate self-healing mechanisms that automatically detect and
repair corrupted or missing data, reducing the need for manual
intervention and minimizing data loss.
Data consistency and durability
Disadvantages of Data Durability:-
Vendor lock-in: Organizations may become dependent on a specific
cloud provider's data durability mechanisms, making it challenging to
migrate data or switch providers due to potential compatibility issues or
data transfer costs.
Potential security risks: While cloud providers implement robust
security measures, the shared nature of cloud infrastructure can
introduce potential security risks, such as data breaches or
unauthorized access, which could compromise data durability.
Network dependencies: Cloud storage relies heavily on network
connectivity, and any network disruptions or latency issues could
impact data durability and availability, particularly for synchronous
replication or real-time data processing.
Cloud Database
A cloud database is a database
built to run in a public or hybrid
cloud environment to help
organize, store, and manage
data within an organization.
Cloud databases can be offered
as a managed database-as-a-
service (DBaaS) or deployed on
a cloud-based virtual machine
(VM) and self-managed by an
in-house IT team.
Cloud Database
A cloud database is a database service that runs on a cloud computing
platform, provided as a cloud service by a third-party cloud provider.
Instead of hosting and managing a database on-premises using your
own hardware and infrastructure, you can leverage a cloud database
offered by companies like Amazon Web Services (AWS), Microsoft
Azure, Google Cloud Platform, and others.
There are two primary cloud database deployment models.
Traditional database(Self Managed)
Database as a service (DBaaS)
Cloud Database
Traditional database(Self Managed):-
It is very similar to an onsite, in-house managed database—except for
infrastructure provisioning.
In this case, an organization purchases virtual machine space from a
cloud services provider, and the database is deployed to the cloud.
The organization’s developers use a DevOps model or traditional IT
staff to control the database.
The organization is responsible for oversight and database
management.
Cloud Database
Database as a service (DBaaS):-
In which an organization contracts with a cloud services provider
through a fee-based subscription service.
The service provider offers a variety of real-time operational,
maintenance, administrative, and database management tasks to the
end user.
The database runs on the service provider’s infrastructure.
This usage model typically includes automation in the areas of
provisioning, backup, scaling, high availability, security, patching, and
health monitoring.
Cloud Database
Cloud Database
Advantages of Cloud Database:-
Access: Ease of mobile data access is greatly enhanced via cloud
access.
Scalability:
The rapid scalability of cloud databases can easily
accommodate data asset increases and user base growth.
Performance: Automatic alerts to performance issues enable
optimization of indexes and access patterns in order to hit performance
targets.
Reliability:
Cloud databases are usually replicated and backed up
automatically, so single-point-of-failure concerns are minimized.
Cloud Database
Disadvantages of Cloud Database:-
Vendor Lock-In: Migrating from one cloud provider to another can be
challenging due to the lack of standardization and potential vendor
lock-in, making it difficult to switch providers if needed.
Network Dependency: Cloud databases rely on a stable and high-speed
internet connection to function properly. Network disruptions or latency
issues can impact performance and availability.
Security Concerns: While cloud providers offer robust security
measures, organizations may have concerns about potential data
breaches, unauthorized access, or other security risks associated with
storing data in a shared cloud environment.
Limited Control and Customization: Cloud databases are managed by
the cloud provider, which may limit organizations' ability to customize
or fine-tune the database configurations to their specific needs.
Types of Cloud Databases
There are 5 types of cloud databases.
Relational Databases (SQL Databases)
NoSQL Databases
In-Memory Databases
Data Warehouses
Types of Cloud Databases
Relational Databases (SQL Databases):-
SQL cloud databases are based on the
relational database model, which organizes
data into tables with rows and columns,
enforcing relationships and data integrity
through schemas and constraints.
These databases use Structured Query
Language (SQL) for defining and
manipulating data.
Relational cloud databases are ideal for
structured data, such as retail analytics
data related to transactions, inventory, or
customer information.
Types of Cloud Databases
Characteristics Relational Databases (SQL Databases):-
Structured data model with predefined schemas.
Support for ACID (Atomicity, Consistency, Isolation, Durability)
properties.
Use SQL for querying and manipulating data.
Suitable for complex transactions and relationships.
Vertically scalable (scale up/down resources for a single instance).
SQL cloud databases are well-suited for applications that require
strict data consistency, complex queries, and transactions, such as e-
commerce platforms, banking systems, and enterprise resource
planning (ERP) systems.
Types of Cloud Databases
Examples of Relational Databases:
Amazon Relational Database Service (RDS)
Microsoft Azure SQL Database
Google Cloud SQL
IBM Db2 on Cloud
Types of Cloud Databases
Popular Relational Databases (SQL Databases):-
Amazon Relational Database Service (RDS):
Provider: Amazon Web Services (AWS)
Description:-
Based on the IBM Db2 database engine, with compatibility for on-
premises Db2 databases.
Offers features like high availability, workload management, and data
partitioning.
Integrateswith other IBM Cloud services like Watson Studio and
Cloud Object Storage.
Pricingis based on the deployment model (Virtual Private Cloud or
Baremetal), compute resources, and storage used.
Types of Cloud Databases
Advantages of Relational Databases (SQL Databases):-
Data Integrity: Relational databases enforce data integrity through
the use of schemas, constraints, and transactions, ensuring that data
remains consistent and accurate.
ACID Compliance: SQL databases adhere to the ACID (Atomicity,
Consistency, Isolation, Durability) properties, which guarantee
reliable and consistent data operations, even in the event of failures
or concurrent transactions.
Structured Query Language (SQL): SQL is a standardized and widely-
adopted language for managing and querying data, making it easier
to work with and integrate with various applications and tools.
Relationships and Joins: Relational databases allow for defining and
querying relationships between data through joins, enabling complex
data modeling and analysis.
Types of Cloud Databases
Disadvantages of Relational Databases (SQL Databases):-
ScalabilityChallenges: Traditional SQL databases can face scalability
challenges when dealing with large volumes of data or high write
loads, as they rely on vertical scaling (adding more resources to a
single node).
Schema Rigidity: Relational databases require a predefined schema,
which can make it difficult to adapt to changing data requirements or
handle unstructured or semi-structured data.
Complexity: SQL databases can be complex to set up, configure, and
optimize, often requiring specialized database administrators and
expertise.
Less Suitable for Unstructured Data: SQL databases are primarily
designed for structured data and may not be the best choice for
handling large volumes of unstructured or semi-structured data,
such as documents, media files, or IoT sensor data.
Types of Cloud Databases
NoSQL Databases:-
NoSQL stands for Not only SQL.
NoSQL cloud databases are designed to handle large volumes of semi-
structured or unstructured data.
They offer flexible data models, horizontal scalability, and relaxed
consistency compared to traditional SQL databases.
Unlike a relational database, NoSQL databases are non-tabular,
meaning they don't store data in relational tables and rows with strict
schemas. Because of this flexibility, NoSQL databases are able to
store a variety of data types with varying schemas.
These databases are designed to handle unstructured data, such as
social media posts, log files, and user-generated content.
Types of Cloud Databases
Types of NoSQL Database:
Key-value stores
Column-oriented databases
Document-based databases
Graph-based databases
Types of Cloud Databases
Characteristics of NoSQL Databases:-
Flexible data models (key-value, document, column-family, graph)
Horizontally scalable (scale out by adding more nodes)
Eventual consistency (data consistency achieved over time)
High availability and partition tolerance
Optimized for high-throughput and low-latency operations
Examples:
Amazon DynamoDB (Key-Value)
Azure Cosmos DB (Multi-model)
Google Cloud Data store (Document-oriented)
Redis (In-memory key-value store)
Types of Cloud Databases
Key-value stores:-
A key-value data store is a type of
database that stores data as a
collection of key-value pairs.
In this type of data store, each data
item is identified by a unique key, and
the value associated with that key can
be anything, such as a string,
number, object, or even another data
structure.
Key features of the key-value store:
Simplicity, Scalability and Speed.
Example:-Amazon DynamoDB
Types of Cloud Databases
Column-oriented databases (Wide Column):-
A wide column data store is a type of NoSQL database that stores data
in columns rather than rows, making it highly scalable and flexible.
In a wide column data store, data is organized into column families,
which are groups of columns that share the same attributes.
Each row in a wide column data store is identified by a unique row
key, and the columns in that row are further divided into column
names and values.
Key features of columnar oriented database:
Scalability.
Compression.
Very responsive.
Example:- Amazon Keyspaces (for Apache Cassandra)
Types of Cloud Databases
Column-oriented databases (Wide Column):-
Types of Cloud Databases
Document-based databases:-
In a document database, the data is stored in documents.
Each document is typically a nested structure of keys and values.
The values can be atomic data types, or complex elements such as
lists, arrays, nested objects.
A document database stores data in JSON, BSON, or XML documents.
Documents are retrieved by unique keys.
It may also be possible to retrieve only parts of a document.
Example:-Amazon DocumentDB (with MongoDB compatibility)
Types of Cloud Databases
Document-based databases:-
Types of Cloud Databases
Graph-based databases:-
Graph databases are used to store and query highly connected data.
Data can be modeled in the form of entities (also referred to as nodes, or
vertices) and the relationships between those entities (also referred to as
edges).
The strength or nature of the relationships also carry significant
meaning in graph databases.
Users can then traverse the graph structure by starting at a defined set
of nodes or edges and travel across the graph, along defined relationship
types or strengths, until they reach some defined condition.
Results can be returned in the form of literals, lists, maps, or graph
traversal paths.
An example of a social network graph
Example:-Amazon Neptune
Types of Cloud Databases
Graph-based databases:-
Types of Cloud Databases
Popular Non Relational Databases (NoSQL Databases):-
Amazon DynamoDB (Key-Value):
Provider: Amazon Web Services (AWS)
Description:-