0% found this document useful (0 votes)
20 views

Module3 4

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Module3 4

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

AWS for Cloud

Computing
Module 3: AWS Database Services
• AWS Lambda
• Amazon Dynamo DB
• Amazon ECS (Elastic Container Service) & Amazon S3
Glacier
• Amazon Kinesis, Amazon Redshift
• Amazon EMR (Elastic MapReduce), AWS Disaster Recovery
and Backup.
Amazon Kinesis
What is streaming data?
• Kinesis is a platform on
AWS that sends your Streaming data is data which is generated
streaming data. continuously from thousands of data sources, and
• Amazon Kinesis Service these data sources can send the data records
that allows users to simultaneously and in small size.
process a large amount of
data (which can be audio, Stock prices
video, application logs, You tube, Facebook, OTT
website clickstreams, and IoT Sensor data
IoT telemetry ) per
second in real time. https://www.windy.com/-Rain-thunder-rain?rain,5.519,47.520,3
How Amazon Kinesis Works?
• Amazon Kinesis working can be divided into four stages as follows.
1. Data Ingestion
Amazon kinesis will collect the data or receives the data from the
different data streams like application, sensors, and so on. The data that
is going to be received from the different sources can be in different
formats like JSON and Binary. It can also accept the data of real-time
applications.
2. Sharding and Scaling
The smaller parts of the data called the shards the data which is
received from the different sources are divided into smaller shards for
redundancy and fault tolerance. There are no limits for the shards amazon
kinesis can scale the shards horizontally depending on the requirement.
How Amazon Kinesis Works?
3. Processing and buffering
After sharded the data will be prepared for further use like
it will apply filtering or record aggregation before storing it.
4. Making the data accessible
After completing all the steps mentioned above know the
data should be accessible it offers various ways to access and
utilize your data stream.
• Kinesis Data Streams API.
• Kinesis Firehose.
• Kinesis Analytics.
What Can I Do With Kinesis Data Streams?
Amazon Kinesis will stream the data in real-time help in handling it and
also tell you what to do with that data according to the organization’s goals
following are the broad categories to get you started:
• Real time data ingestion and processing: Amazon kinesis will take the
data from the real time it will helps in the applications like health which are
used for the identifying the and regulating the health of the patient and
also if any emergency with the help of data we can predict it in before
head. It can also used in the applications of OTT platforms by which you can
personalize the according to the user experience.
• Streamlined data delivery and storage: The real time data can be
stored in the storage and can be used for the further research and can be
used for the further use and also amazon kinesis can be integrated with the
other services also.
• Real-time insights and automation: The data which is collected from
the real time data will be analyzed the whole data and reacts to anomalies,
fraud attempts or any other critical immediately. And also monitor the key
metrics which can be used for the data driven decision making.
Core Services of Kinesis
• Kinesis
Streams
• Kinesis
Firehose
• Kinesis
Analytics
Kinesis Streams
• Kinesis streams consist of shards.
• Shards provide 5 transactions per second for reads,
up to a maximum total data read rate of 2MB per
second and up to 1,000 records per second for
writes up to a maximum total data write rate of 1MB
per second.
• Suppose we have got the EC2, mobile phones,
Laptops, IOT which are producing the data. They are
known as producers as they produce the data.
• The data is moved to the Kinesis streams and stored
in the shard. By default, the data is stored in shards
for 24 hours. You can increase the time to 7 days of
retention.
• Once the data is stored in shards, then you have EC2
instances which are known as consumers. They take
the data from shards and turned it into useful data.
Once the consumers have performed its calculation,
then the useful data is moved to either of the AWS
services, i.e., DynamoDB, S3, EMR, Redshift.
Kinesis Firehouse
• Kinesis Firehose is a service used for delivering streaming
data to destinations such as Amazon S3, Amazon Redshift,
Amazon Elasticsearch.
• With Kinesis Firehouse, you do not have to manage the
resources.
• EC2, mobile phones, Laptop, IOT which are producing the
data. They are also known as producers. Producers send
the data to Kinesis Firehose.
• Kinesis Firehose does not have to manage the resources
such as shards, you do not have to worry about streams,
you do not have to worry about manual editing the shards
to keep up with the data, etc. It’s completely automated.
You do not have to worry even about the consumers. Data
can be analyzed by using a Lambda function.
• Once the data has been analyzed, the data is sent directly
over to the S3. One important thing about Kinesis
Firehouse is that there is no automatic retention window,
but the Kinesis
• Firehouse only sends the data over directly to S3 or other
location.
Kinesis Analytics
• Kinesis Analytics is a service of
Kinesis in which streaming data is
processed and analyzed using
standard SQL.
• This is very similar to Firehouse.
• But, Kinesis Analytics allows you to
run the SQL Queries of that data
which exist within the kinesis
firehose. You can use the SQL Queries
to store the data in S3, Redshift or
Elasticsearch cluster. Essentially,
data is analyzed inside the kinesis
using SQL type query language.
Applications
• Real-time application monitoring: Amazon kinesis will provide the real time
data of the applications like if you consider the health application it will
provides the live feed of the data by which you can take care of the health by
which the issues that is pointed by amazon kinesis.
• Fraud detection and prevention: Amazon Kinesis will helps you to protect
the data from fraudulent activity by analyzing transaction data by which you
can detect the suspicious patterns and blocks fraudulent transactions before
they happen.
• Personalized recommendations and marketing: Amazon kinesis will helps
you in analyzing the data of the customers by which you can understand your
customers very better. You can recommends the personalised products in real
time to the costumers.
• IoT analytics and predictive maintenance: Your connected gadgets’ full
potential is unlocked with Kinesis. Through the examination of sensor data from
electronics, automobiles, or machinery.
Amazon Redshift
• AWS Redshift is big data
analytics service.
• Redshift is a fast and
powerful, fully managed,
petabyte-scale data
warehouse service in the
cloud.
Redshift Configuration
• Redshift consists of two types of nodes:
• Single node
• Multi-node

Single node: A single node stores up to


160 GB.
• Multi-node: Multi-node is a node that
consists of more than one node. It is of
two types:
• Leader Node -It manages the client
connections and receives queries.
A leader node receives the queries from the
client applications, parses the queries, and
develops the execution plans.
It coordinates with the parallel execution of
these plans with the compute node and
combines the intermediate results of all the
nodes, and then return the final result to
the client application.
Redshift warehouse is a collection
• Compute Node -A compute node of computing resources known as
executes the execution plans. nodes, and these nodes are
organized in a group known as a
Then intermediate results are sent to the cluster. Each cluster runs in a
leader node for aggregation before sending Redshift Engine which contains one
back to the client application. It can have or more databases.
up to 128 compute nodes.
• Easy to setup, deploy and
manage
Redshift features
Automated Provisioning, and Backup
• Cost-effective
No upfront costs, pay as you go
Scale quickly to meet your
needs
Amazon Redshift automatically
scales up or down the nodes
according to the need changes.
Faster performance
Secure
You can also enable encryption, all
the data written to disk will be
encrypted.

You might also like