What Is Streaming Data
What Is Streaming Data
Streaming Data is data that is generated continuously by thousands of data sources, which
typically send in the data records simultaneously, and in small sizes (order of Kilobytes).
Streaming data includes a wide variety of data such as log files generated by customers using
your mobile or web applications, ecommerce purchases, in-game player activity, information
from social networks, financial trading floors, or geospatial services, and telemetry from
connected devices or instrumentation in data centers.
• A financial institution tracks changes in the stock market in real time, computes value-at-risk,
and automatically rebalances portfolios based on stock price movements.
• A real-estate website tracks a subset of data from consumers’ mobile devices and makes real-
time property recommendations of properties to visit based on their geo-location.
• A solar power company has to maintain power throughput for its customers, or pay penalties.
It implemented a streaming data application that monitors of all of panels in the field, and
schedules service in real time, thereby minimizing the periods of low throughput from each
panel and the associated penalty payouts.
• A media publisher streams billions of clickstream records from its online properties,
aggregates and enriches the data with demographic information about users, and optimizes
content placement on its site, delivering relevancy and better experience to its audience.
• An online gaming company collects streaming data about player-game interactions, and feeds
the data into its gaming platform. It then analyzes the data in real-time, offers incentives and
dynamic experiences to engage its players.
Amazon Kinesis is a platform for streaming data on AWS, offering powerful services to make it
easy to load and analyze streaming data, and also enables you to build custom streaming data
applications for specialized needs. It offers two services: Amazon Kinesis Firehose, and Amazon
Kinesis Streams.
In addition, you can run other streaming data platforms such as –Apache Kafka, Apache Flume,
Apache Spark Streaming, and Apache Storm –on Amazon EC2 and Amazon EMR.
Amazon Kinesis Streams
Amazon Kinesis Streams enables you to build your own custom applications that process or
analyze streaming data for specialized needs. It can continuously capture and store terabytes of
data per hour from hundreds of thousands of sources. You can then build applications that
consume the data from Amazon Kinesis Streams to power real-time dashboards, generate alerts,
implement dynamic pricing and advertising, and more. Amazon Kinesis Streams supports your
choice of stream processing framework including Kinesis Client Library (KCL), Apache Storm, and
Apache Spark Streaming. Learn more about Amazon Kinesis Streams »
Amazon Kinesis Firehose
Amazon Kinesis Firehose is the easiest way to load streaming data into AWS. It can capture and
automatically load streaming data into Amazon S3 and Amazon Redshift, enabling near real-time
analytics with existing business intelligence tools and dashboards you’re already using today. It
enables you to quickly implement an ELT approach, and gain benefits from streaming data
quickly. Learn more about Amazon Kinesis Firehose »