Unit 3
Unit 3
* The 0th order moment is the number of distinct elements in the stream
* The 1st order moment is the length of the stream
Counting of 1’s in a Window
• Infinite Stream Processing
• Volume of data is too large that it cannot be stored. Hardly a chance
exists to look at all of it.
• Important queries may be likely to ask only about the most recent data or
summaries of data.
Decaying Windows
• Useful in applications which need identification of most common elements
• Decaying window concept assigns more weight to recent elements
• The technique computes a smooth aggregation of all the 1’s ever seen in
the stream, with decaying weights.
• When element further appears in the stream, less weight is given.
• The effect of exponentially decaying weights is to spread out the weights of
the stream elements as far back in time as the stream flows
1. Multiply the current sum/score by the value (1−c).
2. Add the weight corresponding to the new element
Real Time Analytics Platform(RTAP)Application:
Real-time analytics
• Refers to finding meaningful patterns in data at the actual time of
receiving
• Real-Time Analytics Platform (RTAP) analyses the data, correlates, and
predicts the outcomes in the real time.
• Manages and processes data and helps timely decision-making
• Helps to develop dynamic analysis applications
• Leads to evolution of business intelligence
Widely used RTAPs
• Apache SparkStreaming—a Big Data platform for data stream analytics in
real time.
• Cisco Connected Streaming Analytics (CSA)—a platform that delivers
insights from high-velocity streams of live data from multiple sources and
enables immediate action.
• Oracle Stream Analytics (OSA)—a platform that provides a graphical
interface to “Fast Data”.
• SAP HANA— a streaming analytics tool which also does real-time analytics
• SQL streamBlaze—an analytics platform, offering a real-time, easy-touse and
powerful visual development environment for developers and analysts.
• TIBCO StreamBase—streaming analytics, which accelerates action in
order to quickly build applications.
• Informatica — a real-time data streaming tool which transforms a
torrent of small messages and events into unprecedented business agility
• IBM Stream Computing—a data streaming tool that analyzes a broad
range of streaming data— unstructured text, video, audio, geospatial, sensor—
helping organizations spot the opportunities and risks and make
decisions in real times
RTAP Applications
1. Fraud detection systems for online transactions
2. Log analysis for understanding usage pattern
3. Click analysis for online recommendations
4. Social Media Analytics
5. Push notifications to the customers for location-based
advertisements for retail
6. Action for emergency services such as fires and accidents in an
industry
7. Any abnormal measurements require immediate reaction in
healthcare
monitoring
Case Studies
Case Study 1: Streaming to a Cloud-Based Lambda Architecture
Qlik is working with a Fortune 500 healthcare solution provider to hospitals,
pharmacies, clinical laboratories, and doctors that is investing in cloud analytics
to identify opportunities for improving quality of care.
Such as SQL Server and Oracle to a Kafka message queue that in turn feeds a
Lambda architecture on Amazon Web Services (AWS) Simple Storage Service
(S3).
Case Study 2: Streaming to the Data Lake
Decision makers at an international food industry leader, which we’ll call
“Suppertime,” needed a current view and continuous integration of
production capacity data, customer orders, and purchase orders to efficiently
process, distributes
Qlik Replicate injects this data stream, along with any changes to the source
metadata and data definition language (DDL) changes, to a Kafka message
queue that feeds HDFS and HBase consumers that subscribe to the relevant
message topics
Case Study 3: Streaming, Data Lake, and Cloud Architecture
Qlik Replicate CDC is remotely capturing updates and DDL changes from
source databases (Oracle, SQL Server, MySQL, and DB2) at four locations in
the United States. Qlik Replicate then sends that data through an encrypted
File Channel connection over a wide area network (WAN) to a virtual
machine–based instance of Qlik Replicate in the Azure cloud .
Case Study 4:Supporting Microservices on the AWSCloud Architecture
The DB2 transaction log and send them via encrypted multipathing to the
AWS cloud. There, certain transaction records are copied straight to an RDS
database, using the same schemas as the source, for analytics by a single line
of business.
Case Study 5: Real-Time Operational Data Store/Data Warehouse
To improve the efficiency of its data replication process. This required
continuous copies of transactional data from the company’s production Oracle
database to an operational data store (ODS) based on SQL Server. Although
the target is an ODS rather than a full-fledged data warehouse, this case study
serves our purpose of illustrating the advantages of CDC for high-scale
structured analysis and reporting.
Real Time Sentiment Analysis:
Real-time Sentiment Analysis is a machine learning (ML) technique that
automatically recognizes and extracts the sentiment in a text whenever it
occurs. It is most commonly used to analyze brand and product mentions in
live social comments and posts.
Why Do We Need Real-Time Sentiment Analysis?
Real-time sentiment analysis has several applications for brand and customer
analysis.
1. Live social feeds from video platforms like Instagram or Facebook
2. Real-time sentiment analysis of text feeds from platforms such as Twitter.
This is immensely helpful in prompt addressing of negative or wrongful social
mentions as well as threat detection in cyberbullying.
3. Live monitoring of Influencer live streams.
4. Live video streams of interviews, news broadcasts, seminars, panel
discussions, speaker events, and lectures.
5. Live audio streams such as in virtual meetings on Zoom or Skype, or at
product support call centers for customer feedback analysis.
6. Live monitoring of product review platforms for brand mentions.
7. Up-to-date scanning of news websites for relevant news through keywords
and hashtags along with the sentiment in the news
How Is Real-Time Sentiment Analysis Done?
Live sentiment analysis is done through machine learning algorithms that are
trained to recognize and analyze all data types from multiple data sources,
across different languages, for sentiment..
Step 1 - Data collection
To extract sentiment from live feeds from social media or other online
sources, we first need to add live APIs of those specific platforms, such as
Instagram or Facebook
Step 2 - Data processing
All the data from the various platforms thus gathered is now analyzed. All
text data in comments are cleaned up and processed for the next stage. All
non-text data from live video or audio feeds is transcribed and also added to
the text pipeline
Step 3 - Data analysis
All the data is now analyzed using native natural language processing (NLP),
semantic clustering, and aspect-based sentiment analysis.
Step 4 - Data visualization
All the intelligence derived from the real-time sentiment analysis in is now show
cased on a reporting dashboard in the form of statistics, graphs, and other visual
elements. It is from this sentiment analysis dashboard that you can set alerts for
brand mentions and keywords in live feeds as well.
The Most Important Features Of A Real-Time Sentiment Analysis Platform
* Multiplatform
* Multimedia
* Multilingual
* Web scraping
* Alerts
* Reporting
Stock Market Prediction
1. Stock market prediction and analysis are some of the most difficult jobs to
complete.
2. There are numerous causes for this, including market volatility and a variety
of other dependent and independent variables that influence the value of a
certain stock in the market
3. These variables make it extremely difficult for any stock market expert to
anticipate the rise and fall of the market with great precision.
4. The introduction of Machine Learning and its strong algorithms, the most
recent market research and Stock Market Prediction advancements have
begun to include such approaches in analyzing stock market data.
5. Machine Learning Algorithms are widely utilized by many organizations
in Stock market prediction.
6. This article will walk through a simple implementation of analyzing and
forecasting the stock prices of a Popular Worldwide Online Retail Store in
Python using various Machine Learning Algorithms