Lec 1 Introduction To Big Data Analytics
Lec 1 Introduction To Big Data Analytics
expandedramblings.com
Source:
https://enablecomp.com/
Competitor Analysis
Online traffic to websites and related social media
Market Analysis
Trends and market segment analysis
Productivity Enhancement
Analyze employees tracking data
Cost Cutting
Reduce energy bills, optimize routes, predict
demands, process efficiency and automation6
Targeted Marketing
Analyze purchasing history and target the right people
for a product
Improved Customer Relations
Analyze customer feedback and make adjustments
6
Forbes (01/08/2016) Big Data Analytics’ Potential to Revolutionize Manufacturing Is Within Reach
Big Data Analytics: Introduction 13 / 68
Industries Benefiting from Big Data Analytics
Retail: Advertising, Targeted marketing,
recommendation system, customer loyalty, inventory
management, demand prediction
Banking and Financial: Customer loyalty and churn,
fraud detection, risk assessment
Brands: 66% brands use data analytics for product
and service launch, appropriate timings
Logistics and Transportation: Fleet management,
maintenance needs, drivers risk assessment, real
time tracking
Health Care: Efficiency in healthcare operations,
predictive analytics, outbreak prediction, immunization
strategy
Government & Utility Companies: Surveys & census,
development planning, health, education, energy
supply & demand management
Big Data Analytics: Introduction 14 / 68
Industries Benefiting from Big Data Analytics
7
International Data Corporation (IDC) - Big data analytics
company Big Data Analytics: Introduction 30 / 68
Big Data Analytics - Market
1 Volume
2 Velocity
3 Variety
4 Veracity
5 Value
Source: https://openautomationsoftware.com/
Relational Data
Text Data
Multimedia
Data Time
Series Data
Sequential
Data Streams
Graphs and Homogeneous
Networks Graphs and
Heterogeneous Networks
Bio-sequences
Discretized music and
audio data Text
Data Collection
What data is needed and available?
Identify sources of data and relevance of data
Are there enough instances, are all relevant
features there? Identify datasets, acquire and
retrieve
Sources RDBMS, .txt, web services (soup), RSS,
IMDAD ULLAH KHAN Big Data Analytics: Introduction 59 / 68
The Analytics Process
Data Preparation
Make the data ready for analytics
Exploratory Data Analysis Describe, Summarize,
Visualize
Pre-process: Improve data quality, clean data,
transformation, standardization, normalization
Data Analysis
Apply analytical techniques
Supervised and unsupervised learning, Graph
analytics
to discover patterns in
data to find
relationships in data
to (automatically) extract knowledge from data
to summarize data in ways that are
understandable and useful
Predictive Analytics
Predict value of a attribute based on values of
other attributes Predicted attribute:
Target/dependent/response variable
Attributes used to predict:
Predictor/explanatory/independent variables
IMDAD ULLAH KHAN Big Data Analytics: Introduction 63 / 68
Data Analytics Taks
Clustering: Partition data into meaningful groups
Outlier Detection: Detect points that are unusual (unlike
others) Classification: Assign (predefined) class labels to
each object Regression: Find a function that models
(continuous) target variable
Association Analysis: Find patterns in data that
describe relationships
Recommendation: Predict an unknown rating
based on known ratings
Community Detection: Find (overlapping) communities
of nodes in networks
Centrality and Important nodes: Find important
(or evaluate importance
IMDAD ULLAH KHAN
of) nodes in networks
Big Data Analytics: Introduction 64 / 68
Machine Learning for Data Analytics
Supervised Learning
x x
1 1
Clustering
Outlier
detection
Modeling the density of
data Dimensionality
reduction