Abhijitya Midsem
Abhijitya Midsem
Layman Explanation:
Data is all around us! It can come from different sources like social media, smart devices, businesses,
and even nature. It can be simple numbers or complex videos and images.
Examples:
• Google Maps gathers live traffic data from millions of users to provide accurate route
suggestions.
2. Classification of Data
• Velocity: The speed at which data is generated (e.g., live stock market prices).
• Veracity: Accuracy and trustworthiness (e.g., news sources may have fake news).
Big Data platforms manage and process massive data that normal systems cannot handle.
We moved from manual calculations to AI-driven analytics for handling large-scale data.
8. Analysis vs Reporting
1. Regression Modeling
Regression is used to find the relationship between variables. It helps in making predictions based on
historical data.
Real-time Example:
Imagine an e-commerce company wants to predict the sales of a product based on its price and
advertising budget. By analyzing past data, regression modeling can determine whether increasing
the ad budget leads to higher sales and by how much.
Another Example:
A bank uses regression to predict a customer’s loan repayment probability based on their income,
credit score, and previous repayment history.
2. Multivariate Analysis
Multivariate analysis studies multiple factors affecting an outcome. It helps in understanding the
combined effect of different variables.
Real-time Example:
A restaurant chain wants to find out why some of its outlets perform better than others. They
analyze data on location, menu pricing, customer reviews, and weather conditions to understand
what factors contribute to higher sales.
Another Example:
A hospital studies how age, diet, physical activity, and genetics together influence a patient’s risk of
heart disease.
This method uses probabilities to update predictions when new data is available.
Real-time Example:
A weather forecasting system predicts rain probability based on temperature, humidity, and wind
speed. If a sudden drop in pressure occurs, the model updates its probability and predicts rain with
higher certainty.
Another Example:
A medical AI system predicts the likelihood of a patient having cancer based on test results and
symptoms. If new symptoms appear, the model adjusts its probability estimate accordingly.
Real-time Example:
A banking system detects fraudulent transactions by classifying them as "genuine" or "fraud" based
on transaction amount, location, and past patterns.
Another Example:
An email system automatically filters spam emails based on past behavior. If an email contains too
many promotional words, it's classified as spam.
This technique analyzes data collected over time to identify patterns and make predictions.
Real-time Example:
A stock market analyst uses time series analysis to predict stock prices based on historical trends and
market conditions.
Another Example:
An electricity company forecasts power demand by analyzing past consumption trends. If the data
shows increased power usage during summer, the company prepares for higher production.
6. Rule Induction
Real-time Example:
An online grocery store finds that customers who buy milk also tend to buy eggs. The system
automatically suggests eggs to customers purchasing milk, increasing sales.
Another Example:
A university analyzes student study habits and finds that students who revise for more than 6 hours
a week tend to score above 80% in exams.
Neural networks mimic the human brain to recognize patterns and make decisions.
Real-time Example:
A self-driving car uses neural networks to recognize traffic signs, pedestrians, and road lanes,
allowing it to drive safely without human intervention.
Another Example:
A smartphone’s face recognition system scans the user's face and identifies unique features to
unlock the device.
Real-time Example:
A clothing brand wants to understand customer preferences. Instead of analyzing hundreds of
customer behavior factors, PCA helps them focus on the 5 most important factors that influence
purchases.
Another Example:
A medical research team studies genetic data to identify key genes responsible for a disease,
reducing the dataset from 100,000 genes to the 10 most important ones.
Fuzzy logic helps in uncertain decision-making, while decision trees provide step-by-step guidance.
Real-time Example:
A washing machine decides the best wash cycle based on how dirty clothes are. Instead of just
"clean" or "dirty," it uses fuzzy logic to determine levels like "slightly dirty," "moderately dirty," and
"heavily dirty" and adjusts the wash cycle accordingly.
Another Example:
A customer service chatbot uses decision trees to answer questions. If a customer asks about
refund policies, the bot follows a predefined tree structure to guide them to the correct solution.
Real-time Example:
A chess AI tests thousands of possible moves before making the best move against an opponent.
Another Example:
A delivery company uses stochastic search to find the fastest delivery route based on live traffic
data. Instead of following a fixed route, it constantly searches for better options in real time.