CPEN106 Machine Learning and Predictive Analytics
CPEN106 Machine Learning and Predictive Analytics
By Joven R. Ramos
CPEN106: Elective 2 - Big Data Analytics
COURSE OUTCOMES
CO2: Analyze different machine learning models and big data
technologies to solve real-world data problems. (Bloom’s Level:
Analyze – Level 4)
Concept Overview
GRADIENT BOOSTING MACHINES (GBM) Boosts weak models iteratively to reduce errors. Sales forecasting in retail businesses.
Optimized version of GBM for faster and better Used in financial analytics for predicting stock
XGBOOST REGRESSION
performance. prices.
ARTIFICIAL NEURAL NETWORKS (ANN) FOR Deep learning model that captures complex Predicting customer lifetime value in e-
REGRESSION patterns. commerce.
MACHINE LEARNING 1:
SUPERVISED LEARNING ALGORITHMS
Regression Algorithms (For Predicting Continuous Values)
Example Code: Predicting House Prices
MACHINE LEARNING 1:
SUPERVISED LEARNING ALGORITHMS
A statistical model for binary classification Predicting whether a loan will be approved
LOGISTIC REGRESSION
problems. (Yes/No).
Classifies data points based on their closest Recommender systems (e.g., Netflix suggesting
K-NEAREST NEIGHBORS (KNN)
neighbors. movies).
GRADIENT BOOSTING MACHINES (GBM) Iteratively improves weak classifiers. Customer churn prediction in telecom.
MACHINE LEARNING 1:
SUPERVISED LEARNING ALGORITHMS
XGBOOST CLASSIFIER A highly efficient gradient boosting algorithm. Credit scoring models in finance.
NAÏVE BAYES CLASSIFIER Uses probability for classification. Spam email detection.
ARTIFICIAL NEURAL NETWORKS (ANN) FOR Image recognition (e.g., Google Photos
Deep learning model for complex patterns.
CLASSIFICATION categorizing images).
CONVOLUTIONAL NEURAL NETWORKS (CNN) Extracts spatial features for classification tasks. Detecting pneumonia from X-ray images.
Deep learning is a subset of machine learning that uses neural networks with
multiple layers to model complex patterns in data.
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Deep learning is a subset of machine learning that uses neural networks with
multiple layers to model complex patterns in data.
Neurons (similar to brain cells) receive inputs, process them, and pass
outputs.
Activation Functions (ReLU, Sigmoid, Softmax) determine neuron firing.
Loss Functions measure the model’s error.
Backpropagation & Optimization (Gradient Descent) adjust weights to
reduce error.
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Hidden Layers: Process and extract patterns using weights and activation
functions (ReLU, Sigmoid).
Purpose:
CNNs are designed for image recognition by extracting spatial features from
images.
✅ Use Cases:
Face recognition (e.g., unlocking smartphones).
Medical imaging (e.g., detecting pneumonia from X-rays).
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Ì Example Code: CNN for Image Classification (MNIST Digits)
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Implementing Neural Networks
Purpose:
LSTMs are ideal for processing sequential data like time-series, speech, and
text.
✅ Use Cases:
Stock price prediction.
Speech recognition (Google Assistant, Siri).
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Ì Example Code: LSTM for Stock Price Prediction
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Implementing Neural Networks
CNN-LSTM models combine CNNs (for feature extraction) and LSTMs (for
sequence learning).
✅ Use Cases:
Video classification (e.g., detecting suspicious activities in security
footage).
Weather prediction (using images + time-series data).
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Ì Example Code:
CNN-LSTM for Video
Classification
MACHINE LEARNING 2:
DEEP LEARNING FUNDAMENTALS
Ì Example Code:
CNN-LSTM for Video
Classification
PREDICTIVE ANALYTICS
Developing Predictive Models
✅ Use Cases:
Retail: Forecasting customer demand.
Finance: Predicting loan defaults.
PREDICTIVE ANALYTICS
» Forecasting Techniques
Forecasting is the process of using historical data to make future predictions. It
is widely used in business, finance, weather prediction, and environmental
monitoring.
PREDICTIVE ANALYTICS
» Forecasting Techniques
This section explains three major time-series forecasting techniques:
✅ Use Cases:
✔ Stock price trends (e.g., 50-day moving average in stock trading).
✔ Temperature smoothing for weather trends.
✔ Sales forecasting (e.g., moving average of monthly revenue).
PREDICTIVE ANALYTICS
( Example: Moving Average in Python
✅ Interpretation:
The red line smooths out random fluctuations, revealing the true trend.
Moving Averages work best for short-term forecasting but struggle with
seasonality.
PREDICTIVE ANALYTICS
» 2. ARIMA (Auto-Regressive Integrated Moving Average)
✅ Concept:
ARIMA is a powerful statistical model for forecasting time-series data by
considering:
( Auto-regression (AR): Uses past values to predict future values.
( Integration (I): Differencing is used to make the data stationary.
( Moving Average (MA): Uses past forecast errors to improve predictions.
✅ Use Cases:
✔ Stock market forecasting.
✔ Weather forecasting (e.g., rainfall prediction).
✔ Demand forecasting for businesses (e.g., sales, inventory).
PREDICTIVE ANALYTICS
( Example: ARIMA for Stock Price Prediction
✅ Interpretation:
ARIMA models time-series trends and patterns to predict future values.
It works well for stationary time-series data (where trends don’t change
over time).
PREDICTIVE ANALYTICS
» 3. LSTM (Long Short-Term Memory Networks) – Deep Learning for Time-
Series Forecasting
✅ Concept:
LSTM is a type of Recurrent Neural Network (RNN) designed to capture
long-term dependencies in time-series data.
It remembers past values over time, making it ideal for complex
forecasting problems.
✅ Use Cases:
✔ Weather forecasting (e.g., temperature, rainfall).
✔ Stock market trend prediction.
✔ Energy consumption forecasting.
PREDICTIVE ANALYTICS
( Example: LSTM for Weather Forecasting
✅ Interpretation:
LSTM can capture long-term dependencies in weather data.
Works best when large datasets are available for training.
PREDICTIVE ANALYTICS
» Comparison of Forecasting Techniques
METHOD BEST FOR STRENGTHS WEAKNESSES
Moving Averages (MA) Short-term trend smoothing Simple, fast, removes noise Doesn’t predict future values
Root Mean Squared Error The square root of MSE, interpretable in the same
Closer to 0
(RMSE) units as the output.
R² Score (Coefficient of Measures how well the model explains the variance in
Closer to 1
Determination) the target variable.
PERFORMANCE METRICS FOR EVALUATION
1. Regression Model Metrics. Used when the model predicts continuous
numerical values, such as house prices or temperature forecasts.
PERFORMANCE METRICS FOR EVALUATION
2. Classification Model Metrics. Used when the model predicts categorical
labels, such as spam detection or disease diagnosis.
Measures how many actual positives were correctly identified. Important for
Recall (Sensitivity) Closer to 1
medical diagnosis.
F1 Score Harmonic mean of precision and recall. Best for imbalanced datasets. Closer to 1
Confusion Matrix Table showing TP, FP, FN, and TN for model predictions. N/A
PERFORMANCE METRICS FOR EVALUATION
2. Classification Model Metrics. Used when the model predicts categorical
labels, such as spam detection or disease diagnosis.
PERFORMANCE METRICS FOR EVALUATION
3. Deep Learning Model Metrics. Deep learning models use similar metrics as
machine learning but have some additional ones for complex architectures
like CNNs and LSTMs.