Predictive Analysis Assignment Answers
Predictive Analysis Assignment Answers
Predictive Analysis uses statistical techniques and machine learning to forecast future outcomes
based on historical data. The goal is to predict future events and trends.
3. Data Preprocessing: Clean and prepare the data for analysis (handling missing values,
4. Feature Engineering: Select or create important features that improve model accuracy.
7. Evaluation: Assess the model's performance using metrics like accuracy, precision, recall, etc.
9. Monitoring: Continuously monitor the model for accuracy and update it as needed.
3. Clustering: Grouping data into clusters based on similarity (e.g., customer segmentation).
4. Time Series Forecasting: Predicting future values in a sequence (e.g., stock prices).
- Descriptive Analysis: Focuses on what has happened in the past using historical data (e.g., sales
reports).
engines).
- Predictive Analysis: Forecasts future outcomes based on past data (e.g., demand prediction).
Feature selection is the process of identifying the most important variables in a dataset that
6. How can imbalanced datasets affect predictive analysis, and what techniques can be used to
Imbalanced datasets occur when one class significantly outnumbers the other (e.g., fraud detection,
- Synthetic Data Generation (SMOTE): Create synthetic samples for the minority class.
- Use Different Metrics: Instead of accuracy, use metrics like precision, recall, or the F1-score.
7. In finance, it is of interest to look at the relationship between Y (stock's average return) and X
(overall market return). The slope coefficient computed by linear regression is called stock's beta.
Given:
X: 10 12 8 15 9 11 8 10 13 11
Y: 11 15 3 18 10 12 6 7 18 13
Linear regression can be used to compute the stock's beta, which represents the sensitivity of the
stock's returns to market returns. The beta value will be the slope of the regression line fitted to the
data.
Y: 25 30 11 22 27 19
- b1: The coefficient of X1, which indicates the change in Y for a unit change in X1.
- b2: The coefficient of X2, which indicates the change in Y for a unit change in X2.
The CART (Classification and Regression Trees) algorithm builds a decision tree by splitting the
dataset into smaller subsets based on feature values. Each split is chosen to minimize the Gini