0% found this document useful (0 votes)
37 views

Predictive Analysis Assignment Answers

Uploaded by

Ganesh Shejule
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Predictive Analysis Assignment Answers

Uploaded by

Ganesh Shejule
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Predictive Analysis Assignment - Answers

1. What is Predictive Analysis? Explain Life cycle of Predictive Analysis.

Predictive Analysis uses statistical techniques and machine learning to forecast future outcomes

based on historical data. The goal is to predict future events and trends.

Life Cycle of Predictive Analysis:

1. Define Objective: Identify the problem to be solved or the outcome to be predicted.

2. Data Collection: Gather relevant data from various sources.

3. Data Preprocessing: Clean and prepare the data for analysis (handling missing values,

normalizing data, etc.).

4. Feature Engineering: Select or create important features that improve model accuracy.

5. Model Selection: Choose an appropriate predictive model (e.g., regression, classification).

6. Model Training: Train the model using historical data.

7. Evaluation: Assess the model's performance using metrics like accuracy, precision, recall, etc.

8. Deployment: Deploy the model into production to make predictions.

9. Monitoring: Continuously monitor the model for accuracy and update it as needed.

2. Explain different applications of Data Analysis.

Applications of data analysis include:

- Finance: Risk management, fraud detection, stock market prediction.

- Healthcare: Diagnosis predictions, treatment effectiveness, patient risk assessment.

- Marketing: Customer segmentation, sentiment analysis, sales forecasting.

- Manufacturing: Predictive maintenance, quality control.

- Retail: Demand forecasting, product recommendations.


3. Explain different types of Predictive Analysis.

1. Classification: Predicting discrete labels (e.g., spam or not spam).

2. Regression: Predicting continuous values (e.g., price prediction).

3. Clustering: Grouping data into clusters based on similarity (e.g., customer segmentation).

4. Time Series Forecasting: Predicting future values in a sequence (e.g., stock prices).

4. What is the difference between Descriptive, Prescriptive, and Predictive Analysis?

- Descriptive Analysis: Focuses on what has happened in the past using historical data (e.g., sales

reports).

- Prescriptive Analysis: Suggests actions based on predictive analysis (e.g., recommendation

engines).

- Predictive Analysis: Forecasts future outcomes based on past data (e.g., demand prediction).

5. Explain the concept of feature selection in predictive analysis.

Feature selection is the process of identifying the most important variables in a dataset that

influence the predictive model. It helps in:

- Reducing model complexity.

- Improving accuracy by eliminating irrelevant features.

- Reducing overfitting by focusing on the most important predictors.

6. How can imbalanced datasets affect predictive analysis, and what techniques can be used to

address this issue?

Imbalanced datasets occur when one class significantly outnumbers the other (e.g., fraud detection,

where fraud cases are rare).


Techniques to address imbalance:

- Resampling: Oversample the minority class or undersample the majority class.

- Synthetic Data Generation (SMOTE): Create synthetic samples for the minority class.

- Use Different Metrics: Instead of accuracy, use metrics like precision, recall, or the F1-score.

7. In finance, it is of interest to look at the relationship between Y (stock's average return) and X

(overall market return). The slope coefficient computed by linear regression is called stock's beta.

Given:

X: 10 12 8 15 9 11 8 10 13 11

Y: 11 15 3 18 10 12 6 7 18 13

Linear regression can be used to compute the stock's beta, which represents the sensitivity of the

stock's returns to market returns. The beta value will be the slope of the regression line fitted to the

data.

8. Given the following set of data:

Y: 25 30 11 22 27 19

X1: 3.5 6.7 1.5 0.3 4.6 2.0

X2: 5.0 4.2 8.5 1.4 3.6 1.3

a. Calculate the multiple regression equation.

The multiple regression equation is: Y = b0 + b1 * X1 + b2 * X2

b. Values of intercept b0, b1, and b2:

- b0: Intercept, the value of Y when X1 and X2 are zero.

- b1: The coefficient of X1, which indicates the change in Y for a unit change in X1.

- b2: The coefficient of X2, which indicates the change in Y for a unit change in X2.

c. Explain the significance of b0, b1, and b2:

- b0: Base value of Y without the effect of X1 and X2.


- b1: Impact of X1 on Y, holding X2 constant.

- b2: Impact of X2 on Y, holding X1 constant.

d. Predict Y when X1 = 3.0 and X2 = 2.7:

Substitute these values into the regression equation to predict Y.

9. Draw a Decision Tree using the CART Algorithm.

The CART (Classification and Regression Trees) algorithm builds a decision tree by splitting the

dataset into smaller subsets based on feature values. Each split is chosen to minimize the Gini

impurity (for classification) or variance (for regression).

You might also like