0% found this document useful (0 votes)
8 views1 page

Problems

The document provides instructions for two predictive modeling tasks. For the first task, the goal is to predict diabetes based on a provided dataset, using the training set to build a model and k-fold cross validation, and testing on the test set to minimize misclassification costs. For the second task, the goal is to analyze a telecom dataset to provide a report with suggestions for how the client can improve revenue, using variables from 'longmon' to 'wireten' which represent different revenue sources. Both tasks require building models and providing a report on the approach, process, and performance.

Uploaded by

ankiosa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views1 page

Problems

The document provides instructions for two predictive modeling tasks. For the first task, the goal is to predict diabetes based on a provided dataset, using the training set to build a model and k-fold cross validation, and testing on the test set to minimize misclassification costs. For the second task, the goal is to analyze a telecom dataset to provide a report with suggestions for how the client can improve revenue, using variables from 'longmon' to 'wireten' which represent different revenue sources. Both tasks require building models and providing a report on the approach, process, and performance.

Uploaded by

ankiosa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Q1. Look at the 'Custom Diabetes Dataset.xlsx' dataset.

The objective is to predict diabetes on the


basis of the information given. The dataset is divided into two parts, i.e. blue part and green part.
The blue part is the training set and the green part is the test set. Build a model using the training set
and test the model using the test set. Use k-fold cross validation. Do not use any technique which
was not covered in the class. Costs of misclassifications are given below:
Predicted Cases
Diabetes No Diabetes
Actual Cases Diabetes

No Diabetes 150

50
0

Build a model which will lower the total cost of misclassification of the test dataset. Build a report
on the approach, process and finally on the performance of the model.
[10]
Q2. A telecom company has given you the telecom dataset Telecom Data.xlsx. Prepare a
comprehensive report suggesting your client how to improve revenue. From variable longmon to
wireten are revenues. Look a the Variable Description tab to understand the variables.
[20]

You might also like