FAQ's - Supervised Learning
FAQ's - Supervised Learning
1. Data Understanding:
1 C. Compare Column names of all the 3 DataFrames and clearly write observations. [1 Mark]
→ Compare the column names of all the three dataframes. As we are going to merge datasets by rows,
checking the column names, order and type is mandatory. Use a simple compare operator to check
whether all 3 dataframes have the same column names. And write your observations from the result.
1 E. Observe and share variation in ‘Class’ feature of all the 3 DaraFrames. [1 Mark]
→ Check the ‘Class’ variable’s distribution and categories.
1
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
3. Data Analysis:
3 C. Visualize a pairplot with 3 classes distinguished by colors and share insights. [2 Marks]
→ Create a pairplot for the given variables and the color of the data points in the pairplot should be
distinguished by ‘Class’ categories.
4. Model Building:
4 D. Print all the possible performance metrics for both train and test data. [2 Marks]
→ Print the performance metric of classification models that include accuracy, precision, recall, F1 score
etc.
5. Performance Improvement:
5 A. Experiment with various parameters to improve performance of the base model. [2 Marks]
→ So far you would have run the default model, now you can tune the model by changing the
parameters in KNeighborsClassifier() or svm function. Firstly, self-explore what are the parameters
available in the models and check how you can fine-tune it by changing the options. You have to just
research a bit and do it. (Detailed parameter tuning will be covered in feature engineering course)
Reference link for Hyperparameter tuning for a KNN problem -
https://medium.datadriveninvestor.com/k-nearest-neighbors-in-python-hyperparameters-tuning-71673
4bc557f
You can explore and tune the hyperparameters for other models too. You can learn about Gridsearch,
Random search cross validation techniques and use them.
2
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
→ The variables are of object type with Binary or multi-class outputs like 0,1 or 1,2,3 etc. Hence,
convert them to ‘Object’ type
2 C. Check for unexpected values in each categorical variable and impute them with the best suitable
value. [2 Marks]
→ Unexpected values mean if all values in a feature are 0/1 then ‘?’, ‘a’, 1.5 are unexpected values
which needs treatment
4. Performance Improvement:
4 A. Train a base model each for SVM, KNN. [4 Marks]
→You have to build a base model without tuning any parameters on the balanced data.
4 B. Tune parameters for each of the models wherever required and finalize a model. [3 Marks]
(Optional: Experiment with various Hyperparameters - Research required)
→ Tune the parameters as performed in Part A, Question 5 A.
3
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited
You can tune the model by changing the parameters in KNeighborsClassifier() or svm function. Firstly,
self-explore what are the parameters available in the models and check how you can fine-tune it by
changing the options. You have to just research a bit and do it. (Detailed parameter tuning will be
covered in feature engineering course)
Reference link for Hyperparameter tuning for a KNN problem -
https://medium.datadriveninvestor.com/k-nearest-neighbors-in-python-hyperparameters-tuning-71673
4bc557f
You can explore and tune the hyperparameters for other models too.
4
Proprietary content. ©Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited