Exercise #2
Exercise #2
[4]
i) True/False
1- Regression analysis is a subcategory of supervised learning that aims to predict a categorical class label of a
new instance.
2- Classification is a method used to predict continuous target variables, which represent numerical values.
3- Data cleaning is the process of manipulating the data to make it usable for analysis, while data wrangling is
the process of making sure that the data is accurate and consistent.
1
4- Data scaling is the process of ensuring that all features contribute equally to the model and avoid the domination
of features with larger values.
5- Due to imbalanced data, most DL algorithms are unable to predict the minority class's data properly.
6- F1 score = 1⁄(1 + (𝐹𝑛 + 𝐹𝑝 )⁄2𝑇𝑝 ) ranges between 0 and 1.
ii) In the context of DL, choose from set B the definition that matches the expression in A.
A B
1- The philosophy of DL (DL) is a program generated by an algorithm without being
explicitly programmed by any human being.
2. A deep learning (DL) model is refers to a set of training examples where the labels are
already known.
3. Overfitting occurs associated tags or labels representing the outcome or
category of the data.
4. Deep Learning (DL) is built on the rules of inferences, heuristics, discovery,
reasoning, induction, and guesswork.
5. Supervised learning is represented as a vector 𝒙∈ ℝ𝒏 where each entry 𝒙𝒊 of
the vector is another feature.
6. Unsupervised learning rescaling features to the range of [0, 1]
7. Labeled data comes with is to reduce the effect of overfitting
8. A dataset example or sample when a model learns to perform well on the training data
but does not generalize well to unseen data.
9. Normalization refers to multi-neural network architecture statistical tool to
explore and analyze the data.
10. Advantage of using CNN over (MLP) is based on finding meaningful patterns and groups in
the unlabeled data based on features and purposes.
iii) Using the idea of a fully functioning feedforward network shows how to design a simple task: learning the
XOR function. Draw the implementation network.
Sample solutions:
Solution of Q2
[2] i)
Error = (Fp + Fn)/ (total data set) = (10 + 2 + 14 + 18)/ 200 = 0.22 = 22%
Success = 1 – error = 1 – 0.22 = 0.78 = 78%
ii)
Accuracy = (88+40+12)/200 = 0.7 = 70%, Precision = 0.58 = 58%, Recall = 0.62 = 62%, F1 Score = 0.63 = 63%
2
iii)