DMML Assignment
DMML Assignment
Submitted to
Name: Dr. Md Alamgir Kabir
Department CSE
Daffodil International University
Submitted by:
Name: Ahanaf Al Mashfi
Id: 222-15-6402
Sec:62_G
Daffodil International University
Example: A dataset containing sales records, including dates, amounts, and customer information.
2. Feature
Definition: An individual measurable property or characteristic of a phenomenon being observed.
Example: In a dataset of cars, features could include horsepower, weight, and fuel efficiency.
3. Label
Definition: The outcome variable that a machine learning model predicts.
Example: In a dataset predicting house prices, the label would be the actual price of the house.
4. Training Data
Definition: A subset of the dataset used to train the machine learning model.
Example: A portion of the data used to teach a model to recognize handwritten digits.
5. Testing Data
Definition: A separate subset of the dataset used to evaluate the performance of the model.
Example: The data not used during training to test how well the model predicts new, unseen data.
6. Algorithm
Definition: A set of rules or procedures for solving a problem or performing a task in machine learning.
Example: Decision trees, neural networks, and support vector machines are examples of algorithms.
7. Overfitting
Definition: A modeling error that occurs when a model learns the details and noise in the training data
to the extent that it performs poorly on new data.
Example: A model that achieves 95% accuracy on training data but only 60% on testing data might be
overfitting.
8. Underfitting
Definition: A situation where a model is too simple to capture the underlying patterns in the data.
Example: A linear regression model applied to a non-linear dataset that fails to provide an accurate
representation.
9. Cross-Validation
Definition: A technique for assessing how the results of a statistical analysis will generalize to an
independent dataset.
Example: K-fold cross-validation, where the dataset is split into 'k' subsets; the model is trained on 'k-
1' subsets and tested on 1 subset repeatedly.
Example: It shows True Positive, False Positive, True Negative, and False Negative counts.
Precision: The ratio of correctly predicted positive observations to the total predicted positives.
Recall: The ratio of correctly predicted positive observations to all actual positives. Example: In email
spam detection, precision measures how many emails identified as spam were actually spam, while
recall measures how many actual spam emails were correctly identified.
Example: Predicting house prices based on historical sales data, where past prices (labels) are
provided.
Example: Customer segmentation analysis where data on purchasing behavior is grouped without
predefined labels.