0% found this document useful (0 votes)
6 views3 pages

DMML Assignment

The document outlines key concepts in data mining and machine learning, including definitions and examples of data, features, labels, training and testing data, algorithms, overfitting, underfitting, cross-validation, confusion matrices, precision and recall, as well as supervised and unsupervised learning. It serves as an educational resource for understanding fundamental terms and techniques in the field. The assignment is submitted by Ahanaf Al Mashfi to Dr. Md Alamgir Kabir at Daffodil International University.

Uploaded by

Atick Arman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views3 pages

DMML Assignment

The document outlines key concepts in data mining and machine learning, including definitions and examples of data, features, labels, training and testing data, algorithms, overfitting, underfitting, cross-validation, confusion matrices, precision and recall, as well as supervised and unsupervised learning. It serves as an educational resource for understanding fundamental terms and techniques in the field. The assignment is submitted by Ahanaf Al Mashfi to Dr. Md Alamgir Kabir at Daffodil International University.

Uploaded by

Atick Arman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Assignment

Course Code: CSE325


Course Title: Data Mining and Machine Learning

Submitted to
Name: Dr. Md Alamgir Kabir
Department CSE
Daffodil International University

Submitted by:
Name: Ahanaf Al Mashfi
Id: 222-15-6402
Sec:62_G
Daffodil International University

Submission Date: 05-04-2025


1. Data
Definition: Raw facts and figures that can be processed to extract information.

Example: A dataset containing sales records, including dates, amounts, and customer information.

2. Feature
Definition: An individual measurable property or characteristic of a phenomenon being observed.

Example: In a dataset of cars, features could include horsepower, weight, and fuel efficiency.

3. Label
Definition: The outcome variable that a machine learning model predicts.

Example: In a dataset predicting house prices, the label would be the actual price of the house.

4. Training Data
Definition: A subset of the dataset used to train the machine learning model.

Example: A portion of the data used to teach a model to recognize handwritten digits.

5. Testing Data
Definition: A separate subset of the dataset used to evaluate the performance of the model.

Example: The data not used during training to test how well the model predicts new, unseen data.

6. Algorithm
Definition: A set of rules or procedures for solving a problem or performing a task in machine learning.

Example: Decision trees, neural networks, and support vector machines are examples of algorithms.

7. Overfitting
Definition: A modeling error that occurs when a model learns the details and noise in the training data
to the extent that it performs poorly on new data.

Example: A model that achieves 95% accuracy on training data but only 60% on testing data might be
overfitting.

8. Underfitting
Definition: A situation where a model is too simple to capture the underlying patterns in the data.

Example: A linear regression model applied to a non-linear dataset that fails to provide an accurate
representation.
9. Cross-Validation
Definition: A technique for assessing how the results of a statistical analysis will generalize to an
independent dataset.

Example: K-fold cross-validation, where the dataset is split into 'k' subsets; the model is trained on 'k-
1' subsets and tested on 1 subset repeatedly.

10. Confusion Matrix


Definition: A table used to evaluate the performance of a classification model by comparing predicted
results with actual results.

Example: It shows True Positive, False Positive, True Negative, and False Negative counts.

11. Precision and Recall


Definitions:

Precision: The ratio of correctly predicted positive observations to the total predicted positives.

Recall: The ratio of correctly predicted positive observations to all actual positives. Example: In email
spam detection, precision measures how many emails identified as spam were actually spam, while
recall measures how many actual spam emails were correctly identified.

12. Supervised Learning


Definition: A type of machine learning where the model is trained on labeled data.

Example: Predicting house prices based on historical sales data, where past prices (labels) are
provided.

13. Unsupervised Learning


Definition: A type of machine learning where the model learns from unlabeled data to find hidden
patterns.

Example: Customer segmentation analysis where data on purchasing behavior is grouped without
predefined labels.

You might also like