0% found this document useful (0 votes)
7 views2 pages

ITT306 Data Science Study Guide

The study guide for ITT306 - Data Science covers essential topics including the data science process, data quality, evaluation metrics, regression and classification techniques, clustering methods, and association rule mining. It also emphasizes the use of Python libraries such as NumPy and Pandas for data manipulation and analysis. Each module provides key concepts, formulas, and examples to facilitate understanding of data science principles.

Uploaded by

amnahibakt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views2 pages

ITT306 Data Science Study Guide

The study guide for ITT306 - Data Science covers essential topics including the data science process, data quality, evaluation metrics, regression and classification techniques, clustering methods, and association rule mining. It also emphasizes the use of Python libraries such as NumPy and Pandas for data manipulation and analysis. Each module provides key concepts, formulas, and examples to facilitate understanding of data science principles.

Uploaded by

amnahibakt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

ITT306 - Data Science: Frequently Asked Topics Study Guide

Module I: Introduction to Data Science

- Data Science Process (Diagram + Explanation)

Define stages: Data collection, preprocessing, analysis, modeling, interpretation.

- Data Quality

Handling noisy data, missing values, and outliers.

- Evaluation Metrics

Sensitivity, specificity, precision, recall, F1-score (with formulas and examples).

Module II: Regression & Classification

- Linear and Non-linear Regression

Equations, plots, R² statistic.

- Regression Tree & Multiple Linear Regression

Tree construction, interpretation of coefficients.

- Classification vs Regression

- Entropy & Gini Index

Used in decision trees (formulas, examples).

- Supervised vs Unsupervised Learning

Module III: Clustering & SVM

- K-Means Clustering

Manual clustering of 8 points (common problem).

- Hierarchical Clustering

Dendrogram interpretation.

- SVM

Maximal margin hyperplane, linear vs non-linear boundaries.


- Principal Component Analysis (PCA)

Steps to compute components, significance.

Module IV: Association Rule Mining

- Apriori Algorithm

Step-by-step with frequent itemset generation.

- FP-Growth

Constructing FP-Trees, advantages over Apriori.

- Support & Confidence

Definitions with calculations.

- Single vs Multidimensional Rules

Module V: Python for Data Science

- Essential Libraries

NumPy, Pandas, Matplotlib, Seaborn, Scikit-learn.

- Array Creation & Operations

NumPy functions, transposing, reshaping.

- Pandas Basics

Series, DataFrames, Boolean indexing, reindexing.

- Jupyter vs IPython

You might also like