INAIO Syllabus
INAIO Syllabus
● Learning Fundamentals
○ Concept of an ML model: Inputs, outputs, and learning from data.
○ Train-Test Process
○ Purpose of loss functions
● Evaluation Metrics
○ For Regression: Root Mean Squared Error,, Mean Absolute Error
○ For Classification: Accuracy, Precision, Recall, F1-Score, Confusion matrix,
ROC Curve and AUC (basic interpretation)
Section 3: Algorithms
Students are NOT expected to have any prior knowledge of these ML algorithms. Questions
on the topics below will include all the relevant background information in the problem
statement and students are only expected to reason about them and apply them to specific
problems.
● Supervised Learning
○ Linear Regression
○ Linear Classifier
○ K-Nearest Neighbors (K-NN)
○ Decision Trees
● Unsupervised Learning
○ K-Means Clustering
○ Principal Component Analysis (PCA)
○ Hierarchical Clustering
● Probabilistic Models
○ Naive Bayes
● Famous Algorithms
○ PageRank
○ TF-IDF Ranking
Stage 2 - Syllabus
● Learning Fundamentals
○ Concept of an ML model: Inputs, outputs, and learning from data.
○ Train-Test Process
○ Purpose of loss functions
○ Overfitting and underfitting
○ Hyperparameter Tuning
● Evaluation Metrics
○ For Regression: Root Mean Squared Error,, Mean Absolute Error
○ For Classification: Accuracy, Precision, Recall, F1-Score, Confusion matrix,
ROC Curve and AUC (basic interpretation)
Section 3: ML Algorithms
Students are not expected to know these algorithms in depth but some familiarity will be helpful.
They should be able to apply these algorithms to specific problems/datasets when provided
with minimal guidance.
● Supervised Learning
○ Linear Regression
○ Linear Classifier
○ K-Nearest Neighbors (K-NN)
○ Decision Trees
● Unsupervised Learning
○ K-Means Clustering
○ Principal Component Analysis (PCA)
○ Hierarchical Clustering
● Probabilistic Models
○ Naive Bayes
● Famous Algorithms
○ PageRank
○ TF-IDF Ranking
● Python Basics
○ Syntax, loops, and functions
○ File handling and basic input-output operations
● Data Handling
○ NumPy and Pandas for efficient data processing
● Visualization
○ Matplotlib and Seaborn for creating and interpreting visualizations
● ML Libraries
○ Introduction to scikit-learn: classification, clustering, regression, metrics modules
○ Training and testing models with specific algorithms on provided datasets