ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
CO2 :: Discuss the advanced topics of Python language used for programming
CO3 :: Apply neural networks for medical diagnosis by medical image classification
CO6 :: Validate the application of machine learning for biological data analysis
UNIT I
Data & Feature Engineering : Data vs information, types of data: numerical data (discrete and continuous),
categorical data (ordinal and nominal data), time series data, unstructured data, data labelling, What is feature,
importance of feature selection, feature selection algorithms, Sequential forward selection, sequential backward
selection, bidirectional feature selection, feature extraction
UNIT II
Advanced Python packages : introduction to numpy, creation and accessing of nD arrays, operations on nD
arrays, introduction to pandas, data-frame, reading csv/excel data, Dimensionality reduction, PCA and LDA,
visualization using Matplot-lib, line plot, subplots, scatter plot, bar graph, histogram, pie chart
UNIT II
Classification & Regression : Introduction to classification, KNN, Decision Tree, Naive Bayes classifier, Support
Vector Machine classifier, classification on a given dataset, Introduction to regression, linear regression,
Polynomial regression, regression on a given dataset
Machine Learning
Introduction to classification
(K-NN) algorithm is a versatile and widely used machine learning
algorithm that is primarily used for its simplicity and ease of
implementation.
• It does not require any assumptions about the underlying data distribution.
• It can also handle both numerical and categorical data, making it a flexible
choice for various types of datasets in classification and regression tasks.
• It is a non-parametric method that makes predictions based on the
similarity of data points in a given dataset.
• K-NN is less sensitive to outliers compared to other algorithms.
Introduction to classification
Naive Bayes classifier
or
Model complexity
As the degree of the model increases, its performance may improve, but it also increases
the risk of over-fitting or under-fitting the data.
Model selection
There are two approaches to choosing the order of a polynomial model: forward selection
and backward elimination.