Machine Learning
Machine Learning
What is it?
• Arthur Samuel 1959
• ML is the ability of a machine to learn from and replicate human behavior. It allows
programms to learn automatically and make computers more intelligent.
• Differences:
ML refers to algorithms that learn and perform based on the data exposed to it.
DL refers to layers of neural networks built with ML algorithms.
AI leverages different techniques, including ML and DL.
Types of ML
• Supervised learning → Algorith uses the data developed in a supervised environment to
deliver an output. Some commonly known supervised learning mathods are: linear
regression, logistic regression, support vector machines, decision trees. Supervised learning
helps two actions: classification and regression.
• Unsupervised learning → The program finds hidden patterns and recognises their relation.
• Reinforcement learning → The program learns from its previous errors. Interpreter rewards
the algorith when it finds the correct solution.
ML Pipeline
• It is a series of sequential steps used to codify and automate ML workflows to produce ML
models. Every step in the sequence is repeated until a successful algorithm is achieved.
Python in ML
• Packages → folders and modules that form the building blocks in Python-based
programming.
• Libraries → collections of packages or specific files containing prewritten code (NupPy,
Pandas, matplotlib etc).
Supervised Learning algorithms → i. Classification ii. Regression (establishes the relation between
input and output variables. It is suitable for situations where the output variable is a real or
continuous value)
Tree → a specific structure used to represent data.
The place where data resides is called node.
The lines going out of the nodes are called edges/branches (upper – lower, not
adjacent levels).
Root node, parent node, children nodes, leaf
Binary search tree
Decision Tree
ML is all about training models to predict outcomes based on big amounts of data or data sets. An
ML algorithm generates and trains the given tree based on the provided data set.
What kinds of problems are decision trees suitable for? → We can use them for both classification
and regression.
Decision tree can work with both numerical features and categorical features.
Pros
• White Box Model (simple for the scientist to understand and interpret) [in contrast to Black
Box Model (Neural Network)].
• In-built feature selection.
• Require little data preprocessing and preparation.
• Perform well on large datasets (Low Computation complexity, Prediction phase).
Cons
• Overfitting.
• Unstable due to data variation.
• Not guaranteed global optimal solution, greedy construction.
• Unbalanced datasets can create problems.