ML 1
ML 1
Well Posed Learning Problem – A computer program is said to learn from experience E in
context to some task T and some performance measure P, if its performance on T, as was
measured by P, upgrades with experience E.
Any problem can be segregated as well-posed learning problem if it has three traits –
Task
Performance Measure
Experience
Certain examples that efficiently defines the well-posed learning problem are –
1. To better filter emails as spam or not
Task – Classifying emails as spam or not
Performance Measure – The fraction of emails accurately classified as spam or not
spam
Experience – Observing you label emails as spam or not spam
2. A checkers learning problem
Task – Playing checkers game
Performance Measure – percent of games won against opposer
Experience – playing implementation games against itself
3. Handwriting Recognition Problem
Task – Acknowledging handwritten words within portrayal
Performance Measure – percent of words accurately classified
Experience – a directory of handwritten words with given classifications
4. A Robot Driving Problem
Task – driving on public four-lane highways using sight scanners
Performance Measure – average distance progressed before a fallacy
Experience – order of images and steering instructions noted down while observing a
human driver
5. Fruit Prediction Problem
Task – forecasting different fruits for recognition
Performance Measure – able to predict maximum variety of fruits
Experience – training machine with the largest datasets of fruits images
6. Face Recognition Problem
Task – predicting different types of faces
Performance Measure – able to predict maximum types of faces
Experience – training machine with maximum amount of datasets of different face
images
7. Automatic Translation of documents
Task – translating one type of language used in a document to other language
Performance Measure – able to convert one language to other efficiently
Experience – training machine with a large dataset of different types of languages
Machine Learning
The Machine Learning Tutorial covers both the fundamentals and more complex ideas of machine
learning. Students and professionals in the workforce can benefit from our machine learning
tutorial.
A rapidly developing field of technology, machine learning allows computers to automatically learn
from previous data. For building mathematical models and making predictions based on historical
data or information, machine learning employs a variety of algorithms. It is currently being used
for a variety of tasks, including speech recognition, email filtering, auto-tagging on Facebook, a
recommender system, and image recognition.
You will learn about the many different methods of machine learning, including reinforcement
learning, supervised learning, and unsupervised learning, in this machine learning tutorial.
Regression and classification models, clustering techniques, hidden Markov models, and various
sequential models will all be covered.
A subset of artificial intelligence known as machine learning focuses primarily on the creation of
algorithms that enable a computer to independently learn from data and previous experiences.
Arthur Samuel first used the term "machine learning" in 1959. It could be summarized as follows:
Without being explicitly programmed, machine learning enables a machine to automatically learn
from data, improve performance from experiences, and predict things.
Machine learning algorithms create a mathematical model that, without being explicitly
programmed, aids in making predictions or decisions with the assistance of sample historical data,
or training data. For the purpose of developing predictive models, machine learning brings
together statistics and computer science. Algorithms that learn from historical data are either
constructed or utilized in machine learning. The performance will rise in proportion to the quantity
of information we provide.
A machine can learn if it can gain more data to improve its performance.
Let's say we have a complex problem in which we need to make predictions. Instead of writing
code, we just need to feed the data to generic algorithms, which build the logic based on the data
and predict the output. Our perspective on the issue has changed as a result of machine learning.
The Machine Learning algorithm's operation is depicted in the following block diagram:
By providing them with a large amount of data and allowing them to automatically explore the
data, build models, and predict the required output, we can train machine learning algorithms. The
cost function can be used to determine the amount of data and the machine learning algorithm's
performance. We can save both time and money by using machine learning.
The significance of AI can be handily perceived by its utilization's cases, Presently, AI is utilized in
self-driving vehicles, digital misrepresentation identification, face acknowledgment, and
companion idea by Facebook, and so on. Different top organizations, for example, Netflix and
Amazon have constructed AI models that are utilizing an immense measure of information to
examine the client interest and suggest item likewise.
Following are some key points which show the importance of Machine
Learning:
Example : In Driverless Car, the training data is fed to Algorithm like how to Drive Car in
Highway, Busy and Narrow Street with factors like speed limit, parking, stop at signal etc. After
that, a Logical and Mathematical model is created on the basis of that and after that, the car will
work according to the logical model. Also, the more data the data is fed the more efficient output
is produced.
Step 1) Choosing the Training Experience: The very important and first task is to choose the
training data or training experience which will be fed to the Machine Learning Algorithm. It is
important to note that the data or experience that we fed to the algorithm must have a significant
impact on the Success or Failure of the Model. So Training data or experience should be chosen
wisely.
Below are the attributes which will impact on Success and Failure of Data:
The training experience will be able to provide direct or indirect feedback regarding
choices. For example: While Playing chess the training data will provide feedback to
itself like instead of this move if this is chosen the chances of success increases.
Second important attribute is the degree to which the learner will control the sequences
of training examples. For example: when training data is fed to the machine then at that
time accuracy is very less but when it gains experience while playing again and again
with itself or opponent the machine algorithm will get feedback and control the chess
game accordingly.
Third important attribute is how it will represent the distribution of examples over
which performance will be measured. For example, a Machine learning algorithm will
get experience while going through a number of different cases and different examples.
Thus, Machine Learning Algorithm will get more and more experience by passing
through more and more examples and hence its performance will increase.
Step 2- Choosing target function: The next important step is choosing the target function. It
means according to the knowledge fed to the algorithm the machine learning will choose
NextMove function which will describe what type of legal moves should be taken. For example :
While playing chess with the opponent, when opponent will play then the machine learning
algorithm will decide what be the number of possible legal moves taken in order to get success.
Step 3- Choosing Representation for Target function: When the machine algorithm will know
all the possible legal moves the next step is to choose the optimized move using any
representation i.e. using linear Equations, Hierarchical Graph Representation, Tabular form etc.
The NextMove function will move the Target move like out of these move which will provide
more success rate. For Example : while playing chess machine have 4 possible moves, so the
machine will choose that optimized move which will provide success to it.
Step 5- Final Design: The final design is created at last when system goes from number of
examples , failures and success , correct and incorrect decision and what will be the next step etc.
Example: DeepBlue is an intelligent computer which is ML-based won chess game against the
chess expert Garry Kasparov, and it became the first computer which had beaten a human chess
expert.
o Noisy Data- It is responsible for an inaccurate prediction that affects the decision as well as
accuracy in classification tasks.
o Incorrect data- It is also responsible for faulty programming and results obtained in machine
learning models. Hence, incorrect data may affect the accuracy of the results also.
o Generalizing of output data- Sometimes, it is also found that generalizing output data becomes
complex, which results in comparatively poor future actions.
Further, if we are using non-representative training data in the model, it results in less accurate
predictions. A machine learning model is said to be ideal if it predicts well for generalized cases
and provides accurate decisions. If there is less training data, then there will be a sampling noise in
the model, called the non-representative training set. It won't be accurate in predictions. To
overcome this, it will be biased against one class or a group.
Hence, we should use representative data in training to protect against being biased and make
accurate predictions without any drift.
Overfitting is one of the most common issues faced by Machine Learning engineers and data
scientists. Whenever a machine learning model is trained with a huge amount of data, it starts
capturing noise and inaccurate data into the training data set. It negatively affects the performance
of the model. Let's understand with a simple example where we have a few training data sets such
as 1000 mangoes, 1000 apples, 1000 bananas, and 5000 papayas. Then there is a considerable
probability of identification of an apple as papaya because we have a massive amount of biased
data in the training data set; hence prediction got negatively affected. The main reason behind
overfitting is using non-linear methods used in machine learning algorithms as they build non-
realistic data models. We can overcome overfitting by using linear and parametric algorithms in
the machine learning models.
Underfitting:
Underfitting is just the opposite of overfitting. Whenever a machine learning model is trained with
fewer amounts of data, and as a result, it provides incomplete and inaccurate data and destroys
the accuracy of the machine learning model.
Underfitting occurs when our model is too simple to understand the base structure of the data,
just like an undersized pant. This generally happens when we have limited data into the data set,
and we try to build a linear model with non-linear data. In such scenarios, the complexity of the
model destroys, and rules of the machine learning model become too easy to be applied on this
data set, and the model starts doing wrong predictions as well.
8. Customer Segmentation
Customer segmentation is also an important issue while developing a machine learning algorithm.
To identify the customers who paid for the recommendations shown by the model and who don't
even check them. Hence, an algorithm is necessary to recognize the customer behavior and trigger
a relevant recommendation for the user based on past experience.
9. Process Complexity of Machine Learning
The machine learning process is very complex, which is also another major issue faced by machine
learning engineers and data scientists. However, Machine Learning and Artificial Intelligence are
very new technologies but are still in an experimental phase and continuously being changing over
time. There is the majority of hits and trial experiments; hence the probability of error is higher
than expected. Further, it also includes analyzing the data, removing data bias, training data,
applying complex mathematical calculations, etc., making the procedure more complicated and
quite tedious.
Classification
Classification deals with predicting categorical target variables, which represent discrete classes
or labels. For instance, classifying emails as spam or not spam, or predicting whether a patient has
a high risk of heart disease. Classification algorithms learn to map the input features to one of the
predefined classes.
Classification algorithms are used to solve the classification problems in which the output variable
is categorical, such as "Yes" or No, Male or Female, Red or Blue, etc. The classification
algorithms predict the categories present in the dataset. Some real-world examples of
classification algorithms are Spam Detection, Email filtering, etc.
Regression
Regression, on the other hand, deals with predicting continuous target variables, which represent
numerical values. For example, predicting the price of a house based on its size, location, and
amenities, or forecasting the sales of a product. Regression algorithms learn to map the input
features to a continuous numerical value.
Regression algorithms are used to solve regression problems in which there is a linear relationship
between input and output variables. These are used to predict continuous output variables, such
as market trends, weather prediction, etc.
Here are some regression algorithms:
Linear Regression
Polynomial Regression
Ridge Regression
Lasso Regression
Decision tree
Random Forest
Unsupervised Learning
Clustering
Clustering is the process of grouping data points into clusters based on their similarity. This
technique is useful for identifying patterns and relationships in data without the need for labeled
examples.
Here are some clustering algorithms:
K-Means Clustering algorithm
Mean-shift algorithm
DBSCAN Algorithm
Principal Component Analysis
Independent Component Analysis
Association
Association rule learning is a technique for discovering relationships between items in a dataset. It
identifies rules that indicate the presence of one item implies the presence of another item with a
specific probability.
Here are some association rule learning algorithms:
Apriori Algorithm
Eclat
FP-growth Algorithm
3. Semi-Supervised Learning
Semi-Supervised learning is a machine learning algorithm that works between the supervised and
unsupervised learning so it uses both labelled and unlabelled data. It’s particularly useful when
obtaining labeled data is costly, time-consuming, or resource-intensive. This approach is useful
when the dataset is expensive and time-consuming. Semi-supervised learning is chosen when
labeled data requires skills and relevant resources in order to train or learn from it.
We use these techniques when we are dealing with data that is a little bit labeled and the rest large
portion of it is unlabeled. We can use the unsupervised techniques to predict labels and then feed
these labels to supervised techniques. This technique is mostly applicable in the case of image
data sets where usually all images are not labeled.
Semi-Supervised Learning
The association rule learning is one of the very important concepts of machine learning, and it is
employed in Market Basket analysis, Web usage mining, continuous production, etc. Here
market basket analysis is a technique used by the various big retailer to discover the associations
between items. We can understand it by taking an example of a supermarket, as in a supermarket,
all products that are purchased together are put together.
For example, if a customer buys bread, he most likely can also buy butter, eggs, or milk, so these
products are stored within a shelf or mostly nearby. Consider the below diagram:
Association rule learning can be divided into three types of algorithms:
1. Apriori
2. Eclat
3. F-P Growth Algorithm
Here the If element is called antecedent, and then statement is called as Consequent. These types
of relationships where we can find out some association or relation between two items is known as
single cardinality. It is all about creating rules, and if the number of items increases, then cardinality
also increases accordingly. So, to measure the associations between thousands of data items, there
are several metrics. These metrics are given below:
o Support
o Confidence
o Lift
Confidence
Confidence indicates how often the rule has been found to be true. Or how often the items X and
Y occur together in the dataset when the occurrence of X is already given. It is the ratio of the
transaction that contains X and Y to the number of records that contain X.
Lift
It is the strength of any rule, which can be defined as below formula:
It is the ratio of the observed support measure and expected support if X and Y are independent of
each other. It has three possible values:
o If Lift= 1: The probability of occurrence of antecedent and consequent is independent of each other.
o Lift>1: It determines the degree to which the two itemsets are dependent to each other.
o Lift<1: It tells us that one item is a substitute for other items, which means one item has a negative
effect on another.
Apriori Algorithm
This algorithm uses frequent datasets to generate association rules. It is designed to work on the
databases that contain transactions. This algorithm uses a breadth-first search and Hash Tree to
calculate the itemset efficiently.
It is mainly used for market basket analysis and helps to understand the products that can be
bought together. It can also be used in the healthcare field to find drug reactions for patients.
Eclat Algorithm
Eclat algorithm stands for Equivalence Class Transformation. This algorithm uses a depth-first
search technique to find frequent itemsets in a transaction database. It performs faster execution
than Apriori Algorithm.
o Market Basket Analysis: It is one of the popular examples and applications of association rule
mining. This technique is commonly used by big retailers to determine the association between
items.
o Medical Diagnosis: With the help of association rules, patients can be cured easily, as it helps in
identifying the probability of illness for a particular disease.
o Protein Sequence: The association rules help in determining the synthesis of artificial Proteins.
o It is also used for the Catalog Design and Loss-leader Analysis and many more other applications.