Lecture bsmd -Introduction to ML
Lecture bsmd -Introduction to ML
CODE : TEE
Lecturer: MASESE CHUMA-
CONTACT: 0701260004/0725999196
Objective
By the end of this session, students will be able to:
i. Discuss the concepts of MACHINE LEARNING
--------------------------------------------------------------------------------------------------
WHAT IS MACHINE LEARNING?
Machine Learning is the field of study that gives computers the capability to learn without being explicitly
programmed. ML is one of the most exciting technologies that one would have ever come across. As it is
evident from the name, it gives the computer that makes it more similar to humans: The ability to learn.
Machine learning is actively being used today, perhaps in many more places than one would expect.
What is Machine Learning?
It has been the quest of computer scientists from the earliest days to discover whether computers can learn.
• Learning from medical records which treatments are most effective for new diseases Homes or office
buildings learning to optimize energy costs based on the particular usage patterns of the occupants
• Personal software assistants learning the evolving interests of their users in order to highlight relevant
information on websites visited
• Discovering new knowledge hidden in massive datasets
• And so on...
• Machine learning is about extracting knowledge from the data. It can be defined as,
Machine learning is a subfield of artificial intelligence, which enables machines to learn from past
data or experiences without being explicitly programmed.
• Machine learning is the science (and art) of programming computers so that they can learn from data.
• Machine learning is the field of study that gives computers the ability to learn without being explicitly
programmed. (Arthur Samuel, 1959)
• A good start at a Machine Learning definition is that it is a core sub-area of Artificial Intelligence (AI).
ML applications learn from experience (well data) like humans without direct programming. When
exposed to new data, these applications learn, grow, change, and develop by themselves. In other
words, with Machine Learning, computers find insightful information without being told where to
look. Instead, they do this by leveraging algorithms that learn from data in an iterative process.
A spam filter learns to flag spam given examples of spam emails flagged by users and nonspam (also called
ham).
The examples that the system uses to learn are called the training set.
In this case, the task T is to flag spam for new emails, the experience E is the training data, and the performance
measure P needs to be defined.
You can use the ratio of correctly classified emails as your P. This particular performance measure is called
accuracy.
• A computer program is said to learn from experience E with respect to some task T and some
performance measure P, if its performance on T, as measured by P, improves with experience E. (Tom
Mitchell, 1997)
Another Example:
• The heavily hyped, self-driving Google car? The essence of machine learning.
• Online recommendation offers such as those from Amazon and Netflix? Machine learning applications for
everyday life.
• Knowing what customers are saying about you on Twitter? Machine learning combined with linguistic rule
creation.
• Fraud detection? One of the more obvious, important uses in our world today.
How Does Machine Learning Work?
Machine Learning is, undoubtedly, one of the most exciting subsets of Artificial Intelligence. It completes the
task of learning from data with specific inputs to the machine. It’s important to understand what makes
Machine Learning work and, thus, how it can be used in the future.
The Machine Learning process starts with inputting training data into the selected algorithm. Training data
being known or unknown data to develop the final Machine Learning algorithm. The type of training data
input does impact the algorithm, and that concept will be covered further momentarily.
To test whether this algorithm works correctly, new input data is fed into the Machine Learning algorithm.
The prediction and results are then checked.
If the prediction is not as expected, the algorithm is re-trained multiple numbers of times until the desired
output is found. This enables the Machine Learning algorithm to continually learn on its own and produce the
most optimal answer that will gradually increase in accuracy over time.
Traditional approach
The ML program will be much shorter, easier to maintain, and more accurate.
Consider how you would write a spam filter using traditional programming techniques.
You might notice that some words or phrases (such as "4U", "credit card", "free" and "amazing") tend to come
up a lot in the subject.
Perhaps you would also notice a few other patterns in the sender's name and the email's body.
You would write a detection algorithm for each pattern that you noticed, and your program would flag emails
as spam if a number of these patterns are detected.
You would test your program, and repeat steps 1 and 2 until it is good enough
• Problems for which existing solutions require a lot of hand-tuning or long lists of rules: one machine
learning algorithms can often simplify code and perform better.
• Complex problems for which there is no good solution at all using a traditional approach: the best
machine learning techniques can find a solution.
• Fluctuating environments: a machine learning system can adapt to new data.
• Getting insights about complex problems and large amounts of data.
Machine learning is a buzzword for today's technology, and it is growing very rapidly day by day. We are
using machine learning in our daily life even without knowing it such as Google Maps, Google assistant,
Alexa, etc. Below are some most trending real-world applications of Machine Learning:
1. Image Recognition:
Image recognition is one of the most common applications of machine learning. It is used to identify objects,
persons, places, digital images, etc. The popular use case of image recognition and face detection
is, Automatic friend tagging suggestion:
Facebook provides us a feature of auto friend tagging suggestion. Whenever we upload a photo with our
Facebook friends, then we automatically get a tagging suggestion with name, and the technology behind this
is machine learning's face detection and recognition algorithm.
It is based on the Facebook project named "Deep Face," which is responsible for face recognition and person
identification in the picture.
2. Speech Recognition
While using Google, we get an option of "Search by voice," it comes under speech recognition, and it's a
popular application of machine learning.
Speech recognition is a process of converting voice instructions into text, and it is also known as "Speech to
text", or "Computer speech recognition." At present, machine learning algorithms are widely used by
various applications of speech recognition. Google assistant, Siri, Cortana, and Alexa are using speech
recognition technology to follow the voice instructions.
3. Traffic prediction:
If we want to visit a new place, we take help of Google Maps, which shows us the correct path with the shortest
route and predicts the traffic conditions.
It predicts the traffic conditions such as whether traffic is cleared, slow-moving, or heavily congested with the
help of two ways:
o Real Time location of the vehicle form Google Map app and sensors
o Average time has taken on past days at the same time.
Everyone who is using Google Map is helping this app to make it better. It takes information from the user
and sends back to its database to improve the performance.
4. Product recommendations:
Machine learning is widely used by various e-commerce and entertainment companies such
as Amazon, Netflix, etc., for product recommendation to the user. Whenever we search for some product on
Amazon, then we started getting an advertisement for the same product while internet surfing on the same
browser and this is because of machine learning.
Google understands the user interest using various machine learning algorithms and suggests the product as
per customer interest.
As similar, when we use Netflix, we find some recommendations for entertainment series, movies, etc., and
this is also done with the help of machine learning.
5. Self-driving cars:
One of the most exciting applications of machine learning is self-driving cars. Machine learning plays a
significant role in self-driving cars. Tesla, the most popular car manufacturing company is working on self-
driving car. It is using unsupervised learning method to train the car models to detect people and objects while
driving.
Whenever we receive a new email, it is filtered automatically as important, normal, and spam. We always
receive an important mail in our inbox with the important symbol and spam emails in our spam box, and the
technology behind this is Machine learning. Below are some spam filters used by Gmail:
o Content Filter
o Header filter
o General blacklists filter
o Rules-based filters
o Permission filters
Some machine learning algorithms such as Multi-Layer Perceptron, Decision tree, and Naïve Bayes
classifier are used for email spam filtering and malware detection.
We have various virtual personal assistants such as Google assistant, Alexa, Cortana, Siri. As the name
suggests, they help us in finding the information using our voice instruction. These assistants can help us in
various ways just by our voice instructions such as Play music, call someone, open an email, Scheduling an
appointment, etc.
These virtual assistants use machine learning algorithms as an important part.
These assistants record our voice instructions, send it over the server on a cloud, and decode it using ML
algorithms and act accordingly.
Machine learning is making our online transaction safe and secure by detecting fraud transaction. Whenever
we perform some online transaction, there may be various ways that a fraudulent transaction can take place
such as fake accounts, fake ids, and steal money in the middle of a transaction. So to detect this, Feed
Forward Neural network helps us by checking whether it is a genuine transaction or a fraud transaction.
For each genuine transaction, the output is converted into some hash values, and these values become the input
for the next round. For each genuine transaction, there is a specific pattern which gets change for the fraud
transaction hence, it detects it and makes our online transactions more secure.
Machine learning is widely used in stock market trading. In the stock market, there is always a risk of up and
downs in shares, so for this machine learning's long short-term memory neural network is used for the
prediction of stock market trends.
In medical science, machine learning is used for diseases diagnoses. With this, medical technology is growing
very fast and able to build 3D models that can predict the exact position of lesions in the brain.
Nowadays, if we visit a new place and we are not aware of the language then it is not a problem at all, as for
this also machine learning helps us by converting the text into our known languages. Google's GNMT (Google
Neural Machine Translation) provide this feature, which is a Neural Machine Learning that translates the text
into our familiar language, and it called as automatic translation.
The technology behind the automatic translation is a sequence to sequence learning algorithm, which is used
with image recognition and translates the text from one language to another language.
Develop an intelligent system that perform variety Construct machines that can only accomplish the
of complex jobs. jobs for which they have trained.
AI has broad variety of applications. ML allows systems to learn new things from data.
2. Evaluation
Evaluation involves assessing the performance of machine learning models. It includes methods for measuring
how well a model generalizes to unseen data and how accurately it predicts outcomes. Evaluation metrics vary
depending on the specific task and objectives of the machine learning project. Common evaluation techniques
include cross-validation and holdout validation.
3. Optimization
Optimization is the process of refining and improving machine learning models to enhance their performance.
It involves adjusting model parameters or hyperparameters to minimize errors or maximize accuracy.
Optimization techniques aim to find the best possible model for a given task by iteratively adjusting the model
based on feedback from the evaluation process.
Although machine learning is being used in every industry and helps organizations make more informed and
data-driven choices that are more effective than classical methodologies, it still has so many problems that
cannot be ignored. Here are some common issues in Machine Learning that professionals face to inculcate
ML skills and create an application from scratch.
Further, if we are using non-representative training data in the model, it results in less accurate predictions. A
machine learning model is said to be ideal if it predicts well for generalized cases and provides accurate
decisions. If there is less training data, then there will be a sampling noise in the model, called the non-
representative training set. It won't be accurate in predictions. To overcome this, it will be biased against one
class or a group.
Hence, we should use representative data in training to protect against being biased and make accurate
predictions without any drift.
Overfitting is one of the most common issues faced by Machine Learning engineers and data scientists.
Whenever a machine learning model is trained with a huge amount of data, it starts capturing noise and
inaccurate data into the training data set. It negatively affects the performance of the model. Let's understand
with a simple example where we have a few training data sets such as 1000 mangoes, 1000 apples, 1000
bananas, and 5000 papayas. Then there is a considerable probability of identification of an apple as papaya
because we have a massive amount of biased data in the training data set; hence prediction got negatively
affected. The main reason behind overfitting is using non-linear methods used in machine learning algorithms
as they build non-realistic data models. We can overcome overfitting by using linear and parametric algorithms
in the machine learning models.
Methods to reduce overfitting:
Underfitting:
Underfitting is just the opposite of overfitting. Whenever a machine learning model is trained with fewer
amounts of data, and as a result, it provides incomplete and inaccurate data and destroys the accuracy of the
machine learning model.
Underfitting occurs when our model is too simple to understand the base structure of the data, just like an
undersized pant. This generally happens when we have limited data into the data set, and we try to build a
linear model with non-linear data. In such scenarios, the complexity of the model destroys, and rules of the
machine learning model become too easy to be applied on this data set, and the model starts doing wrong
predictions as well.
8. Customer Segmentation
Customer segmentation is also an important issue while developing a machine learning algorithm. To identify
the customers who paid for the recommendations shown by the model and who don't even check them. Hence,
an algorithm is necessary to recognize the customer behavior and trigger a relevant recommendation for the
user based on past experience.