Benjamin Presentation Writeup
Benjamin Presentation Writeup
A RESEARCH CONDUCTED ON
Chapter 1:
Chapter 2:
Chapter 3:
Chapter 4:
Chapter 5:
5.0 Summary
5.1 Conclusion
ABSTRACT
Machine learning has experienced significant evolution due to advancements
in technology and the development of sophisticated algorithms. Early machine
learning efforts were limited by computational power and data availability, but as
technology advanced, so did the capabilities of machine learning systems.
Technological advancements such as improved hardware, faster processors, and
increased storage capacities have enabled the processing of larger datasets and the
execution of more complex algorithms.
Another critical factor in the evolution of machine learning is the development of
algorithms. Early algorithms were relatively simple, focusing on linear
relationships and basic pattern recognition. Over time, researchers developed more
complex algorithms, including neural networks, decision trees, and support vector
machines, which allowed for better handling of non-linear relationships and more
sophisticated pattern recognition. This write-up is concerned with machine
learning and it vast application in Computational Science.
CHAPTER ONE
1.0 DESCRIPTION OF MACHINE LEARNING: Machine learning (ML) is a
branch of artificial intelligence (AI) that enables computers to “self-learn” from
training data and improve over time, without being explicitly programmed.
Machine learning algorithms are able to detect patterns in data and learn from
them, in order to make their own predictions. In short, machine learning algorithms
and models learn through experience.
In traditional programming, a computer engineer writes a series of directions that
instruct a computer how to transform input data into a desired output. Instructions
are mostly based on an IF-THEN structure: when certain conditions are met, the
program executes a specific action.
Machine learning, on the other hand, is an automated process that enables
machines to solve problems with little or no human input, and take actions based
on past observations.
While artificial intelligence and machine learning are often used interchangeably,
they are two different concepts. AI is the broader concept – machines making
decisions, learning new skills, and solving problems in a similar way to humans –
whereas machine learning is a subset of AI that enables intelligent systems to
autonomously learn new things from data.
Instead of programming machine learning algorithms to perform tasks, you can
feed them examples of labeled data (known as training data), which helps them
make calculations, process data, and identify patterns automatically.
Machine learning can be put to work on massive amounts of data and can perform
much more accurately than humans. It can help you save time and money on tasks
and analyses, like solving customer pain points to improve customer satisfaction,
support ticket automation, and data mining from internal sources and all over the
internet.
1.1 BRIEF HISTORY AND EVOLUTION OF MACHINE LEARNING:
The history of machine learning dates back to the 1950s and 1960s when
researchers in artificial intelligence (AI) began exploring ways to enable machines
to learn from data. One of the earliest attempts was the creation of programs that
could play games like chess and checkers. These early efforts laid the groundwork
for more advanced machine learning techniques by demonstrating that machines
could be programmed to perform tasks traditionally thought to require human
intelligence.
In the 1950s, Alan Turing, a pioneer in computer science, proposed the concept of
a “learning machine” that could modify its behavior based on experience. This idea
was a precursor to modern machine learning, emphasizing the potential for
machines to improve their performance over time. Turing’s work, along with that
of other early AI researchers, set the stage for future developments in the field.
The 1960s saw the development of some of the first machine learning algorithms,
including the nearest neighbor algorithm. Researchers began to understand the
potential of using statistical methods and probability theory to enable machines to
make decisions based on data. This period marked the beginning of a shift from
rule-based systems to data-driven approaches in AI research.
In 1957, Frank Rosenblatt introduced the perceptron algorithm, one of the earliest
breakthroughs in machine learning. The perceptron is a type of artificial neuron
that serves as a building block for neural networks. Rosenblatt’s work
demonstrated that machines could learn to recognize patterns and make decisions
based on input data, paving the way for future developments in neural network
research.
The perceptron algorithm works by adjusting the weights of input features to
minimize the error in classification tasks. This adjustment is done through a
process called gradient descent, which iteratively updates the weights based on the
error of the predictions. Although the original perceptron was limited to linear
decision boundaries, it laid the foundation for more complex neural network
architectures that could handle non-linear relationships.
Rosenblatt’s perceptron sparked significant interest in the field of machine learning
and led to the development of multilayer perceptrons (MLPs) and back
propagation, techniques that enabled the training of deeper neural networks. These
advancements allowed for more accurate and efficient learning from data, further
advancing the capabilities of machine learning systems.
The 1980s and 1990s marked a period of resurgence in machine learning, driven
by the introduction of new techniques and algorithms. One of the key
developments during this time was the back propagation algorithm, which enabled
the training of deep neural networks. Back propagation allows for the efficient
calculation of gradients, making it possible to train multilayer perceptrons with
multiple hidden layers.
Another significant advancement was the development of support vector machines
(SVMs) by Vladimir Vapnik and his colleagues. SVMs are powerful classification
algorithms that work by finding the optimal hyperplane that separates data points
of different classes. They are particularly effective in high-dimensional spaces and
have become a staple in the machine learning toolkit.
The 1980s also saw the rise of decision tree algorithms and ensemble methods such
as bagging and boosting. These techniques improved the accuracy and robustness
of machine learning models by combining the predictions of multiple models. The
development of these methods laid the groundwork for the later success of
algorithms like random forests and gradient boosting machines.
There are three key aspects of Machine Learning, which are as follows:
i. Task: A task is defined as the main problem in which we are interested. This
task/problem can be related to the predictions and recommendations and
estimations, etc.
ii. Experience: It is defined as learning from historical or past data and used to
estimate and resolve future tasks.
iii. Performance: It is defined as the capacity of any machine to resolve any
machine learning task or problem and provide the best outcome for the same.
However, performance is dependent on the type of machine learning problems.
2.1 Techniques in Machine Learning
Machine Learning techniques are divided mainly into the following 4 categories:
1. Supervised Learning
Supervised learning is applicable when a machine has sample data, i.e., input as
well as output data with correct labels. Correct labels are used to check the
correctness of the model using some labels and tags. Supervised learning technique
helps us to predict future events with the help of past experience and labeled
examples. Initially, it analyses the known training dataset, and later it introduces an
inferred function that makes predictions about output values. Further, it also
predicts errors during this entire learning process and also corrects those errors
through algorithms.
Example: Let's assume we have a set of images tagged as ''dog''. A machine
learning algorithm is trained with these dog images so it can easily distinguish
whether an image is a dog or not.
2. Unsupervised Learning
In unsupervised learning, a machine is trained with some input samples or labels
only, while output is not known. The training information is neither classified nor
labeled; hence, a machine may not always provide correct output compared to
supervised learning.
Although Unsupervised learning is less common in practical business settings, it
helps in exploring the data and can draw inferences from datasets to describe
hidden structures from unlabeled data.
Example: Let's assume a machine is trained with some set of documents having
different categories (Type A, B, and C), and we have to organize them into
appropriate groups. Because the machine is provided only with input samples or
without output, so, it can organize these datasets into type A, type B, and type C
categories, but it is not necessary whether it is organized correctly or not.
3. Reinforcement Learning
Reinforcement Learning is a feedback-based machine learning technique. In such
type of learning, agents (computer programs) need to explore the environment,
perform actions, and on the basis of their actions, they get rewards as feedback. For
each good action, they get a positive reward, and for each bad action, they get a
negative reward. The goal of a Reinforcement learning agent is to maximize the
positive rewards. Since there is no labeled data, the agent is bound to learn by its
experience only.
4. Semi-supervised Learning
Semi-supervised Learning is an intermediate technique of both supervised and
unsupervised learning. It performs actions on datasets having few labels as well as
unlabeled data. However, it generally contains unlabeled data. Hence, it also
reduces the cost of the machine learning model as labels are costly, but for
corporate purposes, it may have few labels. Further, it also increases the accuracy
and performance of the machine learning model.
Semi-supervised learning helps data scientists to overcome the drawback of
supervised and unsupervised learning. Speech analysis, web content classification,
protein sequence classification, text documents classifiers etc., are some important
applications of Semi-supervised learning.