0% found this document useful (0 votes)
51 views16 pages

Ai & ML Week-1

Uploaded by

ಹರಿ ಶಂ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views16 pages

Ai & ML Week-1

Uploaded by

ಹರಿ ಶಂ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Artificial Intelligence and Machine Learning Code: 20CS511

Artificial Intelligence and Machine Learning

 Fundamentals
 Machine learning types
 -Machine learning workflow
 Machine learning applications
 Challenges in ML
 Building a model – steps involved

Machine Learning: Fundamentals


Machine learning (ML) is a type of artificial intelligence (AI) that allows software
applications to become more accurate at predicting outcomes without being explicitly
programmed to do so. Machine learning algorithms use historical data as input to predict
new output values.
(OR)
Machine learning is a subset of AI, which enables the machine to automatically learn
from data, improve performance from past experiences, and make predictions.
(OR)
Machine Learning is the field of study that gives computers the capability to learn
without being explicitly programmed. ML is one of the most exciting technologies that
Search Educations Page 1
Artificial Intelligence and Machine Learning Code: 20CS511

one would have ever come across. As it is evident from the name, it gives the computer
that makes it more similar to humans: The ability to learn. Machine learning is actively
being used today, perhaps in many more places than one would expect.
Machine learning types
1. Supervised Machine Learning
 As its name suggests, supervised machine learning is based on supervision. It
means in the supervised learning technique, we train the machines using the
"labeled" dataset, and based on the training, the machine predicts the output.
 Supervised learning works on labeled data.
 Each input data has a corresponding labeled output. The goal of supervised
machine learning is to learn a mapping from the input to the output. The input
data is called attributes, features or predictors.
 This output variable is also called response variable or target variable.
 For example, the problem of building a utility for predicting the selling price of
the car. The dataset is shown below:

Search Educations Page 2


Artificial Intelligence and Machine Learning Code: 20CS511

From the given dataset, the machine learning algorithm learns the mapping from input
variables to output variable. This learning is represented in the form of a model. When a
new instance is given to the model as shown below, it can predict its output value.

Some of the other examples of supervised learning:


Given an email defined by its collection of phrases(X), predict if the mail is a
spam(Y). Given a medical brain scan image (X), predict if the patient has
tumors(Y).

2. Unsupervised Machine Learning


Unsupervised learning algorithm is to group or categories the unsorted dataset according
to the similarities, patterns, and differences.
Machines are instructed to find the hidden patterns from the input dataset.
Unsupervised machine learning has no explicitly defined output.
The idea is to discover knowledge or structure in the data.

For example, an online retailer will have data about all items that the customers purchased.
Unsupervised learning algorithms can be applied on this data to group customers
based on their buying patterns.

Grouping new articles based on topics like sports, politics, business etc. is another example
ofunsupervised learning.
This task of discovering inherent clusters or groups in the data is known as Clustering

3. Reinforcement Learning
Reinforcement learning works on a feedback-based process, in which an AI agent (A
software component) automatically explore its surrounding by hitting & trail, taking
action, learning from experiences, and improving its performance.

Search Educations Page 3


Artificial Intelligence and Machine Learning Code: 20CS511

Machine Learning Process

The diagram above illustrates the Machine Learning process.


In this process, first relevant data is gathered then is cleaned and transformed
through a process called Feature Engineering.
During the process of Feature Engineering, handling missing value, handling outliers,
creating new features out of existing ones are some of the common tasks performed.
After feature engineering, the data is split into Train Data and Test Data. The Train Data is
used for training the machine learning model.
Once the model is built, it is validated against the Test Data for accuracy.
This accuracy helps us in estimating the performance on previously unseen data.
If the model performance on both Train and Test Data is satisfactory, the model may be
deployed.

Search Educations Page 4


Artificial Intelligence and Machine Learning Code: 20CS511

Once deployed, the model makes predictions on new data ; these predictions/insights are
used to take business decisions

Machine learning workflow


In order to execute and produce results successfully, a machine learning model must
automate some standard workflows. T
He process of automate these standard workflows can be done with the help of Scikit-learn
Pipelines.
From a data scientist’s perspective, pipeline is a generalized, but very important concept.
It basically allows data flow from its raw format to some useful information.
The working of pipelines can be understood with the help of following diagram −

The blocks of ML pipelines are as follows −


Data ingestion − as the name suggests, it is the process of importing the data for use in
ML project. The data can be extracted in real time or batches from single or multiple
systems. It is one of the most challenging steps because the quality of data can affect the
whole ML model.
Data Preparation − after importing the data, we need to prepare data to be used for our
ML model. Data preprocessing is one of the most important technique of data preparation.
ML Model Training − Next step is to train our ML model. We have various ML
algorithms like supervised, unsupervised, reinforcement to extract the features from data,
and make predictions.
Model Evaluation − Next, we need to evaluate the ML model. In case of AutoML
pipeline, ML model can be evaluated with the help of various statistical methods and
business rules.
ML Model retraining − In case of AutoML pipeline, it is not necessary that the first
model is best one. The first model is considered as a baseline model and we can train it
repeatable to increase model’s accuracy.

Search Educations Page 5


Artificial Intelligence and Machine Learning Code: 20CS511

Deployment − At last, we need to deploy the model. This step involves applying and
migrating the model to business operations for their use.

Machine learning Applications

Search Educations Page 6


Artificial Intelligence and Machine Learning Code: 20CS511

4. Challenges in ML
 Inadequate Training Data
 Poor quality of data
 Monitoring and maintenance
 Getting bad recommendations
 Lack of skilled resources
 Process Complexity of Machine Learning
 Data Bias
 Slow implementations and results

Building model steps involved


Step 1: Collect Data
Given the problem you want to solve, you will have to investigate and obtain data that
you will use to feed your machine.
Step 2: Prepare the data

This is a good time to visualize your data and check if there are correlations between the
different characteristics that we obtained.
You must also separate the data into two groups: one for training and the other for model
evaluation which can be divided approximately in a ratio of 80/20 but it can vary
depending on the case and the volume of data we have.

At this stage, you can also pre-process your data by normalizing, eliminating duplicates,
and making error corrections.

Step 3: Choose the model

you will use algorithms of classification, prediction, linear regression, clustering, i.e. k-
means or K-Nearest Neighbor, Deep Learning, i.e. Neural Networks, Bayesian, etc.
There are various models to be used depending on the data you are going to process
such as images, sound, text, and numerical values.

In the following table, we will see some models and their applications that you can
apply in your projects:

Search Educations Page 7


Artificial Intelligence and Machine Learning Code: 20CS511

Model Applications

Logistic Regression Price prediction

Fully connected networks Classification

Convolutional Neural Networks Image processing

Recurrent Neural Networks Voice recognition

Random Forest Fraud Detection

Reinforcement Learning Learning by trial and error

Generative Models Image creation

K-means Segmentation

k-Nearest Neighbors Recommendation systems

Bayesian Classifiers Spam and noise filtering

Search Educations Page 8


Artificial Intelligence and Machine Learning Code: 20CS511

Step 4 Train your machine model


You will need to train the datasets to run smoothly and see an incremental improvement
in the prediction rate.

Step 5: Evaluation
You will have to check the machine created against your evaluation data set of your
already trained model.
If the accuracy is less than or equal to 50%, that model will not be useful.
If you reach 90% or more, you can have good confidence in the results that the model
gives you.

Step 6: Parameter Tuning


If during the evaluation you did not obtain good predictions, you must return to the
training step before making a new configuration of parameters in your model.

Step 7: Prediction or Inference


You are now ready to use your Machine Learning model inferring results in real-life
scenarios.

Search Educations Page 9


Artificial Intelligence and Machine Learning Code: 20CS511

Examples of ML in Daily life

Machine Learning

 Pipeline

o Data engineering
o Machine Learning
o Deployment
 What is Data Science?
 How Data Science works?
 Data Science uses

Pipeline
Machine Learning

A machine learning pipeline is a way to control and automate the workflow it


takes to produce a machine learning model. Machine learning pipelines consist of
multiple sequential steps that do everything from data extraction and preprocessing to
Search Educations Page 1
Artificial Intelligence and Machine Learning Code: 20CS511

model training and deployment.

Machine learning pipelines are iterative as every step is repeated to continuously


improve the accuracy of the model and achieve the end goal.

An example of ML Pipeline O'Reilly

 The term Pipeline is used generally to describe the independent sequence of


steps that are arranged together to achieve a task.
 This task could be machine learning or not.
 Machine Learning Pipelines are very common but that is not the only type of
pipeline that exists.
 Data Orchestration Pipelines are another example.

Search Educations Page 2


Artificial Intelligence and Machine Learning Code: 20CS511

According to Microsoft docs, there are three scenarios:

Deployment
The deployment of machine learning models (or pipelines) is the process of making
models available in production where web applications, enterprise software (ERPs) and
APIs can consume the trained model by providing new data points, and get the
predictions.

In short, Deployment in Machine Learning is the method by which you integrate a


machine learning model into an existing production environment to make practical
business decisions based on data. It is the last stage in the machine learning lifecycle.

Normally the term Machine Learning Model Deployment is used to describe


deployment of the entire Machine Learning Pipeline, in which the model itself is only
one component of the Pipeline.

Search Educations Page 3


Artificial Intelligence and Machine Learning Code: 20CS511

An example of a machine learning pipeline built using


sklearn

As you can see in the above example, this pipeline consists of a


Logistic Regression model. There are several steps in the pipeline that
have to be executed first before training can begin, such as Imputation
of missing values, One-Hot encoding, Scaling, and Principal
Component Analysis (PCA).

Data engineering
Data engineering is the process of designing and building systems that

SEARCH EDUCATIONS Page 4


Artificial Intelligence and Machine Learning Code: 20CS511

let people collect and analyze raw data from multiple sources and
formats.

These systems empower people to find practical applications of the


data, which businesses can use to thrive.

What Do Data Engineers Do?


Data engineering is a skill that is in increasing demand. Data engineers
are the people who design the system that unifies data and can help you
navigate it. Data engineers perform many different tasks including:

Acquisition: Finding all the different

data sets around the business Cleansing:

Finding and cleaning any errors in the

data Conversion: Giving all the data a

common format

Disambiguation: Interpreting data that could be interpreted in


multiple ways

Deduplication: Removing duplicate copies of data

SEARCH EDUCATIONS Page 5


Artificial Intelligence and Machine Learning Code: 20CS511

Once this is done, data may be stored in a central repository such as a


data lake or data lake house.
Data engineers may also copy and move subsets of data into a data
warehouse.

Machine learning pipeline


A machine learning pipeline is a way to codify and automate the
workflow it takes to produce a machine learning model.

Machine learning pipelines consist of multiple sequential steps that do


everything from data extraction and preprocessing to model training
and deployment.

What is Data Science?


Data science is the domain of study that deals with vast volumes of
data using modern tools and techniques to find unseen patterns,

SEARCH EDUCATIONS Page 6


Artificial Intelligence and Machine Learning Code: 20CS511

derive meaningful information, and make business decisions.


For example, finance companies can use a customer's banking and bill-
paying history to assess creditworthiness and loan risk.
How Data Science works?
Data science uses techniques such as machine learning and artificial
intelligence to extract meaningful information and to predict future
patterns and behaviours. Advances in technology, the internet, social
media, and the use of technology have all increased access to big data.

Data Science uses


Data science is used in marketing, finance, and human resources,
healthcare, government programmes, and any other industry that
generates data.
Marketing departments use data science to determine which product is
most likely to sell.

SEARCH EDUCATIONS Page 7

You might also like