0% found this document useful (0 votes)
361 views79 pages

ML LAB Mannual-1

Lab manual of machine learning of mumbai University

Uploaded by

ANIKET LOHKARE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
361 views79 pages

ML LAB Mannual-1

Lab manual of machine learning of mumbai University

Uploaded by

ANIKET LOHKARE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

Shivajirao S Jondhale College of Engineering, Dombivli (E)

Department of AIMLEngineering

Laboratory Manual
Machine Learning Lab
Subject Code: CSL604

Semester – VI

Prepared by

Prof. Rashmi K Mahajan

Department of Artificial Intelligence and

Machine Learning

Shivajirao S. Jondhale College of Engineering,

Dombivli (E)

Affiliated to University of Mumbai


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

COURSE: Machine Learning Lab

COURSE CODE: CSL 604

Semester-VI

INDEX
Sr. Topic Page
No. No.
1 Vision iii

2 Mission iii

3 Program Educational Objectives (PEOs) iii

4 Program Outcomes (POs) iv

5 Program Specific Outcomes (PSOs) Iv

6 Syllabus V

7 Course Objectives and Course Outcomes Vii

8 List of Experiments ix

9 CO-PO Mapping Matrix and CO-PSO Mapping Matrix X

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

VISION
• To impart quality technical education in the department of Artificial Intelligence and
Machine Learning for creating competent and ethically strong engineers with capabilities of
accepting new challenges.

MISSION

• To provide learners with the technical knowledge to build a life long learning career in
the Artificial Intelligence and Machine Learning domain.
• To develop ability among the learners to analyze,design implement engineering
problems and real world applications by providing novel Artificial Intelligence and
Machine Learning Solution.
• To promote close interaction among industry, faculty and learners to enrich the learning
process and enhance career opportunities for learners.

Program Educational Objectives (PEO)


• Impel Learners to acquire in-depth understanding of Artificial Intelligence & Machine
Learning that will enable them to pursue higher education or professional positions in
the field of engineering.
• Prepare Learners to demonstrate technical skills, competency in the Artificial
Intelligence & Machine Learning field.
• Inculcate in Learners, professional and ethical attitude, good leadership qualities and
commitment to social responsibilities.

Program Outcomes (POs)

Program Specific Outcomes (PSOs)

• PSO1 : Ability to understand the concepts and key issues in artificial intelligence and
its associated fields to achieve adequate perspectives in real time applications
• PSO2 : Ability to design, implement solutions for various domains using Machine
learning and Deep Learning techniques.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

University Syllabus for the lab


Lab Code Lab Name Credit

CSL604 Machine Learning Lab 1

Prerequisite: C Programming Language.


Lab Objectives:
1 To introduce platforms such as Anaconda, COLAB suitable to Machine learning
2 To implement various Regression techniques
3 To develop Neural Network based learning models
4 To implement Clustering techniques
Lab Outcomes:
After successful completion of the course students will be able to:
1 Implement various Machine learning models
2 Apply suitable Machine learning models for a given problem
3 Implement Neural Network based models
4 Apply Dimensionality Reduction techniques

Suggested Experiments: Students are required to complete at least 10 experiments.


Sr. No. Name of the Experiment
1 Introduction to platforms such as Anaconda, COLAB
2 Study of Machine Learning Libraries and tools (Python library, tensorflow, keras,...)
Implementation of following algorithms for a given example data set-
3 Linear Regression.
4 Logistic Regression.
5 Support Vector Machines
6 Hebbian Learning
7 Expectation -Maximization algorithm
8 McCulloch Pitts Model.
9 Single Layer Perceptron Learning algorithm
10 Error Backpropagation Perceptron Training Algorithm
11 Principal Component Analysis
12 Applications of above algorithms as a case study (E.g. Hand Writing Recognition
using MNIST data set, classification using IRIS data set, etc)

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Useful Links:
1 https://www.learndatasci.com/out/edx-columbia-machine-learning/
2 https://www.learndatasci.com/out/oreilly-hands-machine-learning-scikit-learn-keras-and-
ten sorflow-2nd-edition/
3 https://www.learndatasci.com/out/google-machine-learning-crash-course/
4 https://www.learndatasci.com/out/edx-columbia-machine-learning/
Term Work:
1 Term work should consist of 10 experiments.
2 Journal must include at least 2 assignments.
3 The final certification and acceptance of term work ensures that satisfactory performance of
laboratory work and minimum passing marks in term work.
4 Total 25 Marks (Experiments: 15-marks, Attendance Theory & Practical: 05-marks,
Assignments: 05-marks)
Oral & Practical exam
Based on the entire syllabus of CSL604and CSC604

Course Objectives: The course aims:


1 To introduce Machine learning concepts
2 To develop mathematical concepts required for Machine learning algorithms
3 To understand various Regression techniques
4 To understand Clustering techniques
5 To develop Neural Network based learning models

Course Outcomes:
After successful completion of the course students will be able to:
1 Comprehend basics of Machine Learning
2 Build Mathematical foundation for machine learning
3 Understand various Machine learning models
4 Select suitable Machine learning models for a given problem
5 Build Neural Network based models
6 Apply Dimensionality Reduction techniques

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

LIST OF EXPERIMENTS
Expt. Name of the Experiment Page COs
No. No.
1. Getting introduced to platforms such as Anaconda, COLAB. 1 CO1

2. Study of Machine Learning Libraries and tools (Python library, tensorflow,


keras,...)
3. Implementation of Simple Linear Regression & 8 CO2
Multiple Linear Regression in Python.
4. Implementation of Logistic Regression in Python. 11 CO2

5. Implementation of Support Vector Machine Regression & Classifier in 15 CO3


Python.
6. Hebbian Learning 17 CO3

7. Expectation -Maximization algorithm 21 CO6

8. McCulloch Pitts Model. 26 CO6

9. Single Layer Perceptron Learning algorithm 31 CO5

10. Error Backpropagation Perceptron Training Algorithm 35 CO4

11 Principal Component Analysis

12 Applications of above algorithms as a case study (E.g. Hand Writing


Recognition using MNIST data set, classification using IRIS data set, etc)

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

CO-PO Mapping Matrix

PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12

CO1

CO2

CO3

CO4

CO5

CO6
CO-PSO Mapping Matrix

PSO1 PSO2

CO1

CO2

CO3

CO4

CO5

CO6

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

EXPERIMENT NO- 1
AIM: Introduction to platforms Anaconda, Google COLAB.

RESOURCES REQUIRED: H/W :- P4 machine


S/W :- Google Colaboratory , Anaconda Navigator, Jupyter Notebooks

THEORY:
Introduction to Anaconda
Anaconda distribution, a comprehensive platform for data science and scientific computing
in Python. Anaconda simplifies the process of setting up and working with various libraries
and tools commonly used in data science, machine learning, and scientific computing.

Experiment Steps
1. Installation of Anaconda
1. Download Anaconda:

• Go to the Anaconda website.


• Choose the appropriate version for your operating system (Windows, macOS,
Linux) and download the installer.
2. Install Anaconda:

• Follow the installation instructions for your operating system.


• During installation, you can choose to add Anaconda to your system PATH,
which makes it easier to access Anaconda from the command line.
3. Verify Installation:
• Open a new terminal or command prompt.
• Type conda --version to check if the installation was successful.
2. Anaconda Navigator
1. Launch Anaconda Navigator:
• Open Anaconda Navigator from your applications or Start menu.
2. Explore Navigator:
• Get familiar with the Anaconda Navigator interface.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Identify the key components such as Home, Environments, and Jupyter


Notebooks.
3. Creating and Managing Environments
1. Create a New Environment:
• Use Anaconda Navigator to create a new environment.
• Choose the Python version and give your environment a name.
2. Manage Environments:
• Activate and deactivate environments.
• Install and remove packages using the Conda package manager.
4. Jupyter Notebooks with Anaconda
1. Launch Jupyter Notebook:
• Open Jupyter Notebook from Anaconda Navigator.
2. Create a New Notebook:
• Create a new Jupyter Notebook within your Anaconda environment.

3. Execute Code in Notebook:


• Write and execute a simple Python code snippet in the Jupyter Notebook.

5. Introduction to Google Colab


5.1 Overview of Google Colab
1. Go to Google Colab.
2. Overview of the interface and features.
5.2 Setting Up a Colab Notebook
1. Create a new Colab notebook.
2. Understand the collaborative features.

5.3 Collaboration Features


1. Share and collaborate on a Colab notebook.
2. Commenting and version history.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

6. Comparing Anaconda and Google Colab

Feature Google Colab Jupyter Notebook


Accessibility and Cloud-based, requires no local
Requires local installation and setup.
Setup setup.
Real-time collaboration with Collaboration requires external
Collaboration
team members. services or plugins.
Free access to GPU and TPU Relies on local hardware resources
Hardware Resources
resources. for computation.
Integration with Seamless integration with Limited native integration with
Cloud Google Drive. cloud services.
Pre-installed libraries for Requires manual installation of
Library Management
machine learning. libraries.
Supports Matplotlib, Seaborn, Similar support for visualization
Visualization Tools
Plotly, etc. libraries.
Ideal for teaching and learning
Educational Use Widely used in educational settings.
with ease.
Offline Usage Limited offline functionality. Full functionality available offline.
Command Line Supports shell commands within Limited support; requires external
Integration notebooks. plugins.
Shareable via links or Google Sharing involves file transfer or
Ease of Sharing
Drive integration. external services.
Strong community support and Well-established community with
Community Support
resources. extensive resources.
Highly customizable based on local
Customization Limited customization options.
environment.

5. Conclusion
This lab document provides a structured outline for conducting an introduction to Anaconda
and Google Colab experiment.In this experiment, we have been introduced to the Anaconda
distribution and its capabilities. and learned how to install Anaconda, create and manage
Python environments, and use Jupyter Notebooks for interactive coding.

Additional Resources
• Anaconda Documentation: Anaconda Documentation
• Jupyter Notebook Documentation: Jupyter Notebook Documentation

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Installing Anaconda and Python


To learn machine learning, we will use the Python programming language in this tutorial. So,
in order to use Python for machine learning, we need to install it in our computer system with
compatible IDEs (Integrated Development Environment).

In this topic, we will learn to install Python and an IDE with the help of Anaconda
distribution.

Anaconda distribution is a free and open-source platform for Python/R programming


languages. It can be easily installed on any OS such as Windows, Linux, and MAC OS. It
provides more than 1500 Python/R data science packages which are suitable for developing
machine learning and deep learning models.

Anaconda distribution provides installation of Python with various IDE's such as Jupyter
Notebook, Spyder, Anaconda prompt, etc. Hence it is a very convenient packaged solution
which you can easily download and install in your computer. It will automatically install
Python and some basic IDEs and libraries with it.

Below some steps are given to show the downloading and installing process of Anaconda and
IDE:

Step-1: Download Anaconda Python:

• To download Anaconda in your system, firstly, open your favorite browser and type
Download Anaconda Python, and then click on the first link as given in the below
image. Alternatively, you can directly download it by clicking on this link,
https://www.anaconda.com/distribution/#download-section.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• After clicking on the first link, you will reach to download page of Anaconda, as
shown in the below image:

• Since, Anaconda is available for Windows, Linux, and Mac OS, hence, you can
download it as per your OS type by clicking on available options shown in below
image. It will provide you Python 2.7 and Python 3.7 versions, but the latest version
is 3.7, hence we will download Python 3.7 version. After clicking on the download
option, it will start downloading on your computer.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Note: In this topic, we are downloading Anaconda for Windows you can choose it as per your
OS.

Step- 2: Install Anaconda Python (Python 3.7 version):

Once the downloading process gets completed, go to downloads → double click on the ".exe"
file (Anaconda3-2019.03-Windows-x86_64.exe) of Anaconda. It will open a setup window
for Anaconda installations as given in below image, then click on Next.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• It will open a License agreement window click on "I Agree" option and move further.

• In the next window, you will get two options for installations as given in the below
image. Select the first option (Just me) and click on Next.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Now you will get a window for installing location, here, you can leave it as default or
change it by browsing a location, and then click on Next. Consider the below image:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Now select the second option, and click on install.

• Once the installation gets complete, click on Next.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Now installation is completed, tick the checkbox if you want to learn more about
Anaconda and Anaconda cloud. Click on Finish to end the process.

Note: Here, we will use the Spyder IDE to run Python programs.
Step- 3: Open Anaconda Navigator

• After successful installation of Anaconda, use Anaconda navigator to launch a Python


IDE such as Spyder and Jupyter Notebook.
• To open Anaconda Navigator, click on window Key and search for Anaconda
navigator, and click on it. Consider the below image:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• After opening the navigator, launch the Spyder IDE by clicking on the Launch button
given below the Spyder. It will install the Spyder IDE in your system.

Run your Python program in Spyder IDE.

• Open Spyder IDE, it will look like the below image:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Write your first program, and save it using the .py extension.
• Run the program using the triangle Run button.
• You can check the program's output on console pane at the bottom right side.

Step- 4: Close the Spyder IDE.

Below some steps are given to show the downloading and installing process of Anaconda and
IDE:

Step-1: Download Anaconda Python:

• To download Anaconda in your system, firstly, open your favorite browser and type
Download Anaconda Python, and then click on the first link as given in the below
image. Alternatively, you can directly download it by clicking on this link,
https://www.anaconda.com/distribution/#download-section.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• After clicking on the first link, you will reach to download page of Anaconda, as
shown in the below image:

• Since, Anaconda is available for Windows, Linux, and Mac OS, hence, you can
download it as per your OS type by clicking on available options shown in below
image. It will provide you Python 2.7 and Python 3.7 versions, but the latest version
is 3.7, hence we will download Python 3.7 version. After clicking on the download
option, it will start downloading on your computer.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Note: In this topic, we are downloading Anaconda for Windows you can choose it as per your
OS.

Step- 2: Install Anaconda Python (Python 3.7 version):

Once the downloading process gets completed, go to downloads → double click on the ".exe"
file (Anaconda3-2019.03-Windows-x86_64.exe) of Anaconda. It will open a setup window
for Anaconda installations as given in below image, then click on Next.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• It will open a License agreement window click on "I Agree" option and move further.

• In the next window, you will get two options for installations as given in the below
image. Select the first option (Just me) and click on Next.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Now you will get a window for installing location, here, you can leave it as default or
change it by browsing a location, and then click on Next. Consider the below image:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Now select the second option, and click on install.

• Once the installation gets complete, click on Next.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Now installation is completed, tick the checkbox if you want to learn more about
Anaconda and Anaconda cloud. Click on Finish to end the process.

Note: Here, we will use the Spyder IDE to run Python programs.
Step- 3: Open Anaconda Navigator

• After successful installation of Anaconda, use Anaconda navigator to launch a Python


IDE such as Spyder and Jupyter Notebook.
• To open Anaconda Navigator, click on window Key and search for Anaconda
navigator, and click on it. Consider the below image:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• After opening the navigator, launch the Spyder IDE by clicking on the Launch button
given below the Spyder. It will install the Spyder IDE in your system.

Run your Python program in Spyder IDE.

• Open Spyder IDE, it will look like the below image:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Write your first program, and save it using the .py extension.
• Run the program using the triangle Run button.
• You can check the program's output on console pane at the bottom right side.

Step- 4: Close the Spyder IDE.

How to use Colaboratory

To use Colaboratory, you must have a Google account.

On your first visit, you will see a Welcome To Colaboratory notebook with links to video
introductions and basic information on how to use Colab.

Create a workbook

From the File menu, click New notebook to create a workbook.

If you are not yet logged in to a Google account, the system will prompt you to log in.

The notebook will by default have a generic name; click on the filename field to rename it.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

The file type, IPYNB, is short for "IPython notebook" because IPython was the forerunner of
Jupyter Notebook.

The interface allows you to insert various kinds of cells, mainly text and code, which have
their own shortcut buttons under the menu bar via the Insert menu.

Because notebooks are meant for sharing, there are accommodations throughout for
structured documentation.

Code, debug, repeat

You can insert Python code to execute in a code cell. The code can be entirely standalone or
imported from various Python libraries.

A notebook can be treated as a rolling log of work, with earlier code snippets being no longer
executed in favor of later ones, or treated as an evolving set of code blocks intended for
ongoing execution. The Runtime menu offers execution options, such as Run all, Run
before or Run the focused cell, to match either approach.

Each code cell has a run icon on the left edge, as shown above. You can type code into a cell
and hit the run icon to execute it immediately.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

If the code generates an error, the error output will appear beneath the cell. Correcting the
problem and hitting run again replaces the error info with program output. The first line of
code, in its own cell, imports the NumPy library, which is the source of the arange
function. Colab has many common libraries pre-loaded for easy import into programs.

A text cell provides basic rich text using Markdown formatting by default and allows for the
insertion of images, HTML code and LaTeX formatting.

As you add text on the left side of the text cell, the formatted output appears on the right.

Once you stop editing a block, only the final formatted version shows.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Incorporating data into the notebook

After getting comfortable with the interface and using it for initial test coding, you must
eventually provide the code with data to analyze or otherwise manipulate.

Colab can mount a user's Google Drive to the VM hosting their notebook using a code cell.

Once you hit run, Google will ask for permission to mount the drive.

If you allow it to connect, you will then have access to the files in your Google Drive via the
/my_drive path.

If you prefer not to grant access to your Drive space, you can upload files or any network file
space mounted as a drive from your local machine instead.

with file access, many functions are available to read data in various ways. For example,
importing the Pandas library gives access to functions such as read_csv and read_json.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Save and share

By default, Colab puts notebooks in a Colab Notebooks folder under My Drive in Google
Drive.

The File menu enables notebooks to be saved as named revisions in the version history,
relocated using Move, or saved as a copy in Drive or GitHub. It also allows you to download
and upload notebooks. Tools based on Jupyter provide broad compatibilities, so you can
create notebooks in one place and then upload and use them in another.

You can use the Share button in the upper right to grant other Google users access to the
notebook and to copy links.

Google also provides example notebooks illustrating available resources, such as pre-trained
image classifiers and language transformers, as well as addressing common business
problems, such as working with BigQuery or performing time series analytics. It also
provides links to introductory Python coding notebooks.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Experiment No: 2
Aim: Study of Machine learning Libraries and Tools

Objective: The objective of this experiment is to provide students with hands-on experience
in using popular machine learning libraries and tools. Participants will explore libraries such
as scikit-learn, TensorFlow, and PyTorch, and familiarize themselves with essential machine
learning tasks.
Prerequisites
• Basic understanding of Python programming language.
• Familiarity with fundamental machine learning concepts.
RESOURCES REQUIRED: H/W :- P4 machine
S/W :- Jupyter Notebook

Theory:

Best Python libraries for Machine Learning


Machine learning is a science of programming the computer by which they can learn from
different types of data. According to machine learning's definition of Arthur Samuel - "Field
of study that gives computers the ability to learn without being explicitly programmed". The
concept of machine learning is basically used for solving different types of life problems.

In previous days, the users used to perform tasks of machine


learning by manually coding all the algorithms and using
mathematical and statistical formulas.

This process was time-consuming, inefficient, and tiresome


compared to Python libraries, frameworks, and modules. But in
today's world, users can use the Python language, which is the
most popular and productive language for machine learning. Python has replaced many
languages as it is a vast collection of libraries, and it makes work easier and simpler.

Here are some of the best libraries of Python used for Machine Learning:

• NumPy
• SciPy
• Scikit-learn
• Pandas
• Matplotlib
• Seaborn

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• TensorFlow
• Keras
• PyTorch

Top Python Machine Learning Libraries


1) NumPy

NumPy is a well known general-purpose array-processing package. An extensive collection of


high complexity mathematical functions make NumPy powerful to process large multi-
dimensional arrays and matrices. NumPy is very useful for handling linear algebra, Fourier
transforms, and random numbers. Other libraries like TensorFlow uses NumPy at the backend
for manipulating tensors.

With NumPy, you can define arbitrary data types and easily integrate with most databases.
NumPy can also serve as an efficient multi-dimensional container for any generic data that is
in any datatype. The key features of NumPy include powerful N-dimensional array object,
broadcasting functions, and out-of-box tools to integrate C/C++ and Fortran code.

Its key features are as below:

• Supports n-dimensional arrays to enable vectorization, indexing, and broadcasting


operations.
• Supports Fourier transforms mathematical functions, linear algebra methods, and
random number generators.
• Implementable on different computing platforms, including distributed and GPU
computing.
• Easy-to-use high-level syntax with the optimized Python code to provide high speed
and flexibility.
• In addition to that, NumPy enables the numerical operations of plenty of libraries
associated with data science, data visualization, image processing, quantum computing,
signal processing, geographic processing, bioinformatics, etc. So, it is one of the
versatile machine learning libraries.

Advantages:

• It can easily deal with multidimensional data.


• It helps in the matrix manipulation of data and operations such as transpose, reshape,
and much more.
• It enables enhanced performance and management of garbage collection by providing
a dynamic data structure.
• It allows improving the performance of Machine Learning models.

Disadvantages:

• It is highly dependent on non-Pythonic entities. It uses the functionalities of Cython


and other libraries that use C or C++.
• Its high productivity comes at a price.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Its data types are hardware-native and not Python-native, so it costs heavily when
NumPy entities have to be translated back to Python-equivalent entities and vice versa.

import numpy as nup

# Then, create two arrays of rank 2

K = nup.array([[2, 4], [6, 8]])

R = nup.array([[1, 3], [5, 7]])

# Then, create two arrays of rank 1

P = nup.array([10, 12])

S = nup.array([9, 11])

# Then, we will print the Inner product of vectors

print ("Inner product of vectors: ", nup.dot(P, S), "\n")

# Then, we will print the Matrix and Vector product

print ("Matrix and Vector product: ", nup.dot(K, P), "\n")

# Now, we will print the Matrix and matrix product

print ("Matrix and matrix product: ", nup.dot(K, R))

Output:

Inner product of vectors: 222

Matrix and Vector product: [ 68 156]

Matrix and matrix product: [[22 34]


[46 74]]

2) SciPy

SciPy is a popular library among Machine Learning developers as it


contains numerous modules for performing optimization, linear
algebra, integration, and statistics. SciPy library is different from

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

SciPy stack, as SciPy library is one of the core packages which made up the SciPy stack. SciPy
library is used for image manipulation tasks.

Advantages:

• It is perfect for image manipulation.


• It offers basic processing features for mathematical operations.
• It provides effective integration for numerics and their optimizations.
• It also facilitates the processing of signals.

Disadvantages:

• There is no major disadvantage of using SciPy. However, there can be confusion


between SciPy stack and SciPy library as the SciPy library is included in the stack.

Example 1:

1. from scipy import signal as sg


2. import numpy as nup
3. K = nup.arange(45).reshape(9, 5)
4. domain_1 = nup.identity(3)
5. print (K, end = 'KK')
6. print (sg.order_filter (K, domain_1, 1))

Output:

r (K, domain_1, 1))


Output:
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]
[25 26 27 28 29]
[30 31 32 33 34]
[35 36 37 38 39]
[40 41 42 43 44]] KK [[ 0. 1. 2. 3. 0.]
[ 5. 6. 7. 8. 3.]
[10. 11. 12. 13. 8.]
[15. 16. 17. 18. 13.]
[20. 21. 22. 23. 18.]
[25. 26. 27. 28. 23.]
[30. 31. 32. 33. 28.]
[35. 36. 37. 38. 33.]
[ 0. 35. 36. 37. 38.]]

Example 2:

1. from scipy.signal import chirp as cp


2. from scipy.signal import spectrogram as sp
3. import matplotlib.pyplot as plot
4. import numpy as nup

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

5. t_T = nup.linspace(3, 10, 300)


6. w_W = cp(t_T, f0 = 4, f1 = 2, t1 = 5, method = 'linear')
7. plot.plot(t_T, w_W)
8. plot.title ("Linear Chirp")
9. plot.xlabel ('Time in Seconds)')
10. plot.show()

Output:

3) Scikit-learn

Scikit-learn is a Python library which is used for classical


machine learning algorithms. It is built on the top of two basic
libraries of Python, that is NumPy and SciPy. Scikit-learn is
popular in Machine learning developers as it supports
supervised and unsupervised learning algorithms. This library can also be used for data-
analysis, and data-mining process.

The following features of scikit-learn make it one of the best machine learning libraries in
Python:

• Easy to use for precise predictive data analysis


• Simplifies solving complex ML problems like classification, preprocessing, clustering,
regression, model selection, and dimensionality reduction
• Plenty of inbuilt machine learning algorithms
• Helps build a fundamental to advanced level ML model

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Developed on top of prevalent libraries like SciPy, NumPy, and Matplotlib

Example:

1. from sklearn import datasets as ds


2. from sklearn import metrics as mt
3. from sklearn.tree import DecisionTreeClassifier as dtc
4.
5. # load the iris datasets
6. dataset_1 = ds.load_iris()
7.
8. # fit a CART model to the data
9. model_1 = dtc()
10. model_1.fit(dataset_1.data, dataset_1.target)
11. print(model)
12.
13. # make predictions
14. expected_1 = dataset_1.target
15. predicted_1 = model_1.predict(dataset_1.data)
16.
17. # summarize the fit of the model
18. print (mt.classification_report(expected_1, predicted_1))
19. print(mt.confusion_matrix(expected_1, predicted_1))

Output:

DecisionTreeClassifier()
precision recall f1-score support

0 1.00 1.00 1.00 50


1 1.00 1.00 1.00 50
2 1.00 1.00 1.00 50

accuracy 1.00 150


macro avg 1.00 1.00 1.00 150
weighted avg 1.00 1.00 1.00 150

[[50 0 0]
[ 0 50 0]
[ 0 0 50]]

4) Pandas

Pandas is a Python library that is mainly used for data analysis. The
users have to prepare the dataset before using it for training the
machine learning. Pandas make it easy for the developers as it is
developed specifically for data extraction. It has a wide variety of
tools for analysing data in detail, providing high-level data structures.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Advantages:

• It has descriptive, quick, and compliant data structures.


• It supports operations such as grouping, integrating, iterating, re-indexing, and
representing data.
• It is very flexible for usage in association with other libraries.
• It contains inherent data manipulation functionalities that can be implemented using
minimal commands.
• It can be implemented in a large variety of areas, especially related to business and
education, due to its optimized performance.

Disadvantages:

• It is based on Matplotlib, which means that an inexperienced programmer needs to be


acquainted with both libraries to understand which one will be better to solve a specific
business problem.
• It is much less suitable for quantitative modeling and n-dimensional arrays. In such
scenarios, where we need to work on quantitative or statistical modeling, we can use
Numpy or SciPy.

The two main types of data structures used by pandas are :

• Series (1-dimensional)
• DataFrame (2-dimensional)

These two put together can handle a vast majority of data requirements and use cases from
most sectors like science, statistics, social, finance, and of course, analytics and other areas of
engineering.

Pandas support and perform well with different kinds of data including the below :

• Tabular data with columns of heterogeneous data. For instance, consider the data
coming from the SQL table or Excel spreadsheet.
• Ordered and unordered time series data. The frequency of time series need not be fixed,
unlike other libraries and tools. Pandas is exceptionally robust in handling uneven time-
series data
• Arbitrary matrix data with the homogeneous or heterogeneous type of data in the rows
and columns
• Any other form of statistical or observational data sets. The data need not be labeled at
all. Pandas data structure can process it even without labeling.

It was launched as an open-source Python library in 2009. Currently, it has become one of the
favourite Python libraries for machine learning among many ML enthusiasts. The reason is
it offers some robust techniques for data analysis and data manipulation. This library is
extensively used in academia. Moreover, it supports different commercial domains like
business and web analytics, economics, statistics, neuroscience, finance, advertising, etc. It
also works as a foundational library for many advanced Python libraries.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Here are some of its key features:

• Handles missing data


• Handles time series data
• Supports indexing, slicing, reshaping, subsetting, joining, and merging of large datasets
• Offers optimized code for Python using C and Cython
• Powerful DataFrame object for broad data manipulation support

Example:

1. import pandas as pad


2.
3. data_1 = {"Countries": ["Bhutan", "Cape Verde", "Chad", "Estonia", "Guinea", "Keny
a", "Libya", "Mexico"],
4. "capital": ["Thimphu", "Praia", "N'Djamena", "Tallinn", "Conakry", "Nairobi", "
Tripoli", "Mexico City"],
5. "Currency": ["Ngultrum", "Cape Verdean escudo", "CFA Franc", "Estonia Kroon
; Euro", "Guinean franc", "Kenya shilling", "Libyan dinar", "Mexican peso"],
6. "population": [20.4, 143.5, 12.52, 135.7, 52.98, 76.21, 34.28, 54.32] }
7.
8. data_1_table = pad.DataFrame(data_1)
9. print(data_1_table)

Output:

Countries capital Currency population


0 Bhutan Thimphu Ngultrum 20.40
1 Cape Verde Praia Cape Verdean escudo 143.50
2 Chad N'Djamena CFA Franc 12.52
3 Estonia Tallinn Estonia Kroon; Euro 135.70
4 Guinea Conakry Guinean franc 52.98
5 Kenya Nairobi Kenya shilling 76.21
6 Libya Tripoli Libyan dinar 34.28
7 Mexico Mexico City Mexican peso 54.32

5) Matplotlib

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Matplotlib is a Python library that is used for data visualization. It is used by developers when
they want to visualize the data and its patterns. It is a 2-D plotting library that is used to create
2-D graphs and plots.

It has a module pyplot which is used for plotting graphs, and it provides different features for
control line styles, font properties, formatting axes and many more. Matplotlib provides
different types of graphs and plots such as histograms, error charts, bar charts and many more.

Example 1:

1. import matplotlib.pyplot as plot


2. import numpy as nup
3.
4. # Prepare the data
5. K = nup.linspace(2, 4, 8)
6. R = nup.linspace(5, 7, 9)
7. Q = nup.linspace(0, 1, 3)
8.
9. # Plot the data
10. plot.plot(K, K, label = 'K')
11. plot.plot(R, R, label = 'R')
12. plot.plot(Q, Q, label = 'Q')
13.
14. # Add a legend
15. plot.legend()
16.
17. # Show the plot
18. plot.show()

Output:

Example 2:

1. import matplotlib.pyplot as plot

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

2.
3. # Creating dataset-1
4. K_1 = [8, 4, 6, 3, 5, 10,
5. 13, 16, 12, 21]
6.
7. R_1 = [11, 6, 13, 15, 17, 5,
8. 3, 2, 8, 19]
9.
10. # Creating dataset2
11. K_2 = [6, 9, 18, 14, 16, 15,
12. 11, 16, 12, 20]
13.
14. R_2 = [16, 4, 10, 13, 18,
15. 20, 6, 2, 17, 15]
16.
17. plot.scatter(K_1, R_1, c = "Black",
18. linewidths = 2,
19. marker = "s",
20. edgecolor = "Brown",
21. s = 50)
22.
23. plot.scatter(K_2, R_2, c = "Purple",
24. linewidths = 2,
25. marker = "^",
26. edgecolor = "Grey",
27. s = 200)
28.
29. plt.xlabel ("X-axis")
30. plt.ylabel ("Y-axis")
31. print ("Scatter Plot")
32. plt.show()

Output:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Matplotlib is a data visualization library that is used for 2D plotting to produce publication-
quality image plots and figures in a variety of formats. The library helps to generate histograms,
plots, error charts, scatter plots, bar charts with just a few lines of code.

It provides a MATLAB-like interface and is exceptionally user-friendly. It works by using


standard GUI toolkits like GTK+, wxPython, Tkinter, or Qt to provide an object-oriented API
that helps programmers to embed graphs and plots into their applications.

It is the oldest Python machine learning library. However, it is still not obsolete. It is one of
the most innovative data visualization libraries for Python. So, the ML community admires it.

The following features of the Matplotlib library make it a famous Python machine learning
among the ML community:

• Its interactive charts and plots allow fascinating data storytelling


• Offers an extensive list of plots appropriate for a particular use case
• Charts and plots are customizable and exportable to various file formats
• Offers embeddable visualizations with different GUI applications
• Various Python frameworks and libraries extend Matplotlib

Below are some of the advantages and disadvantages of Matplotlib.

Advantages:

• It helps produce plots that are configurable, powerful, and accurate.


• It can be easily streamlined with the Jupyter Notebook.
• It supports GUI toolkits that include wxPython, Qt, and Tkinter.
• It is leveraged with a structure that can support Python as well as IPython shells.

Disadvantages:

• It has a strong dependency on NumPy and other such libraries for the SciPy stack.
• It has a high learning curve as its use takes quite a lot of knowledge and application
from the learner’s end.
• It can be confusing for developers as it provides two distinct frameworks, object-
oriented and MATLAB.
• It is primarily used for data visualization. It is not suitable for data analysis. To get both
data visualization and data analysis, we will have to integrate it with other libraries.

6) Seaborn

Seaborn is a library in Python that allows us to create analytical graphs.


Seaborn is based on Matplotlib and includes the data structures of pandas.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Below are some advantages and disadvantages of Seaborn.

Advantages:

• It produces graphs that are more appealing than those created with Matplotlib.
• It has integrated packages that are unavailable in Matplotlib.
• It uses less code for visualizing graphs.
• It is integrated with pandas for visualizing and analyzing data.

Disadvantages:

• Prior knowledge of Matplotlib is required to work with Seaborn.


• Seaborn does not provide the feature of customization, which is there in Matplotlib.

7)TensorFlow

TensorFlow was developed for Google’s internal use by the Google


Brain team. Its first release came in November 2015 under Apache
License 2.0. TensorFlow is a popular computational framework for
creating machine learning models. TensorFlow supports a variety
of different toolkits for constructing models at varying levels of abstraction.

TensorFlow exposes a very stable Python and C++ APIs. It can expose, backward compatible
APIs for other languages too, but they might be unstable. TensorFlow has a flexible
architecture with which it can run on a variety of computational platforms CPUs, GPUs, and
TPUs. TPU stands for Tensor processing unit, a hardware chip built around TensorFlow for
machine learning and artificial intelligence.

TensorFlow empowers some of the largest contemporary AI models globally. Alternatively, it


is recognized as an end-to-end Deep Learning and Machine Learning library to solve practical
challenges.

The following key features of TensorFlow make it one of the best machine learning libraries
Python:

• Comprehensive control on developing a machine learning model and robust neural


network
• Deploy models on cloud, web, mobile, or edge devices through TFX, TensorFlow.js,
and TensorFlow Lite
• Supports abundant extensions and libraries for solving complex problems
• Supports different tools for integration of Responsible AI and ML solutions

8) Keras

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Keras has over 200,000 users as of November 2017. Keras is


an open-source library used for neural networks and machine
learning. Keras can run on top of TensorFlow, Theano, Microsoft Cognitive Toolkit, R, or
PlaidML. Keras also can run efficiently on CPU and GPU.

Keras works with neural-network building blocks like layers, objectives, activation functions,
and optimizers. Keras also have a bunch of features to work on images and text images that
comes handy when writing Deep Neural Network code.

Apart from the standard neural network, Keras supports convolutional and recurrent neural
networks.

It was released in 2015 and by now, it is a cutting-edge open-source Python deep learning
framework and API. It is identical to Tensorflow in several aspects. But it is designed with a
human-based approach to make DL and ML accessible and easy for everybody.

You can conclude that Keras is one of the versatile machine learning libraries Python
because it includes:

• Everything that TensorFlow provides but presents in easy to understand format.


• Quickly runs various DL iterations with full deployment proficiencies.
• Support large TPUs and GPU clusters which facilitate commercial Python machine
learning.
• It is used in various applications, including natural language processing, computer
vision, reinforcement learning, and generative deep learning. So, it is useful for graph,
structured, audio, and time series data.

9) PyTorch
PyTorch has a range of tools and libraries that support
computer vision, machine learning, and natural language
processing. The PyTorch library is open-source and is
based on the Torch library. The most significant advantage
of PyTorch library is it’s ease of learning and using.

PyTorch can smoothly integrate with the python data science stack, including NumPy. You
will hardly make out a difference between NumPy and PyTorch. PyTorch also allows
developers to perform computations on Tensors. PyTorch has a robust framework to build
computational graphs on the go and even change them in runtime. Other advantages of PyTorch
include multi GPU support, simplified preprocessors, and custom data loaders.

Facebook released PyTorch as a powerful competitor of TensorFlow in 2016. It has now


attained huge popularity among deep learning and machine learning researchers. Various
aspects of PyTorch suggest that it is one of the outstanding Python libraries for machine
learning. Here are some of its key capabilities.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Fully support the development of customized deep neural networks


• Production-ready with TorchServe
• Supports distributed computing through the torch.distributed backend
• Supports various extensions and tools to solve complex problems
• Compatible on all leading cloud platforms for extensible deployment
• Also supported on GitHub as an open-source Python framework

Conclusion

In this Experiment, we have discussed about different libraries of Python and Machine learning
which are used for performing Machine learning tasks. We have also shown different examples
of each library.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

EXPERIMENT 3: Linear Regression

Title: Exploring Linear Regression: A Comprehensive Experimental Analysis

Aim: To study & explore the performance of linear regression in various real-world scenarios.

RESOURCES REQUIRED: H/W :- P4 machine


S/W :- Google Colaboratory or Jupyter Notebook

Introduction:

Linear regression is a fundamental statistical technique widely used in various fields to model
the relationship between a dependent variable and one or more independent variables..

Linear regression serves as a powerful tool for predictive modeling and understanding the
relationship between variables. This study investigates its applicability in diverse scenarios,
addressing questions regarding its assumptions, accuracy, and potential challenges.

2. Experimental Design:

Dataset Selection: We selected a diverse set of datasets from different domains, ranging from
finance and healthcare to social sciences. Each dataset was preprocessed to handle missing
values and outliers.

Assumption Check: We rigorously examined the assumptions of linear regression, including


linearity, independence, homoscedasticity, and normality of residuals. Diagnostic plots and
statistical tests were employed to assess the validity of these assumptions.

Model Training: Linear regression models were trained on each dataset using standard
techniques. The performance metrics such as Mean Squared Error (MSE), R-squared, and
adjusted R-squared were employed to evaluate model fitness.

Theory:

Regression is a method of modelling a target value based on independent predictors. This


method is mostly used for forecasting and finding out cause and effect relationship between
variables. Regression techniques mostly differ based on the number of independent variables
and the type of relationship between the independent and dependent variables.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Linear Regression

Simple linear regression is a type of regression analysis where the number of independent
variables is one and there is a linear relationship between the independent(x) and dependent(y)
variable. The red line in the above graph is referred to as the best fit straight line. Based on the
given data points, we try to plot a line that models the points the best. The line can be modelled
based on the linear equation shown below.

y = a_0 + a_1 * x ## Linear Equation

The motive of the linear regression algorithm is to find the best values for a_0 and a_1. Before
moving on to the algorithm, let’s have a look at two important concepts you must know to
better understand linear regression.

Cost Function

The cost function helps us to figure out the best possible values for a_0 and a_1 which would
provide the best fit line for the data points. Since we want the best values for a_0 and a_1, we
convert this search problem into a minimization problem where we would like to minimize the
error between the predicted value and the actual value.

Minimization and Cost Function

We choose the above function to minimize. The difference between the predicted values and
ground truth measures the error difference. We square the error difference and sum over all
data points and divide that value by the total number of data points. This provides the average
squared error over all the data points. Therefore, this cost function is also known as the Mean
Squared Error(MSE) function. Now, using this MSE function we are going to change the
values of a_0 and a_1 such that the MSE value settles at the minima.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Gradient Descent

The next important concept needed to understand linear regression is gradient descent.
Gradient descent is a method of updating a_0 and a_1 to reduce the cost function(MSE). The
idea is that we start with some values for a_0 and a_1 and then we change these values
iteratively to reduce the cost. Gradient descent helps us on how to change the values.

Gradient Descent

To draw an analogy, imagine a pit in the shape of U and you are standing at the topmost point
in the pit and your objective is to reach the bottom of the pit. There is a catch, you can only
take a discrete number of steps to reach the bottom. If you decide to take one step at a time you
would eventually reach the bottom of the pit but this would take a longer time. If you choose
to take longer steps each time, you would reach sooner but, there is a chance that you could
overshoot the bottom of the pit and not exactly at the bottom. In the gradient descent algorithm,
the number of steps you take is the learning rate. This decides on how fast the algorithm
converges to the minima.

Convex vs Non-convex function

Sometimes the cost function can be a non-convex function where you could settle at a local
minima but for linear regression, it is always a convex function.

You may be wondering how to use gradient descent to update a_0 and a_1. To update a_0 and
a_1, we take gradients from the cost function. To find these gradients, we take partial
derivatives with respect to a_0 and a_1. Now, to understand how the partial derivatives are
found below you would require some calculus but if you don’t, it is alright. You can take it as
it is.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

The partial derivates are the gradients and they are used to update the values of a_0 and a_1.
Alpha is the learning rate which is a hyperparameter that you must specify. A smaller learning
rate could get you closer to the minima but takes more time to reach the minima, a larger
learning rate converges sooner but there is a chance that you could overshoot the minima.

4. Results:

Assumption Validation: The experimental results demonstrated that while linear regression
assumes certain conditions, deviations from these assumptions did not always lead to
significant model deterioration. Robustness to violations of normality and homoscedasticity
was observed in specific scenarios.

Model Performance: The performance of linear regression varied across datasets. In some
cases, it provided accurate predictions, while in others, its simplicity led to underfitting. The
importance of feature selection and engineering emerged as crucial factors in improving model
performance.

5. Conclusion: This experimental analysis contributes to a deeper understanding of linear


regression, emphasizing its strengths and potential pitfalls. As a foundational statistical tool,
linear regression continues to play a crucial role in various applications, and awareness of its
assumptions is paramount for accurate modeling.

PROGRAM:

OUTPUT:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

EXPERIMENT 4: Logistic Regression

Title: Exploring the Effectiveness of Logistic Regression in Binary Classification Tasks

Aim: To study & explore the performance of logistic regression in various real-world
scenarios.

RESOURCES REQUIRED: H/W :- P4 machine


S/W :- Google Colaboratory or Jupyter Notebook

Introduction:

Logistic regression is a statistical method used for binary classification, where the dependent
variable is categorical and has only two possible outcomes. It models the probability that an
instance belongs to a particular class based on one or more independent variables. Despite
its simplicity, logistic regression can provide valuable insights into the relationships between
predictors and outcomes, making it a versatile tool in machine learning.

Theory :

• Logistic regression is one of the most popular Machine Learning algorithms, which
comes under the Supervised Learning technique. It is used for predicting the
categorical dependent variable using a given set of independent variables.
• Logistic regression predicts the output of a categorical dependent variable. Therefore
the outcome must be a categorical or discrete value. It can be either Yes or No, 0 or 1,
true or False, etc. but instead of giving the exact value as 0 and 1, it gives the
probabilistic values which lie between 0 and 1.
• Logistic Regression is much similar to the Linear Regression except that how they are
used. Linear Regression is used for solving Regression problems, whereas Logistic
regression is used for solving the classification problems.
• In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic
function, which predicts two maximum values (0 or 1).
• The curve from the logistic function indicates the likelihood of something such as
whether the cells are cancerous or not, a mouse is obese or not based on its weight,
etc.
• Logistic Regression is a significant machine learning algorithm because it has the
ability to provide probabilities and classify new data using continuous and discrete
datasets.
• Logistic Regression can be used to classify the observations using different types of
data and can easily determine the most effective variables used for the classification.
The below image is showing the logistic function:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Logistic Function (Sigmoid Function):

• The sigmoid function is a mathematical function used to map the predicted values to
probabilities.
• It maps any real value into another value within a range of 0 and 1.
• The value of the logistic regression must be between 0 and 1, which cannot go
beyond this limit, so it forms a curve like the "S" form. The S-form curve is called the
Sigmoid function or the logistic function.
• In logistic regression, we use the concept of the threshold value, which defines the
probability of either 0 or 1. Such as values above the threshold value tends to 1, and
a value below the threshold values tends to 0.

Assumptions for Logistic Regression:

• The dependent variable must be categorical in nature.


• The independent variable should not have multi-collinearity.

Logistic Regression Equation:

The Logistic regression equation can be obtained from the Linear Regression equation. The
mathematical steps to get Logistic Regression equations are given below:

• We know the equation of the straight line can be written as:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

In Logistic Regression y can be between 0 and 1 only, so for this let's divide the above equation

by (1-y):

• But we need range between -[infinity] to +[infinity], then take logarithm of the
equation it will become:

The above equation is the final equation for Logistic Regression.

Type of Logistic Regression:

On the basis of the categories, Logistic Regression can be classified into three types:

• Binomial: In binomial Logistic regression, there can be only two possible types of the
dependent variables, such as 0 or 1, Pass or Fail, etc.
• Multinomial: In multinomial Logistic regression, there can be 3 or more possible
unordered types of the dependent variable, such as "cat", "dogs", or "sheep"
• Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered types
of dependent variables, such as "low", "Medium", or "High".

PROGRAMM and OUTPUT :

# import the necessary libraries


from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# load the breast cancer dataset


X, y = load_breast_cancer(return_X_y=True)

# split the train and test dataset


X_train, X_test,\
y_train, y_test = train_test_split(X, y,
test_size=0.20,
random_state=23)
# LogisticRegression
clf = LogisticRegression(random_state=0)
clf.fit(X_train, y_train)

# Prediction
y_pred = clf.predict(X_test)

acc = accuracy_score(y_test, y_pred)


print("Logistic Regression model accuracy (in %):", acc*100)

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

OUTPUT:
Logistic Regression model accuracy (in %): 94.73684210526315

from sklearn.model_selection import train_test_split


from sklearn import datasets, linear_model, metrics

# load the digit dataset


digits = datasets.load_digits()

# defining feature matrix(X) and response vector(y)


X = digits.data
y = digits.target

# splitting X and y into training and testing sets


X_train, X_test,\
y_train, y_test = train_test_split(X, y,
test_size=0.4,
random_state=1)

# create logistic regression object


reg = linear_model.LogisticRegression()

# train the model using the training sets


reg.fit(X_train, y_train)

# making predictions on the testing set


y_pred = reg.predict(X_test)

# comparing actual response values (y_test)


# with predicted response values (y_pred)
print("Logistic Regression model accuracy(in %):",
metrics.accuracy_score(y_test, y_pred)*100)

OUTPUT:

Logistic Regression model accuracy(in %): 96.52294853963839

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

EXPERIMENT 5: Support Vector Machine

Title: Exploring the Effectiveness of Support Vector Machine

Aim: To study & explore the performance Implementation of Support Vector Machine
Regression & Classifier in Python.

RESOURCES REQUIRED: H/W :- P4 machine


S/W :- Google Colaboratory or Jupyter Notebook

Introduction:

Support Vector Machines (SVMs) are a class of supervised learning algorithms that analyze
data for classification and regression analysis. The primary objective of SVMs is to find the
optimal hyperplane that separates data points into different classes while maximizing the
margin, which is the distance between the hyperplane and the nearest data points of each
class. SVMs are particularly useful in scenarios where the data is not linearly separable, as
they can map the data to a higher-dimensional space using kernel functions to facilitate
separation.

Theory:

Support Vector Machine Algorithm


Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms,
which is used for Classification as well as Regression problems. However, primarily, it is
used for Classification problems in Machine Learning.

The goal of the SVM algorithm is to create the best line or decision boundary that can
segregate n-dimensional space into classes so that we can easily put the new data point in the
correct category in the future. This best decision boundary is called a hyperplane.

SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme
cases are called as support vectors, and hence algorithm is termed as Support Vector
Machine. Consider the below diagram in which there are two different categories that are
classified using a decision boundary or hyperplane:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Types of SVM

SVM can be of two types:

• Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset
can be classified into two classes by using a single straight line, then such data is
termed as linearly separable data, and classifier is used called as Linear SVM
classifier.
• Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which
means if a dataset cannot be classified by using a straight line, then such data is
termed as non-linear data and classifier used is called as Non-linear SVM classifier.

Hyperplane and Support Vectors in the SVM algorithm:

Hyperplane: There can be multiple lines/decision boundaries to segregate the classes in n-


dimensional space, but we need to find out the best decision boundary that helps to classify
the data points. This best boundary is known as the hyperplane of SVM.

The dimensions of the hyperplane depend on the features present in the dataset, which means
if there are 2 features (as shown in image), then hyperplane will be a straight line. And if
there are 3 features, then hyperplane will be a 2-dimension plane.

We always create a hyperplane that has a maximum margin, which means the maximum
distance between the data points.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Support Vectors:

The data points or vectors that are the closest to the hyperplane and which affect the position
of the hyperplane are termed as Support Vector. Since these vectors support the hyperplane,
hence called a Support vector.

How does SVM works?

Linear SVM:

The working of the SVM algorithm can be understood by using an example. Suppose we
have a dataset that has two tags (green and blue), and the dataset has two features x1 and x2.
We want a classifier that can classify the pair(x1, x2) of coordinates in either green or blue.
Consider the below image:

So as it is 2-d space so by just using a straight line, we can easily separate these two classes.
But there can be multiple lines that can separate these classes. Consider the below image:

Hence, the SVM algorithm helps to find the best line or decision boundary; this best
boundary or region is called as a hyperplane. SVM algorithm finds the closest point of the
lines from both the classes. These points are called support vectors. The distance between the
vectors and the hyperplane is called as margin. And the goal of SVM is to maximize this
margin. The hyperplane with maximum margin is called the optimal hyperplane.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Non-Linear SVM:

If data is linearly arranged, then we can separate it by using a straight line, but for non-linear
data, we cannot draw a single straight line. Consider the below image:

So to separate these data points, we need to add one more dimension. For linear data, we have
used two dimensions x and y, so for non-linear data, we will add a third dimension z. It can
be calculated as:

z=x2 +y2

By adding the third dimension, the sample space will become as below image:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Since we are in 3-d Space, hence it is looking like a plane parallel to the x-axis. If we convert
it in 2d space with z=1, then it will become as:

Hence we get a circumference of radius 1 in case of non-linear data.

# Load the important packages


from sklearn.datasets import load_breast_cancer
import matplotlib.pyplot as plt
from sklearn.inspection import DecisionBoundaryDisplay
from sklearn.svm import SVC

# Load the datasets


cancer = load_breast_cancer()
X = cancer.data[:, :2]
y = cancer.target

#Build the model


svm = SVC(kernel="rbf", gamma=0.5, C=1.0)
# Trained the model
svm.fit(X, y)

# Plot Decision Boundary


DecisionBoundaryDisplay.from_estimator(
svm,
X,
response_method="predict",
cmap=plt.cm.Spectral,
alpha=0.8,
xlabel=cancer.feature_names[0],
ylabel=cancer.feature_names[1],

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

# Scatter plot
plt.scatter(X[:, 0], X[:, 1],
c=y,
s=20, edgecolors="k")
plt.show()

OUTPUT:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

EXPERIMENT 6: Hebbian Learning

Title: implementation of Hebbian learning rules

Aim: implementation of Hebbian learning rules to train a neural network to perform the
logical AND operation.

RESOURCES REQUIRED: H/W :- P4 machine


S/W :- Google Colaboratory or Jupyter Notebook

Introduction:

The implementation of artificial neural networks (ANNs) to perform logical operations is a


fundamental application of neural networks in the field of artificial intelligence. The logical
AND operation is a binary operation that returns true (1) only when both inputs are true (1).
We use Hebbian learning, a type of unsupervised learning rule based on synaptic plasticity, to
train the neural network. Additionally, we employ a bipolar sigmoid function to introduce non-
linearity into the output of the neural network.

Theory :

Hebbian Learning Rule Algorithm :

1. Set all weights to zero, wi = 0 for i=1 to n, and bias to zero.


2. For each input vector, S(input vector) : t(target output pair), repeat steps 3-5.
3. Set activations for input units with the input vector Xi = Si for i = 1 to n.
4. Set the corresponding output value to the output neuron, i.e. y = t.
5. Update weight and bias by applying Hebb rule for all i = 1 to n:

Methodology:

1]. Neural Network Architecture: We design a simple neural network consisting of two input
neurons and one output neuron. The input neurons represent the two binary inputs of the AND
gate, and the output neuron produces the result of the AND operation.

2]. Activation Function: We use a bipolar sigmoid function as the activation function for the
output neuron. The bipolar sigmoid function maps the output of the neuron to the range [-1, 1],
introducing non-linearity into the network.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

3]. Hebbian Learning: We implement Hebbian learning rules to update the weights of the
connections between neurons. Hebbian learning is a local learning rule that strengthens the
connections between neurons when they are simultaneously active.

4]. Training: We train the neural network using Hebbian learning with input patterns
corresponding to all possible combinations of binary inputs for the AND gate.

5]. Testing: After training, we evaluate the performance of the neural network by feeding it
with different input patterns and observing the corresponding output.

Implementing AND Gate :

Truth Table of AND Gate using bipolar sigmoidal function

There are 4 training samples, so there will be 4 iterations. Also, the activation function used
here is Bipolar Sigmoidal Function so the range is [-1,1].
Step 1 :
Set weight and bias to zero, w = [ 0 0 0 ]T and b = 0.
Step 2 :
Set input vector Xi = Si for i = 1 to 4.
X1 = [ -1 -1 1 ]T
X2 = [ -1 1 1 ]T
X3 = [ 1 -1 1 ]T
X 4 = [ 1 1 1 ]T
Step 3 :
Output value is set to y = t.
Step 4 :
Modifying weights using Hebbian Rule:
First iteration –
w(new) = w(old) + x 1y1 = [ 0 0 0 ]T + [ -1 -1 1 ]T . [ -1 ] = [ 1 1 -1 ]T
For the second iteration, the final weight of the first one will be used and so on.
Second iteration –

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

w(new) = [ 1 1 -1 ]T + [ -1 1 1 ]T . [ -1 ] = [ 2 0 -2 ]T
Third iteration –
w(new) = [ 2 0 -2]T + [ 1 -1 1 ]T . [ -1 ] = [ 1 1 -3 ]T
Fourth iteration –
w(new) = [ 1 1 -3]T + [ 1 1 1 ]T . [ 1 ] = [ 2 2 -2 ]T
So, the final weight matrix is [ 2 2 -2 ]T

Testing the network :

The network with the final weights


For x1 = -1, x2 = -1, b = 1, Y = (-1)(2) + (-1)(2) + (1)(-2) = -6
For x1 = -1, x2 = 1, b = 1, Y = (-1)(2) + (1)(2) + (1)(-2) = -2
For x1 = 1, x2 = -1, b = 1, Y = (1)(2) + (-1)(2) + (1)(-2) = -2
For x1 = 1, x2 = 1, b = 1, Y = (1)(2) + (1)(2) + (1)(-2) = 2
The results are all compatible with the original table.
Decision Boundary :
2x1 + 2x2 – 2b = y
Replacing y with 0, 2x1 + 2x2 – 2b = 0
Since bias, b = 1, so 2x 1 + 2x2 – 2(1) = 0
2( x1 + x2 ) = 2
The final equation, x 2 = -x1 + 1

Results & Conclusion:

The experiment demonstrates the successful implementation of Hebbian learning rules for
training a neural network to perform the logical AND operation .The weights of the neural
network are adjusted during the training process to effectively perform the AND operation.
After training, the neural network successfully produces the correct output for all possible input
patterns of the AND gate.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Decision Boundary of AND Function

Program And OUTPUT:

import numpy as np

# Define bipolar sigmoid function

def bipolar_sigmoid(x):

return 2 / (1 + np.exp(-x)) - 1

# Define Hebbian learning function

def hebbian_learning(input_pattern, output, weights, learning_rate):

delta_w = learning_rate * np.outer(output, input_pattern)

weights += delta_w

# Define AND gate function

def AND_gate(input_data):

weights = np.random.rand(1, 2) # Initialize weights randomly

learning_rate = 0.1

# Define input patterns for AND gate

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

input_patterns = [[-1, -1], [-1, 1], [1, -1], [1, 1]]

expected_outputs = [-1, -1, -1, 1]

# Training

for _ in range(1000):

for i in range(len(input_patterns)):

input_pattern = np.array(input_patterns[i])

output = bipolar_sigmoid(np.dot(weights, input_pattern))

hebbian_learning(input_pattern, expected_outputs[i], weights, learning_rate)

# Testing

print("Weights after training:", weights)

print("\nTesting:")

for i in range(len(input_patterns)):

input_pattern = np.array(input_patterns[i])

output = bipolar_sigmoid(np.dot(weights, input_pattern))

print("Input:", input_patterns[i], "Output:", output)

# Test the AND gate function

AND_gate([])

OUTPUT:

Weights after training: [[200.47566818 200.73044635]]

Testing:
Input: [-1, -1] Output: [-1.]
Input: [-1, 1] Output: [0.12670444]
Input: [1, -1] Output: [-0.12670444]
Input: [1, 1] Output: [1.]

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

EXPERIMENT 7: McCulloch Pitts Model.

Title: implementation of McCulloch Pitts Model.

Aim: implementation of the logic Gates using a McCulloch-Pitts neural network.

Introduction:

The McCulloch-Pitts model, proposed by Warren McCulloch and Walter Pitts in 1943, is one
of the earliest neural network models. It describes the behavior of a simplified neuron, which
receives input signals, applies weights to them, sums them up, and produces an output based
on a threshold. In this experiment, we apply the McCulloch-Pitts model to implement an AND
gate, a fundamental logical operation.

THEORY:
Logic gates are elementary building blocks of digital circuits, performing logical operations
on binary inputs. By implementing logic gates using neural networks, we bridge the gap
between conventional digital computing and neural computation. The McCulloch-Pitts
neural network model, inspired by the biological neurons, offers a simple yet powerful
framework for modeling logical operations.

Methodology:
McCulloch-Pitts Neurons: We implement the McCulloch-Pitts neuron model, which
receives binary input signals and produces binary outputs based on predefined thresholds.

AND Gate Implementation: We configure a McCulloch-Pitts neural network to simulate


the behavior of an AND gate. The network activates its output neuron only when all input
neurons are active.

OR Gate Implementation: Similarly, we design a neural network to function as an OR gate,


where the output neuron activates if any of the input neurons are active.

NOT Gate Implementation: For the NOT gate, we employ a single McCulloch-Pitts neuron
that inversely mirrors its input signal.

Experiment Execution: We test each logic gate implementation by providing different input
combinations and observing the corresponding outputs. We verify that the neural networks
accurately compute the truth tables of their respective logic gates.

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Architecture of OR Gate

Architecture of AND Gate

Architecture of NOT Gate

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Architecture of NOR Gate

Results:
The implemented McCulloch-Pitts neural networks successfully emulate the behavior of
logic gates. For each gate (AND, OR, and NOT), the neural networks produce the expected
outputs based on their truth tables, demonstrating the ability to perform logical operations
using neural computation.

Conclusion:
This experiment highlights the versatility of the McCulloch-Pitts neural network model in
simulating logical operations. By configuring the connections and thresholds of McCulloch-
Pitts neurons, we can construct neural networks that mimic the behavior of conventional
logic gates. This integration of neural and digital computing paradigms opens avenues for
novel approaches to information processing and computational tasks.

Logic Gates using McCulloch-Pitts Neuron


from tabulate import tabulate
FOR AND GATE
#inputs
x1 = [0, 0, 1, 1]

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

x2 = [0, 1, 0, 1]
w1 = [1, 1, 1, 1]
w2 = [1, 1, 1, 1]
t = 2
#output
print("x1 x2 w1 w2 t O")
for i in range(len(x1)):
if ( x1[i]*w1[i] + x2[i]*w2[i] ) >= t:
print(x1[i],' ',x2[i],' ',w1[i],' ',w2[i],' ',t,' ', 1)
else:
print(x1[i],' ',x2[i],' ',w1[i],' ',w2[i],' ',t,' ', 0)

OUTPUT AND GATE:


x1 x2 w1 w2 t O
0 0 1 1 2 0
0 1 1 1 2 0
1 0 1 1 2 0
1 1 1 1 2 1

FOR OR GATE¶
#inputs
x1 = [0, 0, 1, 1]
x2 = [0, 1, 0, 1]
w1 = [1, 1, 1, 1]
w2 = [1, 1, 1, 1]
t = 1
#output
print("x1 x2 w1 w2 t O")
for i in range(len(x1)):
if ( x1[i]*w1[i] + x2[i]*w2[i] ) >= t:
print(x1[i],' ',x2[i],' ',w1[i],' ',w2[i],' ',t,' ', 1)
else:
print(x1[i],' ',x2[i],' ',w1[i],' ',w2[i],' ',t,' ', 0)

OUTPUT OR GATE:

x1 x2 w1 w2 t O
0 0 1 1 1 0
0 1 1 1 1 1
1 0 1 1 1 1
1 1 1 1 1 1

FOR NOT GATE

#inputs
x = [0, 1]

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

w = [-1, -1]
t = 0
#output
print("x w t O")
for i in range(len(x)):
if ( x[i]*w[i]) >= t:
print(x[i],' ',w[i],' ',t,' ', 1)
else:
print(x[i],' ',w[i],' ',t,' ', 0)

OUTPUT NOT GATE:

x w t O
0 -1 0 1
1 -1 0 0

FOR NAND GATE

#inputs
x1 = [0, 0, 1, 1]
x2 = [0, 1, 0, 1]
w1 = [-1, -1, -1, -1]
w2 = [-1, -1, -1, -1]
t = -2
#output
print("x1 x2 w1 w2 t O")
for i in range(len(x1)):
if ( x1[i]*w1[i] + x2[i]*w2[i] ) > t:
print(x1[i],' ',x2[i],' ',w1[i],' ',w2[i],' ',t,' ', 1)
else:
print(x1[i],' ',x2[i],' ',w1[i],' ',w2[i],' ',t,' ', 0)

OUTPUT NAND GATE:

x1 x2 w1 w2 t O
0 0 -1 -1 -2 1
0 1 -1 -1 -2 1
1 0 -1 -1 -2 1
1 1 -1 -1 -2 0

FOR NOR GATE

#inputs
x1 = [0, 0, 1, 1]
x2 = [0, 1, 0, 1]
w1 = [1, 1, 1, 1]
w2 = [1, 1, 1, 1]
t = 0

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

#output
print("x1 x2 w1 w2 t O")
for i in range(len(x1)):
if ( x1[i]*w1[i] + x2[i]*w2[i] ) <= t:
print(x1[i],' ',x2[i],' ',w1[i],' ',w2[i],' ',t,' ', 1)
else:
print(x1[i],' ',x2[i],' ',w1[i],' ',w2[i],' ',t,' ', 0)

OUTPUT NOR GATE:

x1 x2 w1 w2 t O
0 0 1 1 0 1
0 1 1 1 0 0
1 0 1 1 0 0
1 1 1 1 0 0

FOR EXOR GATE

#inputs
x1 = [0, 0, 1, 1]
x2 = [0, 1, 0, 1]
w1 = [1, 1, 1, 1]
w2 = [1, 1, 1, 1]
w3 = [1, 1, 1, 1]
w4 = [-1, -1, -1, -1]
w5 = [-1, -1, -1, -1]
w6 = [1, 1, 1, 1]
t1 = [0.5,0.5,0.5,0.5]
t2 = [-1.5,-1.5,-1.5,-1.5]
t3 = [1.5,1.5,1.5,1.5]
def XOR (a, b):
if a != b:
return 1
else:
return 0
#output
print('x1 x2 w1 w2 w3 w4 w5 w6 t1 t2 t3 O')
for i in range(len(x1)):
print(x1[i],' ',x2[i],' ',w1[i],' ',w2[i],' ',w3[i],' ',w4[i],' ',
w5[i],' ',w6[i],' ',t1[i],' ',t2[i],' ',t3[i],' ',XOR(x1[i],x2[i]))

OUTPUT EXOR GATE:

x1 x2 w1 w2 w3 w4 w5 w6 t1 t2 t3 O
0 0 1 1 1 -1 -1 1 0.5 -1.5 1.5 0
0 1 1 1 1 -1 -1 1 0.5 -1.5 1.5 1
1 0 1 1 1 -1 -1 1 0.5 -1.5 1.5 1
1 1 1 1 1 -1 -1 1 0.5 -1.5 1.5 0

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

EXPERIMENT 8: Principal component Analysis.

Title: implementation of PCA on the Iris dataset using Python and scikit-learn library.

Aim: transform high-dimensional data into a lower-dimensional space while preserving the
most important information in the dataset.

RESOURCES REQUIRED: H/W :- P4 machine


S/W :- Google Colaboratory or Jupyter Notebook

Introduction:

PCA, which stands for Principal Component Analysis, is a dimensionality reduction technique
used in data analysis and machine learning. It aims to transform high-dimensional data into a
lower-dimensional space while preserving the most important information in the dataset.

The main idea behind PCA is to identify the directions (principal components) in which the
data varies the most. These directions represent the axes along which the data points have the
highest variance. By projecting the data onto these principal components, PCA allows for a
more compact representation of the dataset while minimizing information loss.

Here's how PCA works :

1. Standardization: PCA typically starts with standardizing the features of the dataset to
have a mean of 0 and a standard deviation of 1. This step is crucial because PCA is
sensitive to the scale of features.
2. Covariance Matrix Calculation: PCA calculates the covariance matrix of the
standardized data. The covariance matrix summarizes the relationships between all
pairs of features in the dataset. It provides information about how each feature varies
with respect to others.
3. Eigendecomposition: PCA performs eigendecomposition on the covariance matrix to
obtain the eigenvectors and eigenvalues. Eigenvectors represent the directions of
maximum variance in the data, while eigenvalues indicate the magnitude of variance
along these directions.
4. Selection of Principal Components: PCA sorts the eigenvalues in descending order
and selects the top k eigenvectors corresponding to the largest eigenvalues, where k is
the desired number of principal components. These principal components capture most
of the variance in the data.
5. Projection: PCA projects the original data onto the selected principal components to
obtain the new feature space. Each data point is represented by its coordinates along
the principal components.

PCA has several applications, including:

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

• Dimensionality Reduction: PCA reduces the number of features in the dataset while
retaining most of the important information. This can lead to simpler and more
interpretable models and faster computation.
• Visualization: PCA can be used to visualize high-dimensional data in lower
dimensions (e.g., 2D or 3D) for easier interpretation and exploration.
• Noise Reduction: PCA can help in reducing noise and redundancy in the data by
focusing on the most informative features.
• Feature Engineering: PCA can be used as a preprocessing step for feature engineering
in machine learning tasks to improve model performance.

Overall, PCA is a powerful technique for data analysis and preprocessing, particularly when
dealing with high-dimensional datasets.

Results : The PCA implementation successfully reduced the dimensionality of the Iris dataset
from four dimensions to two dimensions. The scatter plot of the data points in the new 2-
dimensional space showed a clear separation between the three species of iris flowers. The first
principal component (PC1) and the second principal component (PC2) captured a significant
amount of variance in the dataset, as evidenced by their high explained variance ratios.

Conclusion: PCA proved to be an effective technique for dimensionality reduction and


visualization of the Iris dataset. By reducing the dataset to two dimensions, we were able to
visualize the data and observe patterns and relationships between the iris samples. PCA can be
a valuable tool in exploratory data analysis and feature engineering tasks, providing insights
into high-dimensional datasets.

Program And OUTPUT

PCA

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA

# Load the Iris dataset


iris = load_iris()
X = iris.data
y = iris.target

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

feature_names = iris.feature_names

# Standardize the feature matrix


X_std = StandardScaler().fit_transform(X)

# Perform PCA
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_std)

# Create a DataFrame for the reduced features


df_pca = pd.DataFrame(data=X_pca, columns=['PC1', 'PC2'])
df_pca['target'] = y

# Plot the PCA results


plt.figure(figsize=(8, 6))
targets = np.unique(y)
colors = ['r', 'g', 'b']
for target, color in zip(targets, colors):
indices_to_keep = df_pca['target'] == target
plt.scatter(df_pca.loc[indices_to_keep, 'PC1'],
df_pca.loc[indices_to_keep, 'PC2'],
c=color,
s=50)
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.title('PCA of Iris Dataset')
plt.legend(iris.target_names)
plt.grid()
plt.show()

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

# Explained variance ratio


print("Explained Variance Ratio:", pca.explained_variance_ratio_)

OUTPUT:

Explained variance ratio: [0.72962445 0.22850762]

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Error Back Propagation


import numpy as np

def sigmoid(x):
return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
return x * (1 - x)

class NeuralNetwork:
def __init__(self, input_size, hidden_size, output_size):
self.input_size = input_size
self.hidden_size = hidden_size
self.output_size = output_size

# Initialize weights randomly


self.weights_input_hidden = np.random.rand(input_size, hidden_size)
self.weights_hidden_output = np.random.rand(hidden_size, output_size)

def forward_pass(self, X):


# Input to hidden layer
self.hidden_input = np.dot(X, self.weights_input_hidden)
self.hidden_output = sigmoid(self.hidden_input)

# Hidden to output layer


self.output_input = np.dot(self.hidden_output, self.weights_hidden_output)
self.output = sigmoid(self.output_input)

return self.output

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

def backward_pass(self, X, y, output):


# Compute error
self.output_error = y - output

# Compute gradients for weights between hidden and output layer


delta_output = self.output_error * sigmoid_derivative(output)
self.weights_hidden_output += np.dot(self.hidden_output.T, delta_output)

# Compute gradients for weights between input and hidden layer


delta_hidden = np.dot(delta_output, self.weights_hidden_output.T) *
sigmoid_derivative(self.hidden_output)
self.weights_input_hidden += np.dot(X.T, delta_hidden)

def train(self, X, y, epochs):


for epoch in range(epochs):
# Forward pass
output = self.forward_pass(X)

# Backward pass
self.backward_pass(X, y, output)

# Calculate and print mean squared error


mse = np.mean(np.square(y - output))
print(f'Epoch {epoch+1}/{epochs}, Mean Squared Error: {mse}')

# Example usage:
# Input data (4 features), target output (1 output)
X = np.array([[0, 0, 1, 1],
[0, 1, 1, 0],
[1, 0, 1, 0],
[1, 1, 1, 1]])
y = np.array([[0, 1, 1, 0]]).T # Target output

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

# Initialize and train the neural network


input_size = 4
hidden_size = 3
output_size = 1
epochs = 100
nn = NeuralNetwork(input_size, hidden_size, output_size)
nn.train(X, y, epochs)

OUTPUT:
Epoch 1/100, Mean Squared Error: 0.3909198935257375
Epoch 2/100, Mean Squared Error: 0.36761319023216843
Epoch 3/100, Mean Squared Error: 0.3407214073553024
Epoch 4/100, Mean Squared Error: 0.3128286116839329
Epoch 5/100, Mean Squared Error: 0.2881041414280933
Epoch 6/100, Mean Squared Error: 0.27002608479593215
Epoch 7/100, Mean Squared Error: 0.25907428765598484
Epoch 8/100, Mean Squared Error: 0.2533005830016185
Epoch 9/100, Mean Squared Error: 0.25045070634549754
Epoch 10/100, Mean Squared Error: 0.2490243571166713
Epoch 11/100, Mean Squared Error: 0.24823559789439453
Epoch 12/100, Mean Squared Error: 0.24771525949253997
Epoch 13/100, Mean Squared Error: 0.24729744852157265
Epoch 14/100, Mean Squared Error: 0.2469081741342835
Epoch 15/100, Mean Squared Error: 0.24651359625068736
Epoch 16/100, Mean Squared Error: 0.24609708181511797
Epoch 17/100, Mean Squared Error: 0.24564922657424218
Epoch 18/100, Mean Squared Error: 0.24516354434357218
Epoch 19/100, Mean Squared Error: 0.24463460268932458
Epoch 20/100, Mean Squared Error: 0.2440572086387934
Epoch 21/100, Mean Squared Error: 0.24342604391988595
Epoch 22/100, Mean Squared Error: 0.24273549260704744
Epoch 23/100, Mean Squared Error: 0.24197955150030104
Epoch 24/100, Mean Squared Error: 0.2411517767755278
Epoch 25/100, Mean Squared Error: 0.2402452475666762
Epoch 26/100, Mean Squared Error: 0.2392525388390253
Epoch 27/100, Mean Squared Error: 0.2381657010349174
Epoch 28/100, Mean Squared Error: 0.2369762463018299
Epoch 29/100, Mean Squared Error: 0.23567514226139646
Epoch 30/100, Mean Squared Error: 0.23425281493398856
Epoch 31/100, Mean Squared Error: 0.23269916288778134
Epoch 32/100, Mean Squared Error: 0.23100358505675164
Epoch 33/100, Mean Squared Error: 0.2291550250220682
Epoch 34/100, Mean Squared Error: 0.2271420349034005
Epoch 35/100, Mean Squared Error: 0.22495286238338627
Epoch 36/100, Mean Squared Error: 0.22257556481787466
Epoch 37/100, Mean Squared Error: 0.21999815490373723
Epoch 38/100, Mean Squared Error: 0.21720878302685684
Epoch 39/100, Mean Squared Error: 0.21419596222428522
Epoch 40/100, Mean Squared Error: 0.21094884264329808

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Epoch 41/100, Mean Squared Error: 0.20745754332096747


Epoch 42/100, Mean Squared Error: 0.2037135496681245
Epoch 43/100, Mean Squared Error: 0.1997101844896994
Epoch 44/100, Mean Squared Error: 0.19544315750608954
Epoch 45/100, Mean Squared Error: 0.1909111914858925
Epoch 46/100, Mean Squared Error: 0.18611671040409528
Epoch 47/100, Mean Squared Error: 0.18106655522215986
Epoch 48/100, Mean Squared Error: 0.17577266656396578
Epoch 49/100, Mean Squared Error: 0.1702526448569532
Epoch 50/100, Mean Squared Error: 0.16453007605344375
Epoch 51/100, Mean Squared Error: 0.1586345069240508
Epoch 52/100, Mean Squared Error: 0.15260097934283845
Epoch 53/100, Mean Squared Error: 0.14646909144029285
Epoch 54/100, Mean Squared Error: 0.14028163493030232
Epoch 55/100, Mean Squared Error: 0.13408293923919218
Epoch 56/100, Mean Squared Error: 0.1279171073089984
Epoch 57/100, Mean Squared Error: 0.1218263376896841
Epoch 58/100, Mean Squared Error: 0.11584949359685069
Epoch 59/100, Mean Squared Error: 0.11002101890147944
Epoch 60/100, Mean Squared Error: 0.10437023591234376
Epoch 61/100, Mean Squared Error: 0.0989210072970789
Epoch 62/100, Mean Squared Error: 0.09369171146959493
Epoch 63/100, Mean Squared Error: 0.08869546583855525
Epoch 64/100, Mean Squared Error: 0.08394053021852792
Epoch 65/100, Mean Squared Error: 0.07943082805054885
Epoch 66/100, Mean Squared Error: 0.07516653194631293
Epoch 67/100, Mean Squared Error: 0.07114467013467872
Epoch 68/100, Mean Squared Error: 0.0673597204113831
Epoch 69/100, Mean Squared Error: 0.0638041675161059
Epoch 70/100, Mean Squared Error: 0.06046900811026459
Epoch 71/100, Mean Squared Error: 0.05734419448604978
Epoch 72/100, Mean Squared Error: 0.0544190137060713
Epoch 73/100, Mean Squared Error: 0.051682403068194535
Epoch 74/100, Mean Squared Error: 0.049123205720009076
Epoch 75/100, Mean Squared Error: 0.0467303720862622
Epoch 76/100, Mean Squared Error: 0.04449311372893038
Epoch 77/100, Mean Squared Error: 0.042401016546805254
Epoch 78/100, Mean Squared Error: 0.04044412003639947
Epoch 79/100, Mean Squared Error: 0.03861296884611634
Epoch 80/100, Mean Squared Error: 0.03689864219401266
Epoch 81/100, Mean Squared Error: 0.03529276598517502
Epoch 82/100, Mean Squared Error: 0.03378751172673332
Epoch 83/100, Mean Squared Error: 0.03237558564122715
Epoch 84/100, Mean Squared Error: 0.031050210747906447
Epoch 85/100, Mean Squared Error: 0.029805104128391033
Epoch 86/100, Mean Squared Error: 0.028634451120435986
Epoch 87/100, Mean Squared Error: 0.02753287778795062
Epoch 88/100, Mean Squared Error: 0.026495422690140116
Epoch 89/100, Mean Squared Error: 0.02551750870930918
Epoch 90/100, Mean Squared Error: 0.0245949154867061
Epoch 91/100, Mean Squared Error: 0.023723752850307478
Epoch 92/100, Mean Squared Error: 0.02290043548988526
Epoch 93/100, Mean Squared Error: 0.02212165903620465
Epoch 94/100, Mean Squared Error: 0.021384377626881536
Epoch 95/100, Mean Squared Error: 0.020685782986324383
Epoch 96/100, Mean Squared Error: 0.02002328500719115
Epoch 97/100, Mean Squared Error: 0.019394493792556117
Epoch 98/100, Mean Squared Error: 0.018797203098795626
Epoch 99/100, Mean Squared Error: 0.018229375106906295

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan


Shivajirao S Jondhale College of Engineering, Dombivli (E)
Department of AIMLEngineering

Epoch 100/100, Mean Squared Error: 0.017689126442850214

TE-AIML-SEM-VI [DAV-Lab Mannual] Prof. Rashmi Mahajan

You might also like