18B15EC314 - Python For Signal Processing and Communication (1) Updated
18B15EC314 - Python For Signal Processing and Communication (1) Updated
To become a Center of Excellence in the field of IT & related emerging areas education,
training and research comparable to the best in the world for producing professionals who
shall be leaders in innovation, entrepreneurship, creativity and management.
MISSION
COURSE OUTCOMES:
COGNITIVE LEVELS
At the completion of the course, students will be able to:
Recall various concept, syntax and operation of python Remembering Level (C1)
C310. 1
4. Signal Transformations Writing codes to compute DFT (Discrete Fourier Transform) and
IDFT (Inverse Discrete Fourier Transform) for the spectral analysis
of signals.
5. Signal Operations Writing codes for generating various signal operations.
6. Data Wrangling To transform raw data to a clean and organized format ready for use.
10. Pulse Code Modulation To perform pulse code modulation and demodulation.
14. Virtual Lab 2 To learn the concepts of Constructor and Inheritance in Python
programming language. To implement those concepts in solving a simple
problem in the simulator.
Evaluation Criteria
Components Maximum Marks
Viva 1(Mid Sem Viva) 20
Viva 2(End Sem Viva) 20
Assessment Components 30
Attendance and Discipline 15
Virtual Lab 05
Report 10
Total 100
Project based learning: Students will implement SVMs for image classification using standard image
classification dataset. Additionally, students in group sizes of two-three will realize any one application of
machine learning using Python programming.
Recommended Reading material: Author(s), Title, Edition, Publisher, Year of Publication etc. (Text books,
Reference Books, Journals, Reports, Websites etc. in the IEEE format)
1. J. UNPINGC310.: Python for Signal Processing, Springer International Publishing Switzerland, 2014.
M. WICKERT: Signal Processing and Communications: Teaching and Research Using IPython Notebook, In
2.
Proc. of the 14th python in science conf., (scipy. 2015).
B. P. LATHI: Modern Digital and Analog Communication System: Python textbook Companion, Oxford
3. University Press Inc.
Experiment 1
Theory:
Python
• High level, interpreted, and general-purpose dynamic programming language.
Kernels: The "computational engine" which executes code blocks of the notebook
Cells: A container for code or text (e.g., this is written within a markdown cell)
Text cell uses markdown language. If you're unfamiliar with Markdown syntax,
check our this cheat sheet.
Code cell contains Python code. Few important key-shortcuts while using Code cell:
Shift + Enter : Executes the current cell and moves to the next
Shift + Tab : Brings up documentation. Try this after entering
np.ones(
Inserting Module
What is a module?
Certain functions are always defined, split(), len(). But to access more functions, must
import their module.
Module:
A file containing functions, definitions, and/or executable statements.
The module name is just the file name with .py removed.
To import:
Method 1 (Recommended)
import module1 as mod1
mod1.some_func()
This chapter covers the most common NumPy operations we are likely to run into
Solution
Use Numpy to create a one-dimensional array
# load library
import numpy as np
[1 2 3]
array([[1],
[2],
[3]])
Discussion
Numpy's main data structure is the multidimensional array
See Also
Vectors, Math is Fun (https://www.mathsisfun.com/algebra/vectors.html)
Euclidian vector, Wikipedia (https://en.wikipedia.org/wiki/Euclidean_vector)
1.2 Creating a Matrix
Problem
You need to create a matrix.
Solution
Use Numpy to create a two-dimensional array:
Discussion
To create a matrix we can use a NumPy two-dimensional array. In our solution, the matrix contains three
rows and two columns (a column of 1s and a column of 2s)
NumPy actually has a dedicated matrix data structure:
# load library
import numpy as np
# create a matrix
matrix=np.array([[1, 2],[1, 2],[1, 2]])
matrix
array([[1, 2],
[1, 2],
[1, 2]])
However the matrix data structure is not recommended for two reasons. First, arrays are the de
facto standard data structure of NumPy. Second the vast majority of NumPy operations return
arrays, not matrix objects.
See Also
Matrix, Wikipedia (https://en.wikipedia.org/wiki/Matrix_(mathematics)
Matrix, Wolfram MathWorld (http://mathworld.wolfram.com/Matrix.html)
Solution
Create a sparse matrix:
# load libraries
import numpy as np
from scipy import sparse
# create a matrix
matrix = np.array([[0, 0],
[0, 1],
[3, 0]])
[[0 0]
[0 1]
[3 0]]
<3x2 sparse matrix of type '<class 'numpy.longlong'>'
with 2 stored elements in Compressed Sparse Row format>
Discussion
A frequent situation in machine learning is having a huge amount of data; however most of the
elements in the data are zeros. For example, imagine a matrix where the columns are every movie
on Netflix, the rows are every Netflix user, and the values are how many times a user has watched
that particular movie. This matrix would have tens of thousands of columns and millions of rows!
However, since most users do not watch most movies, the vast majority of elements would be zero.
Sparse matricies only store nonzero elements and assume all other values will be zero, leading to
significant computational savings. In our solution, we created a Numpy array with two nonzero
values, then converted it into a sparse matrix. If we view the sparse matrix we can see that only the
nonzero values are stored:
(1, 1) 1
(2, 0) 3
There are a number of types of sparse matrices. However, in compressed sparse row (CSR)
matrices, (1, 1) and (2, 0) represent the (zero-indexed) indices of the non-zero values 1 and 3,
respectively. For example, the element 1 is in the second row and second column. We can see the
advantage of sparse matrices if we create a much larger matrix with many more zero elements and
then compare this larger matrix with our original sparse matrix:
# create larger matrix
matrix_large = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[3, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
(1, 1) 1
(2, 0) 3
As we can see, despite the fact that we added many more zero elements in the larger matrix, its
sparse representation is exactly the same as our original sparse matrix. That is, the addition of zero
elements did not change the size of the sparse matrix.
As mentioned, there are many different types of sparse matrices, such as compressed sparse
column, list of lists, and dictionary of keys. While an explanation of the different types and their
implications is outside the scope, it is worth noting that while there is no “best” sparse matrix type,
there are meaningful differences between them and we should be conscious about why we are
choosing one type over another.
See Also
Sparse matrices, SciPy documentation
(https://docs.scipy.org/doc/scipy/reference/sparse.html)
101 Ways to Store a Sparse Matrix (https://medium.com/@jmaxg3/101-ways-to-store-a-
sparse-matrix-c7f2bf15a229)
Solution
NumPy's arrays make that easy
# load library
import numpy as np
# create row vector
vector = np.array([1, 2, 3, 4, 5, 6])
# create matrix
matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Discussion
Like most things in Python, NumPy arrays are zero-indexed, meaning that the index of the first
element is 0, not 1. With that caveat, NumPy offers a wide variety of methods for selecting (i.e.,
indexing and slicing) elements or groups of elements in arrays:
array([1, 2, 3, 4, 5, 6])
array([1, 2, 3])
array([[1, 2, 3],
[4, 5, 6]])
# select all rows and the second column
matrix[:,1:2]
array([[2],
[5],
[8]])
Solution
Use shape, size, and ndim:
# load library
import numpy as np
# create matrix
matrix = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]])
(3, 4)
12
Discussion
This might seem basic (and it is); however, time and again it will be valuable to check the shape and
size of an array both for further calculations and simply as a gut check after some operation
Solutions
Use NumPy's vectorize:
# load library
import numpy as np
# create matrix
matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Discusion
A lambda function is a small anonymous function. A lambda function can take any number of
arguments, but can only have one expression. NumPy’s vectorize class converts a function into a
function that can apply to all elements in an array or slice of an array. It’s worth noting that vectorize is
essentially a for loop over the elements and does not increase performance. Furthermore, NumPy
arrays allow us to perform operations between arrays even if their dimensions are not the same (a
process called broadcasting). For example, we can create a much simpler version of our solution
using broadcasting:
Solution
Use NumPy's max and min:
# load library
import numpy as np
# create matrix
matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Discussion
Often we want to know the maximum and minimum value in an array or subset of an array. This can
be accomplished with the max and min methods. Using the axis parameter we can also apply the
operation along a certain axis:
array([7, 8, 9])
array([3, 6, 9])
Unsolved Problems
1.8 Calculating the Average, Variance, and Standard Deviation
Problem
You want to calculate some descriptive statistics about an array.
Solution
Use NumPy's mean, var, and std:
# load library
import numpy as np
# create matrix
matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
# return mean
# return variance
Discussion
Just like with max and min, we can easily get descriptive statistics about the whole matrix or do
calculations alon a single axis:
# load library
import numpy as np
Discussion
reshape allows us to restructure an array so that we maintain the same data but it is organized as a
different number of rows and columns. The only requirement is that the shape of the original and
new matrix contain the same number of elements (i.e., the same size). We can see the size of a
matrix using size:
matrix.size
One useful argument in reshape is -1, which effectively means “as many as needed,” so reshape(-1,
1) means one row and as many columns as needed:
matrix.reshape(1, -1)
Finally, if we provide one integer, reshape will return a 1D array of that length:
matrix.reshape(12)
Solution
Use NumPy's eig from linear algebra
from numpy import linalg
A= np.array([[5, 3], [-6, -4]])
A
array([[ 5, 3],
[-6, -4]])
Problem
You want to find inverse of a given matrix.
Solution
Use NumPy's inv from linear algebra
Prob 2 Write a program to create a 2X2 matrix of ones and adding a border
around it with zeros.
Theory:
Python
• High level, interpreted, and general-purpose dynamic programming language.
Kernels: The "computational engine" which executes code blocks of the notebook
Cells: A container for code or text (e.g., this is written within a markdown cell)
Text cell uses markdown language. If you're unfamiliar with Markdown syntax,
check our this cheat sheet.
Code cell contains Python code. Few important key-shortcuts while using Code cell:
Shift + Enter : Executes the current cell and moves to the next
Shift + Tab : Brings up documentation. Try this after entering
np.ones(
Inserting Module
What is a module?
Certain functions are always defined, split(), len(). But to access more functions, must
import their module.
Module:
A file containing functions, definitions, and/or executable statements.
The module name is just the file name with .py removed.
To import:
Method 1 (Recommended)
import module1 as mod1
mod1.some_func()
Plotting
2.0 Introduction
Matplotlib is the most popular graphing and data visualization library for Python. matplotlib.pyplot is a
collection of command style functions that make matplotlib work like MATLAB. Each pyplot
function makes some change to a figure: e.g., creates a figure, creates a plotting area in a figure,
plots some lines in a plotting area, decorates the plot with labels, etc. In matplotlib.pyplot various
states are preserved across function calls, so that it keeps track of things like the current figure and
plotting area, and the plotting functions are directed to the current axes.
Reference: https://matplotlib.org/stable/
Installation
The Easiest way to install matplotlib is to use pip. Type following command in terminal:
To import
import matplotlib.pyplot as plt
Discussion
If you provide a single list or array to the plot() command, matplotlib assumes it is a sequence of y
values, and automatically generates the x values for you. Since python ranges start with 0, the
default x vector has the same length as y but starts with 0.
# x axis values
x = [1,2,3,4,5]
# corresponding y axis values
y = [2,4,1,3,5]
# Title to plot
plt.title('Plot 2.2')
Note:
Define the x-axis and corresponding y-axis values as lists.
Plot using .plot() function.
Give a name to x-axis and y-axis using .xlabel() and .ylabel() functions.
Give a title to your plot using .title() function.
To view your plot, we use .show() function.
Discussion
plot() is a versatile command, and will take an arbitrary number of arguments. For example, to plot x
versus y, you can directly issue the command:
plt.plot([1, 2, 3, 4], [1, 4, 9, 16])
See Also
https://www.geeksforgeeks.org/graph-plotting-in-python-set-1/
Discussion
For every x, y pair of arguments, there is an optional third argument which is the format string that
indicates the color and line type of the plot. The letters and symbols of the format string are from
MATLAB, and you concatenate a color string with a line style string. The default format string is ‘b-‘,
which is a solid blue line. In abobe example, to plot with red circles.
# line 1 points
x1 = [1,2,3,4,5]
y1 = [2,4,1,3,5]
# plotting the line 1 points
plt.plot(x1, y1, label = "line 1")
# line 2 points
x2 = [1,2,3,4,5]
y2 = [4,1,3,2,5]
# plotting the line 2 points
plt.plot(x2, y2, label = "line 2")
Note
How to differentiate between the lines: By using a name(label) which is passed as an argument of
.plot() function. The small rectangular box giving information about type of line and its color is
called legend.
Solution
To set the x – axis values, use np.arange() method used in which first two arguments are for range
and third one for step-wise increment like linspace from MATLAB. The result is a numpy array. To
get corresponding y-axis values, predefined np.sin() used on the numpy array. Finally, the points are
plotted by passing x and y arrays to the plt.plot() function.
Also see
plt.setp() command for MATLAB-style
Matplotlib allows you provide such an object with the data keyword argument. If provided, then you
may generate plots with the strings corresponding to these variables.
#scatter plot
plt.scatter('a', 'b', c='c', s='d', data=data)
plt.xlabel('entry a')
plt.ylabel('entry b')
plt.show()
Solution
matplotlib.pyplot.stem() creates stem plots. A Stem plot plots vertical lines at each x position
covered under the graph from the baseline to y, and places a marker there.
Problem: To generate a discrete time unit impulse and unit step signal.
Solution
n = np.linspace(-5, 5, 11)
delta = 1*(n==0)
u = 1*(n>=0)
Solution: Matplotlib allows you to pass categorical variables directly to many plotting
functions.
plt.figure()
#subplot matrix size 2 x 1, window 1
plt.subplot(211)
plt.plot(t1, f(t1), 'bo', t2, f(t2), 'k')
plt.show()
Unsolved Problem
Problem 1: Write a program to generate figure given below:
Post Lab Exercise
Problem 1: Write a program to generate figure given below:
Problem 2: Write a program to plot two sequences given below in one figure window.
y[n] = u[n+3] - u[n-4], u[n] is unit step sequence
y[n] = r[n], n = -5:20, r[n] is unit ramp sequence
0s completed at 09:48
Experiment 3
Theory:
Convolution is a mathematical operation used to express the relation between input and output of
an LTI system. It relates input, output and impulse response of an LTI system. The discrete time
convolution of two sequences can be defined as x[n] ∗ h[n]= ∑∞k=−∞ x[k]h[n − k]
convolution =np.convolve(x1,x2)
# Title to plot
plt.title('DT Convolution')
plt.show()
convolution
array([0. , 1. , 2.5, 5. , 3.5, 3. ])
convolution =np.convolve(x1,x2 )
# Title to plot
plt.title('DT Convolution')
plt.show()
convolution
array([ 0. , 1. , -2.5, 3. , 0.5, -3. ])
convolution =np.convolve(x1,x2 )
#Setting range of time axis
l1 = np.size(x1)
l2 = np.size(x2)
n=np.arange(n1[0]+n2[0],n1[-1]+n2[-1],1)
# Title to plo
plt.title('DT Convolution')
plt.show()
convolution
array([ 0. , 1. , -2.5, 4. , -1.5])
Unsolved Problem
Problem 4: Compute the convolution of the following discrete time sequences x1[n]=[0,1,0,1,0,1]
x2[n]=[1,0,0,0]
PostLab Exercise
#importing libraries
import matplotlib.pyplot as plt
import numpy as np
# input sequence
x1=([0,4,2,0])
#DFT computation of x1
dft=scipy.fft.fft(x1)
plt.subplot(2, 1, 1)
plt.stem(dft.real, use_line_collection = True)
# naming the x axis
plt.xlabel('k')
# naming the y axis
plt.ylabel('Real{x[k]}')
# Title to plot
plt.title('Real part of DFT')
plt.show()
plt.subplot(2, 1, 2)
plt.stem(dft.imag, use_line_collection = True)
# naming the x axis
plt.xlabel('k')
# naming the y axis
plt.ylabel('Img{X{k}}')
# Title to plot
plt.title('Imaginary Part of DFT')
plt.show()
print('DFT X[k] =',dft)
DFT X[k] = [ 6.-0.j -2.-4.j -2.-0.j -2.+4.j]
#importing libraries
import matplotlib.pyplot as plt
import numpy as np
# input sequence
Xk=[6,-2-4j,-2,-2+4j]
#IDFTcomputation of Xk
idft=scipy.fft.ifft(Xk)
plt.subplot(2, 1, 1)
plt.stem(idft.real, use_line_collection = True)
# naming the x axis
plt.xlabel('n')
# naming the y axis
plt.ylabel('Real{x[n]}')
# Title to plot
plt.title('Real part of IDFT')
plt.show()
plt.subplot(2, 1, 2)
plt.stem(idft.imag, use_line_collection = True)
# naming the x axis
plt.xlabel('n')
# naming the y axis
plt.ylabel('Img{x[n]}')
# Title to plot
plt.title('Imaginary Part of IDFT')
plt.show()
print('IDFT x[n] =',idft)
N = 32
t = np.arange(N)
#Finding DFT
x=np.sin(t)
sp = fftshift(fft(x))
#Creating DFT sample frequencies
freq = fftshift(fftfreq(N))
plt.figure(figsize=(4,8))
#Ploting x(t)
plt.subplot(211)
plt.grid()
plt.plot(t, x)
# naming the x axis
plt.xlabel('t')
# naming the y axis
plt.ylabel('sin(t)')
#Plotting DFT
plt.subplot(212)
plt.grid()
plt.plot(freq, abs(sp)) # Real Part
plt.show()
Discussion: Observe spectrum for larger values of N like 64, 128 and 512.
Unsolved Problems:
0s completed at 18:14
8/5/22, 4:38 PM Experiment_6_Data Wrangling.ipynb - Colaboratory
Experiment 6
Aim: To transform raw data to a clean and organized format ready for use.
Data wrangling is the process of cleaning and unifying messy and complex data sets for easy
access and analysis. This process typically includes manually converting and mapping data from
one raw form into another format to allow for more convenient consumption and organization of
the data.
The most common data structure used to "wrangle" data is the data frame, which can be both
intuitive and incredibly versatile. Data frames are tabular, meaning that they are based on rows and
columns like you'd find in a spreadsheet.
#importing
import pandas as pd
dataframe = pd.DataFrame()
Pandas is a software library written for the Python programming language for data manipulation
and analysis. In particular, it offers data structures and operations for manipulating numerical
tables and time series.
url = "https://raw.githubusercontent.com/chrisalbon/simulated_datasets/master/titanic.csv"
df = pd.read_csv(url)
# show first two rows
print('First two rows: \n',df.head(2))
# also try for last two rows
https://colab.research.google.com/drive/179mLjFyAo9Fj_T2aLgPib74ZRoJmLeNg#printMode=true 1/6
8/5/22, 4:38 PM Experiment_6_Data Wrangling.ipynb - Colaboratory
# show dimensions
print('\n Dimensions: ',df.shape)
# also try
print("\n Dimensions: {}".format(df.shape))
# show statistics
print('\n Data statistics are')
df.describe()
Dimensions: (1313, 6)
Dimensions: (1313, 6)
https://colab.research.google.com/drive/179mLjFyAo9Fj_T2aLgPib74ZRoJmLeNg#printMode=true 2/6
8/5/22, 4:38 PM Experiment_6_Data Wrangling.ipynb - Colaboratory
print('\n', df.iloc[0:3])
[4 rows x 6 columns]
Data Frames do not need to be numerically indexed. We can set the index of a Data Frame to any
value where the value is unique to each row. For example, we can set the index to be passenger
names and then select rows using a name:
# set index
df = df.set_index(df['Name'])
# show row
df.loc['Allen, Miss Elisabeth Walton']
Discussion
To select individual rows and slices of rows, pandas provides two methods:
https://colab.research.google.com/drive/179mLjFyAo9Fj_T2aLgPib74ZRoJmLeNg#printMode=true 3/6
8/5/22, 4:38 PM Experiment_6_Data Wrangling.ipynb - Colaboratory
# multiple conditions
df[df("Sex"=="female") & ("Age"==30)]].head(2)
# replace any instance of 'female' with Woman and 'male' with Man
df['Sex'].replace(['female', 'male'], ['Woman', 'Man']).head(5)
Try to replace across the entire dataframe object by specifying the whole dataframe instead of a
single columm:
df.replace(1, "One").head(2)
https://colab.research.google.com/drive/179mLjFyAo9Fj_T2aLgPib74ZRoJmLeNg#printMode=true 4/6
8/5/22, 4:38 PM Experiment_6_Data Wrangling.ipynb - Colaboratory
df['Sex'].nunique()
# value_counts will display all unique values with the number of times each value appears
df['Sex'].value_counts()
https://colab.research.google.com/drive/179mLjFyAo9Fj_T2aLgPib74ZRoJmLeNg#printMode=true 5/6
8/5/22, 4:38 PM Experiment_6_Data Wrangling.ipynb - Colaboratory
https://colab.research.google.com/drive/179mLjFyAo9Fj_T2aLgPib74ZRoJmLeNg#printMode=true 6/6
Experiment 7
Image as Data
7.0 Introduction
Image data is most often used to represent graphic or pictorial data. The term image inherently reflects a
graphic representation. A photographic or trace objects that represent the underlying pixel data of an area
of an image element, which is created, collected and stored using image constructor devices.
Image data is typically stored in a variety of de facto industry standard proprietary formats. These often
reflect the most popular image processing systems. Other graphic image formats, such as TIFF, GIF, PCX,
etc., are used to store ancillary image data. Most GIS software will read such formats and allow you to
display this data.
With extensive examples,it explains the central Python packages you will need for working with images.This
experiment introduces the basic tools for reading images, converting, plotting or saving results, and so on.
PIL—The Python Imaging Library The Python Imaging Library (PIL) provides general image handling and
lots of useful basic image operations like resizing, cropping, rotating, color conversion and much more. PIL
is free and available from http://www.pythonware.com/products/pil/.
PIL is the Python Imaging Library which provides the python interpreter with image editing capabilities.
Pillow is the friendly PIL fork and an easy to use library developed by Alex Clark and other contributors.
The main package of skimage only provides a few utilities for converting between image data types; for
most features, you need to import one of the following subpackages:
color (Color space conversion), data (Test images and example data), draw (Drawing primitives (lines, text,
etc.) that operate on NumPy arrays), exposure (Image intensity adjustment, e.g., histogram equalization,
etc.), feature (Feature detection and extraction, e.g., texture analysis corners, etc.), filters (Sharpening, edge
finding, rank filters, thresholding, etc.), graph (Graph-theoretic operations, e.g., shortest paths), io (Reading,
saving, and displaying images and video), measure ( Measurement of image properties, e.g., region
properties and contours), metrics (Metrics corresponding to images, e.g. distance metrics, similarity, etc.),
morphology (Morphological operations, e.g., opening or skeletonization), restoration (Restoration algorithms,
e.g., deconvolution algorithms, denoising, etc.), segmentation ( Partitioning an image into
multiple regions), transform (Geometric and other transforms, e.g., rotation or the Radon transform), util (
Generic utilities), viewer (A simple graphical user interface for visualizing results and exploring parameters).
7.1.1 Using PIL : With PIL, you can read images from most formats and write to the most common ones.
The most important module is the Image module. PIL.Image.open() Opens and identifies the given image
file.
# importing PIL
from PIL import Image
# Read image
I=Image.open("x1.png")
print('Size of Image',I.size)
Image.open("x1.png")
# Read Images
img = mpimg.imread("x1.png")
# Output Images
plt.imshow(img)
plt.show()
7.1.3 Using OpenCV : OpenCV (Open Source Computer Vision) is a computer vision library that contains
various functions to perform operations on pictures or videos. It was originally developed by Intel but was
later maintained by Willow Garage and is now maintained by Itseez. This library is cross-platform that is it is
available on multiple programming languages such as Python, C++ etc.
io.imshow(image)
<matplotlib.image.AxesImage at 0x7f621570c050>
Image.open("x11.png")
7.2.2 Using Matplotlib: Matplotlib plots can be saved as image files using the plt.savefig() function.
The plt.savefig() function needs to be called right above the plt.show() line
x = [0, 2, 4, 6]
y = [1, 3, 4, 8]
plt.plot(x,y)
plt.xlabel('x values')
plt.ylabel('y values')
plt.title('plotted x and y values')
plt.legend(['line 1'])
cv2_imshow(image)
# Filename
filename = 'savedImage.jpg'
cv2.imwrite(filename,image[:,:200,:100])
img = cv2.imread('savedImage.jpg')
# Filename
filename = 'savedImage.png'
io.imshow('savedImage.png')
#image = io.imread("http://host.robots.ox.ac.uk/pascal/VOC/images/voc2005_11c.jpg")
image = io.imread("x1.png")
#Check the image matrix data type (could know the bit depth of the image)
print(image.dtype)
# Check the height of image
print(image.shape[0])
# Check the width of image
print(image.shape[1])
# Check the number of channels of the image
print(image.shape[2])
uint8
512
512
3
Using PIL :
im = Image.open("x1.png")
width, height = im.size
512
512
x1.png
PNG
Portable network graphics
import numpy as np
import matplotlib.pyplot as plt
image = io.imread("x2.png")
def image_augment(image):
fig,ax = plt.subplots(nrows=1,ncols=3,figsize=(15,8))
ax[0].imshow(image)
ax[0].axis('off')
ax[1].imshow(rotate(image, angle=45, mode = 'wrap'))
ax[1].axis('off')
ax[2].imshow(np.fliplr(image))
ax[2].axis('off')
# Apply on an image
image_augment(image)
import cv2
from google.colab.patches import cv2_imshow # for image display
image = cv2.imread("x2.png")
cv2_imshow(image)
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
cv2_imshow(gray_image)
7.3.2.2 Converting images from BGR to RGB .
import numpy as np
import pandas as pd
import cv2 as cv
from google.colab.patches import cv2_imshow # for image display
from skimage import io
from PIL import Image # Pillow library
import matplotlib.pylab as plt
images=[]
for url in urls:
image = io.imread(url)
images.append(io.imread(url))
image_2 = cv.cvtColor(image, cv.COLOR_BGR2RGB)
final_frame = cv.hconcat((image, image_2))
cv2_imshow(final_frame)
print('\n')
For more on image processing using python:
Problem 2 Load a image from Google drive and and rotate image by 45 degree and display both
images.
Problem 3 Load at least three colour images and convert them into grayscale. Display original
images as well as images in grayscale format.
Experiment Number 8
Note that when down-sampling an image, resize and rescale should perform Gaussian smoothing to avoid aliasing artifacts. See the
anti_aliasing and anti_aliasing_sigma arguments to these functions.
Downscale serves the purpose of down-sampling an n-dimensional image by integer factors using the local mean on the elements of eachblock of the size
factors given as a parameter to the function.
image = color.rgb2gray(data.astronaut())
ax = axes.ravel()
ax[0].imshow(image, cmap='gray')
ax[0].set_title("Original image")
ax[1].imshow(image_rescaled, cmap='gray')
ax[1].set_title("Rescaled image (aliasing)")
ax[2].imshow(image_resized, cmap='gray')
ax[2].set_title("Resized image (no aliasing)")
ax[3].imshow(image_downscaled, cmap='gray')
ax[3].set_title("Downscaled image (no aliasing)")
ax[0].set_xlim(0, 512)
ax[0].set_ylim(512, 0)
plt.tight_layout()
plt.show()
%matplotlib inline
Following operation on an image is for brightness and darkness. Basically brightness of an image is done using addition and darkness of it is done using subtraction operation.
Multiplication can be used to improve the contrast of the image .Contrast is the difference in the intensity values of pixels of an image, multiplying the intensity values with a
constant can make the difference larger or smaller.
[2]
1s
#from google.colab import files
import cv2
import numpy as np
import matplotlib.pyplot as plt
#image=files.upload()
img=cv2.imread('lena.jpg')
#img = cv2.resize(img, (300,300))
matrix=np.ones(img.shape, dtype="uint8")*70
img_brighter=cv2.add(img,matrix)
img_darker=cv2.subtract(img,matrix)
plt.subplot(141);
plt.imshow(img)
plt.title("original image")
plt.subplot(142);
plt.imshow(img_brighter)
plt.title("brighter image")
plt.subplot(143);
plt.imshow(img_darker)
plt.title("darker image");
First we try reconstruction by dilation starting at the edges of the image. We initialize a seed image to the minimum intensity of the image, andset its border to
be the pixel values in the original image. These maximal pixels will get dilated in order to reconstruct the background image.
import numpy as np
import matplotlib.pyplot as plt
# Convert to float: Important for subtraction later which won't work with uint8
image = img_as_float(data.coins())
image = gaussian_filter(image, 1)
seed = np.copy(image)
seed[1:-1, 1:-1] = image.min()
mask = image
Subtracting the dilated image leaves an image with just the coins and a flat, black background, as shown below.
ax0.imshow(image, cmap='gray')
ax0.set_title('original image')
ax0.axis('off')
fig.tight_layout()
Although the features (i.e. the coins) are clearly isolated, the coins surrounded by a bright background in the original image are dimmer in thesubtracted
image. We can attempt to correct this using a different seed image.
Instead of creating a seed image with maxima along the image border, we can use the features of the image itself to seed the reconstructionprocess. Here, the
seed image is the original image minus a fixed value, h .
h = 0.4
seed = image - h
dilated = reconstruction(seed, mask, method='dilation')
hdome = image - dilated
To get a feel for the reconstruction process, we plot the intensity of the mask, seed, and dilated images along a slice of the image (indicated byred line).
ax2.imshow(hdome, cmap='gray')
ax2.axhline(yslice, color='r', alpha=0.4)
ax2.set_title('image - dilated')
ax2.axis('off')
fig.tight_layout()
plt.show()
As you can see in the image slice, each coin is given a different baseline intensity in the reconstructed image; this is because we used the localintensity (shifted
by h ) as a seed value. As a result, the coins in the subtracted image have similar pixel intensities. The final result is known as
the h-dome of an image since this tends to isolate regional maxima of height h . This operation is particularly useful when your images areunevenly
illuminated.
%matplotlib inline
Mean filters
This example compares the following mean filters of the rank filter package:
local mean: all pixels belonging to the structuring element to compute average gray level.
percentile mean: only use values between percentiles p0 and p1 (here 10% and 90%).
bilateral mean: only use pixels of the structuring element having a gray level situated inside g-s0 and g+s1 (here g-500 and g+500)
Percentile and usual mean give here similar results, these filters smooth the complete image (background and details). Bilateral mean exhibitsa high filtering
rate for continuous area (i.e. background) while higher image frequencies remain untouched.
image = data.coins()
selem = disk(20)
plt.tight_layout()
plt.show()
In Lab Practice Exersise
%matplotlib inline
image = camera()
edge_roberts = filters.roberts(image)
edge_sobel = filters.sobel(image)
axes[0].imshow(edge_roberts, cmap=plt.cm.gray)
axes[0].set_title('Roberts Edge Detection')
axes[1].imshow(edge_sobel, cmap=plt.cm.gray)
axes[1].set_title('Sobel Edge Detection')
for ax in axes:
ax.axis('off')
plt.tight_layout()
plt.show()
Different operators compute different finite-difference approximations of the gradient. For example, the Scharr filter results in a less rotational variance than the Sobel
filter that is in turn better than the Prewitt filter [1]_ [2]_ [3]_. The difference between the Prewitt and Sobel filters and theScharr filter is illustrated below with an
image that is the discretization of a rotation- invariant continuous function. The discrepancy between
the Prewitt and Sobel filters, and the Scharr filter is stronger for regions of the image where the direction of the gradient is close to diagonal,and for regions
with high spatial frequencies. For the example image the differences between the filter results are very small and the filterresults are visually almost
indistinguishable.
.. [1] https://en.wikipedia.org/wiki/Sobel_operator#Alternative_operators
.. [2] B. Jaehne, H. Scharr, and S. Koerkel. Principles of filter design. In Handbook of Computer Vision and Applications. Academic Press, 1999.
.. [3] https://en.wikipedia.org/wiki/Prewitt_operator
x, y = np.ogrid[:100, :100]
edge_sobel = filters.sobel(image_rot)
edge_scharr = filters.scharr(image_rot)
edge_prewitt = filters.prewitt(image_rot)
axes[0].imshow(image_rot, cmap=plt.cm.gray)
axes[0].set_title('Original image')
axes[1].imshow(edge_scharr, cmap=plt.cm.gray)
axes[1].set_title('Scharr Edge Detection')
for ax in axes:
ax.axis('off')
plt.tight_layout()
plt.show()
As in the previous example, here we illustrate the rotational invariance of the filters. The top row shows a rotationally invariant image along withthe angle of its
analytical gradient. The other two rows contain the difference between the different gradient approximations (Sobel, Prewitt,
Scharr & Farid) and analytical gradient.
The Farid & Simoncelli derivative filters [4], [5] are the most rotationally invariant, but require a 5x5 kernel, which is computationally moreintensive
than a 3x3 kernel.
.. [4] Farid, H. and Simoncelli, E. P., "Differentiation of discrete multidimensional signals", IEEE Transactions on Image Processing 13(4): 496-508, 2004.
:DOI: 10.1109/TIP.2004.823819
x, y = np.mgrid[-10:10:255j, -10:10:255j]
image_rotinv = np.sin(x ** 2 + y ** 2)
image_x = 2 * x * np.cos(x ** 2 + y ** 2)
image_y = 2 * y * np.cos(x ** 2 + y ** 2)
angle_farid = angle(filters.farid_h(image_rotinv),
filters.farid_v(image_rotinv))
angle_sobel = angle(filters.sobel_h(image_rotinv),
filters.sobel_v(image_rotinv))
angle_scharr = angle(filters.scharr_h(image_rotinv),
filters.scharr_v(image_rotinv))
angle_prewitt = angle(filters.prewitt_h(image_rotinv),
filters.prewitt_v(image_rotinv))
axes[0].imshow(image_rotinv, cmap=plt.cm.gray)
axes[0].set_title('Original image')
axes[1].imshow(true_angle, cmap=plt.cm.hsv)
axes[1].set_title('Analytical gradient angle')
fig.subplots_adjust(right=0.8)
colorbar_ax = fig.add_axes([0.90, 0.10, 0.02, 0.50])
fig.colorbar(color_ax, cax=colorbar_ax, ticks=[0, 0.01, 0.02])
for ax in axes:
ax.axis('off')
plt.show()
Check is it possible to perform background and Foreground extraction using the above
techniques ?
1s completed at 12:27 PM
Experiment – 9
Theory:
Reconstruction of signal:
t = np.linspace(-
1,1,100) # redefine this here for convenience
num_coeffs=len(ts)
sm=0
for k in range(-
num_coeffs,num_coeffs): # since function is real, need
both sides
sm+=np.sin(2*np.pi*(k/fs))*np.sinc( k - fs * t)
#plt.subplot(1, 2, 2)
plt.plot( ts, np.sin(2*np.pi*ts), label='Reconstructed
SineWave')
plt.plot(t,sm,'o', label=' Samples')
plt.xlabel('Time.', fontsize=15)
plt.ylabel('Amplitude', fontsize=15)
plt.legend(fontsize=10, loc='upper right')
Detailed explanation of reconstruction:
plt.show()
k=0
plt.plot (t,np.sinc( k - fs * t),
t,np.sinc( k+1 - fs * t),'--
',k/fs,1,'o',(k)/fs,0,'o',
t,np.sinc( k-1 - fs * t),'--',k/fs,1,'o',(-
k)/fs,0,'o'
)
plt.hlines(0,-1,1)
plt.vlines(0,-.2,1)
plt.annotate('sample value goes here',
xy=(0,1),
xytext=(-1+.1,1.1),
arrowprops={'facecolor':'red','shrink':0.05
},
)
plt.annotate('no interference here',
xy=(0,0),
xytext=(-1+.1,0.5),
arrowprops={'facecolor':'green','shrink':0.
05},
)
plt.show()
k=np.array(sorted(set((t*fs).astype(int)))) # sorted co
efficient list
plt.plot(t,(np.sin(2*np.pi*(k[:,None]/fs))*np.sinc(k[:,
None]-fs*t)).T,'--', # individual whittaker functions
t,(np.sin(2*np.pi*(k[:,None]/fs))*np.sinc(k[:,N
one]-fs*t)).sum(axis=0),'k-', # whittaker interpolant
k/fs,np.sin(2*np.pi*k/fs),'ob')# samples
#plt.set_xlabel('time',fontsize=14)
plt.axis((-1.1,1.1,-1.1,1.1));
Assignments: Change the sampling rate below Nyquist rate and understand
aliasing problem.
Experiment - 10
Theory:
Pulse code modulation(PCM) was invented by Alec Reeves in 1937 to obtain a digital
representation of analog message signals m(t). In essence, m(t) is sampled at rate Fs samples
per second and then each sample is quantized to b bits which are in turn transmitted serially,
e.g., using flat-top PAM. In telephony Fs = 8000 samples per second and b = 8 are the most
commonly used parameters for individual subscriber lines, resulting in a binary signal with bit
rate Fb = 64 kbit/sec. One advantage of using PCM is that several signals can easily be
multiplexed in time so that they can share a single communication channel. A T1 carrier, for
example, is used in telephony to transmit 24 multiplexed PCM signals with a total rate of 1.544
Mbit/sec (this includes some overhead for synchronization). A second advantage is that
repeaters that need to be used to compensate for losses over large distances can (within some
limits) restore the signal perfectly because only two signal levels need to be distinguished.
The most common technique to change an analog signal to digital data is called pulse code
modulation (PCM). A PCM encoder has the following three processes:
Python Implementation
import matplotlib.pyplot as plt
import numpy as np
A = 1
fm = 10
fs = 80
n = 3
t = np.arange(0, 1, (1 / (100 * fm)))
x = A * np.cos(2 * np.pi * fm * t)
#---Sampling-----
ts = np.arange(0, 1, (1 / (fs)))
xs = A * np.cos(2 * np.pi * fm * ts)
#xs Sampled signal
#--Quantization---
x1 = xs + A
x1 = x1 / (2 * A)
L = (-1 + 2 ** n)# Levels
x1 = L * x1
xq = np.round(x1)
r = xq / L
r = 2 * A * r
r = r - A
#r quantized signal
#Calculations
MSE = np.sum((xs - r)** 2) / len(x)
Bitrate = n * fs
Stepsize = 2 * A / L
QNoise = ((Stepsize) ** 2) / 12
plt.figure(1)
plt.plot(t, x, label= 'Original Signal',linewidth = 2)
plt.title('Sampling')
plt.ylabel('Amplitute')
plt.xlabel('Time t(in sec)')
plt.figure(2)
plt.stem(ts, x1,use_line_collection = True)
plt.title('Quantization')
plt.ylabel('Levels L')
Problem 1
A signal m(t) band-limited to 3 k.Hz is sampled at a rate 33½ % higher than the Nyquist rate. The
maximum acceptable error in the sample amplitude (the maximum quantization error) is 0.5% of
the peak amplitude mp. The quantized samples are binary coded. Find the minimum bandwidth
of a channel required to transmit the encoded binary signal. If 24 such signals are time-division
multiplexed, determine the minimum transmission bandwidth required to transmit the
multiplexed signal.
Solution:
we need n = log(2)256 = 8 bits per sample. We require to transmit a total of C = 8 x 8000 =
64, 000 bit/s. Because we can transmit up to 2 bit/s per hertz of bandwidth, we require a
minimum transmission bandwidth BT = C /2 = 32 kHz. The multiplexed signal has a total of
CM = 24 x 64, 000 = 1 .536 Mbit/s, which requires a minimum of 1 .536/2 = 0.768 MHz of
transmission bandwidth.
for i in range(0,11):
j=2**i
if(j>=L):
L1=j#
break#
Solution:
Problem 3:
Calculate the number of bits required, bandwidth required for 30 encodersand signalling rate for
PCM signal for following specifications:
maximum frequency = f_m = 4*10^3
maximun amplitude of input signalx_max = 3.8 average
power of signal, P = 30*10^-3
signal to noise ratio in db, S/N_dB= 20
import math
#given
f_m = 4.*10**3#maximum frequency or bans
x_max = 3.8#maximun input signal
P = 30.*10**-3#average power of signal
SbyN_dB= 20.#signal to noise ratio in db
#calculations
SbyN = math.exp((SbyN_dB/10)*math.log(10));
v = round((math.log10((SbyN*(x_max)**2)/(3*P))/math.log10(2.)/2.));#number of bits require
BW = 30*v*f_m#transmission channel bandwidth which is greater than or equal to obtained va
r=BW*2#wkt signalling rate is two times the transmission bandwidth
#resulta
print ("i.Number of bits required (bits) = ",round(v,2))
print ("ii.Bandwidth required for 30 PCM coders (kHz) = ",round(BW/1000.,0))
print ("iii.Signalling rate (bitspersecond) = ",round(r/1000.,0))
EXPERIMENT-11
11.1 Frequency shift keying modulation-Frequency-shift keying (FSK) is a method of transmitting digital signals using discrete signals.
The two binary states logic 0 (low) and 1 (high) in a binary frequency-shift key mechanism are each represented by an analog waveform.
import numpy as np
import math
amplitude = 1
frequencies = [1200, 4000]
X = []
Y1 = []
Y2 = []
signal= [1,0,1,1,0,1,0,1,0,0]
interval = 1000
phase = 0
for s in signal:
frequency = frequencies[s]
for t in range (interval):
phase += frequency/1000000 * math.pi * 2
Y2.append(s)
Y1.append(math.sin(phase) * amplitude)
X = range(len(signal)*interval)
https://colab.research.google.com/drive/1f32jVJA0u22i869ratul9gLwrsXyO869#scrollTo=cQGNKGKVWYpU&printMode=true 1/7
8/22/22, 12:48 PM Experiment_11.ipynb - Colaboratory
ax, fig = plot.subplots(figsize=(16, 2))
plot.plot(X,Y1)
plot.plot(X,Y2)
plot.title('Digital Signal')
plot.xlabel('Time')
plot.ylabel('Amplitude')
plot.grid(True, which='both')
11.2-Amplitude shift keying-ASK is a type of modulation where the digital signal is represented as a change in amplitude. In order to
carry out amplitude shift keying, we require a carrier signal and a binary sequence signal. It is also known as On-Off keying. This is
because the carrier waves switch between 0 and 1 according to the high and low level of the input signal.
F1=10
F2=2
A=3;#Amplitude
t=num.arange(0,1,0.001)
x=A*num.sin(2*num.pi*F1*t)#Carrier Sine wave
u=[]#Message signal
b=[0.2,0.4,0.6,0.8,1.0]
s=1
for i in t:
https://colab.research.google.com/drive/1f32jVJA0u22i869ratul9gLwrsXyO869#scrollTo=cQGNKGKVWYpU&printMode=true 2/7
8/22/22, 12:48 PM Experiment_11.ipynb - Colaboratory
if(i==b[0]):
b.pop(0)
if(s==0):
s=1
else:
s=0
#print(s,i,b)
u.append(s)
v=[]#Sine wave multiplied with square wave
for i in range(len(t)):
v.append(A*num.sin(2*num.pi*F1*t[i])*u[i])
'''plt.plot(t,x);
plt.xlabel('Time');
plt.ylabel('Amplitude');
plt.title('Carrier');
plt.grid(True)
plt.show()
#plt.plot(t,u)
plt.xlabel('Time')
plt.ylabel('Amplitude')
plt.title('Square wave Pulses')
plt.grid(True)
plt.show()'''
plt.plot(t,v)
plt.xlabel('Time')
plt.ylabel('Amplitude')
plt.title('ASK Signal')
plt.grid(True)
plt.show()
https://colab.research.google.com/drive/1f32jVJA0u22i869ratul9gLwrsXyO869#scrollTo=cQGNKGKVWYpU&printMode=true 3/7
8/22/22, 12:48 PM Experiment_11.ipynb - Colaboratory
11.3 Phase shift keying modulation- PSK is the digital modulation technique in which the phase of the carrier signal is changed by
varying the sine and cosine inputs at a particular time.This is also called as 2-phase PSK or Phase Reversal Keying. In this technique,
the sine wave carrier takes two phase reversals such as 0° and 180°.
BPSK is basically a Double Side Band Suppressed Carrier DSBSC modulation scheme, for message being the digital information.
'''plt.plot(t,x);
plt.xlabel("time");
plt.ylabel("Amplitude");
plt.title("Carrier");
plt.grid(True)
plt.show()'''
u=[]#Message signal
b=[0.2,0.4,0.6,0.8,1.0]
https://colab.research.google.com/drive/1f32jVJA0u22i869ratul9gLwrsXyO869#scrollTo=cQGNKGKVWYpU&printMode=true 4/7
8/22/22, 12:48 PM Experiment_11.ipynb - Colaboratory
s=1
for i in t:
if(i==b[0]):
b.pop(0)
if(s==0):
s=1
else:
s=0
#print(s,i,b)
u.append(s)
#print(u)
'''plt.plot(t,u)
plt.xlabel('time')
plt.ylabel('Amplitude')
plt.title('Message Signal')
plt.grid(True)
plt.show()'''
plt.plot(t,v);
#plt.axis([0 1 -6 6]);
plt.xlabel("t");
plt.ylabel("y");
plt.title("PSK");
plt.grid(True)
plt.show()
https://colab.research.google.com/drive/1f32jVJA0u22i869ratul9gLwrsXyO869#scrollTo=cQGNKGKVWYpU&printMode=true 5/7
8/22/22, 12:48 PM Experiment_11.ipynb - Colaboratory
POST LAB EXCERCISE:do SLF EXCERCISE ON THE GENERATION OF FOLLOWING MODULATION TECHNIQUE WITH THE
HELP OF IN LAB EXCERCISE.
https://colab.research.google.com/drive/1f32jVJA0u22i869ratul9gLwrsXyO869#scrollTo=cQGNKGKVWYpU&printMode=true 6/7
8/22/22, 12:48 PM Experiment_11.ipynb - Colaboratory
0s completed at 12:47 PM
https://colab.research.google.com/drive/1f32jVJA0u22i869ratul9gLwrsXyO869#scrollTo=cQGNKGKVWYpU&printMode=true 7/7
8/23/22, 4:28 PM Updated_Experiment_Linear_Regression_and_Logistic_Regression.ipynb - Colaboratory
Experiment - 12
Aim: To build Linear Regression models and Logistic Regression models for classification.
Linear Regression
Introduction
A regression model is, in machine learning and statistical analysis, a model that can put in relation to known data points in order to estimate a
certain function F and approximate the value of Y in respect of a data point X.
Linear regression is one of the simplest supervised learning algorithms. In fact, it is so simple that it is sometimes not considered machine
learning at all!
Whatever you believe, the fact is that linear regression--and its extensions--continues to be a common and useful method of making
predictions when the target vector is a quantitative value (e.g. home price, age).
Problem 1 : Simple Linear Regression- We will start with the most familiar linear regression, a straight-line fit to data. A straight-line fit is a
model of the form *𝑦=𝑎𝑥+𝑏 * where 𝑎 is commonly known as the slope, and 𝑏 is commonly known as the intercept.
Consider the following data, which is scattered about a line with a slope of 2 and an intercept of -5:
Reference
https://colab.research.google.com/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.06-Linear-
Regression.ipynb#scrollTo=_i6mW05AbG0S
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
import numpy as np
https://colab.research.google.com/drive/1mFjso3nNg5meMzLICOUw866yutdKWAd8?authuser=5#printMode=true 1/13
8/23/22, 4:28 PM Updated_Experiment_Linear_Regression_and_Logistic_Regression.ipynb - Colaboratory
rng = np.random.RandomState(1)
x = 10 * rng.rand(50)
y = 2 * x - 5 + rng.randn(50)
plt.scatter(x, y);
rng.randn(50)
We can use Scikit-Learn's LinearRegression estimator to fit this data and construct the best-fit line:
model.fit(x[:, np.newaxis], y)
https://colab.research.google.com/drive/1mFjso3nNg5meMzLICOUw866yutdKWAd8?authuser=5#printMode=true 2/13
8/23/22, 4:28 PM Updated_Experiment_Linear_Regression_and_Logistic_Regression.ipynb - Colaboratory
plt.scatter(x, y)
plt.plot(xfit, yfit);
xfit
yfit
The slope and intercept of the data are contained in the model's fit parameters, which in Scikit-Learn are always marked by a trailing underscore.
Here the relevant parameters are coef_ and intercept_:
https://colab.research.google.com/drive/1mFjso3nNg5meMzLICOUw866yutdKWAd8?authuser=5#printMode=true 3/13
8/23/22, 4:28 PM Updated_Experiment_Linear_Regression_and_Logistic_Regression.ipynb - Colaboratory
model.intercept_
-4.998577085553204
model.coef_[0]
2.0272088103606953
#### Problem 2 You want to train a model that represents a linear relationship between the feature (2-D) and target vector using
LinearRegression from scikit-learn.
Reference: Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011. [https://scikit-
learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html?
highlight=linear%20regression#sklearn.linear_model.LinearRegression]
Solution:
Linear regression assumes that the relationship between the features and the target vector is approximately linear. In our solution, for the sake of
explanation we have trained our model using only two features. This means our linear model will be:
y = β 0 + β 1x 1 + β 2 x 2 + ϵ
where y is our target, xi is the data for a single feature, β1 and β2 are the coefficients identified by fitting the model, and ϵ is the error. After we have
fit our model, we can view the value of each parameter. For example, β0, also called the bias or intercept, can be viewed using intercept_:
# Load libraries
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston
https://colab.research.google.com/drive/1mFjso3nNg5meMzLICOUw866yutdKWAd8?authuser=5#printMode=true 4/13
8/23/22, 4:28 PM Updated_Experiment_Linear_Regression_and_Logistic_Regression.ipynb - Colaboratory
22.485628113468223
array([-0.35207832, 0.11610909])
24000.0
model.predict(features)[0]*1000
24573.366631705547
https://colab.research.google.com/drive/1mFjso3nNg5meMzLICOUw866yutdKWAd8?authuser=5#printMode=true 5/13
8/23/22, 4:28 PM Updated_Experiment_Linear_Regression_and_Logistic_Regression.ipynb - Colaboratory
Polynomial regression is an extension of linear regression to allow us to model nonlinear relationships. To create a polynomial regression, convert
the linear function we used earlier
y = β 0 + β 1x 1 + ϵ
into a polynomial function by adding polynomial features:
y = β0 + β1x1 + β2x2 +. . . +βd xd + ϵ
1 i
where d is the degree of the polynomial. How are we able to use a linear regression for a nonlinear function?
Problem 3
You want to model a nonlinear relationship y = sin(x).
Solution
Create a polynomial regression by including polynomial features in a linear regression model:
With this transform in place, we can use the linear model to fit much more complicated relationships between 𝑥 and 𝑦 . For example, here is a
sine wave with noise:
https://colab.research.google.com/drive/1mFjso3nNg5meMzLICOUw866yutdKWAd8?authuser=5#printMode=true 6/13
8/23/22, 4:28 PM Updated_Experiment_Linear_Regression_and_Logistic_Regression.ipynb - Colaboratory
rng = np.random.RandomState(1)
x = 10 * rng.rand(50)
y = np.sin(x) + 0.1 * rng.randn(50)
poly_model.fit(x[:, np.newaxis], y)
yfit = poly_model.predict(xfit[:, np.newaxis])
plt.scatter(x, y)
plt.plot(xfit, yfit);
Our linear model, through the use of 7th-order polynomial basis functions, can provide an excellent fit to this non-linear data!
Problem 4
You want to model a nonlinear relationship using using LinearRegression from scikit-learn.
Solution
Create a polynomial regression by including polynomial features in a linear regression model:
https://colab.research.google.com/drive/1mFjso3nNg5meMzLICOUw866yutdKWAd8?authuser=5#printMode=true 7/13
8/23/22, 4:28 PM Updated_Experiment_Linear_Regression_and_Logistic_Regression.ipynb - Colaboratory
array([0.00632])
array([3.99424e-05])
array([2.52435968e-07])
https://colab.research.google.com/drive/1mFjso3nNg5meMzLICOUw866yutdKWAd8?authuser=5#printMode=true 8/13
8/23/22, 4:28 PM Updated_Experiment_Linear_Regression_and_Logistic_Regression.ipynb - Colaboratory
#bias parameter/intercept
regression.intercept_
25.190479369326766
0.2177048869902171
plt.xlabel('X')
plt.ylabel('y,Predicted')
plt.legend()
plt.show()
https://colab.research.google.com/drive/1mFjso3nNg5meMzLICOUw866yutdKWAd8?authuser=5#printMode=true 9/13
8/23/22, 4:28 PM Updated_Experiment_Linear_Regression_and_Logistic_Regression.ipynb - Colaboratory
Logistic Regression
Despire being called a regression, logistic regression is actually a widely used supervised classification technique. Allows us to predict the
probability that an observation is of a certain class
# Load libraries
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
# Standardize features
https://colab.research.google.com/drive/1mFjso3nNg5meMzLICOUw866yutdKWAd8?authuser=5#printMode=true 10/13
8/23/22, 4:28 PM Updated_Experiment_Linear_Regression_and_Logistic_Regression.ipynb - Colaboratory
scaler = StandardScaler()
features_standardized = scaler.fit_transform(features)
# Train model
model = logistic_regression.fit(features_standardized, target)
array([1])
Discussion
Dispire having "regression" in its name, a logistic regression is actually a widely used binary lassifier (i.e. the target vector can only take two
1
values). In a logistic regression, a linear model (e.g. β 0 + β i x) is included in a logistic (also called sigmoid) function, 1+ e−z
, such that:
1
P(yi = 1|X) =
1 + e−(β0 +β1 x)
where P(yi = 1|X) is the probability of the ith obsevation's target, yi being class 1, X is the training data, β0 and β1 are the parameters to be
learned, and e is Euler's number. The effect of the logistic function is to constrain the value of the function's output to between 0 and 1 so that i can
be interpreted as a probability. If P(yi = 1|X) is greater than 0.5, class 1 is predicted; otherwise class 0 is predicted
# Load libraries
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
# Load data
iris = datasets.load_iris()
features = iris.data
target = iris.target
# Standardize features
scaler = StandardScaler()
features_standardized = scaler.fit_transform(features)
# Train model
model = logistic_regression.fit(features_standardized, target)
array([2])
array([1])
https://colab.research.google.com/drive/1mFjso3nNg5meMzLICOUw866yutdKWAd8?authuser=5#printMode=true 12/13
8/23/22, 4:28 PM Updated_Experiment_Linear_Regression_and_Logistic_Regression.ipynb - Colaboratory
Problem 1 Consider a dataset with curvilinear relationship. Build polynomial regression model using LinearRegression from
scikit-learn.
#creating a dataset with curvilinear relationship
x=10*np.random.normal(0,1,70)
y=10*(-x**2)+np.random.normal(-100,100,70)
Problem 2 Load one dataset from sklearn datasets and build a logistic regression classifier.
For datasets see link: https://scikit-learn.org/stable/datasets.html
https://colab.research.google.com/drive/1mFjso3nNg5meMzLICOUw866yutdKWAd8?authuser=5#printMode=true 13/13