Machinelearning Prac
Machinelearning Prac
Machinelearning Prac
AIM:- Write a program for read data set through a csv file.
content_csv_space = []
content_csv_tab = []
import csv
spamreader_space =
csv.reader(csvfile, delimiter=' ')
content_csv_space.append(row)
spamreader_tab = csv.reader(csvfile,
delimiter='\t')
content_csv_tab.append(row)
EXPERIMENT-2
Linear regression uses the relationship between the data-points to draw a straight line through all them.
Python has methods for finding a relationship between data-points and to draw a line of linear regression.
We will show you how to use these methods instead of going through the mathematic formula.
Example:
import matplotlib.pyplot as plt
from scipy import stats
x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]
plt.scatter(x, y)
plt.plot(x, mymodel)
plt.show()
It is important to know how the relationship between the values of the x-axis and the values
of the y-axis is, if there are no relationship the linear regression can not be used to predict
anything.
The r value ranges from -1 to 1, where 0 means no relationship, and 1 (and -1) means 100%
related.
Python and the Scipy module will compute this value for you, all you have to do is feed it
with the x and y values.
EXPERIMENT-3
import numpy as np
plt.show() # display
We can see in the above output image that there is no label on the x-axis and
y-axis. Since labelling is necessary for understanding the chart dimensions. In
the following example, we will see how to add labels, Indent in the charts
import matplotlib.pyplot as plt
import numpy as np
x = np.array([1, 2, 3, 4])
y = x*2
plt.plot(x, y)
plt.show()
Experiment 4:
To create a histogram the first step is to create bin of the ranges, then distribute the whole range of the
values into a series of intervals, and count the values which fall into each of the intervals.Bins are
clearly identified as consecutive, non-overlapping intervals of variables.The matplotlib.pyplot.hist()
function is used to compute and create histogram of x.
Let’s create a basic histogram of some random values. Below code creates a simple
histogram of some random values:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import colors
from matplotlib.ticker import PercentFormatter
# Creating dataset
np.random.seed(23685752)
N_points = 10000
n_bins = 20
# Creating distribution
x = np.random.randn(N_points)
y = .8 ** x + np.random.randn(10000) + 25
legend = ['distribution']
# Creating histogram
fig, axs = plt.subplots(1, 1,figsize =(10, 7),tight_layout = True)
# Remove x, y ticks
axs.xaxis.set_ticks_position('none')
axs.yaxis.set_ticks_position('none')
# Add x, y gridlines
axs.grid(b = True, color ='grey',linestyle ='-.', linewidth = 0.5,alpha = 0.6)
# Add Text watermark
fig.text(0.9, 0.15, 'Jeeteshgavande30',
fontsize = 12,
color ='red',
ha ='right',
va ='bottom',
alpha = 0.7)
# Creating histogram
N, bins, patches = axs.hist(x, bins = n_bins)
# Setting color
fracs = ((N**(1 / 5)) / N.max())
norm = colors.Normalize(fracs.min(), fracs.max())
# Show plot
plt.show()
Experiment 5:
A bar graph is a graphical representation of data in which we can highlight the category with
particular shapes like a rectangle. The length and heights of the bar chart represent the data distributed
in the dataset. In a bar chart, we have one axis representing a particular category of a column in the
dataset and another axis representing the values or counts associated with it. Bar charts can be plotted
vertically or horizontally. A vertical bar chart is often called a column chart. When we arrange bar
charts in a high to low-value counts manner, we called them Pareto charts.
The matplotlib A The matplotlib API in Python provides the bar() function which can be used in
MATLAB style use or as an object-oriented API. The syntax of the bar() function to be used with the axes
is as follows:-
name = df['car'].head(12)
price = df['price'].head(12)
# Figure Size
fig, ax = plt.subplots(figsize =(16, 9))
# Remove x, y Ticks
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')
# Add x, y gridlines
ax.grid(b = True, color ='grey',linestyle ='-.', linewidth = 0.5,alpha = 0.2)
# Show Plot
plt.show()
EXPERIMENT – 6
AIM: Draw Matplotlib Pie Charts.
INTRODUCTION:
A Pie Chart is a circular statistical plot that can display only one series of data. The area of the chart is the
total percentage of the given data. The area of slices of the pie represents the percentage of the parts of the
data. The slices of pie are called wedges. The area of the wedge is determined by the length of the arc of the
wedge. The area of a wedge represents the relative percentage of that part with respect to whole data. Pie
charts are commonly used in business presentations like sales, operations, survey results, resources, etc as
they provide a quick summary.
Program Code
# Import libraries
import numpy as np
import matplotlib.pyplot as plt
# Creating dataset
cars = ['AUDI', 'BMW', 'FORD',
'TESLA', 'JAGUAR', 'MERCEDES']
# Wedge properties
wp = { 'linewidth' : 1, 'edgecolor' : "green" }
# Creating plot
fig, ax = plt.subplots(figsize =(10, 7))
wedges, texts, autotexts = ax.pie(data,
autopct = lambda pct: func(pct, data),
explode = explode,
labels = cars,
shadow = True,
colors = colors,
startangle = 90,
wedgeprops = wp,
textprops = dict(color ="magenta"))
# Adding legend
ax.legend(wedges, cars,
title ="Cars",
loc ="center left",
bbox_to_anchor =(1, 0, 0.5, 1))
# show plot
plt.show()
EXPERIMENT- 7
AIM: Write program to draw scatter plot using Python.
INTRODUCTION:
A scatter plot is a diagram where each value in the data set is represented by a dot.
The Matplotlib module has a method for drawing scatter plots, it needs two arrays of the same length, one
for the values of the x-axis, and one for the values of the y-axis:
x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]
import matplotlib.pyplot as plt
x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]
plt.scatter(x, y)
plt.show()
What we can read from the diagram is that the two fastest cars were both 2 years old,
and the slowest car was 12 years old.
EXPERIMENT- 8
AIM: Write program which demonstrate Machine Learning - Polynomial
Regression.
INTRODUCTION:
Polynomial Regression is a form of linear regression in which the relationship between the independent
variable x and dependent variable y is modeled as an nth degree polynomial. Polynomial regression fits a
nonlinear relationship between the value of x and the corresponding conditional mean of y, denoted E(y |x)
There are some relationships that a researcher will hypothesize is curvilinear. Clearly, such types of
cases will include a polynomial term.
Inspection of residuals. If we try to fit a linear model to curved data, a scatter plot of residuals (Y-
axis) on the predictor (X-axis) will have patches of many positive residuals in the middle. Hence in
such a situation, it is not appropriate.
An assumption in usual multiple linear regression analysis is that all the independent variables is
independent. In polynomial regression model, this assumption is not satisfied.
Uses of Polynomial Regression:
These are basically used to define or describe non-linear phenomena such as:
The growth rate of tissues.
Progression of disease epidemics
Distribution of carbon isotopes in lake sediments
The basic goal of regression analysis is to model the expected value of a dependent variable y in terms of the
value of an independent variable x. In simple regression, we used the following equation –
y = a + bx + e
Here y is the dependent variable on x, a is the y-intercept and e is the error rate.
In general, we can model it for nth value.
y = a + b1x + b2x^2 +....+ bnx^n
Since regression function is linear in terms of unknown variables, hence these models are linear from the
point of estimation.
Hence through the Least Square technique, let’s compute the response value that is y.
Divide dataset into two components that is X and y.X will contain the Column between 1 and 2. y will
contain the 2 columns.
X = datas.iloc[:, 1:2].values
y = datas.iloc[:, 2].values
Step 5: In this step, we are Visualising the Linear Regression results using a scatter plot.
Step 7: Predicting new results with both Linear and Polynomial Regression. Note that the input variable
must be in a numpy 2D array.