0% found this document useful (0 votes)

353 views55 pages

DEV Lab Manual

Here are the steps to select a column from a DataFrame in Pandas: 1. Import Pandas as pd 2. Create a DataFrame 3. Use df['column_name'] to select the column For example: import pandas as pd df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) print(df) A B 0 1 3 1 2 4 # Select column 'A' print(df['A']) 0 1 1 2 Name: A, dtype: int64 So df['column_name'] allows you to select a single column from the

Uploaded by

palaniappan.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

353 views55 pages

DEV Lab Manual

Uploaded by

palaniappan.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

Ex. Page
Date Name of the Experiment Marks
No. No.

1(a) 25-08-2021 Working with numpy arrays 01

1(b) 25-08-2021 Program to perform array slicing 03

1(c) 25-08-2021 Program to perform array slicing 05

2(a) 28-08-2021 Create a dataframe using a list of elements. 07

2(b) 28-08-2021 Create a dataframe using the dictionary 09

2(c) 28-08-2021 Column selection 11

Checking for missing values using isnull() and

2(d) 02-09-2021 13
notnull() :

2(e) 02-09-2021 Dropping missing values using dropna() 15

3(a) 11-09-2021 Basic plots using matplotlib 17

3(b) 11-09-2021 Compute the x and y coordinates and create a plot. 19

3(c) 15-09-2021 Drawing multiple lines using plot function. 21

3(d) 15-09-2021 Basic plots using matplotlib 23

Python program to show the conditional frequency

4(a) 17-09-2021 27
distribution

Python program determine the frequency of words, of

4(b) 29
17-09-2021 a particular genre, in brown corpus.

Python program frequency of last character appearing

4(C) 22-09-2021 in all names associated with males and females 31
respectively and compares them.

Python program for finding a average of list using

5(a) 25-10-2021 33
Loop.

Python program to find the average of list using built

5 (b) 25-10-2021 34
in functions
Python program to find the average of list using
5(c) 06-10-2021
Mean function 35

Python program to find the average of list using

5(d) 09-10-2021 36
Numpy library

Variability python program to show variance of

6(a) 13-10-2021 37
sample set.

Python program to show variance on a range of data-

6(b) 21-10-2021 38
types.

6(c) 23-10-2021 Python program to show statistics. 39

7 23-10-2021 Python Program To Create A Normal Curve 40

8 30-10-2021 Correlation and scatter plots 42

9 30-10-2021 Correlation coefficient 46

10(a) 13-11-2021 Simple linear regression with scikit-learn 49

10(b) 13-11-2021 Multiple linear regression with scikit-learn 50

Ex No: 1 (a) WORKING WITH NUMPY ARRAYS
Date: 25-08-2021

NUMPY:
NumPy is a Python library used for working with arrays .It also has functions for working in
domain of linear algebra, fourier transform, and matrices. NumPy was created in 2005 by Travis
Oliphant. It is an open source project and you can use it freely. NumPy stands for Numerical
Python.

It is a general-purpose array-processing package. It provides a high-performance multidimensional

array object, and tools for working with these arrays.

It is the fundamental package for scientific computing with Python. It contains various features
including these important ones:

A powerful N-dimensional array object

Sophisticated (broadcasting) functions
Tools for integrating C/C++ and Fortran code
Useful linear algebra, Fourier transform, and random number capabilities

AIM
Write a Python program to demonstrate basic array characteristics.

ALGORITHM
Step1: Start
Step2: Import numpy module

Step3: Print the basic characteristics of array

Step4: Stop

PROGRAM
importnumpy as np
# Creating array object
arr = np.array( [[ 1, 2, 3],

[ 4, 2, 5]] )

1
# Printing type of arr object
print("Array is of type: ", type(arr))
# Printing array dimensions (axes)

print("No. of dimensions: ", arr.ndim)

# Printing shape of array
print("Shape of array: ", arr.shape)
# Printing size (total number of elements) of array
print("Size of array: ", arr.size)

# Printing type of elements in array

print("Array stores elements of type: ", arr.dtype)

OUTPUT
Array is of type: <class 'numpy.ndarray'>
No. of dimensions: 2
Shape of array: (2, 3)
Size of array: 6
Array stores elements of type: int32

RESULT
Thus the python program working with NumPy array has been implemented and executed
successfully.

2
EX.NO : 1 (b) PROGRAM TO PERFORM ARRAY SLICING
Date : 25-08-2021

SLICING:
Similar to Python lists, numpy arrays can be sliced. Since arrays may be multidimensional, we must
specify a slice for each dimension of the array

AIM
Write a Python Program to Perform Array Slicing.

ALGORITHM
Step1: Start
Step2: import numpy module
Step3: Create an array and apply the slicing operator

Step4: Print the output

Step5: Stop

PROGRAM
Importnumpy as np

a = np.array([[1,2,3],[3,4,5],[4,5,6]])
print(a)

print("After slicing")
print(a[1:])

3
OUTPUT
[[1 2 3]
[3 4 5]
[4 5 6]]
After slicing

[[3 4 5] [4 5 6]]

RESULT
Thus the python program to perform array slicing has been implemented and executed successful

4
Ex No : 1 (c) PROGRAM TO PERFORM ARRAY SLICING
Date : 25-08-2021

AIM
Write a Python Program to Perform Array Slicing.

ALGORITHM
Step1: Start
Step2: import numpy module
Step3: Create an array and apply the slicing operator
Step4: Print the output

Step5: Stop

PROGRAM
# array to begin with
importnumpy as np
a = np.array([[1,2,3],[3,4,5],[4,5,6]])

print('Our array is:' )

print(a)
# this returns array of items in the second column
print('The items in the second column are:' )
print(a[...,1])
print('\n' )

# Now we will slice all items from the second row

print ('The items in the second row are:' )
print(a[1,...])
print('\n' )

# Now we will slice all items from column 1 onwards

print('The items column 1 onwards are:' )
print(a[...,1:])

5
OUTPUT:
Our array is:
[[1 2 3]
[3 4 5]
[4 5 6]]
The items in the second column are:
[2 4 5]
The items in the second row are:
[3 4 5]
The items column 1 onwards are:
[[2 3]
[4 5]
[5 6]]

RESULT
Thus the python program to perform array slicing has been implemented and executed successfully.

6
Ex No : 2 (a) CREATE A DATAFRAME USING A LIST OF ELEMENTS
Date: 28-08-2021

PANDAS:

It is a Python library. Pandas is used to analyze data. A Pandas DataFrame is a 2 dimensional data
structure, like a 2 dimensional array, or a table with rows and columns.Pandas DataFrame can be
created from the lists, dictionary and from a list of dictionary etc.

AIM
Write a program to create a dataframe using a list of elements.

ALGORITHM
Step1: Start
Step2: import numpy and pandas module
Step3: Create a dataframe using list of elements

Step4: Print the output

Step5: Stop

PROGRAM
# import pandas as pd
import pandas as pd
# list of strings
lst = ['A', 'B', 'C', 'D', 'E', 'F', 'G']

# Calling DataFrame constructor on list

df = pd.DataFrame(lst)
print(df)

7
OUTPUT
0
0 A
1 B
2 C
3 D
4 E
5 F
6 G

RESULT
Thus the python program for dataframe using list of elements has been implemented and executed
successfully.

8
Ex No: 2 (b) CREATE A DATAFRAME USING THE DICTIONARY
Date: 28-08-2021

DATAFRAME:
To create DataFrame from dict of narray/list, all the narray must be of same length. If index is passed
then the length index should be equal to the length of arrays. If no index is passed, then by default,
index will be range(n) where n is the array length.

AIM

Write a program to create a dataframe using dictionary of elements.

ALGORITHM
Step1: Start
Step2: import numpy and pandas module
Step3: Create a dataframe using the dictionary
Step4: Print the output

Step5: Stop

PROGRAM
import pandas as pd
# intialise data of lists.
data = {'Name':['Tom', 'nick', 'krish', 'jack'],
'Age':[20, 21, 19, 18]}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
print(df)

9
OUTPUT:
Name Age

0 Tom 20
1 nick 21
0 krish 19
1 jack 18

RESULT
Thus the python to create dataframe using dictionary program has been implemented and executed
successfully

10
Ex No: 2 (c) COLUMN SELECTION
Date: 28-08-2021

Column Selection
A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and
columns. We can perform basic operations on rows/columns like selecting, deleting, adding, and
renaming.

Column Selection: In Order to select a column in Pandas DataFrame, we can either access the
columns by calling them by their columns name.

AIM
Write a program to select a column from dataframe.

ALGORITHM
Step1: Start

Step2: import pandas module

Step3: Create a dataframe using the dictionary
Step4: Select the specific columns and print the output
Step5: Stop

PROGRAM
import pandas as pd
# Define a dictionary containing employee data

data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'], 'Age':[27, 24, 22, 32], 'Address':['Delhi', 'Kanpur',
'Allahabad', 'Kannauj'], 'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)

print(df)

# select two columns

print(df[['Name', 'Qualification']])

11
OUTPUT:

RESULT
Thus the python program for coloumn selection has been implemented and executed successfully.

12
Ex No: 2 (d) CHECKING FOR MISSING VALUES USING ISNULL() AND NOTNULL() :
Date: 02-09-2021

In order to check missing values in Pandas DataFrame, we use a function isnull() and notnull(). Both
function help in checking whether a value is NaN or not.These function can also be used in Pandas
Series in order to find null values in a series.

AIM
Write a program to check the missing values from the dataframe.

ALGORITHM
Step1: Start
Step2: import pandas module
Step3: Create a dataframe using the dictionary
Step4: Check the missing values using isnull() function
Step5: print the output
Step6: Stop

PROGRAM
# importing pandas as pd
import pandas as pd

# importing numpy as np
importnumpy as np

# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}

# creating a dataframe from list

df = pd.DataFrame(dict)
# using isnull() function
df.isnull()

13
OUTPUT:

RESULT
Thus the python program checking for missing value using isnull() and nonull() has been
implemented and executed successfully.

14
Ex No: 2 (e) DROPPING MISSING VALUES USING DROPNA()
DATE: 02-09 2021

In order to drop a null values from a dataframe, we used dropna() function this function drop
Rows/Columns of datasets with Null values in different ways.

AIM
Write a program to drop rows with at least one Nan value (Null value)

ALGORITHM
Step1: Start
Step2: import pandas module
Step3: Create a dataframe using the dictionary
Step4: Drop the null values using dropna() funtion
Step5: print the output
Step6: Stop

PROGRAM
Drop rows with at least one Nan value (Null value)

# importing pandas as pd
import pandas as pd

# importing numpy as np
import numpy as np

# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, np.nan, 45, 56],
'Third Score':[52, 40, 80, 98],
'Fourth Score':[np.nan, np.nan, np.nan, 65]}

15
# creating a dataframe from dictionary
df = pd.DataFrame(dict)

# using dropna()

function

df.dropna()

OUTPUT:

RESULT
Thus the python program for Drop missing values has been implemented and executed successfully.

16
Ex No: 3 (a) BASIC PLOTS USING MATPLOTLIB
Date: 11-09-2021

MATPLOTLIB:
It is a Python library that helps in visualizing and analyzing the data and helps in better
understanding of the data with the help of graphical, pictorial visualizations that can be simulated
using the matplotlib library. Matplotlib is a comprehensive library for static, animated and interactive
visualizations.

AIM
Write a python program to create a simple plot using plot() function.

ALGORITHM
Step1:Define the x-axis and corresponding y-axis values as lists.

Step2:Plot them on canvas using .plot() function.

Step3:Give a name to x-axis and y-axis using .xlabel() and .ylabel() functions.
Step4:Give a title to your plot using .title() function.

Step5:Finally, to view your plot, we use .show() function.

Step6: Stop

PROGRAM
importmatplotlib.pyplot as plt
# x axis values

x = [1,2,3]
# corresponding y axis values
y = [2,4,1]

# plotting the points

plt.plot(x, y)
# naming the x axis
plt.xlabel('x - axis')
# naming the y axis
plt.ylabel('y - axis')

17
# giving a title to my graph
plt.title('My first graph!')
# function to show the plot
plt.show()

OUTPUT:

RESULT
Thus the python program for basic Matplotlib has been implemented and executed successfully.

18
Ex No: 3 (b) COMPUTE THE X AND Y COORDINATES AND CREATE A PLOT
Date: 11-09-2021

AIM
Write a python program to create a plot by computing the x and y coordinates.

ALGORITHM
Step1: Compute the x and y coordinates for points on a sine curve
Step2: Plot the points using matplotlib
Step3:Display the output
Step4: Stop

PROGRAM
importnumpyasnp

importmatplotlib.pyplotasplt

x =np.arange(0, 3*np.pi, 0.1)

y =np.sin(x)

plt.plot(x, y)

plt.show()

19
OUTPUT

RESULT
Thus the python program to compute X and Y coordinates has been implemented and executed
successfully.

20
Ex No: 3 (c) DRAWING MULTIPLE LINES USING
Date: 15-09-2021 PLOT FUNCTION

AIM
Write a python program to draw multiple lines using plot() function.

ALGORITHM
Step1: Compute the x and y coordinates for points on a sine and cosine curve
Step2: Plot the points using matplotlib
Step3:Display the output
Step4: Stop

PROGRAM
importnumpy as np
importmatplotlib.pyplot as plt

# Compute the x and y coordinates for points on sine and cosine curves
x = np.arange(0, 3 * np.pi, 0.1)
y_sin = np.sin(x)
y_cos = np.cos(x)

# Plot the points using matplotlib

plt.plot(x, y_sin)
plt.plot(x, y_cos)
plt.xlabel('x axis label')
plt.ylabel('y axis label')
plt.title('Sine and Cosine')
plt.legend(['Sine', 'Cosine'])
plt.show()

21
OUTPUT

RESULT
Thus the python program multiple line using plot functiont has been implemented and executed
successfully.

22
Ex No: 3 (d) BASIC PLOT USING MATPLOTLIB
Date: 15-09-2021

AIM
Write a python program for basic plot using matplotlib

ALGORITHM
Step1: import the library
Step2: Plot the points using matplotlib
Step3: Display the output
Step4: Stop

PROGRAM
Line plot :

from matplotlib import pyplot as plt

x = [5, 2, 9, 4, 7]

y = [10, 5, 8, 4, 2]

plt.plot(x,y)

plt.show()

23
Bar plot :

from matplotlib import pyplot as plt

x = [5, 2, 9, 4, 7]

y = [10, 5, 8, 4, 2]

plt.bar(x,y)

plt.show()

24
Histogram :

from matplotlib import pyplot as plt

y = [10, 5, 8, 4, 2]

plt.hist(y)

plt.show()

Scatter Plot :

from matplotlib import pyplot as plt

x = [5, 2, 9, 4, 7]

25
y = [10, 5, 8, 4, 2]

plt.scatter(x, y)

plt.show()

RESULT
Thus the python program for basic plot using Matplotlib has been implemented and executed
successfully.

26
Ex No: 4 (a) CONDITIONAL FREQUENCY DISTRIBUTION
Date: 17-09-2021

Conditional Frequency:
In the previous topic, you have studied about Frequency Distributions FreqDist function computes
the frequency of each item in a list. While computing a frequency distribution, you observe
occurrence count of an event.

A Conditional Frequency is a collection of frequency distributions, computed based on a condition.

For computing a conditional frequency, you have to attach a condition to every occurrence of an
event. Let's consider the following list for computing Conditional Frequency.

AIM
To write a python program to show the conditional Frequency distribution

ALGORITHM
Step 1: Start
Step 2: Import Pandas, Numpy And Nltk

Step 4: Display The Frequency Of Each Items In The List

Step 5: Stop

PROGRAM:
importnumpyasnp# linear algebra
importpandasaspd# data processing, CSV file I/O (e.g. pd.read_csv)

importnltk
items = ['apple', 'apple', 'kiwi', 'cabbage', 'cabbage', 'potato']
nltk.FreqDist(items)

c_items= [('F','apple'), ('F','apple'), ('F','kiwi'), ('V','cabbage'), ('V','cabbage'), ('V','potato') ]

27
cfd=nltk.ConditionalFreqDist(c_items)
cfd.conditions()
cfd.plot()
cfd['V']

OUTPUT

FreqDist({'cabbage': 2, 'potato': 1})

RESULT
Thus the python program for conditional frequency distribution has been implemented and executed
successfully.

28
Ex No: 4 (b) FREQUENCY OF WORDS, OF A PARTICULAR
Date: 17-09-2021 GENRE, IN BROWN CORPUS.

AIM
To write a python program determine the frequency of words, of a particular genre, in brown corpus.

ALORITHM
Step 1: Start
Step 2: Import All Necessary Libraries
Step 3: Display The Frequency Of Each Items In The List
Step 4:Setting Cumulative Argument Value To True.
Step 5: Stop

PROGRAM:

fromnltk.corpusimport brown
cfd=nltk.ConditionalFreqDist([ (genre, word) for genre inbrown.categories() for word
inbrown.words(categories=genre) ])
cfd
cfd.conditions()
cfd.tabulate(conditions=['government', 'humor', 'reviews'],samples=['leadership', 'worship',
'hardship'])
cfd.plot(conditions=['government', 'humor', 'reviews'],samples=['leadership', 'worship', 'hardship'])
cfd.tabulate(conditions=['government', 'humor', 'reviews'], samples=['leadership', 'worship',
'hardship'], cumulative =True)
news_fd=cfd['news']
news_fd.most_common(3)
news_fd['the']

29
OUTPUT :

leadership worship hardship

government 12 3 2
humor 1 0 0
reviews 14 1 2

leadership worship hardship

government 12 15 17
humor 1 1 1
reviews 14 15 17

RESULT
Thus the python program frequency of words, of a particular genre, in brown corpus has been
implemented and executed successfully.

30
FREQUENCY OF LAST CHARACTER APPEARING IN ALL
Ex No: 4 (C) NAMES ASSOCIATED WITH MALES AND FEMALES
Date: 22-09-2021 RESPECTIVELY AND COMPARES THEM

AIM
To write a python program frequency of last character appearing in all names associated with males
and females respectively and compares them.

ALORITHM
Step 1: Start
Step 2: Import All Necessary Libraries
Step 3: Display The Frequency Of Each Items In The List
Step 4: Plot
Step 5: Stop

PROGRAM
fromnltk.corpusimport names
nt= [(fid.split('.')[0], name[-1]) for fid innames.fileids() for name innames.words(fid) ]
cfd2 =nltk.ConditionalFreqDist(nt)
cfd2['female']['a']
cfd2['male']['a']
cfd2['female'] > cfd2['male']
cfd2.tabulate(samples=['a', 'e'])
cfd2.plot()

31
OUTPUT
a e
female 1773 1432
male 29 468

RESULT
Thus the python program frequency of last character appearing in all names associated with males
and females respectively and compares them has been implemented and executed successfully.

32
Ex No: 5 (a)
Date: 25-10-2021 AVERAGE OF LIST USING LOOP

AIM
To write a python program for finding a average of list using loop.

ALGORITHM
Step 1: Start
Step 2: Define A Class Cal_Average
Step 3: Sum_Num = Sum_Num + T
Step 4: Avg = Sum_Num / Len(Num)
Step 5: Stop

PROGRAM:
defcal_average(num):
sum_num = 0
for t in num:
sum_num = sum_num + t
avg = sum_num / len(num)
returnavg
print("The average is", cal_average([18,25,3,41,5]))

OUTPUT:
The average is 18.4

RESULT
Thus the python program finding a average of list using loop has been implemented and executed
successful.

33
Ex No: 5 (b) AVERAGE OF LIST USING BUILT IN FUNCTIONS
Date: 25-10-2021

AIM
To write a python program to find the average of list using built in functions.

ALGORITHM
STEP 1: Start STEP
STEP 2: define a list
STEP 3: avg = sum(number_list)/len(number_list)
STEP 4:printavg
STEP 5: Stop

PROGRAM
number_list = [45, 34, 10, 36, 12, 6, 80]
avg = sum(number_list)/len(number_list)
print("The average is ", round(avg,2))
OUTPUT:
The average is 31.86

RESULT
Thus the python program finding a average of list using built in functions has been implemented and
executed successfully

34
Ex No: 5 (c) AVERAGE OF LIST USING MEAN FUNCTION
Date: 06-10-2021

AIM
To write a python program to find the average of list using mean function.

ALGORITHM
Step 1: Start
Step 2: Define A List
Step 3: Import Mean From Statistics
Step 4: Avg = Mean(Number_List)
Step 5: Printavg
Step 6: Stop

PROGRAM
from statistics import mean
number_list = [45, 34, 10, 36, 12, 6, 80]
avg = mean(number_list)
print("The average is ", round(avg,2))
OUTPUT:
The average is 31.86

RESULT
Thus the python program average of list using mean function has been implemented and executed
successful.

35
Ex No: 5 (d)
AVERAGE OF LIST USING NUMPY LIBRARY
Date: 09-10-2021

AIM
To write a python program to find the average of list using numpy library.

ALGORITHM
Step 1: Start
Step 2: Import Mean From Numpy
Step 3: Define A List
Step 4: Avg = Mean(Number_List)
Step 5:Printavg
Step 6: Stop

PR0GRAM
fromnumpy import mean
number_list = [45, 34, 10, 36, 12, 6, 80]
avg = mean(number_list)
print ("The average is ", round(avg,2))
OUTPUT:
The average is 31.86

RESULT
Thus the python program average of list using numpy library has been implemented and executed
successfully.

36
Ex No: 6 (a) VARIANCE OF SAMPLE SET.
Date: 13-10-2021

AIM
To write a python program to show variance of sample set.

ALGORITHAM
Step 1: Start
Step 2: Import Statistics
Step 3: Define A List
Step 4: Print Statistics.Variance(Sample))
Step 5: Stop

PROGRAM
import statistics
sample = [2.74, 1.23, 2.63, 2.22, 3, 1.98]
print("Variance of sample set is % s" , statistics.variance(sample))

OUTPUT :
Variance of sample set is 0.40924

RESULT
Thus the python program to show variance of sample set has been implemented and executed
successfully.

37
Ex No: 6 (b) VARIANCE ON A RANGE OF DATA-TYPES
Date: 21-10-2021

AIM
To write a python program to show variance on a range of data-types.
ALGORITHM
Step 1: Start
Step 2: Import All Necessary Libraries
Step 3: Define Samples
Step 4: Print Variance Of Sample
Step 5: Stop

PROGRAM
from statistics import variance
from fractions import Fraction as fr
sample1 = (1, 2, 5, 4, 8, 9, 12)
sample2 = (-2, -4, -3, -1, -5, -6)
sample3 = (-9, -1, -0, 2, 1, 3, 4, 19)
sample4 = (fr(1, 2), fr(2, 3), fr(3, 4),fr(5, 6), fr(7, 8))
sample5 = (1.23, 1.45, 2.1, 2.2, 1.9)
print("Variance of Sample1 is ",variance(sample1))
print("Variance of Sample2 is ",variance(sample2))
print("Variance of Sample3 is ",variance(sample3))
print("Variance of Sample4 is ", variance(sample4))
print("Variance of Sample5 is ",variance(sample5))

OUTPUT
Variance of Sample1 is 15.80952380952381
Variance of Sample2 is 3.5
Variance of Sample3 is 61.125
Variance of Sample4 is 1/45
Variance of Sample5 is 0.17613000000000006

RESULT
Thus the python program to show variance on a range of data-types has been implemented and
executed successfully.

38
Ex No: 6 (c) STATISTICS
Date: 23-10-2021

AIM
To write a python program to show statistics.

ALGORITHM
Step 1: Start
Step 2: Import Statistics
Step 3: Define A List
Step 4: M=Statistics.Mean(Sample)
Step 5: Stop

PROGRAM
import statistics
sample = (1, 1.3, 1.2, 1.9, 2.5, 2.2)
m = statistics.mean (sample)
print("Variance of Sample set is ",statistics.variance(sample, xbar = m))

OUTPUT
Variance of Sample set is 0.3656666666666667

RESULT
Thus the python program to show statistics has been implemented and executed successfully.

39
Ex No: 7 CREATE NORMAL CURVE
` Date: 23-10-2021

AIM

To write a python program to create a normal curve.

ALGORITHM

STEP 1: Start

STEP 2: import all necessary packages

STEP 3: create distribution

STEP 4: visualize the distribution

STEP 5: Stop

PROGRAM

From scipy.stats import norm

importnumpy as np

importmatplotlib.pyplot as plt

importseaborn as sb

data = np.arange(1,10,0.01)

pdf = norm.pdf(data , loc = 5.3 , scale = 1 )

sb.set_style('whitegrid')

sb.lineplot(data, pdf , color = 'black')

plt.xlabel('Heights')

plt.ylabel('Probability Density')

40
OUTPUT

Text(0, 0.5, 'Probability Density')

RESULT
Thus the python program to create a normal curve has been implemented and executed
successfully.

41
Ex No: 8 CORRELATION AND SCATTER PLOTS
Date: 30-10-2021

CORRELATION:
Correlation means an association. It is a measure of the extent to which two variables are related.

AIM:
To write a python program correlation and scatter plots.

ALGORITHM:
Step 1: Importing the libraries.
Step 2: Finding the Correlation between two variables.
Step 3: Plotting the graph. Here we are using scatter plots. A scatter plot is a diagram where each
value in the data set is represented by a dot. Also, it shows a relationship between two variables.

PROGRAM:
importsklearn
importnumpy as np
importmatplotlib.pyplot as plt
import pandas as pd
y = pd.Series([1, 2, 3, 4, 3, 5, 4])
x = pd.Series([1, 2, 3, 4, 5, 6, 7])
correlation = y.corr(x)
print(correlation)
plt.scatter(x, y)
# This will fit the best line into the graph
plt.plot(np.unique(x), np.poly1d(np.polyfit(x, y, 1)) (np.unique(x)), color='red')

42
OUTPUT:

RESULT:
Thus the python program to correlation and scatter plots has been implemented and executed
successfully.

43
SCATTER PLOT:
Scatter plot is a graph of two sets of data along the two axes. It is used to visualize the
relationship between the two variables.

In python matplotlib, the scatterplot can be created using the pyplot.plot() or the pyplot.scatter().
Using these functions, you can add more feature to your scatter plot, like changing the size, color or
shape of the points.

i)SIMPLE

SCATTER PLOT

AIM:

To write a python program simple scatter plots.

PROGRAM:

x = range(50)
y = range(50) + np.random.randint(0,30,50)

plt.scatter(x, y)
plt.rcParams.update({'figure.figsize':(10,8), 'figure.dpi':100})

plt.title('Simple Scatter plot')

plt.xlabel('X - value')
plt.ylabel('Y - value')

plt.show()

44
OUTPUT:

RESULT
Thus the python program for simple scatter Plot has been implemented and executed successfully.

45
ii) SIMPLE SCATTERPLOT WITH

COLORED POINTS

AIM:
To write a python program Simple Scatterplot with colored points.

PROGRAM:
x = range(50)
y = range(50) + np.random.randint(0,30,50)

plt.rcParams.update({'figure.figsize':(10,8), 'figure.dpi':100})
plt.scatter(x, y, c=y, cmap='Spectral')

plt.colorbar()
plt.title('Simple Scatter plot')
plt.xlabel('X - value')

plt.ylabel('Y - value')
plt.show()

46
OUTPUT:

RESULT:
Thus the python program Simple Scatterplot with colored points has been implemented and
executed successfully.

47
9. CORRELATION COEFFICIENT
Variables within a dataset can be related for lots of reasons.
For example:
One variable could cause or depend on the values of another variable.
One variable could be lightly associated with another variable.
Two variables could depend on a third unknown variable.
It can be useful in data analysis and modelling to better understand the relationships between
variables. The statistical relationship between two variables is referred to as their correlation.

A correlation could be positive, meaning both variables move in the same direction, or negative,

can also be neutral or zero, meaning that the variables are unrelated.

Positive Correlation: both variables change in the same direction.

Neutral Correlation: No relationship in the change of the variables.
Negative Correlation: variables change in opposite directions.

48
Ex No: 9 (a) NUMPY CORRELATION CALCULATION
Date: 30-10-2021

AIM:
To write a program to calculate the correlation coefficient.

ALGORITHM:
STEP 1: Import the numpy packages.
STEP 2: Define two NumPy arrays. Call them x and y
STEP3: Call np.corrcoef() with both arrays as arguments
STEP 4: corrcoef() returns the correlation matrix, which is a two-dimensional array with the
correlation coefficients.

PROGRAM:
importnumpy as np
x = np.arange(10, 20)
y = np.array([2, 1, 4, 5, 8, 12, 18, 25, 96, 48])
r = np.corrcoef(x, y)
print(r)

OUTPUT:

RESULT:
Thus the python program calculate the correlation coefficient has been implemented and executed
successfully.

49
Ex No: 9 (b) CORRELATION
Date: 30-10-2021
The Pearson correlation coefficient can be used to summarize the strength of the linear relationship
between two data samples.The Pea
two variables divided by the product of the standard deviation of each data sample. It is the
normalization of the covariance between the two variables to give an interpretable score.
Pearson's correlation coefficient = covariance(X, Y) / (stdv(X) * stdv(Y))

AIM:
To write a program to calculate the Pearson correlation coefficient between two variables.

ALGORITHM:
Step 1: Import The Needed Packages.
Step 2: Provide The Data.
Step 3: Thep
Coefficient Between Two Data Samples With The Same Length.
Step 4: Display The Correlation Coefficient.

PROGRAM:
fromnumpy.random import randn
fromnumpy.random import seed
fromscipy.stats import pearsonr
seed(1)
data1 = 20 * randn(1000) + 100
data2 = data1 + (10 * randn(1000) + 50)
corr,_ = pearsonr(data1, data2)
print('Pearsons correlation:', corr)

OUTPUT:
Pearsons correlation: 0.887611908579531

RESULT:
Thus the python program to calculate the Pearson correlation coefficient between two variables has
been implemented and executed successfully.

50
10. REGRESSION

Ex No: 10 (a) SIMPLE LINEAR REGRESSION

Date: 13-11-2021 WITH SCIKIT-LEARN

AIM:
To write a program simple linear regression with scikit-learn.

ALGORITHM:
Step 1: Import The Packages And Classes.
Step 2: Provide Data To Work With And Eventually Do Appropriate Transformations.
Step 3: Create A Regression Model And Fit It With Existing Data.
Step 4: Check The Results Of Model Fitting To Know Whether The Model Is Satisfactory.
Step 5: Apply The Model For Predictions.

PROGRAM:
importnumpy as np
fromsklearn.linear_model import LinearRegression
x = np.array([5, 15, 25, 35, 45, 55]).reshape((-1, 1))
y = np.array([5, 20, 14, 32, 22, 38])
model = LinearRegression().fit(x, y)
r_sq = model.score(x, y)
print('coefficient of determination:', r_sq)
y_pred = model.predict(x)
print('predicted response:', y_pred)

OUTPUT:
coefficient of determination: 0.715875613747954
predicted response: [ 8.33333333 13.73333333 19.13333333 24.53333333 29.93333333
35.33333333]

RESULT:
Thus the python program simple linear regression with scikit-learn has been implemented and
executed successfully.

51
Ex No: 10 (b) MULTIPLE LINEAR REGRESSION
Date: 13-11-2021 WITH SCIKIT-LEARN

AIM:
To write a program multiple linear regression with scikit-learn.

ALGORITHM:
Step 1:Import Packages And Classes
Step 2: Provide Data
Step 3:Create A Model And Fit It
Step 4: Get Results
Step 5: Predict Response

PROGRAM:
importnumpy as np
fromsklearn.linear_model import LinearRegression

x = [[0, 1], [5, 1], [15, 2], [25, 5], [35, 11], [45, 15], [55, 34], [60, 35]]
y = [4, 5, 20, 14, 32, 22, 38, 43]

x, y = np.array(x), np.array(y)
model = LinearRegression().fit(x, y)

r_sq = model.score(x, y)
print('coefficient of determination:', r_sq)

print('intercept:', model.intercept_)
print('slope:', model.coef_)

y_pred = model.predict(x)
print('predicted response:', y_pred)

54
OUTPUT:
coefficient of determination: 0.8615939258756775
intercept: 5.52257927519819
slope: [0.44706965 0.25502548]
predicted response: [ 5.77760476 8.012953 12.73867497 17.9744479 23.97529728 29.4660957
38.78227633 41.27265006]

RESULT:
Thus the python program multiple linear regression with scikit-learn has been implemented and
executed successfully.

Ad3351 Daa Unit I
No ratings yet
Ad3351 Daa Unit I
135 pages
Ad3311 - Artificial Intelligence Lab Manual
100% (1)
Ad3311 - Artificial Intelligence Lab Manual
30 pages
Deep Learning For Vision Lab Manual 2024
100% (1)
Deep Learning For Vision Lab Manual 2024
25 pages
Lab Manual
No ratings yet
Lab Manual
59 pages
r22 1 9 ML Lab Manual r22 Regulations
No ratings yet
r22 1 9 ML Lab Manual r22 Regulations
24 pages
Python I Compiled Notes
100% (3)
Python I Compiled Notes
321 pages
Cs3591 CN Unit 1 Notes
No ratings yet
Cs3591 CN Unit 1 Notes
37 pages
Data and Information Security - CW3551 - Important Questions and Question Bank
No ratings yet
Data and Information Security - CW3551 - Important Questions and Question Bank
9 pages
Programming in C - CS3251 - HandWritten Notes - Un - 250316 - 200237
No ratings yet
Programming in C - CS3251 - HandWritten Notes - Un - 250316 - 200237
38 pages
FDSA Unit-2
No ratings yet
FDSA Unit-2
41 pages
Fdsa Unit 5
No ratings yet
Fdsa Unit 5
48 pages
Python Lab Manual 2022-23-2
No ratings yet
Python Lab Manual 2022-23-2
36 pages
Foundations of Data Science - Unit 2
No ratings yet
Foundations of Data Science - Unit 2
28 pages
CS3401 Algorithms Syllabus
No ratings yet
CS3401 Algorithms Syllabus
3 pages
AD3461 ML Lab Manual
No ratings yet
AD3461 ML Lab Manual
32 pages
Lab Record-Cs3401 Algorithms
No ratings yet
Lab Record-Cs3401 Algorithms
79 pages
Python Record
No ratings yet
Python Record
35 pages
Dsf-Pyt-Lab Manual
No ratings yet
Dsf-Pyt-Lab Manual
50 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
Ad3301 Data Exploration and Visualization
No ratings yet
Ad3301 Data Exploration and Visualization
38 pages
AL3391-AI Unit IV
No ratings yet
AL3391-AI Unit IV
65 pages
R22-Ids-Question Bank
No ratings yet
R22-Ids-Question Bank
4 pages
CS3481 Set 4
No ratings yet
CS3481 Set 4
3 pages
Set 3
No ratings yet
Set 3
16 pages
Data Structures Design - AD3251 - Important Questions With Answer - Unit 1 - Abstract Data Types
No ratings yet
Data Structures Design - AD3251 - Important Questions With Answer - Unit 1 - Abstract Data Types
15 pages
CCS354 Network Security
No ratings yet
CCS354 Network Security
87 pages
Data Warehousing Full
No ratings yet
Data Warehousing Full
41 pages
CS3361 Set1
No ratings yet
CS3361 Set1
5 pages
Ccs354-Network Security Laboratory
No ratings yet
Ccs354-Network Security Laboratory
52 pages
Unit 5 Fod (1) (Repaired)
No ratings yet
Unit 5 Fod (1) (Repaired)
28 pages
Os Lab Manual AI&DS
No ratings yet
Os Lab Manual AI&DS
64 pages
Lab-manual-Advanced Python Programming 4321602
No ratings yet
Lab-manual-Advanced Python Programming 4321602
24 pages
The Use of Simulink For Process Modelling in The Sugar Industry
No ratings yet
The Use of Simulink For Process Modelling in The Sugar Industry
13 pages
Compiler Design Lab Manual
No ratings yet
Compiler Design Lab Manual
36 pages
Ad3301 Data Exploration and Visualization
No ratings yet
Ad3301 Data Exploration and Visualization
24 pages
250 Hadoop Interview Questions and Answers For Experienced Hadoop Developers - Hadoop Online Tutorials
No ratings yet
250 Hadoop Interview Questions and Answers For Experienced Hadoop Developers - Hadoop Online Tutorials
35 pages
CS3353 Unit 2
No ratings yet
CS3353 Unit 2
26 pages
Dbms
No ratings yet
Dbms
99 pages
Ad3251 Unit 2 Notes Edu Engg
No ratings yet
Ad3251 Unit 2 Notes Edu Engg
35 pages
CS3301 Datastructure QN Paper Apr-May
No ratings yet
CS3301 Datastructure QN Paper Apr-May
2 pages
MC4112 Set1
100% (1)
MC4112 Set1
3 pages
2.1 Exploratory Data Analysis Using Python
No ratings yet
2.1 Exploratory Data Analysis Using Python
12 pages
Compiler-Design Notes
No ratings yet
Compiler-Design Notes
5 pages
DSDBA Sppu Dsbda QP
No ratings yet
DSDBA Sppu Dsbda QP
11 pages
cs3251 UNIT II QUESTION BANK
No ratings yet
cs3251 UNIT II QUESTION BANK
4 pages
Numpy - Tutorial - Ipynb - Colaboratory
No ratings yet
Numpy - Tutorial - Ipynb - Colaboratory
9 pages
Cd3291 Dsa Unit 5 Notes Eduengg
No ratings yet
Cd3291 Dsa Unit 5 Notes Eduengg
23 pages
Survey of Rtos
100% (1)
Survey of Rtos
17 pages
Ad3411 - Student
No ratings yet
Ad3411 - Student
27 pages
Artificial Intelligence Lab Manual: Python
No ratings yet
Artificial Intelligence Lab Manual: Python
15 pages
Question Paper - AI (Feb 1)
No ratings yet
Question Paper - AI (Feb 1)
2 pages
FDS Iat-2 Part-B
No ratings yet
FDS Iat-2 Part-B
4 pages
BDA Unit 1-1
No ratings yet
BDA Unit 1-1
21 pages
I Sem BCA QBank
No ratings yet
I Sem BCA QBank
20 pages
Cs3353 Foundations of Data Science L T P C 3 0 0 3
No ratings yet
Cs3353 Foundations of Data Science L T P C 3 0 0 3
2 pages
Application Development Using Python: Model Question Paper-1 With Effect From 2019-20 (CBCS Scheme)
100% (1)
Application Development Using Python: Model Question Paper-1 With Effect From 2019-20 (CBCS Scheme)
3 pages
Ge8151 Phython Prog Unit 4 New
No ratings yet
Ge8151 Phython Prog Unit 4 New
33 pages
Sapbw Technical Specification Template
No ratings yet
Sapbw Technical Specification Template
30 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
98 pages
Quiz Application in C#
100% (1)
Quiz Application in C#
9 pages
Syllabus GE3151 PROBLEM SOLVING AND PYTHON PROGRAMMING 3 0 0 3
No ratings yet
Syllabus GE3151 PROBLEM SOLVING AND PYTHON PROGRAMMING 3 0 0 3
2 pages
8423 AWP Practical File
No ratings yet
8423 AWP Practical File
93 pages
Computer Architecture Project 2: Understanding Gem5 Branch Predictor Structure
No ratings yet
Computer Architecture Project 2: Understanding Gem5 Branch Predictor Structure
5 pages
Ad3311 Set4
No ratings yet
Ad3311 Set4
2 pages
Unit I
No ratings yet
Unit I
85 pages
Study On Intel 80386 Microprocessor
No ratings yet
Study On Intel 80386 Microprocessor
3 pages
AMDP - ABAP Managed Database Procedures
No ratings yet
AMDP - ABAP Managed Database Procedures
3 pages
Lecture Notes: Introduction To Data Science and Big Data
No ratings yet
Lecture Notes: Introduction To Data Science and Big Data
5 pages
Java Lab Manual
No ratings yet
Java Lab Manual
52 pages
It6602 Software Architecture
No ratings yet
It6602 Software Architecture
16 pages
Unit Iii
No ratings yet
Unit Iii
108 pages
Automatic Matlab and DIgSILENT
100% (1)
Automatic Matlab and DIgSILENT
2 pages
Introduction To C++: Julian Thijssen Computer Graphics and Visualization TU Delft
No ratings yet
Introduction To C++: Julian Thijssen Computer Graphics and Visualization TU Delft
45 pages
Batch Testing in QTP
No ratings yet
Batch Testing in QTP
3 pages
Materialized V
No ratings yet
Materialized V
29 pages
Lecture 08 - Penalty and Augmented Lagrangian Methods
No ratings yet
Lecture 08 - Penalty and Augmented Lagrangian Methods
7 pages
AD3301 - Data - Transformation - Ipynb - Colaboratory
No ratings yet
AD3301 - Data - Transformation - Ipynb - Colaboratory
27 pages
1.1. Otcl Basics: 1.1.1. Assigning Values To Variables
No ratings yet
1.1. Otcl Basics: 1.1.1. Assigning Values To Variables
27 pages
Advance Python
No ratings yet
Advance Python
57 pages
03 Expressions and Interactivity
No ratings yet
03 Expressions and Interactivity
66 pages
Syllabus of Object Oriented Programming With C++ - BTCS201
No ratings yet
Syllabus of Object Oriented Programming With C++ - BTCS201
5 pages
Lex Programming Lab
No ratings yet
Lex Programming Lab
9 pages
Lab 01 Manual - Intro To Python
No ratings yet
Lab 01 Manual - Intro To Python
9 pages
Unit - 1 Introduction To Database Management System
No ratings yet
Unit - 1 Introduction To Database Management System
40 pages
Mã Hóa Thông Điệp: Code
No ratings yet
Mã Hóa Thông Điệp: Code
12 pages
cmsc320 f2018 Lec02
No ratings yet
cmsc320 f2018 Lec02
45 pages
Zensar Placement Paper
No ratings yet
Zensar Placement Paper
16 pages
Python Programming List1
No ratings yet
Python Programming List1
31 pages
AD3301 - Numpy - and - Pandas - Ipynb - Colaboratory
No ratings yet
AD3301 - Numpy - and - Pandas - Ipynb - Colaboratory
18 pages
Pham Thien An Net Developer
No ratings yet
Pham Thien An Net Developer
2 pages
Atomic Stimulus Generation Vs
No ratings yet
Atomic Stimulus Generation Vs
8 pages
Data Structure Management
No ratings yet
Data Structure Management
3 pages
Sounak's Resume
No ratings yet
Sounak's Resume
1 page
Touchpad Prime Ver. 1.2 Class 6
From Everand
Touchpad Prime Ver. 1.2 Class 6
Nisha Batra
No ratings yet

DEV Lab Manual

Uploaded by

DEV Lab Manual

Uploaded by

TABLE OF CONTENTS

1(a) 25-08-2021 Working with numpy arrays 01

1(b) 25-08-2021 Program to perform array slicing 03

1(c) 25-08-2021 Program to perform array slicing 05

2(a) 28-08-2021 Create a dataframe using a list of elements. 07

2(b) 28-08-2021 Create a dataframe using the dictionary 09

2(c) 28-08-2021 Column selection 11

Checking for missing values using isnull() and

2(e) 02-09-2021 Dropping missing values using dropna() 15

3(a) 11-09-2021 Basic plots using matplotlib 17

3(b) 11-09-2021 Compute the x and y coordinates and create a plot. 19

3(c) 15-09-2021 Drawing multiple lines using plot function. 21

3(d) 15-09-2021 Basic plots using matplotlib 23

Python program to show the conditional frequency

Python program determine the frequency of words, of

Python program frequency of last character appearing

Python program for finding a average of list using

Python program to find the average of list using built

Python program to find the average of list using

Variability python program to show variance of

Python program to show variance on a range of data-

6(c) 23-10-2021 Python program to show statistics. 39

7 23-10-2021 Python Program To Create A Normal Curve 40

8 30-10-2021 Correlation and scatter plots 42

9 30-10-2021 Correlation coefficient 46

10(a) 13-11-2021 Simple linear regression with scikit-learn 49

10(b) 13-11-2021 Multiple linear regression with scikit-learn 50

It is a general-purpose array-processing package. It provides a high-performance multidimensional

A powerful N-dimensional array object

Step3: Print the basic characteristics of array

print("No. of dimensions: ", arr.ndim)

# Printing type of elements in array

print("Array stores elements of type: ", arr.dtype)

Step4: Print the output

print('Our array is:' )

# Now we will slice all items from the second row

# Now we will slice all items from column 1 onwards

Step4: Print the output

# Calling DataFrame constructor on list

Write a program to create a dataframe using dictionary of elements.

Step2: import pandas module

# select two columns

# creating a dataframe from list

Step2:Plot them on canvas using .plot() function.

Step5:Finally, to view your plot, we use .show() function.

# plotting the points

x =np.arange(0, 3*np.pi, 0.1)

# Plot the points using matplotlib

from matplotlib import pyplot as plt

from matplotlib import pyplot as plt

from matplotlib import pyplot as plt

from matplotlib import pyplot as plt

A Conditional Frequency is a collection of frequency distributions, computed based on a condition.

Step 4: Display The Frequency Of Each Items In The List

c_items= [('F','apple'), ('F','apple'), ('F','kiwi'), ('V','cabbage'), ('V','cabbage'), ('V','potato') ]

FreqDist({'cabbage': 2, 'potato': 1})

leadership worship hardship

leadership worship hardship

To write a python program to create a normal curve.

STEP 2: import all necessary packages

STEP 4: visualize the distribution

From scipy.stats import norm

pdf = norm.pdf(data , loc = 5.3 , scale = 1 )

sb.lineplot(data, pdf , color = 'black')

Text(0, 0.5, 'Probability Density')

To write a python program simple scatter plots.

plt.title('Simple Scatter plot')

Positive Correlation: both variables change in the same direction.

Ex No: 10 (a) SIMPLE LINEAR REGRESSION

You might also like