Lab Manual Data Science

Uploaded by

saisateeshwar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

90 views24 pages

Lab Manual Data Science

Uploaded by

saisateeshwar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 24

LIST OF EXPERIMENTS NAME OF THE EXPERIMENT Working with Numpy arrays 2. | Working with Pandas data frames 3. | Basie plots using Matplotlib 4. | Frequency distributions 3._| Averages 6. | Variability 7._| Normal curves 8. | Correlation and scatter plots 9. | Correlation coefficient 10. [RegressionCONTENTS SI. No. Name of the Experiment Page Ni Marks (100) staff ignature Working with Numpy arrays 2 Working with Pandas data frames 3 | Develop python program for Basic plots using Matplotlib 4 | Develop python program for Frequency distributions 5 _ | Develop python program for Variability 6 _ | Develop python program for Averages 7 | Develop python program for Normal Curves Develop python program for Correlation and scatter plots| 8 9 Develop python program for Correlation coefficient 10 Develop python program for Simple Linear Regre:Ex no: 1 Working with Numpy arrays AIM Working with Numpy arrays ALGORITHM Stepl: Start Step2: Import numpy module Step3: Print the basic characterist Step4: Stop and operactions of array PROGRAM import numpy as np # Creating array object arr = np.array( [[ 1, 2,3], (4,2, S]]) # Printing type of arr object print(""Array is of type: ", type(arr)) # Printing array dimensions (axes) print("No, of dimensions: ", arr.ndim) # Printing shape of array print("Shape of array: ", arr shape) # Printing size (total number of elements) of array print("Size of array: ", arr-size) # Printing type of elements in array print("Array stores elements of type: ", arr.dtype) OUTPUT Array is of type: No. of dimensions: 2 Shape of array: (2,3) Size of array: 6 Array stores elements of type: int32 Program to Perform Array Slicing a= nparray({[1,2,3]{3,4,51.[4,5,61) print(a) print(""After slicing") print(a{1:))Output (23) 345] (456)] After slicing 1345] (456]] Program to Perform Array Slicing # array to begin with import numpy as np a= nparray(([1,2,3].(3,4,5].(4.5,6])) print(Our array is:") print(a) # this returns array of items in the second column print("The items in the second column are:' ) print(af...1)) print(‘n' ) ## Now we will slice all items from the second row print ("The items in the second row are:' ) print(af1,...) print('\n' ) # Now we will slice all items from column | onwards print('The items column 1 onwards are:' ) print(af.1:)) Output: Our array is: The items in the second column are: (245) The items in the second row are: (345) The items column | onwards are: 123] 145] 561] Result: ‘Thus the working with Numpy arrays was successfully completed.Ex no: 2 Create a dataframe using a list of elements. Aim: ‘To work with Pandas data frames ALGORITHM Stepl: Start Step2: import numpy and pandas module Step3: Create a dataframe using the dictionary Step4: Print the output Step5: Stop PROGRAM import numpy as np import pandas as pd data = np.array({[[",'Col1','Col2'], ['Rowl',1,2], [Row2'3,4])) print(pd.DataFrame(data=data{|:,1:], index = data[1:,0], columns=data[0,1:])) # Take a 2D array as input to your DataFrame my_2darray = np.array({[1, 2, 3], (4, 5, 61). print(pd.DataFrame(my_2darray)) #t Take a dictionary as input to your DataFrame my_di {A(T 3], 2: [1 2], 3: [2447 print(pd.DataFrame(my_dict)) # Take a DataFrame as input to your DataFrame my_df= pd.DataFrame(data~[4,5,6,7], index=range(04), columns=['A")) print(pd.DataFrame(my_df)) # Take a Series as input to your DataFrame my_series = pd.Series({"United Kingdom":"London", "India":"New Delhi", "United States":" Washington", "Belgium":"Brussels"}) print(pd.DataFrame(my_series)) df= pd.DataFrame(np.array(([1, 2, 3], [4. 5. 6]])) # Use the "shape" property print(df.shape)# Or use the “Ien()° function with the ‘index’ property print(len(df.index)) Output: Coll Col2 Rowl 1 2 Row2 3 4 012 1 more bok wre aan 0 United Kingdom — London India New Delhi United States Washington Belgium Brussels, 2,3) Result: ‘Thus the working with Pandas data frames wa: iccessfully completed,Ex. No. Basic plots using Matplotlib Aim: ‘To draw basic plots in Python program using Matplotlib ALGORITHM Step]: Start Step2: import Matplotlib module Step3: Create a Basic plots using Matplotlib Step4: Print the output Step5: Stop Program:3a # importing the required module import matplotlib.pyplot as plt # x axis values x= [12,3] # corresponding y axis values y=[2,4,1] # plotting the points plt.plot(x, y) i naming the x axis plt.xlabel('x - axis’) # naming the y axis pltylabel('y - axis’) # giving a title to my graph pletitle(My first graph!") # function to show the plot pltshow()Output: My first graph! aof 100 125 150 175 200 225 250 275 300 Program:3b import matplotlib.pyplot as plt a=[1,2,3,4,5] b=[0, 0.6, 0.2, 15, 10, 8, 16, 21] pltplot(a) o is for circles and ris # for red pltplot(b, "or") pltplot(list(range(0, 22, 3))) ‘# naming the x-axis plt.xlabel(’Day ->') # naming the y-axi pltylabel(’Temp ->") = [4, 2, 6,8, 3, 20, 13, 15] pltplot(c, label = ‘4th Rep’) if get current axes command ax = plt.gea() zet command over the individual # boundary line of the graph body ax.spines['right'].set_visible(False) ax.spinesf'top'].set_visible(False)# set the range or the bounds of # the left boundary line to fixed range ax.spinesf'left'] set_bounds(-3, 40) #f set the interval by which if the x-axis set the marks plt.xticks(list(range(-3, 10))) if set the intervals by which y-axis # set the marks plt yticks(list(range(-3, 20, 3))) #f legend denotes that what color # signifies what ax.legend({"Ist Rep’, '2nd Rep’, 1rd Rep’, '4th Rep’) # annotate command helps to write # ON THE GRAPH any text xy denotes # the position on the graph plt.annotate(’Temperature V /'s Days’, xy = (1.01, -2.15)) # gives a title to the Graph plttitle( All Features Discussed’) plt.show() Output: imps Program:4e import matplotlib.pyplot as plt a=[1,2,3,4,5] b=[0, 0.6, 0.2, 15, 10, 8, 16, 21] c=[4, 2, 6, 8, 3, 20, 13, 15]# use fig whenever u want the # output in a new window also #f specify the window size you # want ans to be displayed fig = plt.figure(figsize =(10, 10) # creating multiple plots in a # single plot subl =plisubplot(2, 2, 1) sub2 = plt.subplot(2, 2, 2) sub3 = plt.subplot(2, 2, 3) sub4 = plt.subplot(2, 2, 4) sub] plot(a,'sb’) if sets how the display subplot ## x axis values advances by 1 # within the specified range subl.set_xticks(list(range(0, 10, 1))) subl set_title('Ist Rep’) sub2.plot(b, ‘or’) # sets how the display subplot x axis # values advances by 2 within the # specified range sub2.set_xticks(list(range(0, 10, 2))) sub2.set_title(‘2nd Rep’) # can directly pass a list in the plot # function instead adding the reference sub3.plot(list(range(0, 22, 3)),'ve") sub3.set_xticks(list(range(0, 10, 1))) sub3.set_title("3rd Rep’) sub4.plot(e, Dm’) # similarly we can set the ticks for if the y-axis range(start(inclusive), # end(exclusive), step) sub4.set_yticks(list(range(0, 24, 2))) sub4.set_title('4th Rep’) # without writing pit show() no plot # will be visible pltshow() Output:0 Result: ‘Thus the basic plots using Matplotlib in Python program was succe: fully completed. Ist Rep 2nd Rep id Rep 2 ath Rep 0 . a . , . °Ex. No. Frequency distributions Aim: ‘To Count the frequency of occurrence of a word in a body of text is often needed during text processing. ALGORITHM Step 1: Start the Program Step 2: Create text file blake-poems.txt Step 3: Import the word_tokenize function and gutenberg Step 4: Write the code to count the frequency of occurrence of a word in a body of text Step 5: Print the result Step 6: Stop the process Program: from nltk.tokenize import word_tokenize from nltk.corpus import gutenberg sample = gutenberg.raw("blake-poems.txt") token = word_tokenize(sample) wlist = [] for i in range(50): wlist.append(token{i]) wordfreq = [wlist.count(w) for w in wlist] print("Pairsin" + str(zip(token, wordsreq))) Output: [C, D, Poems’, 1), (by', 1), (William, 1), (Blake', 1), (1789 1), (I, J, (SONGS', 2), (OF, 3), (INNOCENCE, 2), (AND', 1), (OF, 3), (EXPERIENCE, 1), (and’, 1), 1), (BOOK', 1), (of, 2), (THEL’, 1), (SONGS’, 2), (OF', 3), INNOCENCE’ 2), (INTRODUCTION, 1), (Piping’, 2), (down’, 1), (the', 1), (valleys', 1), (wild’, 1), (°, 3), Piping’, 2), (songs’, 1), (of, 2), (pleasant’, 1), (glee’, 1), (', 3), (On’, 1), (a, 2), (cloud, 1), (T', 1), (saw', 1), (a, 2), (child’, 1), (', 3), (And’, 1), (he’, 1), (aughing’, 1), (said, 1), (o', 1), (me, 1), C1), DI Result: ‘Thus the count the frequeney of occurrence of a word in a body of text is often needed during text processing and Conditional Frequency Distribution program using python was suecessfully completed,Averages To compute weig! ted averages in Python ei her defining your own functions or using ALGORITHM Step 1: Start the Program Step 2: Create the employees_salary table and save as .csv file Step 3: Import pa -kages (pandas and numpy) and the employees, alary table itself: Step 4: Calculate weighted sum and average using Numpy Average() Function Step 5 : Stop the process Program:6¢ #Method Using Numpy Average() Function weighted_avg_m3 = round(average( dfl'salary_p_year'], weights = df['employees_number'}),2) weighted_avg_m3 Output: 44225. 5 Result: Thus the compute weighted averages in Python either defining your own functions or using Numpy was successfully completedTo write a python program to calculate the variance. ALGORITHM Step 1: Start the Program Step 2: Import statistics module from statistics import variance Step 3: Import fractions as parameter values from fractions import Fraction as fr Step 4: Create tuple of a set of positive and negative numbers Step 5: Print the variance of each samples Step 6: Stop the process Program: # Python code to demonstrate variance() # function on varying range of data-types # importing statisties module from statistics import variance # importing fractions as parameter values from fractions import Fraction as fr # tuple of a set of positive integers # numbers are spread apart but not very much samplel = (1, 2, 5, 4, 8, 9, 12) # tuple of a set of negative integers sample? = (-2, «4, «3, «1, -5, -6) # tuple of a set of positive and negative numbers # data-points are spread apart considerably sample3 = (-9, «1, -0, 2, 1, 3, 4, 19) # tuple of a set of fractional numbers sampled = (fr(1, 2), fr(2, 3), fr(3, 4), fir(5, 6), fr(7, 8) # tuple of a set of floating point values samples = (1.23, 1.45, 2.1, 2.2, 1.9)# Print the variance of each samples print("Variance of Samplel is % s " %(variance(sample1))) print("'Variance of Sample2 is % s " %(variance(sample2))) print("Variance of Sample3 is % s " %(variance(sample3))) print("Variance of Sampled is % s " %(variance(sample4))) print("Variance of Samples is % s " %(variance(sample5))) Output : Variance of Sample 1 is 15.80952380952381 Variance of Sample 2 is 3.5 Variance of Sample 3 is 61.125 Variance of Sample 4 is 1/45 Variance of Sample 5 is 0.17613000000000006 Result: ‘Thus the computation for variance was successfully completedEx. No.:7 Normal Curve Aim: To create a normal curve using python program. ALGORITHM Step 1: Start the Program Step 2: Import packages scipy and call function seipy.stats Step 3: Import packages numpy, matplotlib and seaborn Step 4: Create the distribution Step 5: Visualizing the distribution Step 6: Stop the process Program: # import required libraries from seipy.stats import norm import numpy as np import matplotlib.pyplot as plt import seaborn as sb # Creating the distribution data = np.arange(1,10,0.01) pdf=norm.pdf(data , loc = 5.3 , seale= 1) +#Visualizing the distribution sb.set_style(whitegrid’) sb.lineplot(data, pdf , color = "black’) plt.xlabel(Heights’) pit ylabel( Probability Density’)Outi Result: ‘Thus the normal curve using python program was successfully completed.Correlation and scatter plots Aim: To write a python program for correlation with scatter plot ALGORITHM Step 1: Start the Program Step 2: Create variable y1, y2 Step 3: Create variable x, y3 using random function Step 4: plot the scatter plot Step 5: Print the result Step 6: Stop the process Progra # Scatterplot and Correlations # Data x-pp random randn(100) yIEx*54+9 y: ox y3-no_random.randn(100) #Plot plt.reParams update('figure figsize’ (10,8), ‘figure dpi':100}) pit scatter(x, yl, label=fyl, Correlation = {np.round(np.corrcoef(x,y1)[0,1], 2)}) atter(x, y2, label=fy2 Correlation = (np.tound(np.correoef(x,y2)[0,1], 2)}) fabel=fy3 Correlation = (np.round(np.corrcoef(x,y3){0,1], 2)}) plt titlef('Scatterplot and Correlations’) plt(legend) plt(show)Output Seaterplot and Corelations x - . Result: ‘Thus the Correlation and scatter plots using python program was successfully completed.Correlation coefficient To write a python program to compute correlation coefficient. ALGORITHM, Step 1: Start the Program Step 2: Import math package Step 3: Define correlation coefficient function Step 4: Calculate correlation using formula Step 5:Print the result Step 6 : Stop the process Program: # Python Program to find correlation coefficient, import math. # function that returns correlation coefficient. def correlationCoefficient(X, Y, n) sum_X =0 sum_Y=0 sum_XY=0 squareSum_X =0 squareSum_Y =0 i=0 while i

Unit-V Python_BCC402
No ratings yet
Unit-V Python_BCC402
20 pages
Unit5 NumPy Pandas Notes
No ratings yet
Unit5 NumPy Pandas Notes
90 pages
Dsf-Pyt-Lab Manual
No ratings yet
Dsf-Pyt-Lab Manual
54 pages
Fundamentals of Data Science Lab Manual-5-26
No ratings yet
Fundamentals of Data Science Lab Manual-5-26
22 pages
FODS_LAB_MANUAL
No ratings yet
FODS_LAB_MANUAL
26 pages
Pandas Class XII (2021-22)
No ratings yet
Pandas Class XII (2021-22)
246 pages
697e9176-7141-4407-ac59-183e04ddf458
No ratings yet
697e9176-7141-4407-ac59-183e04ddf458
44 pages
Data Science Practical
No ratings yet
Data Science Practical
28 pages
DSF LAB EXP FULL (1) (1)
No ratings yet
DSF LAB EXP FULL (1) (1)
88 pages
Ds Lab-1
No ratings yet
Ds Lab-1
40 pages
Unit 5
No ratings yet
Unit 5
75 pages
AIML Lab Manual
No ratings yet
AIML Lab Manual
39 pages
Fundamentals of Data Science Lab Manual
No ratings yet
Fundamentals of Data Science Lab Manual
34 pages
Fundamentals of Data science Lab manual new
No ratings yet
Fundamentals of Data science Lab manual new
33 pages
Manual
No ratings yet
Manual
21 pages
Lab Mannual
No ratings yet
Lab Mannual
49 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
45 pages
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
100% (1)
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
84 pages
Data Science Fundamentals Lab
No ratings yet
Data Science Fundamentals Lab
24 pages
FDS Lab Manual (Print)
No ratings yet
FDS Lab Manual (Print)
43 pages
Section 7
No ratings yet
Section 7
33 pages
ML3_Data_Analysis
No ratings yet
ML3_Data_Analysis
80 pages
dv_lab_manual_modified
No ratings yet
dv_lab_manual_modified
31 pages
FDS Lab 1 Manuel .1..1new
No ratings yet
FDS Lab 1 Manuel .1..1new
38 pages
FDS Lab 1 Manuel .1..1new
No ratings yet
FDS Lab 1 Manuel .1..1new
34 pages
Class X - A.I. - Practical Lab Manual - VVA 2024-25
No ratings yet
Class X - A.I. - Practical Lab Manual - VVA 2024-25
50 pages
fdsa lab manual final
No ratings yet
fdsa lab manual final
70 pages
FDS LAB
No ratings yet
FDS LAB
43 pages
Module 6 NumPY and Pandas
No ratings yet
Module 6 NumPY and Pandas
12 pages
Fundamentals of Data Science Lab Manual New1
No ratings yet
Fundamentals of Data Science Lab Manual New1
32 pages
Swarang Raut EDVA Experiment 1 Numpy Pandas
No ratings yet
Swarang Raut EDVA Experiment 1 Numpy Pandas
58 pages
Matplotlib
No ratings yet
Matplotlib
15 pages
Unit 5 PythonPackages(Matplotlib)
No ratings yet
Unit 5 PythonPackages(Matplotlib)
24 pages
Answers 1
No ratings yet
Answers 1
17 pages
Unit 3
No ratings yet
Unit 3
19 pages
Data Analysis and Visualization Using Python Libraries and Streamlit - RTF Pre Read Materials
No ratings yet
Data Analysis and Visualization Using Python Libraries and Streamlit - RTF Pre Read Materials
29 pages
NumPy Cheat Sheet
No ratings yet
NumPy Cheat Sheet
1 page
Unit Vi
No ratings yet
Unit Vi
60 pages
Python-Unit-4
No ratings yet
Python-Unit-4
43 pages
PP&DS UNIT III
No ratings yet
PP&DS UNIT III
26 pages
FOD Record Sem 1
No ratings yet
FOD Record Sem 1
25 pages
Module Numpy
No ratings yet
Module Numpy
67 pages
Numpy
No ratings yet
Numpy
64 pages
IP Book 12 Question Bank
No ratings yet
IP Book 12 Question Bank
20 pages
FDS Exp1,2
No ratings yet
FDS Exp1,2
4 pages
Dsf-Pyt-Lab Manual
No ratings yet
Dsf-Pyt-Lab Manual
50 pages
Module3 Advance Pythonlibraries
No ratings yet
Module3 Advance Pythonlibraries
53 pages
Python exps questions
No ratings yet
Python exps questions
10 pages
Numpy
No ratings yet
Numpy
14 pages
NumPy Basics
No ratings yet
NumPy Basics
23 pages
EXP1-siddhant gupta (23_SE_148)
No ratings yet
EXP1-siddhant gupta (23_SE_148)
17 pages
Tutorial 2
No ratings yet
Tutorial 2
9 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
Value Added Course: Programming in Python and Machine Learning UNIT-2
No ratings yet
Value Added Course: Programming in Python and Machine Learning UNIT-2
41 pages
Python Unit IV
No ratings yet
Python Unit IV
12 pages
Getting Started With Python Cheat Sheet
No ratings yet
Getting Started With Python Cheat Sheet
1 page
Introduction To Numpy Pandas and Matplotlib
No ratings yet
Introduction To Numpy Pandas and Matplotlib
2 pages
Python Cheat Sheet For Beginners
No ratings yet
Python Cheat Sheet For Beginners
1 page

Lab Manual Data Science

Uploaded by

Lab Manual Data Science

Uploaded by

You might also like