We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6
LIST OF EXPERIMENTSEx.
NoName of the Experiments1Download, install and explore the features of
Python for data analytics2Working with Numpy arrays 3Working with Pandas data frames 4Basic plots using Matplotlib5.AFrequency distributions - Iris data set5.BMean, Mode, Standard Deviation5.CVariability 5.DNormal curves - Iris data set5.ECorrelation and scatter plots - Iris data set5.FCorrelation coefficient - Iris data set5.GRegression – Linear Regression6.AUnivariate Analysis - Pima Indians Diabetes for Diabetic Patients6.B.1Bivariate Analysis - Linear Regression -Pima Indians Diabetes for Diabetic Patients6.B.2Bivariate Analysis -Logistic regression -Pima Indians Diabetes7.1Supervised Learning Algorithms - Linear Regression -Pima Indians Diabetes for Non-Diabetic Patients7.2Unsupervised Learning Algorithms8Various plotting functionsExno: 1 DOWNLOAD, INSTALL AND EXPLORE THE FEATURES OF PYTHON FOR DATA ANALYTICSAIM:-To download, install and explore the features of Python, Numpy and Pandas packages .DESCRIPTION:-1 .Download the package of Python(if not pre installed),Numpy and Pandas packages from web.2. Check the python version before install packages, because most of as having pre installed.For example, Check the version of python , run the command python in command prompt.3. Install the package in windows through pip, which is a package manager for installing and managing python software packages. For example, pip install numpy in command promptFEATUES:-1. PYTHONPython is a Portable Language.It is a cross-platform language.Python is a high level language and Interpreted language.Python is a dynamic language.2 .NUMPYNumpy stands for numerical python.Numpy is one of the most commonly used packages for specifying computing in python.High performance zero-dimensional array objectIt contains tools for integrating code from C/C++ &Forton .It contains an multidimensional contains for generic data.Additional linear algebra, Fouriertransforms and random number capabilities.It consists of broad casting functions.3. PANDAS:-Pandas stands for “Python Analysis Library”.Fast and efficient Data frame object with default and customized indexing.Tools for loading data into in- memory data objects from different file formats.Data alignment and integrated handling of missing data.Reshaping and evaluating of data sets.RESULT:-Thus the above packages downloaded, installed and features are studied.Ex No:2 WORKING WITH NUMPY ARRAY AIM:- To write a python program using numpy array. ALGORITHM:- 1. Import numpy package. 2. Create a array from list with type float using array() and print the same. 3. Create a array from tuple using array () and print the same. 4. Creating a 3*4 array with all zeroes () and print the same. 5. Create a withstand value array of complex type using full () and print the same. 6. Create a sequences of integers with steps from 0 to 30 using array () and print the same. 7. Reshaping 3*4 array to 2*2*3 array reshape () and print the same. 8. Create a flatten array using flatten () and print the same. 9. Create a merge of arrays using concatenate () and print the same. 10. Create a merge splitten array using array split () and print the same. 11. To sort a given array using sort () and print the same. 12. To search the given key value in the given array by using where() and print the same. PROGRAM : import numpy as np #1.Creating array from list with type float a=np. Array([[1,2,4],[5,8,7]],dtype='float') print ("\n 1.Array created using passed list:\n",a) #2.Creating array from tuple b=np.array((1,3,2)) print("\n 2.Array created using passed tuple:\n",b) #3.Creating a 3*4 array with all zeros c=np.zeros((3,4)) print("\n 3.An array initialized with all zeros:\n",c) #4.Create a constant value array of complex type d=np.full((3,3),6,dtype='complex') print("\n 4.An array initialized with all 6's with type complex:\n",d) #5.Create a sequence of integers from 0 to 30 with steps of 5 e=np.arange(0,30,5) print("\n 5.A sequence array with steps of 5:\n",e) #6.Reshaping 3*4 array to 2*2*3 array f=np.array([[1,2,3,4],[5,2,4,2],[1,2,0,1]]) fnew=f.reshape(2,2,3) print("\n 6.Old array\n",f) print("\n New array\n",fnew) #7.Flatten array g=np.array([[1,2,3],[4,5,6]]) gflat=g.flatten() print("\n 7.Old array\n",g) print("\n Flatten array\n",gflat) #8.Join array arr1=np.array([1,2,3]) arr2=np.array([4,5,6]) arr=np.concatenate((arr1,arr2)) print("\n 8.Joined array\n",arr) #9.Split h=np.array([1,2,3,4,5,6]) hnew=np.array_split(h,3) print("\n 9.Splited array\n",hnew) #10.Sort i=np.array([3,2,0,1]) print("\n 10.Sorted 1D array:\n",np.sort(i)) j=np.array([[3,2,4],[5,0,1]]) print("\n 10.Sorted 2D array:\n",np.sort(j)) #11.Search k=np.array([1,2,3,4,5,4,4]) knew=np.where(k==4) print("\n 11.Searched array position:\n",knew)OUTPUT : 1.Array created using passed list: [[1. 2. 4.] [5. 8. 7.]] 2.Array created using passed tuple: [1 3 2] 3. An array initialized with all zeros: [[0.0. 0.0.] [0. 0. 0. 0.] [0. 0. 0. 0.]] 4.An array initialized with all 6's with type complex: [[6.+0.j 6.+0.j 6.+0.j] [6.+0.j 6.+0.j 6.+0.j] [6.+0.j 6.+0.j 6.+0.j]] 5.A sequence array with steps of 5: [ 0 5 10 15 20 25] 6.Old array [[1 2 3 4] [5 2 4 2] [1 2 0 1]] New array [[[1 2 3] [4 5 2]] [[4 2 1] [2 0 1]]] 7.Old array [[1 2 3] [4 5 6]] Flatten array [1 2 3 4 5 6] 8.Joined array [1 2 3 4 5 6] 9. Spitted array [array([1, 2]), array([3, 4]), array([5, 6])] 10.Sorted 1D array: [0 1 2 3] 10.Sorted 2D array: [[2 3 4] [0 1 5]] 11.Searched array position: (array([3, 5, 6], dtype=int64),)RESULT:-Thus the above program is executed successfully .Ex No:3 WORKING WITH PANDAS DATA FRAMES AIM:- To write a python program using Pandas data frames. ALGORITHM:- 1.Start the program. 2.Import pandas package. 3.Create a two different dictionaries for employee data. 4.Convert the dictionaries into data frames. 5.Using concat() method to join the two data frames into single data frame . 6.Stop the program.PROGRAM:import pandas as pd #Define a dictionary containing employee data data1={'Name':['Jai','Princi','Gaurav','Anuj'], 'Age':[27,24,22,32], 'Address':['Nagpur','Kanpur','Allahabad','Kannuaj'], 'Qualification':['M.sc','MA','MCA','P.hd']} #Define a dictionary containing employee data data2={'Name':['Abhi','Ayushi','Dhiraj','Hitesh'], 'Age':[17,14,12,52], 'Address':['Nagpur','Kanpur','Allahabad','Kannuaj'], 'Qualification':['B.tech','BA','B.com','B.hons']} #Convert the dictionary into Data framedf=pd.DataFrame(data1, index=[0,1,2,3])#Convert the dictionary into Dataframe df1=pd.DataFrame(data2,index=[4,5,6,7]) print ("DataFrame1\n\n",df,"\n\n","DataFrame2\n\n",df1,"\n\n") #Using a concat() method frames=[df,df1] res1=pd.concat(frames) print ("The concatenating Dataframes 1 and 2 \n\n",res1) OUTPUT:DataFrame 1 Name Age Address Qualification 0 Jai27Nagpur M.sc 1 Princi 24 Kanpur MA 2 Gaurav 22Allahabad MCA 3 Anuj 32 Kannuaj P.hd DataFrame 2 Name Age Address Qualification 4 Abhi 17 Nagpur B.Tech 5 Ayushi 14Kanpur BA 6 Dhiraj 12 AllahabadB.com 7 Hitesh 52 Kannuaj B.hons The concatenating Dataframes 1 and 2 Name Age Address Qualification 0 Jai 27 Nagpur M.sc 1 Princi 24 KanpurMA2 Gaurav 22 Allahabad MCA 3 Anuj 32 Kannuaj P.hd 4 Abhi 17 Nagpur B.Tech 5 Ayushi 14 Kanpur BA 6 Dhiraj 12 AllahabadB.com 7 Hitesh52 Kannuaj B.honsRESULT:-Thus the above program is executed successfully. Ex.no:4 BASIC PLOTS USING MATPLOTLIBAIMTo write a python program for drawing a dashed lines for a given data.ALGORITHM:-Step 1: Start the programStep 2 :import matplotlib,numpy packageStep 3:Read xpoints and ypoints from the arrayStep 4: print the dotted line for above xpoints,ypointsStep 5:stop the program PROGRAM:- import matplotlib.pyplot as pltimport numpy as npxpoints=np.array([1,2,3,4])ypoints = np.array([3, 8, 1, 10])plt.plot(xpoints,ypoints, linestyle = 'dotted')plt.show()OUTPUT:-RESULT:-Thus the above program is executed successfully.Ex.no:5.A FREQUENCY DISTRIBUTION – IRIS DATA SETAIMTo write a python program for frequency distribution using iris data setALGORITHM:-Step 1: Start the programStep 2 :import numpy packageStep 3:Read data from listStep 4: print the value of variance for the given listStep 5:stop the program PROGRAM:-import pandas as pd import numpy as np data = pd.read_csv('iris.csv') df = d a t a [ ' v a r i e t y ' ] . v a l u e _ c o u n t s ( ) p r i n t ( d f ) I r i s d a t a s e t : - s.nosepal.lengthsepal.widthpetal.lengthpetal.widthvariety15.13.51.40.2Setosa24.931.40.2Setosa34.73.21.30.2Setosa4 4.63.11.50.2Setosa553.61.40.2Setosa65.43.91.70.4Setosa74.63.41.40.3Setosa853.41.50.2Setosa94.42.91.40.2Setosa1 04.93.11.50.1Setosa115.43.71.50.2Setosa124.83.41.60.2Setosa134.831.40.1Setosa144.331.10.1Setosa155.841.20.2Se tosa………1416.73.15.62.4Virginica1426.93.15.12.3Virginica1435.82.75.11.9Virginica1446.83.25.92.3Virginica145 6.73.35.72.5Virginica1466.735.22.3Virginica1476.32.551.9Virginica1486.535.22Virginica1496.23.45.42.3Virginica1 505.935.11.8VirginicaOUTPUT:-setosa 50virginica 50versicolor 50Name: species, dtype: int64RESULT:- Thus the above program is executed successfully.Ex.no:5.B MEAN, MODE , STANDARD DEVIATIONAIMTo write a python program for mean, mode, standard deviation using given data setALGORITHM:- Step 1: Start the programStep 2 :import statistics, numpy packagesStep 3:Read data from listStep 4: print the values of mean, mode and standard deviation for the given listStep 5:stop the program PROGRAM:-import statistics as statimport numpy as npspeed = [99,86,87,88,111,86,103,87,94,78,77,85,86]print("The mean is")print(np.mean(speed))print("The mode is")print(stat.mode(speed))print("The standard deviation is")print(np.std(speed))OUTPUT:-The mean is89.76923076923077The mode is86The standard deviation is9.258292301032677RESULT:-Thus the above program is executed successfully.Ex.no:5.C VARIABILITYAIMTo write a python program for variability using given data setALGORITHM:-Step 1: Start the programStep 2 :import numpy packageStep 3:Read data from listStep 4: print the value of variance for the given listStep 5:stop the program PROGRAM:-import numpy as nplist = [2, 4, 4, 4, 5, 5, 7, 9]print(“variance is:”)print(np.var(list))OUTPUT:-variance is: 4.0RESULT:-Thus the above program is executed successfully..Ex.no:5.DNORMAL CURVES –IRIS DATA SETAIMTo write a python program for Normal curves using Iris data setALGORITHM:-Step 1: Start the programStep 2 :import pandas, matplotlin.pyplot packagesStep 3:Read data from Iris.csv data setStep 4: get x,y from the data setStep 5:Plot dotted line curve by using x,y valuesStep 6:stop the program PROGRAM:-import pandas as pdimportmatplotlib.pyplot as pltdf=pd.read_csv("irisdata.csv")x=df['sepal.length']y=df['petal.length']plt.plot(x,y, linestyle = 'dotted')plt.show()OUTPUT:-RESULT:-Thus the above program Normal Curves using Iris Data Set is verified.Ex.No:5.ECORRELATION AND SCATTER PLOT- IRIS DATA SETAIM:To write the python program for Corlation andreScatter Plot using Iris Data Set .ALGORITHM:1. Start the program.2.import pandas and matplotlib.pyplot packages.3.Read iris.csv file4.Draw the scatter plot for 3 different varieties of flowers using their sepal length and petal length5.Stop the programPROGRAM:-import pandas as pdimport matplotlib.pyplot as pltdf=pd.read_csv("iris.csv")df1=df[df.variety=='Setosa']x1=df1['sepal.length']y1=df1['petal.length']df2=df[df.variety =='Versicolor']x2=df2['sepal.length']y2=df2['petal.length']df3=df[df.variety=='Virginica']x3=df3['sepal.length']y3=df3 ['petal.length']plt.scatter(x1,y1,color='red',marker='o',label='Setosa')plt.scatter(x2,y2,color='blue',marker='s',label='Ve rsicolor')plt.scatter(x3,y3,color='green',marker='x',label='Virginica')plt.title('iris_scatterplot')plt.xlabel("sepal.length[c m]")plt.ylabel("petal.length[cm]")plt.show()OUTPUT:RESULT:Thus the above program Scatter Plot using Iris Data Set is verified.Ex.No:5.FCORRELATION COEFFICIENT – IRIS DATA SETAIM:To write a python program for Correlation using Iris data Set.ALGORITHM:1. Start the program.2. Import pandas and matplotlib pyplot package.3. Read iris.Csv file.4.Calculate containing correlation co-efficient of r by using (method=”person”) defined on pandas package.5. Coefficient of r for each column by using correlation.6.Stop the program.PROGRAM:-import pandas as p d i m p o r t m a t p l o t l i b . p y p l o t a s pltdf=pd.read_csv("iris.csv")x=df.corr(method='pearson')print(x)OUTPUT:sepal.lengthsepal.widthpetal.lengthpetal.w idthsepal.length 1.000000 -0.117570 0.871754 0.817941sepal.width -0.117570 1.000000 -0.428440 -0.366126petal.length 0.871754 -0.428440 1.000000 0.962865petal.width 0.817941 -0.366126 0.962865 1.000000RESULT:Thus the above program Correlation Coefficient using Iris Data Set is verified.Ex.No:5.G REGRESSION - LINEAR REGRESSIONAIMTo write a python program for linear regression using pima Indians Diabetes data set for diabetic patients.ALGORITHM Step 1: Start the programStep 2 :import matplotlib.pyplot,scipy packagesStep 3:get x,y values from the list for linregress()Step 4:Plot x,y values using scatterplotStep 7: plot regression line by calculate y=bx+aStep 8:stop the programPROGRAM:import matplotlib.pyplot as pltfrom scipy import statsx = [5,7,8,7,2,17,2,9,4,11,12,9,6]y = [99,86,87,88,111,86,103,87,94,78,77,85,86]slope, intercept, r, p, std_err = stats.linregress(x, y)def myfunc(x): return slope * x + interceptmymodel = list(map(myfunc, x))plt.scatter(x, y)plt.plot(x, mymodel)plt.show()OUTPUT:RESULT:Thus the above program is executed successfully.Ex.No:6.A UNIVARIATE ANALYSIS PIMA INDIANS DIABETES FOR DIABETIC PATIENTSAIMTo write a python program for univariate analysis using pima Indians Diabetes data set for diabetic patients.ALGORITHM :-Step 1: Start the programStep 2: import pandas packagesStep 3: Read data from pima Indian Diabetes.csv data setStep 4: separate diabetic patients from the data setStep 5:calculate frequency, mean, median, mode, standard deviation, variance, skewness , kurtosis by using in built functions value_counts() ,mean(),median(),mode(),std(), var(), skew(), kurt() for independent fileds pregnant, glucose, bp, skin, insulin, bmi, pedigree in the data setStep 6: stop the p r o g r a m P R O G R A M : I m p o r t p a n d a s a s p d d f = p d . read_csv("pima1.csv")df1=df[df.outcome==1]print(df1['pregnant'].value_counts())print(df1['pregnant'].mean())print(d f1['pregnant'].std())print(df1['pregnant'].mode())print(df1['pregnant'].var())print(df1['pregnant'].skew())print(df1['preg nant'].kurt())print(df1['glucose'].value_counts())print(df1['glucose'].mean())print(df1['glucose'].std())print(df1['glucose '].mode())print(df1['glucose'].var())print(df1['glucose'].skew())print(df1['glucose'].kurt())print(df1['bp'].value_counts() )print(df1['bp'].mean())print(df1['bp'].std())print(df1['bp'].mode())print(df1['bp'].var())print(df1['bp'].skew())print(df1[' bp'].kurt())print(df1['skin'].value_counts())print(df1['skin'].mean())print(df1['skin'].std())print(df1['skin'].mode())print( df1['skin'].var())print(df1['skin'].skew())print(df1['skin'].kurt())print(df1['insulin'].value_counts())print(df1['insulin'].m ean())print(df1['insulin'].std())print(df1['insulin'].mode())print(df1['insulin'].var())print(df1['insulin'].skew())print(df1[' insulin'].kurt())print(df1['bmi'].value_counts())print(df1['bmi'].mean())print(df1['bmi'].std())print(df1['bmi'].mode())pr int(df1['bmi'].var())print(df1['bmi'].skew())print(df1['bmi'].kurt())print(df1['pedigree'].value_counts())print(df1['pedigr ee'].mean())print(df1['pedigree'].std())print(df1['pedigree'].mode())print(df1['pedigree'].var())print(df1['pedigree'].ske w())print(df1['pedigree'].kurt())print(df1['age'].value_counts())print(df1['age'].mean())print(df1['age'].std())print(df1['a ge'].mode())print(df1['age'].var())print(df1['age'].skew())print(df1['age'].kurt())Pima Indians Diabetics.csvs.nopregnantglucosebpskininsulinbmipedigreeageclass161487235033.60.62750121856629026.60.35131 038183640023.30.672321418966239428.10.16721050137403516843.12.28833165116740025.60.2013007378503288 310.24826181011500035.30.13429092197704554330.50.158531108125960000.232541114110920037.60.191300121 01687400380.5373411310139800027.11.441570141189602384630.10.398591155166721917525.80.5875111671000 00300.484321170118844723045.80.551311187107740029.60.25431119110330388343.30.18333020111570309634.6 0.529321………76128858261628.40.766220762917074310440.403431763989620022.50.1423307641010176481803 2.90.17163076521227027036.80.342707665121722311226.20.2453007671126600030.10.3494717681937031030.40. 315230OUTPUT:8 26 10 13 12 110 11 15 1Name: pregnant, dtype: int644.7777777777777783.4920544732928270 8Name: pregnant, dtype: int6412.1944444444444450.07976827393633572-1.4086477342894645148 1183 1137 178 1197 1125 1168 1189 1166 1Name: glucose, dtype: int64154.5555555555555437.40692152233030 781 1252 1373 1484 1665 1686 1837 1898 197Name: glucose, dtype: int641399.2777777777776- 1.03138537079757220.97810048072380772 264 140 150 170 196 174 160 1Name: bp, dtype: int6466.4444444444444415.8989866902824280 72Name: bp, dtype: int64252.777777777777770.136560740399989251.01362664964548890 335 232 145 123 119 1Name: skin, dtype: int6421.017.3925271309260870 0Name: skin, dtype: int64302.5-0.21810449977720303- 1.62454555211880520 4168 188 1543 1846 1175 1Name: insulin, dtype: int64202.22222222222223297.72335219872230 0Name: insulin, dtype: int6488639.194444444451.65500594009474661.978179909723240633.6 123.3 143.1 131.0 130.5 10.0 138.0 130.1 125.8 1Name: bmi, dtype: int6428.3777777777777812.1895219120539940 0.01 23.32 25.83 30.14 30.55 31.06 33.67 38.08 43.1Name: bmi, dtype: float64148.58444444444447- 1.66321758630570263.9892001036443590.627 10.672 12.288 10.248 10.158 10.232 10.537 10.398 10.587 1Name: pedigree, dtype: int640.63855555555555550.64628865669898440 0.1581 0.2322 0.2483 0.3984 0.5375 0.5876 0.6277 0.6728 2.288Name: pedigree, dtype: float640.417689027777777672.5211864104630696.95786902416126450 132 133 126 153 154 134 159 151 1Name: age, dtype: int6443.5555555555555612.1358056089316850 261 322 333 344 505 516 537 548 59Name: age, dtype: int64147.27777777777774-0.23884551536673332-1.9149668580541754RESULT:- Thus the above program univariate analysis on pima Indian diabetes for diabetics patients is verifiedEx.No:6.B.1 BIVARIATE ANALYSIS - LINEAR REGRESSION PIMA INDIAN DIABETES FOR DIABETIC PATIENTSAIMTo write a python program for linear regression using pima Indians Diabetes data set for diabetic patients.ALGORITHM Step 1: Start the programStep 2 :import pandas and matplotlib.pyplot,scipy packagesStep 3:Read data from pima Indian Diabetes.csv data setStep 4:separate diabetic patients from the data setStep 5:get x,y values from data set for linregress()Step 6:Plot x,y values using scatterplotStep 7: plot regression line by calculate y=bx+aStep 8:stop the programPROGRAM:import pandas as pdimport matplotlib.pyplot as pltfrom scipy import statsdf=pd.read_csv("pima1.csv")df1=df[df.outcome==1]x=df1["age"]y=df1["bp"]slope,intercept,r,p,std_err=stats.linr e g r e s s ( x , y ) d e f m y f u n c ( x ) : r e t u r n slope*x+interceptmymodel=list(map(myfunc,x))plt.scatter(x,y)plt.plot(x,mymodel)plt.show()OUTPUT:RESULT:Thu s the above program Bivariate Analysis on Linear Regression using Pima Indians Diabetes for Diabetic Patients is verified.Ex.No:6.B.2BIVARIATE ANALYSES - LOGISTIC REGRESSION PIMA INDIAN DIABETES AIMTo write a python program for logistic regression using pima Indians Diabetes data set ALGORITHM Step 1: Start the programStep 2 :import pandas and numpy packagesStep 3:Read data from pima Indian Diabetes.csv data setStep 4: get x,y values from data set for model.LogisticRegression() Step 5: predict the y for the given x values based on pima Indian diabetes data set.Step 6:stop the programPROGRAM:Import numpy as n p i m p o r t p a n d a s a s p d f r o m s k l e a r n i m p o r t linear_modeldf=pd.read_csv("pima1.csv")X=df[["bp"]]y=df[["outcome"]]logr=linear_model.LogisticRegression()logr .fit(X.values,y.values.ravel())predicted=logr.predict(np.array([66]).reshape(1,- 1))print(predicted)OUTPUT:[1]RESULT:Thus the above program Bivariate Analysis on Logistic Regression using Pima Indians Diabetes set is verified. Ex.No:7.1 SUPERVISED LEARNING ALGORITHMS - LINEAR REGRESSIONPIMA INDIAN DIABETES FOR NON -DIABETIC PATIENTSAIMTo write a python program for linear regression using pima Indians Diabetes data set for nondiabetic patients.ALGORITHM Step 1: Start the programStep 2 :import pandas and matplotlib.pyplot,scipy packagesStep 3:Read data from pima Indian Diabetes.csv data setStep 4:separate diabetic patients from the data setStep 5:get x,y values from data set for linregress()Step 6:Plot x,y values using scatterplotStep 7: plot regression line by calculate y=bx+aStep 8:stop the programPROGRAM:import pandas as pdimport matplotlib.pyplot as pltfrom scipy import statsdf=pd.read_csv("pima1.csv")df1=df[df.outcome==0]x=df1["age"]y=df1["bp"]slope,intercept,r,p,std_err=stats.linr e g r e s s ( x , y ) d e f m y f u n c ( x ) : r e t u r n slope*x+interceptmymodel=list(map(myfunc,x))plt.scatter(x,y)plt.plot(x,mymodel)plt.show()OUTPUT:RESULT: Thus the above program Linear Regression using Pima Indians Diabetes for non-Diabetic Patients is verified.Ex.No:7.2 UNSUPERVISED LEARNING ALGORITHMS K- MEANS CLUSTERING – IRIS DATA SETAIMTo write a python program for K-means clustering using Iris data set.ALGORITHM Step 1: Start the programStep 2 :import packagesStep 3:Read data from Iris.csv data setStep 4: Choose the number of clusters kStep 5: Select k random points from the data as centroidsStep 6: Assign all the points to the closest cluster centroidStep 7: Recompute the centroids of newly formed clustersStep 8: Repeat step 6 and step 7Step 9: Stop the programPROGRAM:import pandas as pd import numpy as np import seaborn as snsimport matplotlib.pyplot as plt iris = load_iris() iris = sns.load_dataset('iris') X = iris.iloc[:, :-2] y = iris.target X_train, X_test,\ y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42) wcss = [] for i in range(1, 11): kmeans = KMeans(n_clusters=i,init='k-means++', max_iter=300, n_init=10, random_state=0) kmeans.fit(x) wcss.append(kmeans.inertia_) # from above array with help of elbow method #we can get no of cluster to provide. kmeans = KMeans(n_clusters=3, init='k-means++', max_iter=300, n_init=10,random_state=0) y_kmeans = kmeans.fit_predict(x) # Visualising the clusters cols = iris.columns plt.scatter(X.loc[y_kmeans == 0, cols[0]],X.loc[y_kmeans == 0, cols[1]],s=100, c='purple',label='Iris-setosa') plt.scatter(X.loc[y_kmeans == 1, cols[0]], X.loc[y_kmeans == 1, cols[1]], s=100, c='orange', label='Iris-versicolour') plt.scatter(X.loc[y_kmeans == 2, cols[0]], X.loc[y_kmeans == 2, cols[1]], s=100, c='green', label='Iris-virginica') # Plotting the centroids of the clusters plt.scatter(kmeans.cluster_centers_[:, 0],kmeans.cluster_centers_[:, 1], s=100, c='red', label='Centroids') plt.legend()OUTPUT:RESULT:Thus the above program is executed successfully.Ex.No:8VARIOUS PLOTTING FUNCTIONSSUB PLOT & HISTOGRAM – IRIS DATA SETAIM:To write the python program for sub plot and Histogram using Iris Data set.ALGORITHM:1. Start the program.2. Import pandas and matplotlib-pyplot package.3. Read Iris_csv file.4. Draw the sub plot using subplot()5. Draw the Histogram using hist() for sepal.length as X-axis.6. Draw the histogram using hist() for sepal.width as X-axis.7. Draw the histogram using hist() for petal.length as x- axis.8. Draw the histogram usingn hist() for petal.width as x-axis.9. Stop the program.PROGRAM:import numpy as n p i m p o r t p a n d a s a s p d i m p o r t m a t p l o t l i b . p y p l o t a s pltdf=pd.read_csv("iris.csv")x1=df["sepal.width"]plt.subplot(2,2,1)plt.hist(x1,bins=20,color="green")plt.title("sepal w i d t h i n c m " ) p l t . x l a b e l ( " s e p a l w i d t h cm")plt.ylabel("count")x2=df["sepal.length"]plt.subplot(2,2,2)plt.hist(x2,bins=20,color="blue")plt.title("sepal length i n c m " ) p l t . x l a b e l ( " s e p a l l e n g t h cm")plt.ylabel("count")x3=df["petal.length"]plt.subplot(2,2,3)plt.hist(x3,bins=20,color="red")plt.title("petal length in c m " ) p l t . x l a b e l ( " p e t a l l e n g t h cm")plt.ylabel("count")x4=df["petal.width"]plt.subplot(2,2,4)plt.hist(x4,bins=20,color="yellow")plt.title("petal width in cm")plt.xlabel("petal width cm")plt.ylabel("count")plt.show()OUTPUT:_RESULT:Thus the above program sub plot and Histogram using Iris Data Set is verified.