0% found this document useful (0 votes)
38 views11 pages

Experiment-2-1-Ml Kritika

The document loads and analyzes the Iris dataset using Pandas. It displays the first few rows, renames columns, extracts numeric columns, calculates statistics like mean and standard deviation, and shows subsets of rows.

Uploaded by

KRITIKA DAS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views11 pages

Experiment-2-1-Ml Kritika

The document loads and analyzes the Iris dataset using Pandas. It displays the first few rows, renames columns, extracts numeric columns, calculates statistics like mean and standard deviation, and shows subsets of rows.

Uploaded by

KRITIKA DAS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Experiment-2.

1
Aim : Study of Different Python Libraries

Pandas Library:
Load a dataset (Iris dataset: https://www.kaggle.com/datasets/uciml/iris) using pandas
Display the first few rows to understand its structure.
Calculate basic statistics (mean, median, standard deviation, etc.) for a numerical column in
the dataset.
Perform data filtering to extract rows based on specific conditions (e.g., SepalLengthCm>5.0).

In [1]:
import numpy as np
import pandas as pd

In [3]:
iris_df = pd.read_csv('C:/Users/kriti/Downloads/Iris.csv')
print(iris_df.head())

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species


0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5 5.0 3.6 1.4 0.2 Iris-setosa

In [4]:
new_col_name = ["ID","SepalLengthCm","SepalWidthCm" , "PetalLengthCm" , "PetalWi
iris_df.columns = new_col_name
iris_df.head()

Out[4]: ID SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

In [5]:
x = iris_df[iris_df.columns[1:-1]]
x.head()

Out[5]: SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm

0 5.1 3.5 1.4 0.2

1 4.9 3.0 1.4 0.2

2 4.7 3.2 1.3 0.2

3 4.6 3.1 1.5 0.2


SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm

4 5.0 3.6 1.4 0.2

In [6]:
y = iris_df[iris_df.columns[-1]]
y.head()

Out[6]: 0 Iris-setosa
1 Iris-setosa
2 Iris-setosa
3 Iris-setosa
4 Iris-setosa
Name: Species, dtype: object

In [7]:
sepal_length_stats = iris_df["SepalLengthCm"].describe()
print(sepal_length_stats)

count 150.000000
mean 5.843333
std 0.828066
min 4.300000
25% 5.100000
50% 5.800000
75% 6.400000
max 7.900000
Name: SepalLengthCm, dtype: float64

In [8]:
sepal_length_stats = iris_df["PetalWidthCm"].describe()
print(sepal_length_stats)

count 150.000000
mean 1.198667
std 0.763161
min 0.100000
25% 0.300000
50% 1.300000
75% 1.800000
max 2.500000
Name: PetalWidthCm, dtype: float64

In [9]:
iris_df.head(10)

Out[9]: ID SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

5 6 5.4 3.9 1.7 0.4 Iris-setosa

6 7 4.6 3.4 1.4 0.3 Iris-setosa

7 8 5.0 3.4 1.5 0.2 Iris-setosa

8 9 4.4 2.9 1.4 0.2 Iris-setosa


ID SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

9 10 4.9 3.1 1.5 0.1 Iris-setosa

In [10]:
iris_df.tail(10)

Out[10]: ID SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

140 141 6.7 3.1 5.6 2.4 Iris-virginica

141 142 6.9 3.1 5.1 2.3 Iris-virginica

142 143 5.8 2.7 5.1 1.9 Iris-virginica

143 144 6.8 3.2 5.9 2.3 Iris-virginica

144 145 6.7 3.3 5.7 2.5 Iris-virginica

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

In [11]:
iris_df[15:50]

Out[11]: ID SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

15 16 5.7 4.4 1.5 0.4 Iris-setosa

16 17 5.4 3.9 1.3 0.4 Iris-setosa

17 18 5.1 3.5 1.4 0.3 Iris-setosa

18 19 5.7 3.8 1.7 0.3 Iris-setosa

19 20 5.1 3.8 1.5 0.3 Iris-setosa

20 21 5.4 3.4 1.7 0.2 Iris-setosa

21 22 5.1 3.7 1.5 0.4 Iris-setosa

22 23 4.6 3.6 1.0 0.2 Iris-setosa

23 24 5.1 3.3 1.7 0.5 Iris-setosa

24 25 4.8 3.4 1.9 0.2 Iris-setosa

25 26 5.0 3.0 1.6 0.2 Iris-setosa

26 27 5.0 3.4 1.6 0.4 Iris-setosa

27 28 5.2 3.5 1.5 0.2 Iris-setosa

28 29 5.2 3.4 1.4 0.2 Iris-setosa

29 30 4.7 3.2 1.6 0.2 Iris-setosa

30 31 4.8 3.1 1.6 0.2 Iris-setosa

31 32 5.4 3.4 1.5 0.4 Iris-setosa

32 33 5.2 4.1 1.5 0.1 Iris-setosa


ID SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

33 34 5.5 4.2 1.4 0.2 Iris-setosa

34 35 4.9 3.1 1.5 0.1 Iris-setosa

35 36 5.0 3.2 1.2 0.2 Iris-setosa

36 37 5.5 3.5 1.3 0.2 Iris-setosa

37 38 4.9 3.1 1.5 0.1 Iris-setosa

38 39 4.4 3.0 1.3 0.2 Iris-setosa

39 40 5.1 3.4 1.5 0.2 Iris-setosa

40 41 5.0 3.5 1.3 0.3 Iris-setosa

41 42 4.5 2.3 1.3 0.3 Iris-setosa

42 43 4.4 3.2 1.3 0.2 Iris-setosa

43 44 5.0 3.5 1.6 0.6 Iris-setosa

44 45 5.1 3.8 1.9 0.4 Iris-setosa

45 46 4.8 3.0 1.4 0.3 Iris-setosa

46 47 5.1 3.8 1.6 0.2 Iris-setosa

47 48 4.6 3.2 1.4 0.2 Iris-setosa

48 49 5.3 3.7 1.5 0.2 Iris-setosa

49 50 5.0 3.3 1.4 0.2 Iris-setosa

In [12]:
iris_df.groupby("Species").head(5)

Out[12]: ID SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

50 51 7.0 3.2 4.7 1.4 Iris-versicolor

51 52 6.4 3.2 4.5 1.5 Iris-versicolor

52 53 6.9 3.1 4.9 1.5 Iris-versicolor

53 54 5.5 2.3 4.0 1.3 Iris-versicolor

54 55 6.5 2.8 4.6 1.5 Iris-versicolor

100 101 6.3 3.3 6.0 2.5 Iris-virginica

101 102 5.8 2.7 5.1 1.9 Iris-virginica

102 103 7.1 3.0 5.9 2.1 Iris-virginica

103 104 6.3 2.9 5.6 1.8 Iris-virginica

104 105 6.5 3.0 5.8 2.2 Iris-virginica


In [13]:
filter = iris_df["SepalLengthCm"] > 5.0
sel = iris_df[filter]
sel

Out[13]: ID SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

5 6 5.4 3.9 1.7 0.4 Iris-setosa

10 11 5.4 3.7 1.5 0.2 Iris-setosa

14 15 5.8 4.0 1.2 0.2 Iris-setosa

15 16 5.7 4.4 1.5 0.4 Iris-setosa

... ... ... ... ... ... ...

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

118 rows × 6 columns

In [14]:
iris_df.shape

Out[14]: (150, 6)

2. Matplotlib Library:
Create a line plot to visualize the trend of a numerical variable over time.
Generate a histogram to understand the distribution of a numerical variable in the dataset.
Create a bar chart to compare the performance of different categories.
Plot a scatter plot to explore the relationship between two numerical variables.
Customize your plots with labels, titles, colors, and styles.

In [15]:
import matplotlib.pyplot as plt

plt.plot(iris_df["SepalWidthCm"], '-', label="Line")


plt.plot(iris_df["SepalWidthCm"], 'o', label="Dots")
plt.xlabel("Index")
plt.ylabel("Sepal Width (cm)")
plt.title("Trend of Sepal Width")
plt.legend()
plt.show()
In [16]:
!pip install plotly

Requirement already satisfied: plotly in c:\users\kriti\anaconda3\lib\site-packa


ges (5.16.1)
Requirement already satisfied: packaging in c:\users\kriti\anaconda3\lib\site-pa
ckages (from plotly) (20.9)
Requirement already satisfied: tenacity>=6.2.0 in c:\users\kriti\anaconda3\lib\s
ite-packages (from plotly) (8.2.3)
Requirement already satisfied: pyparsing>=2.0.2 in c:\users\kriti\anaconda3\lib
\site-packages (from packaging->plotly) (2.4.7)

In [17]:
import plotly.express as px

species_count = iris_df["Species"].value_counts()
figure = px.pie(iris_df, values=species_count, names=species_count.index)
figure.show()

Iris-setosa
Iris-versicolor
Iris-virginica

33.3% 33.3%

33.3%
In [18]:
figure = px.histogram(iris_df , x = "SepalLengthCm")
figure.show()

30

25

20
count

15

10

0
4 5 6 7 8

SepalLengthCm

In [19]:
plt.scatter(iris_df["SepalLengthCm"] , iris_df["PetalLengthCm"])
plt.xlabel("Sepal Length")

Out[19]: Text(0.5, 0, 'Sepal Length')


In [20]:
species_counts = iris_df["Species"].value_counts()
plt.bar(species_counts.index , species_counts.values)
plt.xlabel("Species")
plt.ylabel("count")
plt.show()

3. Seaborn Library:
Create a box plot to visualize the distribution of a numerical variable across different
categories.
Generate a heatmap to explore the correlation between numerical variables.
Customize the appearance of seaborn plots using various parameters.

In [21]:
import seaborn as sns
sns.boxplot(x="Species",y="SepalLengthCm",data = iris_df)
plt.xlabel("Species")
plt.ylabel("Sepal Length (cm)")
plt.title("Distribution of Sepal Length across species")
plt.show()
In [22]:
# Heatmap to explore the correlation between numerical variables

df_filter = iris_df.select_dtypes(include = [np.number])


sns.heatmap(df_filter.corr())

Out[22]: <AxesSubplot:>

4. NumPy Library:
Create a NumPy array and perform basic operations like addition, subtraction, and
multiplication.
Use NumPy functions to calculate statistical measures like mean, median, and standard
deviation.
Reshape and slice NumPy arrays to extract specific data elements.
Perform element-wise operations and broadcasting with NumPy arrays.
Apply mathematical functions (e.g., exponential, logarithm) to NumPy arrays.
In [23]:
x = np.array([25 , 7 ,8 , 9 , 10 , 12])
y = np.array([10 , 20 , 58 , 100 , 204 , 7])

z = x + y

w = x - y

j = x * y

print("Addition : ", z)

print("Substraction : ", w)

print("Multiplication : ", j)

Addition : [ 35 27 66 109 214 19]


Substraction : [ 15 -13 -50 -91 -194 5]
Multiplication : [ 250 140 464 900 2040 84]

In [24]:
#statistics in numpy

print("Mean : ",np.mean(x))
print("Std Deviation : ",np.std(x))
print("Variance : ",np.var(x))

Mean : 11.833333333333334
Std Deviation : 6.094168432927407
Variance : 37.138888888888886

In [25]:
x = np.arange(1,11)
x1 = np.reshape(x , (2,5))
x1

Out[25]: array([[ 1, 2, 3, 4, 5],


[ 6, 7, 8, 9, 10]])

In [26]:
# numpy slicing
x1[0:1 , 2:5]

Out[26]: array([[3, 4, 5]])

In [27]:
# scalar broadcasting
x2 = x1 + 5
print(x2)

[[ 6 7 8 9 10]
[11 12 13 14 15]]

In [28]:
# logarithmic function
y = np.log(x)
plt.subplot(1,2,1)
plt.plot(x,y)
plt.title("Logarithmic Function")

# exponential function
plt.subplot(1,2,2)
f = np.exp(x)
plt.plot(x,f)
plt.title("Exponential Function")

Out[28]: Text(0.5, 1.0, 'Exponential Function')

5. SciPy Library:
Use SciPy to perform numerical integration for a given mathematical function.

In [29]:
from scipy.integrate import quad

# x = np.arange(0 , 2*np.pi , 0.1)

# y = np.sin(x)

def integrand(m):
return np.sin(m)

fun_intr , error = quad( integrand , 0 , np.pi)

print(fun_intr)
print(error)

# plt.plot(x , fun_intr)

2.0
2.220446049250313e-14
Name-Kritika Das

Regd no-2101020068

Rollno-CSE21068

You might also like