0% found this document useful (0 votes)
10 views19 pages

Notes DV

The document covers various topics in Python programming, specifically focusing on Fibonacci series generation, Pandas Series, and DataFrames. It includes examples of creating and manipulating data structures, performing statistical analysis, and working with CSV files. Additionally, it discusses data visualization techniques using the Iris dataset.

Uploaded by

ansarisshadan748
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views19 pages

Notes DV

The document covers various topics in Python programming, specifically focusing on Fibonacci series generation, Pandas Series, and DataFrames. It includes examples of creating and manipulating data structures, performing statistical analysis, and working with CSV files. Additionally, it discusses data visualization techniques using the Iris dataset.

Uploaded by

ansarisshadan748
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

3/2/25, 7:19 PM notesDV

In [1]: #wap to print fibonacci series


def fibo(n):
a,b=0,1;
print(a,b,end=" ")
for i in range (n):
b,a = a+b,b
print(b,end=" ")

fibo(7)

0 1 1 2 3 5 8 13 21

LAB on DATA VISUALIZATION


PYTHON - PANDAS

Series :
A Pandas Series is like a column in a table.It is a one-dimensional array holding data of
any type.

In [4]: import pandas as pd


arr=[1,2,3,4,5]
ser=pd.Series(arr)
print(ser, ser[0])

#Labels - using your own indexes instead of default ones


arr2=["Rajesh","Suresh","Ramesh","Mahesh"]
serL=pd.Series(arr2,index= ["a","b","c","d"])
print(serL,serL["a"])

0 1
1 2
2 3
3 4
4 5
dtype: int64
a Rajesh
b Suresh
c Ramesh
d Mahesh
dtype: object Rajesh

In [5]: import pandas as pd


# making dictionary as series - it does not rewuire labels
dict1={"Porsche":"Panamera","Toyota":"MK5","Dodge":"Ram"}
serD=pd.Series(dict1)
print(serD)

Porsche Panamera
Toyota MK5
Dodge Ram
dtype: object

DataFrames -

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 1/19


3/2/25, 7:19 PM notesDV

A two-dimensional labeled data structure, akin to a table in a database or a spreadsheet.

In [29]: import pandas as pd


data={
"Rajesh":["eng","mat","hin"],
"Rahul":["sci","sst","eng"]
}
data2=data
df=pd.DataFrame(data)
print(df)

#lOCating rows in dataframe


print("\nRows of df : \n",df.loc[[0,1]])

#Labeled Indexes
dfL=pd.DataFrame(data2,index=["s1","s2","s3"])
print("\n",dfL,"\n\n", dfL.loc["s1"])

#csv files can also be called as dataframes


#dfcsv= pd.read_csv(file.csv)

Rajesh Rahul
0 eng sci
1 mat sst
2 hin eng

Rows of df :
Rajesh Rahul
0 eng sci
1 mat sst

Rajesh Rahul
s1 eng sci
s2 mat sst
s3 hin eng

Rajesh eng
Rahul sci
Name: s1, dtype: object

In [1]: import pandas as pd


df=pd.DataFrame(columns=["name","age","per"])
#to input data from user
# for i in range (2):
# name=input("Enter Name:")
# age=int(input("Enter Age:"))
# per=int(input("Enter Percentage:"))

# row=[name,age,per]
# #to add new row in a dataframe
# df.loc[len(df)] = row
# print(df)

# adding new data to a dataframe

dict1={"name":["Raj","Rej","Rij","Roj"],"age":[18,14,16,15],"per":[97,94,74,86]}
df2=pd.DataFrame(dict1)
df=pd.concat([df,df2],ignore_index=True)
print(df)

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 2/19


3/2/25, 7:19 PM notesDV

#find mean median mode std variance(var) of age

print("Mean= ",df["age"].mean())
print("Mode= ",df["age"].mode())
print("Median", df["age"].median())
print("Standard Deviation= ",df["age"].std())
print("Variance= ",df["age"].var())

#find range of percentage


print("Range of % :", (df["per"].max()-df["per"].min()))
print(df.describe())

name age per


0 Raj 18 97
1 Rej 14 94
2 Rij 16 74
3 Roj 15 86
Mean= 15.75
Mode= 0 14
1 15
2 16
3 18
Name: age, dtype: object
Median 15.5
Standard Deviation= 1.707825127659933
Variance= 2.9166666666666665
Range of % : 23
name age per
count 4 4 4
unique 4 4 4
top Raj 18 97
freq 1 1 1

Iris Data Analysis


In [10]: import pandas as pd

# Creatign a Data Frame


df = pd.read_csv("iris.csv")

# Data frame Functions


print("****************** Get Started with iris DataSet ******************\n", d
print("\n****************** Unique Variety ******************\n",df["species"].u
print("\n****************** Describe ******************\n",df.describe())
print("\n****************** Sample ******************\n", df.sample(2*10))
print("\n****************** Min ******************\n", df["sepal_length"].min(),
print("\n****************** Max ******************\n", df["sepal_length"].max(),
print("\n****************** Variety Count ******************\n", df["species"].v

# Adding a new coloumn in the Data Frame


df["sepal.area"] = df["sepal_length"] * df["sepal_width"]

# Get the information about the data (Count, Mean, standard deviation, Min, Max,
print("\n****************** Describe ******************\n", df.describe())

# Get the unique value using groupby


gg = df.groupby("species").mean()
print("\n****************** Group By ******************\n", gg)

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 3/19


3/2/25, 7:19 PM notesDV

# Printing petal length value that is bigger then 4


print("\n****************** Petal length ******************\n", df[df["petal_len

# Printing sepal length value that is bigger then 2


print("\n****************** Sepal width ******************\n", df["sepal_width"]

# Saving Data frame to a csv file


#df.to_csv("UpdatedCsv.csv")

# Creating a new coloumn name 'test' the values of this colomn sepal.len
df["test"] = df["sepal_length"] * df["sepal_width"]
df.to_csv("UpdatedCsv.csv")

# Correlation of sepal length and petal length


print("\n****************** Correlation ******************\n", df["sepal_length"

# Deleting a coloumn from the data frame


delete = df.drop(columns=["sepal.area", "test"])
print(delete)

#Add New Coloumn


df["Ratio"] = df["sepal_length"]/df["petal_length"]
print("\n****************** Ratio of sepal length & petal lenght ***************

# Multipling the valuee of petal lenght by 2


df["petal_length"] = df["petal_length"]*2
print("\n****************** Multiply by 2 petal length value ******************\

# Group By show mean of the petal length and only print petal length coloumn
print("\n****************** Group By species & get petal length mean ***********

# Simple group by and get the uniqe species and only print sepal length coloumn
print("\n****************** Group By species & get varience of sepal length ***

# Using two Group By first species and sepal length


print("\n****************** Group By species & get the mean of sepal length ****

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 4/19


3/2/25, 7:19 PM notesDV

****************** Get Started with iris DataSet ******************


sepal_length sepal_width petal_length petal_width species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa
.. ... ... ... ... ...
145 6.7 3.0 5.2 2.3 virginica
146 6.3 2.5 5.0 1.9 virginica
147 6.5 3.0 5.2 2.0 virginica
148 6.2 3.4 5.4 2.3 virginica
149 5.9 3.0 5.1 1.8 virginica

[150 rows x 5 columns]

****************** Unique Variety ******************


['setosa' 'versicolor' 'virginica']

****************** Describe ******************


sepal_length sepal_width petal_length petal_width
count 150.000000 150.000000 150.000000 150.000000
mean 5.843333 3.054000 3.758667 1.198667
std 0.828066 0.433594 1.764420 0.763161
min 4.300000 2.000000 1.000000 0.100000
25% 5.100000 2.800000 1.600000 0.300000
50% 5.800000 3.000000 4.350000 1.300000
75% 6.400000 3.300000 5.100000 1.800000
max 7.900000 4.400000 6.900000 2.500000

****************** Sample ******************


sepal_length sepal_width petal_length petal_width species
4 5.0 3.6 1.4 0.2 setosa
125 7.2 3.2 6.0 1.8 virginica
64 5.6 2.9 3.6 1.3 versicolor
57 4.9 2.4 3.3 1.0 versicolor
95 5.7 3.0 4.2 1.2 versicolor
110 6.5 3.2 5.1 2.0 virginica
88 5.6 3.0 4.1 1.3 versicolor
48 5.3 3.7 1.5 0.2 setosa
127 6.1 3.0 4.9 1.8 virginica
68 6.2 2.2 4.5 1.5 versicolor
128 6.4 2.8 5.6 2.1 virginica
13 4.3 3.0 1.1 0.1 setosa
121 5.6 2.8 4.9 2.0 virginica
39 5.1 3.4 1.5 0.2 setosa
101 5.8 2.7 5.1 1.9 virginica
117 7.7 3.8 6.7 2.2 virginica
15 5.7 4.4 1.5 0.4 setosa
31 5.4 3.4 1.5 0.4 setosa
148 6.2 3.4 5.4 2.3 virginica
113 5.7 2.5 5.0 2.0 virginica

****************** Min ******************


4.3
2.0

****************** Max ******************


7.9
4.4

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 5/19


3/2/25, 7:19 PM notesDV

****************** Variety Count ******************


species
setosa 50
versicolor 50
virginica 50
Name: count, dtype: int64

****************** Describe ******************


sepal_length sepal_width petal_length petal_width sepal.area
count 150.000000 150.000000 150.000000 150.000000 150.000000
mean 5.843333 3.054000 3.758667 1.198667 17.806533
std 0.828066 0.433594 1.764420 0.763161 3.368693
min 4.300000 2.000000 1.000000 0.100000 10.000000
25% 5.100000 2.800000 1.600000 0.300000 15.645000
50% 5.800000 3.000000 4.350000 1.300000 17.660000
75% 6.400000 3.300000 5.100000 1.800000 20.325000
max 7.900000 4.400000 6.900000 2.500000 30.020000

****************** Group By ******************


sepal_length sepal_width petal_length petal_width sepal.area
species
setosa 5.006 3.418 1.464 0.244 17.2088
versicolor 5.936 2.770 4.260 1.326 16.5262
virginica 6.588 2.974 5.552 2.026 19.6846

****************** Petal length ******************


sepal_length sepal_width petal_length petal_width species \
50 7.0 3.2 4.7 1.4 versicolor
51 6.4 3.2 4.5 1.5 versicolor
52 6.9 3.1 4.9 1.5 versicolor
54 6.5 2.8 4.6 1.5 versicolor
55 5.7 2.8 4.5 1.3 versicolor
.. ... ... ... ... ...
145 6.7 3.0 5.2 2.3 virginica
146 6.3 2.5 5.0 1.9 virginica
147 6.5 3.0 5.2 2.0 virginica
148 6.2 3.4 5.4 2.3 virginica
149 5.9 3.0 5.1 1.8 virginica

sepal.area
50 22.40
51 20.48
52 21.39
54 18.20
55 15.96
.. ...
145 20.10
146 15.75
147 19.50
148 21.08
149 17.70

[84 rows x 6 columns]

****************** Sepal width ******************


0 True
1 True
2 True
3 True

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 6/19


3/2/25, 7:19 PM notesDV

4 True
...
145 True
146 True
147 True
148 True
149 True
Name: sepal_width, Length: 150, dtype: bool

****************** Correlation ******************


0.8717541573048712
sepal_length sepal_width petal_length petal_width species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa
.. ... ... ... ... ...
145 6.7 3.0 5.2 2.3 virginica
146 6.3 2.5 5.0 1.9 virginica
147 6.5 3.0 5.2 2.0 virginica
148 6.2 3.4 5.4 2.3 virginica
149 5.9 3.0 5.1 1.8 virginica

[150 rows x 5 columns]

****************** Ratio of sepal length & petal lenght ******************


sepal_length sepal_width petal_length petal_width species \
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa
.. ... ... ... ... ...
145 6.7 3.0 5.2 2.3 virginica
146 6.3 2.5 5.0 1.9 virginica
147 6.5 3.0 5.2 2.0 virginica
148 6.2 3.4 5.4 2.3 virginica
149 5.9 3.0 5.1 1.8 virginica

sepal.area test Ratio


0 17.85 17.85 3.642857
1 14.70 14.70 3.500000
2 15.04 15.04 3.615385
3 14.26 14.26 3.066667
4 18.00 18.00 3.571429
.. ... ... ...
145 20.10 20.10 1.288462
146 15.75 15.75 1.260000
147 19.50 19.50 1.250000
148 21.08 21.08 1.148148
149 17.70 17.70 1.156863

[150 rows x 8 columns]

****************** Multiply by 2 petal length value ******************


sepal_length sepal_width petal_length petal_width species \
0 5.1 3.5 2.8 0.2 setosa
1 4.9 3.0 2.8 0.2 setosa
2 4.7 3.2 2.6 0.2 setosa

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 7/19


3/2/25, 7:19 PM notesDV

3 4.6 3.1 3.0 0.2 setosa


4 5.0 3.6 2.8 0.2 setosa
.. ... ... ... ... ...
145 6.7 3.0 10.4 2.3 virginica
146 6.3 2.5 10.0 1.9 virginica
147 6.5 3.0 10.4 2.0 virginica
148 6.2 3.4 10.8 2.3 virginica
149 5.9 3.0 10.2 1.8 virginica

sepal.area test Ratio


0 17.85 17.85 3.642857
1 14.70 14.70 3.500000
2 15.04 15.04 3.615385
3 14.26 14.26 3.066667
4 18.00 18.00 3.571429
.. ... ... ...
145 20.10 20.10 1.288462
146 15.75 15.75 1.260000
147 19.50 19.50 1.250000
148 21.08 21.08 1.148148
149 17.70 17.70 1.156863

[150 rows x 8 columns]

****************** Group By species & get petal length mean ******************


species
setosa 2.928
versicolor 8.520
virginica 11.104
Name: petal_length, dtype: float64

****************** Group By species & get varience of sepal length *************


*****
species
setosa 0.124249
versicolor 0.266433
virginica 0.404343
Name: sepal_length, dtype: float64

****************** Group By species & get the mean of sepal length **************
****
sepal_width petal_length petal_width sepal.area \
species sepal_length
setosa 4.3 3.000000 2.200000 0.100000 12.900000
4.4 3.033333 2.666667 0.200000 13.346667
4.5 2.300000 2.600000 0.300000 10.350000
4.6 3.325000 2.650000 0.225000 15.295000
4.7 3.200000 2.900000 0.200000 15.040000
4.8 3.180000 3.160000 0.200000 15.264000
4.9 3.075000 2.950000 0.125000 15.067500
5.0 3.362500 2.900000 0.287500 16.812500
5.1 3.600000 3.125000 0.312500 18.360000
5.2 3.666667 2.933333 0.166667 19.066667
5.3 3.700000 3.000000 0.200000 19.610000
5.4 3.660000 3.080000 0.320000 19.764000
5.5 3.850000 2.700000 0.200000 21.175000
5.7 4.100000 3.200000 0.350000 23.370000
5.8 4.000000 2.400000 0.200000 23.200000
versicolor 4.9 2.400000 6.600000 1.000000 11.760000
5.0 2.150000 6.800000 1.000000 10.750000

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 8/19


3/2/25, 7:19 PM notesDV

5.1 2.500000 6.000000 1.100000 12.750000


5.2 2.700000 7.800000 1.400000 14.040000
5.4 3.000000 9.000000 1.500000 16.200000
5.5 2.440000 7.960000 1.180000 13.420000
5.6 2.820000 8.120000 1.300000 15.792000
5.7 2.820000 8.200000 1.220000 16.074000
5.8 2.666667 8.000000 1.133333 15.466667
5.9 3.100000 9.000000 1.650000 18.290000
6.0 2.800000 9.050000 1.425000 16.800000
6.1 2.875000 9.000000 1.325000 17.537500
6.2 2.550000 8.800000 1.400000 15.810000
6.3 2.700000 9.333333 1.466667 17.010000
6.4 3.050000 8.800000 1.400000 19.520000
6.5 2.800000 9.200000 1.500000 18.200000
6.6 2.950000 9.000000 1.350000 19.470000
6.7 3.066667 9.400000 1.533333 20.546667
6.8 2.800000 9.600000 1.400000 19.040000
6.9 3.100000 9.800000 1.500000 21.390000
7.0 3.200000 9.400000 1.400000 22.400000
virginica 4.9 2.500000 9.000000 1.700000 12.250000
5.6 2.800000 9.800000 2.000000 15.680000
5.7 2.500000 10.000000 2.000000 14.250000
5.8 2.733333 10.200000 2.066667 15.853333
5.9 3.000000 10.200000 1.800000 17.700000
6.0 2.600000 9.800000 1.650000 15.600000
6.1 2.800000 10.500000 1.600000 17.080000
6.2 3.100000 10.200000 2.050000 19.220000
6.3 2.933333 10.733333 1.983333 18.480000
6.4 2.920000 10.920000 2.060000 18.688000
6.5 3.050000 10.800000 2.000000 19.825000
6.7 3.040000 11.200000 2.220000 20.368000
6.8 3.100000 11.400000 2.200000 21.080000
6.9 3.133333 10.800000 2.233333 21.620000
7.1 3.000000 11.800000 2.100000 21.300000
7.2 3.266667 11.933333 1.966667 23.520000
7.3 2.900000 12.600000 1.800000 21.170000
7.4 2.800000 12.200000 1.900000 20.720000
7.6 3.000000 13.200000 2.100000 22.800000
7.7 3.050000 13.200000 2.200000 23.485000
7.9 3.800000 12.800000 2.000000 30.020000

test Ratio
species sepal_length
setosa 4.3 12.900000 3.909091
4.4 13.346667 3.304029
4.5 10.350000 3.461538
4.6 15.295000 3.559524
4.7 15.040000 3.276442
4.8 15.264000 3.076692
4.9 15.067500 3.325000
5.0 16.812500 3.483001
5.1 18.360000 3.294678
5.2 19.066667 3.549206
5.3 19.610000 3.533333
5.4 19.764000 3.541357
5.5 21.175000 4.079670
5.7 23.370000 3.576471
5.8 23.200000 4.833333
versicolor 4.9 11.760000 1.484848
5.0 10.750000 1.471861

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 9/19


3/2/25, 7:19 PM notesDV

5.1 12.750000 1.700000


5.2 14.040000 1.333333
5.4 16.200000 1.200000
5.5 13.420000 1.386771
5.6 15.792000 1.387017
5.7 16.074000 1.399954
5.8 15.466667 1.450605
5.9 18.290000 1.316964
6.0 16.800000 1.335784
6.1 17.537500 1.361708
6.2 15.810000 1.409819
6.3 17.010000 1.352653
6.4 19.520000 1.455297
6.5 18.200000 1.413043
6.6 19.470000 1.467391
6.7 20.546667 1.429420
6.8 19.040000 1.416667
6.9 21.390000 1.408163
7.0 22.400000 1.489362
virginica 4.9 12.250000 1.088889
5.6 15.680000 1.142857
5.7 14.250000 1.140000
5.8 15.853333 1.137255
5.9 17.700000 1.156863
6.0 15.600000 1.225000
6.1 17.080000 1.167092
6.2 19.220000 1.219907
6.3 18.480000 1.180168
6.4 18.688000 1.172889
6.5 19.825000 1.206754
6.7 20.368000 1.198188
6.8 21.080000 1.194453
6.9 21.620000 1.280415
7.1 21.300000 1.203390
7.2 23.520000 1.207236
7.3 21.170000 1.158730
7.4 20.720000 1.213115
7.6 22.800000 1.151515
7.7 23.485000 1.169186
7.9 30.020000 1.234375

Height Weight

In [6]: import pandas as pd


df=pd.read_csv("height_weight_data.csv")
print (df.describe())

Index Height (Inches) Weight (Pounds)


count 200.000000 200.000000 200.000000
mean 100.500000 67.949800 127.221950
std 57.879185 1.940363 11.960959
min 1.000000 63.430000 97.900000
25% 50.750000 66.522500 119.895000
50% 100.500000 67.935000 127.875000
75% 150.250000 69.202500 136.097500
max 200.000000 73.900000 158.960000

In [3]: !pip install matplotlib

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 10/19


3/2/25, 7:19 PM notesDV

Requirement already satisfied: matplotlib in c:\users\mnnbt\onedrive\desktop\stuu


di\data visualization\.venv\lib\site-packages (3.10.0)
Requirement already satisfied: contourpy>=1.0.1 in c:\users\mnnbt\onedrive\deskto
p\stuudi\data visualization\.venv\lib\site-packages (from matplotlib) (1.3.1)
Requirement already satisfied: cycler>=0.10 in c:\users\mnnbt\onedrive\desktop\st
uudi\data visualization\.venv\lib\site-packages (from matplotlib) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\mnnbt\onedrive\deskt
op\stuudi\data visualization\.venv\lib\site-packages (from matplotlib) (4.56.0)
Requirement already satisfied: kiwisolver>=1.3.1 in c:\users\mnnbt\onedrive\deskt
op\stuudi\data visualization\.venv\lib\site-packages (from matplotlib) (1.4.8)
Requirement already satisfied: numpy>=1.23 in c:\users\mnnbt\onedrive\desktop\stu
udi\data visualization\.venv\lib\site-packages (from matplotlib) (2.2.1)
Requirement already satisfied: packaging>=20.0 in c:\users\mnnbt\onedrive\desktop
\stuudi\data visualization\.venv\lib\site-packages (from matplotlib) (24.2)
Requirement already satisfied: pillow>=8 in c:\users\mnnbt\onedrive\desktop\stuud
i\data visualization\.venv\lib\site-packages (from matplotlib) (11.1.0)
Requirement already satisfied: pyparsing>=2.3.1 in c:\users\mnnbt\onedrive\deskto
p\stuudi\data visualization\.venv\lib\site-packages (from matplotlib) (3.2.1)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\mnnbt\onedrive\de
sktop\stuudi\data visualization\.venv\lib\site-packages (from matplotlib) (2.9.0.
post0)
Requirement already satisfied: six>=1.5 in c:\users\mnnbt\onedrive\desktop\stuudi
\data visualization\.venv\lib\site-packages (from python-dateutil>=2.7->matplotli
b) (1.17.0)
[notice] A new release of pip is available: 24.3.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip

In [13]: import pandas as pd


import matplotlib.pyplot as plt

# Define column names as strings


columns = ['qty', 'gender', 'day', 'time', 'smoker', 'total_bill', 'tip']

# Create a DataFrame with the defined columns


df = pd.DataFrame(columns=columns)

# Add data to the DataFrame

df = pd.DataFrame( columns=columns)
df.loc[0] = [1, 'M', 'Sunday', 12, 'Yes', 200, 18]
df.loc[1] = [2, 'F', 'Monday', 2, 'No', 300, 16]
df.loc[2] = [1, 'M', 'Sunday', 12, 'Yes', 150, 20]
df.loc[3] = [24, 'F', 'Monday', 21, 'No', 400, 11]
df.loc[4] = [2, 'F', 'Monday', 2, 'No', 250.12, 13]
df.loc[5] = [1, 'M', 'Sunday', 16, 'Yes', 320.1, 14]
df.loc[6] = [24, 'F', 'Monday', 6, 'No', 102.12, 15]
# Print the DataFrame
print(df)

# Create a scatter plot of total bill vs tip


plt.figure(figsize=(8,6))
plt.scatter(df["total_bill"], df["tip"])
plt.xlabel("Total Bill ($)")
plt.ylabel("Tip ($)")
plt.title("Total Bill vs Tip")
plt.grid(True)
#plt.show()
plt.title("Total Bill vs Tip")

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 11/19


3/2/25, 7:19 PM notesDV

plt.scatter(df["tip"],df["total_bill"],colour="green")
plt.scatter(df["qty"],df["tip"],colour="#88c999")
plt.show()

qty gender day time smoker total_bill tip


0 1 M Sunday 12 Yes 200.00 18
1 2 F Monday 2 No 300.00 16
2 1 M Sunday 12 Yes 150.00 20
3 24 F Monday 21 No 400.00 11
4 2 F Monday 2 No 250.12 13
5 1 M Sunday 16 Yes 320.10 14
6 24 F Monday 6 No 102.12 15

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 12/19


3/2/25, 7:19 PM notesDV

---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[13], line 34
30 #plt.show()
31 plt.title("Total Bill vs Tip")
---> 34 plt.scatter(df["tip"],df["total_bill"],colour="green")
35 plt.scatter(df["qty"],df["tip"],colour="#88c999")
36 plt.show()

File c:\Users\mnnbt\OneDrive\Desktop\Stuudi\Data Visualization\.venv\Lib\site-pac


kages\matplotlib\_api\deprecation.py:453, in make_keyword_only.<locals>.wrapper(*
args, **kwargs)
447 if len(args) > name_idx:
448 warn_deprecated(
449 since, message="Passing the %(name)s %(obj_type)s "
450 "positionally is deprecated since Matplotlib %(since)s; the "
451 "parameter will become keyword-only in %(removal)s.",
452 name=name, obj_type=f"parameter of {func.__name__}()")
--> 453 return func(*args, **kwargs)

File c:\Users\mnnbt\OneDrive\Desktop\Stuudi\Data Visualization\.venv\Lib\site-pac


kages\matplotlib\pyplot.py:3939, in scatter(x, y, s, c, marker, cmap, norm, vmin,
vmax, alpha, linewidths, edgecolors, colorizer, plotnonfinite, data, **kwargs)
3919 @_copy_docstring_and_deprecators(Axes.scatter)
3920 def scatter(
3921 x: float | ArrayLike,
(...)
3937 **kwargs,
3938 ) -> PathCollection:
-> 3939 __ret = gca().scatter(
3940 x,
3941 y,
3942 s=s,
3943 c=c,
3944 marker=marker,
3945 cmap=cmap,
3946 norm=norm,
3947 vmin=vmin,
3948 vmax=vmax,
3949 alpha=alpha,
3950 linewidths=linewidths,
3951 edgecolors=edgecolors,
3952 colorizer=colorizer,
3953 plotnonfinite=plotnonfinite,
3954 **({"data": data} if data is not None else {}),
3955 **kwargs,
3956 )
3957 sci(__ret)
3958 return __ret

File c:\Users\mnnbt\OneDrive\Desktop\Stuudi\Data Visualization\.venv\Lib\site-pac


kages\matplotlib\_api\deprecation.py:453, in make_keyword_only.<locals>.wrapper(*
args, **kwargs)
447 if len(args) > name_idx:
448 warn_deprecated(
449 since, message="Passing the %(name)s %(obj_type)s "
450 "positionally is deprecated since Matplotlib %(since)s; the "
451 "parameter will become keyword-only in %(removal)s.",
452 name=name, obj_type=f"parameter of {func.__name__}()")
--> 453 return func(*args, **kwargs)

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 13/19


3/2/25, 7:19 PM notesDV

File c:\Users\mnnbt\OneDrive\Desktop\Stuudi\Data Visualization\.venv\Lib\site-pac


kages\matplotlib\__init__.py:1521, in _preprocess_data.<locals>.inner(ax, data, *
args, **kwargs)
1518 @functools.wraps(func)
1519 def inner(ax, *args, data=None, **kwargs):
1520 if data is None:
-> 1521 return func(
1522 ax,
1523 *map(cbook.sanitize_sequence, args),
1524 **{k: cbook.sanitize_sequence(v) for k, v in kwargs.items()})
1526 bound = new_sig.bind(ax, *args, **kwargs)
1527 auto_label = (bound.arguments.get(label_namer)
1528 or bound.kwargs.get(label_namer))

File c:\Users\mnnbt\OneDrive\Desktop\Stuudi\Data Visualization\.venv\Lib\site-pac


kages\matplotlib\axes\_axes.py:5019, in Axes.scatter(self, x, y, s, c, marker, cm
ap, norm, vmin, vmax, alpha, linewidths, edgecolors, colorizer, plotnonfinite, **
kwargs)
5015 keys_str = ", ".join(f"'{k}'" for k in extra_keys)
5016 _api.warn_external(
5017 "No data for colormapping provided via 'c'. "
5018 f"Parameters {keys_str} will be ignored")
-> 5019 collection._internal_update(kwargs)
5021 # Classic mode only:
5022 # ensure there are margins to allow for the
5023 # finite size of the symbols. In v2.x, margins
5024 # are present by default, so we disable this
5025 # scatter-specific override.
5026 if mpl.rcParams['_internal.classic_mode']:

File c:\Users\mnnbt\OneDrive\Desktop\Stuudi\Data Visualization\.venv\Lib\site-pac


kages\matplotlib\artist.py:1233, in Artist._internal_update(self, kwargs)
1226 def _internal_update(self, kwargs):
1227 """
1228 Update artist properties without prenormalizing them, but generating
1229 errors as if calling `set`.
1230
1231 The lack of prenormalization is to maintain backcompatibility.
1232 """
-> 1233 return self._update_props(
1234 kwargs, "{cls.__name__}.set() got an unexpected keyword argument
"
1235 "{prop_name!r}")

File c:\Users\mnnbt\OneDrive\Desktop\Stuudi\Data Visualization\.venv\Lib\site-pac


kages\matplotlib\artist.py:1206, in Artist._update_props(self, props, errfmt)
1204 func = getattr(self, f"set_{k}", None)
1205 if not callable(func):
-> 1206 raise AttributeError(
1207 errfmt.format(cls=type(self), prop_name=k),
1208 name=k)
1209 ret.append(func(v))
1210 if ret:

AttributeError: PathCollection.set() got an unexpected keyword argument 'colour'

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 14/19


3/2/25, 7:19 PM notesDV

In [17]: import pandas as pd


# Define column names as strings
columns = ['qty', 'gender', 'day', 'time', 'smoker', 'total_bill', 'tip']

# Create a DataFrame with the defined columns


df = pd.DataFrame(columns=columns)

# Add data to the DataFrame


df.loc[0] = [1, 'M', 'Sunday', 12, 'Yes', 200, 18]
df.loc[1] = [2, 'F', 'Monday', 2, 'No', 300, 16]
df.loc[2] = [1, 'M', 'Sunday', 12, 'Yes', 150, 20]
df.loc[3] = [24, 'F', 'Monday', 21, 'No', 400, 11]
df.loc[4] = [2, 'F', 'Monday', 2, 'No', 250.12, 13]
df.loc[5] = [1, 'M', 'Sunday', 16, 'Yes', 320.1, 14]

print(df)

rw = df.iloc[5]
print(rw)

import matplotlib.pyplot as plt

plt.figure(figsize=(8,6))
plt.plot(df["total_bill"], df["tip"], marker="^", color="green", linestyle='-')
plt.xlabel("Total Bill")
plt.ylabel("Tip")
plt.title("Change in tip w.r.t. total bill")

df["five_percent_bill"] = df["total_bill"] * 0.05


print(df)

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 15/19


3/2/25, 7:19 PM notesDV

#plt.plot(df["five_percent_bill"], df["tip"], color="red", marker="+", linewidth


#plt.show()

colors=["red","yellow","green","blue","orange","black"]
sizes=[10,20,30,40,50,60]
plt.scatter(df["tip"], df["total_bill"], c=colors, s=sizes, alpha=0.7)
plt.scatter(df["qty"],df["tip"],color="#aa1199")

for(i,j) in zip(df["qty"],df["tip"]):
plt.text(i,j, f"({i},{j})")
plt.show()

qty gender day time smoker total_bill tip


0 1 M Sunday 12 Yes 200.00 18
1 2 F Monday 2 No 300.00 16
2 1 M Sunday 12 Yes 150.00 20
3 24 F Monday 21 No 400.00 11
4 2 F Monday 2 No 250.12 13
5 1 M Sunday 16 Yes 320.10 14
qty 1
gender M
day Sunday
time 16
smoker Yes
total_bill 320.1
tip 14
Name: 5, dtype: object
qty gender day time smoker total_bill tip five_percent_bill
0 1 M Sunday 12 Yes 200.00 18 10.000
1 2 F Monday 2 No 300.00 16 15.000
2 1 M Sunday 12 Yes 150.00 20 7.500
3 24 F Monday 21 No 400.00 11 20.000
4 2 F Monday 2 No 250.12 13 12.506
5 1 M Sunday 16 Yes 320.10 14 16.005

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 16/19


3/2/25, 7:19 PM notesDV

In [16]: import pandas as pd


import matplotlib.pyplot as plt
# New Data Frame Student Information
student_column = ['Id', 'Name', 'english_marks', 'maths_marks']
student_df = pd.DataFrame(columns=student_column)

student_df.loc[len(student_df)] = [100, 'Deepanshu', 90, 85.5]


student_df.loc[len(student_df)] = [101, 'Prathu', 80, 82]
student_df.loc[len(student_df)] = [102, 'Dhurv', 75, 87]
student_df.loc[len(student_df)] = [103, 'Prath', 79.9, 89]
student_df.loc[len(student_df)] = [104, 'Diven', 80, 90]
student_df.loc[len(student_df)] = [105, 'Madhav', 85, 79]
student_df.loc[len(student_df)] = [106, 'Manan', 87, 77]
student_df.loc[len(student_df)] = [107, 'Akash', 90, 80.5]
student_df.loc[len(student_df)] = [108, 'Geetika', 88, 83.4]
student_df.loc[len(student_df)] = [109, 'Ranbir', 92, 91.2]
student_df.loc[len(student_df)] = [110, 'Mansh', 80, 75]

#plt.bar(student_df['Id'], student_df['english_marks'], color="darkred")

plt.barh(student_df['Id'], student_df['english_marks'], color="darkblue")


plt.show()

colors=["green","blue"]

plt.hist(student_df[["english_marks", "maths_marks"]],bins=4,color=colors,alpha=

plt.legend (fontsize=10)

plt.title(label= "Marks of students", fontweight= "bold")

plt.show()

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 17/19


3/2/25, 7:19 PM notesDV

l= ['apple', 'banana', 'orange', 'pear', 'grapes']

share= [15,20,25,40,55]

fig, ex=plt.subplots()

#ex.pie(share,autopct="%1.1f%%",labels=l,colors=["red","orange","gray","brown","
ex.pie(share,radius=0.5,autopct="%1.1f%%",labels=l,colors=["red","orange","gray"
plt.show()

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 18/19


3/2/25, 7:19 PM notesDV

file:///C:/Users/mnnbt/OneDrive/Desktop/Stuudi/Data Visualization/notesDV.html 19/19

You might also like