0% found this document useful (0 votes)
2 views11 pages

Python For AI - Part2

The document provides a comprehensive guide on how to create and manipulate a Pandas DataFrame in Python, including creating a DataFrame, accessing and modifying rows and columns, adding and removing data, and handling missing values. It also covers searching for specific data and exporting DataFrames to and from CSV files. The examples demonstrate various operations using a dataset of doctors with attributes such as ID, name, age, fees, city, and assets.

Uploaded by

sunny.shah6498
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views11 pages

Python For AI - Part2

The document provides a comprehensive guide on how to create and manipulate a Pandas DataFrame in Python, including creating a DataFrame, accessing and modifying rows and columns, adding and removing data, and handling missing values. It also covers searching for specific data and exporting DataFrames to and from CSV files. The examples demonstrate various operations using a dataset of doctors with attributes such as ID, name, age, fees, city, and assets.

Uploaded by

sunny.shah6498
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

1.

creating a dataframe

>>> import pandas as pd

>>> doctor={"did":[101,102,103,104,105,106,107,108,109,110],"dname":["ravi saxena","anil


ram","ruby roy","sonu das","ram prakash","sohini kaur","john","michel","mihir pal","md
adil"],"age":[45,23,43,54,39,38,43,42,38,40],"fees":[500,1200,700,800,500,900,1500,700,800,100
0],"city":["delhi","mumbai","delhi","kolkata","jaipur","delhi","lucknow","mumbai","delhi","agra"],
"assets":[500000,78900,432000,76890,347654,890000,592000,100000,23000,450030]}

>>> df=pd.DataFrame(doctor)

>>> print(df)

did dname age fees city assets

0 101 ravi saxena 45 500 delhi 500000

1 102 anil ram 23 1200 mumbai 78900

2 103 ruby roy 43 700 delhi 432000

3 104 sonu das 54 800 kolkata 76890

4 105 ram prakash 39 500 jaipur 347654

5 106 sohini kaur 38 900 delhi 890000

6 107 john 43 1500 lucknow 592000

7 108 michel 42 700 mumbai 100000

8 109 mihir pal 38 800 delhi 23000

9 110 md adil 40 1000 agra 450030

2. accessing a column

>>> print(df["dname"])

0 ravi saxena

1 anil ram

2 ruby roy

3 sonu das

4 ram prakash

5 sohini kaur
6 john

7 michel

8 mihir pal

9 md adil

Name: dname, dtype: object

3. accessing a row

>>> print(df.loc[0])

did 101

dname ravi saxena

fees 500

assets 500000

Name: 0, dtype: object

4. accessing multiple rows

>>> print(df.loc[[0,3]])

did dname fees assets

0 101 ravi saxena 500 500000

3 104 sonu das 800 76890

5. adding a column

>>> df["contact"]=[9787879781,9564656735,7908980981,1,1,1,1,1,1,1]

>>> print(df)

did dname age fees city assets contact

0 101 ravi saxena 45 500 delhi 500000 9787879781

1 102 anil ram 23 1200 mumbai 78900 9564656735

2 103 ruby roy 43 700 delhi 432000 7908980981

3 104 sonu das 54 800 kolkata 76890 1

4 105 ram prakash 39 500 jaipur 347654 1


5 106 sohini kaur 38 900 delhi 890000 1

6 107 john 43 1500 lucknow 592000 1

7 108 michel 42 700 mumbai 100000 1

8 109 mihir pal 38 800 delhi 23000 1

9 110 md adil 40 1000 agra 450030 1

6. adding a row

>>> df.loc[10]=[111,"rana",38,900,"delhi",678234,9957676910]

>>> print(df)

did dname age fees city assets contact

0 101 ravi saxena 45 500 delhi 500000 9787879781

1 102 anil ram 23 1200 mumbai 78900 9564656735

2 103 ruby roy 43 700 delhi 432000 7908980981

3 104 sonu das 54 800 kolkata 76890 1

4 105 ram prakash 39 500 jaipur 347654 1

5 106 sohini kaur 38 900 delhi 890000 1

6 107 john 43 1500 lucknow 592000 1

7 108 michel 42 700 mumbai 100000 1

8 109 mihir pal 38 800 delhi 23000 1

9 110 md adil 40 1000 agra 450030 1

10 111 rana 38 900 delhi 678234 9957676910

7. updating a row

>>> df.loc[2]=[111,"rana",38,900,"delhi",678234,9957676910]

>>> print(df)

did dname age fees city assets contact

0 101 ravi saxena 45 500 delhi 500000 9787879781

1 102 anil ram 23 1200 mumbai 78900 9564656735

2 111 rana 38 900 delhi 678234 9957676910


3 104 sonu das 54 800 kolkata 76890 1

4 105 ram prakash 39 500 jaipur 347654 1

5 106 sohini kaur 38 900 delhi 890000 1

6 107 john 43 1500 lucknow 592000 1

7 108 michel 42 700 mumbai 100000 1

8 109 mihir pal 38 800 delhi 23000 1

9 110 md adil 40 1000 agra 450030 1

10 111 rana 38 900 delhi 678234 9957676910

8. removing a column but actually no changes in the dataframe

>>> df.drop("contact",axis=1)

did dname age fees city assets

0 101 ravi saxena 45 500 delhi 500000

1 102 anil ram 23 1200 mumbai 78900

2 111 rana 38 900 delhi 678234

3 104 sonu das 54 800 kolkata 76890

4 105 ram prakash 39 500 jaipur 347654

5 106 sohini kaur 38 900 delhi 890000

6 107 john 43 1500 lucknow 592000

7 108 michel 42 700 mumbai 100000

8 109 mihir pal 38 800 delhi 23000

9 110 md adil 40 1000 agra 450030

10 111 rana 38 900 delhi 678234

>>> print(df)

did dname age fees city assets contact

0 101 ravi saxena 45 500 delhi 500000 9787879781

1 102 anil ram 23 1200 mumbai 78900 9564656735

2 111 rana 38 900 delhi 678234 9957676910


3 104 sonu das 54 800 kolkata 76890 1

4 105 ram prakash 39 500 jaipur 347654 1

5 106 sohini kaur 38 900 delhi 890000 1

6 107 john 43 1500 lucknow 592000 1

7 108 michel 42 700 mumbai 100000 1

8 109 mihir pal 38 800 delhi 23000 1

9 110 md adil 40 1000 agra 450030 1

10 111 rana 38 900 delhi 678234 9957676910

9. removing a column with changes in the dataframe

>>> df.drop("contact",axis=1,inplace=True)

>>> print(df)

did dname age fees city assets

0 101 ravi saxena 45 500 delhi 500000

1 102 anil ram 23 1200 mumbai 78900

2 111 rana 38 900 delhi 678234

3 104 sonu das 54 800 kolkata 76890

4 105 ram prakash 39 500 jaipur 347654

5 106 sohini kaur 38 900 delhi 890000

6 107 john 43 1500 lucknow 592000

7 108 michel 42 700 mumbai 100000

8 109 mihir pal 38 800 delhi 23000

9 110 md adil 40 1000 agra 450030

10 111 rana 38 900 delhi 678234

10. removing multiple columns with changes in the dataframe

>>> df.drop(["city","age"],axis=1,inplace=True)

>>> print(df)

did dname fees assets


0 101 ravi saxena 500 500000

1 102 anil ram 1200 78900

2 111 rana 900 678234

3 104 sonu das 800 76890

4 105 ram prakash 500 347654

5 106 sohini kaur 900 890000

6 107 john 1500 592000

7 108 michel 700 100000

8 109 mihir pal 800 23000

9 110 md adil 1000 450030

10 111 rana 900 678234

11. removing a row with changes in the dataframe

>>> df.drop(2,axis=0,inplace=True)

>>> print(df)

did dname fees assets

0 101 ravi saxena 500 500000

1 102 anil ram 1200 78900

3 104 sonu das 800 76890

4 105 ram prakash 500 347654

5 106 sohini kaur 900 890000

6 107 john 1500 592000

7 108 michel 700 100000

8 109 mihir pal 800 23000

9 110 md adil 1000 450030

10 111 rana 900 678234

12. removing multiple rows with changes in the dataframe

>>> df.drop([1,4],axis=0,inplace=True)
>>> print(df)

did dname fees assets

0 101 ravi saxena 500 500000

3 104 sonu das 800 76890

5 106 sohini kaur 900 890000

6 107 john 1500 592000

7 108 michel 700 100000

8 109 mihir pal 800 23000

9 110 md adil 1000 450030

10 111 rana 900 678234

13. giving names to the indexes

>>>
df=pd.DataFrame(doctor,index=["doc1","doc2","doc3","doc4","doc5","doc6","doc7","doc8","doc9
","doc10"])

>>> print(df)

did dname age fees city assets

doc1 101 ravi saxena 45 500 delhi 500000

doc2 102 anil ram 23 1200 mumbai 78900

doc3 103 ruby roy 43 700 delhi 432000

doc4 104 sonu das 54 800 kolkata 76890

doc5 105 ram prakash 39 500 jaipur 347654

doc6 106 sohini kaur 38 900 delhi 890000

doc7 107 john 43 1500 lucknow 592000

doc8 108 michel 42 700 mumbai 100000

doc9 109 mihir pal 38 800 delhi 23000

doc10 110 md adil 40 1000 agra 450030

14. updating a particular value in the dataframe

>>> df.loc["doc6","fees"]=1200
>>> print(df)

did dname age fees city assets

doc1 101 ravi saxena 45 500 delhi 500000

doc2 102 anil ram 23 1200 mumbai 78900

doc3 103 ruby roy 43 700 delhi 432000

doc4 104 sonu das 54 800 kolkata 76890

doc5 105 ram prakash 39 500 jaipur 347654

doc6 106 sohini kaur 38 1200 delhi 890000

doc7 107 john 43 1500 lucknow 592000

doc8 108 michel 42 700 mumbai 100000

doc9 109 mihir pal 38 800 delhi 23000

doc10 110 md adil 40 1000 agra 450030

>>> df.loc["doc7","fees"]="NaN"

>>> print(df)

did dname age fees city assets

doc1 101 ravi saxena 45 500 delhi 500000

doc2 102 anil ram 23 1200 mumbai 78900

doc3 103 ruby roy 43 700 delhi 432000

doc4 104 sonu das 54 800 kolkata 76890

doc5 105 ram prakash 39 500 jaipur 347654

doc6 106 sohini kaur 38 1200 delhi 890000

doc7 107 john 43 NaN lucknow 592000

doc8 108 michel 42 700 mumbai 100000

doc9 109 mihir pal 38 800 delhi 23000

doc10 110 md adil 40 1000 agra 450030

15. searching: displaying all the doctors who stays in Mumbai

>>> df[df["city"]=="mumbai"]
did dname age fees city assets

doc2 102 anil ram 23 1200 mumbai 78900

doc8 108 michel 42 700 mumbai 100000

searching: displaying all the doctors who stays in Mumbai and fees is 1200

>>> print(df[(df["fees"]==1200) & (df["city"]=="mumbai")])

did dname age fees city assets

doc2 102 anil ram 23 1200 mumbai 78900

searching: displaying all the doctors who stays in Mumbai and fees is greater than 100

>>> print(df[(df["fees"]>100) & (df["city"]=="mumbai")])

did dname age fees city assets

doc2 102 anil ram 23 1200 mumbai 78900

doc8 108 michel 42 700 mumbai 100000

16. transferring data from a dataframe to a csv file

>>> df.to_csv("abc.csv")

17. transferring data from a csv file to a dataframe

>>> df1=pd.read_csv("abc.csv")

18. Checking for missing values

>>> df.loc[2,"city"]=np.NaN

>>> df.loc[7,"age"]=np.NaN

>>> df.loc[2,"age"]=np.NaN

>>> print(df)

did dname age fees city assets

0 101 ravi saxena 45.0 500 delhi 500000

1 102 anil ram 23.0 1200 mumbai 78900

2 103 ruby roy NaN 700 NaN 432000


3 104 sonu das 54.0 800 kolkata 76890

4 105 ram prakash 39.0 500 jaipur 347654

5 106 sohini kaur 38.0 900 delhi 890000

6 107 john 43.0 1500 lucknow 592000

7 108 michel NaN 700 mumbai 100000

8 109 mihir pal 38.0 800 delhi 23000

9 110 md adil 40.0 1000 agra 450030

>>> print(df.isnull())

did dname age fees city assets

0 False False False False False False

1 False False False False False False

2 False False True False True False

3 False False False False False False

4 False False False False False False

5 False False False False False False

6 False False False False False False

7 False False True False False False

8 False False False False False False

9 False False False False False False

19. Filling the missing values (NaN) by zero

>>> df.fillna(0)

did dname age fees city assets

0 101 ravi saxena 45.0 500 delhi 500000

1 102 anil ram 23.0 1200 mumbai 78900

2 103 ruby roy 0.0 700 0 432000

3 104 sonu das 54.0 800 kolkata 76890

4 105 ram prakash 39.0 500 jaipur 347654

5 106 sohini kaur 38.0 900 delhi 890000


6 107 john 43.0 1500 lucknow 592000

7 108 michel 0.0 700 mumbai 100000

8 109 mihir pal 38.0 800 delhi 23000

9 110 md adil 40.0 1000 agra 450030

20. Deleting the rows with missing values(NaN)

>>> df.loc[2,"city"]=np.NaN

>>> df.loc[7,"age"]=np.NaN

>>> df.loc[2,"age"]=np.NaN

>>>

>>> df.dropna()

did dname age fees city assets

0 101 ravi saxena 45.0 500 delhi 500000

1 102 anil ram 23.0 1200 mumbai 78900

3 104 sonu das 54.0 800 kolkata 76890

4 105 ram prakash 39.0 500 jaipur 347654

5 106 sohini kaur 38.0 900 delhi 890000

6 107 john 43.0 1500 lucknow 592000

8 109 mihir pal 38.0 800 delhi 23000

9 110 md adil 40.0 1000 agra 450030

You might also like