0% found this document useful (0 votes)
103 views9 pages

TAMIL

The document provides instructions for 4 assignments involving analyzing datasets using pandas in Python. Assignment 1 involves creating DataFrames from dictionaries, inserting rows, and setting indexes. Assignment 2 involves analyzing a world alcohol consumption dataset and selecting rows. Assignment 3 involves splitting a student DataFrame by school/class and sorting. Assignment 4 involves creating pivot tables to analyze sales data by region, manager, items. The learning outcome is to develop pandas skills in manipulating, analyzing and visualizing data.

Uploaded by

RM Vignesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
103 views9 pages

TAMIL

The document provides instructions for 4 assignments involving analyzing datasets using pandas in Python. Assignment 1 involves creating DataFrames from dictionaries, inserting rows, and setting indexes. Assignment 2 involves analyzing a world alcohol consumption dataset and selecting rows. Assignment 3 involves splitting a student DataFrame by school/class and sorting. Assignment 4 involves creating pivot tables to analyze sales data by region, manager, items. The learning outcome is to develop pandas skills in manipulating, analyzing and visualizing data.

Uploaded by

RM Vignesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Date:

Ex. No:

Assignment – I

Aim

To develop a python code using pandas and to learn basic concepts of


inserting ,removing ,rearranging ,selecting and sorting the information
using python code with help of pandas.

No 1 :
Consider the dictionary “Student”:
Student =
{ 'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael',
'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}

Develop a python code using Pandas for the following:


1. Create a DataFrame with the default index.
2. Insert a row after index 3 and reset the index value.
3. Create a DataFrame with the index as
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

Python Code
1.
Student= { 'name': ['Anastasia', 'Dima', 'Katherine', 'James',
'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan,8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes',
'no', 'no', 'yes']}

student=py.DataFrame(Student,columns=['name','score','attempts
','qualify'])
print(student)
2.
print(student)
student.loc[3.5]='charan',10,3,'yes'

Department of Computer Science and Engineering


Date:
Ex. No:

student = student.sort_index().reset_index(drop=True)
print(student)
3.

data= { 'name': ['Anastasia', 'Dima', 'Katherine', 'James',


'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan,8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes',
'no', 'no', 'yes']}
labels=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
df=py.DataFrame(data,index=labels)
print(df)

Output

Department of Computer Science and Engineering


Date:
Ex. No:

No 2 : Consider the World alcohol consumption dataset containing Year, WHO


region (the place where the beverage was produced), Country(the place where the
beverage was consumed), Beverage Types, Display Value(the average
consumption of beverages per person). (Dataset is available in LMS)
Develop a python code using Pandas for the following:

Department of Computer Science and Engineering


Date:
Ex. No:

1. Display the dimensions or shape of the World alcohol consumption dataset af-
ter converting the csv file as a DataFrame
2. Select the first 10 rows with last columns from the DataFrame
3.Create a random sample with replacement

1.
import pandas as pd
# World alcohol consumption data
w_a_con = pd.read_csv('world_alcohol.csv')
data=w_a_con.size
print("size",data)
2.
import pandas as pd
# World alcohol consumption data
w_a_con = pd.read_csv('world_alcohol.csv')
data=w_a_con.size
print("size",data)

print(w_a_con['Display Value'])
3.
# Import libraries
import numpy as np
import pandas as pd# Load dataset
url =
'https://raw.githubusercontent.com/mGalarnyk/Tutorial_Data/mas
ter/King_County/kingCountyHouseData.csv'
df = pd.read_csv(url)
columns= ['Year','WHO religion','country','Beverages'Display
Value']
df = df.loc[:, columns]
df = df.head(15)#
df.sample(n = 15, replace = True, random_state=2)
OUTPUT:

Department of Computer Science and Engineering


Date:
Ex. No:

Department of Computer Science and Engineering


Date:
Ex. No:

No 3:
Consider the DataFrame "Student" containing School_Code, Class, Name, Date_of_Birth, Age,
Height and Weight. The dataset is given below:

Schoo Clas Name Date_of_Bir Age Heigh Weigh


l_Cod s th t t
e

s001 V Alberto 15/05/2002 12 173 35


Franco

s002 V Gino 17/05/2002 12 192 32


Mcneill

s003 VI Ryan 16/02/1999 13 186 33


Parkes
s001 VI Eesha 25/09/1998 13 167 30
Hinton

s002 V Gino 11/05/2002 14 151 31


Mcneill

s004 VI David 15/09/1997 12 159 32


Parkes

Develop a python code using Pandas for the following:

1. Split the DataFrame by school code and get the mean, min, and max value of age for each school.
2. Split the dataframe into groups based on school code and class.
3. Sort the DataFrame permanently in ascending order based on the student name.

Department of Computer Science and Engineering


Date:
Ex. No:

1.
print(student['Age'].max())
print(student['Age'].min())
print(student['Age'].mean())
2.
group=student[['School Code','class']]
print(group)
3.
print(student.sort_values(by='Name',ascending=True,inplace=True))
student.head()
OUTPUT:

No 4: Consider the DataFrame "Sales" containing OrderDate, Region, Manager, SalesMan, Item,
Units, Unit_price and Sale_amt. (Dataset is available in LMS)

Develop a python code using Pandas for the following:

1. Create a Pivot table and find the total sale amount region wise, manager wise'.
2. Create a Pivot table and count the manager wise sale and mean value of sale amount.
3. Create a Pivot table and find the maximum and minimum sale value of the items

Department of Computer Science and Engineering


Date:
Ex. No:

1.
import pandas as pd
import numpy as np
df = pd.read_excel('E:\SaleData.xlsx')
table =
pd.pivot_table(df,index=["Manager","SalesMan"],values=["Units"
,"Sale_amt"],
aggfunc=[np.sum],fill_value=0,margins=True)
print(table)
2.
import pandas as pd
df = pd.read_excel('E:\SaleData.xlsx')
table =
pd.pivot_table(df,index=["Region","Manager","SalesMan"],
values="Sale_amt")
print(table.query('Manager == ["Douglas"]'))
3.
import pandas as pd
import numpy as np
df = pd.read_excel('E:\SaleData.xlsx')
table = pd.pivot_table(df, index='Item', values='Sale_amt',
aggfunc=[np.max, np.min])
print(table)

Ouput:

Department of Computer Science and Engineering


Date:
Ex. No:

Learning Outcomes
I have learnt to develop a python code using pandas and to learn basic
concepts of inserting ,removing ,rearranging ,selecting and sorting the
information using python code with help of pandas.

Department of Computer Science and Engineering

You might also like