0% found this document useful (0 votes)
84 views

Data Frame

The document summarizes key pandas DataFrame operations in Python. It describes how to create DataFrames from lists or dictionaries, select columns and rows, add and delete columns, and append and drop rows. DataFrames are two-dimensional data structures that allow easy storage and manipulation of tabular data with rows and columns.

Uploaded by

Ayaan Saleem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views

Data Frame

The document summarizes key pandas DataFrame operations in Python. It describes how to create DataFrames from lists or dictionaries, select columns and rows, add and delete columns, and append and drop rows. DataFrames are two-dimensional data structures that allow easy storage and manipulation of tabular data with rows and columns.

Uploaded by

Ayaan Saleem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Python Tutorial

Pandas Dataframe
The simple datastructure pandas.DataFrame is described in this article. It includes
the related information about the creation, index, addition and deletion. The text is
very detailed.
In short: it’s a two-dimensional data structure (like table) with rows and columns.
Related course: Data Analysis with Python Pandas
Create DataFrame
What is a Pandas DataFrame
Pandas is a data manipulation module. DataFrame let you store tabular data in
Python.

Python.
The DataFrame lets you easily store and manipulate tabular data like rows and
columns.
A dataframe can be created from a list (see below), or a dictionary or numpy array
(see bottom).
Create DataFrame from list
You can turn a single list into a pandas dataframe:
import pandas as pd
pd

data = [1
[1,2,3]

df = pd.DataFrame(data)

pd.DataFrame(data)

The contents of the dataframe is then:


>>> df
df

0 1

1 2

2 3

>>>

>>>

Before the contents, you’ll see every element has an index (0,1,2).

(0,1,2).
This works for tables (n-dimensional arrays) too:
import pandas as pd
pd

data = [['Axel'
[['Axel',
,32
32],
], ['Alice'
['Alice',
, 26
26],
], ['Alex'
['Alex',
, 45
45]]
]]

df = pd.DataFrame(data,columns=['Name'
pd.DataFrame(data,columns=['Name',
,'Age'
'Age'])
])

This outputs:
>>> df

df

Name Age

Age

0 Axel 32

32

1 Alice 26

26

2 Alex 45

45

>>>

>>>

Related course: Data Analysis with Python Pandas


Columns
Select column
To select a column, you can use the column name.
Step 1: Create frame:
>>> df = pd.DataFrame(data,columns=['Name'
pd.DataFrame(data,columns=['Name',
,'Age'
'Age'])
])

>>> df
df

Name Age

Age

0 Axel 32

32

1 Alice 26

26

2 Alex 45

45

Step 2: Select by column name:


>>> df[
df['Name'
'Name']
]

0 Axel

Axel

1 Alice

Alice

2 Alex

Alex

Name: Name, dtype: object

object

>>> df[
df['Age'
'Age']
]

0 32

32

1 26

26

2 45

45

Name: Age, dtype: int64

int64

>>>

>>>

Column Addition
You can add a column to a dataframe. So this:
>>> df

df

Name Age

Age

0 Axel 32

32

1 Alice 26

26

2 Alex 45

45

Becomes this:
>>> df

df

Name Age Example

Example

0 Axel 32 1

1 Alice 26 2

2 Alex 45 3

>>>

>>>

Here’s how to do that:


Step 1: Create the dataframe
>>> data = [['Axel'
[['Axel',
,32
32],
], ['Alice'
['Alice',
, 26
26],
], ['Alex'
['Alex',
, 45
45]]
]]

>>> df = pd.DataFrame(data,columns=['Name'
pd.DataFrame(data,columns=['Name',
,'Age'
'Age'])
])

>>>

>>> df

df

Name Age
Age

0 Axel 32

32

1 Alice 26

26

2 Alex 45

45

Step 2: Create a new dataframe with column

column
>>> c = pd.DataFrame([1
pd.DataFrame([1,2,3], columns=['Example'
columns=['Example'])
])

Step 3: Set the column name of your dataframe to that of the newly created one:

one:
>>> df['Example'
df['Example']
] = c['Example'
c['Example']
]

>>> df
df

Name Age Example

Example

0 Axel 32 1

1 Alice 26 2

2 Alex 45 3

>>>

>>>

Column deletion
To delete a column, you can use the keyword del
del..

The original dataframe:


>>> df

df

Name Age Example

Example

0 Axel 32 1

1 Alice 26 2

2 Alex 45 3

Then delete it:


>>> del df[
df['Example'
'Example']
]

And it will delete that column:


>>> df

df

Name Age

Age

0 Axel 32

32

1 Alice 26

26

2 Alex 45

45

>>>

>>>

Related course: Data Analysis with Python Pandas


Rows
Select row
You can select a row using .loc[label]
.loc[label]..
>>> df

df

Name Age

Age

0 Axel 32

32

1 Alice 26

26

2 Alex 45

45

>>>

>>> df.loc[
df.loc[00]

Name Axel

Axel

Age 32

32

Name: 0, dtype: object

object

>>>

>>> df.loc[
df.loc[22]

Name Alex

Alex

Age 45

45

Name: 2, dtype: object

object

>>>

>>>

You can select by index too, .iloc[index]


.iloc[index]..
>>> df.iloc[
df.iloc[0
0]

Name Axel

Axel

Age 32

32

Name: 0, dtype: object

object

>>>

>>>

Append row
You can append a row by calling the .append() method on the dataframe.

dataframe.
First create a new dataframe:
>>> user = pd.DataFrame([['Vivian'
pd.DataFrame([['Vivian',
,33
33]],
]], columns= ['Name'
['Name',
,'Age'
'Age'])
])

Then add it to the existing dataframe:


>>> df = df.append(user)

df.append(user)

>>> df

df

Name Age

Age

0 Axel 32

32

1 Alice 26

26

2 Alex 45

45

0 Vivian 33

33

>>>

>>>

Delete row
To delete a row, you can use the method .drop(index)
.drop(index)..
Start by creating a frame:

frame:
>>> data = [['Axel'
[['Axel',
,32
32],
], ['Alice'
['Alice',
, 26
26],
], ['Alex'
['Alex',
, 45
45]]
]]

>>> df = pd.DataFrame(data,columns=[
pd.DataFrame(data,columns=['Name'
'Name',
,'Age'
'Age'])
])

>>> df
df

Name Age

Age

0 Axel 32

32

1 Alice 26

26

2 Alex 45

45

Lets delete the first row:


>>> df = df.drop(0
df.drop(0)

>>> df
df

Name Age

Age

1 Alice 26

26

2 Alex 45

45

>>>

>>>

DataFrame creation
Create DataFrame from dictionary
If you have a dictionary, you can turn it into a dataframe.
>>> import pandas as pd
pd

aa>>> d = {'one'
{'one':[
:[1
1,2,3], 'two'
'two':[
:[2
2,3,4], 'three'
'three':[
:[3
3,4,5] }

>>> df = pd.DataFrame(d)

pd.DataFrame(d)

>>> df
df

one two three

three

0 1 2 3

1 2 3 4

2 3 4 5

>>>

>>>

The keys in the dictionary are columns in the DataFrame, but there is no value for
the index, so you need to set it yourself, and no default is to count from zero.
>>> df = pd.DataFrame(d, index=['first'
index=['first',
,'second'
'second',
,'third'
'third'])
])

>>> df
df

one two three

three

first 1 2 3

second 2 3 4

third 3 4 5

>>>

>>>

Create DataFrame from array


An array (numpy array) can be converted into an dataframe too.
>>> import numpy as np
np

>>> ar = np.array([[1
np.array([[1,2,3],[
],[4
4,5,6],[
],[6
6,7,8]])
]])

>>> ar
ar

array([[1
array([[ 1, 2, 3],
],

[4
[4, 5, 6],
],

[6
[6, 7, 8]])
]])

Then turn it into a dataframe with the line:


>>> df = pd.DataFrame(ar)

pd.DataFrame(ar)

>>> df
df

0 1 2

0 1 2 3

1 4 5 6

2 6 7 8

>>>

>>>

Creating a DataFrame assignment columns and index is created from a multi-


dimensional array, otherwise it is the default, ugly.
>>> df = pd.DataFrame(ar, index=['A'
index=['A',
,'B'
'B',
,'C'
'C'],
], columns=['One'
columns=['One',
,'Two
>>> df
df

One Two Three

Three

A 1 2 3

B 4 5 6

C 6 7 8

>>>

>>>

Create from DataFrame


You can copy parts of a dataframe into a new dataframe.

dataframe.
Using the dataframe above:
>>> df2 = df[['One'
df[['One',
,'Two'
'Two']].copy()
]].copy()

>>> df2
df2

One Two

Two

A 1 2

B 4 5

C 6 7

>>>

>>>

Create from CSV


If you have a csv file (Google Sheets can save as csv), you can load it like this:
# Import pandas as pd

import pandas as pd
pd

# Import the cats.csv data: cats

cats = pd.read_csv('cats.csv'
pd.read_csv('cats.csv')
)

# Print out cats

print(cats)

print(cats)

Back Next
Pandas Series Read CSV with Pandas

Cookie policy |
Privacy policy |
Terms of use |
© 2021 https://pythonbasics.org

You might also like