Computer >> Computer tutorials >  >> Programming >> Python

Data analysis using Python Pandas


In this tutorial, we are going to see the data analysis using Python pandas library. The library pandas are written in C. So, we don't get any problem with speed. It is famous for data analysis. We have two types of data storage structures in pandas. They are Series and DataFrame. Let's see one by one.

1.Series

Series is a 1D array with customized index and values. We can create a Series object using the pandas.Series(data, index) class. Series will take integers, lists, dictionaries as data. Let's see some examples.

Example

# importing the pandas library
import pandas as pd
# data
data = [1, 2, 3]
# creating Series object
# Series automatically takes the default index
series = pd.Series(data)
print(series)

Output

If you run the above program, you will get the following result.

0 1
1 2
2 3
dtype: int64

How to have a customized index? See the example.

Example

# importing the pandas library
import pandas as pd
# data
data = [1, 2, 3]
# index
index = ['a', 'b', 'c']
# creating Series object
series = pd.Series(data, index)
print(series)

Output

If you run the above program, you will get the following result.

a 1
b 2
c 3
dtype: int64

When we give the data as a dictionary to the Series class, then it takes keys as index and values as actual data. Let's see one example.

Example

# importing the pandas library
import pandas as pd
# data
data = {'a':97, 'b':98, 'c':99}
# creating Series object
series = pd.Series(data)
print(series)

Output

If you run the above program, you will get the following results.

a 97
b 98
c 99
dtype: int64

We can access the data from the Series using an index. Let's see the examples.

Example

# importing the pandas library
import pandas as pd
# data
data = {'a':97, 'b':98, 'c':99}
# creating Series object
series = pd.Series(data)
# accessing the data from the Series using indexes
print(series['a'], series['b'], series['c'])

Output

If you run the above code, you will get the following results.

97 98 99

2.Pandas

We have how to use Series class in pandas. Let's see how to use the DataFrame class. DataFrame data structure class in pandas that contains rows and columns.

We can create DataFrame objects using lists, dictionaries, Series, etc.., Let's create the DataFrame using lists.

Example

# importing the pandas library
import pandas as pd
# lists
names = ['Tutorialspoint', 'Mohit', 'Sharma']
ages = [25, 32, 21]
# creating a DataFrame
data_frame = pd.DataFrame({'Name': names, 'Age': ages})
# printing the DataFrame
print(data_frame)

Output

If you run the above program, you will get the following results.

               Name    Age
0    Tutorialspoint    25
1             Mohit    32
2            Sharma    21

Let's see how to create a data frame object using the Series.

Example

# importing the pandas library
import pandas as pd
# Series
_1 = pd.Series([1, 2, 3])
_2 = pd.Series([1, 4, 9])
_3 = pd.Series([1, 8, 27])
# creating a DataFrame
data_frame = pd.DataFrame({"a":_1, "b":_2, "c":_3})
# printing the DataFrame
print(data_frame)

Output

If you run the above code, you will get the following results.

   a  b  c
0  1  1  1
1  2  4  8
2  3  9  27

We can access the data from the DataFrames using the column name. Let's see one example.

Example

# importing the pandas library
import pandas as pd
# Series
_1 = pd.Series([1, 2, 3])
_2 = pd.Series([1, 4, 9])
_3 = pd.Series([1, 8, 27])
# creating a DataFrame
data_frame = pd.DataFrame({"a":_1, "b":_2, "c":_3})
# accessing the entire column with name 'a'
print(data_frame['a'])

Output

If you run the above code, you will get the following results.

0 1
1 2
2 3
Name: a, dtype: int64

Conclusion

If you have any doubts in the tutorial, mention them in the comment section.