0% found this document useful (0 votes)
56 views

PYTHON PANDAS-Module1

Python Pandas is an open source Python library that provides high performance tools for data manipulation, management, and analysis. It allows users to load, prepare, manipulate, model, and analyze data. Pandas provides fast calculations across datasets and handles missing data. It also supports reshaping and merging of data as well as visualization. Python Pandas is well-suited for working with large tabular datasets containing different data formats.

Uploaded by

Rishabh Roy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

PYTHON PANDAS-Module1

Python Pandas is an open source Python library that provides high performance tools for data manipulation, management, and analysis. It allows users to load, prepare, manipulate, model, and analyze data. Pandas provides fast calculations across datasets and handles missing data. It also supports reshaping and merging of data as well as visualization. Python Pandas is well-suited for working with large tabular datasets containing different data formats.

Uploaded by

Rishabh Roy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

PYTHON PANDAS (MODULE-1)

Introduction to Data Science and Data Structure:

Data Science or Data Analytics is the process of analyzing a large set of data points to handle huge data,
which is an area of concern for large business organisation. Data processing is the analzying of data to
get the required or productive output from a large data set.

Data Structure is a way of storing and organising data for a specific purpose so that it can be accessed
and utilized in appropriate and specific way. Python offers two specific ways to handle data structure.
They are:
i) Pandas
ii) Numpy

Introduction to Python Pandas:

1. Python Pandas is an open source Python library providing high performance data manipulation,
management and analysis tool using its powerful data structure features. It was developed by Wes
McKinney. The term “Pandas” is derived from “Panel Data System”
2. Using Pandas, we can accomplish 5 typical steps in the processing and analysis of data. They are
Load, Prepare, Manipulate, Model and Analyze.
3. Pandas provides fast and efficient way to perform calculation among datasets.
4. Pandas provides data alignment and has functionality to find and fill missing data.
5. Pandas provides high performance techniques for merging, joining and reshaping of data.
6. Pandas supports visualization or pictorial representation of data.
So, Python Pandas is most suited for handling large tabular datasets comprising data of different
formats.

How to install Python Pandas?

Installing Python Pandas in Python IDLE 3.6:

Step-1 : Type Command Prompt. Right click on the “Command Prompt” icon and click on “Run
as Administrator’.

Step-2: Type the commands serially as given in the screenshot below:


Note: The system/computer must be connected to Internet. The above command will download Pandas
Package into the system.

In Anaconda Navigator:

In the SPYDER application of Anaconda Navigator, Python Pandas is by-default installed. No separate
package is required.

In PyCharm:

Step-1: Open PyCharm Application. Click on File and then click on Settings as given below:
Step-2: The Settings dialog box will open. Click on the Project Interpreter Option from the left Panel.

Step-3: Click on + Sign on the right Panel. The “Available Packages” Dialog box will appear.

Step-4: Type “Python Pandas” in the search box. Select it and then click on “Install Package”
Note: The system/computer must be connected to Internet. The above steps will download Pandas
Package into the system.

Variation in Python Pandas Data Structure:

Pandas provides and deals with the following three types of data structure:
i) Series : 1-D data structure containing homogeneous mutable data.
ii) DataFrame : 2-D data structure containing heterogeneous mutable data.
iii) Panel : 3-D data structure (outside the syllabus)

-----------------

You might also like