0% found this document useful (0 votes)
2 views

03 Python Packages for Data Science.en

The document provides an overview of key Python libraries for data analysis, categorizing them into scientific computing libraries and visualization tools. It highlights Pandas for data manipulation, NumPy for array processing, and Matplotlib and Seaborn for data visualization. Additionally, it mentions Scikit-learn and Statsmodels for machine learning and statistical modeling.

Uploaded by

Lougmiri Mohamed
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

03 Python Packages for Data Science.en

The document provides an overview of key Python libraries for data analysis, categorizing them into scientific computing libraries and visualization tools. It highlights Pandas for data manipulation, NumPy for array processing, and Matplotlib and Seaborn for data visualization. Additionally, it mentions Scikit-learn and Statsmodels for machine learning and statistical modeling.

Uploaded by

Lougmiri Mohamed
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 1

In order to do data analysis in Python, we should first tell you a little bit about

the main packages relevant to analysis in Python. A Python library is


a collection of functions and methods that allow you to perform lots of actions
without writing any code. The libraries usually contain built in modules providing
different functionalities which you can use directly. And there are extensive
libraries offering a broad range of facilities. We have divided the Python data
analysis libraries into three groups. The first group is called scientific
computing libraries. Pandas offers data structure and tools for effective data
manipulation and analysis. It provides facts, access to structured data. The
primary instrument of Pandas is the two dimensional table consisting of column and
row labels, which are called a data frame. It is designed to provid easy indexing
functionality. The NumPy library uses arrays for its inputs and outputs. It can be
extended to objects for matrices and with minor coding changes, developers can
perform fast array processing. SciPy includes functions for some advanced math
problems as listed on this slide, as well as data visualization. Using data
visualization methods is the best way to communicate with others, showing them
meaningful results of analysis. These libraries enable you to create graphs, charts
and maps. The Matplotlib package is the most well known library for data
visualization. It is great for making graphs and plots. The graphs are also highly
customizable. Another high level visualization library is Seaborn. It is based on
Matplotlib. It's very easy to generate various plots such as heat maps, time series
and violin plots. With machine learning algorithms, we're able to develop a model
using our data set and obtain predictions. The algorithmic libraries tackles the
machine learning tasks from basic to complex. Here we introduce two packages, the
Scikit-learn library contains tools statistical modeling, including regression,
classification, clustering, and so on. This library is built on NumPy, SciPy and
Matplotib. Statsmodels is also a Python module that allows users to explore data,
estimate statistical models and perform statistical tests. [music]

You might also like