Python Pandas Series
Python Pandas Series
Pandas
Series
The advantages of Pandas over Excel are
Scalability - Pandas is only limited by hardware and can manipulate larger
quantities of data.
Speed - Pandas is much faster than Excel, which is especially noticeable when
working with larger quantities of data.
Automation - A lot of the tasks that can be achieved with Pandas are extremely
easy to automate, reducing the amount of tedious and repetitive tasks that need
to be performed daily.
Interpretability - It is very easy to interpret what happens when each task is
run, and it is relatively easy to find and fix errors.
Advanced Functions - Performing advanced statistical analysis and creating
complex visualizations is very straightforward.
Module: Module is a file which contains python functions. It is .py
file which has python executable code or statements.
Package: Package is namespace which contains multiple
packages or modules. It is a directory which contains a special
file __init__.py.
A namespace is a system that has a unique name for each and
every object in Python. An object might be a variable or a
method.
Library: It is collection of various packages. There is no difference
between package and python library conceptually.
Dataframe
Installation of pandas:
The data in series is mutable i.E. It can be changed but the size of
series is immutable i.E. Size of the series cannot be changed.
CREATING A SERIES
Pandas series can be created from the lists, dictionary, and from a
scalar value etc.
Syntax
Pandas.Series( data, index, name)
Where
Data: takes various forms like ndarray, list, constants/scalar
values, dictionary, mathematical expression
Index: are unique and hashable with same length as data.
Default is np.Arange(n) if no index is passed.
Name: allows you to give a name to a series object
Series() with arguments
SYNTAX:
A SEQUENCE (LIST)
AN NDARRAY
A SCALAR VALUE
A PYTHON DICTIONARY
A MATHEMATICAL EXPRESSION/FUNCTION
Here, keys of the dictionary become the indexes of the
series.
Creating a series with index of string type
String can be used as an index to the elements of a series.
Creating a series using two different lists
The two lists are passed as arguments to Series() method, out of which
the first list will be index and the other one will be the value.
Creating a series using missing values (nan)
In certain situations, we need to create a series object for which size is
defined but some elements or data are missing. This is handled by defining
NaN (Not a number) value(s), which is an attribute of Numpy library and
this can be achieved by defining a missing value using np.NaN.
Creating a series using a range()
To create a series using range() method.
CODE:
● loc :- loc is used for indexing and selecting based on name, i.e.
by row name and column name. it refers to name-based
indexing.
Note:
Arithmetic
operation is
possible on objects
of same index;
otherwise will
result as NaN.
Vector operations on a series
Vector operations mean that if you apply a function or expression
then it is individually applied on each item of the object. Since Series
objects are built upon Numpy arrays (ndarrays), they also support
vectorized operations, just like ndarrays.
# syntax of series.sort_values()
● series.sort_values(axis=0, ascending=True)
sort pandas series in an ascending order:
sortedseries = myseries.sort_values()
sortedseries = myseries.sort_values(ascending=true)
# sort inplace
myseries.sort_values(ascending=false, inplace=True)