Operations in Series
Operations in Series
Addition of 2 Series
• import pandas as pd
• series1 = pd.Series([1, 2, 3, 4, 5]) • OUTPUT
• series2 = pd.Series([6, 7, 8, 9, 10]) •0 7
• series3 = series1 + series2 •1 9
• print(series3) • 2 11
• 3 13
• 4 15
• dtype: int64
Addition of 2 Series
• import pandas as pd
• series1 = pd.Series([1, 2, 3, 4, • OUTPUT
5],['A','F','E','T','S'])
•
•
0 NaN
series2 = pd.Series([6, 7, 8, 9, 10]) • 1 NaN
• series3 = series1 + series2 • 2 NaN
• print(series3) • 3 NaN
• 4 NaN
• A NaN
• E NaN
• F NaN
• S NaN
• T NaN
• dtype: float64
Addition of 2 Series
• import pandas as pd
• series1 = pd.Series([1, 2, 3, 4,
5],['A','F','E','T','S'])
• series2 = pd.Series([6, 7, 8, 9,
• OUTPUT
10],['F','A','M','T','V'])
• series3 = series1 + series2
• A 8.0
• print(series3)
• E NaN
• F 8.0
• M NaN
• S NaN
• T 13.0
• V NaN
• dtype: float64
Addition of 2 Series
• import pandas as pd
• series1 = pd.Series([1, 2, 3, 4, 5]) • OUTPUT
• series2 = pd.Series([6, 7, 8, 9, 10]) •0 7
• series3 = series1 + series2 •1 9
• print(series3) • 2 11
• 3 13
• 4 15
• dtype: int64
• import pandas as pd • series7 = series1 // series2
• series1 = pd.Series([1, 2, 3, 4, 5]) • print(series7)
• series2 = pd.Series([6, 7, 8, 9, 10]) • series8 = series1 % series2
• series3 = series1 + series2 • print(series8)
• print(series3) • series9 = series1 ** series2
• series4 = series1 - series2 • print(series9)
• print(series4)
• series5 = series1 * series2
• print(series5)
• series6 = series1 / series2
• print(series6)
add()
• seriesA.add(seriesB)
• is equivalent to calling seriesA+seriesB
• add() allows explicit specification of the fill value for any element in seriesA
or seriesB that might be missing
• seriesA.add(seriesB, fill_value=0)
• import pandas as pd
• seriesA = pd.Series([1, 2, 3, 4,
5],["a",1,"w","r",4]) • Output:
• seriesB = pd.Series([6, 7, 8, 9, 10]) • 0 6.0
• series3=seriesA.add(seriesB,fill_value=0) • 1 9.0
• print(series3) • 2 8.0
• 3 9.0
• 4 15.0
• a 1.0
• r 4.0
• w 3.0
• dtype: float64
• Just like addition, subtraction, multiplication and division can also be done
using corresponding mathematical operators or explicitly calling of the
appropriate method.
• sub()
• mul()
• div()
Vector operations on
series object
• Vector operation of performing operations
or expression on entire series without
explicit loops.
• import numpy as np • OUTPUT
•
• import pandas as pd •
0 2
1 4
• a = np.array([2,4,6,8,10 ]) • 2 6
• number= 2 • 3 8
• s1=pd.Series(a) • 4 10
• dtype: int32
• s2=s1+number • 0 4
• print(s1,'\n',s2) • 1 6
• 2 8
• 3 10
• 4 12
• dtype: int32
Possible legal operation
• S+2 • S<=22
• S-2 • S>=2
• S*2
• S**2
• S/2
• s//2
• S%2
• S>2
• S<2
Accessing Elements of a Series
• There are two common ways for accessing the elements of a series:
• Indexing and Slicing.
• (A) Indexing
• Indexing in Series is similar to that for NumPy arrays, and is used to
access elements in a series.
• Indexes are of two types: positional index and labelled index.
• Positional index takes an integer value that corresponds to its position in
the series starting from 0, whereas labelled index takes any user-defined
label as index.
example shows usage of the positional index
• Import pandas as pd
• seriesNum = pd.Series([10,20,30])
• print( seriesNum[2])
• Output
• 30
example shows usage of the labelled index
• import pandas as pd
• seriesNum = pd.Series([10,20,30], index=["Feb ","Mar","Apr"])
• print( seriesNum[“Mar”])
• Output
• 20
• import pandas as pd
• seriesNum = pd.Series([10,20,30], index=[1,"Mar","Apr"])
• print( seriesNum[2])
• Output
• KeyError: 2
• import pandas as pd
• seriesNum = pd.Series([10,20,30], index=['feb',"Mar","Apr"])
• print( seriesNum[2])
• Output
• 30
• seriesCapCntry = pd.Series(['NewDelhi', 'WashingtonDC', 'London', 'Paris'],
index=['India', 'USA', 'UK', 'France’])
• print( seriesCapCntry['India’] )
• OUTPUT
• NewDelhi
• We can also access an element of the series using the positional index:
• print( seriesCapCntry[1])
• WashingtonDC
More than one element of a series can be accessed using a
list of positional integers or a list of index labels
• import pandas as pd
• seriesCapCntry = pd.Series(['NewDelhi', 'WashingtonDC', 'London', 'Paris'],
index=['India', 'USA', 'UK', 'France'])
• print(seriesCapCntry[[3,2]])
• OUTPUT
• France Paris
• UK London
• dtype: object
The index values associated with the series can be altered
by assigning new index values
• import pandas as pd • India NewDelhi
• seriesCapCntry = pd.Series(['NewDelhi', • USA WashingtonDC
'WashingtonDC', 'London', 'Paris'],
index=['India', 'USA', 'UK', 'France'])
• UK London
• print(seriesCapCntry) • 10 NewDelhi
• 20 WashingtonDC
• 30 London
• 40 Paris
• Sometimes, we may need to extract a part of a series. This can be done through
slicing.
• We can define which part of the series is to be sliced by specifying the start and
end parameters [start :end] with the series name.
• When we use positional indices for slicing, the value at the end index position is
excluded, i.e., only (end - start) number of data values of the series are
extracted.
• If labelled indexes are used for slicing, then value at the end index label is also
included in the output
• >>> seriesCapCntry = pd.Series(['NewDelhi', 'WashingtonDC', 'London', 'Paris'],
index=['India', 'USA', 'UK', 'France’])
• >>> seriesCapCntry[1:3]
• #excludes the value at index position 3
• USA WashingtonDC
• UK London
• dtype: object
• >>> seriesCapCntry['USA' : 'France’]
• USA WashingtonDC
• UK London
• France Paris
• dtype: object
seriesCapCntry[ : : -1] # reverse order
• France Paris
• UK London
• USA WashingtonDC
• India NewDelhi
• dtype: object
MODIFYING ELEMENTS OF SERIES OBJECT
• <SERIES_OBJECT>[INDEX] = <NEW_DATA>
• <SERIES_OBJECT>[START:STOP] = <NEW_DATA>
import numpy as np
import pandas as pd
seriesAlph = pd.Series(np.arange(10,16,1),index = ['a', 'b', 'c', 'd', 'e', 'f'])
print(seriesAlph)
output
a 10
b 11
c 12
d 13
e 14
f 15
dtype: int32
seriesAlph[1:3] = 50 a 10
print(seriesAlph) b 50
OUTPUT c 50
a 10 d 13
b 11 e 14
c 12 f 15
d 13 dtype: int32
e 14
f 15
dtype: int32
When positional indices are used for slicing,
the value at end index position is excluded, i.e.,
only (end - start) number of data values of the
series are extracted. However with labelled
indexes the value at the end index label is also
included in the output.
print(seriesAlph) • dtype: int32
seriesAlph['c':'e'] = 500 • a 10
print(seriesAlph) • b 11
• a 10 • c 500
• b 11 • d 500
• c 12 • e 500
• d 13 • f 15
• e 14 • dtype: int32
• f 15
Renaming Indexes
• <object >.index=<new index • OUTPUT
array> • 10 1
• import pandas as pd • 20 2
•
• series1 = pd.Series([1, 2, 3, 4,
•
50 3
30 4
5],[10,20,50,30,90]) • 90 5
• print( series1) • dtype: int64
• series1.index=['a','b','c','d','e'] • a 1
• print( series1) • b 2
• c 3
• d 4
• e 5
• dtype: int64
head() ,tail(), count()
• head(n)
• Returns the first n members of the series. If the value for n is not passed, then by
default n takes 5 and the first five members are displayed
• count()
• Returns the number of non-NaN values in the Series
• tail(n)
• Returns the last n members of the series. If the value for n is not passed, then by
default n takes 5 and the last five members are displayed
head()
• import pandas as pd
• series1 = pd.Series([1, 2, 3, 4, 5,8,11,45],[10,20,50,30,90,100,20,50])
• print( series1)
• print(series1.head(2))
• print(series1.head())
• print(series1.head(100))
head()- OUTPUT
• 10 1 • 50 3
• 20 2 • 30 4
• 50 3 • 90 5
• 30 4 • dtype: int64
• 90 5 • 10 1
• 100 8 • 20 2
• 20 11 • 50 3
• 50 45 • 30 4
• dtype: int64 • 90 5
• 10 1 • 100 8
• 20 2 • 20 11
• dtype: int64 • 50 45
• 10 1 • dtype: int64
• 20 2
tail()
• import pandas as pd
• series1 = pd.Series([1, 2, 3, 4, 5,8,11,45],[10,20,50,30,90,100,20,50])
• print( series1)
• print(series1.tail(2))
• print(series1.tail())
• print(series1.tail(100))
tail()- output
• 10 1 • 10 1
• 20 2 • 20 11 • 20 2
• 50 3 • 50 45 • 50 3
• 30 4 • dtype: int64 • 30 4
• 90 5 • 30 4 • 90 5
• 100 8 • 90 5 • 100 8
• 20 11 • 100 8 • 20 11
• 50 45 • 20 11 • 50 45
• dtype: int64 • 50 45 • dtype: int64
• dtype: int64
count()
• import pandas as pd
• series1 = pd.Series([1, 2, 3, 4, • OUTPUT
5,8,11,45],[10,20,50,30,90,100,20,50] • 10 1
)
• 20 2
• print( series1) • 50 3
• s=series1.tail(100) • 30 4
• print(s.count()) • 90 5
• 100 8
• 20 11
• 50 45
• dtype: int64
• 8
Filtering entries in Series object-using
expression that are Boolean type
• import pandas as pd
• series = pd.Series([1, 2, 3, 4, 5,8,11,45],[10,20,50,30,90,100,20,50])
• print( series)
• print(series>5)
• s=series[series>5]
• print(s)
• print("no: of items",s.count())
Output
• 10 1 • 30 False
• 20 2 • 90 False
• 50 3 • 100 True
• 30 4 • 20 True
• 90 5 • 50 True
• 100 8 • dtype: bool
• 20 11 • 100 8
• 50 45 • 20 11
• dtype: int64 • 50 45
• 10 False • dtype: int64
• 20 False • no: of items 3
• 50 False
Sorting Series Values- sort_values()
• print( series)
• 20 2
• print(series.sort_values())
• 30 4
• 10 11
• 100 8
• 20 2
• 10 11
• 50 13
• 20 11
• 30 4
• 50 13
• 90 35
• 90 35
• 100 8
• 50 45
• dtype: int64