Unit – III Python for Data Science
UNIT III: (10 hours) NumPy Basics: Arrays and Vectorized Computation- The
NumPy ndarray- Creating ndarrays Data Types for ndarrays- Arithmetic with
NumPy Arrays- Basic Indexing and Slicing - Boolean Indexing-Transposing Arrays
and Swapping Axes. Universal Functions: Fast Element-Wise Array Functions-
Mathematical and Statistical Methods-Sorting- Unique and Other Set Logic
NumPy:
NumPy, short for Numerical Python, is one of the most important
foundational packages for numerical computing in Python. it is designed for
efficiency on large arrays of data. There are a number of reasons for this:
ndarray, an efficient multidimensional array providing fast array-oriented
arithmetic operations and flexible broadcasting capabilities.
Mathematical functions for fast operations on entire arrays of data without
having to write loops.
Tools for reading/writing array data to disk and working with memory-
mapped files.
Linear algebra, random number generation, and Fourier transform
capabilities.
A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.
One of the reasons NumPy is so important for numerical computations in
Python is because it is designed for efficiency on large arrays of data. There are a
number of reasons for this:
NumPy internally stores data in a contiguous block of memory,
independent of other built-in Python objects. NumPy’s library of algorithms
written in the C language can operate on this memory without any type checking
NumPy arrays also use much less memory than built-in Python sequences.
NumPy operations perform complex computations on entire arrays without
the need for Python for loops.
import numpy as np
my_arr = np.arange(1000000)
my_list = list(range(1000000))
B V Raju College Page 1
Unit – III Python for Data Science
Now let’s multiply each sequence by 2:
%time for _ in range(10): my_arr2 = my_arr * 2
CPU times: user 20 ms, sys: 8 ms, total: 28 ms
Wall time: 26.5 ms
%time for _ in range(10): my_list2 = [x * 2 for x in my_list]
CPU times: user 408 ms, sys: 64 ms, total: 472 ms
Wall time: 473 ms
NumPy-based algorithms are generally 10 to 100 times faster (or more)
than their pure Python counterparts and use significantly less memory.
Difference Between List and Array:
List Array
List can have elements of different data All elements of an array are of same
types for example, [1,3.4, ‘hello’, ‘a@’] data type for example, an array of floats
may be: [1.2, 5.4, 2.7]
Lists do not support element wise Arrays support element wise
operations, for example, addition, operations. For example, if A1 is an
multiplication, etc. because elements array, it is possible to say A1/3 to divide
may not be of same type each element of the array by 3.
Lists can contain objects of different NumPy array takes up less space in
datatype that Python must store the memory as compared to a list because
type information for every element arrays do not require to store datatype
along with its element value. Thus lists of each element separately.
take more space in memory and are
less efficient.
List is a part of core Python. Array (ndarray) is a part of NumPy
library.
B V Raju College Page 2
Unit – III Python for Data Science
THE NUMPY NDARRAY: A MULTIDIMENSIONAL ARRAY OBJECT
One of the key features of NumPy is its N-dimensional array object, or
ndarray, which is a fast, flexible container for large datasets in Python. Arrays
enable us to perform mathematical operations on whole blocks of data using
similar syntax to the equivalent operations between scalar elements.
import numpy as np
# Generate some random data
data = np.random.randn(2, 3)
print(data)
data=data * 10
print(data)
data= data + data
print(data)
output:
[[ 1.02769038 -0.45400781 -1.09134785]
[-0.74483404 -0.89984109 -0.04883344]]
[[ 10.27690378 -4.54007811 -10.91347855]
[ -7.44834037 -8.99841092 -0.48833438]]
[[ 20.55380756 -9.08015622 -21.82695709]
[-14.89668073 -17.99682185 -0.97666877]]
B V Raju College Page 3
Unit – III Python for Data Science
In the first example, all of the elements have been multiplied by 10. In the
second, the corresponding values in each “cell” in the array have been added to
each other.
An ndarray is a generic multidimensional container for homogeneous data;
that is, all of the elements must be the same type.
Every array has a shape, a tuple indicating the size of each dimension, and
a dtype, an object describing the data type of the array:
print(data.shape)
(2, 3)
print(data.dtype)
dtype('float64')
Arrays:
Example:
import numpy as np
a = np.arange(15).reshape(3, 5)
print(“array is”,a)
print("array size is",a.shape)
print("array dimensions",a.ndim)
print("itewm size is",a.itemsize)
print("type of array",type(a))
O/P:
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]
array size is (3, 5)
B V Raju College Page 4
Unit – III Python for Data Science
array dimensions 2
item size is 4
type of array <class 'numpy.ndarray'>
CREATING NDARRAYS
The easiest way to create an array is to use the array function. This accepts
any sequence-like object (including other arrays) and produces a new NumPy
array containing the passed data.
Exam
import numpy as np
a = np.array([2,3,4])
print("array is",a)
print("data type", a.dtype)
b = np.array([1.2, 3.5, 5.1])
print("array b",b.dtype)
O/p:
array is [2 3 4]
data type int32
array b float64
Array transforms sequences of sequences into two-dimensional arrays,
sequences of sequences of sequences into three-dimensional arrays, and so on.
Example:
import numpy as np
b = np.array([(1.5,2,3), (4,5,6)])
B V Raju College Page 5
Unit – III Python for Data Science
print("two dim array",b)
O/P:
two dim array [[1.5 2 3 ]
[4 5 6 ]]
The type of the array can also be explicitly specified at creation time:
Example:
import numpy as np
c = np.array( [ [1,2], [3,4] ], dtype=complex )
print("complex array",c)
O/P:
complex array [[1.+0.j 2.+0.j]
[3.+0.j 4.+0.j]]
To create sequences of numbers, NumPy provides a function analogous to
range that returns arrays instead of lists.
np.arange( 10, 30, 5 )
array([10, 15, 20, 25])
np.arange( 0, 2, 0.3 ) # it accepts float arguments
array([ 0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])
The function zeros creates an array full of zeros, the function ones creates
an array full of ones, and the function empty creates an array whose initial
content is random and depends on the state of the memory. By default, the dtype
of the created array is float64.
Example:
B V Raju College Page 6
Unit – III Python for Data Science
import numpy as np
a=np.zeros( (3,4) )
print("array a is",a)
b=np.ones( (2,3,4), dtype=np.int16 )
print("array b is",b)
O/P:
array a is
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]
array b is
[[[1 1 1 1]
[1 1 1 1]
[1 1 1 1]]
[[1 1 1 1]
[1 1 1 1]
[1 1 1 1]]]
Function Description
Convert input data (list, tuple, array, or other sequence type) to an
array ndarray either by inferring a dtype or explicitly specifying a dtype;
copies the input data by default
Convert input to ndarray, but do not copy if the input is already an
asarray
ndarray
B V Raju College Page 7
Unit – III Python for Data Science
Function Description
arange Like the built-in range but returns an ndarray instead of a list
Produce an array of all 1s with the given shape and
ones,
dtype; ones_like takes another array and produces a ones array of the
ones_like
same shape and dtype
zeros,
Like ones and ones_like but producing arrays of 0s instead
zeros_like
empty, Create new arrays by allocating new memory, but do not populate
empty_like with any values like ones and zeros
Produce an array of the given shape and dtype with all values set to
full,
the indicated “fill value” full_like takes another array and produces a
full_like
filled array of the same shape and dtype
eye, Create a square N × N identity matrix (1s on the diagonal and 0s
identity elsewhere)
Table 4-1. Array creation functions
Data Types for ndarrays:
NumPy supports a much greater variety of numerical types than Python
does. The primitive types supported are tied closely to those in C:
By default Python have these data types:
strings - used to represent text data, the text is given under quote marks.
e.g. "ABCD"
integer - used to represent integer numbers. e.g. -1, -2, -3
float - used to represent real numbers. e.g. 1.2, 42.42
B V Raju College Page 8
Unit – III Python for Data Science
boolean - used to represent True or False.
complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j
NumPy has some extra data types, and refer to data types with one character,
like i for integers, u for unsigned integers etc.
Below is a list of all data types in NumPy and the characters used to represent
them.
i - integer
b - boolean
u - unsigned integer
f - float
c - complex float
m - timedelta
M - datetime
O - object
S - string
U - unicode string
V - fixed chunk of memory for other type ( void )
Type Type code Description
Signed and unsigned 8-bit (1 byte)
int8, uint8 i1, u1
integer types
Signed and unsigned 16-bit integer
int16, uint16 i2, u2
types
Signed and unsigned 32-bit integer
int32, uint32 i4, u4
types
Signed and unsigned 64-bit integer
int64, uint64 i8, u8
types
B V Raju College Page 9
Unit – III Python for Data Science
Type Type code Description
float16 f2 Half-precision floating point
Standard single-precision floating
float32 f4 or f
point; compatible with C float
Standard double-precision floating
float64 f8 or d point; compatible with C double and
Python float object
float128 f16 or g Extended-precision floating point
complex64, complex128, c Complex numbers represented by
c8, c16, c32
omplex256 two 32, 64, or 128 floats, respectively
Boolean type
bool ?
storing True and False values
Python object type; a value can be
object O
any Python object
Fixed-length ASCII string type (1 byte
per character); for example, to
string_ S
create a string dtype with length 10,
use 'S10'
Fixed-length Unicode type (number
of bytes platform specific); same
unicode_ U
specification semantics
as string_ (e.g., 'U10')
Table 4-2. NumPy data types
B V Raju College Page 10
Unit – III Python for Data Science
Checking the Data Type of an Array:
The NumPy array object has a property called dtype that returns the data
type of the array:
Example 1:
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr.dtype)
O/P:
int64
Example 2:
import numpy as np
arr = np.array(['apple', 'banana', 'cherry'])
print(arr.dtype)
O/P:
U6
Creating Arrays With a Defined Data Type:
We use the array() function to create arrays, this function can take an
optional argument: dtype that allows us to define the expected data type of the
array elements:
Example:
import numpy as np
arr = np.array([1, 2, 3, 4], dtype='S')
print("array is",arr)
print("array type is",arr.dtype)
O/P:
B V Raju College Page 11
Unit – III Python for Data Science
array is [b'1' b'2' b'3' b'4']
array type is |S1
We can explicitly convert or cast an array from one dtype to another
using ndarray’s astype method:
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print( arr.dtype)
#Output: dtype('int64')
float_arr = arr.astype(np.float64)
print( float_arr.dtype)
#Output: dtype('float64')
In this example, integers were cast to floating point. If we cast some
floating-point numbers to be of integer dtype, the decimal part will be truncated:
import numpy as np
arr = np.array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
print(arr)
#Output: array([ 3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
print(arr.astype(np.int32))
#Output: array([ 3, -1, -2, 0, 12, 10], dtype=int32)
If we have an array of strings representing numbers, we can use astype to
convert them to numeric form:
import numpy as np
B V Raju College Page 12
Unit – III Python for Data Science
numeric_strings = np.array(['1.25', '-9.6', '42'], dtype=np.string_)
print(numeric_strings.astype(float))
#Output: array([ 1.25, -9.6 , 42. ])
Arithmetic with NumPy Arrays:
NumPy is an open-source Python library. It provides a wide range of
arithmetic operations like addition, subtraction, multiplication, and division which
can be performed on the NumPy arrays.
NumPy arithmetic operations are only possible if the arrays should be of
the same shape or if they satisfy the rules of broadcasting.
Addition
import numpy as np
# Defining both the matrices
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
# Performing addition using arithmetic operator
add_ans = a+b
print(add_ans)
output:
[ 7 77 23 130]
Subtraction
import numpy as np
# Defining both the matrices
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
B V Raju College Page 13
Unit – III Python for Data Science
# Performing subtraction using arithmetic operator
sub_ans = a-b
print(sub_ans)
output:
[ 3 67 3 70]
Multiplication
import numpy as np
# Defining both the matrices
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
# Performing multiplication using arithmetic operator
mul_ans = a*b
print(mul_ans)
output:
[ 10 360 130 3000]
Division
import numpy as np
# Defining both the matrices
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
# Performing division using arithmetic operators
B V Raju College Page 14
Unit – III Python for Data Science
div_ans = a/b
print(div_ans)
[ 2.5 14.4 1.3 3.33333333]
import numpy as np
# Defining both the matrices
a = np.array([5, 72, 13, 100])
b = np.array([2, 5, 10, 30])
# Performing addition using arithmetic operator
print(a>b)
print(a/2)
print(a**2)
output:
[ True True True True]
[ 2.5 36. 6.5 50. ]
[ 25 5184 169 10000]
Basic Indexing and Slicing:
indexing and slicing are only applicable to sequence data types. In
sequence type, the order in which elements are inserted is maintained and this
allows us to access its elements using indexing and slicing.
The sequence types in Python are list, tuple, string, range, byte, and byte
arrays. And indexing and slicing apply to all these types.
Numpy indexing is used for accessing an element from an array by giving it
an index value that starts from 0.
Slicing NumPy arrays means extracting elements from an array in a specific
range. It obtains a substring, subtuple, or sublist from a string, tuple, or list.
B V Raju College Page 15
Unit – III Python for Data Science
There are two types of Indexing: basic and advanced. Advanced indexing is
further divided into Boolean and Purely Integer indexing. Negative Slicing index
values start from the end of the array.
To get some specific data or elements from numpy arrays, NumPy indexing
and slicing are used. Indexing starts from 0 and slicing is performed using
indexing.
Indexing an Array
Indexing is used to access individual elements. It is also possible to extract
entire rows, columns, or planes from multi-dimensional arrays with numpy
indexing. Indexing starts from 0. Let's see an array example below to understand
the concept of indexing:
Element
2 3 11 9 6 4 10 12
of array
Index 0 1 2 3 4 5 6 7
Indexing in one dimensions:
When arrays are used as indexes to access groups of elements, this is called
indexing using index arrays. NumPy arrays can be indexed with arrays or with any
other sequence like a list, etc.
Example:
import numpy as np
arr = np.arange(10)
print(arr)
#Output: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Print( arr[5])
#Output: 5
Print( arr[5:8])
#Output: array([5, 6, 7])
B V Raju College Page 16
Unit – III Python for Data Science
arr[5:8] = 12
print(arr)
#Output: array([ 0, 1, 2, 3, 4, 12, 12, 12, 8, 9])
As you can see, if you assign a scalar value to a slice, as in arr[5:8] = 12, the
value is propagated (or broadcasted henceforth) to the entire selection. An
important first distinction
from Python’s built-in lists is that array slices are views on the original
array. This means that the data is not copied, and any modifications to the view
will be reflected in the source array.
To give an example of this, I first create a slice of arr:
arr_slice = arr[5:8]
print( arr_slice)
Output: array([12, 12, 12])
Now, when I change values in arr_slice, the mutations are reflected in the
original array arr:
arr_slice[1] = 12345
print(arr)
Output:
array([ 0, 1, 2, 3, 4, 12, 12345, 12, 8, 9])
The “bare” slice [:] will assign to all values in an array:
B V Raju College Page 17
Unit – III Python for Data Science
arr_slice[:] = 64
print(arr)
Output:
array([ 0, 1, 2, 3, 4, 64, 64, 64, 8, 9])
Indexing in 2 Dimensions
Example:
import numpy as np
arr=np.arange(12)
arr1=arr.reshape(3,4)
print("Array arr1:\n",arr1)
print("Element at 0th row and 0th column of arr1 is:",arr1[0,0])
print("Element at 1st row and 2nd column of arr1 is:",arr1[1,2])
O/P;
Picking a Row or Column in 2-D NumPy Array
import numpy as np
arr=np.arange(12)
arr1=arr.reshape(3,4)
print("Array arr1:\n",arr1)
print("\n")
print("1st row :\n",arr1[1])
B V Raju College Page 18
Unit – III Python for Data Science
O/P:
Consider the two-dimensional array from before, arr2d. Slicing this array is
a bit different:
import numpy as np
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr2d[:2])
Output:
array([[1, 2, 3],
[4, 5, 6]])
As you can see, it has sliced along axis 0, the first axis. A slice, therefore,
selects a range of elements along an axis. It can be helpful to read the
expression arr2d[:2] as “select the first two rows of arr2d.”
import numpy as np
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d[:2, 1:]
Output:
B V Raju College Page 19
Unit – III Python for Data Science
array([[2, 3],
[5, 6]])
When slicing like this, you always obtain array views of the same number of
dimensions. By mixing integer indexes and slices, you get lower dimensional
slices.
For example, I can select the second row but only the first two columns like
so:
import numpy as np
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d[1, :2]
Out[93]: array([4, 5])
Note that a colon by itself means to take the entire axis, so you can slice
only higher dimensional axes by doing:
import numpy as np
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print( arr2d[:, :1])
Output:
array([[1],
[4],
[7]])
Of course, assigning to a slice expression assigns to the whole selection:
B V Raju College Page 20
Unit – III Python for Data Science
import numpy as np
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d[:2, 1:] = 0
print( arr2d)
Output:
array([[1, 0, 0],
[4, 0, 0],
[7, 8, 9]])
Indexing in 3 Dimensions
There are three dimensions in a 3-D array, suppose we have three
dimensions as (i, j, k), where i stands for the 1st dimension, j stands for the 2nd
dimension and, k stands for the 3rd dimension.
Example:
import numpy as np
arr=np.arange(12)
arr1=arr.reshape(2,2,3)
print("Array arr1:\n",arr1)
print("Element:",arr1[1,0,2])
O/P:
B V Raju College Page 21
Unit – III Python for Data Science
Explanation: We want to access the element of an array at index(1,0,2)
Here 1 represents the 1st dimension, and the 1st dimension has two arrays:
1st array: [0,1,2] [3,4,5] and: 2nd array: [6,7,8] [9,10,11]
Indexing starts from 0.
We have the 2nd array as we select 1: [[6,7,8] [9,10,11]
The 2nd digit 0, stands for the 2nd dimension, and the 2nd dimension also
contains two arrays: Array 1: [6, 7, 8] and: Array 2: [9, 10, 11]
0 is selected and we have 1st array : [6, 7, 8]
The 3rd digit 2, represents the 3rd dimension, 3rd dimension further has
three values: 6,7,8
As 2 is selected, 8 is the output.
Basic Slicing and indexing
Using basic indexing and slicing we can access a particular element or group
of elements from an array.
Basic indexing and slicing return the views of the original arrays.
Basic slicing occurs when the arr[index] is:
a slice object (constructed by start: stop: step)
an integer
or a tuple of slice objects and integers
Example:
import numpy as np
arr = np.arange(12)
print(arr)
print("Element at index 6 of an array arr:",arr[6])
print("Element from index 3 to 8 of an array arr:",arr[3:8])
O/P:
B V Raju College Page 22
Unit – III Python for Data Science
Slicing a 2D Array
In a 2-D array, we have to specify start:stop 2 times. One for the row and 2nd one
for the column.
Exampl:
import numpy as np
arr=np.arange(12)
arr1=arr.reshape(3,4)
print("Array arr1:\n",arr1)
print("\n")
print("elements of 1st row and 1st column upto last column :\n",arr1[1:,1:4])
O/P:
The 1st number represents the row, so slicing starts from the 1st row and
goes till the last as no ending index is mentioned. Then elements from the 1st
column to the 3rd column are sliced and printed as output.
B V Raju College Page 23
Unit – III Python for Data Science
Negative Slicing and Indexing
Negative indexing begins when the array sequence ends, i.e. the last
element will be the first element with an index of -1 in negative indexing, and the
slicing occurs by using this negative indexing.
Example:
import numpy as np
arr = np.array([10,20,30,40,50,60,70,80,90])
print("Element at index 2 or -7 of an array arr:",arr[-7])
print("Sliced Element from index -8 or 2 and -3 or 6 of an array arr:",arr[-8:-3])
O/P:
B V Raju College Page 24
Unit – III Python for Data Science
Boolean Indexing:
Boolean indexing occurs when the obj is a Boolean array object, i.e., a true or
false type or having some condition.
The elements that satisfy the Boolean expression are returned.
This is used to filter the values of the desired elements.
Example:
import numpy as np
arr = np.array([11,6,41,10,29,50,55,45])
print(arr[arr>35])
O/P:
Elements that satisfy the given condition, i.e., greater than 35, are printed
as output
Import numpy as np
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
data = np.random.randn(7, 4)
print( names)
print( data)
names == 'Bob'
data[names == 'Bob']
data[names == 'Bob', 2:]
data[names == 'Bob', 3]
B V Raju College Page 25
Unit – III Python for Data Science
Output:
array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], dtype='<U4')
array([[ 0.0929, 0.2817, 0.769 , 1.2464],
[ 1.0072, -1.2962, 0.275 , 0.2289],
[ 1.3529, 0.8864, -2.0016, -0.3718],
[ 1.669 , -0.4386, -0.5397, 0.477 ],
[ 3.2489, -1.0212, -0.5771, 0.1241],
[ 0.3026, 0.5238, 0.0009, 1.3438],
[-0.7135, -0.8312, -2.3702, -1.8608]])
array([ True, False, False, True, False, False, False])
array([[ 0.0929, 0.2817, 0.769 , 1.2464],
[ 1.669 , -0.4386, -0.5397, 0.477 ]])
array([[ 0.769 , 1.2464],
[-0.5397, 0.477 ]])
array([1.2464, 0.477 ])
Transposing Arrays and Swapping Axes:
The numpy.transpose() function is one of the most important functions in
matrix multiplication. This function permutes or reserves the dimension of the
given array and returns the modified array.
The numpy.transpose() function changes the row elements into column
elements and the column elements into row elements. The output of this function
is a modified array of the original one.
Syntax
numpy.transpose(arr, axis=None)
Transposing is a special form of reshaping that similarly returns a view on
the under‐ lying data without copying anything. Arrays have the transpose
method and also the special T attribute:
B V Raju College Page 26
Unit – III Python for Data Science
Example:
import numpy as np
a= np.arange(6).reshape((2,3))
print("array",a)
b=np.transpose(a)
print("transpose array",b )
O/P:
numpy.transpose() with axis
For higher dimensional arrays, transpose will accept a tuple of axis numbers to
permute the axes (for extra mind bending):
import numpy as np
a= np.arange(16).reshape(2, 2, 4)
print("array is\n",a)
print("\n")
b=a.transpose(0, 1,2)
print("transpose with axes\n",b)
O/P:
B V Raju College Page 27
Unit – III Python for Data Science
Simple transposing with .T is just a special case of swapping axes. ndarray
has the method swapaxes which takes a pair of axis numbers:
import numpy as np
a= np.arange(16).reshape(2, 2, 4)
print("array is\n",a)
print("\n")
b=a.swapaxes(1,2)
print("transpose with swapaxes\n",b)
O/P:
Universal Functions: Fast Element-Wise Array
These are two types of universal functions
1.unary ufuncs.
2.binary ufunc(such as add or maximum, take 2 arrays (thus, binary ufuncs) and
return a single array as the result)
A universal function, or ufunc, is a function that performs element-wise
operations on data in ndarrays. You can think of them as fast vectorized wrappers
for simple functions that take one or more scalar values and produce one or more
scalar results.
B V Raju College Page 28
Unit – III Python for Data Science
unary ufuncs.
Function Description
Compute the absolute value element-wise for integer,
abs, fabs
floating-point, or complex values
Compute the square root of each element (equivalent
sqrt
to arr ** 0.5)
Compute the square of each element (equivalent to arr
square
** 2)
exp Compute the exponent ex of each element
Natural logarithm (base e), log base 10, log base 2, and
log, log10, log2, log1p
log(1 + x), respectively
Compute the sign of each element: 1 (positive), 0
sign
(zero), or –1 (negative)
Compute the ceiling of each element (i.e., the smallest
ceil
integer greater than or equal to that number)
Compute the floor of each element (i.e., the largest
floor
integer less than or equal to each element)
Round elements to the nearest integer, preserving
rint
the dtype
Return fractional and integral parts of array as a
modf
separate array
isnan Return boolean array indicating whether each value
B V Raju College Page 29
Unit – III Python for Data Science
Function Description
is NaN (Not a Number)
Return boolean array indicating whether each element
isfinite, isinf
is finite (non-inf, non-NaN) or infinite, respectively
cos, cosh, sin, sinh, tan,
Regular and hyperbolic trigonometric functions
tanh
arccos, arccosh, arcsin,
Inverse trigonometric functions
arcsinh, arctan, arctanh
Compute truth value of not x element-wise (equivalent
logical_not
to ~arr).
import numpy as np
arr = np.arange(10)
print("array is\n",arr)
print("\n")
a=np.sqrt(arr)
print("square root\n",a)
print("\n")
b=np.exp(arr)
print("exponent is\n",b)
O/p:
B V Raju College Page 30
Unit – III Python for Data Science
Binary ufunc:
Function Description
add Add corresponding elements in arrays
subtract Subtract elements in second array from first array
multiply Multiply array elements
divide, floor_divide Divide or floor divide (truncating the remainder)
Raise elements in first array to powers indicated
power
in second array
maximum, fmax Element-wise maximum; fmax ignores NaN
minimum, fmin Element-wise minimum; fmin ignores NaN
mod Element-wise modulus (remainder of division)
Copy sign of values in second argument to values
copysign
in first argument
Perform element-wise comparison, yielding
greater, greater_equal, less,
boolean array (equivalent to infix operators >, >=,
less_equal, equal, not_equal
<, <=, ==, !=)
B V Raju College Page 31
Unit – III Python for Data Science
Function Description
logical_and, logical_or, Compute element-wise truth value of logical
logical_xor operation (equivalent to infix operators & |, ^)
import numpy as np
arr = np.arange(5)
arr1=np.arange(5,10)
print("arr",arr)
print("\n")
print("arr1",arr1)
print("add is\n",np.add(arr,arr1))
print("div is\n",np.divide(arr,arr1))
O/P:
Mathematical and Statistical Methods:
Math Methods
NumPy contains a large number of various mathematical operations.
NumPy provides standard trigonometric functions, functions for arithmetic
operations, handling complex numbers, etc.
B V Raju College Page 32
Unit – III Python for Data Science
Trigonometric Functions
NumPy has standard trigonometric functions which return trigonometric
ratios for a given angle in radians.
FUNCTION DESCRIPTION
sin() Compute sin element wise
cos() Compute cos element wise
tan() Compute tangent element-wise.
degrees() Convert angles from radians to degrees.
rad2deg() Convert angles from radians to degrees.
deg2rad Convert angles from degrees to radians.
radians() Convert angles from degrees to radians.
numpy.sin(x) : This mathematical function helps user to calculate trignmetric sine
for all x(being the array elements).
numpy.cos(x) : This mathematical function helps user to calculate trignmetric
cosine for all x(being the array elements).
numpy.tan(x) : This mathematical function helps user to calculate trignmetric tan
for all x(being the array elements).
Example
import numpy as np
a = np.array([0,30,45,60,90])
B V Raju College Page 33
Unit – III Python for Data Science
print ('Sine of different angles:' )
# Convert to radians by multiplying with pi/180
print (np.sin(a*np.pi/180) )
print ('\n' )
print ('Cosine values for angles in array:' )
print (np.cos(a*np.pi/180) )
print ('\n' )
print ('Tangent values for given angles:' )
print (np.tan(a*np.pi/180))
Here is its output −
Sine of different angles:
[ 0. 0.5 0.70710678 0.8660254 1. ]
Cosine values for angles in array:
[ 1.00000000e+00 8.66025404e-01 7.07106781e-01 5.00000000e-01
6.12323400e-17]
FUNCTION DESCRIPTION
rint() Round to nearest integer towards zero.
fix() Round to nearest integer towards zero.
B V Raju College Page 34
Unit – III Python for Data Science
floor() Return the floor of the input, element-wise.
ceil() Return the ceiling of the input, element-wise.
Return the truncated value of the input, element-
trunc() wise.
numpy.round_(arr, decimals = 0, out = None) : This mathematical function round
an array to the given number of decimals.
# Python program explaining
# round_() function
import numpy as np
in_array = [.5, 1.5, 2.5, 3.5, 4.5, 10.1]
print ("Input array : \n", in_array)
round_off_values = np.round_(in_array)
print ("\nRounded values : \n", round_off_values)
in_array = [.53, 1.54, .71]
print ("\nInput array : \n", in_array)
round_off_values = np.round_(in_array)
print ("\nRounded values : \n", round_off_values)
in_array = [.5538, 1.33354, .71445]
B V Raju College Page 35
Unit – III Python for Data Science
print ("\nInput array : \n", in_array)
round_off_values = np.round_(in_array, decimals = 3)
print ("\nRounded values : \n", round_off_values)
Output :
Input array :
[0.5, 1.5, 2.5, 3.5, 4.5, 10.1]
Rounded values :
[ 0. 2. 2. 4. 4. 10.]
Input array :
[0.53, 1.54, 0.71]
Rounded values :
[ 1. 2. 1.]
Input array :
[0.5538, 1.33354, 0.71445]
Rounded values :
[ 0.554 1.334 0.714]
Statistics Methods
Numpy provides various statistical functions which are used to perform some
statistical data analysis.
Mean
Median
Range (peak to peak)
Standard Deviation
B V Raju College Page 36
Unit – III Python for Data Science
Variance
np.mean
Compute the arithmetic mean of an array along a specific axis. The default is
along the flattened axis. For example-
import numpy as np
array_for_mean = np.array([[1, 2], [3, 4]])
m=np.mean(array_for_mean) #default
print(array_for_mean)
print("mean",m)
y=np.mean(array_for_mean, axis = 1) #mean to be calculated along axis
print("mean axis",y)
output:
[[1 2]
[3 4]]
mean 2.5
mean axis [1.5 3.5]
np.median
computes the median of the array along a specific axis. Median is the middle
value in a sorted (ascending/descending) list of numbers.
import numpy as np
array_for_median = np.array([[10, 7, 4], [3, 2, 1]])
x=np.median(array_for_median) #default
print(array_for_median)
print("Median", x)
output:
B V Raju College Page 37
Unit – III Python for Data Science
[[10 7 4]
[ 3 2 1]]
Median 3.5
Ordering the elements of the array in ascending order we get- 1, 2, 3, 4, 7,
10. It is an even-numbered list of 6 elements. The middlemost should be the 3rd
element. The average of the 3rd and 4th elements is taken (since the order is
even-numbered)- (3+4)/2 = 7/2 = 3.5
np.ptp
measures the range along a specific axis of an array. The range is the
difference between the maximum and minimum values in a matrix/array.
import numpy as np
array_for_range = np.array([[-85, 60, 94, 53],
[3, -12, 54, 14],
[32, 45, -66, 36]])
x=np.ptp(array_for_range)
print(x)
The maximum value is 94 while the minimum value is -85. The range will be:
(94- (-85)) = (94+85) = 179
np.std
The standard deviation is the spread of the values from their mean
value. np.stdis the NumPy function which helps to measure the standard deviation
of an array along a specified axis.
import numpy as np
array_for_stddev = np.array([[7, 8, 9], [10, 11, 12]])
x=np.std(array_for_stddev)
B V Raju College Page 38
Unit – III Python for Data Science
print(array_for_stddev)
print("standard deviation",x)
output:
[[ 7 8 9]
[10 11 12]]
standard deviation 1.707825127659933
np.var
np.var is the NumPy function which measures the variance of an array along
a specified axis. Variance is the average squared deviations from the mean of all
observed values.
import numpy as np
array_for_variance = np.array([[1, 2, 3], [6, 7, 8]])
x=np.var(array_for_variance) #default
print(array_for_variance)
print("variance",x)
output:
[[1 2 3]
[6 7 8]]
variance 6.916666666666667
Method Description
Sum of all the elements in the array or along an axis; zero-
sum
length arrays have sum 0
B V Raju College Page 39
Unit – III Python for Data Science
Method Description
mean Arithmetic mean; zero-length arrays have NaN mean
Standard deviation and variance, respectively, with optional
std, var
degrees of freedom adjustment (default denominator n)
min, max Minimum and maximum
argmin,
Indices of minimum and maximum elements, respectively
argmax
cumsum Cumulative sum of elements starting from 0
cumprod Cumulative product of elements starting from 1
Table 4-5. Basic array statistical methods
Sorting:
Like Python’s built-in list type, NumPy arrays can be sorted in-place using the sort
method:
Syntax
list.sort(reverse=True|False, key=myFunc)
Parameter Description
reverse Optional. reverse=True will sort the list descending. Default is
reverse=False
key Optional. A function to specify the sorting criteria(s)
Example:
import numpy as np
B V Raju College Page 40
Unit – III Python for Data Science
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))
print("\n")
arr1 = np.array(['banana', 'cherry', 'apple'])
print(np.sort(arr1))
print("\n")
arr2 = np.array([[3, 2, 4], [5, 0, 1]])
print(np.sort(arr2))
O/P:
Unique and Other Set Logic
NumPy has some basic set operations for one-dimensional ndarrays. Probably the
most commonly used one is np.unique, which returns the sorted unique values in
an array
Example:
import numpy as np
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
print( np.unique(names))
ints = np.array([3, 3, 3, 2, 2, 1, 1, 4, 4])
print( np.unique(ints))
O/P:
B V Raju College Page 41
Unit – III Python for Data Science
Another function, np.in1d, tests membership of the values in one array in
another, returning a boolean array
Example:
import numpy as np
values = np.array([6, 0, 0, 3, 2, 5, 6])
print( np.in1d(values, [2, 3, 6]))
O/P:
Array set operations
B V Raju College Page 42