0% found this document useful (0 votes)
3 views

3 Introduction To Numpy

Uploaded by

AB
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

3 Introduction To Numpy

Uploaded by

AB
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

7/18/24, 11:44 AM 3-introduction-to-numpy.

ipynb - Colab

In this series of articles, we will cover the basics of Data Analysis using Python. The lessons will start growing gradually until
forming a concrete analytical mindset for students. This lesson will cover the essentials of Scientific Computing in Python using
NumPy

What is NumPy?
NumPy is short for Numerical Python and, as the name indicates, it deals with everything related to Scientific Computing. The basic object in
NumPy is the ndarray which is also a short for n-dimentional array and in a mathematical context it means multi-dimentional array.

Any mathematical operation such as differention, optimization, solving equations simultionously will need to be defined in a matrix format to be
done properly and easily and that was the pupose of programming languages like Matlab.

Unlike any other python object, ndarray has some intersting aspects that ease any mathematical computation.

NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will create a
new array and delete the original.
The elements in a NumPy array are all required to be of the same data type, and thus will be the same size in memory.
NumPy arrays facilitate advanced mathematical and other types of operations on large numbers of data. Typically, such operations are
executed more efficiently and with less code than is possible using Python’s built-in sequences.

For the sake of Data Analytics there will not be a lot of mathematical compution proplems but later on when we will start working with data in
tables. You will figure out that any table is more or less a 2d dimentional array and that's why it's essiential to know a bit about array that will
convert in future lessons to tables of data.

What is an array?
Array is a mathematical object that is defined to hold some numbers organized in rows and columns. The structure of the array should allow
selecting (indexing) any of the inner items. Later on we will see how to do this in code.

Below is a graph for the structure of arrays.

https://colab.research.google.com/drive/1Eu1iJwqopohMA9DYh1T5AhPVwsSwgF3J#printMode=true 1/9
7/18/24, 11:44 AM 3-introduction-to-numpy.ipynb - Colab

keyboard_arrow_down Creating a NumPy array


import numpy as np

# Create a 2-d array


arr2d = np.array([[1, 2, 3], [4, 5, 6]])
# Print its content
print(arr2d)
# Print array type
print(type(arr2d))

[[1 2 3]
[4 5 6]]
<class 'numpy.ndarray'>

# Let's create a 1-d array


arr1d = np.array([[1, 2, 3]])
# print its content
print(arr1d)
# print array type
print(type(arr1d))

[[1 2 3]]
<class 'numpy.ndarray'>

keyboard_arrow_down Array Shape


Array Shape is the most important aspect to take care of when dealing with array and array-maths in general.

It's simply: shape = N rows ∗ N columns


It's a major info to know espcially when dealing with multiplication in arrays. Lets see it in a more visual way.

https://colab.research.google.com/drive/1Eu1iJwqopohMA9DYh1T5AhPVwsSwgF3J#printMode=true 2/9
7/18/24, 11:44 AM 3-introduction-to-numpy.ipynb - Colab

Lets see how to get this info in Python using .shape

print(arr2d.shape)
print(arr1d.shape)

(2, 3)
(1, 3)

keyboard_arrow_down Special Types of Arrays.


In some cases, we will need to create some special types of arrays such as array of zeros, identity array, or array of ones, etc.. Lets see some
examples that could be implented using NumPy

np.zeros() : creating array of all zeros.


np.ones() : creating array of all ones.
np.empty() : creating array of random values.
np.full() : creating array full of the same number.
np.eye() : creating an identity array. and much more!

zeros = np.zeros((2,2)) # Create an array of all zeros


print(zeros)

[[0. 0.]
[0. 0.]]

ones = np.ones((5,2)) # Create an array of all ones


print(ones)

[[1. 1.]
[1. 1.]
[1. 1.]
[1. 1.]
[1. 1.]]

full = np.full((5,4), -9) # Create a constant array


print(full)

[[-9 -9 -9 -9]
[-9 -9 -9 -9]
[-9 -9 -9 -9]
[-9 -9 -9 -9]
[-9 -9 -9 -9]]

eye = np.eye(5) # Create a 2x2 identity matrix


print(eye)

https://colab.research.google.com/drive/1Eu1iJwqopohMA9DYh1T5AhPVwsSwgF3J#printMode=true 3/9
7/18/24, 11:44 AM 3-introduction-to-numpy.ipynb - Colab

[[1. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 1. 0. 0.]
[0. 0. 0. 1. 0.]
[0. 0. 0. 0. 1.]]

np.random.seed(0) # try to comment and uncomment this line


random = np.random.random((2,2)) # Create an array filled with random values
print(random)

[[0.5488135 0.71518937]
[0.60276338 0.54488318]]

keyboard_arrow_down Array Slicing and Indexing


Array slicing means to select a part of an arry not the entire version and indexing has been touched previously!

# Create the following rank with shape (3, 4)


myarr = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(myarr)

[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]

Suppose that we want to select the second row of myarr ?

This is what called slicing. It can be done using the following syntacx.

arr[start_index_of_rows:end_index_for_rows, start_index_for_columns:end_index_for_columns]

Just keep in mind two things:

Python is generally a zero-indexed languge so, your first column will be column zero and the same applies for rows.
The end boundary for the above syntax is exclusive so the slicing stops directly before that boundary.

# Lets type the syntax for selecting the second row.


row_r2 = myarr[1:2, :]
print(row_r2)

[[5 6 7 8]]

Now, lets try selecting the second column with the same manner.

col_c2 = myarr[:, 1:2]


print(col_c2)

[[ 2]
[ 6]
[10]]

Now, lets select the slice that come from the first two rows and two columns

myarr_slice = myarr[:2, :2]


print(myarr_slice)

[[1 2]
[5 6]]

keyboard_arrow_down Integer Array Indexing


What if we want to select some specific elements in the array?

We should select this based on the mathematical indexing and, for sure, with applying the zero-indexing.

The mathematical way of indexing is as following.

https://colab.research.google.com/drive/1Eu1iJwqopohMA9DYh1T5AhPVwsSwgF3J#printMode=true 4/9
7/18/24, 11:44 AM 3-introduction-to-numpy.ipynb - Colab

Lets see some code examples

# Lets define a new array


newarr = np.array([[1,2], [3, 4], [5, 6]])
# Print it
print(newarr)

[[1 2]
[3 4]
[5 6]]

Now we will try to select items with the following indexes:

0*0
1*1
2*0

The values will be 1 , 4 , and 5 respectivily

print(newarr[[0, 1, 2], [0, 1, 0]]) # Prints "[1 4 5]"

[1 4 5]

keyboard_arrow_down Boolean Indexing


In some scenarios, the task will be to select based on some critiera such as the elements greater than 2 or less or equal -1. Luckily, Python is
capable of doing such type of indexing easily without the need to consturct any loops or so. Lets see how!

# Define an array
otherNewArray = np.array([[1,2], [3, 4], [5, 6]])
# Lets print it
print(otherNewArray)
# Consturuct a boolean index (To check for elements greater than 2)
bool_idx = (otherNewArray > 2)
# Print the result of the boolean index
print(bool_idx)
# Now we will use such index to print all elements greater than 2
print(otherNewArray[bool_idx])

[[1 2]
[3 4]
[5 6]]
[[False False]
[ True True]
[ True True]]
[3 4 5 6]

https://colab.research.google.com/drive/1Eu1iJwqopohMA9DYh1T5AhPVwsSwgF3J#printMode=true 5/9
7/18/24, 11:44 AM 3-introduction-to-numpy.ipynb - Colab

keyboard_arrow_down Data Types in NumPy


NumPy has some data types that could be refered with one character, like i for integers, u for unsigned integers etc.

Below is a list of all data types in NumPy and the characters used to represent them.

i - integer
b - boolean
u - unsigned integer
f - float
c - complex float
m - timedelta
M - datetime
O - object
S - string
U - unicode string
V - fixed chunk of memory for other type ( void )

In general, we will not use all of them. Only the famous ones are heavily used such as iteger , float , string .

Now, lets see how to chech a datatype for a NumPy array.

x = np.array([1, 2])
print(x.dtype)

int64

Here, the datatype of the inner elements, which must be unified, is int64

y = np.array([1.0, 2.0])
print(y.dtype)

float64

This one is float64

While creating a NumPy array we can force a specific data type. Lets see the following example.

z = np.array([1.0, 2.0], dtype='S')


print(z.dtype)

|S3

Here, the elements of the array z are str . Lets define a float array.

f = np.array([1, 2], dtype='f')


print(f.dtype)

float32

keyboard_arrow_down Array Math


NumPy is supporting all the mathematical operations on arrays. Lets see some examples.

Now, we will define two arrays on which the whole mathematical operations will be applied.

x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

https://colab.research.google.com/drive/1Eu1iJwqopohMA9DYh1T5AhPVwsSwgF3J#printMode=true 6/9
7/18/24, 11:44 AM 3-introduction-to-numpy.ipynb - Colab

keyboard_arrow_down Elementwise sum


print(x + y)
print('='*10)
print(np.add(x, y))
print('='*10)

[[ 6. 8.]
[10. 12.]]
==========
[[ 6. 8.]
[10. 12.]]
==========

keyboard_arrow_down Elementwise difference


print(x - y)
print('='*10)
print(np.subtract(x, y))
print('='*10)

[[-4. -4.]
[-4. -4.]]
==========
[[-4. -4.]
[-4. -4.]]
==========

keyboard_arrow_down Elementwise product


print(x * y)
print('='*10)
print(np.multiply(x, y))
print('='*10)

[[ 5. 12.]
[21. 32.]]
==========
[[ 5. 12.]
[21. 32.]]
==========

keyboard_arrow_down Elementwise division


print(x / y)
print('='*10)
print(np.divide(x, y))
print('='*10)

[[0.2 0.33333333]
[0.42857143 0.5 ]]
==========
[[0.2 0.33333333]
[0.42857143 0.5 ]]
==========

keyboard_arrow_down Elementwise square root


print(np.sqrt(x))

[[1. 1.41421356]
[1.73205081 2. ]]

keyboard_arrow_down Transpose of a Matrix


https://colab.research.google.com/drive/1Eu1iJwqopohMA9DYh1T5AhPVwsSwgF3J#printMode=true 7/9
7/18/24, 11:44 AM 3-introduction-to-numpy.ipynb - Colab

print(x)
print('='*10)
print(x.T)

[[1. 2.]
[3. 4.]]
==========
[[1. 3.]
[2. 4.]]

keyboard_arrow_down Dot Product


np.dot(x, y)

array([[19., 22.],
[43., 50.]])

keyboard_arrow_down Broadcasting
The term "broadcasting" describes how Numpy handles arrays of differing dimensions when performing operations that result in restrictions;
the smaller array is broadcast across the bigger array to ensure that they have compatible dimensions

As we know that Numpy is built in C, broadcasting offers a way to vectorize array operations so that looping happens in C rather than Python.
This results in effective algorithm implementations without the requirement for extra data duplication.

In the follwing example, we need to add the elements of y to each row of array x . We will do this using two methods:

The conventional Python loop.


The broadcasting method.

x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])


v = np.array([1, 0, 1])
print(x)
print('='*10)
print(v)

[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]
[10 11 12]]
==========
[1 0 1]

Lets create an empty array y with the same shape of x that will hold the result of the addition process.

%%time
# This command will calculate the excution time for the whole cell
# Create an empty matrix with the same shape as x
y = np.empty_like(x)
# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
y[i, :] = x[i, :] + v

print(y)

[[ 2 2 4]
[ 5 5 7]
[ 8 8 10]
[11 11 13]]
CPU times: user 275 µs, sys: 37 µs, total: 312 µs
Wall time: 308 µs

Now, lets use the concept of Broadcasting

%%time
z = x + v
print(z)

https://colab.research.google.com/drive/1Eu1iJwqopohMA9DYh1T5AhPVwsSwgF3J#printMode=true 8/9
7/18/24, 11:44 AM 3-introduction-to-numpy.ipynb - Colab

[[ 2 2 4]
[ 5 5 7]
[ 8 8 10]
[11 11 13]]
CPU times: user 692 µs, sys: 0 ns, total: 692 µs
Wall time: 672 µs

We can notice that Broadcasting is faster and easier in implementation.

This notebook is part of my Python for Data Analysis course. If you find it useful, you can upvote it! Also, you can follow me on LinkedIn and
Twitter.

Below are the contents of the whole course:

1. Introduction to Python
2. Iterative Operations & Functions in Python

https://colab.research.google.com/drive/1Eu1iJwqopohMA9DYh1T5AhPVwsSwgF3J#printMode=true 9/9

You might also like