0% found this document useful (0 votes)

33 views109 pages

Data Science

The document discusses data science and provides information on what data science is, where it is needed, how a data scientist works, different types of data, popular tools used in data science, and how to generate NumPy arrays. Specifically, it defines data science as involving data gathering, analysis, and decision making. It also lists industries like banking, healthcare, and manufacturing that utilize data science and provides steps on how a data scientist analyzes data to find patterns and make predictions.

Uploaded by

suy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views109 pages

Data Science

Uploaded by

suy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 109

Data Science

By:
Dharna Ahuja
What is Data Science?
• Data Science is about data gathering, analysis and
decision-making.

• Data Science is about finding patterns in data, through

analysis, and make future predictions.

• By using Data Science, companies are able to make:

• Better decisions (should we choose A or B)
• Predictive analysis (what will happen next?)
• Pattern discoveries (find pattern, or maybe hidden
information in the data)
• .
Where is Data Science Needed?
• Data Science is used in many industries in the world today, e.g.
banking, consultancy, healthcare, and manufacturing.
• Examples of where Data Science is needed:
• For route planning: To discover the best routes to ship
• To foresee delays for flight/ship/train etc. (through predictive analysis)
• To create promotional offers
• To find the best suited time to deliver goods
• To forecast the next years revenue for a company
• To analyze health benefit of training
• To predict who will win elections
How Does a Data Scientist Work?
• A Data Scientist must find patterns within the data. Before he/she can find
the patterns, he/she must organize the data in a standard format.
• Here is how a Data Scientist works:
1. Ask the right questions - To understand the business problem.
2. Explore and collect data - From database, web logs, customer
feedback, etc.
3. Extract the data - Transform the data to a standardized format.
4. Clean the data - Remove erroneous values from the data.
5. Find and replace missing values - Check for missing values and replace
them with a suitable value (e.g. an average value).
6. Normalize data - Scale the values in a practical range (e.g. 140 cm is
smaller than 1,8 m. However, the number 140 is larger than 1,8. - so
scaling is important).
7. Analyze data, find patterns and make future predictions.
8. Represent the result - Present the result with useful insights in a way
the "company" can understand.
What is Data?
• Data is a collection of information.
• One purpose of Data Science is to structure data, making it
interpretable and easy to work with.
• Data can be categorized into two groups:
• Structured data
• Unstructured data
Unstructured Data
• Unstructured data is not organized. We must
organize the data for analysis purposes.
Structured Data
• Structured data is organized and
easier to work with.
Data perspective
• Read data

• Data processing and cleaning

• Summarizing data

• Visualization

• Deriving insights from data

Data science using Python
• Python libraries provide key feature sets which are essential
for data science.
• Data manipulation and pre-processing
– Python’s pandas library offers a variety of functions for data wrangling
and manipulation
• Data summary
• Visualization
– Plotting libraries like ‘matplotlib’ and ‘seaborn’ aid in condensing
statisticsal information and help in identifying trends and
relationships.
• Machine learning libraries like ‘sci-kit learn’ offer a bouquet of
machine learning algorithms.
Introduction
• We live in a world that’s drowing in data.

• Data is generated from various sources.

– Websites track every user’s every click.

– Your smartphone is building up a record of your location.
– Sensors from electronic devices record real time information.
– E-commerce websites collect purchasing habits
Popular tools used in data science
• Data pre-processing and analytics
– Python, R, Microsoft Excel, SAS, SPSS

• Data exploration and visualization

– Tableau, Qlikview, Microsoft Excel

• Parallel and distributed computing incase of big data

– Apache Spark, Apache Hadoop
Jupyter notebook
• Web application that allows creation and
manipulation of notebook documents called
‘notebook’.

• Supported across Linux,Mac OS and Windows

platforms.
• Available as open source version.
Python-Numpy
• Numpy stands for numerical python.

• Fundamental package for numerical computations in Python.

• Supports N-dimesnional array objects that can be used for

processing multi-dimesnional data.

• Supports different data types.

• Using Numpy we can perform

– Mathematical and logical operations on arrays
– Fourier transforms
– Linear algebra operations
– Random number generation
Why Use NumPy?
• In Python we have lists that serve the purpose of arrays,
but they are slow to process.
• NumPy aims to provide an array object that is up to 50x
faster than traditional Python lists.
• The array object in NumPy is called ndarray,
• Arrays are very frequently used in data science, where speed and
resources are very important.
Why is NumPy Faster Than Lists?
• NumPy arrays are stored at one continuous place in
memory unlike lists, so processes can access and
manipulate them very efficiently.
• This behavior is called locality of reference in computer
science.
• This is the main reason why NumPy is faster than lists. Also
it is optimized to work with latest CPU architectures.
Installation of NumPy

pip install numpy

Create an array
• Ordered collection of elements of basic data
types of given length.
• Syntax: numpy.array(object)
– Import numpy as np
x=np.array([2,3,4,5])
print(type(x))  this will give the output as <class
‘numpy .ndarray’>
print(x)
Arrays
• Numpy can handle different categorical entities.
– X=np.array([2,3,’’n’,5])

print(x)  will make the datatype of all the elements as same. So this
will give output as [‘2’,’3’,’n’,’5]

– All elements are coerced to same data type.

0-D Arrays
• 0-D arrays, or Scalars, are the
elements in an array. Each value in
an array is a 0-D array.
• import numpy as np

arr = np.array(42)

print(arr)
1-D Arrays
• An array that has 0-D arrays as its elements is called uni-
dimensional or 1-D array.
• These are the most common and basic arrays.

• import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)
2-D Arrays
• An array that has 1-D arrays as its elements is called a 2-D
array.
• These are often used to represent matrix or 2nd order
tensors.
• import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)
3-D arrays
• An array that has 2-D arrays (matrices) as its elements is
called 3-D array.
• These are often used to represent a 3rd order tensor.
• import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3],

[4, 5, 6]]])

print(arr)
Check Number of Dimensions?
• NumPy Arrays provides the ndim attribute that returns an
integer that tells us how many dimensions the array have.
• import numpy as np

a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3],
[4, 5, 6]]])

print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)
NumPy Array Indexing
• Access Array Elements
• import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[0])
• Get third and fourth elements from the following array and add them.
• import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[2] + arr[3])
Generate arrays using linspace()
• Numpy.linspace()- returns equally spaced numbers within the
given range based on the sample number.

• Syntax: numpy.linspace(start,stop,num,dtype,retstep)

• Start- start of interval range

• Stop- end of interval range
• Num- number of samples to be generated
• Dtype- type of output array
• Retstep- return the samples,step value
Example
• Generate an array b with start=1 and stop=5
• b=np.linspace(start=1,stop=5,num=10,endpoi
nt=True, retstep=False)
• Here endpoint means if we want to include
the last value.
• Retstep=False will return the samples and not
the increment value.
• print(b)
Generate arrays using arange()
• Numpy.arange()- returns equally spaced numbers with in the
given range based on step size.

• Syntax: numpy.arange(start,stop,step)

• Start- start of interval range.

• Stop- end of interval range

• Step- step size of interval

Example
• Generate an array with start=2 and stop=10 by
specifying step=2

• D=np.arange(start=1,stop=10,step=2
Generate arrays using ones()
• Numpy.ones()- returns an array of given shape and type filled
with ones.

• Syntax: numpy.ones(shape,dtype)

• shape- integer or sequence of integers

• Dtype- data type(default:float)

• Np.ones((3,4),dtype=‘int’) means 3 rows and 4 columns.

Generate arrays using zeros()
• Numpy.zeros() – returns an array of given shape and type
filled with zeroes.

• Syntax: numpy.zeros(shape,dtype)

• Shape – integer or sequence of integers

• Dtype- data type(default:float)

• Np.zeros((3,4))
Generate arrays using random.rand()
• Numpy.random.rand()- returns an array of given shape filled
with random values.

• Syntax: numpy.random.rand(shape)

• Shape – integer or sequence of integers.

• Np.random.rand(5)
Generate arrays using random.rand()
• Generate an array of random values with 5
rows and 2 columns

• Np.random.rand(5,2) means 5 rows and 2

columns.
Generate arrays using logspace()
• Numpy.logspace()- returns equally spaced numbers based on
log scale.
• Syntax:
– Numpy.logspace(start,stop,endpoint,base,dtype)
• Start- start value of the sequence.
• Stop- end value of the sequence.
• Num- number of samples to generate (default:50)
• Endpoint- if true, stop is the last sample.
• Base- base of the log space(default: 10.0)
• Dtype- data type of the output array.
• Generate an array with 5 samples with base
10.0
• Np.logspace(1,10,num=5,endpoint=True,
base=10.0)
Reshaping an array
• reshape()-> recasts an array to new shape
without changing its data.
• grid=np.arange(start=1,stop=10).reshape(3,3)
• print(grid)
• import numpy as np
• arr=np.array([1,2,3,4,5,6,7,8,9,10,11,12])
• newarr=arr.reshape(4,3)
• print(newarr)
Reshape From 1-D to 3-D
• Convert the following 1-D array with 12
elements into a 3-D array.
• The outermost dimension will have 2 arrays
that contains 3 arrays, each with 2 elements:
• import numpy as np

arr =
np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

newarr = arr.reshape(2, 3, 2)

print(newarr)
Can We Reshape Into any Shape?
• Yes, as long as the elements required for reshaping are equal
in both shapes.
• We can reshape an 8 elements 1D array into 4 elements in 2
rows 2D array but we cannot reshape it into a 3 elements 3
rows 2D array as that would require 3x3 = 9 elements.

• import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

newarr = arr.reshape(3, 3)

print(newarr)
Numpy addition
• numpy.add() -> performs elementwise
addition between two arrays.
• Numpy.add(array_1,array_2)
• Create two arrays a and b.
• a=np.array([[1,2,3],[4,5,6]])
• b=np.arange(start=11,stop=20).reshape(3,3)
• np.add(a,b)
Numpy multiplication
• Numpy.multiply()-> performs elementwise
multiplication between two arrays.
• numpy.multiply(array_1,array_2)
Other Numpy functions
• Numpy.subtract()-> performs elementwise
subtraction between two arrays.
• Numpy.divide()-> returns an element wise
division of inputs.
• Numpy.remainder()-> returns element-wise
remainder of division.
Accessing components of an array
• Components of an array can be accessed using
index number.
• a=[[1 2 3]
[4 5 6]
[7 8 9]]

Extract element with index (0,1) from a

a[0][1]
Access 3-D Arrays
• Access the third element of the second array of the first
array:

• import numpy as np

arr = np.array([
• [[1, 2, 3], [4, 5, 6]],
[[7, 8, 9], [10, 11, 12]]])

print(arr[0, 1, 2])
Example Explained
• The first number represents the first dimension, which contains
two arrays:
[[1, 2, 3], [4, 5, 6]]
and:
[[7, 8, 9], [10, 11, 12]]
• Since we selected 0, we are left with the first array:
• [[1, 2, 3], [4, 5, 6]]
• The second number represents the second dimension, which also contains two arrays:
[1, 2, 3]
and:
[4, 5, 6]

• Since we selected 1, we are left with the second array:

• [4, 5, 6]
• The third number represents the third dimension, which contains three values:
4
5
6
Since we selected 2, we end up with the third value:
6
Negative Indexing
• Use negative indexing to access an array from the end.
• Print the last element from the 2nd dim:
• import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('Last element from 2nd dim: ', arr[1, -1])

NumPy Array Slicing
• import numpy as np

arr =
np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5])
Slicing 2-D Arrays
• From the second element, slice elements from
index 1 to index 4 (not included):
• import numpy as np

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[1, 1:4])

• From both elements, return index 2:

• import numpy as np

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[0:2, 2])

• From both elements, slice index 1 to index 4 (not included), this will
return a 2-D array:
• import numpy as np

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[0:2, 1:4])
NumPy Data Types
• NumPy has some extra data types, and refer to data types with
one character, like i for integers, u for unsigned integers etc.
• Below is a list of all data types in NumPy and the characters used
to represent them.
• i- integer
• B-Boolean
• f-float
• c-complex float
• M-datetime
• O-object
• S-string
• The NumPy array object has a property called dtype that returns the data
type of the array:
• import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr.dtype)
Creating Arrays With a Defined
Data Type
• We use the array() function to create arrays, this function can
take an optional argument:dtype that allows us to define the
expected data type of the array elements:
• import numpy as np

arr = np.array([1, 2, 3, 4], dtype='S')

print(arr)
print(arr.dtype)
• For i,u,f,S and U we can define size as well.
• Create an array with data type 4 bytes integer:
• import numpy as np

arr = np.array([1, 2, 3, 4], dtype='i4')

print(arr)
print(arr.dtype)
Converting Data Type on Existing
Arrays
• The best way to change the data type of an existing array,
is to make a copy of the array with the astype() function.
• The astype() function creates a copy of the array, and
allows you to specify the data type as a parameter.
• The data type can be specified using a string, like ‘f’ for
float, ‘I’ for integer etc.
• Change data type from float to integer by using ‘I’ as
parameter valyue:
• import numpy as np

arr = np.array([1.1, 2.1, 3.1])

newarr = arr.astype('i')

print(newarr)
print(newarr.dtype)
NumPy Array Copy vs View
• The main difference between a copy and a view of an array
is that the copy is a new array, and the view is just a view
of the original array.
• The copy owns the data and any changes made to the copy
will not affect original array, and any changes made to the
original array will not affect the copy.
• The view does not own the data and any changes made to the
view will affect the original array, and any changes made to the
original array will affect the view.
• COPY:
• Make a copy, change the original array, and display both
arrays:
• import numpy as np

arr = np.array([1, 2, 3, 4, 5])

x = arr.copy()
arr[0] = 42

print(arr)
print(x)
• The copy SHOULD NOT be affected by the changes made to the original array.
Make Changes in the VIEW
• Make a view, change the view, and
display both arrays:
• import numpy as np

arr = np.array([1, 2, 3, 4, 5])

x = arr.view()
x[0] = 31

print(arr)
print(x)
• The original array SHOULD be affected
by the changes made to the view.
VIEW
• Make a view, change the original array,
and display both arrays:
• import numpy as np

arr = np.array([1, 2, 3, 4, 5])

x = arr.view()
arr[0] = 42

print(arr)
print(x)
• The view SHOULD be affected by the
changes made to the original array.
Flattening the arrays
• Flattening array means converting a multidimensional array
into a 1D array.
• We can use reshape(-1) to do this.
• Convert the array into a 1D array:

• import numpy as np

arr = np.array([[1, 2, 3],

[4, 5, 6]])

newarr = arr.reshape(-1)

print(newarr)
NumPy Array Iterating
• Iterating means going through elements one by one.
• As we deal with multi-dimensional arrays in numpy, we can
do this using basic for loop.
• If we iterate on a 1-D array it will go through each element
one by one.

• import numpy as np

arr = np.array([1, 2, 3])

for x in arr:
print(x)
Iterating 2-D Arrays
• In a 2-D array it will go through all
the rows.
• import numpy as np

arr = np.array([[1, 2, 3],

[4, 5, 6]])

for x in arr:
print(x)
• To return the actual values, the scalars,
we have to iterate the arrays in each
dimension.
• Iterate on each scalar element of the 2-
D array:
• import numpy as np

arr = np.array([[1, 2, 3],

[4, 5, 6]])

for x in arr:
for y in x:
print(y)
Iterating 3-D Arrays
• import numpy as np

arr = np.array([[[1, 2, 3],

[4, 5, 6]],
[[7, 8, 9], [10, 11, 12]]])

for x in arr:
for y in x:
for z in y:
print(z)
Iterating Arrays Using nditer()
• Iterating on Each Scalar Element
• In basic for loops, iterating through
each scalar of an array we need to
use n for loops which can be difficult to
write for arrays with very high
dimensionality.
• import numpy as np
arr = np.array([[[1, 2], [3, 4]],
[[5, 6], [7, 8]]])
for x in np.nditer(arr):
print(x)
Iterating With Different Step Size
• We can use filtering and followed by
iteration.
• import numpy as np

arr = np.array([[1, 2, 3, 4],

[5, 6, 7, 8]])

for x in np.nditer(arr[:, ::2]):

print(x)
NumPy Joining Array
• Joining means putting contents of two or more arrays in a
single array.
• In SQL we join tables based on a key, whereas in NumPy
we join arrays by axes.
• We pass a sequence of arrays that we want to join to the
concatenate(), along with the axis. If axis is not explicitly
passed, it is taken as 0.
• import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.concatenate((arr1, arr2))

print(arr)
• Join two 2-D arrays along rows
(axis=1):
• import numpy as np

arr1 = np.array([[1, 2], [3, 4]])

arr2 = np.array([[5, 6], [7, 8]])

arr = np.concatenate((arr1, arr2),

axis=1)

print(arr)
Splitting NumPy Arrays
• Splitting is reverse operation of Joining.
• Joining merges multiple arrays into one and Splitting
breaks one array into multiple.
• We use array_split() for splitting arrays, we pass
it the array we want to split and the number of
splits.
• Split the array in 3 parts:
• import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr)
• Split the array in 4 parts:
• import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 4)

print(newarr)
Split Into Arrays
• The return value of the array_split() method is an
array containing each of the split as an array.
• If you split an array into 3 arrays, you can access them
from the result just like any array element:
• import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr[0])
print(newarr[1])
print(newarr[2])
Splitting 2-D Arrays
• Split the 2-D array into three 2-D
arrays.
• import numpy as np

arr = np.array([[1, 2], [3, 4],

[5, 6], [7, 8], [9, 10],
[11, 12]])

newarr = np.array_split(arr, 3)

print(newarr)
Searching Arrays
• You can search an array for a certain
value, and return the indexes that get a
match.
• To search an array, use the where()
method.
• Find the indexes where the value is 4:
• import numpy as np
arr =
np.array([1, 2, 3, 4, 5, 4, 4])
x = np.where(arr == 4)
print(x)
Sorting Arrays
• Sorting means putting elements in an ordered
sequence.
• Ordered sequence is any sequence that has an
order corresponding to elements, like numeric or
alphabetical, ascending or descending.
• The NumPy ndarray object has a function called
sort(), that will sort a specified array.
• import numpy as np

arr = np.array([3, 2, 0, 1])

print(np.sort(arr))
Array dimensions
• Create an array a
• a=np.array([[1,2,3],[4,5,6],[7,8,9]])
• shape()-> returns dimensions of an array
• array_name.shape
• Extract elements from second and third row of
array a.
• a[1:3]
• Extract elements from first column of array a.
• a[: , 0] -> this means take all the rows
• Extract elements from the first row of array a.
• a[0, : ]
Subset of arrays
• Array a=> [[1 2 3]
[4 5 6]
[7 8 9]]
• Subset a 2X2 array from the original array a
• Consider the first two rows and columns from a
• a_sub=a[:2,:2]
• print(a_sub)
• Suppose, you want to modify the value of 1 to
12 in the array a_sub :
• a=[[1,2]
[4,5]]

• a_sub[0,0]=12

• Modifying the subset will automatically

update the original array as well.
• print(a)
Modifying array using transpose()
• numpy.transpose()-> permute the dimensions
of array.
• Syntax: numpy.transpose(array)
• It will interchange the rows and columns.
Modifying array using append()
• append()-> adds value at the end of the array.
• Syntax: numpy.append(array,axis)
• Adding the new array to ‘a’ as a row
• a_row=np.append(a,[[10,11,14]],axis=0)
• Adding the new array to as a column
• Create an array and reshape to column array
• col=np.array([21,22,23]).reshape(3,1)

• a_col=np.append(a,col,axis=1)
Modifying array using insert()
• insert()-> adds values at a given position and
axis in an array.
• numpy.insert(array,obj,values,axis)
• array- input array
• Obj- index position
• values-array of values to be inserted’
• axis-axis along which values should be
inserted
• Consider array a
• a=[[12,2,3],[4,5,6],[7,8,9]]
• Insert new array along row and at the 1st index
position.
• a_ins=np.insert(a,1,[13,15,16],axis=0)
• print(a_ins)
Modifying array using delete()
• delete()- > removes values at a given position
and axis in an array.
• Numpy.delete(array,obj,axis)
• array- input array
• obj- indicate array to be removed.
• axis- axis along which array should be
removed
• Delete third row from the existing array a_ins
• a_del=np.delete(a_ins,2,axis=0)
• Corresponding index is 2.
Matrices
• Rectangular arrangement of numbers in rows
and columns.
• Rows run horizontally and columns run
vertically.
a11 a12 a13
a21 a22 a23
a31 a32 a33
The above matrix is 3X3
a11
a21
a31

The above matrix is 3X1

Create a matrix
• matrix() -> returns a matrix from an array type
object or string of data.
• numpy.matrix(data)
• import numpy as np
a=np.matrix(“1,2,3,4;4,5,6,7;7,8,9,10”)
print(a)
• shape[0]-> returns the number of rows.
• shape[1]-> returns the number of columns.
• size()-> returns the number of elements in a
matrix.
Modifying matrix using insert()
• insert-> adds value at a given position and axis
in a a matrix.
• Syntax: numpy.insert(matrix,obj,values,axis)
• Adding the matrix ‘col_new” as a new column
to a
• Create a matrix:
• Col_new=np.matrix(“2,3,4”)
• a=np.insert(a,0,col_new,axis=1)
• Adding the matrix ‘row_new’ as a new row to
a
• row_new=np.matrix(“4,5,6,7,9”)
• a=np.insert(a,0,row_new,axis=0)
Modifying matrix using index
• Elements of a matrix can be modified using
index number.
• The value of 1 should be updated to -3 in
matrix a. So,
• a[1,1]=-3
Accessing elements of matrix using
index
• Extract elements from second row of matrix a:
• print(a[1,:])
• Extract elements from third column:
• print(a[:,2])
• Extract element with index (1,2) from a
• print(a[1,2])
Matrix addition
• numpy.add() -> performs elementwise
addition between two matrices.
– numpy.add(matrix_1,matrix_2)
• Create two matrix A and B
• A=np.matrix(np.arange(0,20)).reshape(5,4)
• B=np.matrix(np.arange(20,40)).reshape(5,4)
• Numpy.subtract()
Matrix Multiplication
• numpy.dot() -> performs matrix multiplication
between two matrices.
• Syntax: numpy.dot(matrix_1,matrix_2)
• For matrix multiplication, number of columns
in matrix A should be equal to number of rows
in matrix B.
• Transpose matrix B to make it 4X5 in
dimension.
• B=np.transpose(B)
• np.dot(A,B)
• numpy.matmul() and @ can also be used for
matrix multiplication.
• numpy.multiply(matrix_1,matrix_2)
Linear Algebra
The Numpy library can be used to perform a variety of mathematical/scientific
operations such as matrix cross and dot products.

What is a System of Linear Equations?

In mathematics, a system of linear equations (or linear system) is a collection of two or
more linear equations involving the same set of variables.
The ultimate goal of solving a system of linear equations is to find the values of the
unknown variables. Here is an example of a system of linear equations with two
unknown variables, x and y:

Equation 1:

4x + 3y = 20
-5x + 9y = 26
To solve the above system of linear equations, we need to find the values of the x and y
variables.
In the matrix solution, the system of linear equations to be solved is represented in the
form of matrix AX = B.
For instance, we can represent Equation 1 in the form of a matrix as follows:
• A = [[ 4 3]
• [-5 9]]

• X = [[x]
• [y]]

• B = [[20]
• [26]]

• To find the value of x and y variables in Equation 1, we need to find the values in
the matrix X. To do so, we can take the dot product of the inverse of matrix A, and
the matrix B as shown below:
• X = inverse(A).B
• Using the inv() and dot() Methods
• m_list = [[4, 3], [-5, 9]]
• A = np.array(m_list)
• inv_A = np.linalg.inv(A)

• The next step is to find the dot product between the inverse of matrix A,
and the matrix B. It is important to mention that matrix dot product is only
possible between the matrices if the inner dimensions of the matrices are
equal i.e. the number of columns of the left matrix must match the
number of rows in the right matrix.
• To find the dot product with the Numpy library, the linalg.dot() function is
used. The following script finds the dot product between the inverse of
matrix A and the matrix B, which is the solution of the Equation 1.
• B = np.array([20, 26])
• X = np.linalg.inv(A).dot(B)
Using the solve() Method
• the Numpy library contains the linalg.solve() method, which can be used
to directly find the solution of a system of linear equations:
• A = np.array([[4, 3, 2], [-2, 2, 3], [3, -5, 2]])
• B = np.array([25, -10, -4])
• X2 = np.linalg.solve(A,B)
Finding Determinant
A = np.array([[6, 1, 1],
[4, -2, 5],
[2, 8, 7]])

print(("\nDeterminant of A:"
, np.linalg.det(A)))
Numpy Matrix
• Numpy matrices are strictly 2-dimensional, while numpy arrays (ndarrays)
are N-dimensional.
• The main advantage of numpy matrices is that they provide a convenient
notation for matrix multiplication: if a and b are matrices, then a*b is their
matrix product.

• Syntax :

• numpy.matrix(data, dtype = None) :

• a=np.matrix('1 2;3 4’)

• det_matrix=no.linalg.det(x)
• print(det_matrix)
Rank of matrix
• numpy.linalg.matrix_rank() -> returns rank of
the matrix.
• Syntax: numpy.linalg.matrix_rank(matrix)
• x=np.matrix(“4,5,6,7;2,-3,2,3;3,4,5,6;4,7,8,9”)
Inverse of a matrix
• numpy.linalg.inv()-> returns the multiplicative
inverse of a matrix.
• numpy.linalg.inv(matrix)
• A=np.matrix(“3,1,2;3,2,5;6,7,8”)
System of linear equations
• 3x+y+2z=2
• 3x+2y+5z=-1
• 6x+7y+8z=3
• Now we can write the equations in the form
of Ax=b
• (3 1 2 (x =2
• 4 2 5 y =-1
• 6 7 8) z) =3
• numpy.linalg.solve() -> return the solution to
the system Ax=b
• numpy.linalg.solve(matrix_A,matrix+b)
• Create the matrix A and b
• A=np.matrix(“3,1,2;3,2,5;6,7,8”)
• B=np.matrix(“2,-1,3”).transpose()
• sol_linear=np.linalg.solve(A,b)
sum()
• print(a.sum())

• a=np.array([(1,2,3),(3,4,5,)])
• print(np.sqrt(a))
• print(np.std(a)) //standard deviation
Vertical & Horizontal Stacking
• if you want to concatenate two arrays and not
just add them, you can perform it using two
ways – vertical stacking and horizontal
stacking.
• f = np.array([1,2,3])
• g = np.array([4,5,6])
• print(np.vstack((x,y)))
• print(np.hstack((x,y)))
• Horizontal Append: [1 2 3 4 5 6]
• Vertical Append: [[1 2 3] [4 5 6]]
NumPy Array Iteration
• NumPy provides an iterator object, i.e., nditer
which can be used to iterate over the given
array using python standard Iterator interface.
• import numpy as np
• a = np.array([[1,2,3,4],[2,4,5,6],[10,20,39,3]])
• print("Printing array:")
• print(a);
• print("Iterating over the array:")
• for x in np.nditer(a):
• print(x,end=' ')
Array Sorting
• print(np.sort(a,1)) → sorting along the rows
• print(np.sort(a,0)) -> Along the columns
numpy.mean()
• The sum of elements, along with an axis
divided by the number of elements, is known
as arithmetic mean. The numpy.mean()
function is used to compute the arithmetic
mean along the specified axis.
• numpy.mean(a, axis=None, dtype=None, out=
None, keepdims=<no value>)
• import numpy as np
• a = np.array([[1, 2], [3, 4]])
• b=np.mean(a)
• b
• x = np.array([[5, 6], [7, 34]])
• y=np.mean(x)
• y
• import numpy as np
• a = np.array([[2, 4], [3, 5]])
• b=np.mean(a,axis=0)
• c=np.mean(a,axis=1)

Linear Algebra Study Guide: MIT
100% (6)
Linear Algebra Study Guide: MIT
53 pages
Systems - of - Equations
100% (2)
Systems - of - Equations
39 pages
Math. Ed. 445 Linear Algebra and Vector Analysis
100% (1)
Math. Ed. 445 Linear Algebra and Vector Analysis
5 pages
Practice Questions Linear Equations in Two Variables Class X
100% (6)
Practice Questions Linear Equations in Two Variables Class X
2 pages
Mathematics 8: Least Learned Competencies FIRST QUARTER (SY 2019-2020)
50% (2)
Mathematics 8: Least Learned Competencies FIRST QUARTER (SY 2019-2020)
2 pages
CHAP.4 GENERALE VECTOR SPACES-Anton Rorres
100% (1)
CHAP.4 GENERALE VECTOR SPACES-Anton Rorres
190 pages
Matrices Linear Algebra
No ratings yet
Matrices Linear Algebra
18 pages
MA8491 Numerical Methods 01 - by LearnEngineering - in
No ratings yet
MA8491 Numerical Methods 01 - by LearnEngineering - in
117 pages
MAT411 Mathematics Level 5
No ratings yet
MAT411 Mathematics Level 5
41 pages
Assignment 1 LA S25 D13
No ratings yet
Assignment 1 LA S25 D13
7 pages
Math G8U2 Linear Relationships and Functions UbD 22-23
No ratings yet
Math G8U2 Linear Relationships and Functions UbD 22-23
8 pages
Assignment 1.1
100% (1)
Assignment 1.1
4 pages
BSEH Practice Paper (March 2024) : Marking Scheme
No ratings yet
BSEH Practice Paper (March 2024) : Marking Scheme
24 pages
Fortran Codes
No ratings yet
Fortran Codes
5 pages
General Mathematics 114
No ratings yet
General Mathematics 114
2 pages
Chapter 1
No ratings yet
Chapter 1
28 pages
MIR2012 Lec1
No ratings yet
MIR2012 Lec1
37 pages
CB312 Ch3
No ratings yet
CB312 Ch3
40 pages
Class 10 Pair of Linear Eq in Two Variables WS
No ratings yet
Class 10 Pair of Linear Eq in Two Variables WS
8 pages
Lesson 2
No ratings yet
Lesson 2
17 pages
NCERT Solutions For Class 10 Chapter 3 Linear Equations in Two Variables Exercise 3.4
No ratings yet
NCERT Solutions For Class 10 Chapter 3 Linear Equations in Two Variables Exercise 3.4
6 pages
Linear Algebra MA201 Course Outline
No ratings yet
Linear Algebra MA201 Course Outline
3 pages
Mathematics Resource Package: I. Objectives
No ratings yet
Mathematics Resource Package: I. Objectives
8 pages
Contents Linear Algebra
No ratings yet
Contents Linear Algebra
2 pages
M1 Lesson Plan 2 Mathematical Sentence
No ratings yet
M1 Lesson Plan 2 Mathematical Sentence
6 pages
Annihilator Method
No ratings yet
Annihilator Method
7 pages
203 MAT 2105 Mid PDF
No ratings yet
203 MAT 2105 Mid PDF
2 pages
Quadratic Equation
No ratings yet
Quadratic Equation
9 pages
Newton Method of Solving Nonlinear
No ratings yet
Newton Method of Solving Nonlinear
7 pages
Lab 3
No ratings yet
Lab 3
3 pages
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6458)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1175)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (280)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)

Data Science

Uploaded by

Data Science

Uploaded by

Data Science

• Data Science is about finding patterns in data, through

• By using Data Science, companies are able to make:

• Data processing and cleaning

• Deriving insights from data

• Data is generated from various sources.

– Websites track every user’s every click.

• Data exploration and visualization

• Parallel and distributed computing incase of big data

• Supported across Linux,Mac OS and Windows

• Fundamental package for numerical computations in Python.

• Supports N-dimesnional array objects that can be used for

• Supports different data types.

• Using Numpy we can perform

pip install numpy

– All elements are coerced to same data type.

arr = np.array([1, 2, 3, 4, 5])

arr = np.array([[1, 2, 3], [4, 5, 6]])

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3],

arr = np.array([1, 2, 3, 4])

arr = np.array([1, 2, 3, 4])

• Start- start of interval range

• Start- start of interval range.

• Stop- end of interval range

• Step- step size of interval

• shape- integer or sequence of integers

• Dtype- data type(default:float)

• Np.ones((3,4),dtype=‘int’) means 3 rows and 4 columns.

• Shape – integer or sequence of integers

• Dtype- data type(default:float)

• Shape – integer or sequence of integers.

• Np.random.rand(5,2) means 5 rows and 2

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

Extract element with index (0,1) from a

• Since we selected 1, we are left with the second array:

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('Last element from 2nd dim: ', arr[1, -1])

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

• From both elements, return index 2:

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

arr = np.array([1, 2, 3, 4])

arr = np.array([1, 2, 3, 4], dtype='S')

arr = np.array([1, 2, 3, 4], dtype='i4')

arr = np.array([1.1, 2.1, 3.1])

arr = np.array([1, 2, 3, 4, 5])

arr = np.array([1, 2, 3, 4, 5])

arr = np.array([1, 2, 3, 4, 5])

arr = np.array([[1, 2, 3],

arr = np.array([1, 2, 3])

arr = np.array([[1, 2, 3],

arr = np.array([[1, 2, 3],

arr = np.array([[[1, 2, 3],

arr = np.array([[1, 2, 3, 4],

for x in np.nditer(arr[:, ::2]):

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.concatenate((arr1, arr2))

arr1 = np.array([[1, 2], [3, 4]])

arr2 = np.array([[5, 6], [7, 8]])

arr = np.concatenate((arr1, arr2),

arr = np.array([1, 2, 3, 4, 5, 6])

arr = np.array([1, 2, 3, 4, 5, 6])

arr = np.array([1, 2, 3, 4, 5, 6])

arr = np.array([[1, 2], [3, 4],

arr = np.array([3, 2, 0, 1])

• Modifying the subset will automatically

The above matrix is 3X1

What is a System of Linear Equations?

• numpy.matrix(data, dtype = None) :

• a=np.matrix('1 2;3 4’)

You might also like