0% found this document useful (0 votes)
8 views23 pages

? - Vectors, Matrices & Linear Algebra Basics

This document provides an introduction to vectors and matrices, fundamental concepts in linear algebra essential for various fields such as machine learning and physics. It covers definitions, types of vectors, operations like addition and scalar multiplication, and geometric interpretations including magnitude and direction. Additionally, it includes examples and code snippets demonstrating vector operations using Python.

Uploaded by

220311024
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views23 pages

? - Vectors, Matrices & Linear Algebra Basics

This document provides an introduction to vectors and matrices, fundamental concepts in linear algebra essential for various fields such as machine learning and physics. It covers definitions, types of vectors, operations like addition and scalar multiplication, and geometric interpretations including magnitude and direction. Additionally, it includes examples and code snippets demonstrating vector operations using Python.

Uploaded by

220311024
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

5/2/25, 12:19 PM 39 Vectors_and_Matrices

📚 Vectors, Matrices, and Linear Algebra


Basics

Linear algebra forms the foundation of many areas in mathematics, science, and
engineering. This notebook introduces the core concepts of vectors and matrices,
covering their definitions, operations, geometric interpretations, and applications in
solving systems of linear equations. A strong grasp of these basics is essential for fields
like machine learning, physics, and computer graphics.

Basic Linear Algebra Concepts (Vectors


and Matrices)
Vectors
A vector is essentially a list or an array of numbers arranged in a specific order. Vectors
are fundamental in linear algebra and can represent various physical quantities such as
velocity, force, and displacement.

Definition of a Vector
A vector is a mathematical object that has both magnitude (size) and direction. It is
typically represented as a list of numbers, which are called the components or elements
of the vector.

For example, a vector v in 3-dimensional space might look like:

v = [v1 , v2 , v3 ]

where ( v_1, v_2, v_3 ) are the components of the vector.

Types of Vectors
1. Row Vector: A row vector is a 1×n matrix (one row and multiple columns). It is
written horizontally.

Example:

v = [v1 , v2 , v3 ]

2. Column Vector: A column vector is an n×1 matrix (multiple rows and one column).
It is written vertically.

Example:

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 1/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

v1
⎡ ⎤

v = ⎢ v2 ⎥

⎣ ⎦
v3

Vector Operations
1. Vector Addition
Vector addition is the operation of adding two vectors by adding their corresponding
components. If you have two vectors ( \mathbf{v} = [v_1, v_2] ) and ( \mathbf{w} = [w_1,
w_2] ), their sum ( \mathbf{v} + \mathbf{w} ) is:

v + w = [v1 + w1 , v2 + w2 ]

2. Scalar Multiplication
Scalar multiplication involves multiplying each component of a vector by a scalar (a
constant). If ( \mathbf{v} = [v_1, v_2] ) and ( c ) is a scalar, then the scalar multiplication ( c
\mathbf{v} ) is:

cv = [c ⋅ v1 , c ⋅ v2 ]

3. Dot Product
The dot product (or scalar product) of two vectors ( \mathbf{v} = [v_1, v_2] ) and (
\mathbf{w} = [w_1, w_2] ) is calculated as:

v ⋅ w = v 1 ⋅ w1 + v 2 ⋅ w2

The dot product is a scalar value and gives a measure of how much one vector extends in
the direction of another.

Geometric Interpretation of Vectors


Magnitude: The magnitude (or length) of a vector ( \mathbf{v} = [v_1, v_2] ) is the
distance from the origin to the point ( (v_1, v_2) ) in space. It is calculated as:

2 2
|v| = √v + v
1 2

Direction: The direction of a vector represents the angle it makes with respect to a
reference axis, typically the x-axis. Vectors can be visualized as arrows pointing in the
direction of the vector’s components.

Unit Vector: A unit vector is a vector with a magnitude of 1. It is used to represent


direction without magnitude.

A unit vector in the direction of ( \mathbf{v} = [v_1, v_2] ) is given by:

v
^ =
v
|v|

Example in Code:

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 2/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

Geometric Intuition of Vectors


A vector is a mathematical object that has both magnitude (length) and direction.
Geometrically, vectors are often represented as arrows in space, where the direction of
the arrow represents the direction of the vector, and the length of the arrow represents
the magnitude of the vector.

1. Magnitude (Length) of a Vector


The magnitude of a vector is the length of the arrow representing the vector. It is a
scalar quantity and can be calculated using the Euclidean distance formula.

For example, in 2D space, the magnitude ( |\mathbf{v}| ) of the vector ( \mathbf{v} = [v_1,
v_2] ) is given by:

2 2
|v| = √v + v
1 2

In 3D space, the magnitude of a vector ( \mathbf{v} = [v_1, v_2, v_3] ) is:

2 2 2
|v| = √v + v + v
1 2 3

2. Direction of a Vector
The direction of a vector is the angle it makes with a reference axis (typically the x-axis in
2D or the x-y plane in 3D). The direction is independent of the vector's magnitude.

In 2D, the angle ( \theta ) between the vector ( \mathbf{v} = [v_1, v_2] ) and the x-axis can
be found using the formula:

v2
−1
θ = tan ( )
v1

3. Unit Vector
A unit vector is a vector with a magnitude of 1, which represents the direction of the
original vector but with no magnitude. You can convert any vector ( \mathbf{v} ) to a unit
vector ( \hat{v} ) by dividing it by its magnitude:

v
v
^ =
|v|

This gives you a vector with the same direction as ( \mathbf{v} ), but with a magnitude of
1.

4. Vector Addition
The sum of two vectors ( \mathbf{v} ) and ( \mathbf{w} ) can be visualized geometrically
using the tip-to-tail method. This means placing the tail of the second vector (

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 3/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

\mathbf{w} ) at the tip of the first vector ( \mathbf{v} ). The resulting vector is the vector
from the tail of ( \mathbf{v} ) to the tip of ( \mathbf{w} ).

In component form, if:

v = [v1 , v2 ], w = [w1 , w2 ]

Then the sum of the vectors is:

v + w = [v1 + w1 , v2 + w2 ]

5. Scalar Multiplication
When you multiply a vector by a scalar, you stretch or shrink the vector. If the scalar is
positive, the direction remains the same, but if the scalar is negative, the vector reverses
direction.

For example, multiplying a vector ( \mathbf{v} = [v_1, v_2] ) by a scalar ( c ) gives the
vector:

cv = [c ⋅ v1 , c ⋅ v2 ]

If ( c > 1 ), the vector becomes longer, and if ( 0 < c < 1 ), it shrinks. If ( c < 0 ), the vector
also flips direction.

6. Dot Product
The dot product of two vectors measures how much one vector extends in the direction
of the other. It is defined as:

v ⋅ w = |v||w| cos(θ)

Where ( \theta ) is the angle between the two vectors. Geometrically:

If the vectors are orthogonal (perpendicular), their dot product is 0.


If the vectors are parallel, the dot product is maximized.
If the vectors point in opposite directions, the dot product is negative.

Visual Example:
Consider the 2D vector ( \mathbf{v} = [3, 4] ). This vector can be represented as an arrow
from the origin (0,0) to the point (3,4).

Magnitude of the vector ( \mathbf{v} ):

2 2
|v| = √3 + 4 = 5

Direction of the vector ( \mathbf{v} ) is the angle ( \theta ) it makes with the x-axis:

4
−1 ∘
θ = tan ( ) ≈ 53.13
3

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 4/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

Example in Code:
In [1]: import numpy as np

# Defining vectors
v = np.array([2, 3])
w = np.array([1, 4])

# Vector addition
addition = v + w

# Scalar multiplication
scalar_mult = 2 * v

# Dot product
dot_product = np.dot(v, w)

# Magnitude of a vector
magnitude_v = np.linalg.norm(v)

# Unit vector of v
unit_vector_v = v / np.linalg.norm(v)

print("Vector Addition:", addition)


print("Scalar Multiplication:", scalar_mult)
print("Dot Product:", dot_product)
print("Magnitude of v:", magnitude_v)
print("Unit Vector of v:", unit_vector_v)

Vector Addition: [3 7]
Scalar Multiplication: [4 6]
Dot Product: 14
Magnitude of v: 3.605551275463989
Unit Vector of v: [0.5547002 0.83205029]

Geometric Intuition of Vectors


A vector is a mathematical object that has both magnitude (length) and direction.
Geometrically, vectors are often represented as arrows in space, where the direction of
the arrow represents the direction of the vector, and the length of the arrow represents
the magnitude of the vector.

1. Magnitude (Length) of a Vector


The magnitude of a vector is the length of the arrow representing the vector. It is a
scalar quantity and can be calculated using the Euclidean distance formula.

For example, in 2D space, the magnitude of the vector v = [v1, v2] is given by:

2 2
|v| = sqrt(v1 + v2 )

In 3D space, the magnitude of a vector v = [v1, v2, v3] is:

2 2 2
|v| = sqrt(v1 + v2 + v3 )

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 5/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

2. Direction of a Vector
The direction of a vector is the angle it makes with a reference axis (typically the x-axis in
2D or the x-y plane in 3D). The direction is independent of the vector's magnitude.

In 2D, the angle (θ) between the vector v = [v1, v2] and the x-axis can be found using the
formula:

(
θ = tan − 1)(v2/v1)

3. Unit Vector
A unit vector is a vector with a magnitude of 1, which represents the direction of the
original vector but with no magnitude. You can convert any vector v to a unit vector v̂ by
dividing it by its magnitude:

v̂ = v/|v|

This gives you a vector with the same direction as v, but with a magnitude of 1.

4. Vector Addition
The sum of two vectors v and w can be visualized geometrically using the tip-to-tail
method. This means placing the tail of the second vector w at the tip of the first vector
v. The resulting vector is the vector from the tail of v to the tip of w.

In component form, if:

v = [v1, v2], w = [w1, w2]

Then the sum of the vectors is:

v + w = [v1 + w1, v2 + w2]

5. Scalar Multiplication
When you multiply a vector by a scalar, you stretch or shrink the vector. If the scalar is
positive, the direction remains the same, but if the scalar is negative, the vector reverses
direction.

For example, multiplying a vector v = [v1, v2] by a scalar c gives the vector:

c ∗ v = [c ∗ v1, c ∗ v2]

If c > 1, the vector becomes longer, and if 0 < c < 1, it shrinks. If c < 0, the vector also
flips direction.

6. Dot Product
The dot product of two vectors measures how much one vector extends in the direction
of the other. It is defined as:

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 6/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

v ⋅ w = |v||w|cos(θ)

Where θ is the angle between the two vectors. Geometrically:

If the vectors are orthogonal (perpendicular), their dot product is 0.


If the vectors are parallel, the dot product is maximized.
If the vectors point in opposite directions, the dot product is negative.

Visual Example:
Consider the 2D vector v = [3, 4]. This vector can be represented as an arrow from the
origin (0,0) to the point (3,4).

Magnitude of the vector v:

2 2
|v| = sqrt(3 + 4 ) = 5

Direction of the vector v is the angle θ it makes with the x-axis:

(
θ = tan − 1)(4/3) ≈ 53.13°

Example in Code:
In [2]: import numpy as np
import matplotlib.pyplot as plt

# Define the vector


v = np.array([3, 4])

# Plot the vector


plt.quiver(0, 0, v[0], v[1], angles='xy', scale_units='xy', scale=1, color="r")

# Set limits for the plot


plt.xlim(-1, 5)
plt.ylim(-1, 5)

# Add labels
plt.xlabel('X')
plt.ylabel('Y')

# Add grid and title


plt.grid(True)
plt.title('Geometric Representation of Vector v = [3, 4]')

# Show the plot


plt.show()

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 7/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

In [12]: import numpy as np


import matplotlib.pyplot as plt
from matplotlib.patches import Patch

# Define the original vector


v = np.array([3, 4])

# Define the scalar for operations


scalar = 2

# Operations
v_add = v + scalar
v_sub = v - scalar
v_mult = v * scalar
v_div = v / scalar

# Create a figure with only 5 subplots (2x3 grid without the last one)
fig, axs = plt.subplots(2, 3, figsize=(15, 10))

# Remove the last subplot (bottom-right) by hiding its axes


fig.delaxes(axs[1, 2])

# Set limits and grid for all subplots


for ax in axs.flat:
ax.set_xlim(-1, 8) # Adjusting limit to fit the vectors better
ax.set_ylim(-1, 8) # Adjusting limit to fit the vectors better
ax.grid(True)

# Original vector plot (only once)


axs[0, 0].quiver(0, 0, v[0], v[1], angles='xy', scale_units='xy', scale=1, color
axs[0, 0].set_title('Original Vector (v)')

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 8/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

# Scalar addition plot


axs[0, 1].quiver(0, 0, v_add[0], v_add[1], angles='xy', scale_units='xy', scale=
axs[0, 1].set_title(f'Scalar Addition (v + {scalar})')

# Scalar subtraction plot


axs[0, 2].quiver(0, 0, v_sub[0], v_sub[1], angles='xy', scale_units='xy', scale=
axs[0, 2].set_title(f'Scalar Subtraction (v - {scalar})')

# Scalar multiplication plot


axs[1, 0].quiver(0, 0, v_mult[0], v_mult[1], angles='xy', scale_units='xy', scal
axs[1, 0].set_title(f'Scalar Multiplication (v * {scalar})')

# Scalar division plot


axs[1, 1].quiver(0, 0, v_div[0], v_div[1], angles='xy', scale_units='xy', scale=
axs[1, 1].set_title(f'Scalar Division (v / {scalar})')

# Plot the original vector on all subplots (excluding the last one)
for ax in axs.flat:
ax.quiver(0, 0, v[0], v[1], angles='xy', scale_units='xy', scale=1, color="r

# Creating legend with Patch objects


handles = [
Patch(color='r', label="Original Vector (v)"),
Patch(color='g', label=f'Scalar Addition (v + {scalar})'),
Patch(color='b', label=f'Scalar Subtraction (v - {scalar})'),
Patch(color='orange', label=f'Scalar Multiplication (v * {scalar})'),
Patch(color='purple', label=f'Scalar Division (v / {scalar})')
]

# Add the legend to respective subplots only


axs[0, 0].legend(handles=[handles[0]], loc='upper left')
axs[0, 1].legend(handles=[handles[1]], loc='upper left')
axs[0, 2].legend(handles=[handles[2]], loc='upper left')
axs[1, 0].legend(handles=[handles[3]], loc='upper left')
axs[1, 1].legend(handles=[handles[4]], loc='upper left')

# Adjust layout and display the plot


plt.tight_layout()
plt.show()

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 9/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

Matrices
Definition of a Matrix
A matrix is a two-dimensional array of numbers arranged in rows and columns. Each
number in the matrix is called an element. Matrices are usually represented by
uppercase letters like A, B, etc.

Example of a 2 × 3 matrix:

1 2 3
A = [ ]
4 5 6

Rows: Horizontal lines


Columns: Vertical lines

Types of Matrices
Type Description Example

Square Matrix Number of rows = Number of columns 2 × 2, 3 × 3 etc.

Rectangular Matrix Number of rows ≠ Number of columns 2 × 3, 3 × 2 etc.

Row Matrix Only one row [1 2 3]

1
⎡ ⎤

Column Matrix Only one column ⎢2⎥

⎣ ⎦
3

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 10/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

Type Description Example

1 0
Diagonal Matrix Non-diagonal elements are zero [ ]
0 2

Diagonal matrix with all diagonal 5 0


Scalar Matrix [ ]
elements equal 0 5

1 0
Identity Matrix Diagonal elements = 1, others = 0 I = [ ]
0 1

Preserves length and


Orthogonal Matrix

A A = I
angles

Upper Triangular 1 2
All elements below main diagonal are 0 [ ]
Matrix 0 3

Lower Triangular 1 0
All elements above main diagonal are 0 [ ]
Matrix 2 3

1 2
Symmetric Matrix A = A

[ ]
2 3

Skew-Symmetric ⊤
[
0 2
]
A = −A
Matrix −2 0

Matrix Operations
1. Matrix Addition
Matrices must have the same dimensions.
Add corresponding elements.

Example:

1 2 5 6 6 8
A + B = [ ] + [ ] = [ ]
3 4 7 8 10 12

2. Matrix Multiplication
The number of columns of the first matrix must match the number of rows of the
second matrix.
Row-by-column multiplication.

Example:

1 2 5
A = [ ], B = [ ]
3 4 6

(1)(5) + (2)(6) 17
A × B = [ ] = [ ]
(3)(5) + (4)(6) 39

3. Transpose of a Matrix
Rows become columns, columns become rows.

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 11/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

Example:

1 2 ⊤
1 3
A = [ ] ⇒ A = [ ]
3 4 2 4

4. Inverse of a Matrix
Exists only for square matrices where det(A) ≠ 0 .
.
−1
A × A = I

Formula for 2 × 2 matrix:

1 d −b
−1
A = [ ]
ad − bc −c a

Identity Matrix and Its Properties


I × A = A × I = A

Acts as "1" for matrices.


Only square matrices have identity matrices.

Example:

1 0
I = [ ]
0 1

Rank of a Matrix
The rank of a matrix is the maximum number of linearly independent rows or
columns.
It represents the dimension of the vector space spanned by its rows/columns.
Rank gives insight into whether a system of equations has a unique solution.

Example:

A 3 × 3 matrix with full rank (3) has 3 independent rows.

Matrix Dot Product (Matrix Multiplication Rules)


1. Associative Property
(AB)C = A(BC)

Grouping does not affect the result.

2. Distributive Property
A(B + C) = AB + AC

Matrix multiplication distributes over matrix addition.

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 12/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

Other Important Concepts


Determinant: A scalar value that can be computed from the elements of a square
matrix and determines whether the matrix is invertible.
Singular Matrix: A matrix with a determinant of zero; it does not have an inverse.
Trace of Matrix: Sum of diagonal elements of a square matrix.

Conclusion
Matrices are fundamental structures in linear algebra that help in solving systems of
equations, performing transformations, machine learning algorithms, graphics, and much
more. A solid understanding of matrix types, operations, and properties builds the
foundation for more advanced topics like Linear Regression, PCA, Neural Networks, etc.

Rank of a Matrix
The rank of a matrix is the maximum number of linearly independent rows or
columns.
It represents the dimension of the vector space spanned by its rows/columns.
Rank gives insight into whether a system of equations has a unique solution.

Example 1 (Full Rank):


A 3 × 3 matrix is full rank if all its rows are linearly independent.

Consider:

1 0 2
⎡ ⎤

B = ⎢0 1 3⎥
⎣ ⎦
4 5 6

Let's check:

Row 1: (1, 0, 2)
Row 2: (0, 1, 3)
Row 3: (4, 5, 6)

There is no way to express Row 3 as a linear combination of Row 1 and Row 2.

If we try:

a(1, 0, 2) + b(0, 1, 3) = (4, 5, 6)

Expanding:

(a, b, 2a + 3b) = (4, 5, 6)

Setting up equations:

a = 4

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 13/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

b = 5

2a + 3b = 6

Substituting a ,
= 4 b = 5 into third equation:

2(4) + 3(5) = 8 + 15 = 23 ≠ 6

❌ The third condition fails.


Thus, Row 3 is not a linear combination of Row 1 and Row 2.

✅ Therefore, all rows are independent, and the rank of matrix is 3 (full rank).

Key Points to Remember


Rank ≤ minimum(number of rows, number of columns).
Rank = number of non-zero rows after row-reduction (Row Echelon Form).
Full rank: when rank = number of rows (or columns) for square matrices.

Example 2 (Not Full Rank):


A 3 × 3 matrix ideally has full rank (rank = 3) if all rows are linearly independent.
Consider the matrix:

1 2 3
⎡ ⎤

A = ⎢4 5 6⎥
⎣ ⎦
7 8 9

Let's check if the third row is a linear combination of the first two.

Suppose:

Row3 = a × Row1 + b × Row2

Setting up equations:

a(1, 2, 3) + b(4, 5, 6) = (7, 8, 9)

Expanding:

(1a + 4b, 2a + 5b, 3a + 6b) = (7, 8, 9)

Solving:

1a + 4b = 7

2a + 5b = 8

3a + 6b = 9

From the first equation:

a = 7 − 4b

Substituting into the second:

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 14/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

2(7 − 4b) + 5b = 8 ⇒ 14 − 8b + 5b = 8 ⇒ −3b = −6 ⇒ b = 2

Then:

a = 7 − 4(2) = −1

Checking in the third equation:

3(−1) + 6(2) = −3 + 12 = 9 (Matches)

Thus:

Row3 = (−1) × Row1 + 2 × Row2

✅ Therefore, the third row is dependent on the first two rows.

Conclusion:
The rows are not fully independent.
Hence, the rank of the matrix is 2, not 3 (matrix is not full rank).

Example 3 (Clear Dependent Row):

Let's take:

1 2 3
⎡ ⎤

B = ⎢2 4 6⎥
⎣ ⎦
7 8 9

Here:

Row 2 is exactly 2 × Row 1.


Therefore, Row 2 is dependent on Row 1.

Thus:

Only two rows are linearly independent (Row 1 and Row 3).
Hence, rank = 2.

Quick check:

2 × (1, 2, 3) = (2, 4, 6)

which matches Row 2 exactly.

✅ Conclusion:
Rank counts only independent rows.
If a row is a multiple or linear combination of others, it does not add to the rank.

Key Points to Remember

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 15/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

Rank ≤ minimum(number of rows, number of columns).


Rank = number of non-zero rows after row-reduction (Row Echelon Form).
Full rank: when rank = number of rows (or columns) for square matrices.

Matrix-Vector Multiplication

How Vectors and Matrices Interact


Matrix-vector multiplication is a fundamental operation in linear algebra where a matrix
is multiplied by a vector to produce another vector.

If ( \mathbf{A} ) is an ( m \times n ) matrix and ( \mathbf{x} ) is a vector of size ( n


\times 1 ), then the product ( \mathbf{A} \mathbf{x} ) is a vector of size ( m \times 1
).
Each element of the resulting vector is a dot product of a row of the matrix with the
input vector.

Example:

Let

1 2
⎡ ⎤
7
A = ⎢3 4⎥, x = [ ]
⎣ ⎦ 8
5 6

Then,

(1)(7) + (2)(8) 7 + 16 23
⎡ ⎤ ⎡ ⎤ ⎡ ⎤

Ax = ⎢ (3)(7) + (4)(8) ⎥ = ⎢ 21 + 32 ⎥ = ⎢ 53 ⎥

⎣ ⎦ ⎣ ⎦ ⎣ ⎦
(5)(7) + (6)(8) 35 + 48 83

Importance of Matrix Multiplication in Solving Linear


Systems
Matrix-vector multiplication is crucial for:

Representing Systems of Linear Equations:


A system of equations can be compactly written as ( \mathbf{A} \mathbf{x} =
\mathbf{b} ).
Here, ( \mathbf{A} ) is the coefficient matrix, ( \mathbf{x} ) is the vector of
unknowns, and ( \mathbf{b} ) is the result vector.

Example:

The system of equations: [

2x + 3y = 5

4x + y = 6

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 16/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

] can be written in matrix form:

2 3 x 5
[ ][ ] = [ ]
4 1 y 6

Solving for unknowns:

If ( \mathbf{A} ) is invertible, the solution is:

−1
x = A b

Applications:

Linear Regression (Finding weights)


Computer Graphics (Transformations)
Machine Learning models (Forward pass computations)

Conclusion
Matrix-vector multiplication builds the bridge between linear algebra and real-world
problem solving by providing a compact way to represent and compute linear systems,
transformations, and optimizations.

In [21]: import numpy as np


import matplotlib.pyplot as plt

# Define matrix A and vector x


A = np.array([[2, 3],
[4, 1]])
x = np.array([1, 2])

# Perform matrix-vector multiplication


b = A @ x # Resultant vector

# Set up the plot


fig, axs = plt.subplots(1, 2, figsize=(12, 5))
fig.suptitle('Matrix-Vector Multiplication Visualization', fontsize=16)

# Plot 1: Matrix A rows and vector x separately


axs[0].quiver(0, 0, A[0, 0], A[0, 1], angles='xy', scale_units='xy', scale=1, co
axs[0].quiver(0, 0, A[1, 0], A[1, 1], angles='xy', scale_units='xy', scale=1, co
axs[0].quiver(0, 0, x[0], x[1], angles='xy', scale_units='xy', scale=1, color='r

axs[0].set_xlim(-1, 7)
axs[0].set_ylim(-1, 7)
axs[0].set_aspect('equal')
axs[0].grid(True)
axs[0].legend()
axs[0].set_title('Matrix Rows and Vector')

# Plot 2: Resultant Vector b


axs[1].quiver(0, 0, b[0], b[1], angles='xy', scale_units='xy', scale=1, color='m

axs[1].set_xlim(-1, 15)
axs[1].set_ylim(-1, 15)
axs[1].set_aspect('equal')

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 17/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

axs[1].grid(True)
axs[1].legend()
axs[1].set_title('Resultant Vector')

plt.show()

Linear Equations and Systems of Equations

Understanding Linear Equations


A simple linear equation in one variable is of the form:

y = mx + b

Where:

( y ) is the dependent variable,


( m ) is the slope of the line,
( x ) is the independent variable,
( b ) is the y-intercept.

A System of Linear Equations


A system of linear equations consists of two or more linear equations that share
common variables. A typical system with two equations might look like this:

a1 x + b 1 y = c 1

a2 x + b 2 y = c 2

Where:

( a_1, a_2 ) are coefficients of ( x ),


( b_1, b_2 ) are coefficients of ( y ),
( c_1, c_2 ) are constants.

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 18/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

Matrix Representation of Linear Systems


A system of linear equations can be compactly represented as a matrix equation:

Ax = b

Where:

( \mathbf{A} ) is the coefficient matrix,


( \mathbf{x} ) is the vector of unknowns,
( \mathbf{b} ) is the result vector.

For the example above:

a1 b1 x c1
A = [ ], x = [ ], b = [ ]
a2 b2 y c2

Solution of Systems of Equations Using Matrix Operations


The solution to the system ( \mathbf{A} \mathbf{x} = \mathbf{b} ) can be found using
various matrix operations:

Gaussian Elimination: A method for solving a system by transforming the


augmented matrix to row echelon form.
Inverse Matrix Method: If ( \mathbf{A} ) is invertible, the solution is given by:
−1
x = A b

Where ( \mathbf{A}^{-1} ) is the inverse of matrix ( \mathbf{A} ).

Introduction to Least Squares Method

Concept of Fitting a Line to Data


Regression is the process of predicting a dependent variable ( y ) from one or more
independent variables ( x_1, x_2, \dots, x_n ).
The goal of regression is to find the relationship between the variables.
Linear regression is a method where we fit a straight line to data points,
represented as:

y = β0 + β1 x + ϵ

Where:

( y ) is the dependent variable,


( x ) is the independent variable,
( \beta_0 ) is the intercept,

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 19/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

( \beta_1 ) is the slope,


( \epsilon ) is the error term.

Least Squares Approach


The least squares method is used to minimize the error (or residuals) between the
actual data points and the predicted values.
The objective is to minimize the sum of squared differences (residuals) between the
observed values and the values predicted by the model.

The residual for each data point is:

ri = yi − (β0 + β1 xi )

Where:

( r_i ) is the residual for the ( i^{th} ) data point,


( y_i ) is the actual value,
( (\beta_0 + \beta_1 x_i) ) is the predicted value.

The sum of squared residuals (error) is:


n n

2 2
E(β0 , β1 ) = ∑ r = ∑ (yi − (β0 + β1 xi ))
i

i=1 i=1

Matrix Formulation of the Least Squares Problem


To solve the least squares problem, we can rewrite it in matrix form:

y = Xβ + ϵ

Where:

( \mathbf{y} ) is the vector of observed values ( [y_1, y_2, \dots, y_n]^T ),


( \mathbf{X} ) is the matrix of input features, ( \mathbf{X} =

1 x1
⎡ ⎤

⎢1 x2 ⎥
⎢ ⎥
⎢ ⎥
⎢ ⋮ ⋮ ⎥
⎣ ⎦
1 xn

),
( \boldsymbol{\beta} ) is the vector of coefficients ( [\beta_0, \beta_1]^T ),
( \mathbf{\epsilon} ) is the error term.

To minimize the error, we compute:

T −1 T
β = (X X) X y

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 20/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

This gives us the values of ( \beta_0 ) and ( \beta_1 ) that minimize the sum of squared
residuals.

Gradient Descent (Optional, if you want to dive deeper)

Understanding Gradient Descent


Gradient Descent is an optimization algorithm used to minimize a function by
iteratively moving toward the minimum value of the function.
It is primarily used for minimizing the cost function in machine learning, including
linear regression.

The core idea of gradient descent is to update the parameters of the model (in our case, (
\beta_0 ) and ( \beta_1 )) in the direction of the steepest descent of the cost function,
which tells us how far off our predictions are from the actual values.

How it helps in minimizing the cost function in linear


regression
In the context of linear regression, the cost function we aim to minimize is the mean
squared error (MSE), which is a measure of how well the model fits the data.

The MSE is given by:


n
1 2
J (β0 , β1 ) = ∑ (yi − (β0 + β1 xi ))
n
i=1

Where:

( J(\beta_0, \beta_1) ) is the cost function,


( y_i ) are the actual values,
( x_i ) are the input values,
( \beta_0 ) and ( \beta_1 ) are the model parameters (intercept and slope).

The gradient of the cost function with respect to the parameters ( \beta_0 ) and ( \beta_1
) is calculated to determine the direction of the steepest slope, and the parameters are
updated accordingly:

Gradient Descent Update Rule:


∂J
β0 := β0 − α
∂β0

∂J
β1 := β1 − α
∂β1

Where:

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 21/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

( \alpha ) is the learning rate (a small positive value that controls the step size of
each update),
( \frac{\partial J}{\partial \beta_0} ) and ( \frac{\partial J}{\partial \beta_1} ) are the
partial derivatives of the cost function with respect to ( \beta_0 ) and ( \beta_1 ),
respectively.

By iteratively applying these update rules, gradient descent moves toward the values of (
\beta_0 ) and ( \beta_1 ) that minimize the cost function, thereby finding the best fit line
for the data.

Linear Regression Algorithm

Mathematics of Linear Regression


The equation of linear regression is given by:

Y = Xβ + ϵ

Where:

( \mathbf{Y} ) is the vector of observed values (dependent variable),


( \mathbf{X} ) is the matrix of input features (independent variables),
( \boldsymbol{\beta} ) is the vector of coefficients (parameters of the model),
( \boldsymbol{\epsilon} ) is the error term (residuals).

Solving for the Coefficients Using Matrices


To find the best-fitting values for the coefficients ( \boldsymbol{\beta} ), we use the
normal equation. This equation allows us to solve for ( \boldsymbol{\beta} ) directly
using matrix operations:

T −1 T
β = (X X) X y

Where:

( \mathbf{X}^T ) is the transpose of the matrix ( \mathbf{X} ),


( (\mathbf{X}^T \mathbf{X})^{-1} ) is the inverse of the matrix ( \mathbf{X}^T
\mathbf{X} ),
( \mathbf{y} ) is the vector of observed values.

The normal equation provides an exact solution for the coefficients, minimizing the sum
of squared residuals between the predicted values and the actual values.

Understanding Model Evaluation


After fitting a linear regression model, we evaluate its performance using various metrics:

1. Mean Squared Error (MSE):

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 22/23


5/2/25, 12:19 PM 39 Vectors_and_Matrices

The Mean Squared Error (MSE) measures the average of the squared
differences between the actual and predicted values. It gives an idea of how well
the model fits the data.
The formula for MSE is:

n
1
2
M SE = ^ )
∑(yi − y
i
n
i=1

Where:

( y_i ) is the actual value,


( \hat{y}_i ) is the predicted value,
( n ) is the number of data points.
2. ( R^2 ) (R-squared):

The ( R^2 ) score measures the proportion of the variance in the dependent
variable that is explained by the independent variables.
The formula for ( R^2 ) is:
n 2
∑ (yi − y
^ )
i=1 i
2
R = 1 −
n
2
∑ (yi − ȳ )
i=1

Where:

( y_i ) is the actual value,


( \hat{y}_i ) is the predicted value,
( \bar{y} ) is the mean of the actual values.
The value of ( R^2 ) lies between 0 and 1:

A value closer to 1 indicates that the model explains a large portion of the
variance in the dependent variable.
A value closer to 0 indicates that the model explains very little of the variance.

How to Evaluate the Performance of a Linear Regression


Model
The performance of a linear regression model can be evaluated using MSE and (
R^2 ).
A low MSE value indicates that the model’s predictions are close to the actual values,
while a high ( R^2 ) value indicates that the model explains a significant portion of
the variance in the data.
By using these metrics, we can assess how well our model fits the data and whether
improvements or adjustments are necessary.

In [ ]:

In [ ]:

file:///C:/Users/goura/Downloads/39 Vectors_and_Matrices.html 23/23

You might also like