Linear Models
Linear Models
m = 65/50 = 13/10
b = (∑y - m∑x)/n
b = (25 - 1.3×15)/5
b = (25 - 19.5)/5
b = 5.5/5
https://www.cuemath.com/data/least-squares/
https://www.geeksforgeeks.org/least-square-method/ So, the required equation of least squares is y =
mx + b = 13/10x + 5.5/5.
Multiple Linear Regression
Multivariate Linear Regression
• Multivariate regression refers to the statistical technique that
establishes a relationship between multiple data variables. It
estimates a linear equation that facilitates the analysis of multiple
dependent or outcome variables depending on one or more predictor
variables at different points in time.
Assumptions
• The validity and reliability of the multivariate regression findings depend upon
the following four assumptions:
• Linearity: The correlation between the predictor and outcome variables is linear.
• Independence: The observations are autonomous of each other, i.e., the value
of the other independent variable should not influence the value of the
independent variables.
• Homoscedasticity: The variance of the errors (residuals) is even across all levels
of the explanatory variables. This ensures that the spread of residuals is the
same for all predicted values.
• Normality: The residuals (differences between observed and predicted values)
should be normally distributed, ensuring that statistical inferences about
regression coefficients are valid.
Advantages
• Some of the benefits of this model are discussed below:
• Better Comprehends Relationships: Unlike simple linear regression, which considers
only one predictor, multivariate regression can account for interactions and
interdependencies among various predictors, capturing complex relationships
between these variables.
• Reliable Predictions: By including multiple predictors, the model might provide more
accurate estimations than simple regression models, leading to a better fit for the
data.
• Correlation, Strength, and Direction: Multivariate regression can help identify which
explanatory variables significantly influence the dependent variable, establishing a
correlation and quantifying the direction and strength of these correlations.
Disadvantages
The various limitations of this regression technique are as follows:
• Difficult to Interpret: Multivariate regression can be challenging to interpret, especially for
individuals unfamiliar with statistical analyses, due to multiple predictors.
• Complex Calculations: Since this model incorporates multiple variables, its computation involves
complex mathematical calculations.
• Extensive Data Requirement: Multivariate regression requires a larger sample size than simple
regression. Small sample sizes can result in unreliable parameter estimates and low statistical
power.
• Overfitting: It occurs when the model fits the training data too closely, capturing noise rather
than the underlying pattern.
https://www.wallstreetmojo.com/multivariate-regression/
Steps to achieve multivariate regression
• Step 1: Select the features
• First, you need to select that one feature that drives the multivariate regression. This is the feature that is
highly responsible for the change in your dependent variable.
• Step 2: Normalize the feature
• Now that we have our selected features, it is time to scale them in a certain range (preferably 0-1) so that
analysing them gets a bit easy.
• To change the value of each feature, we can use:
• Step 3: Select loss function and formulate a hypothesis
• A formulated hypothesis is nothing but a predicted value of the response variable and is denoted by h(x).
• A loss function is a calculated loss when the hypothesis predicts a wrong value. A cost function is a cost
handled for those wrongly predicting hypotheses.
Steps to achieve multivariate regression
• Step 4: Minimize the cost and loss function
• Both cost function and loss function are dependent on each other. Hence, in order to
minimize both of them, minimization algorithms can be run over the datasets. These
algorithms then adjust the parameters of the hypothesis.
• One of the minimization algorithms that can be used is the gradient descent algorithm.
• Step 5: Test the hypothesis
• The formulated hypothesis is then tested with a test set to check its accuracy and correctness.