0% found this document useful (0 votes)
10 views2 pages

Linear Regression Research Paper

This paper demonstrates the use of linear regression to predict employee salaries based on years of experience using Python and relevant libraries. The dataset was split into training and testing sets, and the model's performance was evaluated using Mean Squared Error and R² Score. The results indicate a strong linear relationship, making this approach effective for salary estimations in HR planning.

Uploaded by

ahmedaboamod08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views2 pages

Linear Regression Research Paper

This paper demonstrates the use of linear regression to predict employee salaries based on years of experience using Python and relevant libraries. The dataset was split into training and testing sets, and the model's performance was evaluated using Mean Squared Error and R² Score. The results indicate a strong linear relationship, making this approach effective for salary estimations in HR planning.

Uploaded by

ahmedaboamod08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Analyzing the Relationship Between Years of Experience and Salary Using

Linear Regression Algorithm

1. Introduction
In the field of data analysis and machine learning, linear regression is one of the simplest
and most commonly used algorithms for predicting numerical values. This paper aims to
demonstrate how linear regression can be used to predict an employee’s salary based on
their years of experience, using Python and libraries such as pandas and scikit-learn.

2. Dataset Description
The dataset used in this experiment contains two main columns:
- YearsExperience: Number of years of professional experience.
- Salary: The corresponding salary.
The data was loaded using the pandas library and visually inspected to verify the structure
and column names.

3. Methodology

3.1 Data Splitting


The dataset was split into a training set (80%) and a testing set (20%) using train_test_split,
with a fixed random_state=42 to ensure result reproducibility.

3.2 Model Construction


The linear regression model was built using the LinearRegression class from the
sklearn.linear_model module. The model was trained on the training data using the fit()
method.

3.3 Model Evaluation


The model’s performance was evaluated using the following metrics:
- Mean Squared Error (MSE): To measure the average of squared differences between
predicted and actual values.
- R² Score (Coefficient of Determination): To assess how well the model explains the
variance in the target variable.

4. Results
The model successfully captured the linear relationship between years of experience and
salary.
As experience increases, the predicted salary also increases, forming a nearly linear pattern.
The results were visualized as shown below:
5. Conclusion
The linear regression algorithm proved to be effective in predicting salaries based on years
of experience, especially when the relationship between the variables is linear. This method
provides a simple yet powerful way to make salary estimations, which can be useful in areas
like HR planning and compensation analysis.

6. References
- Scikit-learn Documentation: https://scikit-learn.org/
- Python for Data Science Handbook – Jake VanderPlas
- Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn and TensorFlow. O'Reilly
Media.

You might also like