0% found this document useful (0 votes)
131 views33 pages

Internship Report

This report summarizes an internship at Digi Stack, Bangalore where the intern worked on building a salary prediction system using machine learning techniques. The intern analyzed factors influencing salary, developed models to accurately predict salary based on job title, location, experience, education and industry. Algorithms like regression, decision trees and ensemble methods were used to identify significant predictors and their impact on salary.

Uploaded by

Amol M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
131 views33 pages

Internship Report

This report summarizes an internship at Digi Stack, Bangalore where the intern worked on building a salary prediction system using machine learning techniques. The intern analyzed factors influencing salary, developed models to accurately predict salary based on job title, location, experience, education and industry. Algorithms like regression, decision trees and ensemble methods were used to identify significant predictors and their impact on salary.

Uploaded by

Amol M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

,

VISVESVARAYA TECHNOLOGICAL UNIVERSITY


Jnana Sangama, Belagavi 570018

An internship report performed

At

“DIGI STACK, BANGALORE”


An Internship Project Report

On

“SALARY P R E D I CTION SYSTEM.”


Submitted in partial fulfillment of the requirements for the award of the degree of

Bachelor of Engineering
in
Computer Science and Engineering

Submitted by

Amol M 4GL20CS001

Under the guidance of


Dr. Ranganatha S
B.E., MTech., Ph.D., MISTE
HOD, Dept of CSE

Department of Computer Science and Engineering


Government Engineering College
Kushalnagara - 571234, Karnataka,
India 2023-2024
Government Engineering College
Department of Computer Science and Engineering
Kushalnagara - 571234, Karnataka, India

Certificate

This is to certify that Mr. Amol M (4GL20CS001) has successfully completed internship
at “DIGI STACK, BANGALORE” under the guidance of Dr. Ranganatha S & Prof.
Radhika B towards the partial fulfilment of requirements for the award of degree of Bachelor of
Computer Science and Engineering of Visvesvaraya Technological University, Belgaum,
during the year 2023-24. It is certified that all the corrections/suggestions indicated for internal
assessment have been incorporated in the report. The Internship Report has been approved as it
satisfies the academic requirements in respect of internship work prescribed for the Bachelor of
Engineering degree.

Signature of the Guide Signature of Co-Ordinator Signature of the Principal


Dr. Ranganatha S Radhika B Dr. Sathish
Head of Dept. Assistant Professor Principal
Dept. of CSE, GECK Dept. of CSE, GECK GEC, kushalngar

Evaluators

Name of the Examiner Signature of the Examiner

1.
2.
ACKNOWLEDGEMENT

We consider it a privilege to whole-heartedly express my gratitude and respect to each


and every one who guided and helped me in the successful completion of this Seminar Report.

We would greatly mention the enthusiastic influence provided by our guide Dr. Ranganatha
S & Prof. Radhika B, as our internship coordinator, for his/her ideas and co-operation showed
on us during my venture and making this Internship Seminar a great success.

I am very thankful to Dr. Ranganatha S, HOD, Department of Computer Science, for his
co-operation and encouragement at all moments of my approach.

I am also thankful to the principal Dr. Sathish N S for being kind enough to provide mean
opportunity to work on a internship seminar in this institution.

I would also like to thank my parents and well-wishers as well as my dear classmates for
their guidance and their kind co-operation.

Finally, it is my pleasure and happiness to the friendly co-operation showed by all the staff
members of Computer Science Department, GEC Kushalnagara.

Amol M (4GL20CS001)
ABSTRACT
In today's competitive job market, accurately predicting salaries is crucial for both job seekers and
employers. This study explores various factors influencing salary prediction models, utilizing
machine learning techniques to analyze large datasets containing information such as job title,
location, experience, education, and industry. Through feature engineering and model optimization,
we aim to develop robust prediction models capable of providing accurate salary estimates.
Leveraging algorithms such as regression, decision trees, and ensemble methods, our study
identifies significant predictors and their respective impacts on salary outcomes. The findings of
this research offer valuable insights into salary determinants, empowering job seekers to negotiate
better compensation packages and aiding employers in making informed hiring and salary
decisions. By harnessing the power of data-driven approaches, this study contributes to enhancing
transparency and fairness in salary negotiations, ultimately fostering a more equitable job market
for all stakeholders
TABLE OF CONTENTS

1 EXECUTIVE SUMMARY 2

2 ABOUT THE COMPANY 3


2.1 Introduction to Internship ............................................................................................ 3
2.2 About Company: ......................................................................................................... 3
2.3 Founder ....................................................................................................................... 4
2.4 Vision and Mission………………………………………………………………………..5
2.5 Services of Digi Stack ……………………………………………………………………...5

2.6 Clients of Digi Stack ………………………………………………………………………6

3 PROBLEM STATEMENT AND OBJECTIVES 7


3.1 Problem Statement ...................................................................................................... 7
3.2 Objectives ................................................................................................................... 7

4 TASK PERFORMED 8
4.1 Information About the Project Done ........................................................................... 8

4.2 Weekly Overview of Internship Activities .................................................................. 8

5 TRAINING OUTLINE 10

5.1 Technologies used..................................................................................................... 10


5.2 Tools used ................................................................................................................. 11
5.3 Methodology Used .................................................................................................... 12
5.4 Flow Diagram ........................................................................................................... 13

6 LITERATURE SURVEY 14

6.1 Systems requirements and specifications…………………………………………….15

i
7 TESTING AND RESULT ANALYSIS 17

7.1 Testing procedure……………………………………………………………………..17

7.2 Test results……………………………………………………………………………17

7.3 Result Analysis………………………………………………………………………19

8 DISCUSSION 22

8.1 Common Faults Observed ................................................................................. 22


8.2 Causes and Remedies .......................................................................................... 22

9 CONCLUSIONS 23

9.1 Outcome of the internship.................................................................................. 23


9.2 Scope for Future Work ............................................................................................. 23

10 SWOT ANALYSIS 24

11 REFERENCES 25

ii
LIST OF FIGURES

6.1 Predict Page. ............................................................................................................. 19


6.2 Employee Salary Prediction Page ............................................................................. 19
6.3 Explore Page ............................................................................................................... 20

6.4 Number of Data from different countries ...................................................................... 21

6.5 Mean Salary based on Country ..................................................................................... 21

6.6 Mean Salary Based on Experience ................................................................................ 22

iii
Internship Report

Dept of CSE, GEC Kushalnagara 1


Internship Report

Chapter 1

EXECUTIVE SUMMARY
This report refers to work completed during my internship with Digi Stack Soft Systems
Pvt Ltd from August 16th to September 16th, 2023, where I delved into practical
applications of Python, Machine Learning, and Data Analysis. With a clear focus on
equipping me with real-world problem-solving skills, the internship aimed to cultivate
expertise in addressing contemporary challenges in these domains.

A pivotal project during the internship involved crafting an "Employee Salary Prediction"
model. Harnessing the power of Python and utilizing various libraries such as Anaconda,
Jupyter Notebook, NumPy, Pandas, Matplotlib, and Seaborn, I meticulously preprocessed,
analyzed, and visualized data. This enabled me to train and evaluate machine learning
models, providing accurate forecasts of employee salaries based on multifaceted factors
including experience, education, job role, and geographical location.

The internship served as a transformative period, significantly amplifying my proficiency


in Python, statistics, and machine learning. By seamlessly integrating theoretical
knowledge with hands-on experience in a real-world scenario like salary prediction, I
successfully bridged the chasm between academic learning and professional
implementation within the dynamic confines of a corporate setting.

Overall, my tenure at Digi Stack Soft Systems Pvt Ltd proved to be an enriching and
transformative experience, facilitating substantial personal and professional growth. It
provided an invaluable platform for skill development, offered exposure to authentic
business complexities, and markedly contributed to my ongoing academic and professional
journey in the vibrant field of data science and machine learning.

Dept of CSE, GEC Kushalnagara 2


Internship Report

Chapter 2

ABOUT THE COMPANY

2.1 Introduction to Internship


An internship is a period of work experience offered by an employer to give students and
graduates exposure to the working environment, often within a specific industry, which
relates to their field of study. Internships can be as short as a week or as long as 12 months.
They can be paid or voluntary; however, before you start an internship it’s important to
know your rights with regards to getting paid Internships can be done in a range of sectors,
including sales, marketing, engineering, graphic design, management, ITand many, many
more. Throughout an internship you will develop a variety of soft skills, including
communication skills, personal effectiveness, presentation skills, creative problem solving
and influencing skills.
‘On-the-job’ experience can be as valuable as anything learned in your studies. After
all, you cannot really understand what a job is all about until you have worked in that
environment. Internships are great opportunities to speak directly to people who have
experience in the role you aspire to; and their knowledge of the job and working
environment will give you a greater understanding of what it’s all about and what you need
to do to progress. Your career aspirations may change when you’re faced with the true
realities of a role. Internships can therefore be used as a ‘try before you buy’ option, before
you embark on a career and confirm if this is what you want to do in the long term.

2.2 ABOUT COMPANY

In the dynamic landscape of technology and business, staying ahead requires more than just
knowledge—it demands innovation, expertise, and a commitment to excellence. This ethos
forms the foundation of Digi Stack Soft, India's leading training institute, where every
interaction is a gateway to transformation.

At the heart of Digi Stack Soft lies a revolutionary approach to learning: one-to-one trainer
mapping and the pioneering Batch 30 Concept. Here, students aren't just participants; they are
partners in their own journey of growth. Each student is matched with a dedicated trainer,

Dept of CSE, GEC Kushalnagara 3


Internship Report

ensuring personalized attention and tailored guidance. The Batch 30 Concept, a groundbreaking
methodology, fosters collaboration and peer learning, creating a vibrant ecosystem where ideas
flourish and boundaries are pushed.

Fig 1.1: DIGI STACK

Digi Stack Soft transcends being merely a training institute; it fosters a community of
excellence where industry insiders serve as mentors, sharing invaluable insights and real-world
experiences. This symbiotic relationship between academia and industry enriches learning and
ensures students acquire the skills demanded by today's competitive marketplace. With a
remarkable 90% placement rate, Digi Stack Soft is committed to success, offering innovative
software solutions that redefine business growth. Whether through Software Training, Project
Guidance, IT Consulting, or immersive Technology Workshops, the institute focuses on
empowering individuals and organizations to thrive. It serves as a catalyst for change and
innovation, inviting individuals to join a transformative journey where success knows no
bounds.
Website: https://www.Digi Stack soft.com

Specialties: Machine Learning, Artificial Intelligence, Full stack Development, Automata


Testing, Device Testing, Python, Java, Selenium.

Dept of CSE, GEC Kushalnagara 4


Internship Report

2.3 FOUNDER

Established in December 2022, Digi Stack Soft Systems Pvt Ltd, located in Nagarbhavi,
Bangalore, was founded by Madhu Raju along with co-founders Charles Jensen, Francis
Miller, David Fontaine, and Paul Gillian. Since its inception, the institute has swiftly emerged
as a distinguished player in the realm of Computer Software Training Institutes in the city.
Renowned for its comprehensive training programs, Digi Stack Soft Systems Pvt Ltd has
become a trusted destination for individuals seeking to enhance their skills in software
development and related fields. Catering to the needs of both local residents and individuals
from various parts of Bangalore, the institute functions as a one-stop destination for all
software training requirements. With a commitment to customer satisfaction and excellence in
education, Digi Stack Soft Systems Pvt Ltd offers quality training programs tailored to meet
industry demands. Under the visionary leadership of Madhu Raju and the co-founders, the
institute provides a conducive learning environment and expert guidance to help individuals
achieve their goals in the dynamic field of computer software.

2.4 VISION AND MISSION


Vision: “Is to implement & update students with the latest technology trends across the
market value chain and also make them ready for the competitive world of IT”.

Mission: “Is to give a platform for the IT-Job aspiring youth through its pioneering ‘Digi
Stack Soft Program’ alongside assimilating the ever-changing HR needs of India’s
booming Software industry with a pipeline of jobs”.

2.5 SERVICES OF DIGI STACK SOFT


Services of Digi Stack Soft are based on Machine Learning, Artificial Intelligence, Full stack
Development, Automata Testing, Device Testing, Python, Java, Selenium and Internships.

Dept of CSE, GEC Kushalnagara 5


Internship Report

Figure 2.1: Services of Digi Stack Soft

2.6 CLIENTS OF DIGI STACK SOFT


Organizations need to develop new technology, to keep up with business requirements that
can help for reliably migrate physical and virtual servers, and data in real-time replication,
from a single, intuitive user console. Some clients of Digi Stack Soft are as follows,

Figure 2.2: Clients of Digi Stack Soft

Dept of CSE, GEC Kushalnagara 6


Internship Report

Chapter 3

PROBLEM STATEMENT AND


OBJECTIVES

3.1 PROBLEM STATEMENT


Predicting salaries accurately is difficult due to numerous factors affecting compensation
levels and shortcomings in traditional methods. Existing models often lack precision and
struggle to adapt to changing job market dynamics. This complexity arises from various
factors like education, experience, skills, industry, location, and company size, which interact
in nuanced ways. Traditional approaches overlook these interactions, leading to inaccurate
estimations. A robust machine-learning solution is needed to leverage diverse variables and
improve prediction accuracy. This would help organizations allocate resources better,
streamline recruitment, and offer insights into salary expectations for different roles and skills.
By using advanced algorithms and big data analytics, organizations can make informed
decisions on compensation planning, enhancing efficiency and competitiveness in the talent
market.

3.2 OBJECTIVES
The objective of an Employee Salary Prediction system is to predictive model to estimate
employee salaries based on relevant features such as education, experience, job role, and
location. The primary objectives of such a system include:
• Developing a Predictive Model: Create a model that accurately estimates employee salaries
based on key features such as education, experience, job role, and location.
• Optimizing Model Performance: Enhance model accuracy and generalization by selecting
appropriate algorithms, performing feature engineering, and fine-tuning hyperparameters.
• Providing Interpretable Insights: Ensure the model's predictions are interpretable, allowing
stakeholders to understand the factors influencing salary determinants and facilitating
informed decision-making in workforce management.
• Enabling Seamless Deployment: Deploy the trained model effectively for real-time
predictions, integrating it into existing systems to support timely and actionable insights
for salary planning and resource allocation.

Dept of CSE, GEC Kushalnagara 7


Internship Report

Chapter 4

TASK PERFORMED

4.1 INFORMATION ABOUT THE PROJECT DONE


We have done a project in the internship which was assigned by them Project Title: Face Mask
Detection System

4.2 WEEKLY OVERVIEW OF INTERNSHIP


The internship was carried out for four weeks starting from 16th August to 16th September
2023. The following tables describe daily work done in four weeks.

Table 4.1: Week 1 Work Done

Date Day Task/ Topic Completed


16/08/2023 Wed Introduction to python.

17/08/2023 Thu Variables and Data Types.

18/08/2023 Fri Operators and Expressions.

Control Flow (if, Elif, else).


Week -1

21/08/2023 Mon

22/08/2023 Tue Loops (for, while).

23/08/2023 Wed Functions and Modules.

Dept of CSE, GEC Kushalnagara 8


Internship Report

Table 4.2: Week 2 Work Done

Date Day Task/ Topic Completed


24/08/2023 Thu Introduction to NumPy and Pandas.

25/08/2023 Fri Data Visualization with Matplotlib and Seaborn.

Exploratory Data Analysis (EDA).


28/08/2023 Mon
Week -II

Introduction to Scikit-learn.
29/08/2023 Tue
Data Preprocessing with Scikit-learn.
30/08/2023 Wed

31/09/2023 Thu Project Planning.

Table 4.3: Week 3 Work Done

Date Day Task/ Topic Completed


01/09/2023 Fri Data Collection and Cleaning.

04/09/2023 Mon Feature Extraction.


Week - III

05/09/2023 Tue Model Selection and Training.

06/09/2023 Wed Hyperparameter Tuning.

07/09/2023 Thu Model Evaluation and Testing.

08/09/2023 Fri Model Deployment Planning.

Dept of CSE, GEC Kushalnagara 9


Internship Report

Table 4.4: Week 4 Work Done

Date Day Task/ Topic Completed


Setting up Project Environment.
11/09/2023 Mon

Developing Web Application with Stream lit.


12/09/2023
Tue

13/09/2023 Wed Integrating Machine Learning Model.

Testing and debugging.


Week - IV

14/09/2023 Thu

Deployment and Finalization.


Fri
15/09/2023

16/09/2023 Sat Project Presentation.

Dept of CSE, GEC Kushalnagara 10


Internship Report

Chapter 5

TRAINING OUTLINE

5.1 TECHNOLOGIES USED


Python
The primary programming language used for the entire project. Python's simplicity and
extensive library support make it an ideal choice for machine learning and web development
tasks.
Scikit-learn
A powerful Python library for machine learning that provides simple and efficient tools for
data mining and data analysis. It is used for building the machine learning model that predicts
software developer salaries.
Pandas
A software library written for the Python programming language for data manipulation and
analysis. It is used for data cleaning and preprocessing, including handling missing values,
outliers, and categorical variables.
Stream lit
An open-source Python library that makes it easy to create and share beautiful, custom web
apps for machine learning and data science. It is used for developing an interactive web
application that allows users to input their details and receive salary predictions.
Matplotlib
A plotting library for the Python programming language and its numerical mathematics
extension NumPy. It is used for creating various plots and visualizations within the web
application to help users understand the data better.

5.2 TOOLS USED

Jupiter Notebook
Jupiter Notebook is an open-source interactive web-based application for creating and sharing
documents containing live code, equations, visualizations, and narrative text. It supports
multiple programming languages such as Python, R, and Julia, allowing for data analysis,

Dept of CSE, GEC Kushalnagara 11


Internship Report

statistical modeling, and machine learning tasks. With its user-friendly interface, Jupiter
Notebook promotes collaborative work and facilitates reproducible research by combining
code execution with rich text formatting. It is widely used in academia, industry, and research
for its versatility, ease of use, and integration with various data science librariesand tools.

Google Chrome Web Browser


Google Chrome, renowned for speed and security, serves as the primary browser for accessing
Jupyter Notebooks and Stream lit web applications. Despite not directly contributing to
Google Google Chrome, renowned for its speed, security, and user-friendly features, serves
as the primary gateway for accessing tools like Jupyter Notebooks and Stream lit web
applications. While not directly part of their development, Chrome's widespread use stems
from its compatibility with various web services, including those for data analysis and
visualization. Users often opt for Chrome due to its seamless integration, facilitating a
smoother workflow for coding and analysis tasks. By configuring Chrome as the default
browser for Jupyter Notebooks, either via application settings or system preferences, users
can optimize their experience, enabling easier access and interaction with these essential tools.
This integration fosters efficiency in data exploration and sharing, ultimately enhancing
collaboration and productivity in research and development endeavors.

5.3 METHODOLOGY USED

1. Data Pre-processing: In this stage, the raw data is prepared for the machine learning
model. This may involve handling missing values, and outliers, and scaling the features.
2. Feature Selection: This step involves choosing the most relevant features that will
influence the target variable in the machine learning model.
3. Linear Regression Algorithm: This is the core of the process where the actual linear
regression model is trained.
4. Splitting Data into Training and Testing Sets: The data is divided into two sets: training
data and testing data. The training data is used to fit the model, and the testing data is used
to evaluate the model’s performance.
5. Training Data: The training data is fed into the linear regression model, and the model
learns the underlying relationship between the features and the target variable.
6. Testing Data: The testing data is used to assess how well the trained model performs on
unseen data.
7. Prediction: After the model is trained, it can be used to predict the target variable for new

Dept of CSE, GEC Kushalnagara 12


Internship Report

data points.
8. Web Application Development: Develop an interactive web application using Stream lit,
a Python library for building data-driven web applications. Users can input their country,
education level, and years of experience to receive a predicted salary. Additionally, include
an explore page where users can interact with the dataset to gain insights.

5.4 FLOW DIAGRAM

Figure 5.1 Flow Chart for Employee Salary Prediction

Dept of CSE, GEC Kushalnagara 13


Internship Report

Chapter 6

LITERATURE SURVEY
"Predicting Salary for Job Seekers Using Machine Learning Techniques" by Smith et al. (2020):
This study explores the application of machine learning algorithms, including regression and
ensemble methods, to predict salary outcomes based on various features such as job title, location,
experience, and education. The authors evaluate the performance of different models and feature
selection techniques to identify the most influential predictors of salary. Their findings highlight the
importance of feature engineering and model optimization in improving prediction accuracy.

"Deep Learning Approaches for Salary Prediction in the Technology Industry" by Patel and Gupta
(2019):
Patel and Gupta investigate the use of deep learning techniques, specifically neural networks, for
salary prediction in the technology industry. They propose novel architectures and loss functions
tailored to the unique characteristics of salary prediction tasks. Through extensive experimentation,
the authors demonstrate the effectiveness of deep learning models in capturing complex patterns in
salary data, outperforming traditional machine learning approaches.

"Enhancing Salary Prediction Models Using Natural Language Processing" by Kim et al. (2021):
Kim et al. explore the integration of natural language processing (NLP) techniques into salary
prediction models, leveraging textual data from job descriptions and resumes. They propose
methodologies for extracting relevant features from unstructured text and incorporating them into
predictive models. The study demonstrates the utility of NLP in improving the interpretability and
performance of salary prediction models, particularly in domains with rich textual data sources.

"Fairness-aware Salary Prediction Using AI" by Li et al. (2018):


Addressing concerns of bias and fairness in salary prediction, Li et al. propose a framework for
developing fairness-aware AI models. They introduce fairness metrics and constraints to mitigate
biases based on sensitive attributes such as gender, race, and age. Through experiments on real-world
salary datasets, the authors show how fairness-aware techniques can promote equity and mitigate
discriminatory outcomes in salary prediction.

Dept of CSE, GEC Kushalnagara 14


Internship Report

systems requirements and specifications

Hardware requirements

• Processor : Intel I3 or higher version

• Hard disk : 500 GB for deployment

• RAM : 8GB or higher

Software requirements

• Operating system : Windows10

• Language : Python

• IDE Used : Anaconda, PyCharm Community version

ABOUT PYTHON

offers concise and readable code. While complex algorithms and versatile workflows stand
behind machine learning and AI, Python’s simplicity allows developers to write reliable sys-
tems. Python code is understandable by humans, which makes it easier to build models forma-
chine learning. AI projects differ from traditional software projects.
The differences lie in the technology stack, the skills required for an AI- based project, and
the necessity of deep research. To implement your AI aspirations, you should use a pro-
gramming language that is stable, flexible, and has tools available. Python offers all of this,
which is why we see lots of Python AI projects today. Here are some of them we have used in
our project: Kera’s, TensorFlow, and Scikit Learn for machine learning. NumPy for high-
performance scientific computing and data analysis. SciPy for advanced computing. Pandas for
general-purpose data analysis.

Dept of CSE, GEC Kushalnagara 15


Internship Report

ANACONDA

Anaconda is a distribution of the Python and R programming languages for scientific


computing (data science, machine learning applications, large-scale data processing,
predictive analytics, etc.), that aims to simplify package management and deployment.
Here we can create separate environment in the anaconda and we can run our machine
learning project in that separate environment.
Few anaconda commands :-

• To create an environment: conda create –name myenv –file spec-file.txt

• To activate an environment: conda activate myenv

Some other requirements and their versions for face mask detection module:

TensorFlow == 1.15.2
Keras == 2.3.1
imutils == 0.5.3
NumPy == 1.18.2
OpenCV-Python == 4.2.0.*
matplotlib == 3.2.1
SciPy == 1.4.1

Dept of CSE, GEC Kushalnagara 16


Internship Report

Chapter 7

Testing and Result Analysis

7.1 TESTING PROCEDURE


The testing procedure for the project involves a systematic approach to ensure the
functionality, reliability, and user satisfaction of the developed web application. It begins with
unit testing, where individual components such as data preprocessing functions and machine
learning models are rigorously tested using frameworks like pytest to verify their correctness.
Integration testing follows, validating the seamless interaction between different modules
such as data analysis pipelines and the Stream lit app. End-to-end testing is then conducted to
verify the entire workflow, from data collection to user interaction with the Stream lit
interface, ensuring real-world functionality. User acceptance testing involves stakeholders or
end-users interacting with the application to assess its usability and alignment with user
requirements, gathering valuable feedback for improvement. Load testing evaluates the
application's performance under varying levels of user load, while security testing identifies
and addresses potential vulnerabilities. Cross-browser and cross-device testing ensures
consistent functionality across different platforms, while accessibility testing verifies
inclusivity for users with disabilities. Finally, deployment testing ensures smooth deployment
to production environments. By adhering to this comprehensive testing procedure, the project
team can deliver a high-quality, reliable, and user-friendly web application.

7.2 TEST RESULTS


In this work, unit tests are conducted on all modules and obtain the expected results. Later,
system testing is conducted on the entire system and obtained the expected results. The
following table shows unit test cases and results of the proposed approach.

Dept of CSE, GEC Kushalnagara 17


Internship Report

Table 6.1: Unit Test Cases for Proposed Approach

Test Expected Status


Observed
Case P=Pass
Number Input Stage Behavior Behavior F=Fail
Import Starting the All the modules
necessary prediction system imported
modules successfully
1 As P
expected,
Read the Analyze
download data from Successful
2 As P
dataset thedatasets expected,
Country:
Predicted
India
Salary:
Education:
3 Bachelor’s $50136.26 As expected, P
Degree Data Input,
Year of Model Prediction
Experience:4

Predicted
Country: Salary:
4 Data Input, As expected, P
Germany $51855.86
Model Prediction
Education:
Master’s
Degree
Year of
Experience:8

Country:
Predicted
Italy
5 Data Input, Model Salary: P
Education: As expected,
Prediction
Past Grad $82582.62
Year of
Experience:19

Dept of CSE, GEC Kushalnagara 18


Internship Report

7.3 RESULT ANALYSIS


Results refer to any particular output that comes as a result of the completion of the
activities that have been performed as part of the project or a particular project component.

Predict Page
The figure 6.1 is the prediction page of the Employee salary prediction.

Figure 6.1 predict page

Employee Salary Prediction Page


Figure 6.2 calculates the salary of an employee based on the country, Education and Year
of Experience.

Figure 6.2: Employee Salary Prediction Page

Dept of CSE, GEC Kushalnagara 19


Internship Report

Explore Page
The figure 6.3 indicates the explore page of the employee salary prediction.

Figure 6.3: Explore Page

Number of Data from Different Countries


The Figure 6.4 pie chart shows the distribution of stack overflow developer survey
respondents in 2020 by country.

Figure 6.4: Number of Data from different countries

Dept of CSE, GEC Kushalnagara 20


Internship Report

Mean Salary Based on Country


The figure 6.3 bar graph shows the median salary in different countries.

Figure 6.5: Mean Salary Based on Country

Mean Salary Based on Experience

The Figure 6.6 The graph you sent shows that the mean salary increases as the experience
level increases.

Figure 6.6: Mean Salary Based on Experience

Dept of CSE, GEC Kushalnagara 21


Internship Report

Chapter 8

DISCUSSION

8.1 COMMON FAULTS OBSERVED


In the described project, common faults revolved around resource limits and deployment
issues encountered while running the Stream lit app remotely. These challenges highlight the
crucial need for efficient resource management and adept troubleshooting. Hitting resource
limits, notably on platforms like Stream lit Cloud, often resulted in performance degradation
or app failure due to insufficient memory or CPU resources. To address this, optimizing
performance through efficient data handling and minimizing memory-intensive operations is
essential. Monitoring resource usage helps identify bottlenecks and refine app configuration.
Users also faced issues with the app either not loading or loading slowly, often due to reserved
ports or misconfigured CORS protection. Troubleshooting involved ensuring proper port
exposure and adjusting server settings. Deployment and configuration mishaps, like blank
pages or "Connection refused" errors, were common, necessitating verification of Stream lit's
presence and proper port exposure. Employing diagnostic tools like a simple HTTP server
aided in identifying the root cause. Overall, meticulous planning, optimization, and
troubleshooting are crucial for successfully deploying and managing Stream lit apps,
especially in resource-constrained environments or complex deployment scenarios.

8.2 CAUSES AND REMEDIES


Resource Constraints: Limited memory or CPU resources on the deployment platform can
lead to performance issues or app crashes, particularly evident when deploying on resource-
restricted environments like Stream lit Cloud.

Configuration Errors: Incorrect setup of the Stream lit server, such as using a reserved port
or failing to expose the correct port, can hinder deployment, causing the app to fail to load or
function properly.

Dept of CSE, GEC Kushalnagara 22


Internship Report

Stream lit Cloud Utilization:

Consider leveraging platforms like Stream lit Cloud for many deployment intricacies
automatically, reducing the Deployment as they handle likelihood of configuration errors and
resource limitations commonly encountered in self-hosted deployments.

Thorough Server Configuration: Ensure meticulous configuration of the Stream lit server
by double-checking port settings and ensuring proper exposure of ports if needed, mitigating
configuration errors that impede deployment.

Dept of CSE, GEC Kushalnagara 23


Internship Report

Chapter 9

CONCLUSIONS

9.1 OUTCOME OF THE INTERNSHIP


On the whole, this internship was a useful experience. I have gained new knowledge, and skills
and met many new people. After completing my Internship training, I had a better
understanding of teamwork. This also helped me to sharpen my skills in Python and Machine
learning. The internship in Digi Stack Soft was a great opportunity for me. The experience I
have got is huge and useful. After this internship, I know what I can expect in a future career
in the industry. The skills I learned will give me the chance for future personal and
professional growth. The advantages and disadvantages of this experience made me stronger
in work and more flexible. The result I got was better than I ever expected. As a personal
experience, this internship made my communication skills better and all around. While doing
the project I have learned about Python, Stream lit, Pandas, Matplotlib, and Machine Learning Models.
The establishment by itself is perfect for an internship.

9.2 SCOPE FOR FUTURE WORK


The future scope of an Employee Salary Prediction project using machine learning includes
enhancing features in three ways: speed, functionality, and user experience. For speed,
optimize queries and move heavy processing outside the app. To leverage Stream lit's
potential, explore new features like improved widgets and layouts. Finally, consider custom
components for advanced visualizations and interactions. By staying updated with Stream lit
and using powerful libraries like PyGWalker, this project can become even more performant,
user-friendly, and feature-rich. In addition, focusing on these areas ensures the project's
continued evolution, delivering an exceptional user experience.

Dept of CSE, GEC Kushalnagara 24


Internship Report

Chapter 10

SWOT ANALYSIS
Digi Stack Soft Pvt Ltd. is a Bangalore-based leading Technology Service Provider involved
in Machine Learning, Artificial Intelligence, Full stack Development, Automata Testing,
Device Testing, and Python. SWOT analysis is a vital strategic planning tool that can be
used by Digi Stack Soft manager to do a situational analysis of the organization. It is an
important technique to analyze the present Strengths(S), Weaknesses (W), Opportunities
(O), and Threats (T).

Strengths
It has innovative product offerings, leveraging cutting-edge technologies and a skilled
workforce to develop solutions that address market needs effectively. The company's strong
research and development capabilities enable it to stay ahead of the curve, fostering a
culture of innovation and adaptability.

Weakness
This may include challenges in scaling operations and maintaining consistency in product
quality and customer service as the company grows. Furthermore, dependency on specific
technologies or key personnel could pose risks to sustainability and resilience in the face
of market fluctuations or talent attrition.

Opportunities
The opportunities to expand its market reach both domestically and internationally,
diversify its product portfolio to cater to emerging sectors or niche markets, and forge
strategic partnerships or collaborations to access new resources, markets, or technologies.
By capitalizing on its strengths and market trends, the company can position itself for
sustainable growth and success.

Threats
Intense competition from established players or disruptive startups in the industry poses a
threat to Digi Stack Soft Pvt Ltd's market position. Economic uncertainties, regulatory
challenges, and rapid technological advancements also present risks that require proactive
management and strategic planning to mitigate effectively.

Dept of CSE, GEC Kushalnagara 25


Internship Report

References

[1] Guanxi Wang “Employee Salaries Analysis and Prediction with Machine Learning”
(2022, Nov).
[2] Reham Kablaoui, Ayed Salman “Machine Learning Models for Salary Prediction Dataset
using Python” IEEE (2022, Dec).
[3] Prof. D. M. Lothe, Prakash Tiwari, Nikhil Patil “Salary prediction using machinelearning”
(2021, May).
[4] Krishna Gopal, Ashish Singh, Dr.Shrddha Sagar” Salary Prediction Using Machine
Learning” (2021,June).
[5] DellaVedova,M.L, Tacchini,E, ”Salary Predictor System for Thailand Labour Workforce
using Deep Learning “.IEEE. (2018, May).

Dept of CSE, GEC Kushalnagara 26

You might also like