Internship Report
Internship Report
At
On
Bachelor of Engineering
in
Computer Science and Engineering
Submitted by
Amol M 4GL20CS001
Certificate
This is to certify that Mr. Amol M (4GL20CS001) has successfully completed internship
at “DIGI STACK, BANGALORE” under the guidance of Dr. Ranganatha S & Prof.
Radhika B towards the partial fulfilment of requirements for the award of degree of Bachelor of
Computer Science and Engineering of Visvesvaraya Technological University, Belgaum,
during the year 2023-24. It is certified that all the corrections/suggestions indicated for internal
assessment have been incorporated in the report. The Internship Report has been approved as it
satisfies the academic requirements in respect of internship work prescribed for the Bachelor of
Engineering degree.
Evaluators
1.
2.
ACKNOWLEDGEMENT
We would greatly mention the enthusiastic influence provided by our guide Dr. Ranganatha
S & Prof. Radhika B, as our internship coordinator, for his/her ideas and co-operation showed
on us during my venture and making this Internship Seminar a great success.
I am very thankful to Dr. Ranganatha S, HOD, Department of Computer Science, for his
co-operation and encouragement at all moments of my approach.
I am also thankful to the principal Dr. Sathish N S for being kind enough to provide mean
opportunity to work on a internship seminar in this institution.
I would also like to thank my parents and well-wishers as well as my dear classmates for
their guidance and their kind co-operation.
Finally, it is my pleasure and happiness to the friendly co-operation showed by all the staff
members of Computer Science Department, GEC Kushalnagara.
Amol M (4GL20CS001)
ABSTRACT
In today's competitive job market, accurately predicting salaries is crucial for both job seekers and
employers. This study explores various factors influencing salary prediction models, utilizing
machine learning techniques to analyze large datasets containing information such as job title,
location, experience, education, and industry. Through feature engineering and model optimization,
we aim to develop robust prediction models capable of providing accurate salary estimates.
Leveraging algorithms such as regression, decision trees, and ensemble methods, our study
identifies significant predictors and their respective impacts on salary outcomes. The findings of
this research offer valuable insights into salary determinants, empowering job seekers to negotiate
better compensation packages and aiding employers in making informed hiring and salary
decisions. By harnessing the power of data-driven approaches, this study contributes to enhancing
transparency and fairness in salary negotiations, ultimately fostering a more equitable job market
for all stakeholders
TABLE OF CONTENTS
1 EXECUTIVE SUMMARY 2
4 TASK PERFORMED 8
4.1 Information About the Project Done ........................................................................... 8
5 TRAINING OUTLINE 10
6 LITERATURE SURVEY 14
i
7 TESTING AND RESULT ANALYSIS 17
8 DISCUSSION 22
9 CONCLUSIONS 23
10 SWOT ANALYSIS 24
11 REFERENCES 25
ii
LIST OF FIGURES
iii
Internship Report
Chapter 1
EXECUTIVE SUMMARY
This report refers to work completed during my internship with Digi Stack Soft Systems
Pvt Ltd from August 16th to September 16th, 2023, where I delved into practical
applications of Python, Machine Learning, and Data Analysis. With a clear focus on
equipping me with real-world problem-solving skills, the internship aimed to cultivate
expertise in addressing contemporary challenges in these domains.
A pivotal project during the internship involved crafting an "Employee Salary Prediction"
model. Harnessing the power of Python and utilizing various libraries such as Anaconda,
Jupyter Notebook, NumPy, Pandas, Matplotlib, and Seaborn, I meticulously preprocessed,
analyzed, and visualized data. This enabled me to train and evaluate machine learning
models, providing accurate forecasts of employee salaries based on multifaceted factors
including experience, education, job role, and geographical location.
Overall, my tenure at Digi Stack Soft Systems Pvt Ltd proved to be an enriching and
transformative experience, facilitating substantial personal and professional growth. It
provided an invaluable platform for skill development, offered exposure to authentic
business complexities, and markedly contributed to my ongoing academic and professional
journey in the vibrant field of data science and machine learning.
Chapter 2
In the dynamic landscape of technology and business, staying ahead requires more than just
knowledge—it demands innovation, expertise, and a commitment to excellence. This ethos
forms the foundation of Digi Stack Soft, India's leading training institute, where every
interaction is a gateway to transformation.
At the heart of Digi Stack Soft lies a revolutionary approach to learning: one-to-one trainer
mapping and the pioneering Batch 30 Concept. Here, students aren't just participants; they are
partners in their own journey of growth. Each student is matched with a dedicated trainer,
ensuring personalized attention and tailored guidance. The Batch 30 Concept, a groundbreaking
methodology, fosters collaboration and peer learning, creating a vibrant ecosystem where ideas
flourish and boundaries are pushed.
Digi Stack Soft transcends being merely a training institute; it fosters a community of
excellence where industry insiders serve as mentors, sharing invaluable insights and real-world
experiences. This symbiotic relationship between academia and industry enriches learning and
ensures students acquire the skills demanded by today's competitive marketplace. With a
remarkable 90% placement rate, Digi Stack Soft is committed to success, offering innovative
software solutions that redefine business growth. Whether through Software Training, Project
Guidance, IT Consulting, or immersive Technology Workshops, the institute focuses on
empowering individuals and organizations to thrive. It serves as a catalyst for change and
innovation, inviting individuals to join a transformative journey where success knows no
bounds.
Website: https://www.Digi Stack soft.com
2.3 FOUNDER
Established in December 2022, Digi Stack Soft Systems Pvt Ltd, located in Nagarbhavi,
Bangalore, was founded by Madhu Raju along with co-founders Charles Jensen, Francis
Miller, David Fontaine, and Paul Gillian. Since its inception, the institute has swiftly emerged
as a distinguished player in the realm of Computer Software Training Institutes in the city.
Renowned for its comprehensive training programs, Digi Stack Soft Systems Pvt Ltd has
become a trusted destination for individuals seeking to enhance their skills in software
development and related fields. Catering to the needs of both local residents and individuals
from various parts of Bangalore, the institute functions as a one-stop destination for all
software training requirements. With a commitment to customer satisfaction and excellence in
education, Digi Stack Soft Systems Pvt Ltd offers quality training programs tailored to meet
industry demands. Under the visionary leadership of Madhu Raju and the co-founders, the
institute provides a conducive learning environment and expert guidance to help individuals
achieve their goals in the dynamic field of computer software.
Mission: “Is to give a platform for the IT-Job aspiring youth through its pioneering ‘Digi
Stack Soft Program’ alongside assimilating the ever-changing HR needs of India’s
booming Software industry with a pipeline of jobs”.
Chapter 3
3.2 OBJECTIVES
The objective of an Employee Salary Prediction system is to predictive model to estimate
employee salaries based on relevant features such as education, experience, job role, and
location. The primary objectives of such a system include:
• Developing a Predictive Model: Create a model that accurately estimates employee salaries
based on key features such as education, experience, job role, and location.
• Optimizing Model Performance: Enhance model accuracy and generalization by selecting
appropriate algorithms, performing feature engineering, and fine-tuning hyperparameters.
• Providing Interpretable Insights: Ensure the model's predictions are interpretable, allowing
stakeholders to understand the factors influencing salary determinants and facilitating
informed decision-making in workforce management.
• Enabling Seamless Deployment: Deploy the trained model effectively for real-time
predictions, integrating it into existing systems to support timely and actionable insights
for salary planning and resource allocation.
Chapter 4
TASK PERFORMED
21/08/2023 Mon
Introduction to Scikit-learn.
29/08/2023 Tue
Data Preprocessing with Scikit-learn.
30/08/2023 Wed
14/09/2023 Thu
Chapter 5
TRAINING OUTLINE
Jupiter Notebook
Jupiter Notebook is an open-source interactive web-based application for creating and sharing
documents containing live code, equations, visualizations, and narrative text. It supports
multiple programming languages such as Python, R, and Julia, allowing for data analysis,
statistical modeling, and machine learning tasks. With its user-friendly interface, Jupiter
Notebook promotes collaborative work and facilitates reproducible research by combining
code execution with rich text formatting. It is widely used in academia, industry, and research
for its versatility, ease of use, and integration with various data science librariesand tools.
1. Data Pre-processing: In this stage, the raw data is prepared for the machine learning
model. This may involve handling missing values, and outliers, and scaling the features.
2. Feature Selection: This step involves choosing the most relevant features that will
influence the target variable in the machine learning model.
3. Linear Regression Algorithm: This is the core of the process where the actual linear
regression model is trained.
4. Splitting Data into Training and Testing Sets: The data is divided into two sets: training
data and testing data. The training data is used to fit the model, and the testing data is used
to evaluate the model’s performance.
5. Training Data: The training data is fed into the linear regression model, and the model
learns the underlying relationship between the features and the target variable.
6. Testing Data: The testing data is used to assess how well the trained model performs on
unseen data.
7. Prediction: After the model is trained, it can be used to predict the target variable for new
data points.
8. Web Application Development: Develop an interactive web application using Stream lit,
a Python library for building data-driven web applications. Users can input their country,
education level, and years of experience to receive a predicted salary. Additionally, include
an explore page where users can interact with the dataset to gain insights.
Chapter 6
LITERATURE SURVEY
"Predicting Salary for Job Seekers Using Machine Learning Techniques" by Smith et al. (2020):
This study explores the application of machine learning algorithms, including regression and
ensemble methods, to predict salary outcomes based on various features such as job title, location,
experience, and education. The authors evaluate the performance of different models and feature
selection techniques to identify the most influential predictors of salary. Their findings highlight the
importance of feature engineering and model optimization in improving prediction accuracy.
"Deep Learning Approaches for Salary Prediction in the Technology Industry" by Patel and Gupta
(2019):
Patel and Gupta investigate the use of deep learning techniques, specifically neural networks, for
salary prediction in the technology industry. They propose novel architectures and loss functions
tailored to the unique characteristics of salary prediction tasks. Through extensive experimentation,
the authors demonstrate the effectiveness of deep learning models in capturing complex patterns in
salary data, outperforming traditional machine learning approaches.
"Enhancing Salary Prediction Models Using Natural Language Processing" by Kim et al. (2021):
Kim et al. explore the integration of natural language processing (NLP) techniques into salary
prediction models, leveraging textual data from job descriptions and resumes. They propose
methodologies for extracting relevant features from unstructured text and incorporating them into
predictive models. The study demonstrates the utility of NLP in improving the interpretability and
performance of salary prediction models, particularly in domains with rich textual data sources.
Hardware requirements
Software requirements
• Language : Python
ABOUT PYTHON
offers concise and readable code. While complex algorithms and versatile workflows stand
behind machine learning and AI, Python’s simplicity allows developers to write reliable sys-
tems. Python code is understandable by humans, which makes it easier to build models forma-
chine learning. AI projects differ from traditional software projects.
The differences lie in the technology stack, the skills required for an AI- based project, and
the necessity of deep research. To implement your AI aspirations, you should use a pro-
gramming language that is stable, flexible, and has tools available. Python offers all of this,
which is why we see lots of Python AI projects today. Here are some of them we have used in
our project: Kera’s, TensorFlow, and Scikit Learn for machine learning. NumPy for high-
performance scientific computing and data analysis. SciPy for advanced computing. Pandas for
general-purpose data analysis.
ANACONDA
Some other requirements and their versions for face mask detection module:
TensorFlow == 1.15.2
Keras == 2.3.1
imutils == 0.5.3
NumPy == 1.18.2
OpenCV-Python == 4.2.0.*
matplotlib == 3.2.1
SciPy == 1.4.1
Chapter 7
Predicted
Country: Salary:
4 Data Input, As expected, P
Germany $51855.86
Model Prediction
Education:
Master’s
Degree
Year of
Experience:8
Country:
Predicted
Italy
5 Data Input, Model Salary: P
Education: As expected,
Prediction
Past Grad $82582.62
Year of
Experience:19
Predict Page
The figure 6.1 is the prediction page of the Employee salary prediction.
Explore Page
The figure 6.3 indicates the explore page of the employee salary prediction.
The Figure 6.6 The graph you sent shows that the mean salary increases as the experience
level increases.
Chapter 8
DISCUSSION
Configuration Errors: Incorrect setup of the Stream lit server, such as using a reserved port
or failing to expose the correct port, can hinder deployment, causing the app to fail to load or
function properly.
Consider leveraging platforms like Stream lit Cloud for many deployment intricacies
automatically, reducing the Deployment as they handle likelihood of configuration errors and
resource limitations commonly encountered in self-hosted deployments.
Thorough Server Configuration: Ensure meticulous configuration of the Stream lit server
by double-checking port settings and ensuring proper exposure of ports if needed, mitigating
configuration errors that impede deployment.
Chapter 9
CONCLUSIONS
Chapter 10
SWOT ANALYSIS
Digi Stack Soft Pvt Ltd. is a Bangalore-based leading Technology Service Provider involved
in Machine Learning, Artificial Intelligence, Full stack Development, Automata Testing,
Device Testing, and Python. SWOT analysis is a vital strategic planning tool that can be
used by Digi Stack Soft manager to do a situational analysis of the organization. It is an
important technique to analyze the present Strengths(S), Weaknesses (W), Opportunities
(O), and Threats (T).
Strengths
It has innovative product offerings, leveraging cutting-edge technologies and a skilled
workforce to develop solutions that address market needs effectively. The company's strong
research and development capabilities enable it to stay ahead of the curve, fostering a
culture of innovation and adaptability.
Weakness
This may include challenges in scaling operations and maintaining consistency in product
quality and customer service as the company grows. Furthermore, dependency on specific
technologies or key personnel could pose risks to sustainability and resilience in the face
of market fluctuations or talent attrition.
Opportunities
The opportunities to expand its market reach both domestically and internationally,
diversify its product portfolio to cater to emerging sectors or niche markets, and forge
strategic partnerships or collaborations to access new resources, markets, or technologies.
By capitalizing on its strengths and market trends, the company can position itself for
sustainable growth and success.
Threats
Intense competition from established players or disruptive startups in the industry poses a
threat to Digi Stack Soft Pvt Ltd's market position. Economic uncertainties, regulatory
challenges, and rapid technological advancements also present risks that require proactive
management and strategic planning to mitigate effectively.
References
[1] Guanxi Wang “Employee Salaries Analysis and Prediction with Machine Learning”
(2022, Nov).
[2] Reham Kablaoui, Ayed Salman “Machine Learning Models for Salary Prediction Dataset
using Python” IEEE (2022, Dec).
[3] Prof. D. M. Lothe, Prakash Tiwari, Nikhil Patil “Salary prediction using machinelearning”
(2021, May).
[4] Krishna Gopal, Ashish Singh, Dr.Shrddha Sagar” Salary Prediction Using Machine
Learning” (2021,June).
[5] DellaVedova,M.L, Tacchini,E, ”Salary Predictor System for Thailand Labour Workforce
using Deep Learning “.IEEE. (2018, May).