Career Recommendation
Career Recommendation
Table of contents
Abstract
1. Chapter 1: Introduction
○ 1.1 Background
○ 1.2 Problem Statement
○ 1.3 Objectives
○ 1.4 Scope of the Project
2. Chapter 2: Literature Survey
○ 2.1 Overview
○ 2.2 Traditional vs AI Approaches
○ 2.3 Computer Vision and Deep Learning in Agriculture
○ 2.4 Related Work and Research
3. Chapter 3: System Analysis
○ 3.1 Existing System
○ 3.2 Proposed System
○ 3.3 Feasibility Study and Requirements Analysis
■ 3.3.1 Functional Requirements
■ 3.3.2 Non-Functional Requirements
4. Chapter 4: System Design
○ 4.1 System Architecture
○ 4.2 Use Case Diagram
○ 4.3 Data Flow Diagram and Database Design
○ 4.4 UI Design
5. Chapter 5: Implementation
○ 5.1 Technology Stack
○ 5.2 Dataset and Preprocessing
○ 5.3 Django Web Interface
○ 5.4 Image Upload and Live Prediction
1.3 Objectives
The primary objectives of the project are:
To develop an AI-based system that analyzes users’ academic data, interests, and
skills to recommend suitable career paths.
To design and implement a user-friendly web application for seamless interaction.
To utilize Machine Learning models trained on career datasets for accurate
predictions.
To provide detailed insights into recommended career paths, including necessary
qualifications and potential growth opportunities.
This chapter outlines the major components of the system, describing how each module contributes to the
overall functionality. The system architecture is modular, allowing for future expansion and easy
maintenance. The four primary components are: Frontend, Backend, Database, and Machine Learning
Service.
Frontend
The user interface is built using Django templates and styled with Tailwind CSS, a utility-first CSS
framework that allows for rapid design and consistent aesthetics across pages. The frontend is designed to
be clean, responsive, and accessible, ensuring usability on both desktop and mobile devices.
Backend
The backend is developed in Django 4.x, a high-level Python web framework known for its scalability,
security, and built-in ORM. It handles:
● Routing and Views: URL mappings are created for all major functionalities such as data entry,
prediction, and dashboard access. Views control the logic for rendering templates or processing
prediction requests.
● Authentication and Authorization: Built-in Django auth is used for login, logout, and password
protection. Users are grouped into roles using Django's Group model, enabling role-based access
control.
● Model Integration: The machine learning model is integrated into Django as a Python module. It
is invoked during form processing to generate real-time predictions.
● Data Validation and Security: The backend ensures input validation, uses CSRF protection on
forms, and handles exceptions gracefully to prevent system crashes.
The architecture follows the Model-View-Template (MVT) design pattern, which cleanly separates data,
logic, and presentation layers.
Database
The system uses PostgreSQL as its primary relational database management system. PostgreSQL was
selected due to its robustness, ACID compliance, and support for advanced features such as JSON storage,
indexing, and role management.
● User Accounts: Stores information about registered users, their roles, and credentials.
● Prediction Logs: Captures environmental input values, recommended s, timestamps, and user ID
for traceability.
● Training Dataset Management (Admin Only): A provision to upload and manage datasets for
retraining the ML model in the future.
● Feedback Records: Planned for future implementation where can submit feedback on the
recommendation quality.
The Django ORM abstracts SQL queries and handles migrations seamlessly, which simplifies database
operations and schema evolution.
Machine Learning Service
The core intelligence of the system lies in the Machine Learning service, implemented as a standalone
Python module integrated into the Django backend. The service uses a Random Forest Classifier trained
on a structured dataset containing labels and environmental features.
● Model Training Script: Written in Python using pandas, scikit-learn, and NumPy. The
model is trained offline and validated using test data before deployment.
● Model Serialization: The trained model is serialized using joblib for efficient storage and fast
loading at runtime.
● Runtime Inference: When a user submits environmental data, the Django view loads the
serialized model and passes the input to generate predictions in real-time.
● Result Output: The prediction result is returned to the user interface along with optional metadata
like confidence score (planned in future).
The design ensures that the model can be updated independently of the web app. Admins can retrain the
model offline and replace the serialized .pkl file without needing to redeploy the entire system.
Integration Workflow
1. User Interaction: The user fills out the data to be used for prediction form via the frontend
interface.
2. Form Submission: The input is sent to a Django view through a secure POST request.
3. Model Prediction: The backend view invokes the ML service, loads the model, and performs
inference.
4. Result Storage and Display: The prediction is stored in the database and returned to the frontend
for display.
5. Admin Access (Optional): Admins can view all prediction logs and performance analytics.
This modular and loosely coupled architecture ensures that each part of the system is independently
testable, replaceable, and scalable.
Technical Feasibility: The system is built using widely available open-source tools (Python,
Django, Scikit-learn). No proprietary software or specialized hardware is required. The ML
model is trained offline and loaded into memory during runtime using joblib.
Operational Feasibility: The system is easy to use and deploy. Once trained, the model can
serve multiple prediction requests in real time without needing re-training unless explicitly
required.
Economic Feasibility: Since the system uses open-source tools and publicly available
datasets, there are no direct costs involved. It is feasible for individual developers and
academic institutions.
4.4 UI Design
Place screen shot and write minimal explanation about the screenshot you can UI you
feel good
Chapter 5: Implementation
1. Programming Languages
Python
Used for data processing, model training (machine learning), and back-end
development (Django framework).
HTML/CSS
Used for designing the web front-end interface (forms, templates).
2. Machine Learning and Data Science
Pandas – For data loading and manipulation.
3. Web Framework
Django (Python-based Web Framework)
4. Database
SQLite
5. Front-End
Django Templates (HTML with template tags)
Used to render web forms, display results, and interact with users dynamically.
6. Deployment Tools (if applicable)
No specific deployment tools were provided in the project, but typical options for
a Django project would include:
o Gunicorn + Nginx for production servers
7. Development Tools
Jupyter Notebook / Python scripts (for model development)
Example Entries
Interest RecommendedCareer
Data Science Data Scientist
Data Science Machine Learning Engineer
Finance Investment Banker
Programming Software Developer
Design UI/UX Designer
Key Characteristics
Categorical data: Both features are text-based.
One-to-many relationship: One interest can map to multiple careers.
No missing values: The dataset is clean with 139 complete records.
This dataset is likely used for training a text classification model where the system predicts
the career category based on user interest keywords.
Project Overview
This project is a Machine Learning-based Career Recommendation System integrated
into a Django web application. The system analyzes user inputs such as skills, interests, and
educational background to suggest suitable career options.
Core Components
1. Machine Learning Model (train_model.py)
The script trains a classification model using data from career.csv.
It likely uses user-provided keywords (skills/interests) and maps them to appropriate
careers.
The model is serialized (pickled) and used later in predictions through the web
interface.
2. Input Data (input.txt and career.csv)
input.txt: Possibly a user input or feature set used in training/testing.
career.csv: Contains training data mapping user features to career titles.
3. Web Application (Django Framework)
views.py: Handles the logic for processing form submissions, loading the trained
model, predicting a career, and rendering the result.
models.py: Contains Django model definitions (though this project doesn’t seem to
rely heavily on database models).
urls.py: Maps URLs to view functions for routing web traffic.
Templates (HTML): index.html and others form the UI, allowing users to input data
and view recommended careers.
User Flow
1. User visits the web application.
2. On the homepage (index.html), they fill in a form (e.g., entering skills or interests).
3. The form submission triggers a Django view.
4. The backend uses the ML model to predict a suitable career.
5. The result is displayed back to the user via the web interface.
Admin Panel:
the admin panel is a built-in interface provided by Django to manage database
records.
admin.py
from django.contrib import admin
from .models import *
Code Explanation:
1. from django.contrib import admin
o Imports Django's admin functionality, which allows you to register
and manage models via the admin site.
2. from .models import *
o Imports all models from the models.py file in the same app. This
makes them available for registration in the admin panel.
3. # Register your models here.
o This is a placeholder. To use the admin panel effectively, you need
to register any model classes like this:
admin.site.register(YourModelName)
Example
from django.db import models
class Career(models.Model):
interest = models.CharField(max_length=100)
recommendation = models.CharField(max_length=100)
import pickle
from django.shortcuts import render
def index(request):
if request.method == 'POST':
user_input = request.POST['interest']
# Make a prediction
prediction = model.predict([user_input])[0]
urlpatterns = [
path('', views.index, name='index'),
]
<!DOCTYPE html>
<html>
<head>
<title>Career Recommendation</title>
</head>
<body>
<h2>Career Recommendation System</h2>
<form method="POST">
{% csrf_token %}
<label>Enter your interest or skill:</label>
<input type="text" name="interest" required>
<button type="submit">Get Recommendation</button>
</form>
{% if prediction %}
<h3>Recommended Career: {{ prediction }}</h3>
{% endif %}
</body>
</html>
Flow
User Input → URL Routing → View Logic → ML Prediction → Rendered Result
1. Data Preparation
Interest RecommendedCareer
Data Science Data Scientist
Programming Software Developer
Finance Financial Analyst
Machine Learning ML Engineer
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
import pickle
# Load dataset
data = pd.read_csv('career.csv')
It begins by loading a dataset (career.csv) that contains two columns: user interests (Interest)
and corresponding career recommendations (RecommendedCareer). The interests, which are
in text form, are converted into numeric features using CountVectorizer, a tool that
transforms text into a matrix of token counts.
The transformed data is then used to train a MultinomialNB (Naive Bayes) classifier. Once
trained, the model is capable of predicting a career based on a new text input describing a
user’s interest. Both the trained model and the vectorizer are saved using Python’s pickle
module so they can be reused in a Django app without retraining. When integrated with
Django views, this model takes user input from a form, transforms it using the vectorizer, and
generates a prediction. The result might be something like recommending "Data Scientist"
def index(request):
if request.method == 'POST':
user_input = request.POST['interest']
input_vector = vectorizer.transform([user_input])
prediction = model.predict(input_vector)[0]
return render(request, 'home/index.html', {'prediction': prediction})
{% if prediction %}
<h3>Recommended Career: {{ prediction }}</h3>
{% endif %}
This HTML code represents a simple Django template form that takes user input for career
recommendation. The form uses the POST method to securely send data to the server, and
includes {% csrf_token %} to protect against Cross-Site Request Forgery, a standard Django
security measure.
The input field lets users type in their interest, such as "Artificial Intelligence," and the
submit button sends the data to the backend view.
When the form is submitted, the Django view processes the interest input, uses the saved
machine learning model to make a prediction, and returns the result. If a prediction exists in
the context, it is displayed using Django’s template syntax.
The {% if prediction %} block ensures that the output appears only after the user submits a
valid input. The result is shown as a recommendation, such as: "Recommended Career: AI
Engineer." This creates a dynamic user experience without page redirection.
1. ML Model Development
A dataset (career.csv) containing Interest and RecommendedCareer is used to train a
text classification model.
CountVectorizer converts user interests (text) into numerical format.
MultinomialNB (Naive Bayes classifier) is trained on the vectorized data.
The trained model and vectorizer are saved using pickle as model.pkl and
vectorizer.pkl.
2. Django Web Application
A Django project is created to provide a web interface for users to input their
interests.
The main components are:
o views.py: Handles user requests and connects to the ML model.
o index.html: Frontend page with a form for user input and displays prediction.
o urls.py: Routes the base URL to the correct view (index).
3. Connecting Django with the ML Model
When a user submits the form:
o Django captures the interest via POST in views.py.
o The interest is passed to the loaded vectorizer and ML model.
o The model predicts a career based on the input and returns it.
The prediction is sent back to index.html and displayed.
6.4 Adavantages
Limitations
1. Data Dependency: The accuracy of recommendations heavily depends on the quality
and diversity of the training data; biased or incomplete data can lead to poor
suggestions.
2. Lack of Human Insight: AI cannot fully understand a user's emotions, personality
traits, or life context, which are often important in career decisions.
3. Static Model Limitations: If the model isn't frequently updated with new career
trends and job market shifts, its suggestions may become outdated.
4. Oversimplification: AI may reduce complex human aspirations into a single output,
overlooking nuanced preferences or multifaceted skill sets.
5. Privacy Concerns: Collecting and processing user input for predictions raises data
privacy and security issues, especially if sensitive information is involved.
6. Limited Adaptability: AI models may struggle with ambiguous or unusual inputs
that a human counselor could interpret more effectively.
7. Lack of Accountability: When users rely on AI advice, there's often no clear
responsibility if the guidance turns out to be inappropriate or misleading.
Future Enhancements
References
1. Aggarwal, C. C. (2018). Machine Learning for Text. Springer.
2. Pedregosa, F., Varoquaux, G., Gramfort, A., et al. (2011). Scikit-learn: Machine
Learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
3. Raschka, S., & Mirjalili, V. (2019). Python Machine Learning (3rd ed.). Packt
Publishing.
4. Chollet, F. (2018). Deep Learning with Python. Manning Publications.
5. Django Software Foundation. (2024). Django Documentation.
https://docs.djangoproject.com
6. Zhang, Y., & Zhao, L. (2020). Research on Career Recommendation System Based
on Data Mining. Journal of Physics: Conference Series.
7. Kumar, A., & Garg, N. (2022). Career Recommendation System using NLP
Techniques. International Journal of Computer Applications.
8. Kowsari, K., et al. (2019). Text Classification Algorithms: A Survey. Information,
10(4), 150.
9. Witten, I. H., Frank, E., & Hall, M. A. (2016). Data Mining: Practical Machine
Learning Tools and Techniques. Morgan Kaufmann.
10. Indeed API Documentation. (2023). Job Search API. https://developer.indeed.com
11. Dua, D., & Graff, C. (2019). UCI Machine Learning Repository.
http://archive.ics.uci.edu/ml
12. Linkedin Talent Insights. (2023). Labor Market Data for Career Planning.
https://linkedin.com/talent/insights
13. Goyal, R. (2021). AI in Career Guidance: A Review. International Journal of
Engineering Research & Technology (IJERT), 10(5).
14. Brownlee, J. (2016). Naive Bayes for Text Classification with Scikit-Learn. Machine
Learning Mastery.
15. Turing, A. M. (1950). Computing Machinery and Intelligence. Mind, 59(236), 433–
460.