0% found this document useful (0 votes)
32 views49 pages

Admission Presection

The document discusses developing a machine learning model to predict a student's chances of admission to a university based on their profile and historical admission data. It aims to help students identify which universities may be a good fit based on their scores and attributes. The proposed model uses algorithms like linear regression, random forest, and CatBoost to analyze past admission data and provide accurate predictions to guide students in their higher education decisions.

Uploaded by

Adithya Ampolu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views49 pages

Admission Presection

The document discusses developing a machine learning model to predict a student's chances of admission to a university based on their profile and historical admission data. It aims to help students identify which universities may be a good fit based on their scores and attributes. The proposed model uses algorithms like linear regression, random forest, and CatBoost to analyze past admission data and provide accurate predictions to guide students in their higher education decisions.

Uploaded by

Adithya Ampolu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

UNIVERSITY ADMISSION PREDICTION USING

MACHINE LEARNING
A Project report submitted in partial fulfilment of the requirements for the

award of the degree of

BACHELOR OF TECHNOLOGY

IN

COMPUTER SCIENCE AND ENGINEERING

Submitted by

POTTI RAJU (REG.NO 19NR1A0586)

SURU SIVAJI (REG.NO 19NR1A05A5)

SIDDALA VENUVATHI (REG.NO 19NR1A05A0)

NOWPADA DHARMA RAJU (REG.NO 19NR1A0575)

Under the guidance of


MRS. T. CHAITANYA

ASSISTANT PROFESSOR

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

BABA INSTITUTE OF TECHNOLOGY AND SCIENCES


(Permanently Approved by A.I.C.T.E & Affiliated to J.N.T.U, Kakinada) PM Palem,

Madhura Wada, Visakhapatnam -530048, Andhra Pradesh


(2019-2023)

1
BABA INSTITUTE OF TECHNOLOGY AND SCIENCES
DEPARTMENT OF COMPUTER SCIENCE AND
ENGINEERING

CERTIFICATE

This is to Certify That the Project Work Entitled “UNIVERSITY ADMISSION PREDICTION
USING MACHINE LEARNING” is Bonafide work done by, Potti Raju (19NR1A0586),
Suru Sivaji (19NR1A05A5), Siddala Venuvathi (19NR1A05A0) and Nowpada Dharmaraju
(19NR1A0575) during the year 2019-2023 in partial fulfilment of the requirements for the award of
the degree of BACHELOR OF TECHNOLOGY from BABA INSTITUTE OF TECHNOLOGY
AND SCIENCES, Affiliated to J.N.T.U. Kakinada, P.M Palem, Madhurwada, Visakhapatnam.

Project Guide Head of The Department

MRS. T. CHAITANYA Mr. S. DURGA PRASAD


Assistant Professor Assistant Professor

EXTERNAL EXAMINER

2
DECLARATION

We, Potti Raju (19NR1A0586), Suru Sivaji (19NR1A05A5), Siddala Venuvathi (19NR1A05A0)
and Nowpada Dharma raju (19NR1A0575) of 4-1 semester B.Tech., in the department of
Computer Science and Engineering from BITS, Visakhapatnam, hereby declare that the project work
entitled “UNIVERSITY ADMISSION PREDICTION USING MACHINE LEARNING" is
carried out by us and submitted in partial fulfilment of the requirements for the award of Bachelor of
Technology in Computer Science and Engineering, under the guidance of MRS. T. CHAITANYA
in Baba Institute of Technology and Sciences during the academic year (2019-2023) and has not
been submitted to any other university for the award of any kind of degree.

PROJECT STUDENTS

Potti Raju (19NR1A0586)

Suru Sivaji (19NR1A05A5)

Siddala Venuvathi (19NR1A05A0)

Nowpada Dharma raju (19NR1A0575)

3
ACKNOWLEDGEMENT

We would like to express our deep gratitude to our project guide Mrs.T.Chaitanya,
Assistant Professor, Department of Computer Science and Engineering, BITS, for his guidance
with unsurpassed knowledge and immense encouragement.

We are grateful to Mr. S. Durga Prasad, Head of the Department, Department of Computer
Science and Engineering, for providing us with the required facilities for the completion of the
project work.

We are very much thankful to the Principal Dr. Mr. Govinda Raju and Management, BITS,
PM Palem, for their encouragement and cooperation to carry out this work.

We express our thanks to all teaching faculty of Department of CSE, whose suggestions during
reviews helped us in accomplishment of our project. We would like to thank all non-teaching staff of
the Department of CSE, BITS for providing great assistance in accomplishment of our project.

We would like to thank our parents, friends, and classmates for their encouragement throughout
our project period. At last, but not the least, we thank everyone for supporting us directly or indirectly
in completing this project successfully.

Potti Raju (19NR1A0586)

Suru Sivaji (19NR1A05A5)

Siddala Venuvathi (19NR1A05A0)

Nowpada Dharma Raju (19NR1A0575)

4
TABLE OF CONTENTS

S.NO TITLE PAGE.NO

ABSTRACT 7

1. INTRODUCTION 8

1.1 PROBLEM STATEMENT 8

1.2 FEASABILITY STUDY 9

1.2.1 ECONOMICAL FEASABILITY 10

1.2.2 TECHNICAL FEASABILITY 10

1.2.3 OPERATIONAL FEASABILITY 10

1.3 EXISTING SYSTEM 10

1.4 PROPOSED SYSTEM 11

1.5 REQUIREMENT ELICITATION 11

1.6 TECHNOLOGIES 12

2. SOFTWARE REQUIREMENT SPECIFICATION


2.1 INTRODUCTION 14

2.1.1 PURPOSE & SCOPE 15

2.1.2 GLOSSARY 16

2.1.3 REFERENCES 16

2.1.4 OVERVIEW OF DOCUMENT 16

2.2 REQUIREMENT SPECIFICATIONS 17

2.2.1 EXTERNAL INTERFACE REQUIREMENT 17

2.2.2 FUNCTIONAL REQUIREMENT 18

2.2.3 USECASE DIAGRAM 18

2.2.4 SYSTEM SPECFICATION MODULES 24

5
2.2.5 OTHER NON-FUNCTIONAL REQUIREMNTS 25

2.2.5.1 NON-FUNCTIONAL REQUIREMNTS 25

3. ANALYSIS

3.1 INTRODUCTION 27

3.2 USE CASE DIAGRAM 33

3.3 INTERACTION DIAGRAM 36

3.3.1 SEQUENCE DIAGRAM 37

3.3.2 COLLABORATION DIAGRAM 38

3.4 STATE DIAGRAM 39

3.5 ACTIVITY DIAGRAM 40

4. DESIGN

4.1 ARCHITECTURE 42

4.2 DATA FLOW 43

4.3 CLASS DIAGRAM 44

4.4 COMPONENT DIAGRAM 45

4.5 DEPLOYMENT DIAGRAM 46

4.6 USER INTERFACE DESIGN 47

6
ABSTRACT
UNIVERSITY ADMISSION PREDICTION USING MACHINE LEARNING

In the present conditions, students regularly have difficulty finding a fitting institution to pursue higher
studies based on their profile. There are some advisory administrations and online apps that
recommend universities but they ask huge consultancy fees and online apps are not accurate. So, the
aim of this research is to develop a model that predicts the percentage of chances into the university
accurately. This model provides also the analysis of scores versus chance of prediction based on
historical data so that students can understand whether their profile is suitable or not. The proposed
model uses Linear Regression and Random Forest Algorithms but Cat Boost Algorithm is giving
highest accuracy.

KEYWORDS: Linear Regression, Random Forest Algorithms and Cat Boost Algorithm.

7
CHAPTER 1
1. INTRODUCTION

A person’s education plays a very important role in his life because the quality of education decides
their future. After they graduate, they often have several doubts that comes in their mind regarding
pursuing higher studies and to choose the best university. Most of the students prefer the universities
with global recognition. So, higher percentage of students from India prefers united states of America
to pursue higher studies. Even though there are universities with good reputation in India, graduate
students are facing difficulties to get an admission in good rated universities and also getting placed
is quite difficult as the number of working opportunities is quite low.

As students are not sure about which university is better, they invest time and money for the
guidance. Apart from consultancy offices and advisors, there are some blogs and websites that
encourage and guide the students about the chances of admission but those resources are not very
high accurate and cannot completely depend on that. When filling out admission inquiry forms,
educational institutes may use knowledge mining to concentrate on the most relevant details in the
data they have collected. It finds information hidden in the data that queries and reports are unable to
disclose.
This technique must be used to evaluate a collection of trends of students seeking admission in
college after collecting data from admission forms filled out by students seeking admission over many
years.
This paper establishes a machine learning model, which takes into account boundaries such as
GRE Score, TOEFL Score, the University Ranking, the Proposal Statement and the Recommendation
Letter Power, the Undergraduate GPA and the Study Experience.

After getting all the inputs, it predicts the chance of admission. On obscure test occasions, the
prepared model has substantial factual findings for the (like) estimate of the probability of
confirmation and, accordingly, offers an unprejudiced impression of measurement.

1.1 PROBLEM STATEMENT:


Students are often worried about their chances of admission in good rated universities. The aim
of this machine learning model is to help students in shortlisting universities with their profiles. The
predicted output gives them a fair idea about their admission chances in a particular university. This
will assist students to know in advance if they have a chance to get accepted.

8
1.2 FEASIBILITY STUDY
The feasibility study is basically the test of proposed system in the light of its workability,
meeting the user’s requirements, effective use of resources and cost effectiveness.

The feasibility study describes pros and cons of undertaking a project before they invest the lot of
time and money into it. For feasibility analysis, some understanding of the major requirements for
the system is essential.

Four key considerations involved in the feasibility analysis are,

1. Economical Feasibility

2. Technical Feasibility

3. Operational Feasibility

1.2.1 ECONOMICAL FEASIBILITY:

• This study is carried out to check the economic impact that the system will have on the
organization. The amount of fund that the company can pour into the research and development
of the system is limited. The expenditures must be justified. Thus, the developed system as
well within the budget and this was achieved because most of the technologies used are freely
available. Only the customized products had to be purchased.
• For this project, the main cost is documentation cost.

1.2.2 TECHNICAL FEASIBILITY:

• Technical feasibility assessment is focused on the present technical resource available in the
organization. It’s also evaluating the hardware and the software requirement of the proposed
system.

• Technical feasibility analysis is an attempt to study the project basically from a technician’s
angle.

• For this project no specific hardware or software are required. It required only latest version
of python.

9
1.2.3 OPERATIONAL FEASILBITY:

• The aspect of study is to check the level of acceptance of the system by the user. This includes
the process of training the user to use the system efficiently. The user must not feel threatened
by the system, instead must accept it as a necessity.
• The level of acceptance by the users solely depends on the methods that are employed to
educate the user about the system and to make him familiar with it. His level of confidence
must be raised so that he is also able to make some constructive criticism, which is welcomed,
as he is the final user of the system.
• For this project, it does not require any skill set operate it.

1.3 EXISTING SYSTEM:

After they graduate, they often have several doubts that comes in their mind regarding pursuing
higher studies and to choose the best university. Most of the students prefer the universities with
global recognition. So, higher percentage of students from India prefer united states of America to
pursue higher studies. Even though there are universities with good reputation in India, graduate
students are facing difficulties to get an admission in good rated universities and also getting placed
is quite difficult as the number of working opportunities is quite low.

DISADVANTAGES OF EXISTING SYSTEM:

• This technique must be used to evaluate a collection of trends of students seeking admission in
college after collecting data from admission forms filled out by students seeking admission over
many years.
• As students are not sure about which university is better, they invest time and money for the
guidance.

Algorithm: Classification and Regression.

1.4 PROPOSED SYSTEM:

In the model development, the dataset is consistently split into train and test set of 80% and 20%.
Train set has 400 profiles and test set has 100 profiles. The dataset used for modelling looks like this.
Pre-processing is a crucial step in method. The aim is to clean the data and prepare it for use in a

10
prediction algorithm. Few improvements are required for the data obtained from Occidental College
in order to make it suitable for the proposed machine learning algorithms.

Determining how to deal with missing data is a common problem in data cleaning. Since the function
in question could be a good predictor of the algorithm's outcome, it's critical to find missing entries,
locate them, and apply a treatment based on the variable form that enables us to use the data in the
model. The data was pre-processed and split into two classes at random: a training set and a testing
set.

ADVANTAGES OF PROPOSED SYSTEM:

• Now we can predict the percentage of chances. Recommended universities are also shown
where students with similar type of profile having higher chance of prediction
• The variable to be predicted is Chance of Admit. The steps involved in model development
are mentioned below.

Algorithm: Linear Regression, Cat Boost.

1.5 REQUIREMENT ELICITATION:

GRE Score: Graduate Record Exam (GRE) score. The score will be out of 340 points (numeric).
TOEFL Score: Test of English as a Foreigner Language2 (TOEFL) score, which will be out of 120
points (numeric).
University Rating: University Rating that indicates the Bachelor University ranking among the other
universities. The score will be out of 5 (numeric).
SOP: Statement of purpose (SOP) which is a document written to show the candidate's life, ambitious
and the motivations for the chosen degree/ university. The score will be out of 5 points (numeric).
LOR: Letter of Recommendation Strength (LOR) which verifies the candidate professional
experience, builds credibility, boosts confidence and ensures your competency. The score is out of 5
points (numeric).
CGPA: Undergraduate GPA (CGPA) out of 10 (numeric).
Research: Research Experience that can support the application, such as publishing research papers
in conferences, working as research assistant with university professor (either yes or no) (categorical).
Chance of Admit: One dependent variable can be predicted which is chance of admission, that is
according to the input given will be ranging from 0 to 1 (numeric).

11
1.6 TECHNOLOGIES USED:

• UML

• HTML

• CSS
• PYTHON

• DJANGO
UML:

The Unified Modelling Language (UML) is an open method used to specify, visualize,
construct and document the artifacts of an object-oriented software-intensive system under
development offers a standard way to write a system's blueprints, including conceptual components
such as: Actors, Business processes and System’s components and activities.

HTML:

• HTML, an initialism of Hypertext Markup Language, is the predominant markup language


for web pages. It provides a means to describe the structure of text-based information in a
document.
• HTML describes the structure of a Web page
• HTML consists of a series of elements
• HTML elements tell the browser how to display the content

CSS:

• CSS stands for Cascading Style Sheets. It is a style sheet language which is used to describe
the look and formatting of a document written in markup language. It provides an additional
feature to HTML.
• CSS describes how HTML elements are to be displayed on screen, paper, or in other media
• CSS saves a lot of work. It can control the layout of multiple web pages all at once.

PYTHON:

• Python is a general-purpose interpreted, interactive, object-oriented, and high-level


programming language.
• An interpreted language Python has a design philosophy that emphasizes code readability
(notably using whitespace indentation to delimit code blocks rather than curly brackets or

12
keywords), and a syntax that allows programmers to express concepts in fewer lines of code
than might be used in languages such as C++or java.

DJANGO:

• Django is a Python framework that makes it easier to create web sites using Python.
• Django takes care of the difficult stuff so that you can concentrate on building your web
applications.
• Django is especially helpful for database driven websites. Django emphasizes reusability of
components, rapid development, and the principle of also referred to as DRY (don't repeat
yourself), and comes with ready-to-use features like login system, database connection and
CRUD operations (Create Read Update Delete).

13
CHAPTER 2

SOFTWARE REQUIREMENT SPECIFICATION

2.1 INTRODUCTION

Software Requirement Specification (SRS) Format as name suggests, is complete specification


and description of requirements of software that needs to be fulfilled for successful development of
software system. These requirements can be functional as well as non-functional depending upon
type of requirement.

Software requirement specification plays an important role in creating quality software


solutions. Specification is basically a representation process. Requirements are represented in a
manner that ultimately leads to successful software implementation. There are a set of guidelines to
be followed while preparing the software requirement specification document. This includes the
purpose, scope, functional and non-functional requirements, software and hardware requirements of
the project. In addition to this, it also contains the information about environmental conditions
required, safety and security requirements, software quality attributes of the project etc.

Following are the features of a good SRS document:

• Correctness: User review is used to provide the accuracy of requirements stated in the
SRS. SRS is said to be perfect if it covers all the needs that are truly expected from the
system.

• Completeness: The SRS is complete if, and only if, it includes the following elements.

• Consistency: The SRS is consistent if, and only if, no subset of individual requirements
described in its conflict.

• Modifiability: SRS should be made as modifiable as likely and should be capable of


quickly obtain changes to the system to some extent. Modifications should be perfectly
indexed and cross-referenced.

• Verifiability: SRS is correct when the specified requirements can be verified with a cost-
effective system to check whether the final software meets those requirements.

The requirements are verified with the help of reviews.

14
• Traceability: The SRS is traceable if the origin of each of the requirements is clear and if
it facilitates the referencing of each condition in future development or enhancement
documentation.

• Unambiguousness: SRS is unambiguous when every fixed requirement has only one
interpretation. This suggests that each element is uniquely interpreted. In case there is a
method used with multiple definitions, the requirements report should determine the
implications in the SRS so that it is clear and simple to understand.

• Testability: An SRS should be written in such a method that it is simple to generate test
cases and test plans from the report.

Document Conventions
We used “Times new roman” font for all the document, in this bold letter are used for headings
and normal letters are for content.

2.1.1 PURPOSE AND SCOPE:

PURPOSE:
The main objective of this project is to help the students to save their time and money that they
have to spend at the education consultancy firms. And also, it will help them to limit their number of
applications to a small number by proving them the suggestion of the universities where they have the
best chance of securing admission thus saving more money on the application fees.

SCOPE:
• Providing accessibility to all users who have a valid user ID and password.

• User can upload dataset after data pre-processing. It can predict the chance of admission in
particular universities.

15
2.1.2 GLOSSARY:

TERM DEFINITION
HTTP Hyper Text Transfer Protocol
UML Unified Modeling Language
URL Uniform Resource Locator
HTML Hyper Text Markup Language
CSS Cascading Style Sheet
RF Random Forest
LR Logistic Regression
MAE Mean Absolute Error
MSE Mean Square Error
OS Operating System

2.1.3 REFERENCES:

[1] Acharya MS, Armaan A, Antony AS (2019) A comparison of regression models for prediction
of graduate admissions.
[2] Gupta N, Sawhney A, Roth D (2016) Will I get in? modeling the graduate admission process for
American universities. In: 2016 IEEE 16th international conference on data mining workshops
(ICDMW). IEEE
[3] Mishra, S. and Sahoo, S. (2016). A Quality Based Automated Admission
System for Educational Domain, pp. 221–223, International conference on Signal Processing,
Communication, Power and Embedded System (SCOPES)- 2016.

2.1.4 OVERVIEW OF THE DOCUMENT

The main objective of this project is to help the students to save their time and money that they
have to spend at the education consultancy firms. a machine learning model, which takes into account
boundaries such as GRE Score, TOEFL Score, the University Ranking, the Proposal Statement and
the Recommendation Letter Power, the Undergraduate GPA and the Study Experience. After getting
all the inputs, it predicts the chance of admission.

16
2.2 REQUIREMENT SPECIFICATIONS:

2.2.1 EXTERNAL INTERFACE REQUIREMENTS

In this, software interfaces which mean how software program communicates with each other or
users either in form of any language, code, or message are fully described and explained. Examples
can be shared memory, data streams, etc. These requirements include user interfaces (interaction logic
between software and user), screen layouts, buttons, functions on every screen, hardware interfaces
(here a team describes what devices the software is created for), and other relevant particularities.
Also, software interfaces like frontend and backend stack, database management system, etc. must
be included.

USER INTERFACES:

This tells about user interfaces how it will work and how it will be display like that. User interface is
part of software and is designed such a way that it is expected to provide the user insight of the
software. UI provides fundamental platform for human-computer interaction.

• Frontend - HTML, CSS.


• Backend - Python
• Data Base - SQLite

HARDWARE REQUIREMENTS:
The Collection of internal electronic circuits and external physical devices used in building a
computer is called Hardware.

The minimum hardware requirement specification for developing this project is as follows:

• Processor - I3/Intel Processor

• Solid Disk Drive - 512GB

• RAM - 8 GB (min)

• Key Board - Standard Windows Keyboard

• Mouse - Two or Three Button Mouse

• Monitor - Any

17
SOFTWARE REQUIREMENTS:
A set of programs associated with the operation of a computer is called software. Software is the
part of the computer system which enables the user to interact with several physical hardware devices.
The minimum software requirement specifications for developing this project are as follows:

• Operating System - Windows 10,11-64bit.

• Programming Language - Python

• IDE - PyCharm, Visual Studio Code.

• Frame Work - Django

• Designing - HTML, CSS, JavaScript.

2.2.2 FUNCTIONAL REQUIREMENTS:

These are the requirements that the end user specifically demands as basic facilities that the
system should offer. All these functionalities need to be necessarily incorporated into the system as a
part of the contract. These are represented or stated in the form of input to be given to the system, the
operation performed and the output expected. They are basically the requirements stated by the user
which one can see directly in the final product, unlike the non-functional requirements.

2.2.3 USE CASE DIAGRAM:


Use case Diagrams represent the functionality of the system from a user’s point of view. Use
cases are used during requirements elicitation and analysis to represent the functionality of the
system. Use cases focus on the behaviour of the system from external point of view.

Actors are external entities that interact with the system. Examples of actors include users and admin.

Description:

Name of the Use Case: User Login

18
Description:
For this use case to be initiated Registered users can log in to the system. If it is a
successful login the user will be directed to the main home page. Else if the user enters invalid
information, he will be asked to check the entered information.

Pre-Condition: Each user must have a valid user id and password.

Post Condition: Home Page will be displayed.

Flow of events:

• Invoke the Login page.

• Enter the valid User ID and Password.

• Click on Login button to access home page.


Name of the Use Case: Registration

Description:
• Every new user can register by clicking on “register now “link.
• Every new user needs to register themselves in the system with a unique name and email.
• The user will enter the details in the registration form according to the required fields.
• The fields include
User Name

Password

Email

Mobile number

Locality

Address

city

State

19
Post condition: Registration page is displayed.

Flow of events:

• Invoke the Login page.

• Click on register now, link to access Registration page.


Name of use case: Home page

Description:
For this use case to be initiated Here when the user will access our website the user will
be redirected to the home page where the home page will have will have the detail description about
our website and its features.
Pre-Condition: Each user must have a valid user id and password.

Post Condition: Home Page will be displayed.

Name of use case: Upload

Description: For this use case to be initiated After User logging in successfully, the user can
upload dataset into dataset column.

Pre-Condition: Each user must have a valid user id and password.

Post Condition: Register User can upload dataset.

Name of use case: Machine Learning

20
Description:

For this use case to be initiated the dataset is subjected to machine learning classifiers are
used to calculated and displayed results. The user can select the desired split ratio and can choose
the required model to view the result from the dataset.

Pre-Condition: Each user must have a valid user id and password.

Post condition: Required Page will be displayed.

Name of use case: Dataset View

Description: For this use case to be initiated the user can upload the dataset and view their
uploaded dataset.

Pre-Condition: Each user must have a valid user id and password.

Post condition: Required Page will be displayed.

Name of use case: Prediction

Description: The user can predict outcomes from the system.


Pre-Condition: Each user must have a valid user id and password.

Post condition: Required Page will be displayed.

Name of the Use Case: Security

21
Description:
For this use case to be initiated Providing Security by entering valid username and password.
Security question is asked whenever user forgets password.

Pre-Condition: Each user must have a valid user id and password.

Post condition: Required Page will be displayed.

Flow of events:

• Whenever the user forgets password, the user is asked with security question.
• If the user doesn’t provide correct user name and password, the user cannot enter into
the system

Name of the Use Case: Logout

Description:
For this use case to be initiated After using the information available with this site, user’s
logout of this system.

Pre-Condition: Each user must have a valid user id and password.

Flow of events: When the user is logged out of the system, the user goes to login page again.

Name of the Use Case: Admin login

Description:
For this use case to be initiated Admin can log in to the system. If it is a successful
login the admin will be directed to the main home page. Else if the user enters invalid information,
he will be asked to check the entered information.

22
Pre-Condition: Each admin must have a valid user id and password.

Post Condition: Home Page will be displayed.

Flow of events:

• Invoke the Login page.

• Enter the valid User ID and Password.

• Click on Login button to access home page.


Name of the Use Case: Admin Activate the User

Description:
For this use case to be initiated Admin can activate the registered users. Once he
activates then only the user can login into our system.
Name of the Use Case: Admin view user details

Description:
For this use case to be initiated Admin can view the user enter details into the system.
Admin can view the overall data in the browser.
Name of the Use Case: Results

23
Description:
For this use case to be initiated Admin can click the Results in the web page so calculated
Mean Square Error (MSE), Mean Absolute Error (MAE), Accuracy and F1-Score based on the
algorithms is displayed.

2.2.4 SYSTEM SPECIFIC MODULES:


• User

• Admin

• Data Preprocessing

• Machine Learning

MODULES DESCRIPTION:

User:
The User can register the first. While registering he required a valid user email and mobile for
further communications. Once the user register then admin can activate the user. Once admin
activated the user then user can login into our system.

User can upload the dataset based on our dataset column matched. For algorithm execution data must
be in float format. Here we took Graduate Admission dataset for testing purpose. User can also add
the new data for existing dataset based on our Django application.

User can click the Classification in the web page so that the data calculated Mean Absolute Error
(MAE), Mean Square Error (MSE) Accuracy and R2-Score based on the algorithms.

User can click Prediction in the web page so that user can write the review after predict the review
That will display results depends upon review like positive, negative or neutral.

Admin:
Admin can login with his login details. Admin can activate the registered users. Once he
activates then only the user can login into our system. Admin can view the overall data in the browser.
Admin can click the Results in the web page so calculated Mean Square Error (MSE), Mean Absolute
Error (MAE), Accuracy and F1-Score based on the algorithms is displayed.
All algorithms execution complete then admin can see the overall accuracy in web page.

24
Data Preprocessing:
A dataset can be viewed as a collection of data objects, which are often also called as a records,
points, vectors, patterns, events, cases, samples, observations, or entities.
Data objects are described by a number of features that capture the basic characteristics of an object,
such as the mass of a physical object or the time at which an event occurred, etc. Features are often
called as variables, characteristics, fields, attributes, or dimensions.
The data preprocessing in this forecast uses techniques like removal of noise in the data, the expulsion
of missing information, modifying default values if relevant and grouping of attributes for prediction
at various levels.

Machine learning:

Based on the split criterion, the cleansed data is split into 60% training and 40% test, then the
dataset is subjected to four machine learning classifiers such as Logistic Regression (LR), Support
Vector Machine (SVM), Random Forest (RF). The accuracy and F1-Score of the classifiers was
calculated and displayed in my results. The classifier which bags up the highest accuracy could be
determined as the best classifier

2.2.5 OTHER NON-FUNCTIONAL REQUIREMENTS.

2.2.5.1 NON-FUNCTIONAL REQUIREMENTS:

These are basically the quality constraints that the system must satisfy according to the project
contract. The priority or extent to which these factors are implemented varies from one project to
other. They are also called non-behavioural requirements.

They basically deal with issues like:

• Portability

• Security

• Maintainability

• Performance

• Reusability

• Flexibility

25
Portability:

This can be used on any operating system. Portability saves time.

Security:

The web Application is secured by registration page and login page and Security question is
asked whenever user forgets password.

Maintainability:

This system can be maintained easily because it does not require skill sets. The only thing the
user needs to do is input the files and gets desired output.

Usability:

How easily the system supports the interactions of user with input and output of the applications.
The website is user-friendly which makes the user easier to use by its features.

Availability:

Our web Application is available for the user who ever provides the data and predict whether the
Getting admission on universities.

Throughput:

The total time taken by a system to take the input, process it and produces output based on the
given input.

CONCLUSION:
This SRS has given details of the application need to build.

26
CHAPTER 3

ANALYSIS
3.1 INTRODUCTION:

Unified Modelling Language


The Unified Modelling Language allows the software engineer to express an analysis model
using the modelling notation that is governed by a set of syntactic semantic and pragmatic rules.

UML stands for Unified Modelling Language. UML is a standardized general-purpose


modelling language in the field of object-oriented software engineering. The standard is managed,
and was created by, the Object Management Group.

The goal is for UML to become a common language for creating models of object-oriented
computer software.

The UML is a language for

• Visualizing
• Specifying
• Constructing
• Documenting

Applications
The UML is intended primarily for the software-intensive system.

It has been used effectively for such domains as

• Enterprise information system


• Banking and financial services
• Telecommunications
• Defense / aerospace
• Retail
• Medical electronics
• Scientific
• Distributed Web-based services

To understand the UML, you need to form a conceptual model of the language, and this
requires learning three major elements: the UML’s basic building blocks.

27
Building Blocks of the UML
The vocabulary of the UML encompasses three kinds of building blocks:

• Things
• Relationships
• Diagrams

Things are the abstractions that are first-class citizens in a model; relationships tie these things
together; diagrams group interesting collections of things.

Things in the UML


There are four kinds of things in the UML

• Structural things
• Behavioural things
• Grouping things
• Annotational things

Structural Things

Nouns that depict the static behaviour of a model are termed as structural things. They display
the physical and conceptual components. They include class, object, interface, node, collaboration,
component, and a use case.

Class:

A Class is a set of identical things that outlines the functionality and properties of an object.
It also represents the abstract class whose functionalities are not defined. Its notation is as follows;

28
Object:

An individual that describes the behaviour and the functions of a system. The notation of the
object is similar to that of the class; the only difference is that the object name is always underlined
and its notation is given below

Interface:

A set of operations that describes the functionality of a class, which is implemented whenever
an interface is implemented.

Collaboration:

It represents the interaction between things that are done to meet the goal. It is symbolized as
a dotted ellipse with its name written inside it.

Use case:

Use case is the core concept of object-oriented modelling. It portrays a set of actions executed
by a system to achieve the goal.

29
Actor:

It comes under the use case diagrams. It is an object that interacts with the system, for example,
a user.

Component:

It represents the physical part of the system.

Node:

A physical element that exists at run time.

Behavioural Things

They are the verbs that encompass the dynamic parts of a model. It depicts the behaviour of a
system. They involve state machine, activity diagram, interaction diagram, grouping things,
annotation things

State Machine:

It defines a sequence of states that an entity goes through in the software development
lifecycle. It keeps a record of several distinct states of a system component.

30
Interaction:

It is used to envision the flow of messages between several components in a system.

Grouping Things

It is a method that together binds the elements of the UML model. In UML, the package is the
only thing, which is used for grouping.

Package:

A package is the only thing that is available for grouping behavioural and structural things.

Annotation Things

It is a mechanism that captures the remarks, descriptions, and comments of UML model
elements. In UML, a note is the only Annotational thing.

Note:

It is used to attach the constraints, comments, and rules to the elements of the model. It is a
kind of yellow sticky note.

31
Relationships

It illustrates the meaningful connections between things. It shows the association between the
entities and defines the functionality of an application.

There are four types of relationships given below:

Dependency:

Dependency is a kind of relationship in which a change in the target element affects the source
element, or simply we can say the source element is dependent on the target element. It is one of the
most important notations in UML. It depicts the dependency from one entity to another.

It is denoted by a dotted line followed by an arrow on one side as shown below,

Association:

A set of links that associates the entities to the UML model. It tells how many elements are
actually taking part in forming that relationship.

It is denoted by a dotted line with arrowheads on both sides to describe the relationship with the
element on both sides.

Generalization:

It portrays the relationship between a general thing (a parent class or superclass) and a specific
kind of that thing (a child class or subclass). It is used to describe the concept of inheritance.

It is denoted by a straight line followed by an empty arrowhead at one side.

32
Realization:

It is a semantic kind of relationship between two things, where one defines the behaviour to
be carried out, and the other one implements the mentioned behaviour. It exists in interfaces.

It is denoted by a dotted line with an empty arrowhead at one side.

This UML diagrams must include the following:

• Class diagram
• Interaction Diagram
• Use case Diagram
• State Diagram
• Activity Diagram
• Component Diagram
• Deployment Diagram

3.2 USE CASE DIAGRAM:


A use case diagram is a diagram that shows a set of use cases and actors and relationships. A use
case diagram is used to represent the dynamic behaviour of a system. It encapsulates the system's
functionality by incorporating use cases, actors, and their relationships. It models the tasks, services,
and functions required by a system/subsystem of an application. It depicts the high-level functionality
of a system and also tells how the user handles a system.

In UML there are five diagrams available to model the dynamic nature and use case diagram is
one of them. Now as we have to discuss the use case diagram is dynamic in nature, there should some
internal or external factors for making the interaction. The internal and external agents are known as
actors. use case diagram consists of actors, use case and their relationships. The diagram is used to
model the system of an application. A single use case diagram captures a particular functionality of a
system.

33
1.USER

In the above Use case Diagram system and user are the Actors and Register, Login,
machine learning, dataset view, prediction, predicting results, logout are the use cases.

Registration:

Every new user needs to register themselves in the system with a unique name and email.

Login:

Registered users can log in to the system.

Machine learning:

The user can select the desired split ratio and can choose the required model to view the result
from the dataset.

Dataset View:

After logging in successfully, the user can view their uploaded dataset.

34
Prediction:

After upload dataset in web application transform data into understandable form. The user
can predict outcomes from the system.

2.ADMIN

Login:

Registered admin can login to the system.

Activate:

Admin can activate the registered users. Once he activates then only the user can login into

our system.

User details:

Admin can view the user enter details into the system. Admin can view the overall data in the

browser.

Results: Admin can click the Results in the web page.

35
FIG 3.2 USE CASE DIAGRAM

3.3 INTERACTION DIAGRAM:


An interaction diagram shows an interaction, consisting of a set of objects and their relationships,
including the messages that may be dispatched among them.

A sequence diagram is an interaction diagram that emphasizes the time ordering of messages.
Graphically, a sequence diagram is a table that shows objects arranged along x-axis and messages,
ordered in increasing time, along the y-axis.

A Collaboration is a society of classes, interfaces, and other elements that work together to provide
some cooperative behavior that’s bigger than the sum of all its parts.

36
3.3.1 SEQUENCE DIAGRAM:
Sequence diagram is a diagram that shows object interactions arranged in time sequence. In
particular it shows objects participating in the interaction and the sequence of messages exchanged.

It is a construct of a Message Sequence Chart. Sequence diagrams are sometimes called event
diagrams, event scenarios, and timing diagrams.

A sequence diagram shows, as parallel vertical lines (lifelines), different processes or


objects that live simultaneously, and, as horizontal arrows, the messages exchanged between them.
Sequence diagrams commonly contain the following:

➢ Objects
➢ Links
➢ Messages
Like all other diagrams, sequence diagrams may contain notes and constrains.

FIG 3.3.1 SEQUENCE DIAGRAM

37
User register with valid mail id and password once the user register then admin can activate the user.

User can select dataset for data pre-process techniques like removal of noise in the data, the expulsion
of missing information, modifying default values. Use the models like Confusion matrix,

Random forest Confusion matrix, cat boost Confusion matrix and machine learning. Generate
prediction form add data and admin can prediction results.

3.3.2 COLLABORATION DIAGRAMS:


A collaboration diagram also called a communication diagram or interaction diagram is an
illustration of the relationships and interactions among software objects in the unified modelling
language.

A sophisticated modelling tool can easily convert a collaboration diagram into a sequence diagram
and the vice versa. Hence, the elements of a Collaboration diagram are essentially the same as that of
a Sequence diagram.

Collaboration diagrams commonly contain the following:

• Objects
• Links
• Messages

FIG 3.3.2 COLLABORATION DIAGRAMS

38
After uploading the dataset into system, it requires some input parameters which are split size, machine
learning model for analysing the uploaded data. This communication should be transferred between
the user and system and represent it has an organization system.

3.4 STATE DIAGRAM:


The state machine diagram is also called the State chart or State Transition diagram, which shows
the order of states underwent by an object within the system. It captures the software system's
behavior. It models the behavior of a class, a subsystem, a package, and a complete system.

It tends out to be an efficient way of modeling the interactions and collaborations in the external
entities and the system. It models event-based systems to handle the state of an object. It also defines
several distinct states of a component within the system. Each object/component has a specific state.

FIG 3.4: STATE DIAGRAM

User and admin login with unique email Id and password. If its user login successfully then loads the
dataset otherwise again login. Add data to processing then admin view the user details and active
Users’ dataset. View the results predict the percentage of chances getting admission into the
university.

39
3.5 ACTIVITY DIAGRAM:
An activity diagram shows the flow from activity to activity. An activity is an ongoing non-
atomic execution within a state machine.

Activities ultimately result in some action, which is made up of executable atomic computations that
result in a change in state of the system or the return of a value.
An activity diagram shows the overall flow of control. Activity diagrams are constructed from a
limited repertoire of shapes, connected with arrows.

The most important shape types:

• Rounded rectangles represent activities

• Diamonds represent decisions

• A black circle represents the start (initial state) of the workflow, an encircled black circle
represents the end (final state).

Activity diagrams commonly contain

• Activity states and action states


• Transitions
• Objects

40
FIG 3.5: ACTIVITY DIAGRAM

In our activity diagram it starts from initial state. From initial state it categorized into two activities
that is admin and the user. A solid line with an arrow represents the direction flow of the activities.

The arrow points in the direction of progressing activities. In system the dataset is splits and
trained. After training the final activity is to predict results. The user here using his/her credentials
registers/login in the web page and upload the dataset of a file. After uploading file, he/she will
select the split size and select the model.

After analysing data, it is again classified into two activities that is uploading the data and testing
the data. Here after analyzing the data by joining training data (60%) and testing data (40%) we
enter the data for prediction. Now the user views the prediction. The predicted results are given
by the system. After the results user logouts.

41
CHAPTER 4
DESIGN

4.1 ARCHITECTURE

FIG 4.1: SYSTEM ARCHITECTURE

In User registration, every new user needs to register themselves in the system with a unique name
and e-mail. After registration the registered users can log into the system. After logging in
successfully, the user can upload and view their dataset.

Collecting data for training the ML model is the basic step in the machine learning pipeline. The
predictions made by ML systems can only be as good as the data on which they have been trained.

42
Data cleaning are techniques are manual and automated, remove data incorrectly added or classified.
Here the process of the system is it will take dataset, splits the dataset and model training will be done
to generate results. The system can deliver the predicted results and can be displayed to the user.

4.2 DATA FLOW DIAGRAM:


1. The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to
represent a system in terms of input data to the system, various processing carried out on this
data, and the output data is generated by this system.
2. The data flow diagram (DFD) is one of the most important modeling tools. It is used to model
the system components. These components are the
3. system process, the data used by the process, an external entity that interacts with the system
and the information flows in the system.
4. DFD shows how the information moves through the system and how it is modified by a series
of transformations. It is a graphical technique that depicts information flow and the
transformations that are applied as data moves from input to output.
5. DFD is also known as bubble chart. A DFD may be used to represent a system at any level of
abstraction. DFD may be partitioned into levels that represent increasing information flow
and functional detail.

FIG 4.2: DATA FLOW DIAGRAM

43
Project flow is a convenient way to define and plan projects. It helps link project budget and schedule
to project activities and tasks.
Traditionally it is designed in the form of a chart or diagram which is a great tool to visually represent
how a project is supposed to produce and deploy its product. Simple tree-like lists or hierarchies of
project activities are also used to map out and depict project flow

4.3 CLASS DIAGRAM:


The class diagram is the main building block in object oriented modelling. They are being used
both for general conceptual modelling of the systematic of the application, and for detailed modelling
translating the models into programming code.

The classes in a class diagram represent both the main objects and or interactions in the
application and the objects to be programmed. In the class diagram these classes are represented with
boxes which contain three parts:

• The upper part holds the name of the class


• The middle part contains the attributes of the class, and
• The bottom part gives the methods or operations the class can take or undertake
A class diagram is an illustration of the relationships and source code dependencies among classes
in the unified modelling language.

In this context, a class defines the methods and variables in an object, which is a specific entity in a
program or the unit of code representing that entity.

44
FIG 4.3: CLASS DIAGRAM

In the above class diagram admin, user data pre-process and models are the classes. And in
each one has a collection of Objects. Each object has Some of Attributes, Methods and set of
Behaviours. In User Class Login and password are the attributes and load Dataset, pre-process,
recommendations etc are the operations.

4.4 COMPONENT DIAGRAM:


A component diagram is used to break down a large object-oriented system into the smaller
components, so as to make them more manageable. It models the physical view of a system such as
executables, files, libraries, etc. that resides within the node.
A component is a single unit of the system, which is replaceable and executable. The implementation
details of a component are hidden, and it necessitates an interface to execute a function.
A component contains a set of collaborating classes. Each class within a component has been fully
elaborated to include all attributes and operations that are relevant to its implementation. As part of
the design elaboration, all interfaces (messages) that enable the classes to communicate and
collaborate with other design classes must also be defined.

45
In component diagram there will be two components they are:

• The system component

• The user component

1.System component:

The system component predicts the dataset, splits, and trained, as well as predicts the results.

2.User component:

In user component the user will register/login by giving credentials like name, email-id, phone
number, Locality, Address, city, State. The user uploads the file, select the split size and select the
model.

After viewing the results by the system and the user logouts.

FIG 4.4: COMPONENT DIAGRAM

4.5 DEPLOYMENT DIAGRAM:


A deployment diagram is a diagram that shows the configuration of run time processing nodes
and the components that live on them.

Graphically, a deployment diagram is collection of vertices and arcs.

Contents

• Deployment diagram commonly contain the following things:


• Dependency and association relationships
• Like all other diagrams, deployment diagrams may contain notes and constraints.
• Deployment diagrams may also contain components, each of which must live on some node.

46
• Deployment diagrams may also contain packages or subsystems, both of which are used to
group elements of your model into larger chunks.

FIG 4.5: DEPLOYMENT DIAGRAM

4.6 USER INTERFACE DESIGN FOR WELCOME PAGE:

47
4.6.1 USER INTERFACE DESIGN FOR USER REGISTRATION:

ADMISSION INTO A UNIVERSITY Home User Admin Registration

User register form


User name
Login id
Password
Mobile
Email
Locality
City
State
Address

Register

Design for user registration here, user providing fields in registration page like user name, login id,

Password, mobile, email, locality, city, state, address.

4.6.2 USER INTERFACE DESIGN FOR USER HOME PAGE:

ADMISSION INTO A UNIVERSITY Home User Admin Registration

User Login Form

Enter Login Id

Password

LOGIN RESET

Design for the user login into his/her account after registration. when the registration page there are
some fields to enter by user like user id and password

48
49

You might also like