100% found this document useful (1 vote)
4K views69 pages

Student Grade Prediction Python Full Document

The document summarizes a student grade prediction system that uses the C4.5 decision tree algorithm. It discusses collecting student academic data like tests and course grades and using C4.5 to analyze the data and predict students' final grades and grade point averages. The system aims to help schools, colleges, and other educational institutions by accurately predicting student performance based on past academic results.

Uploaded by

Sai Machavarapu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
4K views69 pages

Student Grade Prediction Python Full Document

The document summarizes a student grade prediction system that uses the C4.5 decision tree algorithm. It discusses collecting student academic data like tests and course grades and using C4.5 to analyze the data and predict students' final grades and grade point averages. The system aims to help schools, colleges, and other educational institutions by accurately predicting student performance based on past academic results.

Uploaded by

Sai Machavarapu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 69

CHAPTER-1

INTRODUCTION
STUDENT GRADE PREDICTION USING C-4.5
ALGORITHM

ABSTRACT
A system is designed to predict the final grade of the students’ based on the grades
scored by him/her during his/her previous course and years. In order to predict the
grade of the student it needs some data to be analyzed and hence grade is
predicted. Input is students’ basic information and their previous academic
information using which students’ grade is predicted.
Here system will generate a report where he/she will get grade prediction
using C4.5 algorithm. This system can be used in schools, colleges and other
educational institutes.
CHAPTER-2
SYSTEM ANALYSIS
EXISTING SYSTEM
 An ability of student performance is essential in education environment,
which is influenced by many qualitative attributes like Student Identity,
gender, age, Specialty, Lower class Grade, higher Class Grade, Extra
knowledge or skill, Resource, Attendance, Time spend to study, Class Test
Grade (Internal), Seminar Performance, Lab Work, Quiz, Over all previous
Semester exam marks are included for forming the data set.
 The existing system can’t represent students’ performance in grade wise.

DIS-ADVANTAGES

 It can’t discriminate among levels of student performance.

 There is no experiment on the grade prediction for the student.

PROPOSED SYSTEM

The proposed system can overcome all the limitations of the existing system, such
as students’ performance is represented in terms of grades. Here, we are going to
propose the system by using which the user will collect the data about the student.
In our project, we mainly consider the subject grades of a student obtained in
previous semesters.

ADVANTAGES

 The student grade is predicted.

 Accurate and Efficient.

 User friendly.
MODULES DESCRIPTION

 Admin
 Grade and SGPA prediction
 Graph

ADMIN

Admin collects the each and individual data of the students based on the two
parameters internal marks and external marks.

GRADE AND SGPA PREDICTION

Based on the data collected by the admin the grade and SGPA of the student is
predicted.

GRAPH

After the data is collected by the admin and the grade, SGPA is predicted. Based
on the grade the graph is generated.

ARCHITECTURE DIAGRAM

Collect data Predict student


User grade
and store

Predict student
SGPA
C4.5 DECISION TREE
 C4.5 is an algorithm used to generate a decision tree developed by Ross
Quinlan. It is an extension of Quinlan's earlier ID3 algorithm. The decision
trees generated by C4.5 can be used for classification, and for this reason, it
is often referred to as a statistical classifier.
 Handles both continuous and discrete attributes - In order to handle
continuous attributes, C4.5 creates a threshold and then splits the list into
those whose attribute value is above the threshold and those that are less
than or equal to it.
 Handles training data with missing attribute values - C4.5 allows attribute
values to be marked as ? for missing. Missing attribute values are simply not
used in gain and entropy calculations.
CHAPTER-3
REQUIREMENT
PRELIMINARY INVESTIGATION

The first and foremost strategy for development of a project starts from the thought
of designing a mail enabled platform for a small firm in which it is easy and
convenient of sending and receiving messages, there is a search engine ,address
book and also including some entertaining games. When it is approved by the
organization and our project guide the first activity, ie. preliminary investigation
begins. The activity has three parts:

 Request Clarification

 Feasibility Study

 Request Approval

REQUEST CLARIFICATION

After the approval of the request to the organization and project guide, with an
investigation being considered, the project request must be examined to determine
precisely what the system requires.

Here our project is basically meant for users within the company whose systems
can be interconnected by the Local Area Network(LAN). In today’s busy schedule
man need everything should be provided in a readymade manner. So taking into
consideration of the vastly use of the net in day to day life, the corresponding
development of the portal came into existence.
FEASIBILITY ANALYSIS

An important outcome of preliminary investigation is the determination that the


system request is feasible. This is possible only if it is feasible within limited
resource and time. The different feasibilities that have to be analyzed are

 Operational Feasibility
 Economic Feasibility
 Technical Feasibility

Operational Feasibility
Operational Feasibility deals with the study of prospects of the system to be
developed. This system operationally eliminates all the tensions of the Admin and
helps him in effectively tracking the project progress. This kind of automation will
surely reduce the time and energy, which previously consumed in manual work.
Based on the study, the system is proved to be operationally feasible.

Economic Feasibility

Economic Feasibility or Cost-benefit is an assessment of the economic justification


for a computer based project. As hardware was installed from the beginning & for
lots of purposes thus the cost on project of hardware is low. Since the system is a
network based, any number of employees connected to the LAN within that
organization can use this tool from at anytime. The Virtual Private Network is to
be developed using the existing resources of the organization. So the project is
economically feasible.
Technical Feasibility
According to Roger S. Pressman, Technical Feasibility is the assessment of the
technical resources of the organization. The organization needs IBM compatible
machines with a graphical web browser connected to the Internet and Intranet. The
system is developed for platform Independent environment. Java Server Pages,
JavaScript, HTML, SQL server and WebLogic Server are used to develop the
system. The technical feasibility has been carried out. The system is technically
feasible for development and can be developed with the existing facility.

SYSTEM REQUIREMENTS

HARDWARE REQUIREMENTS

 Processor - Core I 3
 RAM - 4 GB (min)
 Hard Disk - 320 GB

SOFTWARE REQUIREMENTS

 Operating System : Windows 7


 Coding Language : Python

1) PIP

2) Json

3) Chefboost

4) Pandas

5) Numpy
CHAPTER-4
DESIGN
Introduction to UML
UML is a method for describing the system architecture in detail using the blue
print. UML represents a collection of best engineering practice that has proven
successful in the modeling of large and complex systems. The UML is very
important parts of developing object oriented software and the software
development process. The UML uses mostly graphical notations to express the
design of software projects. Using the helps UML helps project teams
communicate explore potential designs and validate the architectural design of the
software.
Use Case Diagram
Use case diagram represents the functionality of the system. Use case focus on the
behavior of the system from external point of view. Actors are external entities that
interact with the system.

Use cases:

A use case describes a sequence of actions that provide something of measurable


value to an actor and is drawn as a horizontal ellipse.

Actors:
An actor is a person, organization, or external system that plays a role in one or
more interactions with the system.
System boundary boxes (optional):
A rectangle is drawn around the use cases, called the system boundary box, to
indicate the scope of system. Anything within the box represents functionality that
is in scope and anything outside the box is not.

Four relationships among use cases are used often in practice.


Include:
In one form of interaction, a given use case may include another. "Include is a
Directed Relationship between two use cases, implying that the behavior of the
included use case is inserted into the behavior of the including use case.
The first use case often depends on the outcome of the included use case. This is
useful for extracting truly common behaviors from multiple use cases into a single
description. The notation is a dashed arrow from the including to the included use
case, with the label "«include»".There are no parameters or return values. To
specify the location in a flow of events in which the base use case includes the
behavior of another, you simply write include followed by the name of use case
you want to include, as in the following flow for track order.
Extend:
In another form of interaction, a given use case (the extension) may extend
another. This relationship indicates that the behavior of the extension use case may
be inserted in the extended use case under some conditions. The notation is a
dashed arrow from the extension to the extended use case, with the label
"«extend»". Modelers use the «extend» relationship to indicate use cases that are
"optional" to the base use case.
Generalization:
In the third form of relationship among use cases, a generalization/specialization
relationship exists. A given use case may have common behaviors, requirements,
constraints, and assumptions with a more general use case. In this case, describe
them once, and deal with it in the same way, describing any differences in the
specialized cases. The notation is a solid line ending in a hollow triangle drawn
from the specialized to the more general use case (following the standard
generalization notation
Associations:
Associations between actors and use cases are indicated in use case diagrams by
solid lines. An association exists whenever an actor is involved with an interaction
described by a use case. Associations are modeled as lines connecting use cases
and actors to one another, with an optional arrowhead on one end of the line. The
arrowhead is often used to indicating the direction of the initial invocation of the
relationship or to indicate the primary actor within the use case.

Identified Use Cases

The “user model view” encompasses a problem and solution from the
preservative of those individuals whose problem the solution addresses. The view
presents the goals and objectives of the problem owners and their requirements of
the solution. This view is composed of “use case diagrams”. These diagrams
describe the functionality provided by a system to external integrators. These
diagrams contain actors, use cases, and their relationships.

Class Diagram
Class-based Modeling, or more commonly class-orientation, refers to the style of
object-oriented programming in which inheritance is achieved by defining classes
of objects; as opposed to the objects themselves (compare Prototype-based
programming).

The most popular and developed model of OOP is a class-based model, as opposed
to an object-based model. In this model, objects are entities that combine state (i.e.,
data), behavior (i.e., procedures, or methods) and identity (unique existence among
all other objects). The structure and behavior of an object are defined by a class,
which is a definition, or blueprint, of all objects of a specific type. An object must
be explicitly created based on a class and an object thus created is considered to be
an instance of that class. An object is similar to a structure, with the addition of
method pointers, member access control, and an implicit data member which
locates instances of the class (i.e. actual objects of that class) in the class hierarchy
(essential for runtime inheritance features).

Sequence Diagram

A sequence diagram in Unified Modeling Language (UML) is a kind of


interaction diagram that shows how processes operate with one another and in
what order. It is a construct of a Message Sequence Chart.

Sequence diagrams are sometimes called event diagrams, event scenarios,


and timing diagrams. A sequence diagram shows, as parallel vertical lines
(lifelines), different processes or objects that live simultaneously, and, as
horizontal arrows, the messages exchanged between them, in the order in which
they occur. This allows the specification of simple runtime scenarios in a graphical
manner. If the lifeline is that of an object, it demonstrates a role. Note that leaving
the instance name blank can represent anonymous and unnamed instances. In order
to display interaction, messages are used. These are horizontal arrows with the
message name written above them. Solid arrows with full heads are synchronous
calls, solid arrows with stick heads are asynchronous calls and dashed arrows with
stick heads are return messages. This definition is true as of UML 2, considerably
different from UML 1.x.

Activation boxes, or method-call boxes, are opaque rectangles drawn on top


of lifelines to represent that processes are being performed in response to the
message (Execution Specifications in UML).

Objects calling methods on themselves use messages and add new activation
boxes on top of any others to indicate a further level of processing. When an object
is destroyed (removed from memory), an X is drawn on top of the lifeline, and the
dashed line ceases to be drawn below it (this is not the case in the first example
though). It should be the result of a message, either from the object itself, or
another.

A message sent from outside the diagram can be represented by a message


originating from a filled-in circle (found message in UML) or from a border of
sequence diagram (gate in UML)

COLLOBORATION DIAGRAM:
A Sequence diagram is dynamic, and, more importantly, is time ordered. A
Collaboration diagram is very similar to a Sequence diagram in the purpose it
achieves; in other words, it shows the dynamic interaction of the objects in a
system. A distinguishing feature of a Collaboration diagram is that it shows the
objects and their association with other objects in the system apart from how they
interact with each other. The association between objects is not represented in a
Sequence diagram.

A Collaboration diagram is easily represented by modeling objects in a


system and representing the associations between the objects as links. The
interaction between the objects is denoted by arrows. To identify the sequence of
invocation of these objects, a number is placed next to each of these arrows.

Defining a Collaboration Diagram:


A sophisticated modeling tool can easily convert a collaboration diagram
into a sequence diagram and the vice versa. Hence, the elements of a Collaboration
diagram are essentially the same as that of a Sequence diagram.

Activity Diagram

Activity diagrams are graphical representations of workflows of stepwise activities


and actions with support for choice, iteration and concurrency. In the Unified
Modeling Language, activity diagrams can be used to describe the business and
operational step-by-step workflows of components in a system. An activity
diagram shows the overall flow of control.

Activity diagrams are constructed from a limited repertoire of shapes, connected


with arrows. The most important shape types:

 rounded rectangles represent activities;


 diamonds represent decisions;
 bars represent the start (split) or end (join) of concurrent activities;
 a black circle represents the start (initial state) of the workflow;
 An encircled black circle represents the end (final state).
Arrows run from the start towards the end and represent the order in which
activities happen. However, the join and split symbols in activity diagrams only
resolve this for simple cases; the meaning of the model is not clear when they are
arbitrarily combined with the decisions or loops.

State Chart Diagram

Objects have behaviors and states. The state of an object depends on its current
activity or condition. A state chart diagram shows the possible states of the object
and the transitions that cause a change in state. A state diagram, also called a state
machine diagram or state chart diagram, is an illustration of the states an object can
attain as well as the transitions between those states in the Unified Modeling
Language. A state diagram resembles a flowchart in which the initial state is
represented by a large black dot and subsequent states are portrayed as boxes with
rounded corners.

State Chart Diagram


register

login

view & update internal marks

view & update external marks

view & update lab marks

predict student grade

predict student SGPA

visualize grade-wise graph

There may be one or two horizontal lines through a box, dividing it into stacked
sections. In that case, the upper section contains the name of the state, the middle
section (if any) contains the state variables and the lower section contains the
actions performed in that state. If there are no horizontal lines through a box, only
the name of the state is written inside it. External straight lines, each with an arrow
at one end, connect various pairs of boxes. These lines define the transitions
between states. The final state is portrayed as a large black dot with a circle around
it. Historical states are denoted as circles with the letter H inside.

Component Diagram
COMPONENT LEVEL CLASS DESIGN

This chapter discusses the portion of the software development process


where the design is elaborated and the individual data elements and operations are
designed in detail. First, different views of a “component” are introduced.
Guidelines for the design of object-oriented and traditional (conventional) program
components are presented.

What is a Component?

This section defines the term component and discusses the differences
between object oriented, traditional, and process related views of component level
design. Object Management Group OMG UML defines a component as “… a
modular, deployable, and replaceable part of a system that encapsulates
implementation and exposes a set of interfaces.”

An Object Oriented View

A component contains a set of collaborating classes. Each class within a


component has been fully elaborated to include all attributes and operations that
are relevant to its implementation. As part of the design elaboration, all interfaces
(messages) that enable the classes to communicate and collaborate with other
design classes must also be defined. To accomplish this, the designer begins with
the analysis model and elaborates analysis classes (for components that relate to
the problem domain) and infrastructure classes (or
components that provide support services for the problem domain).

user database
Deployment Diagram:

Deployment diagrams are used to visualize the topology of the physical


components of a system where the software components are deployed.

So deployment diagrams are used to describe the static deployment view of a


system. Deployment diagrams consist of nodes and their relationships.

Purpose:

The name Deployment itself describes the purpose of the diagram. Deployment


diagrams are used for describing the hardware components where software
components are deployed. Component diagrams and deployment diagrams are
closely related.

Component diagrams are used to describe the components and deployment


diagrams shows how they are deployed in hardware.

UML is mainly designed to focus on software artifacts of a system. But these two
diagrams are special diagrams used to focus on software components and hardware
components.

So most of the UML diagrams are used to handle logical components but
deployment diagrams are made to focus on hardware topology of a system.
Deployment diagrams are used by the system engineers.

The purpose of deployment diagrams can be described as:

 Visualize hardware topology of a system.

 Describe the hardware components used to deploy software components.


 Describe runtime processing nodes.

How to draw Deployment Diagram?

Deployment diagram represents the deployment view of a system. It is related to


the component diagram. Because the components are deployed using the
deployment diagrams. A deployment diagram consists of nodes. Nodes are nothing
but physical hardwares used to deploy the application.

Deployment diagrams are useful for system engineers. An efficient deployment


diagram is very important because it controls the following parameters

 Performance

 Scalability

 Maintainability

 Portability

So before drawing a deployment diagram the following artifacts should be


identified:

 Nodes

 Relationships among nodes

The following deployment diagram is a sample to give an idea of the deployment


view of order management system. Here we have shown nodes as:

 Monitor
 Modem

 Caching server

 Server

The application is assumed to be a web based application which is deployed in a


clustered environment using server 1, server 2 and server 3. The user is connecting
to the application using internet. The control is flowing from the caching server to
the clustered environment.

So the following deployment diagram has been drawn considering all the points
mentioned above:

INPUT AND OUTPUT DESIGN

INPUT DESIGN

The input design is the link between the information system and the user. It
comprises the developing specification and procedures for data preparation and
those steps are necessary to put transaction data in to a usable form for processing
can be achieved by inspecting the computer to read data from a written or printed
document or it can occur by having people keying the data directly into the system.
The design of input focuses on controlling the amount of input required,
controlling the errors, avoiding delay, avoiding extra steps and keeping the process
simple. The input is designed in such a way so that it provides security and ease of
use with retaining the privacy. Input Design considered the following things:

 What data should be given as input?


 How the data should be arranged or coded?
 The dialog to guide the operating personnel in providing input.
 Methods for preparing input validations and steps to follow when error
occur.

OBJECTIVES

 Input Design is the process of converting a user-oriented description of the


input into a computer-based system. This design is important to avoid errors
in the data input process and show the correct direction to the management
for getting correct information from the computerized system.
 It is achieved by creating user-friendly screens for the data entry to handle
large volume of data. The goal of designing input is to make data entry
easier and to be free from errors. The data entry screen is designed in such a
way that all the data manipulates can be performed. It also provides record
viewing facilities.
 When the data is entered it will check for its validity. Data can be entered

with the help of screens. Appropriate messages are provided as when needed
so that the user will not be in maize of instant. Thus the objective of input
design is to create an input layout that is easy to follow.

OUTPUT DESIGN
A quality output is one, which meets the requirements of the end user and presents
the information clearly. In any system results of processing are communicated to
the users and to other system through outputs. In output design it is determined
how the information is to be displaced for immediate need and also the hard copy
output. It is the most important and direct source information to the user. Efficient
and intelligent output design improves the system’s relationship to help user
decision-making.

 Designing computer output should proceed in an organized, well thought out


manner; the right output must be developed while ensuring that each output
element is designed so that people will find the system can use easily and
effectively. When analysis design computer output, they should Identify the
specific output that is needed to meet the requirements.
 Select methods for presenting information.
 Create document, report, or other formats that contain information produced
by the system.

The output form of an information system should accomplish one or


more of the following objectives.

 Convey information about past activities, current status or projections of the


 Future.
 Signal important events, opportunities, problems, or warnings.
 Trigger an action.
 Confirm an action.
CHAPTER-5
IMPLEMENTATION
FUNCTIONAL REQUIREMENTS
Functional Requirements

In software engineering, a functional requirement defines a function of a


software system or its component. A function is described as a set of inputs, the
behavior, and outputs (see also software). Functional requirements may be
calculations, technical details, data manipulation and processing and other specific
functionality that define what a system is supposed to accomplish. Behavioral
requirements describing all the cases where the system uses the functional
requirements are captured in use cases. Generally, functional requirements are
expressed in the form “system shall do <requirement>”. The plan for
implementing functional requirements is detailed in the system design. In
requirements engineering, functional requirements specify particular results of a
system. Functional requirements drive the application architecture of a system. A
requirements analyst generates use cases after gathering and validating a set of
functional requirements. The hierarchy of functional requirements is:
user/stakeholder request -> feature -> use case -> business rule.

Functional requirements drive the application architecture of a


system. A requirements analyst generates use cases after gathering and validating a
set of functional requirements. Functional requirements may be technical details,
data manipulation and other specific functionality of the project is to provide the
information to the user.

Non Functional Requirements

In systems engineering and requirements engineering, a non-functional


requirement is a requirement that specifies criteria that can be used to judge the
operation of a system, rather than specific behaviors

The project non functional requirements include the following.

 Updating Work status.


 Problem resolution.
 Error occurrence in the system.
 Customer requests.

Availability: A system’s “availability” or “uptime” is the amount of time that is


operational and available for use. It’s related to is the server providing the service
to the users in displaying images. As our system will be used by thousands of users
at any time our system must be available always. If there are any cases of
updations they must be performed in a short interval of time without interrupting
the normal services made available to the users.

Efficiency: Specifies how well the software utilizes scarce resources: CPU cycles,
disk space, memory, bandwidth etc. All of the above mentioned resources can be
effectively used by performing most of the validations at client side and reducing
the workload on server by using JSP instead of CGI which is being implemented
now.

Flexibility: If the organization intends to increase or extend the functionality of


the software after it is deployed, that should be planned from the beginning; it
influences choices made during the design, development, testing and deployment
of the system. New modules can be easily integrated to our system without
disturbing the existing modules or modifying the logical database schema of the
existing applications.

Portability: Portability specifies the ease with which the software can be
installed on all necessary platforms, and the platforms on which it is expected to
run. By using appropriate server versions released for different platforms our
project can be easily operated on any operating system, hence can be said highly
portable.

Scalability: Software that is scalable has the ability to handle a wide variety of
system configuration sizes. The nonfunctional requirements should specify the
ways in which the system may be expected to scale up (by increasing hardware
capacity, adding machines etc.). Our system can be easily expandable. Any
additional requirements such as hardware or software which increase the
performance of the system can be easily added. An additional server would be
useful to speed up the application.
Integrity: Integrity requirements define the security attributes of the system,
restricting access to features or data to certain users and protecting the privacy of
data entered into the software. Certain features access must be disabled to normal
users such as adding the details of files, searching etc which is the sole
responsibility of the server. Access can be disabled by providing appropriate logins
to the users for only access.

Usability: Ease-of-use requirements address the factors that constitute the capacity
of the software to be understood, learned, and used by its intended users. Hyper
links will be provided for each and every service the system provides through
which navigation will be easier. A system that has high usability coefficient makes
the work of the user easier.

Performance: The performance constraints specify the timing characteristics of


the software.
CHAPTER-6
TECHNOLOGIES
About Python
Python is an open source, high-level programming language developed by Guido
van Rossum in the late 1980s and presently administered by Python Software
Foundation. It came from the ABC language that he helped create early on in his
career. Python is a powerful language that you can use to create games, write
GUIs, and develop web applications.

It is a high-level language. Reading and writing codes in Python is much like


reading and writing regular English statements. Because they are not written in
machine-readable language, Python programs need to be processed before
machines can run them. Python is an interpreted language. This means that every
time a program is run, its interpreter runs through the code and translates it into
machine readable byte code.

Python is an object-oriented language that allows users to manage and control data
structures or objects to create and run programs. Everything in Python is, in fact,
first class. All objects, data types, functions, methods, and classes take equal
position in Python. Programming languages are created to satisfy the needs of
programmers and users for an effective tool to develop applications that impact
lives, lifestyles, economy, and society. They help make lives better by increasing
productivity, enhancing communication, and improving efficiency. Languages die
and become obsolete when they fail to live up to expectations and are replaced and
superseded by languages that are more powerful. Python is a programming
language that has stood the test of time and has remained relevant across industries
and businesses and among programmers, and individual users. It is a living,
thriving, and highly useful language that is highly recommended as a first
programming language for those who want to dive into and experience
programming. Advantages of Using Python Here are reasons why you would
prefer to learn and use Python over other high-level languages:

Readability

Python programs use clear, simple, and concise instructions that are easy to read
even by those who have no substantial programming background. Programs
written in Python are, therefore, easier to maintain, debug, or enhance.

Higher productivity

Codes used in Python are considerably shorter, simpler, and less verbose than
other high-level programming languages such as Java and C++. In addition, it has
well-designed built-in features and standard library as well as access to third party
modules and source libraries.

These features make programming in Python more efficient.

Less learning time

Python is relatively easy to learn. Many find Python a good first language for
learning programming because it uses simple syntax and shorter codes. Python
works on Windows, Linux/UNIX, Mac OS X, other operating systems and small
form devices. It also runs on microcontrollers used in appliances, toys, remote
controls, embedded devices, and other similar devices.

Installing Python in Windows

To install Python, you must first download the installation package of your
preferred version from this link: https://www.python.org/downloads/ On this page,
you will be asked to choose between the two latest versions for Python 2 and 3:
Python 3.5.1 and Python 2.7.11. Alternatively, if you are looking for a specific
release, you can scroll down the page to find download links for earlier versions.
You would normally opt to download the latest version, which is Python 3.5.1.
This was released on December 7, 2015. However, you may opt for the latest
version of Python 2, 2.7.11. Your preferences will usually depend on which
version will be most usable for your project. While Python 3 is the present and
future of the language, issues such as third-party utility or compatibility may
require you to download Python 2.
3.4.3 PyCharm

PyCharm is the most popular IDE for Python, and includes great features such as
excellent code completion and inspection with advanced debugger and support for
web programming and various frameworks. PyCharm is created by Czech
company, Jet brains which focusses on creating integrated development
environment for various web development languages like JavaScript and PHP.
PyCharm offers some of the best features to its users and developers in the
following aspects

• Code completion and inspection.

• Advanced debugging.
• Support for web programming and frameworks such as Django and Flask.

Features of PyCharm

Besides, a developer will find PyCharm comfortable to work with because of the
features mentioned below −

Code Completion

PyCharm enables smoother code completion whether it is for built in or for an


external package.

SQLAlchemy as Debugger

You can set a breakpoint, pause in the debugger and can see the SQL
representation of the user expression for SQL Language code.

Git Visualization in Editor

When coding in Python, queries are normal for a developer. You can check the last
commit easily in PyCharm as it has the blue sections that can define the difference
between the last commit and the current one.

Code Coverage in Editor

You can run .py files outside PyCharm Editor as well marking it as code coverage
details elsewhere in the project tree, in the summary section etc.

Package Management
All the installed packages are displayed with proper visual representation. This
includes list of installed packages and the ability to search and add new packages.

Local History

Local History is always keeping track of the changes in a way that complements
like Git. Local history in PyCharm gives complete details of what is needed to
rollback and what is to be added.

Refactoring

Refactoring is the process of renaming one or more files at a time and PyCharm
includes various shortcuts for a smooth refactoring process.

3.4.4 Wamp Server

WAMPs are packages of independently-created programs installed on computers


that use a Microsoft Windows operating system. Apache is a web server. MySQL
is an open-source database. PHP is a scripting language that can manipulate
information held in a database and generate web pages dynamically each time
content is requested by a browser. Other programs may also be included in a
package, such as phpMyAdmin which provides a graphical user interface for the
MySQL database manager, or the alternative scripting languages Python or

Perl.
Installation on Windows

Visit the link https://www.python.org/downloads/ to download the latest release


of Python. In this process, we will install Python 3.6.7 on our Windows operating
system.

Double-click the executable file which is downloaded; the following window will
open. Select Customize installation and proceed.

Pycharm

Besides, a developer will find PyCharm comfortable to work with because of the
features mentioned below −

Code Completion

PyCharm enables smoother code completion whether it is for built in or for an


external package.

SQLAlchemy as Debugger

You can set a breakpoint, pause in the debugger and can see the SQL
representation of the user expression for SQL Language code.

Git Visualization in Editor

When coding in Python, queries are normal for a developer. You can check the
last commit easily in PyCharm as it has the blue sections that can define the
difference between the last commit and the current one.
Code Coverage in Editor

You can run .py files outside PyCharm Editor as well marking it as code coverage
details elsewhere in the project tree, in the summary section etc.

Package Management

All the installed packages are displayed with proper visual representation. This
includes list of installed packages and the ability to search and add new packages.

Local History

Local History is always keeping track of the changes in a way that complements
like Git. Local history in PyCharm gives complete details of what is needed to
rollback and what is to be added.

Refactoring

Refactoring is the process of renaming one or more files at a time and PyCharm
includes various shortcuts for a smooth refactoring process.

Jupyter notebook

The Jupyter Notebook is an open source web application that you can use to create
and share documents that contain live code, equations, visualizations, and text.
Jupyter Notebook is maintained by the people at Project Jupyter.

Jupyter Notebooks are a spin-off project from the IPython project, which used to
have an IPython Notebook project itself. The name, Jupyter, comes from the core
supported programming languages that it supports: Julia, Python, and R. Jupyter
ships with the IPython kernel, which allows you to write your programs in Python,
but there are currently over 100 other kernels that you can also use.
The Jupyter Notebook is not included with Python, so if you want to try it out, you
will need to install Jupyter.

There are many distributions of the Python language. This article will focus on just
two of them for the purposes of installing Jupyter Notebook. The most popular is
CPython, which is the reference version of Python that you can get from
their website. It is also assumed that you are using Python .

Installation steps for python

 Download latest Python 3.x version.

 Open the executable file and check the add Python 3.x to PATH. Then click
the install now button. It will show the installation progress.

 After completing installation progress click on close button to finish the


installation.

 Now, the Python3.x is installed. Open the command prompt and type python
-V .

PIP

PIP is a package manager for Python packages or modules. If Python version is


3.4 or later, PIP is included by default. Downloading a package using PIP is very
easy. For this, we have to open the command line interface and tell PIP to
download the package. Once the package is installed, it is ready to use.

Ex: pip install pandas

JSON

Python has a built-in package called json, which can be used to worwith JSON
data. JSON is a syntax for storing and exchanging data. It is text, written with Java
Script Object Notation. We can convert JSON string to Python and vice-versa.

Ex: json.dumps()

Chefboost

Chef boost is a lightweight gradient boosting, random forest enabled decision tree
framework including regular C4.5,ID3,CART and regression tree algorithms with
categorical features support. It is lines of code to build decision trees with Chef
boost.

Basically, we just need to pass the dataset as pandas data frame and tree
configurations after importing chef boost. We just need to put the target label to the
right. Besides, chef boost handles both numeric and nominal features and target
values in contrast to its alternatives.

Pandas
Pandas is a opensource library that allows to perform data manipulation in Python.
Pandas library is built on top of Numpy, meaning Pandas needs Numpy to operate.
Pandas provide an easy way to create, manipulate and wrangle the data. Panda is
also an elegant solution for time series data.

• It easily handles missing data.

• It provides an efficient way to slice the data.

NumPy

NumPy is a Python package which stands for ‘Numerical Python’. It is the core
library for scientific computing, which contains a power n-dimensional array
objects. NumPy array can also be used as an efficient multi-dimensional container
for generic data.

Python NumPy array v/s List

1)Less memory

2)Fast
3)convenient
CHAPTER-7
CODE

CHAPTER-8
TESTING
SYSTEM TESTING
The purpose of testing is to discover errors. Testing is the process of trying to
discover every conceivable fault or weakness in a work product. It provides a way
to check the functionality of components, subassemblies, assemblies and/or a
finished product It is the process of exercising software with the intent of ensuring
that the

Software system meets its requirements and user expectations and does not fail in
an unacceptable manner. There are various types of test. Each test type addresses a
specific testing requirement.
TYPES OF TESTS

Unit testing
Unit testing involves the design of test cases that validate that the internal
program logic is functioning properly, and that program inputs produce valid
outputs. All decision branches and internal code flow should be validated. It is the
testing of individual software units of the application .it is done after the
completion of an individual unit before integration. This is a structural testing, that
relies on knowledge of its construction and is invasive. Unit tests perform basic
tests at component level and test a specific business process, application, and/or
system configuration. Unit tests ensure that each unique path of a business process
performs accurately to the documented specifications and contains clearly defined
inputs and expected results.

Integration testing
Integration tests are designed to test integrated software components to determine
if they actually run as one program. Testing is event driven and is more concerned
with the basic outcome of screens or fields. Integration tests demonstrate that
although the components were individually satisfaction, as shown by successfully
unit testing, the combination of components is correct and consistent. Integration
testing is specifically aimed at exposing the problems that arise from the
combination of components.

Functional test
Functional tests provide systematic demonstrations that functions tested are
available as specified by the business and technical requirements, system
documentation, and user manuals.

Functional testing is centered on the following items:

Valid Input : identified classes of valid input must be accepted.

Invalid Input : identified classes of invalid input must be rejected.

Functions : identified functions must be exercised.

Output : identified classes of application outputs must be exercised.

Systems/Procedures: interfacing systems or procedures must be invoked.

Organization and preparation of functional tests is focused on requirements, key


functions, or special test cases. In addition, systematic coverage pertaining to
identify Business process flows; data fields, predefined processes, and successive
processes must be considered for testing. Before functional testing is complete,
additional tests are identified and the effective value of current tests is determined.

System Test
System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results. An
example of system testing is the configuration oriented system integration test.
System testing is based on process descriptions and flows, emphasizing pre-driven
process links and integration points.

White Box Testing


White Box Testing is a testing in which in which the software tester has knowledge
of the inner workings, structure and language of the software, or at least its
purpose. It is purpose. It is used to test areas that cannot be reached from a black
box level.

Black Box Testing


Black Box Testing is testing the software without any knowledge of the inner
workings, structure or language of the module being tested. Black box tests, as
most other kinds of tests, must be written from a definitive source document, such
as specification or requirements document, such as specification or requirements
document. It is a testing in which the software under test is treated, as a black
box .you cannot “see” into it. The test provides inputs and responds to outputs
without considering how the software works.

Unit Testing:

Unit testing is usually conducted as part of a combined code and unit test phase of
the software lifecycle, although it is not uncommon for coding and unit testing to
be conducted as two distinct phases.

Test strategy and approach


Field testing will be performed manually and functional tests will be written
in detail.
Test objectives

 All field entries must work properly.


 Pages must be activated from the identified link.
 The entry screen, messages and responses must not be delayed.

Features to be tested

 Verify that the entries are of the correct format


 No duplicate entries should be allowed
 All links should take the user to the correct page.
Integration Testing
Software integration testing is the incremental integration testing of two or more
integrated software components on a single platform to produce failures caused by
interface defects.The task of the integration test is to check that components or
software applications, e.g. components in a software system or – one step up –
software applications at the company level – interact without error.

Test Results: All the test cases mentioned above passed successfully. No defects
encountered.

Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional
requirements.

Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
SYSTEM TESTING

TESTING METHODOLOGIES
The following are the Testing Methodologies:

o Unit Testing.
o Integration Testing.
o User Acceptance Testing.
o Output Testing.
o Validation Testing.

Unit Testing

Unit testing focuses verification effort on the smallest unit of Software


design that is the module. Unit testing exercises specific paths in a module’s
control structure to ensure complete coverage and maximum error detection. This
test focuses on each module individually, ensuring that it functions properly as a
unit. Hence, the naming is Unit Testing.

During this testing, each module is tested individually and the module
interfaces are verified for the consistency with design specification. All important
processing path are tested for the expected results. All error handling paths are also
tested.

Integration Testing

Integration testing addresses the issues associated with the dual problems of
verification and program construction. After the software has been integrated a set
of high order tests areconducted. The main objective in this testing process is to
take unit tested modules and builds a program structure that has been dictated by
design.

The following are the types of Integration Testing:

1)Top Down Integration

This method is an incremental approach to the construction of program


structure. Modules are integrated by moving downward through the control
hierarchy, beginning with the main program module. The module subordinates to
the main program module are incorporated into the structure in either a depth first
or breadth first manner.
In this method, the software is tested from main module and individual stubs
are replaced when the test proceeds downwards.

2. Bottom-up Integration

This method begins the construction and testing with the modules at the
lowest level in the program structure. Since the modules are integrated from the
bottom up, processing required for modules subordinate to a given level is always
available and the need for stubs is eliminated. The bottom up integration strategy
may be implemented with the following steps:
 The low-level modules are combined into clusters into clusters that
perform a specific Software sub-function.
 A driver (i.e.) the control program for testing is written to coordinate test
case input and output.
 The cluster is tested.
 Drivers are removed and clusters are combined moving upward in the
program structure

The bottom up approaches tests each module individually and then each module is
module is integrated with a main module and tested for functionality.

OTHER TESTING METHODOLOGIES

User Acceptance Testing

User Acceptance of a system is the key factor for the success of any system.
The system under consideration is tested for user acceptance by constantly keeping
in touch with the prospective system users at the time of developing and making
changes wherever required. The system developed provides a friendly user
interface that can easily be understood even by a person who is new to the system.
Output Testing

After performing the validation testing, the next step is output testing of the

proposed system, since no system could be useful if it does not produce the

required output in the specified format. Asking the users about the format required

by them tests the outputs generated or displayed by the system under consideration.

Hence the output format is considered in 2 ways – one is on screen and another in

printed format.

Validation Checking

Validation checks are performed on the following fields.

Text Field:

The text field can contain only the number of characters lesser than or equal
to its size. The text fields are alphanumeric in some tables and alphabetic in other
tables. Incorrect entry always flashes and error message.

Numeric Field:

The numeric field can contain only numbers from 0 to 9. An entry of any
character flashes an error messages. The individual modules are checked for
accuracy and what it has to perform. Each module is subjected to test run along
with sample data. The individually tested modules are integrated into a single
system. Testing involves executing the real data information is used in the
program the existence of any program defect is inferred from the output. The
testing should be planned so that all the requirements are individually tested.
A successful test is one that gives out the defects for the inappropriate data
and produces and output revealing the errors in the system.

Preparation of Test Data

Taking various kinds of test data does the above testing. Preparation of test
data plays a vital role in the system testing. After preparing the test data the system
under study is tested using that test data. While testing the system by using test
data errors are again uncovered and corrected by using above testing steps and
corrections are also noted for future use.

Using Live Test Data:

Live test data are those that are actually extracted from organization files.

After a system is partially constructed, programmers or analysts often ask users to

key in a set of data from their normal activities. Then, the systems person uses this

data as a way to partially test the system. In other instances, programmers or

analysts extract a set of live data from the files and have them entered themselves.

It is difficult to obtain live data in sufficient amounts to conduct extensive


testing. And, although it is realistic data that will show how the system will
perform for the typical processing requirement, assuming that the live data entered
are in fact typical, such data generally will not test all combinations or formats that
can enter the system. This bias toward typical values then does not provide a true
systems test and in fact ignores the cases most likely to cause system failure.
Using Artificial Test Data:
Artificial test data are created solely for test purposes, since they can be generated
to test all combinations of formats and values. In other words, the artificial data,
which can quickly be prepared by a data generating utility program in the
information systems department, make possible the testing of all login and control
paths through the program.

The most effective test programs use artificial test data generated by persons other
than those who wrote the programs. Often, an independent team of testers
formulates a testing plan, using the systems specifications.

The package “Virtual Private Network” has satisfied all the requirements specified
as per software requirement specification and was accepted.

USER TRAINING

Whenever a new system is developed, user training is required to educate them


about the working of the system so that it can be put to efficient use by those for
whom the system has been primarily designed. For this purpose the normal
working of the project was demonstrated to the prospective users. Its working is
easily understandable and since the expected users are people who have good
knowledge of computers, the use of this system is very easy.

MAINTAINENCE

This covers a wide range of activities including correcting code and design errors.
To reduce the need for maintenance in the long run, we have more accurately
defined the user’s requirements during the process of system development.
Depending on the requirements, this system has been developed to satisfy the
needs to the largest possible extent. With development in technology, it may be
possible to add many more features based on the requirements in future. The
coding and designing is simple and easy to understand which will make
maintenance easier.

TESTING STRATEGY :

A strategy for system testing integrates system test cases and design techniques
into a well-planned series of steps that results in the successful construction of
software. The testing strategy must co-operate test planning, test case design, test
execution, and the resultant data collection and evaluation .A strategy for software
testing must accommodate low-level tests that are necessary to verify that a
small source code segment has been correctly implemented as well as high
level tests that validate major system functions against user requirements.

Software testing is a critical element of software quality assurance and represents


the ultimate review of specification design and coding. Testing represents an
interesting anomaly for the software. Thus, a series of testing are performed for
the proposed system before the system is ready for user acceptance testing.

SYSTEM TESTING:

Software once validated must be combined with other system elements (e.g.
Hardware, people, database). System testing verifies that all the elements are
proper and that overall system function performance is achieved. It also tests to
find discrepancies between the system and its original objective, current
specifications and system documentation.
UNIT TESTING:

In unit testing different are modules are tested against the specifications produced
during the design for the modules. Unit testing is essential for verification of the
code produced during the coding phase, and hence the goals to test the internal
logic of the modules. Using the detailed design description as a guide, important
Conrail paths are tested to uncover errors within the boundary of the modules. This
testing is carried out during the programming stage itself. In thistype of testing
step, each module was found to be working satisfactorily as regards to the expected
output from the module.

In Due Course, latest technology advancements will be taken into consideration.


As part of technical build-up many components of the networking system will be
generic in nature so that future projects can either use or interact with this. The
future holds a lot to offer to the development and refinement of this project.
CHAPTER-9
SCREENSHOTS
DATASET
CHAPTER-10
CONCLUSION
CONCLUSION
The system shows the potential of data mining in higher education. It was
especially used to improve students' performance and detect early predictor of their
final SGPA. Accuracy obtained on performing classification using decision tree.
We utilized the classification technique, decision tree in particular, to predict
students’ final SGPA based on their grades on previous courses.
1. Mustafa Agaoglu, ―Predicting Instructor Performance Using Data Mining
Techniques in Higher Education," IEEE Access , Volume: 4 ,2016

2. Tripti Mishra,Dr. Dharminder Kumar,Dr. Sangeeta Gupta,"Mining Students’


Data for Performance Prediction," in fourth International Conference on Advanced
Computing & Communication Technologies,2014.

3. Crist´obal Romero," Educational Data Mining: A Review of the State of the


Art," IEEE Transactions On Systems, Man, And Cybernetics—Part C:
Applications And Reviews, Vol. 40, No. 6, November 2010.

4. Carlos Márquez Vera, Cristóbal Romero Morales and Sebastián Ventura


Soto,―Predicting of school failure and dropout by using data mining techniques‖,
The IEEE Journal of Latin-American Learning Technologies (IEEE-RITA) , Vol.
8, No. 1, pp 7-14, Feb 2013.

5. R.S.J.D Baker and K.Yacef, ―The State of Educational Data Mining in 2009: A
Review and Future Visions‖ , Journal of Educational Data Mining, 1, Vol 1, No 1,
2009.

You might also like