0% found this document useful (0 votes)
311 views42 pages

Industrial Report On Banking System

This document is an industrial training report submitted by Bilal Shahid to the Department of Computer Science and Engineering at Babu Banarasi Das Engineering College in Lucknow, India. It discusses the development of a banking credit system module using machine learning to maximize bank profits by minimizing losses from risky loans. The module analyzes past banking data to build predictive models using logistic regression and decision trees that can help determine if a loan applicant is a good or bad credit risk.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
311 views42 pages

Industrial Report On Banking System

This document is an industrial training report submitted by Bilal Shahid to the Department of Computer Science and Engineering at Babu Banarasi Das Engineering College in Lucknow, India. It discusses the development of a banking credit system module using machine learning to maximize bank profits by minimizing losses from risky loans. The module analyzes past banking data to build predictive models using logistic regression and decision trees that can help determine if a loan applicant is a good or bad credit risk.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 42

INDUSTRIAL TRAINING REPORT

BANKING CREDIT SYSTEM

Submitted in partial fulfillment of the


Requirements for the award of

Degree of Bachelor of Technology in Computer Science and


Engineering

Submitted By:

Name: Bilal Shahid


University Roll No.: 1650810043

SUBMITTED TO:

Department of Computer Science and Engineering


BABU BANARASI DAS ENGINEERING COLLEGE
LUCKNOW
DECLARATION
I hereby declare that the Industrial Training Report entitled ("Banking Credit System")
is an authentic record of my own work as requirements of Industrial Training during
the period from 10 June 2019 to 20 July 2019 for the award of degree of B.Tech. (CSE),
Babu Banarasi Das Engineering College, Lucknow, under the guidance of Abhijeet
Mishra.

(Signature of student)

Bilal Shahid

1650810043

Date: 17-10-2019

ACKNOWLEDGEMENT
The satisfaction that accompanies on the successful completion of this project would
be incomplete without the mention of the people who made it possible, without whose
constant guidance and encouragement would have made our efforts go in vain. We
consider ourself privileged to express gratitude and respect towards all those who
guided us through the completion of this project.
We convey thanks to our project guide Mr. Abhijeet Mishra for providing
encouragement, constant support and guidance which was of a great help to complete
this project successfully.
ETL LABS
ETL is a Professional and educational institute for Programming & Basic concept in
computer educations. The institute provides you with focus on the different
programming languages. It provides a wide range of different language courses as a
result of the continuous effort to meet the needs of a good programmer as well as all
types of competitive computer exams & to improve your personality on the whole.
These courses are designed in a manner that systematically equips them with the
knowledge and a technique required for any of the exams and helps them in progressing
in their careers.
Directors of ETL Labs Private Limited are Amit Singh and Ankit Kumar.
ETL provide all types of computer training such as 2 months summer training/
internship for B.Tech students, BCA, MSc (IT), MCA final year projects are done in
our center. We teach Java, .NET, PHP, Web Designing, VB.NET, ASP.NET, C, C++,
Website Creation, Live Projects.
ETL Labs Private Limited is an Indian Non-Government Company. It's a private
company and is classified as 'company limited by shares.
TABLE OF CONTENTS

S.NO CONTENT

1. INTRODUCTION

2. OBJECTIVE

3. BACKGROUND

4. BENEFITS

5. KEYWORDS

6. TOOL & TECHNOLOGY USED

7. DATA DESCRIPTION

8. PACKAGES

9. LOGISTIC REGRESSION

10. DECISION TREE

11. SNAPSOTS

12. CONCLUSION

13. REFERENCES
Introduction
This project is being created as a result of the data sets of banking
department for the profit maximization of Banking System.
In this project our main motive to maximize the profit and reduce the loss
of bank, We try to make module with the help of machine learning we
make a decision framework which help to maximize the profit of banking
system.
On the basis of previous year of dataset we create our module with the
help of Machine learning which give a decision framework to understand
the applicant who try to apply the loan ,we approve them or not we can
set a criteria where the applicant is good credit risk or bad credit risk,
which help in no loss in while repaying the loan by applicant.

“The main objective of this project to remove deficiency of banking system


which they face last few year”.

OBJECTIVE
• The main objective of this project “Minimization of risk and
maximization of profit on behalf of the bank.”, on the basis of
previous data set of banking system.
• Objective of this project increasing per unit revenue, decreasing
unit cost or mix of both.

• Machine learning approach has proven to be superior in terms of


accuracy as well as reliability compared to some traditional
classification model.
BACKGROUND
The bank offers many services but most of them are related to credit, for example,
business loans, checking accounts, payment services, cash management. One of the
financial services that contribute greatly to bank revenues is lending. Loans that banks
lend out acts as financial solutions for their clients, and in return, the clients have the
responsibility to pay principals and interests. In terms of creditworthy customers, who
are capable of repaying their debts, banks will be profitable. However, this is not always
the case since there are risks that customers cannot afford to full fill their loan
obligations. These risks can shift performing loans to non-performing loans (NPLs), or
worse, impairment losses. Bad debts, another term of NPLs, cause negative impacts on
bank performance, profits and reputation. Even though banks are exposed to many types
of risks, credit risk is considered to have the most influence on financial performance
by far. Due to these reasons, risk controlling in credit activities is a critical issue in the
banking industry which requires bank managers and experts to come up with solutions
that can minimize credit risk and bad debts.
This report takes into account theories relating to credit risk management and a case
study of a bank, Nordic bank. As one of the economic entities in the commercial
banking sector, the case bank also has great concern in the topic and wants to understand
the level of credit risk accompanied with its borrowers, and what it can do to protect its
capital. Later on in this report, we will look deeper into the case bank’s organization to
identify different approaches that the financial institution is currently using to control
credit risk in advance and during the lending period. In the end, from the findings of the
report, the report aims to provide valuable recommendations to improve the case bank’s
ability to control the credit risk.

BENEFITS
This report provides information for people who are interested in the banking industry
as well as in the credit risk topic. It indicates different approaches used to tackle the
credit risk issue. The case bank is of Middle East, thus businessmen from foreign
countries can gain information of how a typical credit risk management system works
through this report. For the case bank, the report’s outcome supports the credit
institution in improving its internal credit risk control. The comparison between the case
bank’s currently in used methods and the knowledge base can generate additional
models that the bank can apply. From the report’s results, we will give
recommendations to develop the case bank’s credit risk management further. Regarding
the field of specialization and career, we expect to gain an insight of how banks protect
themselves from high-risk loans. In addition, during the making of this report, we can
practice and improve the skills in various areas such as communication, data analysis,
decision making, which will support us in our future profession.
KEYWORDS
• Machine learning
• Profit maximization
• Banking System
• Minimize losses Maximize profit
• Lower Risk
• Data Processing
• Module Based on Machine Learning
• Libraries (Pandas, Numpy, Matplotlib, Scikit learn, Seaborn, Plotly,
Logistic Regression, Decision tree)
• Data analytics

TOOLS & TECHNOLOGY USED


Technology: Machine Learning through Python.
Tool: Jyputer Notebook.

In this project we use Machine Learning through Python Language which will run
on jyputer notebook.

Python:
Python is an interpreted, high-level, general-purpose programming
language. Created by Guido van Rossum and first released in 1991, Python's design
philosophy emphasizes code readability with its notable use of significant whitespace.
Python Features:
Python provides lots of features that are listed below.

1) Easy to Learn and Use


2) Expressive Language
3) Interpreted Language
4) Cross-platform Language
5) Free and Open Source
6) Object-Oriented Language
7) Extensible
8) Large Standard Library
9) GUI Programming Support 10) Integrated

Machine Learning:
Machine learning is a type of artificial intelligence (AI) that
provides computers with the ability to learn without being explicitly programmed.
Machine learning focuses on the development of Computer Programs that can
change when exposed to new data.

DATA DESCRIPTION
The dataset contains 1000 entries where each entry represents a person who takes
credit by a bank. Each person is classified as a good or a bad credit risk according to
the set of attributes.
There are two types of variables used in this report :-

1.Numeric variables

2.Categorical variables

Numeric Variables:- The values of a numeric variables are numbers .


-
Categorical Variable: The values of a categorical variable are selected from a small
group of categories .Examples are genders(male or female) , marital status(married
,never married ,divorced) etc .

The categorical variables in the dataset are as follows:


1. Sex (male, female)
2. Job (1-unskilled and non-resident, 2-unskilled and resident, 3-skilled, 4-highly
skilled)
3. Housing (own, rent or free)
4. Purpose (new car, used car, furniture, radio/TV, domestic appliances, repairs,
education, business, vacation, retraining)
5. Risk (value target : Good or Bad Risk)
6. Guarantors (none, Co-Applicant, Guarantor)
Numeric variables in the dataset: 1. Age
2. Checking Account
3. Credit Amount
4. Duration

Packages
Pandas:
Pandas is the most popular python library that is used for data analysis. It
provides highly optimized performance with back-end source code is purely written in
C or Python.

Numpy:
NumPy is not another programming language but a Python extension
module. It provides fast and efficient operations on arrays of homogeneous data.
NumPy extends python into a high-level language for manipulating numerical data,
similiar to MATLAB.

Matplotlib:
Matplotlib tries to make easy things easy and hard things possible. You can
generate plots, histograms, power spectra, bar charts, errorcharts, scatterplots, etc.

Seaborn:
Seaborn is a Python data visualization library based on matplotlib. It
provides a high-level interface for drawing attractive and informative statistical
graphics.

Scikit learn:
A python library which contains very good and efficient implementations
of various machine learning algorithms.
It is Simple and efficient tools for data mining and data analysis, Accessible to
everybody, and reusable in various contexts.

Plotly:
Plotly Also a plotting library which creates interactive plots.
Logistic Regression
Logistic Regression is a Machine Learning classification algorithm that is used to
predict the probability of a categorical dependent variable. In logistic regression, the
dependent variable is a binary variable that contains data coded as 1 (yes, success,
etc.) or 0 (no, failure, etc.).

Advantages:

• it doesn’t require high computational power

• is easily interpretable

• is used widely by the data analyst and data scientists.

• is very easy to implement

• it doesn’t require scaling of features

• it provides a probability score for observations.

Disadvantages:
• while working with Logistic regression you are not able to handle a large number
of categorical features/variables.

• it is vulnerable to overfitting

• it cant solve the non-linear problem with the logistic regression model that is why
it requires a transformation of non-linear features

• Logistic regression will not perform well with independent(X) variables that are
not correlated to the target(Y) variable.

Decision Tree
Decision tree learning is a method commonly used in Machine learning. It is also a
method for supervised learning. The goal is to create a model that predicts the value of
a target variable based on several input variables.

A decision tree is a flowchart-like tree structure where an internal node represents


feature(or attribute), the branch represents a decision rule, and each leaf node represents
the outcome. The topmost node in a decision tree is known as the root node.
Advantages:

• Simple to understand and to interpret. Trees can be visualised.


• The cost of using the tree (i.e., predicting data) is logarithmic in the
number of data points used to train the tree.
• Able to handle both numerical and categorical data. Other techniques are
usually specialised in analysing datasets that have only one type of
variable. See algorithms for more information.
• Able to handle multi-output problems.
• Possible to validate a model using statistical tests. That makes it possible
to account for the reliability of the model.
• Performs well even if its assumptions are somewhat violated by the true
model from which the data were generated.

Disadvantage:

• Decision-tree learners can create over-complex trees that do not generalise


the data well. This is called overfitting. Mechanisms such as pruning (not
currently supported), setting the minimum number of samples required at a
leaf node or setting the maximum depth of the tree are necessary to avoid
this problem.
• Decision trees can be unstable because small variations in the data might
result in a completely different tree being generated. This problem is
mitigated by using decision trees within an ensemble.
• There are concepts that are hard to learn because decision trees do not
express them easily, such as XOR, parity or multiplexer problems.
• Decision tree learners create biased trees if some classes dominate. It is
therefore recommended to balance the dataset prior to fitting with the
decision tree.

Snapsots
So on 30% split, we maximized our profit upto 3 times.
Conclusion
The main purpose of this project to maximize the profit of banking system and
remove the deficiency what we face in previous few year. On the basis of previous
year data we can mine the data according to our need and find the result on the basis
of different situation machine learning library and tools helps for data mining and we
plot the data as graph and with the help of graph we easily classify according to our
need.
The basic specialization on this project it help to understand the data and deficiency
what we face with previous customer we can easilyunderstand what is better option
for maximize the profit.
Under this era,banks should simultaneously enrich their statistical techniques in
order to accommodate the increase availability of data, and to exploit all possible
dimensions of information collected.
“It’s a statistical technique to improve the banking system on the basic of previous
data and maximize the profit”
REFRENCES
• https://www.kaggle.com/startupsci/titanic-data-science-solutions For model
analysis and visualization Data analytics and Machine learning tools.

• https://towardsdatascience.com/building-a-logistic-regression-inpython-
step-by-step-becd4d56c9c8
• https://www.geeksforgeeks.org/introduction-machine-learning-
usingpython/#targetText=Introduction%20To%20Machine%20Learning%2
0using%20Python,when%20exposed%20to%20new%20data.
• https://www.datacamp.com/community/tutorials/decision-treeclassification-
python#targetText=Decision%20Tree%20Algorithm,known%20as%2
0the%20root%20node.

You might also like