Mini Project Documation (Capctha) .DP
Mini Project Documation (Capctha) .DP
On
“CAPTCHA Recognition And Analysis Using Custom Based CNN
Model – Capsecure”
Submitted in partial fulfillment of the requirement for the award of the
degree of
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING (DATA SCIENCE)
Submitted
By
B.K.DURGA PRASAD REDDY (215D1A6710)
UNDER THE GUIDANCE
OF
Mr.B.RAGHUPATHI
Associate Professor
Department of CSD
DECLARATION
I would like to express our sincere gratitude to our Project guide Mr.
B.RAGHUPATHI, who has guided and supported us through every stage in the
project.
I like to express our profound thanks to all those who helped us to make
this project a huge success.
Finally but most importantly, I thank our parents and siblings for the
much needed moral support and to whom we owe everything.
BY
SNO PAGE NO
1. INTRODUCTION 1
2. LITERATURE SURVEY 8
2.1 Methodologies 10
3. SYSTEM REQUIREMENTS 13
4. SYSTEM DESIGN 18
4.1Architecture 18
5.1 Python 25
5.2 Django 40
6. IMPLEMENTATION 51
7. SYSTEM TESTING 52
8. SCREEN SHOTS 56
9. CONCLUSION 60
11. REFERENCES 62
Captcha Recognition And Analysis Using Custom Based CNN Model – Capsecure
1.INTRODUCTION
CAPTCHA, stands for Completely Automated Public Tur- ing test to
distinguish between Computers and Humans, is an automated test to distinguish
between robots and humans . It has been widely used in many applications such as
network protection and information security. With the evolution of internet
technology, network security issues continue to amplify. CAPTCHAs are used on
the internet to prevent various cyber security threats, attacks, and other penetrations
towards the anonymity of web services .
Nowa-days, many major mainstream websites have been using it. In specific,
these attacks regularly lead to situations where computer programs replace humans.
As a result, the computer codes try to automate services to send a significant number
of undesirable emails, access databases, or influence the online pools. CAPTCHA
recognition in conjunction with artificial intelligence and image processing plays a
key role in the development of artificial intelligence technology, and can also be
applied in the area of handwriting recognition, license plate recognition, optical
character recognition and so on.
The mainstream CAPTCHA is based on visual representation that includes
numerical or alpha-numerical strings, voice, or picture sets. Meanwhile, the text-
based CAPTCHAs are commonly used in pratice. shows some examples of alpha-
numerical CAPTCHAs and their types. There are many Department of Computer
Science and Engineering, E´ cole Centrale School of Engineering, Mahindra
University, Hyderabad - 500043, India.
Methods out there to make the CAPTCHAs more complex by adding
efficient noise and disturbances to them . The security of CAPTCHAs schemes can
be enhanced by adding various types of noise, for instance, adding zig-zag lines over
the alpha-numeric CAPTCHA. Some examples of CAPTCHAs which are distorted
with noise. Apart from the text-based CAPTCHAs, image-based CAPTCHAs are
also becoming popular. These CAPTCHAs include sample images of various objects
such as traffic signals, vehicles, street signs, landscapes or statues, and tests the user
to identify a particular object among the displayed images . An example of image-
based CAPTCHA is depicted.
Traditional CAPTCHA recognition involves the following steps: image
preprocessing, character segmentation, character recognition, feature extraction, and
post-processing as the major stages. The accuracy and efficiency of these models
depends on segmentation and feature extraction stages. The present article focuses
on widely used alpha-numeric character based images CAPTCHA, and it is
composed of random English letters and numerals. Further, it is not difficult to
generate them. Furthermore, it is worth mentioning another application of
CAPTCHA, that is through OCR (Optical Character Recognition). Though the
existing OCR algorithms are very robust, they still have a few shortcomings in the
recognition of different hand-written scripts or undermined writings.
In recent years, as a kind of deep neural network, the CNN has shown
exemplary rendition in image recognition compared to traditional methods. The
conventional way of extracting pixel points and template matching can only
recognize simple CAPTCHA. Compared to the traditional pattern recognition
method, the main advantage of CNN is that it can actively learn features without
artificial design. The extracted image features have a strong expressive ability,
avoiding the problems of data preprocessing. Though CNN has achieved certain
results, the literature lack on the recognition effect of complex CAPTCHA.
Subsequently, a CNN algorithm is proposed by to identify the CAPTCHAs.
This paper focuses on CNN based image CAPTCHA recognition model
called CAP-SECURE. The image CAPTCHAs consists of random numerals and
English letters. It is easy to create and the brute force is difficult to crack. This
research attempts to address the issue of CAPTCHA recognition and help us to find
its weaknesses and vulnerabilities. An effort has been made to improve the
technology of CAPTCHAs to generate more robust CAPTCHAs so that these
websites don’t get fooled by bots and intelligence attacks.
The outline of the paper is as follows: Section II reviews the relevant
literature related to the developments in the field of CAPTCHA. Section III
introduces the details of our proposed method. The experimental results are
discussed in Section IV. Finally, some concluding remarks are presented in Section
V, section systems lack at present.
1.2 Objectives
1.3 Scope
INDIVIDUAL CHARACTERS.
CAPTCHAS.
PROPOSED METHOD.
TOD-CNN:
REGION-BASED CNNS:
DICTIONARY-BASED METHODS:
ADVANTAGES OF CAP-SECURE:
ROBUST CAPTCHAS.
DISADVANTAGES:
CAPTCHA GENERATORS.
➢ DOES NOT EXPLICITLY HANDLE VARIABLE LENGTH CAPTCHAS.
2. LITERATURE SURVEY
systems, such as Pass Points, that often leads to weak password choices. CARP is not
a panacea, but it offers reasonable security and usability and appears to fit well with
some practical applications for improving online security.
Bostik, Ondrej, and Jan Klecka. “Recognition of CAPTCHA characters
by supervised machine learning algorithms.” IFAC-Papers On Line 51, no. 6
(2018), pp. 208 – 213.
The focus of this paper is to compare several common machine learning
classification algorithms for Optical Character Recognition of CAPTCHA codes. The
main part of a research focuses on the comparative study of Neural Networks, k-
Nearest Neigh-bour, Support Vector Machines and Decision Trees implemented in
MATLAB Computing environment. Achieved success rates of all analyzed
algorithms overcome 89%. The main difference in results of used algorithms is within
the learning times. Based on the data found, it is possible to choose the right
algorithm for the particular task.
Yousef, Mohamed, Khaled F. Hussain, and Usama S. Mohammed. “Ac-
curate, data-efficient, unconstrained text recognition with convolutional neural
networks.” ARXIV preprint arXiv:1812.11894 (2018).
Unconstrained text recognition is an important computer vision task, featuring
a wide variety of different sub-tasks, each with its own set of challenges. One of the
biggest promises of deep neural networks has been the convergence and automation
of feature extractors from input raw signals, allowing for the highest possible
performance with minimum required domain knowledge. To this end, we propose a
data-efficient, end-to-end neural network model for generic, unconstrained text
recognition. In our proposed architecture we strive for simplicity and efficiency
without sacrificing recognition accuracy. Our proposed architecture is a fully
convolutional network without any recurrent connections trained with the CTC loss
function. Thus it operates on arbitrary input sizes and produces strings of arbitrary
length in a very efficient and parallelizable manner. We show the generality and
superiority of our proposed text recognition architecture by achieving state of the art
results on seven public benchmark datasets, covering a wide spectrum of text
recognition tasks, namely: Handwriting Recognition, CAPTCHA recognition, OCR,
License Plate Recognition, and Scene Text Recognition. Our proposed architecture
has won the ICFHR2018 Competition on Automated Text Recognition
2.1 Methodologies
• Split dataset into training, validation, and test sets for performance
evaluation.
• Evaluate model using metrics such as accuracy, precision, recall, and F1-
score.
3. SYSTEM REQUIREMENTS
REQUIREMENT SPECIFICATION :
Functional Requirements
➢ Python
❖ System : Intel i7
❖ Ram : 8GB.
➢ CPU:
➢ GPU:
• Alternatively, NVIDIA GTX series (GTX 1660 Ti, GTX 1080 Ti) for smaller
models
➢ RAM:
• Minimum 16GB RAM (32GB recommended for large models and datasets)
➢ Storage:
➢ Motherboard:
• Supports multiple PCIe slots for GPUs (if using multiple GPUs)
➢ Cooling System:
• Air cooling (e.g., Noctua NH-D15) or liquid cooling for CPU and GPU
➢ Networking:
• Stable internet connection (100 Mbps or higher) for dataset downloads and
cloud services
➢ Display:
➢ Peripherals:
➢ Cloud Setup:
• Cloud services (e.g., AWS EC2 P3 instances, Google Cloud AI Platform) for
scaling
1.Programming Language
➢ Python: Primary language for developing and training the CNN model.
➢ Pillow (PIL): For image manipulation tasks like cropping, rotating, and
converting formats.
6. Model Optimization
➢ CUDA: For enabling GPU acceleration (if using NVIDIA GPUs for model
training).
8. Version Control
9. Package Management
11. Deployment
➢ Flask/Django: For creating a web application that can serve the model's
predictions.
4. SYSTEM DESIGN
4.1 Architecture:
SEQUENCE DIAGRAM:
ACTIVITY DIAGRAM:
GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language so that they
can develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core
concepts.
3. Be independent of particular programming languages and development
process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations,
frameworks, patterns and components.
7. Integrate best practices.
1. The DFD is also called as bubble chart. It is a simple graphical formalism that
can be used to represent a system in terms of input data to the system, various
processing carried out on this data, and the output data is generated by this
system.
2. The data flow diagram (DFD) is one of the most important modeling tools. It is
used to model the system components. These components are the system
process, the data used by the process, an external entity that interacts with the
system and the information flows in the system.
3. DFD shows how the information moves through the system and how it is
modified by a series of transformations. It is a graphical technique that depicts
information flow and the transformations that are applied as data moves from
input to output.
4. DFD is also known as bubble chart. A DFD may be used to represent a system
at any level of abstraction. DFD may be partitioned into levels that represent
increasing information flow and functional detail.
Input Design
The input design is the link between the information system and the user. It
comprises the developing specification and procedures for data preparation and those
steps are necessary to put transaction data in to a usable form for processing can be
achieved by inspecting the computer to read data from a written or printed document
or it can occur by having people keying the data directly into the system. The design
of input focuses on controlling the amount of input required, controlling the errors,
avoiding delay, avoiding extra steps and keeping the process simple. The input is
designed in such a way so that it provides security and ease of use with retaining the
privacy. Input Design considered the following things:
Output Design
A quality output is one, which meets the requirements of the end user and
presents the information clearly. In any system results of processing are
communicated to the users and to other system through outputs. In output design it is
determined how the information is to be displaced for immediate need and also the
hard copy output. It is the most important and direct source information to the user.
Efficient and intelligent output design improves the system’s relationship to help user
decision-making.
Feasibility Study:
The feasibility of the project is analyzed in this phase and business proposal is
put forth with a very general plan for the project and some cost estimates. During
system analysis the feasibility study of the proposed system is to be carried out. This
is to ensure that the proposed system is not a burden to the company. For feasibility
analysis, some understanding of the major requirements for the system is essential.
➢ ECONOMICAL FEASIBILITY
➢ TECHNICAL FEASIBILITY
➢ SOCIAL FEASIBILITY
1.Economical Feasibility :
This study is carried out to check the economic impact that the system will
have on the organization. The amount of fund that the company can pour into the
research and development of the system is limited. The expenditures must be justified.
Thus the developed system as well within the budget and this was achieved because
most of the technologies used are freely available. Only the customized products had
to be purchased.
2.Technical Feasibility:
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on
the available technical resources. This will lead to high demands on the available
technical resources. This will lead to high demands being placed on the client. The
developed system must have a modest requirement, as only minimal or null changes
are required for implementing this system.
3.Social Feasibility
The aspect of study is to check the level of acceptance of the system by the
user. This includes the process of training the user to use the system efficiently. The
user must not feel threatened by the system, instead must accept it as a necessity. The
level of acceptance by the users solely depends on the methods that are employed to
educate the user about the system and to make him familiar with it. His level of
confidence must be raised so that he is also able to make some constructive criticism,
which is welcomed, as he is the final user of the system.
5.SOFTWARE ENVIRONMENT
5.1 Python
$ python
>>>Type the following text at the Python prompt and press the Enter −
If you are running new version of Python, then you would need to use print
statement with parenthesis as in print ("Hello, Python!");. However in Python version
2.4.3, this produces the following result −
Hello, Python!
Invoking the interpreter with a script parameter begins execution of the script
and continues until the script is finished. When the script is finished, the interpreter is
no longer active.
Let us write a simple Python program in a script. Python files have extension
.py. Type the following source code in a test.py file −
Live Demo
We assume that you have Python interpreter set in PATH variable. Now, try to
run this program as follows −
$ python test.py
Hello, Python!
Let us try another way to execute a Python script. Here is the modified test.py
file −
Live Demo
#!/usr/bin/python
$./test.py
Hello, Python!
Python Identifiers
• Class names start with an uppercase letter. All other identifiers start with a
lowercase letter.
• Starting an identifier with a single leading underscore indicates that the
identifier is private.
• Starting an identifier with two leading underscores indicates a strongly private
identifier.
• If the identifier also ends with two trailing underscores, the identifier is a
language-defined special name.
Reserved Words
The following list shows the Python keywords. These are reserved words and
you cannot use them as constant or variable or any other identifier names. All the
Python keywords contain lowercase letters only.And exec not assert finally or Break
for pass class from print continue global raise def if return del import try elif in while
else is with except lambda yield
Python provides no braces to indicate blocks of code for class and function
definitions or flow control. Blocks of code are denoted by line indentation, which is
rigidly enforced. The number of spaces in the indentation is variable, but all
statements within the block must be indented the same amount.
For example −
if True:
print "True"
else:
print "False"
if True:
print "Answer"
print "True"
else:
print "Answer"
print "False"
Thus, in Python all the continuous lines indented with same number of spaces
would form a block. The following example has various statement blocks −
Note − Do not try to understand the logic at this point of time. Just make sure you
understood various blocks even if they are without braces.
#!/usr/bin/python
import sys
try:
except IOError:
sys.exit()
if file_text == file_finish:
file.close
break
file.write(file_text)
file.write("\n")
file.close()
if len(file_name) == 0:
sys.exit()
try:
except IOError:
sys.exit()
file_text = file.read()
file.close()
print file_text
Multi-Line Statements
Statements in Python typically end with a new line. Python does, however,
allow the use of the line continuation character (\) to denote that the line should
continue. For example −
total = item_one + \
item_two + \
item_three
Statements contained within the [], {}, or () brackets do not need to use the
line continuation character. For example −
'Thursday', 'Friday']
Quotation in Python
• Python accepts single ('), double (") and triple (''' or """) quotes to denote
string literals, as long as the same type of quote starts and ends the string.
• The triple quotes are used to span the string across multiple lines. For
example, all the following are legal −
word = 'word'
Comments in Python
A hash sign (#) that is not inside a string literal begins a comment. All
characters after the # and up to the end of the physical line are part of the comment
and the Python interpreter ignores them.
Live Demo
#!/usr/bin/python
# First comment
Hello, Python!
You can type a comment on the same line after a statement or expression −
# This is a comment.
comment. '''
The following line of the program displays the prompt, the statement saying
“Press the enter key to exit”, and waits for the user to take action −
#!/usr/bin/python
• Here, "\n\n" is used to create two new lines before displaying the actual line.
Once the user presses the key, the program ends. This is a nice trick to keep a
console window open until the user is done with an application.
• The semicolon ( ; ) allows multiple statements on the single line given that
neither statement starts a new code block. Here is a sample snip using the
semicolon.
• A group of individual statements, which make a single code block are called
suites in Python. Compound or complex statements, such as if, while, def, and
class require a header line and a suite.
• Header lines begin the statement (with the keyword) and terminate with a colon (
: ) and are followed by one or more lines which make up the suite. For example
−
if expression :
suite
elif expression :
suite
else :
suite
Many programs can be run to provide you with some basic information about
how they should be run. Python enables you to do this with -h −
$ python -h
usage: python [option] ... [-c cmd | -m mod | file | -] [arg] ...
You can also program your script in such a way that it should accept various
options. Command Line Arguments is an advanced topic and should be studied a bit
later once you have gone through rest of the Python concepts.
Python Lists
• The list is a most versatile datatype available in Python which can be written as a
list of comma-separated values (items) between square brackets. Important thing
about a list is that items in a list need not be of the same type.
• Creating a list is as simple as putting different comma-separated values between
square brackets. For example −
list2 = [1, 2, 3, 4, 5 ];
• Similar to string indices, list indices start at 0, and lists can be sliced,
concatenated and so on.
• A tuple is a sequence of immutable Python objects. Tuples are sequences, just
like lists. The differences between tuples and lists are, the tuples cannot be
changed unlike lists and tuples use parentheses, whereas lists use square
brackets.
• Creating a tuple is as simple as putting different comma-separated values.
Optionally you can put these comma-separated values between parentheses also.
For example −
tup2 = (1, 2, 3, 4, 5 );
tup1 = ();
➢ To write a tuple containing a single value you have to include a comma, even
though there is only one value −
tup1 = (50,);
➢ Like string indices, tuple indices start at 0, and they can be sliced, concatenated,
and so on.
➢ To access values in tuple, use the square brackets for slicing along with the index
or indices to obtain value available at that index. For example −
Live Demo
#!/usr/bin/python
tup2 = (1, 2, 3, 4, 5, 6, 7 );
tup1[0]: physics
tup2[1:5]: [2, 3, 4, 5]
Updating Tuples
To access dictionary elements, you can use the familiar square brackets along
with the key to obtain its value. Following is a simple example −
Live Demo
#!/usr/bin/python
dict['Name']: Zara
dict['Age']: 7
If we attempt to access a data item with a key, which is not part of the dictionary,
we get an error as follows −
Live Demo
#!/usr/bin/python
dict['Alice']:
KeyError: 'Alice'
Updating Dictionary
Live Demo
#!/usr/bin/python
dict['Age']: 8
• You can either remove individual dictionary elements or clear the entire contents
of a dictionary. You can also delete entire dictionary in a single operation.
• To explicitly remove an entire dictionary, just use the del statement. Following is
a simple example −
Live Demo
#!/usr/bin/python
dict['Age']:
• Dictionary values have no restrictions. They can be any arbitrary Python object,
either standard objects or user-defined objects. However, same is not true for the
keys.
• There are two important points to remember about dictionary keys –
➢ (a) More than one entry per key not allowed. Which means no duplicate key is
allowed. When duplicate keys encountered during assignment, the last assignment
wins. For example −
Live Demo
#!/usr/bin/python
dict['Name']: Manni
➢ (b) Keys must be immutable. Which means you can use strings, numbers or tuples
as dictionary keys but something like ['key'] is not allowed. Following is a simple
example −
Live Demo
#!/usr/bin/python
➢ Tuples are immutable which means you cannot update or change the values of
tuple elements. You are able to take portions of existing tuples to create new
tuples as the following example demonstrates −
Live Demo
#!/usr/bin/python
# tup1[0] = 100;
print tup3;
• Removing individual tuple elements is not possible. There is, of course, nothing
wrong with putting together another tuple with the undesired elements discarded.
• To explicitly remove an entire tuple, just use the del statement. For example −
Live Demo
#!/usr/bin/python
print tup;
del tup;
print tup;
This produces the following result. Note an exception raised, this is because after
del tup tuple does not exist any more −
print tup;
5.2 DJANGO
Create a Project
Whether you are on Windows or Linux, just get a terminal or a cmd prompt
and navigate to the place you want your project to be created, then use this code −
myproject/
manage.py
myproject/
__init__.py
settings.py
urls.py
wsgi.py
The “myproject” folder is just your project container, it actually contains two
elements − manage.py − This file is kind of your project local django-admin for
interacting with your project via command line (start the development server, sync
db...). To get a full list of command accessible via manage.py you can use the code −
The “myproject” subfolder − This folder is the actual python package of your
project. It contains four files −
urls.py − All links of your project and the function to call. A kind of ToC of your
project.
DEBUG = True
This option lets you set if your project is in debug mode or not. Debug mode
lets you get more information about your project's error. Never set it to ‘True’ for a
live project. However, this has to be set to ‘True’ if you want the Django light server
to serve static files. Do it only in the development mode.
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': 'database.sql',
'USER': '',
'PASSWORD': '',
'HOST': '',
'PORT': '',
Database is set in the ‘Database’ dictionary. The example above is for SQLite engine.
As stated earlier, Django also supports −
MySQL (django.db.backends.mysql)
PostGreSQL (django.db.backends.postgresql_psycopg2)
MongoDB (django_mongodb_engine)
• Before setting any new engine, make sure you have the correct db driver
installed.
• You can also set others options like: TIME_ZONE, LANGUAGE_CODE,
TEMPLATE…
Now that your project is created and configured make sure it's working −
You will get something like the following on running the above code −
Validating models...
0 errors found
Create an Application
We assume you are in your project folder. In our main “myproject” folder, the
same folder then manage.py −
You just created my app application and like project, Django create a “myapp”
folder with the application structure −
myapp/
__init__.py
admin.py
models.py
tests.py
views.py
admin.py − This file helps you make the app modifiable in the admin interface.
At this stage we have our "myapp" application, now we need to register it with
our Django project "myproject". To do so, update INSTALLED_APPS tuple in the
settings.py file of your project (add your app name) −
INSTALLED_APPS = (
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'myapp',
myapp/forms.py
class LoginForm(forms.Form):
As seen above, the field type can take "widget" argument for html rendering;
in our case, we want the password to be hidden, not displayed. Many others widget
are present in Django: DateInput for dates, CheckboxInput for checkboxes, etc.
There are two kinds of HTTP requests, GET and POST. In Django, the request
object passed as parameter to your view has an attribute called "method" where the
type of the request is set, and all data passed via POST can be accessed via the
request.POST dictionary.
def login(request):
if request.method == "POST":
MyLoginForm = LoginForm(request.POST)
if MyLoginForm.is_valid():
username = MyLoginForm.cleaned_data['username']
else:
MyLoginForm = Loginform()
➢ The view will display the result of the login form posted through the
loggedin.html.
➢ To test it, we will first need the login form template. Let's call it login.html.
<html>
<body>
<center>
</center>
</div>
<br>
<center>
</center>
</div>
<br>
<center>
<strong>Login</strong>
</button>
</center>
</div>
</form>
</body>
</html>
The template will display a login form and post the result to our login view
above. You have probably noticed the tag in the template, which is just to prevent
Cross-site Request Forgery (CSRF) attack on your site.
{% csrf_token %}
Once we have the login template, we need the loggedin.html template that will
be rendered after form treatment.
<html>
<body>
</body>
</html>
urlpatterns = patterns('myapp.views',
url(r'^connection/',TemplateView.as_view(template_name = 'login.html')),
Setting Up Sessions
'django.contrib.sessions.middleware.SessionMiddleware'
'django.contrib.sessions'
Let's create a simple sample to see how to create and save sessions. We have built
a simple login system before (see Django form processing chapter and Django
Cookies Handling chapter). Let us save the username in a cookie so, if not signed out,
when accessing our login page you won’t see the login form. Basically, let's make our
login system we used in Django Cookies handling more secure, by saving cookies
server side.
For this, first lets change our login view to save our username cookie server side
–
def login(request):
if request.method == 'POST':
MyLoginForm = LoginForm(request.POST)
if MyLoginForm.is_valid():
username = MyLoginForm.cleaned_data['username']
request.session['username'] = username
else:
MyLoginForm = LoginForm()
Then let us create formView view for the login form, where we won’t display the
form if cookie is set −
def formView(request):
if request.session.has_key('username'):
username = request.session['username']
else:
Now let us change the url.py file to change the url so it pairs with our new view −
urlpatterns = patterns('myapp.views',
When accessing /myapp/connection, you will get to see the following page
6.IMPLEMENTATION
MODULES:
• User
• Admin
• Machine Learning
MODULES DESCRIPTION:
User:
The User can register first. While registering he required a valid user email
and mobile for further communications. Once the user register then admin can
activate the user. Once admin activated the user then user can login into our system.
User can upload the dataset based on our dataset column matched. For algorithm
execution data must be in int or float format. Here we took
Adacel Technologies Limited dataset for testing purpose. User can also add the new
data for existing dataset based on our Django application. User can click the Data
Preparations in the web page so that the data cleaning process will be starts. The
cleaned data and its required graph will be displayed.
Admin:
Admin can login with his login details. Admin can activate the registered
users. Once he activate then only the user can login into our system. Admin can view
Users and he can view overall data in the browser and he load the data. Admin can
view the training data list and test data list. Admin can load the data and view forecast
results.
Machine learning:
Based on the split criterion, the cleaned data is split into 80% training and
20% test, then the dataset is subjected to one machine learning classifier such as
Natural Language Process(NLP). Sentiment analysis by fine tuning auto encoding
models like BERT and ALBERT to achieve a comprehensive understanding of public
sentiment. Thus, we have analyzed the results of our experiment and methodology
using the contextual information and verified the insights.
7.SYSTEM TESTING
The purpose of testing is to discover errors. Testing is the process of trying to
discover every conceivable fault or weakness in a work product. It provides a way to
check the functionality of components, sub assemblies, assemblies and/or a finished
product It is the process of exercising software with the intent of ensuring that the
Software system meets its requirements and user expectations and does not fail in an
unacceptable manner. There are various types of test. Each test type addresses a
specific testing requirement.
Unit Testing
Unit testing is usually conducted as part of a combined code and unit test
phase of the software lifecycle, although it is not uncommon for coding and unit
testing to be conducted as two distinct phases.
Test strategy and approach
Field testing will be performed manually and functional tests will be written in
detail.
Test Objectives
• All field entries must work properly.
• Pages must be activated from the identified link.
• The entry screen, messages and responses must not be delayed.
Features to be tested
• Verify that the entries are of the correct format
• No duplicate entries should be allowed
• All links should take the user to the correct page.
Integration Testing
Software integration testing is the incremental integration testing of two or
more integrated software components on a single platform to produce failures caused
by interface defects.
The task of the integration test is to check that components or software
applications, e.g. components in a software system or – one step up – software
applications at the company level – interact without error.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires
significant participation by the end user. It also ensures that the system meets the
functional requirements.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
7.2Test Cases
If already user
If User registration
1 User Register Pass email exist then it
successfully.
fails.
Admin can
Admin can activate the If user id not found
6 activate the Pass
register user id then it won’t login.
register users
8.SCREEN SHOTS
Home Page
Registration Form
Accuracy Score
Training model
Result
9.CONCLUSION
10.FUTURE ENHANCEMENT
➢ Broaden its Appetite: Train the model on a wider range of CAPTCHAs, from
text-based classics to image-based challenges, even venturing into audio and
video CAPTCHAs. Stay ahead of the curve by tackling the evolving landscape
of CAPTCHAs, including interactive, behavioral, and game-based variants.
➢ Sharpen its Skills: Explore more advanced CNN architectures or consider
hybrid approaches with RNNs or transformers for improved feature extraction
and analysis. Ensemble learning can further boost accuracy and robustness.
➢ Fortify its Armor: Train on a vast and diverse dataset of CAPTCHAs from
various sources, and employ data augmentation techniques to make the model
resilient to real-world variations. Adversarial training can prepare it for targeted
attacks.
➢ Bridge the Real World: Optimize the model for speed and resource efficiency,
enabling smooth integration with web applications even on devices with limited
capabilities. Develop tools and APIs for hassle-free deployment on CAPTCHA-
protected websites.
➢ Peer into the Future: Research the generation of inherently attack-resistant
CAPTCHAs and explore alternative authentication methods that offer robust
security without sacrificing user experience
11. REFERENCES
1. M. A. Kouritzin, F. Newton, and B. Wu, “On random field completely
automated public turing test to tell computers and humans apart genera- tion,”
IEEE Transactions on Image Processing, vol. 22, no. 4, pp. 1656 – 1666, 2013.
2. B. B. Zhu, J. Yan, G. Guanbo Bao, M. Maowei Yang, and N. Ning Xu,
“CAPTCHA as graphical passwords-a new security primitive based on hard AI
problems,” IEEE Transactions on Information Forensics and Security, vol. 9,
no. 6, pp. 891 – 904, 2014.
3. Bostik, Ondrej, and Jan Klecka. “Recognition of CAPTCHA characters by
supervised machine learning algorithms.” IFAC-PapersOnLine 51, no. 6
(2018), pp. 208 – 213.
4. Yousef, Mohamed, Khaled F. Hussain, and Usama S. Mohammed. “Ac- curate,
data-efficient, unconstrained text recognition with convolutional neural
networks.” arXiv preprint arXiv:1812.11894 (2018).
5. Sivakorn, Suphannee, Jason Polakis, and Angelos D. Keromytis. “I’m robot:
deep learning to break semantic image captchas.” In 2016 IEEE European
Symposium on Security and Privacy (Euro S&P), 2016, pp. 388 – 403.
6. Von Ahn, Luis, Benjamin Maurer, Colin McMillen, David Abraham, and
Manuel Blum. “recaptcha: Human-based character recognition via web security
measures.” Science 321, no. 5895, 2008, pp. 1465 – 1468.
7. M. Belk, C. Fidas, P. Germanakos, and G. Samaras, “Do human cognitive
differences in information processing affect preference and performance of
CAPTCHA?” International Journal of Human-Computer Studies, vol. 84,
2015, pp. 1 – 18.
8. I. J. Goodfellow, Y. Bulatov, J. Ibarz, S. Arnoud and V. Shet, “Multi-digit
Number Recognition from Street View Imagery using Deep Convolutional
Neural Networks,” Computer Science, 2013.
9. M. Jaderberg, A. Vedaldi, and A. Zisserman, “Deep features for text spotting,”
European conference on computer vision, Springer, Cham, 2014, pp. 512 –
528.
10. K. Chellapilla and P. Y. Simard, “Using machine learning to break visual
human interaction proofs (HIPs),” Advances in Neural Information Processing
Systems, 2004, pp. 265 – 272.
11. J. Yan and A. S. El Ahmad, “Breaking visual CAPTCHAs with naive pattern
recognition algorithms,” in Proceedings of the 23rd Annual Computer Security
Applications Conference, Miami Beach, FL, USA, December 2007, pp. 279 –
291.
12. H. Gao, J. Yan, F. Cao et al., “A simple generic attack on text captchas,” in
Proceedings of the Network & Distributed System Security Symposium, San
Diego, CA, USA, February 2016, pp. 220 – 232.
13. G. Mori and J. Malik, “Recognizing objects in adversarial clutter: Breaking a
visual CAPTCHA,” Computer Vision and Pattern Recognition, Proc. IEEE
Computer Society Conference, Vol. 1, IEEE, 2003, pp.I-I.