0% found this document useful (0 votes)

100 views53 pages

Cyberspace News Prediction of Text and Image

The document discusses fake news detection in both text and images on the internet. It begins by explaining that people increasingly consume news online due to low costs and easy sharing. However, the spread of fake news online can seriously impact society. Existing models only detect fake news in text or images separately, so the proposed model aims to detect fake news in both text and images streams using machine learning techniques. If news is predicted to be fake, a report will be generated and sent to authorities to stop the fake news from spreading further. The goal is to create a safer online experience where people can verify news before believing or sharing it.

Uploaded by

city

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views53 pages

Cyberspace News Prediction of Text and Image

Uploaded by

city

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 53

CHAPTER 1

INTRODUCTION

Nowadays, people spend a lot of time in Internet (cyberspace) and consume news. The main
reason for rapid spread of news in cyberspace is due to its low cost, easy access and easy sharing
facility. This made people to consume news from cyberspace rather than fetching it from
television or newspaper. The widespread of fake-news will have a serious negative impact on
society and individuals. Fake-news detection on cyberspace has led to tremendous research all
over the world to predict with the exact accuracy as the content of false-news is diverse in topics.
People consuming news from cyberspace produce data which is diverse and difficult to predict.
This model is a solution to all these problems of fake news in cyberspaces that is fast growing. In
particular the datasets which are trained by various machine learning techniques like data pre-
processing, feature selection, self-consistency etc. and all these are implemented by natural
language processing in python. Here we detect both forms of fake news, i.e., both text and image
streams. Once the prediction is false the report is generated and it is immediately redirected to
the authorized page (cybercrime department) insisting the seriousness of the news for which the
actions will be taken accordingly. Through this we try to bring a safe and trustable cyberspace
experience to people who rely on this. They can now verify news before they are believing or
forwarding them to others.

1.1. SCOPE
People spend a lot of time in Internet (cyberspace) and consume news. The main reason for rapid
spread of news in cyberspace is due to its low cost, easy access and easy sharing facility. This
made people to consume news from cyberspace rather than fetching it from television or
newspaper. The widespread of fake-news will have a serious negative impact on society and
individuals. Fake-news detection on cyberspace has led to tremendous research all over the
world to predict with the exact accuracy as the content of false-news is diverse in topics. People
consuming news from cyberspace produce data which is diverse and difficult to predict so it’s
very important to classify the news is fake or not.

Page | 1
1.3 MOTIVATION OF PROJECT
Our model is a solution to all these problems of fake news in cyberspaces that is fast growing. In
particular the datasets which are trained by various machine learning techniques like data pre-
processing, feature selection, self-consistency etc. and all these are implemented by natural
language processing in python. Here we detect both forms of fake news, i.e., both text and image
streams. Once the prediction is false the report is generated and it is immediately redirected to
the authorized page (cybercrime department) insisting the seriousness of the news for which the
actions will be taken accordingly. Through this we try to bring a safe and trustable cyberspace
experience to people who rely on this. They can now verify news before they are believing or
forwarding them to others.

1.4 KEYWORDS
Cyberspace, fake-news, text and image, Logistic regression classifier, self-consistency algorithm,
report, redirect.

1.3. OBJECTIVE OF THE PROJECT

A. Existing System:
The research paper nowadays presents the prediction of fake news using URL and tweet based
text features. These text content semantic features are extracted from tweet to find out the
sentiment scores and opinion words rather than statistical features. The above researches of
prediction of text are not effective in today's increasing traffic of cyberspace. The extraction of
semantic features from text is not that easy since it is dependent on text mining. Nowadays
images in cyberspace are popular in forms of posts, memes etc. along with the description of
text. Image features of news prediction still exist in basic level of research.

Disadvantages
• There is not much research done of this topic.

Page | 2
B. Proposed system:
Therefore false news prediction on cyberspace is attracting a tremendous attention. The issue of
fake-news prediction on cyberspace is both challenging and relevant as spreading of fake news
occurs in various streams like text, audio, video, images etc. This model works on processing the
text and images together by providing an interactive Application Interface (API), i.e. text by
applying the model Logistic regression classifier and image by applying self-consistency
algorithm. The natural language tool kit (NLTK) model is used for these implementation through
python. Once the news is predicted fake, a report is redirected to the authorized website
(cybercrime department) to take the immediate necessary actions required to stop these news
from spreading.

Advantages:
• It will process both text and images for detecting fake news.
• Report will generate once as news is predicted fake.

C. Modules Description

Pandas: pandas is an open source, BSD-licensed library providing high-performance, easy-to-

use data structures and data analysis tools for the Python programming language.

Numpy: NumPy is a general-purpose array-processing package. It provides a high-performance

multidimensional array object, and tools for working with these arrays. It is the fundamental
package for scientific computing with Python.

MatPlotLib: matplotlib.pyplot is a plotting library used for 2D graphics in python programming

language. It can be used in python scripts, shell, web application servers and other graphical user
interface toolkits.

Page | 3
Tensorflow : TensorFlow is a free and open-source software library for machine learning. It can
be used across a range of tasks but has a particular focus on training and inference of deep neural
networks. Tensorflow is a symbolic math library based on dataflow and differentiable
programming.

Keras : Keras is a neural networks library written in Python that is high-level in nature – which
makes it extremely simple and intuitive to use. It works as a wrapper to low-level libraries like
TensorFlow or Theano high-level neural networks library, written in Python that works as a
wrapper to TensorFlow or Theano.

Page | 4
CHAPTER 2

LITERATURE REVIEW

Social networking sites engage millions of users around the world. The users' interactions with
these social sites, such as Twitter and Facebook have a tremendous impact and occasionally
undesirable repercussions for daily life. The prominent social networking sites have turned into a
target platform for the spammers to disperse a huge amount of irrelevant and deleterious
information. Twitter, for example, has become one of the most extravagantly used platforms of
all times and therefore allows an unreasonable amount of spam. Fake users send undesired
tweets to users to promote services or websites that not only affect legitimate users but also
disrupt resource consumption. Moreover, the possibility of expanding invalid information to
users through fake identities has increased that results in the unrolling of harmful content.
Recently, the detection of spammers and identification of fake users on Twitter has become a
common area of research in contemporary online social Networks (OSNs). In this paper, we
perform a review of techniques used for detecting spammers on Twitter. Moreover, a taxonomy
of the Twitter spam detection approaches is presented that classifies the techniques based on
their ability to detect: (i) fake content, (ii) spam based on URL, (iii) spam in trending topics, and
(iv) fake users. The presented techniques are also compared based on various features, such as
user features, content features, graph features, structure features, and time features. We are
hopeful that the presented study will be a useful resource for researchers to find the highlights of
recent developments in Twitter spam detection on a single platform.
With the increased popularity of online social networks, spammers find these platforms easily
accessible to trap users in malicious activities by posting spam messages. In this work, we have
taken Twitter platform and performed spam tweets detection. To stop spammers, Google Safe
Browsing and Twitter's BotMaker tools detect and block spam tweets. These tools can block
malicious links, however they cannot protect the user in real-time as early as possible. Thus,

Page | 5
industries and researchers have applied different approaches to make spam free social network
platform. Some of them are only based on user-based features while others are based on tweet
based features only. However, there is no comprehensive solution that can consolidate tweet's
text information along with the user based features. To solve this issue, we propose a framework
which takes the user and tweet based features along with the tweet text feature to classify the
tweets. The benefit of using tweet text feature is that we can identify the spam tweets even if the
spammer creates a new account which was not possible only with the user and tweet based
features. We have evaluated our solution with four different machine learning algorithms namely
- Support Vector Machine, Neural Network, Random Forest and Gradient Boosting. With Neural
Network, we are able to achieve an accuracy of 91.65% and surpassed the existing solution [1]
by approximately 18%.

Page | 6
CHAPTER 3
SYSTEM ANALYSIS

• FEASIBILITY REPORT
Preliminary investigation examine project feasibility, the likelihood the system will be useful to
the organization. The main objective of the feasibility study is to test the Technical, Operational
and Economical feasibility for adding new modules and debugging old running system. All
system is feasible if they are unlimited resources and infinite time. There are aspects in the
feasibility study portion of the preliminary investigation:

 Technical Feasibility
 Economical Feasibility
 Operation Feasibility

A. TECHINICAL FEASIBILITY:
In the feasibility study first step is that the organization or company has to decide that what
technologies are suitable to develop by considering existing system.
Here in this application used the technologies like Visual Studio 2008 and SQL Server 2005.
These are free software that would be downloaded from web.
Visual Studio 2008 –it is tool or technology.

B. OPERATIONAL FEASIBILITY:
Not only must an application make economic and technical sense, it must also make operational
sense.

Page | 7
Issues to consider when determining the operational feasibility of a project.
Operations Issues Support Issues
What
What tools are needed to documentation
support operations? will users be
What skills will operators need given?
to be trained in? What training
What processes need to be will users be
created and/or updated? given?
What documentation does How will change
operations need? requests be
managed?
Very often you will need to improve the existing operations, maintenance, and support
infrastructure to support the operation of the new application that you intend to develop. To
determine what the impact will be you will need to understand both the current operations and
support infrastructure of your organization and the operations and support characteristics of your
new application.
To operate this application this system that the user no needs to require any technical knowledge
that we are used to develop this project is. Asp.net, C#.net. that the application providing rich
user interface by user can do the operation in flexible manner.

C. ECONOMIC FEASIBILITY:

Page | 8
It refers to the benefits or Outcomes we are deriving from the product as compared to the total
cost we are spending for developing the product. If the benefits are more or less the same as the
older system, then it is not feasible to develop the product.
In the present system, the development of new product greatly enhances the accuracy
of the system and cuts short the delay in the processing this application. The errors can be
greatly reduced and at the same time providing a great level of security. Here we don’t need any
additional equipment except memory of required capacity.
No need for spending money on client for maintenance because the database used is web enabled
database.

CHAPTER 4

SYSTEM REQUIREMENT SPECIFICATIONS

A Software Requirements Specification (SRS) – a requirements specification for a software

system – is a complete description of the behavior of a system to be developed. It includes a set
of use cases that describe all the interactions the users will have with the software. In addition to
use cases, the SRS also contains non-functional requirements. Non-functional requirements are
requirements which impose constraints on the design or implementation (such as performance
engineering requirements, quality standards, or design constraints).

System requirements specification: A structured collection of information that embodies the

requirements of a system. A business analyst, sometimes titled system analyst, is responsible for
analyzing the business needs of their clients and stakeholders to help identify business problems
and propose solutions. Within the systems development life cycle domain, typically performs a
liaison function between the business side of an enterprise and the information technology
department or external service providers. Projects are subject to three sorts of requirements:
 Business requirements describe in business terms what must be delivered or accomplished
to provide value.

Page | 9
 Product requirements describe properties of a system or product (which could be one of
Several ways to accomplish a set of business requirements.)
 Process requirements describe activities performed by the developing organization. For
instance, process requirements could specify specific methodologies that must be followed,
and constraints that the organization must obey.
Product and process requirements are closely linked. Process requirements often specify the
activities that will be performed to satisfy a product requirement. For example, a maximum
development cost requirement (a process requirement) may be imposed to help achieve a
maximum sales price requirement (a product requirement); a requirement that the product be
maintainable (a Product requirement) often is addressed by imposing requirements to follow
particular development styles

A. Functional Requirement

a. Software Requirements
OS : Windows

Python IDE : python 3.x and above

Jupyter Notebook,
Anaconda 3.5
Setup tools and pip to be installed for 3.6.x and above

b. Hardware Requirements

RAM : 4GB and Higher

Processor : Intel i3 and above
Hard Disk : 500GB: Minimum

B. NON FUNCTIONAL REQUIREMENTS

Page | 10
 Secure access of confidential data (user’s details). SSL can be used.
 24 X 7 availability.
 Better component design to get better performance at peak time
 Flexible service based architecture will be highly desirable for future extension

C. SDLC Methodologies
SDLC MODEL
The Software Development Lifecycle(SDLC) for small to medium database application
development efforts. This project uses iterative development lifecycle, where components of the
application are developed through a series of tight iteration. The first iteration focus on very
basic functionality, with subsequent iterations adding new functionality to the previous work and
or correcting errors identified for the components in production.
The six stages of the SDLC are designed to build on one another, taking outputs from the
previous stage, adding additional effort, and producing results that leverage the previous effort
and are directly traceable to the previous stages. During each stage, additional information is
gathered or developed, combined with the inputs, and used to produce the stage deliverables. It is
important to not that the additional information is restricted in scope, new ideas that would take
the project in directions not anticipated by the initial set of high-level requirements or features
that are out-of-scope are preserved for later consideration.
Too many software development efforts go awry when development team and customer
personnel get caught up in the possibilities of automation. Instead of focusing on high priority
features, the team can become mired in a sea of nice to have features that are not essential to
solve the problem, but in themselves are highly attractive. This is the root cause of large
percentage of failed and or abandoned development efforts and is the primary reason the
development team utilizes the iterative model.

Roles and Responsibilities of PDR AND PER

The iterative lifecycle specifies two critical roles that act together to clearly communicate project
issues and concepts between the end-user community and the development team.

Page | 11
Primary End-user Representative (PER)

The PER is a person who acts as the primary point of contact and principal approver for the end-
user community. The PER is also responsible for ensuring that appropriate subject matter experts
conduct end-user reviews in a timely manner.

PER-PDR Relationship
The PER and PDR are the brain trust for the development effort. The PER has the skills and
domain knowledge necessary to understand the issues associated with the business processes to
the supported by the application and has a close working relationship with the other members of
the end-user community. The PDR has the same advantages regarding the application
development process and the other members of the development team together, they act as the
concentration points for knowledge about the application to be developed.

The objective of this approach is to create the close relationship that is characteristic of a
software project with one developer and one end-user in essence, this approach the “pair
programming” concept from Agile methodologies and extends it to the end-user community.
While it is difficult to create close relationships between the diverse members of an end-user
community and a software development team, it is much simpler to create a close relationship
between the lead representatives for each group.
When multiple end-users are placed into relationship with multiple members of a development
team, communication between the two groups degrades as the number of participants grows. In
this model, members of end-user community may communicate with members of the
development team as needed, but it is the responsibility of all participants to keep the PER and

Page | 12
PDR apprised of the communications for example, this allows the PER and PDR to resolve
conflicts that arise when two different end-users communicate different requirements for the
same application feature to different members of the development team.

CHAPTER 5
SYSTEM DESIGN

a.UML DIAGRAMS
The Unified Modeling Language (UML) is used to specify, visualize, modify, construct and
document the artifacts of an object-oriented software intensive system under development. UML
offers a standard way to visualize a system's architectural blueprints, including elements such as:
 actors
 business processes
 (logical) components
 activities
 programming language statements
 database schemas, and
 Reusable software components.

UML combines best techniques from data modeling (entity relationship diagrams), business
modeling (work flows), object modeling, and component modeling. It can be used with all

Page | 13
processes, throughout the software development life cycle, and across different implementation
technologies. UML has synthesized the notations of the Booch method, the Object-modeling
technique (OMT) and Object-oriented software engineering (OOSE) by fusing them into a
single, common and widely usable modeling language. UML aims to be a standard modeling
language which can model concurrent and distributed systems.

Use Case Diagram:

Page | 14
Import modules

Text
Dataset

preprocess

User Stop word removal

Lemmatization

Splitting dataset

ML Algorithm

Prediction

Sequence diagram:

Sequence Diagrams Represent the objects participating the interaction horizontally and time
vertically. A Use Case is a kind of behavioral classifier that represents a declaration of an offered

Page | 15
behavior. Each use case specifies some behavior, possibly including variants that the subject can
perform in collaboration with one or more actors. Use cases define the offered behavior of the
subject without reference to its internal structure. These behaviors, involving interactions
between the actor and the subject, may result in changes to the state of the subject and
communications with its environment. A use case can include possible variations of its basic
behavior, including exceptional behavior and error handling.

user Text Preprocess Stop word Lemma- Splitting ML Algorithm

Dataset removal tization Dataset Prediction

1 : import()

2 : process()

3 : feture extract()

4 : feature scaling()

5 :Train-Test split()

6 : Fitting()

7 : Predict()

8 : result()

Collaboration Diagram:

Page | 16
Preprocess Stop word Lemmatization Splitting ML
Removal Dataset model

3 : fetureextract()
6 : fit()
4 : featurescale()

2 : process() 5 : partisioning()
7 : predict()

Text User Prediction

Dataset
1 : import() 8 : result()

Class Diagram:

Page | 17
Text
User dataset Preprocessing Stop word Splitt ML Model
ing
+Load() +process() +extract() +scale()

Predict

+predict()

Component Diagram

DataSet

server algorithms

modules

State Diagram:

Page | 18
Text
Dataset

Preprocess

Stop Word
Removal

Lemmatization

Train-Test
Split

ML model

Prediction

Activity Diagram

Page | 19
Text dataset

Preprocess Stop Word

Data splitting ML model
Removal

Prediction

Page | 20
CHAPTER 6
TECHNOLOGY DESCRIPTION AND IMPLEMENTATION

Introduction To Python Framework : Introduction to Django This book is about Django, a

Web development framework that saves you time and makes Web development a joy. Using
Django, you can build and maintain high-quality Web applications with minimal fuss. At its best,
Web development is an exciting, creative act; at its worst, it can be a repetitive, frustrating
nuisance. Django lets you focus on the fun stuff — the crux of your Web application — while
easing the pain of the repetitive bits. In doing so, it provides high-level abstractions of common
Web development patterns, shortcuts for frequent programming tasks, and clear conventions for
how to solve problems. At the same time, Django tries to stay out of your way, letting you work
outside the scope of the framework as needed. The goal of this book is to make you a Django
expert. The focus is twofold. First, we explain, in depth, what Django does and how to build
Web applications with it. Second, we discuss higher-level concepts where appropriate, answering
the question “How can I apply these tools effectively in my own projects?” By reading this book,
you’ll learn the skills needed to develop powerful Web sites quickly, with code that is clean and
easy to maintain.
What Is a Web Framework?
Django is a prominent member of a new generation of Web frameworks. So what exactly does
that term mean? To answer that question, let’s consider the design of a Web application written
using the Common Gateway Interface (CGI) standard, a popular way to write Web applications
circa 1998. In those days, when you wrote a CGI application, you did everything yourself — the
equivalent of baking a cake from scratch. For example, here’s a simple CGI script, written in
Python, that displays the ten most recently published books from a database:

Page | 21
This code is straightforward. First, it prints a “Content-Type” line, followed by a blank
line, as required by CGI. It prints some introductory HTML, connects to a database and executes
a query that retrieves the latest ten books. Looping over those books, it generates an HTML
unordered list. Finally, it prints the closing HTML and closes the database connection.
With a one-off dynamic page such as this one, the write-it-from-scratch approach isn’t
necessarily bad. For one thing, this code is simple to comprehend — even a novice developer can
read these 16 lines of Python and understand all it does, from start to finish. There’s nothing else
to learn; no other code to read. It’s also simple to deploy: just save this code in a file called
latestbooks.cgi, upload that file to a Web server, and visit that page with a browser. But as a Web
application grows beyond the trivial, this approach breaks down, and you face a number of
problems:
Should a developer really have to worry about printing the “Content-Type” line and
remembering to close the database connection? This sort of boilerplate reduces programmer
productivity and introduces opportunities for mistakes. These setup- and teardown-related tasks
would best be handled by some common infrastructure.
 What happens when this code is reused in multiple environments, each with a separate database
and password? At this point, some environment-specific configuration becomes essential.
 What happens when a Web designer who has no experience coding Python wishes to redesign
the page? Ideally, the logic of the page — the retrieval of books from the database — would be

Page | 22
separate from the HTML display of the page, so that a designer could edit the latter without
affecting the former.
 These problems are precisely what a Web framework intends to solve. A Web framework
provides a programming infrastructure for your applications, so that you can focus on writing
clean, maintainable code without having to reinvent the wheel. In a nutshell, that’s what Django
does.

Python
What Is A Script?
Up to this point, I have concentrated on the interactive programming capability of Python. This
is a very useful capability that allows you to type in a program and to have it executed
immediately in an interactive mode
Scripts are reusable
Basically, a script is a text file containing the statements that comprise a Python program. Once
you have created the script, you can execute it over and over without having to retype it each
time.
Scripts are editable
Perhaps, more importantly, you can make different versions of the script by modifying the
statements from one file to the next using a text editor. Then you can execute each of the
individual versions. In this way, it is easy to create different programs with a minimum amount
of typing.
You will need a text editor
Just about any text editor will suffice for creating Python script files.
You can use Microsoft Notepad, Microsoft WordPad, Microsoft Word, or just about any word
processor if you want to.
Difference between a script and a program
Script:
Scripts are distinct from the core code of the application, which is usually written in a different
language, and are often created or at least modified by the end-user. Scripts are often interpreted
from source code or byte code, where as the applications they control are traditionally compiled
to native machine code.

Page | 23
Program:
The program has an executable form that the computer can use directly to execute the
instructions.
The same program in its human-readable source code form, from which executable programs are
derived(e.g., compiled)

Python
what is Python? Chances you are asking yourself this. You may have found this book because
you want to learn to program but don’t know anything about programming languages. Or you
may have heard of programming languages like C, C++, C#, or Java and want to know what
Python is and how it compares to “big name” languages. Hopefully I can explain it for you.

Python concepts
If your not interested in the the hows and whys of Python, feel free to skip to the next chapter. In
this chapter I will try to explain to the reader why I think Python is one of the best languages
available and why it’s a great one to start programming with.
• Open source general-purpose language.
• Object Oriented, Procedural, Functional
• Easy to interface with C/ObjC/Java/Fortran
• Easy-ish to interface with C++ (via SWIG)
• Great interactive environment

Python is a high-level, interpreted, interactive and object-oriented scripting language. Python is

designed to be highly readable. It uses English keywords frequently where as other languages
use punctuation, and it has fewer syntactical constructions than other languages.
 Python is Interpreted − Python is processed at runtime by the interpreter. You do not need to
compile your program before executing it. This is similar to PERL and PHP.
 Python is Interactive − You can actually sit at a Python prompt and interact with the interpreter
directly to write your programs.
 Python is Object-Oriented − Python supports Object-Oriented style or technique of
programming that encapsulates code within objects.

Page | 24
 Python is a Beginner's Language − Python is a great language for the beginner-level
programmers and supports the development of a wide range of applications from simple text
processing to WWW browsers to games.

History of Python
Python was developed by Guido van Rossum in the late eighties and early nineties at the
National Research Institute for Mathematics and Computer Science in the Netherlands.
Python is derived from many other languages, including ABC, Modula-3, C, C++, Algol-68,
SmallTalk, and Unix shell and other scripting languages.
Python is copyrighted. Like Perl, Python source code is now available under the GNU General
Public License (GPL).
Python is now maintained by a core development team at the institute, although Guido van
Rossum still holds a vital role in directing its progress.
Python Features
Python's features include −
 Easy-to-learn − Python has few keywords, simple structure, and a clearly defined syntax. This
allows the student to pick up the language quickly.
 Easy-to-read − Python code is more clearly defined and visible to the eyes.
 Easy-to-maintain − Python's source code is fairly easy-to-maintain.
 A broad standard library − Python's bulk of the library is very portable and cross-platform
compatible on UNIX, Windows, and Macintosh.
 Interactive Mode − Python has support for an interactive mode which allows interactive testing
and debugging of snippets of code.
 Portable − Python can run on a wide variety of hardware platforms and has the same interface
on all platforms.
 Extendable − You can add low-level modules to the Python interpreter. These modules enable
programmers to add to or customize their tools to be more efficient.
 Databases − Python provides interfaces to all major commercial databases.

Page | 25
 GUI Programming − Python supports GUI applications that can be created and ported to many
system calls, libraries and windows systems, such as Windows MFC, Macintosh, and the X
Window system of Unix.
 Scalable − Python provides a better structure and support for large programs than shell scripting.

Apart from the above-mentioned features, Python has a big list of good features, few are listed
below −
 It supports functional and structured programming methods as well as OOP.
 It can be used as a scripting language or can be compiled to byte-code for building large
applications.
 It provides very high-level dynamic data types and supports dynamic type checking.
 IT supports automatic garbage collection.
 It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.

Dynamic vs Static
Types Python is a dynamic-typed language. Many other languages are static typed, such as
C/C++ and Java. A static typed language requires the programmer to explicitly tell the computer
what type of “thing” each data value is.
For example, in C if you had a variable that was to contain the price of something, you would
have to declare the variable as a “float” type.
This tells the compiler that the only data that can be used for that variable must be a floating
point number, i.e. a number with a decimal point.
If any other data value was assigned to that variable, the compiler would give an error when
trying to compile the program.
Python, however, doesn’t require this. You simply give your variables names and assign values
to them. The interpreter takes care of keeping track of what kinds of objects your program is
using. This also means that you can change the size of the values as you develop the program.
Say you have another decimal number (a.k.a. a floating point number) you need in your program.
With a static typed language, you have to decide the memory size the variable can take when you
first initialize that variable. A double is a floating point value that can handle a much larger
number than a normal float (the actual memory sizes depend on the operating environment).

Page | 26
If you declare a variable to be a float but later on assign a value that is too big to it, your
program will fail; you will have to go back and change that variable to be a double.
With Python, it doesn’t matter. You simply give it whatever number you want and Python will
take care of manipulating it as needed. It even works for derived values.
For example, say you are dividing two numbers. One is a floating point number and one is an
integer. Python realizes that it’s more accurate to keep track of decimals so it automatically
calculates the result as a floating point number

Variables
Variables are nothing but reserved memory locations to store values. This means that when you
create a variable you reserve some space in memory.
Based on the data type of a variable, the interpreter allocates memory and decides what can be
stored in the reserved memory. Therefore, by assigning different data types to variables, you can
store integers, decimals or characters in these variables.

Standard Data Types

The data stored in memory can be of many types. For example, a person's age is stored as a
numeric value and his or her address is stored as alphanumeric characters. Python has various
standard data types that are used to define the operations possible on them and the storage
method for each of them.
Python has five standard data types −
 Numbers
 String
 List
 Tuple
 Dictionary

Python Numbers
Number data types store numeric values. Number objects are created when you assign a value to
them

Page | 27
Python Strings
Strings in Python are identified as a contiguous set of characters represented in the quotation
marks. Python allows for either pairs of single or double quotes. Subsets of strings can be taken
using the slice operator ([ ] and [:] ) with indexes starting at 0 in the beginning of the string and
working their way from -1 at the end.
Python Lists
Lists are the most versatile of Python's compound data types. A list contains items separated by
commas and enclosed within square brackets ([]). To some extent, lists are similar to arrays in C.
One difference between them is that all the items belonging to a list can be of different data type.
The values stored in a list can be accessed using the slice operator ([ ] and [:]) with indexes
starting at 0 in the beginning of the list and working their way to end -1. The plus (+) sign is the
list concatenation operator, and the asterisk (*) is the repetition operator.

Python Tuples
A tuple is another sequence data type that is similar to the list. A tuple consists of a number of
values separated by commas. Unlike lists, however, tuples are enclosed within parentheses.
The main differences between lists and tuples are: Lists are enclosed in brackets ( [ ] ) and their
elements and size can be changed, while tuples are enclosed in parentheses ( ( ) ) and cannot be
updated. Tuples can be thought of as read-only lists.

Python Dictionary
Python's dictionaries are kind of hash table type. They work like associative arrays or hashes
found in Perl and consist of key-value pairs. A dictionary key can be almost any Python type, but
are usually numbers or strings. Values, on the other hand, can be any arbitrary Python object.
Dictionaries are enclosed by curly braces ({ }) and values can be assigned and accessed using
square braces ([]).

Different modes in python

Python has two basic modes: normal and interactive. The normal mode is the mode where the
scripted and finished .py files are run in the Python interpreter. Interactive mode is a command
line shell which gives immediate feedback for each statement, while running previously fed

Page | 28
statements in active memory. As new lines are fed into the interpreter, the fed program is
evaluated both in part and in whole

Python libraries
1. Requests. The most famous http library written by kenneth reitz. It’s a must have for every
python developer.
2. Scrapy. If you are involved in webscraping then this is a must have library for you. After using
this library you won’t use any other.
3. wxPython. A GUI toolkit for python. I have primarily used it in place of tkinter. You will
really love it.
4. Pillow. A friendly fork of PIL (Python Imaging Library). It is more user friendly than PIL and
is a must have for anyone who works with images.
5. SQLAlchemy. A database library. Many love it and many hate it. The choice is yours.
6. BeautifulSoup. I know it’s slow but this xml and html parsing library is very useful for
beginners.
7. Twisted. The most important tool for any network application developer. It has a very
beautiful api and is used by a lot of famous python developers.
8. NumPy. How can we leave this very important library? It provides some advance math
functionalities to python.
9. SciPy. When we talk about NumPy then we have to talk about scipy. It is a library of
algorithms and mathematical tools for python and has caused many scientists to switch from
ruby to python.
10. Matplotlib. A numerical plotting library. It is very useful for any data scientist or any data
analyzer.
11. Pygame. Which developer does not like to play games and develop them? This library will
help you achieve your goal of 2d game development.
12. Pyglet. A 3d animation and game creation engine. This is the engine in which the
famous python port of minecraft was made
13. PyQT. A GUI toolkit for python. It is my second choice after wxpython for developing
GUI’s for my python scripts.

Page | 29
14. PyGtk. Another python GUI library. It is the same library in which the famous Bittorrent
client is created.
15. Scapy. A packet sniffer and analyzer for python made in python.
16. Pywin32. A python library which provides some useful methods and classes for interacting
with windows.
17. Nltk. Natural Language Toolkit – I realize most people won’t be using this one, but its
generic enough. It is a very useful library if you want to manipulate strings. But its capacity is
beyond that. Do check it out.
18. Nose. A testing framework for python. It is used by millions of python developers. It is a
must have if you do test driven development.
19. SymPy. SymPy can do algebraic evaluation, differentiation, expansion, complex numbers,
etc. It is contained in a pure Python distribution.
20. IPython. I just can’t stress enough how useful this tool is. It is a python prompt on steroids. It
has completion, history, shell capabilities, and a lot more. Make sure that you take a look at it.

Numpy
NumPy’s main object is the homogeneous multidimensional array. It is a table of elements
(usually numbers), all of the same type, indexed by a tuple of positive integers. In NumPy
dimensions are called axes. The number of axes is rank.
• Offers Matlab-ish capabilities within Python
• Fast array operations
• 2D arrays, multi-D arrays, linear algebra etc.

Matplotlib
• High quality plotting library.

Python class and objects

These are the building blocks of OOP. Class creates a new object. This object can be anything,
whether an abstract data concept or a model of a physical object, e.g. a chair. Each class has
individual characteristics unique to that class, including variables and methods. Classes are very

Page | 30
powerful and currently “the big thing” in most programming languages. Hence, there are several
chapters dedicated to OOP later in the book. The class is the most basic component of object-
oriented programming. Previously, you learned how to use functions to make your program do
something. Now will move into the big, scary world of Object-Oriented Programming (OOP). To
be honest, it took me several months to get a handle on objects. When I first learned C and C++,
I did great; functions just made sense for me. Having messed around with BASIC in the early
’90s, I realized functions were just like subroutines so there wasn’t much new to learn. However,
when my C++ course started talking about objects, classes, and all the new features of OOP, my
grades definitely suffered. Once you learn OOP, you’ll realize that it’s actually a pretty powerful
tool. Plus many Python libraries and APIs use classes, so you should at least be able to
understand what the code is doing. One thing to note about Python and OOP: it’s not mandatory
to use objects in your code in a way that works best; maybe you don’t need to have a full-blown
class with initialization code and methods to just return a calculation. With Python, you can get
as technical as you want. As you’ve already seen, Python can do just fine with functions. Unlike
languages such as Java, you aren’t tied down to a single way of doing things; you can mix
functions and classes as necessary in the same program. This lets you build the codeObjects are
an encapsulation of variables and functions into a single entity. Objects get their variables and
functions from classes. Classes are essentially a template to create your objects.

Here’s a brief list of Python OOP ideas:

• The class statement creates a class object and gives it a name. This creates a new namespace.
• Assignments within the class create class attributes. These attributes are accessed by qualifying
the name using dot syntax: ClassName.Attribute.
• Class attributes export the state of an object and its associated behavior. These attributes are
shared by all instances of a class.
• Calling a class (just like a function) creates a new instance of the class.
This is where the multiple copies part comes in.
• Each instance gets ("inherits") the default class attributes and gets its own namespace. This
prevents instance objects from overlapping and confusing the program.
• Using the term self identifies a particular instance, allowing for per-instance attributes. This
allows items such as variables to be associated with a particular instance.

Page | 31
Inheritance

First off, classes allow you to modify a program without really making changes to it. To
elaborate, by sub classing a class, you can change the behavior of the program by simply adding
new components to it rather than rewriting the existing components. As we’ve seen, an instance
of a class inherits the attributes of that class. However, classes can also inherit attributes from
other classes. Hence, a subclass inherits from a superclass allowing you to make a generic
superclass that is specialized via subclasses. The subclasses can override the logic in a
superclass, allowing you to change the behavior of your classes without changing the superclass
at all.
Operator Overloads
Operator overloading simply means that objects that you create from classes can respond to
actions (operations) that are already defined within Python, such as addition, slicing, printing,
etc. Even though these actions can be implemented via class methods, using overloading ties the
behavior closer to Python’s object model and the object interfaces are more consistent to
Python’s built-in objects, hence overloading is easier to learn and use. User-made classes can
override nearly all of Python’s built-in operation methods

Exceptions
I’ve talked about exceptions before but now I will talk about them in depth. Essentially,
exceptions are events that modify program’s flow, either intentionally or due to errors. They are
special events that can occur due to an error, e.g. trying to open a file that doesn’t exist, or when
the program reaches a marker, such as the completion of a loop. Exceptions, by definition, don’t
occur very often; hence, they are the "exception to the rule" and a special class has been created
for them. Exceptions are everywhere in Python. Virtually every module in the standard Python
library uses them, and Python itself will raise them in a lot of different circumstances.
Here are just a few examples:
• Accessing a non−existent dictionary key will raise a Key Error exception.
• Searching a list for a non−existent value will raise a Value Error exception
.• Calling a non−existent method will raise an Attribute Error exception.

Page | 32
• Referencing a non−existent variable will raise a Name Error exception.
• Mixing datatypes without coercion will raise a Type Error exception.
One use of exceptions is to catch a fault and allow the program to continue working; we have
seen this before when we talked about files. This is the most common way to use exceptions.
When programming with the Python command line interpreter, you don’t need to worry about
catching exceptions. Your program is usually short enough to not be hurt too much if an
exception occurs.
Plus, having the exception occur at the command line is a quick and easy way to tell if your code
logic has a problem. However, if the same error occurred in your real program, it will fail and
stop working. Exceptions can be created manually in the code by raising an exception. It
operates exactly as a system-caused exceptions, except that the programmer is doing it on
purpose. This can be for a number of reasons. One of the benefits of using exceptions is that, by
their nature, they don’t put any overhead on the code processing. Because exceptions aren’t
supposed to happen very often, they aren’t processed until they occur. Exceptions can be thought
of as a special form of the if/elif statements. You can realistically do the same thing with if
blocks as you can with exceptions. However, as already mentioned, exceptions aren’t processed
until they occur; if blocks are processed all the time. Proper use of exceptions can help the
performance of your program. The more infrequent the error might occur, the better off you are
to use exceptions; using if blocks requires Python to always test extra conditions before
continuing. Exceptions also make code management easier: if your programming logic is mixed
in with error-handling if statements, it can be difficult to read, modify, and debug your program.
User-Defined Exceptions
I won’t spend too much time talking about this, but Python does allow for a programmer to
create his own exceptions. You probably won’t have to do this very often but it’s nice to have the
option when necessary. However, before making your own exceptions, make sure there isn’t one
of the built-in exceptions that will work for you. They have been "tested by fire" over the years
and not only work effectively, they have been optimized for performance and are bug-free.
Making your own exceptions involves object-oriented programming, which will be covered in
the next chapter. To make a custom exception, the programmer determines which base exception
to use as the class to inherit from, e.g. making an exception for negative numbers or one for

Page | 33
imaginary numbers would probably fall under the Arithmetic Error exception class. To make a
custom exception, simply inherit the base exception and define what it will do.
Python modules
Python allows us to store our code in files (also called modules). This is very useful for more
serious programming, where we do not want to retype a long function definition from the very
beginning just to change one mistake. In doing this, we are essentially defining our own
modules, just like the modules defined already in the Python library.
To support this, Python has a way to put definitions in a file and use them in a script or in an
interactive instance of the interpreter. Such a file is called a module; definitions from a module
can be imported into other modules or into the main module.

Testing code
As indicated above, code is usually developed in a file using an editor.To test the code, import it
into a Python session and try to run it. Usually there is an error, so you go back to the file, make
a correction, and test again. This process is repeated until you are satisfied that the code works.
The entire process is known as the development cycle. There are two types of errors that you will
encounter. Syntax errors occur when the form of some command is invalid.
This happens when you make typing errors such as misspellings, or call something by the wrong
name, and for many other reasons. Python will always give an error message for a syntax error.

Functions in Python

It is possible, and very useful, to define our own functions in Python. Generally speaking, if you
need to do a calculation only once, then use the interpreter. But when you or others have need to
perform a certain type of calculation many times, then define a function.
You use functions in programming to bundle a set of instructions that you want to use
repeatedly or that, because of their complexity, are better self-contained in a sub-program
and called when needed. That means that a function is a piece of code written to carry out a
specified task.

Page | 34
To carry out that specific task, the function might or might not need multiple inputs. When
the task is carried out, the function can or cannot return one or more values.

There are three types of functions in python:

help(),min(),print().

Python Namespace
Generally speaking, a namespace (sometimes also called a context) is a naming system for
making names unique to avoid ambiguity. Everybody knows a namespacing system from daily
life, i.e. the naming of people in firstname and familiy name (surname).
An example is a network: each network device (workstation, server, printer, ...) needs a unique
name and address. Yet another example is the directory structure of file systems.
The same file name can be used in different directories, the files can be uniquely accessed via the
pathnames.
Many programming languages use namespaces or contexts for identifiers. An identifier defined
in a namespace is associated with that namespace.
This way, the same identifier can be independently defined in multiple namespaces. (Like the
same file names in different directories) Programming languages, which support namespaces,
may have different rules that determine to which namespace an identifier belongs.
Namespaces in Python are implemented as Python dictionaries, this means it is a mapping from
names (keys) to objects (values). The user doesn't have to know this to write a Python program
and when using namespaces.
Some namespaces in Python:
 global names of a module
 local names in a function or method invocation
 built-in names: this namespace contains built-in functions (e.g. abs(), cmp(), ...) and built-in
exception names
Garbage Collection
Garbage Collector exposes the underlying memory management mechanism of Python, the
automatic garbage collector. The module includes functions for controlling how the collector

Page | 35
operates and to examine the objects known to the system, either pending collection or stuck in
reference cycles and unable to be freed.
Python XML Parser
XML is a portable, open source language that allows programmers to develop applications that
can be read by other applications, regardless of operating system and/or developmental language.
What is XML? The Extensible Markup Language XML is a markup language much like HTML
or SGML.
This is recommended by the World Wide Web Consortium and available as an open standard.
XML is extremely useful for keeping track of small to medium amounts of data without
requiring a SQL-based backbone.
XML Parser Architectures and APIs The Python standard library provides a minimal but useful
set of interfaces to work with XML.
The two most basic and broadly used APIs to XML data are the SAX and DOM interfaces.
Simple API for XML SAX : Here, you register callbacks for events of interest and then let the
parser proceed through the document.
This is useful when your documents are large or you have memory limitations, it parses the file
as it reads it from disk and the entire file is never stored in memory.
Document Object Model DOM API : This is a World Wide Web Consortium recommendation
wherein the entire file is read into memory and stored in a hierarchical tree − based form to
represent all the features of an XML document.
SAX obviously cannot process information as fast as DOM can when working with large files.
On the other hand, using DOM exclusively can really kill your resources, especially if used on a
lot of small files.
SAX is read-only, while DOM allows changes to the XML file. Since these two different APIs
literally complement each other, there is no reason why you cannot use them both for large
projects.
Python Web Frameworks
A web framework is a code library that makes a developer's life easier when building reliable,
scalable and maintainable web applications.

Page | 36
Why are web frameworks useful?
Web frameworks encapsulate what developers have learned over the past twenty years while
programming sites and applications for the web. Frameworks make it easier to reuse code for
common HTTP operations and to structure projects so other developers with knowledge of the
framework can quickly build and maintain the application.
Common web framework functionality
Frameworks provide functionality in their code or through extensions to perform common
operations required to run web applications. These common operations include:
1. URL routing
2. HTML, XML, JSON, and other output format templating
3. Database manipulation
4. Security against Cross-site request forgery (CSRF) and other attacks
5. Session storage and retrieval
Not all web frameworks include code for all of the above functionality. Frameworks fall on the
spectrum from executing a single use case to providing every known web framework feature to
every developer. Some frameworks take the "batteries-included" approach where everything
possible comes bundled with the framework while others have a minimal core package that is
amenable to extensions provided by other packages.
Comparing web frameworks
There is also a repository called compare-python-web-frameworks where the same web
application is being coded with varying Python web frameworks, templating engines and object.
Web framework resources
 When you are learning how to use one or more web frameworks it's helpful to have an idea of
what the code under the covers is doing.

 Frameworks is a really well done short video that explains how to choose between web
frameworks. The author has some particular opinions about what should be in a framework. For
the most part I agree although I've found sessions and database ORMs to be a helpful part of a
framework when done well.
 what is a web framework? is an in-depth explanation of what web frameworks are and their
relation to web servers.

Page | 37
 Django vs Flash vs Pyramid: Choosing a Python web framework contains background
information and code comparisons for similar web applications built in these three big Python
frameworks.
 This fascinating blog post takes a look at the code complexity of several Python web
frameworks by providing visualizations based on their code bases.
 Python’s web frameworks benchmarks is a test of the responsiveness of a framework with
encoding an object to JSON and returning it as a response as well as retrieving data from the
database and rendering it in a template. There were no conclusive results but the output is fun to
read about nonetheless.
 What web frameworks do you use and why are they awesome? is a language agnostic Reddit
discussion on web frameworks. It's interesting to see what programmers in other languages like
and dislike about their suite of web frameworks compared to the main Python frameworks.
 This user-voted question & answer site asked "What are the best general purpose Python web
frameworks usable in production?". The votes aren't as important as the list of the many
frameworks that are available to Python developers.
Web frameworks learning checklist
1. Choose a major Python web framework (Django or Flask are recommended) and stick with it.
When you're just starting it's best to learn one framework first instead of bouncing around trying
to understand every framework.
2. Work through a detailed tutorial found within the resources links on the framework's page.

Page | 38
CODING

Page | 39
Page | 40
Page | 41
Page | 42
CHAPTER 7
TESTING AND TEST CASES
7.1 INTRODUCTION TO TESTING
Software testing is a critical element of software quality assurance and represents the ultimate
review of specification, design and coding. The increasing visibility of software as a system
element and attendant costs associated with a software failure are motivating factors for we
planned, through testing. Testing is the process of executing a program with the intent of finding
an error. The design of tests for software and other engineered products can be as challenging as
the initial design of the product itself.
There of basically two types of testing approaches.
One is Black-Box testing – the specified function that a product has been designed to
perform, tests can be conducted that demonstrate each function is fully operated.

Page | 43
The other is White-Box testing – knowing the internal workings of the product ,tests
can be conducted to ensure that the internal operation of the product performs according
to specifications and all internal components have been adequately exercised.
White box and Black box testing methods have been used to test this package. The entire
loop constructs have been tested for their boundary and intermediate conditions. The test
data was designed with a view to check for all the conditions and logical decisions.
Error handling has been taken care of by the use of exception handlers.

7.2 TESTING STRATEGIES:

Testing is a set of activities that can be planned in advanced and conducted systematically. A
strategy for software testing must accommodation low-level tests that are necessary to verify that
a small source code segment has been correctly implemented as well as high-level tests that
validate major system functions against customer requirements.
Software testing is one element of verification and validation. Verification refers to the set of
activities that ensure that software correctly implements as specific function. Validation refers to
a different set of activities that ensure that the software that has been built is traceable to
customer requirements.
The main objective of software is testing to uncover errors. To fulfill this objective, a series of
test steps unit, integration, validation and system tests are planned and executed. Each test step
is accomplished through a series of systematic test technique that assist in the design of test
cases. With each testing step, the level of abstraction with which software is considered is
broadened.
Testing is the only way to assure the quality of software and it is an umbrella activity rather than
a separate phase. This is an activity to be preformed in parallel with the software effort and one
that consists of its own phases of analysis, design, implementation, execution and maintenance.

UNIT TESTING:
This testing method considers a module as single unit and checks the unit at interfaces and
communicates with other modules rather than getting into details at statement level. Here the
module will be treated as a black box, which will take some input and generate output. Outputs
for a given set of input combination are pre-calculated and are generated by the module.

Page | 44
SYSTEM TESTING:
Here all the pre tested individual modules will be assembled to create the larger system and tests
are carried out at system level to make sure that all modules are working in synchronous with
each other. This testing methodology helps in making sure that all modules which are running
perfectly when checked individually are also running in cohesion with other modules. For this
testing we create test cases to check all modules once and then generated test combinations of
test paths throughout the system to make sure that no path is making its way into chaos.

INTEGRATED TESTING
Testing is a major quality control measure employed during software development. Its basic
function is to detect errors. Sub functions when combined may not produce than it is desired.
Global data structures can represent the problems. Integrated testing is a systematic technique
for constructing the program structure while conducting the tests. To uncover errors that are
associated with interfacing the objective is to make unit test modules and built a program
structure that has been detected by design. In a non - incremental integration all the modules are
combined in advance and the program is tested as a whole. Here errors will appear in an end less
loop function. In incremental testing the program is constructed and tested in small segments
where the errors are isolated and corrected.
Different incremental integration strategies are top – down integration, bottom – up integration,
regression testing.
TOP-DOWN INTEGRATION TEST
Modules are integrated by moving downwards through the control hierarchy beginning with
main program. The subordinate modules are incorporated into structure in either a breadth first
manner or depth first manner. This process is done in five steps:
 Main control module is used as a test driver and steps are substituted or all modules directly to
main program.
 Depending on the integration approach selected subordinate is replaced at a time with actual
modules.
 Tests are conducted.
 On completion of each set of tests another stub is replaced with the real module

Page | 45
 Regression testing may be conducted to ensure trha5t new errors have not been introduced.
This process continuous from step 2 until entire program structure is reached. In top down
integration strategy decision making occurs at upper levels in the hierarchy and is encountered
first. If major control problems do exists early recognitions is essential.
If depth first integration is selected a complete function of the software may be implemented and
demonstrated.
Some problems occur when processing at low levels in hierarchy is required to adequately test
upper level steps to replace low-level modules at the beginning of the top down testing. So no
data flows upward in the program structure.
BOTTOM-UP INTEGRATION TEST
Begins construction and testing with atomic modules. As modules are integrated from the
bottom up, processing requirement for modules subordinate to a given level is always available
and need for stubs is eliminated. The following steps implements this strategy.
 Low-level modules are combined in to clusters that perform a specific software sub function.
 A driver is written to coordinate test case input and output.
 Cluster is tested.
 Drivers are removed and moving upward in program structure combines clusters.
Integration moves upward, the need for separate test driver’s lesions.
If the top levels of program structures are integrated top down, the number of drivers can be
reduced substantially and integration of clusters is greatly simplified.

REGRESSION TESTING
Each time a new module is added as a part of integration as the software changes. Regression
testing is an actually that helps to ensure changes that do not introduce unintended behavior as
additional errors.
Regression testing maybe conducted manually by executing a subset of all test cases or using
automated capture play back tools enables the software engineer to capture the test case and
results for subsequent playback and compression. The regression suit contains different classes
of test cases.
A representative sample to tests that will exercise all software functions.

Page | 46
Additional tests that focus on software functions that are likely to be affected by the change.

7.3 IMPLEMENTATION
Implementation is the process of converting a new or revised system design into operational one.
There are three types of Implementation:
 Implementation of a computer system to replace a manual system. The problems encountered
are converting files, training users, and verifying printouts for integrity.
 Implementation of a new computer system to replace an existing one. This is usually a difficult
conversion. If not properly planned there can be many problems.
 Implementation of a modified application to replace an existing one using the same computer.
This type of conversion is relatively easy to handle, provided there are no major changes in the
files.
Implementation in Generic tool project is done in all modules. In the first module User level
identification is done. In this module every user is identified whether they are genuine one or not
to access the database and also generates the session for the user. Illegal use of any form is
strictly avoided.
In the Table creation module, the tables are created with user specified fields and user can create
many table at a time. They may specify conditions, constraints and calculations in creation of
tables. The Generic code maintain the user requirements through out the project.
In Updating module user can update or delete or Insert the new record into the database. This is
very important module in Generic code project. User has to specify the filed value in the form
then the Generic tool automatically gives whole filed values for that particular record.

SNO Test Case Title Pre-requisites Action Expected Test Result

result (Pass/Fail)

Test Case Software Python version 3.6.4 python --version Pass

1 requirements (Checking the Python 3.6.4
version) present in
your system
Test case 2 Idle requirements Jupyter notebook CMD--(jupyter jupyter file Pass
notebook) should run
on the local
host

Page | 47
Test Case packages need pandas, ls(list of packages) all packages Pass
3 numpy,seaborn,scikit- should
learn, nltk,matplotlib import
Test Case Import the dataset Import the dataset by Datset found show the Pass
4 using pandas (uploaded into dataset in the
jupyter) ipynb file
Test Case Import Seaborn,matplotlib pip install Visualizatio Pass
5 Seaborn,matplotlib seaborn,matplotlib n
for dataset
Visualization
Test Case Importing Scikit-learn pip install Scikit- Scikit-learn Pass
6 classification learn found
Algorithm from
Scikit-learn
Test Case Building of fake ML Model create object for work with Pass
7 news classification that library that model
model
Test Case Predict the New out EX:Model.predict(Data Data should be in Predicted the Pass
8 come by ) numpy format label
using .predict

CHAPTER 8
INPUT AND OUTPUT DESIGN
8. INPUT AND OUTPUT DESIGN
8.1 INPUT AND OUTPUT
The following some are the projects inputs and outputs.

Inputs:
 Importing the all required packages like numpy, pandas, matplotlib, scikit – learn and required
machine learning algorithms packages.
 Setting the dimensions of visualization graph.

Page | 48
 Downloading and importing the dataset and convert to data frame.

Outputs:
 Preprocessing the importing data frame for imputing nulls with the related information.
 All are displaying cleaned outputs.
 After applying machine learning algorithms it will give good results and visualization plots.

INPUT DESIGN

Input design is a part of overall system design. The main objective during the input design is as
given below:
 To produce a cost-effective method of input.
 To achieve the highest possible level of accuracy.
 To ensure that the input is acceptable and understood by the user.

OUTPUT DESIGN

Outputs from computer systems are required primarily to communicate the results of processing
to users. They are also used to provide a permanent copy of the results for later consultation. The
various types of outputs in general are:
 External Outputs, whose destination is outside the organization,
 Internal Outputs whose destination is within organization and they are the
 User’s main interface with the computer.
 Operational outputs whose use is purely within the computer department.
 Interface outputs, which involve the user in communicating directly with
The outputs were needed to be generated as a hard copy and as well as queries to be viewed on
the screen. Keeping in view these outputs, the format for the output is taken from the outputs,
which are currently being obtained after manual processing. The standard printer is to be used as
output media for hard copies.

Page | 49
OUTPUT SCREENS
OUTPUT SCREENS

Page | 50
Page | 51
CHAPTER 9
CONCLUSION & FUTURE ENHANCEMENTS

9.1. CONCLUSION

The consumption of news is increasing day by day in cyberspace than the traditional media. Due
to its increasing popularity and user friendly access it leaves a huge impact on individuals and
society. Therefore, in this model we have found a way to detect such fake news in both the forms
of text and image by using the Logistic regression model. By redirecting the fake news to the
authorized website (cybercrime department), we hereby frame a high social impact and thus it
reduces the spreading of false news distinctly.

9.2. FUTURE ENHANCEMENTS

This model can be further discussed for the future improvement in fake news detection which
can be in audio, video streams and commercialize the field to other applications. We can try to
implement this with help of deep learning algorithms so it will help to achieve somewhat more
accuracy.

Page | 52
BIBLOGRAPHY
For software installation:
https://www.anaconda.com/download/
https://www.python.org/downloads/release/python-360/

References:

[1] Faiza Masood, Ghana Ammad, Ahmad Almogren, Assad Abbas, Hasan Ali Khattak, Ikram
Ud Din, Mohsen Guizani and Mansour Zuair, “Spammer Detection and Fake User Identification
on Social Networks,” IEEE Trans. Inf. Translations and content mining, vol. 7, pp. 2169- 3536,
2019.
[2] Himank Gupta, Mohd. Saalim Jamal, Sreekanth Madisetty and Maunendra Sankar Desarkar,
“A framework for realtime spam detection in Twitter,” IEEE Int. Conf. Communication Systems
and networks, pp. 2155-2509, 2018.
[3] K.Sakthidasan, G.Srinithya, V.Nagarajan (FEB 2014), “Enhanced Edge Preserving
Restoration for 3D Images Using Histogram Equalization Technique”, International Journal of
Electronic Communications Engineering Advanced Research, Vol.2, SP-1, Feb.2014, pp. 40-44
[4] S. Kwon, M. Cha, K. Jung, W. Chen and Y. Wang, “Prominent features of rumor propagation
in online social media,” IEEE Int. Conf. Data Mining, pp. 1103–1108, 2013.
[5] Hadeer Ahmed, Issa Traore and Sherif Saad, “Detection of Online Fake News Using N-Gram
Analysis and Machine Learning Techniques,” Springer, pp. 127–138, 2017.
[6] K. Wu, S. Yang, and K. Q, “False rumors detection on sina weibo by propagation structures,”
IEEE Int. Conf. Data Engineering, 2015.
[7] S. Sun, H. Liu, J. He, and X. Du, “Detecting event rumors on sina weibo automatically,” Web
Technologies and Applications, Springer, pp. 120– 131, 2013.
[8] Zhiwei Jin, Juan Cao,Yongdong Zhang, Jianshe Zhou, and Qi Tian Fellow, “Novel Visual
and Statistical Image Features for Microblogs News Verification,” IEEE Trans. Inf. Multimedia,
pp. 1520-9210, 2016.

Page | 53

Development of Malware Detection and Analysis Mode
No ratings yet
Development of Malware Detection and Analysis Mode
50 pages
Secure Persona Prediction and Data Leakage Prevention System Using Python
No ratings yet
Secure Persona Prediction and Data Leakage Prevention System Using Python
49 pages
1.1 General Introduction: Face Recognition System
0% (1)
1.1 General Introduction: Face Recognition System
78 pages
Chapter 1-5 DETECTING PHISHING WEBSITES USING MACHINE LEARNING
No ratings yet
Chapter 1-5 DETECTING PHISHING WEBSITES USING MACHINE LEARNING
140 pages
Heart Disease Prediction Using Machine Learning-1
No ratings yet
Heart Disease Prediction Using Machine Learning-1
6 pages
A Firmware Development Standard Jack Ganssle
No ratings yet
A Firmware Development Standard Jack Ganssle
18 pages
THE FAKE ACCOUNT DETECTION IN ONLINE SOCIAL NETWORKS (OSNs) USING RANDOM FOREST
No ratings yet
THE FAKE ACCOUNT DETECTION IN ONLINE SOCIAL NETWORKS (OSNs) USING RANDOM FOREST
95 pages
Airline Ticket Reservation System
50% (2)
Airline Ticket Reservation System
3 pages
Generating Fake News Detection Model Using A Two-Stage Evolutionary Approach 7th Aug 2023 Published
No ratings yet
Generating Fake News Detection Model Using A Two-Stage Evolutionary Approach 7th Aug 2023 Published
19 pages
Text Summarizer
No ratings yet
Text Summarizer
9 pages
Fake News Detection Using Natural Language Processing
100% (1)
Fake News Detection Using Natural Language Processing
8 pages
Interim Project - Sentiment Analysis of Movie
No ratings yet
Interim Project - Sentiment Analysis of Movie
101 pages
Detection of Cyber Attacks Using Ai
No ratings yet
Detection of Cyber Attacks Using Ai
92 pages
Disease Prediction Using ML
100% (1)
Disease Prediction Using ML
43 pages
Complete Final Sem Report PDF
100% (1)
Complete Final Sem Report PDF
79 pages
Automated Emerging Cyber Threat Identification and Profiling Based On Natural Language Processing
No ratings yet
Automated Emerging Cyber Threat Identification and Profiling Based On Natural Language Processing
57 pages
Project
No ratings yet
Project
43 pages
Militant and Weapon Detection Final Report
No ratings yet
Militant and Weapon Detection Final Report
63 pages
Feature Extraction of Geo-Tagged Twitter Data For Sentiment Analysis
No ratings yet
Feature Extraction of Geo-Tagged Twitter Data For Sentiment Analysis
6 pages
Building Data-Driven Applications with LlamaIndex: A practical guide to retrieval-augmented generation (RAG) to enhance LLM applications
From Everand
Building Data-Driven Applications with LlamaIndex: A practical guide to retrieval-augmented generation (RAG) to enhance LLM applications
Andrei Gheorghiu
No ratings yet
New Python-Ml-Ai-2022-2023 - 9581464142
No ratings yet
New Python-Ml-Ai-2022-2023 - 9581464142
18 pages
PUMMP: Phishing URL Detection Using Machine Learning With Monomorphic and Polymorphic Treatment of Features
No ratings yet
PUMMP: Phishing URL Detection Using Machine Learning With Monomorphic and Polymorphic Treatment of Features
20 pages
Whatsapp Chat Analyser
No ratings yet
Whatsapp Chat Analyser
11 pages
Java Programming From Problem Analysis To Program Design 4th Edition D. S. Malik - Download The Ebook Today and Own The Complete Content
No ratings yet
Java Programming From Problem Analysis To Program Design 4th Edition D. S. Malik - Download The Ebook Today and Own The Complete Content
82 pages
Fake News Detection
No ratings yet
Fake News Detection
28 pages
Deep Audio Classification
No ratings yet
Deep Audio Classification
10 pages
Spammer Detect Project Document
No ratings yet
Spammer Detect Project Document
45 pages
Complete Final Sem Report PDF
No ratings yet
Complete Final Sem Report PDF
79 pages
Stroke Prediction Project Report
No ratings yet
Stroke Prediction Project Report
7 pages
SMS Spam Detection Using Machine Learning
No ratings yet
SMS Spam Detection Using Machine Learning
9 pages
Fake News Detection
No ratings yet
Fake News Detection
18 pages
Project Report Final
No ratings yet
Project Report Final
39 pages
Survey of Machine Learning in Phishing Detection Research
No ratings yet
Survey of Machine Learning in Phishing Detection Research
21 pages
Accident Detection System A Deep Learning Approach To Detect Accidents
No ratings yet
Accident Detection System A Deep Learning Approach To Detect Accidents
4 pages
Predicting Cyberbullying On Social Media in The Big Data Era Using Machine Learning Algorithms Review of Literature and Open Challenges PDF
No ratings yet
Predicting Cyberbullying On Social Media in The Big Data Era Using Machine Learning Algorithms Review of Literature and Open Challenges PDF
18 pages
Speech Emotion Recognition Using Deep Learning
No ratings yet
Speech Emotion Recognition Using Deep Learning
6 pages
Mini Project Report
No ratings yet
Mini Project Report
35 pages
Server Side Scripting PHP
No ratings yet
Server Side Scripting PHP
99 pages
CET351 Research
No ratings yet
CET351 Research
6 pages
Location Tracker Device Project Flow and Quotation
No ratings yet
Location Tracker Device Project Flow and Quotation
8 pages
Spam Mail Detection Using Machine Learning
No ratings yet
Spam Mail Detection Using Machine Learning
5 pages
Blockchain Based Certificate Validation
No ratings yet
Blockchain Based Certificate Validation
7 pages
Fake Profile Identification - Abstract
No ratings yet
Fake Profile Identification - Abstract
3 pages
Tensorflow Object Detection Api Tutorial PDF
No ratings yet
Tensorflow Object Detection Api Tutorial PDF
41 pages
Hansl Primer
No ratings yet
Hansl Primer
62 pages
Documentation-Fake News Detection
No ratings yet
Documentation-Fake News Detection
57 pages
Analysis of Revenue and Expenditure
No ratings yet
Analysis of Revenue and Expenditure
66 pages
Fake Product1
No ratings yet
Fake Product1
37 pages
Kalviexpress'Xii Cs Full Material
No ratings yet
Kalviexpress'Xii Cs Full Material
136 pages
Allnotes Blank
No ratings yet
Allnotes Blank
72 pages
For Fake or Real Disaster Tweet Analysis of Machine Learning Algorithms
No ratings yet
For Fake or Real Disaster Tweet Analysis of Machine Learning Algorithms
23 pages
File Sharing and Data Duplication Removal in Cloud Using File Checksum
No ratings yet
File Sharing and Data Duplication Removal in Cloud Using File Checksum
3 pages
Data Wipro
No ratings yet
Data Wipro
37 pages
College Voting System Full
No ratings yet
College Voting System Full
34 pages
Clustering & Association Algorithms 4
No ratings yet
Clustering & Association Algorithms 4
17 pages
Fruit Disease Detection Using Color, Texture Analysis: A Project Report
No ratings yet
Fruit Disease Detection Using Color, Texture Analysis: A Project Report
10 pages
Object Detection Tutorial
No ratings yet
Object Detection Tutorial
9 pages
Drug Recommender System Using Machine Learning For Sentiment Analysis
No ratings yet
Drug Recommender System Using Machine Learning For Sentiment Analysis
4 pages
Fake News Detection Using Machine Learning
No ratings yet
Fake News Detection Using Machine Learning
8 pages
Project Synopsis
No ratings yet
Project Synopsis
8 pages
Detection of Phishing E-Banking
No ratings yet
Detection of Phishing E-Banking
12 pages
Siebel Scripting 3
100% (1)
Siebel Scripting 3
34 pages
Cse VI Programming Languages 10cs666 Solution PDF
No ratings yet
Cse VI Programming Languages 10cs666 Solution PDF
17 pages
Week 2
No ratings yet
Week 2
23 pages
4.memory Management in Java
No ratings yet
4.memory Management in Java
9 pages
EmSo Manual Completo
100% (1)
EmSo Manual Completo
135 pages
Data Duplication Removal Using File Checksum
No ratings yet
Data Duplication Removal Using File Checksum
2 pages
10 10000002624ea Courier Analyzer Management Instructions
No ratings yet
10 10000002624ea Courier Analyzer Management Instructions
40 pages
Fake News Detection Using Machine Learning Algorithm
No ratings yet
Fake News Detection Using Machine Learning Algorithm
7 pages
Sentiment Analysis Report
No ratings yet
Sentiment Analysis Report
4 pages
How To Make Jarvis Iron Man Computer
No ratings yet
How To Make Jarvis Iron Man Computer
6 pages
Get Python For Beginners: Master Python Programming From Basics To Advanced Level Tim Simon PDF Ebook With Full Chapters Now
100% (11)
Get Python For Beginners: Master Python Programming From Basics To Advanced Level Tim Simon PDF Ebook With Full Chapters Now
66 pages
Title: Personality Prediction System Problem Statement:: Literature Review
No ratings yet
Title: Personality Prediction System Problem Statement:: Literature Review
5 pages
A Project Report ON A Study On Training and Development AT: Nspira Management Services Private Limited
No ratings yet
A Project Report ON A Study On Training and Development AT: Nspira Management Services Private Limited
74 pages
Q.1 Fill in The Blanks:: (Ch#13) Functions Computer Science Part-II
No ratings yet
Q.1 Fill in The Blanks:: (Ch#13) Functions Computer Science Part-II
14 pages
Detection of Fake Online Reviews Using Semi Supervised and Supervised Learning
No ratings yet
Detection of Fake Online Reviews Using Semi Supervised and Supervised Learning
4 pages
Function in C
100% (2)
Function in C
12 pages
Programming Logic and Design: Ninth Edition
No ratings yet
Programming Logic and Design: Ninth Edition
52 pages
Group 6 Sri Lankan Cultural Etiquettes
No ratings yet
Group 6 Sri Lankan Cultural Etiquettes
13 pages
T3tafj5 Work With Ds I r15
No ratings yet
T3tafj5 Work With Ds I r15
32 pages
Introduction To Verilog
No ratings yet
Introduction To Verilog
43 pages
Venture Capital
100% (1)
Venture Capital
59 pages
C# N ASP NET
No ratings yet
C# N ASP NET
25 pages
CP Informatika Fase e Kelas X
No ratings yet
CP Informatika Fase e Kelas X
10 pages
1
No ratings yet
1
36 pages
JavaScript Notes
No ratings yet
JavaScript Notes
5 pages
05 - Objects and Calsses Part2
No ratings yet
05 - Objects and Calsses Part2
13 pages
An Introduction To Python For Scientific Computing: © 2019 M. Scott Shell Last Modified 9/24/2019
No ratings yet
An Introduction To Python For Scientific Computing: © 2019 M. Scott Shell Last Modified 9/24/2019
62 pages
Thesis Final
No ratings yet
Thesis Final
72 pages
Work Life Balance - HGS
No ratings yet
Work Life Balance - HGS
70 pages
Portfolio Management
No ratings yet
Portfolio Management
66 pages
Performance Appraial - Google New Edited
No ratings yet
Performance Appraial - Google New Edited
75 pages
Budgetory Control
No ratings yet
Budgetory Control
77 pages
Performance of MNC at Amazon
No ratings yet
Performance of MNC at Amazon
66 pages
Inventory Management
No ratings yet
Inventory Management
65 pages
Main Home Loans
No ratings yet
Main Home Loans
75 pages
New Emergency of Ott Platform During The Pandemic and Its Future Scope - Removed
No ratings yet
New Emergency of Ott Platform During The Pandemic and Its Future Scope - Removed
45 pages
Fdi 12
No ratings yet
Fdi 12
70 pages
VARSHITHA
No ratings yet
VARSHITHA
50 pages
Role of Derivatives in Managing FInancial Risk AT ICICI BANK 1 - 3
No ratings yet
Role of Derivatives in Managing FInancial Risk AT ICICI BANK 1 - 3
66 pages
Universal Banking
No ratings yet
Universal Banking
61 pages
Image Colorization Final Report
No ratings yet
Image Colorization Final Report
52 pages
2022 Mba Mba Batchno 174
No ratings yet
2022 Mba Mba Batchno 174
80 pages
LOGO Access Tool Help en-US
No ratings yet
LOGO Access Tool Help en-US
27 pages
Proc Report With Colour
No ratings yet
Proc Report With Colour
8 pages
Dynamic Internal Table IIlustrated With An Example of Creating The Transpose of Internal Table
No ratings yet
Dynamic Internal Table IIlustrated With An Example of Creating The Transpose of Internal Table
17 pages
Quiz Sectiunea 6
No ratings yet
Quiz Sectiunea 6
8 pages
Homework 2
No ratings yet
Homework 2
8 pages
ST Pauls Front Pages
No ratings yet
ST Pauls Front Pages
8 pages
Industry of Mutual Fund in India
No ratings yet
Industry of Mutual Fund in India
82 pages
New Financial Performance of Icici Bank
No ratings yet
New Financial Performance of Icici Bank
10 pages
Imapct of Covid 19 On Swiggy Study On Consumer Perception and Satisfaction Towards Swiggy
No ratings yet
Imapct of Covid 19 On Swiggy Study On Consumer Perception and Satisfaction Towards Swiggy
59 pages
A Project Report On
No ratings yet
A Project Report On
6 pages
Financial Performance of Icici Bank
No ratings yet
Financial Performance of Icici Bank
15 pages
Front Pages BBA
No ratings yet
Front Pages BBA
5 pages
Data Types in C
No ratings yet
Data Types in C
3 pages
9.telangana Mandals and Forest
No ratings yet
9.telangana Mandals and Forest
5 pages
7.padma Awards
No ratings yet
7.padma Awards
4 pages
Wipro
No ratings yet
Wipro
2 pages
Touchpad Plus Ver. 1.1 Class 7
From Everand
Touchpad Plus Ver. 1.1 Class 7
Nisha Batra
No ratings yet

Cyberspace News Prediction of Text and Image

Uploaded by

Cyberspace News Prediction of Text and Image

Uploaded by

CHAPTER 1

1.3. OBJECTIVE OF THE PROJECT

Pandas: pandas is an open source, BSD-licensed library providing high-performance, easy-to-

Numpy: NumPy is a general-purpose array-processing package. It provides a high-performance

MatPlotLib: matplotlib.pyplot is a plotting library used for 2D graphics in python programming

SYSTEM REQUIREMENT SPECIFICATIONS

A Software Requirements Specification (SRS) – a requirements specification for a software

System requirements specification: A structured collection of information that embodies the

Python IDE : python 3.x and above

RAM : 4GB and Higher

B. NON FUNCTIONAL REQUIREMENTS

Roles and Responsibilities of PDR AND PER

Use Case Diagram:

User Stop word removal

user Text Preprocess Stop word Lemma- Splitting ML Algorithm

Text User Prediction

Preprocess Stop Word

Introduction To Python Framework : Introduction to Django This book is about Django, a

Python is a high-level, interpreted, interactive and object-oriented scripting language. Python is

Standard Data Types

Different modes in python

Python class and objects

Here’s a brief list of Python OOP ideas:

There are three types of functions in python:

7.2 TESTING STRATEGIES:

SNO Test Case Title Pre-requisites Action Expected Test Result

Test Case Software Python version 3.6.4 python --version Pass

9.2. FUTURE ENHANCEMENTS

You might also like