Social Bot
Social Bot
Social Bot
ABSTRACT
Malicious social bots generate fake tweets and automate their social relationships either by
pretending like a follower or by creating multiple fake accounts with malicious activities.
Moreover, malicious social bots post shortened malicious URLs in the tweet in order to
redirect the requests of online social networking participants to some malicious servers. Hence,
distinguishing malicious social bots from legitimate users is one of the most important tasks
in the Twitter network. To detect malicious social bots, extracting URL-based features (such
as URL redirection, frequency of shared URLs, and spam content in URL) consumes less
amount of time in comparison with social graph-based features (which rely on the social
interactions of users). Furthermore, malicious social bots cannot easily manipulate URL
redirection chains. In this article, a learning automata-based malicious social bot detection
(LA-MSBD) algorithm is proposed by integrating a trust computation model with URL-based
features for identifying trustworthy participants (users) in the Twitter network. The proposed
trust computation model contains two parameters, namely, direct trust and indirect trust.
Moreover, the direct trust is derived from Bayes’ theorem, and the indirect trust is derived from
the Dempster–Shafer theory (DST) to determine the trustworthiness of each participant
accurately. Experimentation has been performed on two Twitter data sets, and the results
illustrate that the proposed algorithm achieves improvement in precision, recall, F-measure,
and accuracy compared with existing approaches for MSBD.
pg. 1
[DETECTING OF MALICIOUS SOCIAL BOTS]
INTRODUCTION:
What is Machine Learning?
Machine Learning is a system of computer algorithms that can learn from example through
self-improvement without being explicitly coded by a programmer. Machine learning is a part
of artificial Intelligence which combines data with statistical tools to predict an output which
can be used to make actionable insights.
The breakthrough comes with the idea that a machine can singularly learn from the data (i.e.,
example) to produce accurate results. Machine learning is closely related to data mining and
Bayesian predictive modelling. The machine receives data as input and uses an algorithm to
formulate answers.
A typical machine learning tasks are to provide a recommendation. For those who have a
Netflix account, all recommendations of movies or series are based on the user's historical
data. Tech companies are using unsupervised learning to improve the user experience with
personalizing recommendation.
Machine learning is also used for a variety of tasks like fraud detection, predictive
maintenance, portfolio optimization, automatize task and so on.
pg. 2
[DETECTING OF MALICIOUS SOCIAL BOTS]
Machine learning is supposed to overcome this issue. The machine learns how the input and
output data are correlated and it writes a rule. The programmers do not need to write new rules
each time there is new data. The algorithms adapt in response to new data and experiences to
improve efficacy over time.
For instance, the machine is trying to understand the relationship between the wage of an
individual and the likelihood to go to a fancy restaurant. It turns out the machine finds a
positive relationship between wage and going to a high-end restaurant: This is the model
Inferring.
When the model is built, it is possible to test how powerful it is on never-seen-before data. The
new data are transformed into a features vector, go through the model and give a prediction.
This is all the beautiful part of machine learning. There is no need to update the rules or train
again the model. You can use the model previously trained to make inference on new data.
pg. 3
[DETECTING OF MALICIOUS SOCIAL BOTS]
The life of Machine Learning programs is straightforward and can be summarized in the
following points:
1. Define a question
2. Collect data
3. Visualize data
4. Train algorithm
5. Test the Algorithm
6. Collect feedback
7. Refine the algorithm
8. Loop 4-7 until the results are satisfying
9. Use the model to make a prediction
Once the algorithm gets good at drawing the right conclusions, it applies that knowledge to
new sets of data.
Machine learning can be grouped into two broad learning tasks: Supervised and Unsupervised.
There are many other algorithms
Supervised learning
An algorithm uses training data and feedback from humans to learn the relationship of given
inputs to a given output. For instance, a practitioner can use marketing expense and weather
forecast as input data to predict the sales of cans.
You can use supervised learning when the output data is known. The algorithm will predict
new data.
There are two categories of supervised learning:
• Classification task
• Regression task
pg. 4
[DETECTING OF MALICIOUS SOCIAL BOTS]
Classification
Imagine you want to predict the gender of a customer for a commercial. You will start
gathering data on the height, weight, job, salary, purchasing basket, etc. from your customer
database. You know the gender of each of your customer, it can only be male or female. The
objective of the classifier will be to assign a probability of being a male or a female (i.e., the
label) based on the information (i.e., features you have collected). When the model learned
how to recognize male or female, you can use new data to make a prediction. For instance,
you just got new information from an unknown customer, and you want to know if it is a male
or female. If the classifier predicts male = 70%, it means the algorithm is sure at 70% that this
customer is a male, and 30% it is a female.
The label can be of two or more classes. The above Machine learning example has only two
classes, but if a classifier needs to predict object, it has dozens of classes (e.g., glass, table,
shoes, etc. each object represents a class)
Regression
When the output is a continuous value, the task is a regression. For instance, a financial analyst
may need to forecast the value of a stock based on a range of feature like equity, previous stock
performances, macroeconomics index. The system will be trained to estimate the price of the
stocks with the lowest possible error.
Unsupervised learning
In unsupervised learning, an algorithm explores input data without being given an explicit
output variable (e.g., explores customer demographic data to identify patterns)
You can use it when you do not know how to classify the data, and you want the algorithm to
find patterns and classify the data for you
to the dark category. The other images show different algorithms and how they try to classified
the data.
pg. 7
[DETECTING OF MALICIOUS SOCIAL BOTS]
Machine learning (ML) is the study of computer algorithms that improve automatically
through experience. It is seen as a part of artificial intelligence. Machine learning algorithms
build a model based on sample data, known as "training data", in order to make predictions or
pg. 8
[DETECTING OF MALICIOUS SOCIAL BOTS]
decisions without being explicitly programmed to do so. Machine learning algorithms are used
in a wide variety of applications, such as email filtering and computer vision, where it is
difficult or unfeasible to develop conventional algorithms to perform the needed tasks.
Overview
Machine learning involves computers discovering how they can perform tasks without being
explicitly programmed to do so. It involves computers learning from data provided so that they
carry out certain tasks. For simple tasks assigned to computers, it is possible to program
algorithms telling the machine how to execute all steps required to solve the problem at hand;
on the computer's part, no learning is needed. For more advanced tasks, it can be challenging
for a human to manually create the needed algorithms. In practice, it can turn out to be more
effective to help the machine develop its own algorithm, rather than having human
programmers specify every needed step.
Tom M. Mitchell provided a widely quoted, more formal definition of the algorithms studied
in the machine learning field: "A computer program is said to learn from experience E with
respect to some class of tasks T and performance measure P if its performance at tasks in T, as
measured by P, improves with experience E."This definition of the tasks in which machine
learning is concerned offers a fundamentally operational definition rather than defining the
field in cognitive terms. This follows Alan Turing's proposal in his paper "Computing
Machinery and Intelligence", in which the question "Can machines think?" is replaced with
the question "Can machines do what we (as thinking entities) can do?".
Modern day machine learning has two objectives, one is to classify data based on models
which have been developed, the other purpose is to make predictions for future outcomes based
on these models. A hypothetical algorithm specific to classifying data may use computer vision
of moles coupled with supervised learning in order to train it to classify the cancerous moles.
Whereas, a machine learning algorithm for stock trading may inform the trader of future
potential predictions.
Artificial intelligence
Machine Learning as subfield of AI
Part of Machine Learning as subfield of AI or part of AI as subfield of Machine Learning
As a scientific endeavour, machine learning grew out of the quest for artificial intelligence. In
the early days of AI as an academic discipline, some researchers were interested in having
machines learn from data. They attempted to approach the problem with various symbolic
methods, as well as what was then termed "neural networks"; these were mostly perceptrons
and other models that were later found to be reinventions of the generalized linear models of
statistics. Probabilistic reasoning was also employed, especially in automated medical
diagnosis.
However, an increasing emphasis on the logical, knowledge-based approach caused a rift
between AI and machine learning. Probabilistic systems were plagued by theoretical and
practical problems of data acquisition and representation. By 1980, expert systems had come
to dominate AI, and statistics was out of Favor. Work on symbolic/knowledge-based learning
did continue within AI, leading to inductive logic programming, but the more statistical line
of research was now outside the field of AI proper, in pattern recognition and information
retrieval. Neural networks research had been abandoned by AI and computer science around
the same time. This line, too, was continued outside the AI/CS field, as "connectionism", by
researchers from other disciplines including Hopfield, Rumelhart and Hinton. Their main
success came in the mid-1980s with the reinvention of backpropagation.
pg. 10
[DETECTING OF MALICIOUS SOCIAL BOTS]
Machine learning (ML), reorganized as a separate field, started to flourish in the 1990s. The
field changed its goal from achieving artificial intelligence to tackling solvable problems of a
practical nature. It shifted focus away from the symbolic approaches it had inherited from AI,
and toward methods and models borrowed from statistics and probability theory.
As of 2020, many sources continue to assert that machine learning remains a subfield of AI.
The main disagreement is whether all of ML is part of AI, as this would mean that anyone
using ML could claim they are using AI. Others have the view that not all of ML is part of AI
where only an 'intelligent' subset of ML is part of AI.
The question to what is the difference between ML and AI is answered by Judea Pearl in The
Book of Why. Accordingly, ML learns and predicts based on passive observations, whereas AI
implies an agent interacting with the environment to learn and take actions that maximize its
chance of successfully achieving its goals.
Data mining
Machine learning and data mining often employ the same methods and overlap significantly,
but while machine learning focuses on prediction, based on known properties learned from the
training data, data mining focuses on the discovery of (previously) unknown properties in the
data (this is the analysis step of knowledge discovery in databases). Data mining uses many
machine learning methods, but with different goals; on the other hand, machine learning also
employs data mining methods as "unsupervised learning" or as a preprocessing step to improve
learner accuracy. Much of the confusion between these two research communities (which do
often have separate conferences and separate journals, ECML PKDD being a major exception)
comes from the basic assumptions they work with: in machine learning, performance is usually
evaluated with respect to the ability to reproduce known knowledge, while in knowledge
discovery and data mining (KDD) the key task is the discovery of previously unknown
knowledge. Evaluated with respect to known knowledge, an uninformed (unsupervised)
method will easily be outperformed by other supervised methods, while in a typical KDD task,
supervised methods cannot be used due to the unavailability of training data.
Optimization
Machine learning also has intimate ties to optimization: many learning problems are
formulated as minimization of some loss function on a training set of examples. Loss functions
express the discrepancy between the predictions of the model being trained and the actual
problem instances (for example, in classification, one wants to assign a label to instances, and
models are trained to correctly predict the pre-assigned labels of a set of examples).
Generalization
The difference between optimization and machine learning arises from the goal of
generalization: while optimization algorithms can minimize the loss on a training set, machine
learning is concerned with minimizing the loss on unseen samples. Characterizing the
generalization of various learning algorithms is an active topic of current research, especially
for deep learning algorithms.
pg. 11
[DETECTING OF MALICIOUS SOCIAL BOTS]
Statistics
Machine learning and statistics are closely related fields in terms of methods, but distinct in
their principal goal: statistics draws population inferences from a sample, while machine
learning finds generalizable predictive patterns. According to Michael I. Jordan, the ideas of
machine learning, from methodological principles to theoretical tools, have had a long pre-
history in statistics. He also suggested the term data science as a placeholder to call the overall
field.
Leo Breiman distinguished two statistical modelling paradigms: data model and algorithmic
model, wherein "algorithmic model" means more or less the machine learning algorithms like
Random Forest.
Some statisticians have adopted methods from machine learning, leading to a combined field
that they call statistical learning.
Theory
A core objective of a learner is to generalize from its experience. Generalization in this context
is the ability of a learning machine to perform accurately on new, unseen examples/tasks after
having experienced a learning data set. The training examples come from some generally
unknown probability distribution (considered representative of the space of occurrences) and
the learner has to build a general model about this space that enables it to produce sufficiently
accurate predictions in new cases.
The computational analysis of machine learning algorithms and their performance is a branch
of theoretical computer science known as computational learning theory. Because training sets
are finite and the future is uncertain, learning theory usually does not yield guarantees of the
performance of algorithms. Instead, probabilistic bounds on the performance are quite
common. The bias–variance decomposition is one way to quantify generalization error.
For the best performance in the context of generalization, the complexity of the hypothesis
should match the complexity of the function underlying the data. If the hypothesis is less
complex than the function, then the model has under fitted the data. If the complexity of the
model is increased in response, then the training error decreases. But if the hypothesis is too
complex, then the model is subject to overfitting and generalization will be poorer.
In addition to performance bounds, learning theorists study the time complexity and feasibility
of learning. In computational learning theory, a computation is considered feasible if it can be
done in polynomial time. There are two kinds of time complexity results. Positive results show
that a certain class of functions can be learned in polynomial time. Negative results show that
certain classes cannot be learned in polynomial time.
Approaches
Types of learning algorithms
The types of machine learning algorithms differ in their approach, the type of data they input
and output, and the type of task or problem that they are intended to solve.
Supervised learning
A support vector machine is a supervised learning model that divides the data into regions
separated by a linear boundary. Here, the linear boundary divides the black circles from the
white.
pg. 12
[DETECTING OF MALICIOUS SOCIAL BOTS]
Supervised learning algorithms build a mathematical model of a set of data that contains both
the inputs and the desired outputs. The data is known as training data, and consists of a set of
training examples. Each training example has one or more inputs and the desired output, also
known as a supervisory signal. In the mathematical model, each training example is
represented by an array or vector, sometimes called a feature vector, and the training data is
represented by a matrix. Through iterative optimization of an objective function, supervised
learning algorithms learn a function that can be used to predict the output associated with new
inputs. An optimal function will allow the algorithm to correctly determine the output for
inputs that were not a part of the training data. An algorithm that improves the accuracy of its
outputs or predictions over time is said to have learned to perform that task.
Types of supervised learning algorithms include active learning, classification and regression.
Classification algorithms are used when the outputs are restricted to a limited set of values,
and regression algorithms are used when the outputs may have any numerical value within a
range. As an example, for a classification algorithm that filters emails, the input would be an
incoming email, and the output would be the name of the folder in which to file the email.
Similarity learning is an area of supervised machine learning closely related to regression and
classification, but the goal is to learn from examples using a similarity function that measures
how similar or related two objects are. It has applications in ranking, recommendation systems,
visual identity tracking, face verification, and speaker verification.
Unsupervised learning
Unsupervised learning algorithms take a set of data that contains only inputs, and find structure
in the data, like grouping or clustering of data points. The algorithms, therefore, learn from
test data that has not been labelled, classified or categorized. Instead of responding to feedback,
unsupervised learning algorithms identify commonalities in the data and react based on the
presence or absence of such commonalities in each new piece of data. A central application of
unsupervised learning is in the field of density estimation in statistics, such as finding the
probability density function. Though unsupervised learning encompasses other domains
involving summarizing and explaining data features.
Cluster analysis is the assignment of a set of observations into subsets (called clusters) so that
observations within the same cluster are similar according to one or more predesignated
criteria, while observations drawn from different clusters are dissimilar. Different clustering
techniques make different assumptions on the structure of the data, often defined by some
similarity metric and evaluated, for example, by internal compactness, or the similarity
between members of the same cluster, and separation, the difference between clusters. Other
methods are based on estimated density and graph connectivity.
Semi-supervised learning
Semi-supervised learning falls between unsupervised learning (without any labelled training
data) and supervised learning (with completely labelled training data). Some of the training
examples are missing training labels, yet many machine-learning researchers have found that
unlabelled data, when used in conjunction with a small amount of labelled data, can produce
a considerable improvement in learning accuracy.
In weakly supervised learning, the training labels are noisy, limited, or imprecise; however,
these labels are often cheaper to obtain, resulting in larger effective training sets.
pg. 13
[DETECTING OF MALICIOUS SOCIAL BOTS]
Reinforcement learning
Reinforcement learning is an area of machine learning concerned with how software agents
ought to take actions in an environment so as to maximize some notion of cumulative reward.
Due to its generality, the field is studied in many other disciplines, such as game theory, control
theory, operations research, information theory, simulation-based optimization, multi-agent
systems, swarm intelligence, statistics and genetic algorithms. In machine learning, the
environment is typically represented as a Markov decision process (MDP). Many
reinforcement learning algorithms use dynamic programming techniques. Reinforcement
learning algorithms do not assume knowledge of an exact mathematical model of the MDP,
and are used when exact models are infeasible. Reinforcement learning algorithms are used in
autonomous vehicles or in learning to play a game against a human opponent.
Self-learning
Self-learning as a machine learning paradigm was introduced in 1982 along with a neural
network capable of self-learning named crossbar adaptive array (CAA). It is a learning with
no external rewards and no external teacher advice. The CAA self-learning algorithm
computes, in a crossbar fashion, both decisions about actions and emotions (feelings) about
consequence situations. The system is driven by the interaction between cognition and
emotion. The self-learning algorithm updates a memory matrix W =||w(a,s)|| such that in each
iteration executes the following machine learning routine:
Feature learning
Several learning algorithms aim at discovering better representations of the inputs provided
during training. Classic examples include principal components analysis and cluster analysis.
Feature learning algorithms, also called representation learning algorithms, often attempt to
preserve the information in their input but also transform it in a way that makes it useful, often
as a pre-processing step before performing classification or predictions. This technique allows
reconstruction of the inputs coming from the unknown data-generating distribution, while not
being necessarily faithful to configurations that are implausible under that distribution. This
replaces manual feature engineering, and allows a machine to both learn the features and use
them to perform a specific task.
pg. 14
[DETECTING OF MALICIOUS SOCIAL BOTS]
Anomaly detection
In data mining, anomaly detection, also known as outlier detection, is the identification of rare
items, events or observations which raise suspicions by differing significantly from the
majority of the data. Typically, the anomalous items represent an issue such as bank fraud, a
structural defect, medical problems or errors in a text. Anomalies are referred to as outliers,
novelties, noise, deviations and exceptions.
In particular, in the context of abuse and network intrusion detection, the interesting objects
are often not rare objects, but unexpected bursts of inactivity. This pattern does not adhere to
the common statistical definition of an outlier as a rare object, and many outlier detection
methods (in particular, unsupervised algorithms) will fail on such data unless it has been
pg. 15
[DETECTING OF MALICIOUS SOCIAL BOTS]
aggregated appropriately. Instead, a cluster analysis algorithm may be able to detect the micro-
clusters formed by these patterns.
Three broad categories of anomaly detection techniques exist. Unsupervised anomaly
detection techniques detect anomalies in an unlabelled test data set under the assumption that
the majority of the instances in the data set are normal, by looking for instances that seem to
fit least to the remainder of the data set. Supervised anomaly detection techniques require a
data set that has been labelled as "normal" and "abnormal" and involves training a classifier
(the key difference to many other statistical classification problems is the inherently
unbalanced nature of outlier detection). Semi-supervised anomaly detection techniques
construct a model representing normal behaviour from a given normal training data set and
then test the likelihood of a test instance to be generated by the model.
Robot learning
In developmental robotics, robot learning algorithms generate their own sequences of learning
experiences, also known as a curriculum, to cumulatively acquire new skills through self-
guided exploration and social interaction with humans. These robots use guidance mechanisms
such as active learning, maturation, motor synergies and imitation.
Association rules
Association rule learning is a rule-based machine learning method for discovering
relationships between variables in large databases. It is intended to identify strong rules
discovered in databases using some measure of "interestingness".
Rule-based machine learning is a general term for any machine learning method that identifies,
learns, or evolves "rules" to store, manipulate or apply knowledge. The defining characteristic
of a rule-based machine learning algorithm is the identification and utilization of a set of
relational rules that collectively represent the knowledge captured by the system. This is in
contrast to other machine learning algorithms that commonly identify a singular model that
can be universally applied to any instance in order to make a prediction. Rule-based machine
learning approaches include learning classifier systems, association rule learning, and artificial
immune systems. Based on the concept of strong rules, Rakesh Agrawal, Tomasz Imieliński
and Arun Swami introduced association rules for discovering regularities between products in
large-scale transaction data recorded by point-of-sale (POS) systems in supermarkets. For
example, the rule {\displaystyle \{\mathrm {onions,potatoes} \}\Rightarrow \{\mathrm
{burger} \}}\{{\mathrm {onions,potatoes}}\}\Rightarrow \{{\mathrm {burger}}\} found in
the sales data of a supermarket would indicate that if a customer buys onions and potatoes
together, they are likely to also buy hamburger meat. Such information can be used as the basis
for decisions about marketing activities such as promotional pricing or product placements. In
addition to market basket analysis, association rules are employed today in application areas
including Web usage mining, intrusion detection, continuous production, and bioinformatics.
In contrast with sequence mining, association rule learning typically does not consider the
order of items either within a transaction or across transactions.
Learning classifier systems (LCS) are a family of rule-based machine learning algorithms that
combine a discovery component, typically a genetic algorithm, with a learning component,
performing either supervised learning, reinforcement learning, or unsupervised learning. They
pg. 16
[DETECTING OF MALICIOUS SOCIAL BOTS]
seek to identify a set of context-dependent rules that collectively store and apply knowledge
in a piecewise manner in order to make predictions.
Inductive logic programming (ILP) is an approach to rule-learning using logic programming
as a uniform representation for input examples, background knowledge, and hypotheses.
Given an encoding of the known background knowledge and a set of examples represented as
a logical database of facts, an ILP system will derive a hypothesized logic program that entails
all positive and no negative examples. Inductive programming is a related field that considers
any kind of programming language for representing hypotheses (and not only logic
programming), such as functional programs.
Inductive logic programming is particularly useful in bioinformatics and natural language
processing. Gordon Plotkin and Ehud Shapiro laid the initial theoretical foundation for
inductive machine learning in a logical setting. Shapiro built their first implementation (Model
Inference System) in 1981: a Prolog program that inductively inferred logic programs from
positive and negative examples. The term inductive here refers to philosophical induction,
suggesting a theory to explain observed facts, rather than mathematical induction, proving a
property for all members of a well-ordered set.
Models
Performing machine learning involves creating a model, which is trained on some training data
and then can process additional data to make predictions. Various types of models have been
used and researched for machine learning systems.
The original goal of the ANN approach was to solve problems in the same way that a human
brain would. However, over time, attention moved to performing specific tasks, leading to
deviations from biology. Artificial neural networks have been used on a variety of tasks,
including computer vision, speech recognition, machine translation, social network filtering,
playing board and video games and medical diagnosis.
Deep learning consists of multiple hidden layers in an artificial neural network. This approach
tries to model the way the human brain processes light and sound into vision and hearing.
Some successful applications of deep learning are computer vision and speech recognition.
Decision trees
Decision tree learning uses a decision tree as a predictive model to go from observations about
an item (represented in the branches) to conclusions about the item's target value (represented
in the leaves). It is one of the predictive modeling approaches used in statistics, data mining,
and machine learning. Tree models where the target variable can take a discrete set of values
are called classification trees; in these tree structures, leaves represent class labels and
branches represent conjunctions of features that lead to those class labels. Decision trees where
the target variable can take continuous values (typically real numbers) are called regression
trees. In decision analysis, a decision tree can be used to visually and explicitly represent
decisions and decision making. In data mining, a decision tree describes data, but the resulting
classification tree can be an input for decision making.
Regression analysis
Regression analysis encompasses a large variety of statistical methods to estimate the
relationship between input variables and their associated features. Its most common form is
linear regression, where a single line is drawn to best fit the given data according to a
mathematical criterion such as ordinary least squares. The latter is often extended by
regularization (mathematics) methods to mitigate overfitting and bias, as in ridge regression.
When dealing with non-linear problems, go-to models include polynomial regression (for
example, used for trendline fitting in Microsoft Excel, logistic regression (often used in
statistical classification) or even kernel regression, which introduces non-linearity by taking
advantage of the kernel trick to implicitly map input variables to higher-dimensional space.
pg. 18
[DETECTING OF MALICIOUS SOCIAL BOTS]
Bayesian networks
A simple Bayesian networks. Rain influences whether the sprinkler is activated, and both rain
and the sprinkler influence whether the grass is wet.
A Bayesian network, belief network, or directed acyclic graphical model is a probabilistic
graphical model that represents a set of random variables and their conditional independence
with a directed acyclic graph (DAG). For example, a Bayesian network could represent the
probabilistic relationships between diseases and symptoms. Given symptoms, the network can
be used to compute the probabilities of the presence of various diseases. Efficient algorithms
exist that perform inference and learning. Bayesian networks that model sequences of
variables, like speech signals or protein sequences, are called dynamic Bayesian networks.
Generalizations of Bayesian networks that can represent and solve decision problems under
uncertainty are called influence diagrams.
Genetic algorithms
A genetic algorithm (GA) is a search algorithm and heuristic technique that mimics the process
of natural selection, using methods such as mutation and crossover to generate new genotypes
in the hope of finding good solutions to a given problem. In machine learning, genetic
algorithms were used in the 1980s and 1990s. Conversely, machine learning techniques have
been used to improve the performance of genetic and evolutionary algorithms.
Training models
Usually, machine learning models require a lot of data in order for them to perform well.
Usually, when training a machine learning model, one needs to collect a large, representative
sample of data from a training set. Data from the training set can be as varied as a corpus of
text, a collection of images, and data collected from individual users of a service. Overfitting
is something to watch out for when training a machine learning model. Trained models derived
from biased data can result in skewed or undesired predictions. Algorithmic bias is a potential
result from data not fully prepared for training.
Federated learning
Federated learning is an adapted form of distributed artificial intelligence to training machine
learning models that decentralizes the training process, allowing for users' privacy to be
maintained by not needing to send their data to a centralized server. This also increases
efficiency by decentralizing the training process to many devices. For example, Gboard uses
federated machine learning to train search query prediction models on users' mobile phones
without having to send individual searches back to Google
pg. 19
[DETECTING OF MALICIOUS SOCIAL BOTS]
SYSTEM STUDY
Feasibility Study
The feasibility of the project is analyzed in this phase and business proposal is put forth with
a very general plan for the project and some cost estimates. During system analysis the
feasibility study of the proposed system is to be carried out. This is to ensure that the
proposed system is not a burden to the company. For feasibility analysis, some
understanding of the major requirements for the system is essential.
Three key considerations involved in the feasibility analysis are:
ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY
Economic Feasibility
This study is carried out to check the economic impact that the system will have on the
organization. The amount of fund that the company can pour into the research and
development of the system is limited. The expenditures must be justified. Thus the developed
system as well within the budget and this was achieved because most of the technologies used
are freely available. Only the customized products had to be purchased.
Technical Feasibility
This study is carried out to check the technical feasibility, that is, the technical requirements
of the system. Any system developed must not have a high demand on the available technical
resources. This will lead to high demands on the available technical resources. This will lead
to high demands being placed on the client. The developed system must have a modest
requirement, as only minimal or null changes are required for implementing this system.
Social Feasibility
The aspect of study is to check the level of acceptance of the system by the user. This includes
the process of training the user to use the system efficiently. The user must not feel threatened
by the system, instead must accept it as a necessity. The level of acceptance by the users solely
depends on the methods that are employed to educate the user about the system and to make
him familiar with it. His level of confidence must be raised so that he is also able to make
some constructive criticism, which is welcomed, as he is the final user of the system.
pg. 20
[DETECTING OF MALICIOUS SOCIAL BOTS]
SYSTEM ANALYSIS
EXISTING SYSTEM:
➢ The existing malicious URL detection approaches are based on DNS information and
lexical properties of URLs. The malicious social bots use URL redirections in order to
avoid detection.
➢ Besel et al. analysed social botnet attack on Twitter. The authors have presented that
social bots use URL shortening services and URL redirection in order to redirect users
to malicious web pages.
➢ Echeverria and Zhou presented methods to detect, retrieve, and analyse botnet over
thousands of users to observe the social behaviour of bots.
➢ Dorri et.al., proposed a social bot hunter model has been presented based on the user
behavioural features, such as follower ratio, the number of URLs, and reputation score.
➢ M. Agarwal et. al. features a trust model has been designed to detect malicious activities
in an OSN.
Disadvantages Of Existing System:
➢ The malicious social bots can manipulate profile features, such as hashtag ratio, follower
ratio, URL ratio, and the number of retweets. The malicious social bots can also
manipulate tweet-content features, such as sentimental words, emoticons, and most
frequent words used in the tweets, by manipulating the content of each tweet. The social
relationship-based features are highly robust because the malicious social bots cannot
easily manipulate the social interactions of users in the Twitter network.
➢ The existing approaches rely on statistical features instead of analyzing the social
behaviour of users. Moreover, these approaches are not highly robust in detecting the
temporal data patterns with noisy data (i.e., where the data is biased with untrustworthy
or fake information) because the behaviour of malicious bots changes over time in order
to avoid detection.
PROPOSED SYSTEM:
➢ In the proposed system, the malicious behaviour of participants is analysed by
considering features extracted from the posted URLs (in the tweets), such as URL
redirection, frequency of shared URLs, and spam content in URL, to distinguish
between legitimate and malicious tweets. To protect against the malicious social bot
attacks, our proposed LA-based malicious social bot detection (LA-MSBD) algorithm
integrates a trust computational model with a set of URL-based features for the detection
of malicious social bots.
➢ In the proposed system, we Analyse the malicious behaviour of a participant by
considering URL-based features, such as URL redirection, the relative position of URL,
frequency of shared URLs, and spam content in URL.
➢ In the proposed system, we Evaluate the trustworthiness of tweets (posted by each
participant) by using the Bayesian learning and Dempster–Shafer theory (DST).
pg. 21
[DETECTING OF MALICIOUS SOCIAL BOTS]
➢ Also, we Design the system by integrating a trust model with a set of URL-based
features.
Advantages Of Proposed System:
➢ The proposed system helps to detect malicious social bots accurately.
➢ The experimental results illustrate that our proposed system gives better performance
compared with conventional machine learning algorithms in terms of precision.
➢ The precision value obtained for The Fake Project data set is better than the Social
Honeypot data set because the Social Honeypot data set contains noisy and
untrustworthy information in its user content features than The Fake Project data set.
➢ The proposed system achieves the highest precision level. This is due to the fact that the
proposed system executes for a finite set of learning actions to update the action
probability value and achieves the advantages of incremental learning. Hence, the LA
model with a trust component identifies the malicious tweets that are posted by
malicious social bots.
pg. 22
[DETECTING OF MALICIOUS SOCIAL BOTS]
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
SOFTWARE REQUIREMENTS:
pg. 23
[DETECTING OF MALICIOUS SOCIAL BOTS]
SYSTEM DESIGN
SYSTEM ARCHITECTURE:
1. The DFD is also called as bubble chart. It is a simple graphical formalism that can be
used to represent a system in terms of input data to the system, various processing
carried out on this data, and the output data is generated by this system.
2. The data flow diagram (DFD) is one of the most important modelling tools. It is used to
model the system components. These components are the system process, the data used
by the process, an external entity that interacts with the system and the information
flows in the system.
3. DFD shows how the information moves through the system and how it is modified by
a series of transformations. It is a graphical technique that depicts information flow and
the transformations that are applied as data moves from input to output.
4. DFD is also known as bubble chart. A DFD may be used to represent a system at any
level of abstraction. DFD may be partitioned into levels that represent increasing
information flow and functional detail.
pg. 24
[DETECTING OF MALICIOUS SOCIAL BOTS]
Input data
Preprocessing
Training dataset
Feature Extraction
spam No spam
UML DIAGRAMS
UML stands for Unified Modelling Language. UML is a standardized general-purpose
modelling language in the field of object-oriented software engineering. The standard is
managed, and was created by, the Object Management Group.
The goal is for UML to become a common language for creating models of object-oriented
computer software. In its current form UML is comprised of two major components: a Meta-
model and a notation. In the future, some form of method or process may also be added to; or
associated with, UML.
The Unified Modelling Language is a standard language for specifying, Visualization,
Constructing and documenting the artifacts of software system, as well as for business
modelling and other non-software systems.
The UML represents a collection of best engineering practices that have proven successful in
the modelling of large and complex systems.
pg. 25
[DETECTING OF MALICIOUS SOCIAL BOTS]
The UML is a very important part of developing objects-oriented software and the software
development process. The UML uses mostly graphical notations to express the design of
software projects.
Goals:
Input data
Preprocessing
User
Training
Classification
CLASS DIAGRAM:
In software engineering, a class diagram in the Unified Modelling Language (UML) is a type
of static structure diagram that describes the structure of a system by showing the system's
pg. 26
[DETECTING OF MALICIOUS SOCIAL BOTS]
classes, their attributes, operations (or methods), and the relationships among the classes. It
explains which class contains information.
Input Output
SEQUENCE DIAGRAM:
A sequence diagram in Unified Modelling Language (UML) is a kind of interaction diagram
that shows how processes operate with one another and in what order. It is a construct of a
Message Sequence Chart. Sequence diagrams are sometimes called event diagrams, event
scenarios, and timing diagrams.
e
Perforn Preprocessing
Give input
pg. 27
[DETECTING OF MALICIOUS SOCIAL BOTS]
ACTIVITY DIAGRAM:
Activity diagrams are graphical representations of workflows of stepwise activities and actions
with support for choice, iteration and concurrency. In the Unified Modelling Language,
activity diagrams can be used to describe the business and operational step-by-step workflows
of components in a system. An activity diagram shows the overall flow of control.
Input dataset
Preprocessing
Training
pg. 28
[DETECTING OF MALICIOUS SOCIAL BOTS]
SOFTWARE ENVIRONMENT
PYTHON:
Python is a high-level, interpreted, interactive and object-oriented scripting language. Python
is designed to be highly readable. It uses English keywords frequently where as other
languages use punctuation, and it has fewer syntactical constructions than other languages.
• Python is Interpreted − Python is processed at runtime by the interpreter. You do not
need to compile your program before executing it. This is similar to PERL and PHP.
• Python is Interactive − You can actually sit at a Python prompt and interact with the
interpreter directly to write your programs.
• Python is Object-Oriented − Python supports Object-Oriented style or technique of
programming that encapsulates code within objects.
• Python is a Beginner's Language − Python is a great language for the beginner-level
programmers and supports the development of a wide range of applications from simple
text processing to WWW browsers to games.
History of Python
Python was developed by Guido van Rossum in the late eighties and early nineties at the
National Research Institute for Mathematics and Computer Science in the Netherlands.
Python is derived from many other languages, including ABC, Modula-3, C, C++, Algol-68,
SmallTalk, and Unix shell and other scripting languages.
Python is copyrighted. Like Perl, Python source code is now available under the GNU General
Public License (GPL).
Python is now maintained by a core development team at the institute, although Guido van
Rossum still holds a vital role in directing its progress.
Python Features
Python's features include −
• Easy-to-learn − Python has few keywords, simple structure, and a clearly defined
syntax. This allows the student to pick up the language quickly.
• Easy-to-read − Python code is more clearly defined and visible to the eyes.
• Easy-to-maintain − Python's source code is fairly easy-to-maintain.
• A broad standard library − Python's bulk of the library is very portable and cross-
platform compatible on UNIX, Windows, and Macintosh.
• Interactive Mode − Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
pg. 29
[DETECTING OF MALICIOUS SOCIAL BOTS]
• Portable − Python can run on a wide variety of hardware platforms and has the same
interface on all platforms.
• Extendable − You can add low-level modules to the Python interpreter. These modules
enable programmers to add to or customize their tools to be more efficient.
• Databases − Python provides interfaces to all major commercial databases.
• GUI Programming − Python supports GUI applications that can be created and ported
to many system calls, libraries and windows systems, such as Windows MFC,
Macintosh, and the X Window system of Unix.
• Scalable − Python provides a better structure and support for large programs than shell
scripting.
Apart from the above-mentioned features, Python has a big list of good features, few are listed
below −
• It supports functional and structured programming methods as well as OOP.
• It can be used as a scripting language or can be compiled to byte-code for building large
applications.
• It provides very high-level dynamic data types and supports dynamic type checking.
• It supports automatic garbage collection.
• It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.
Python is available on a wide variety of platforms including Linux and Mac OS X. Let's
understand how to set up our Python environment.
Getting Python
The most up-to-date and current source code, binaries, documentation, news, etc., is available
on the official website of Python https://www.python.org.
Windows Installation
Here are the steps to install Python on Windows machine.
• Open a Web browser and go to https://www.python.org/downloads/.
• Follow the link for the Windows installer python-XYZ.msifile where XYZ is the
version you need to install.
• To use this installer python-XYZ.msi, the Windows system must support Microsoft
Installer 2.0. Save the installer file to your local machine and then run it to find out if
your machine supports MSI.
• Run the downloaded file. This brings up the Python install wizard, which is really easy
to use. Just accept the default settings, wait until the install is finished, and you are done.
pg. 30
[DETECTING OF MALICIOUS SOCIAL BOTS]
The Python language has many similarities to Perl, C, and Java. However, there are some
definite differences between the languages.
First Python Program
Let us execute programs in different modes of programming.
Interactive Mode Programming
Invoking the interpreter without passing a script file as a parameter brings up the following
prompt −
$ python
Python2.4.3(#1,Nov112010,13:34:43)
[GCC 4.1.220080704(RedHat4.1.2-48)] on linux2
Type"help","copyright","credits"or"license"for more information.
>>>
Type the following text at the Python prompt and press the Enter −
>>>print"Hello, Python!"
If you are running new version of Python, then you would need to use print statement with
parenthesis as in print ("Hello, Python!");. However in Python version 2.4.3, this produces
the following result −
Hello, Python!
print"Hello, Python!"
We assume that you have Python interpreter set in PATH variable. Now, try to run this
program as follows −
$ python test.py
pg. 31
[DETECTING OF MALICIOUS SOCIAL BOTS]
Flask Framework:
Flask is a web application framework written in Python. Armin Ronacher, who leads an
international group of Python enthusiasts named Pocco, develops it. Flask is based on
Werkzeug WSGI toolkit and Jinja2 template engine. Both are Pocco projects. Http protocol is
the foundation of data communication in world wide web. Different methods of data retrieval
from specified URL are defined in this protocol.
The following table summarizes different http methods −
1 GET
Sends data in unencrypted form to the server. Most common method.
2 HEAD
Same as GET, but without response body
3 POST
Used to send HTML form data to server. Data received by POST method is not
cached by server.
4 PUT
Replaces all current representations of the target resource with the uploaded
content.
5 DELETE
Removes all current representations of the target resource given by a URL
By default, the Flask route responds to the GET requests. However, this preference can be
altered by providing methods argument to route() decorator.
In order to demonstrate the use of POST method in URL routing, first let us create an HTML
form and use the POST method to send form data to a URL.
Save the following script as login.html
<html>
<body>
<formaction="http://localhost:5000/login"method="post">
<p>Enter Name:</p>
pg. 32
[DETECTING OF MALICIOUS SOCIAL BOTS]
<p><inputtype="text"name="nm"/></p>
<p><inputtype="submit"value="submit"/></p>
</form>
</body>
</html>
After the development server starts running, open login.html in the browser, enter name in
the text field and click Submit.
user = request.form['nm']
pg. 33
[DETECTING OF MALICIOUS SOCIAL BOTS]
It is passed to ‘/success’ URL as variable part. The browser displays a welcome message in
the window.
Change the method parameter to ‘GET’ in login.html and open it again in the browser. The
data received on server is by the GET method. The value of ‘nm’ parameter is now obtained
by −
User = request.args.get(‘nm’)
Here, args is dictionary object containing a list of pairs of form parameter and its
corresponding value. The value corresponding to ‘nm’ parameter is passed on to ‘/success’
URL as before.
What is Python?
Python is a popular programming language. It was created in 1991 by Guido van Rossum.
It is used for:
• web development (server-side),
• software development,
• mathematics,
• system scripting.
What can Python do?
• Python can be used on a server to create web applications.
• Python can be used alongside software to create workflows.
• Python can connect to database systems. It can also read and modify files.
• Python can be used to handle big data and perform complex mathematics.
• Python can be used for rapid prototyping, or for production-ready software
development.
Why Python?
• Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc).
• Python has a simple syntax similar to the English language.
pg. 34
[DETECTING OF MALICIOUS SOCIAL BOTS]
• Python has syntax that allows developers to write programs with fewer lines than some
other programming languages.
• Python runs on an interpreter system, meaning that code can be executed as soon as it
is written. This means that prototyping can be very quick.
• Python can be treated in a procedural way, an object-orientated way or a functional way.
Good to know
• The most recent major version of Python is Python 3, which we shall be using in this
tutorial. However, Python 2, although not being updated with anything other than
security updates, is still quite popular.
• In this tutorial Python will be written in a text editor. It is possible to write Python in an
Integrated Development Environment, such as Thonny, Pycharm, Netbeans or Eclipse
which are particularly useful when managing larger collections of Python files.
Python Syntax compared to other programming languages
• Python was designed to for readability, and has some similarities to the English language
with influence from mathematics.
• Python uses new lines to complete a command, as opposed to other programming
languages which often use semicolons or parentheses.
• Python relies on indentation, using whitespace, to define scope; such as the scope of
loops, functions and classes. Other programming languages often use curly-brackets for
this purpose.
Python Install
Many PCs and Macs will have python already installed.
To check if you have python installed on a Windows PC, search in the start bar for Python
or run the following on the Command Line (cmd.exe):
To check if you have python installed on a Linux or Mac, then on linux open the command
line or on Mac open the Terminal and type:
python --version
If you find that you do not have python installed on your computer, then you can
download it for free from the following website: https://www.python.org/
Python Quickstart
Python is an interpreted programming language, this means that as a developer you
write Python (.py) files in a text editor and then put those files into the python interpreter
to be executed.
The way to run a python file is like this on the command line:
Let's write our first Python file, called helloworld.py, which can be done in any text editor.
helloworld.py
print("Hello, World!")
Simple as that. Save your file. Open your command line, navigate to the directory where
you saved your file, and run:
Hello, World!
Congratulations, you have written and executed your first Python program.
C:\Users\Your Name>python
From there you can write any python, including our hello world example from earlier in
the tutorial:
C:\Users\Your Name>python
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Hello, World!")
C:\Users\Your Name>python
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Hello, World!")
Hello, World!
Whenever you are done in the python command line, you can simply type the following to
quit the python command line interface:
pg. 36
[DETECTING OF MALICIOUS SOCIAL BOTS]
exit()
Python Indentations
Where in other programming languages the indentation in code is for readability only, in
Python the indentation is very important.
Python uses indentation to indicate a block of code.
Example
if 5 > 2:
print("Five is greater than two!")
Python will give you an error if you skip the indentation:
Example
if 5 > 2:
print("Five is greater than two!")
Comments
Python has commenting capability for the purpose of in-code documentation.
Comments start with a #, and Python will render the rest of the line as a comment:
Example
Comments in Python:
#This is a comment.
print("Hello, World!")
Docstrings
Python also has extended documentation capability, called docstrings.
Docstrings can be one line, or multiline.
Python uses triple quotes at the beginning and end of the docstring:
Example
Docstrings are also comments:
"""This is a
multiline docstring."""
print("Hello, World!")
pg. 37
[DETECTING OF MALICIOUS SOCIAL BOTS]
IMPLEMENTATION
MODULES:
❖ Data Collection
❖ Dataset
❖ Data Preparation
❖ Model Selection
❖ Analyse and Prediction
❖ Accuracy on test set
❖ Saving the Trained Model
❖ Database connecting using MySQL
MODULES DESCSRIPTION:
Data Collection:
This is the first real step towards the real development of a machine learning model, collecting
data. This is a critical step that will cascade in how good the model will be, the more and better
data that we get, the better our model will perform.
There are several techniques to collect the data, like web scraping, manual interventions and
etc.
Detection of Malicious Social Bots taken from kaggle and some other source
Dataset:
The dataset consists of 969812 individual data. There are 3 columns in the dataset, which are
described below
1. Id: unique id
2. Labels : Labels
Malicious
No Malicious
3. URL: url comment
Data Preparation:
We will transform the data. by getting rid of missing data and removing some columns. First
we will create a list of column names that we want to keep or retain.
Next, we drop or remove all columns except for the columns that we want to retain.
Finally, we drop or remove the rows that have missing values from the data set.
Steps to follow:
1. Removing extra symbols
2. Removing punctuations
3. Removing the Stopwords
4. Stemming
5. Tokenization
6. Feature extractions
7. Count Vectorizer
pg. 38
[DETECTING OF MALICIOUS SOCIAL BOTS]
Model Selection:
We used Logistic regression algorithm
Logistic regression is a classification algorithm, used when the value of the target variable
is categorical in nature. Logistic regression is most commonly used when the data in question
has binary output, so when it belongs to one class or another, or is either a 0 or 1.
Remember that classification tasks have discrete categories, unlike regressions tasks.
Here, by the idea of using a regression model to solve the classification problem, we rationally
raise a question of whether we can draw a hypothesis function to fit to the binary dataset. For
simplification, we only concern the binary classification problem
The answer is that you will have to use a type of function, different from linear functions,
called a logistic function, or a sigmoid function.
(Note: Here’s something important to remember: although the algorithm is called “Logistic
Regression”, it is, in fact, a classification algorithm, not a regression algorithm. This can be
confusing at first, but just try to remember it.)
The sigmoid function/logistic function is a function that resembles an “S” shaped curve when
plotted on a graph. It takes values between 0 and 1 and “squishes” them towards the margins
at the top and bottom, labeling them as 0 or 1.
What is the variable e in this instance? The e represents the exponential function or
exponential constant, and it has a value of approximately 2.71828.
Let’s see how the sigmoid function represent the given dataset.
This gives a value y that is extremely close to 0 if xis a large negative value and close to 1
if x is a large positive value. After the input value has been squeezed towards 0 or 1, the input
can be run through a typical linear function, but the inputs can now be put into distinct
categories.
2. Labels : Labels
Malicious
No Malicious
Accuracy on test set:
pg. 39
[DETECTING OF MALICIOUS SOCIAL BOTS]
Once you’re confident enough to take your trained and tested model into the production-ready
environment, the first step is to save it into a .h5 or. pkl file using a library like pickle .
Make sure you have pickle installed in your environment.
Next, let’s import the module and dump the model into. pkl file
Import MYSQL dB
Create a connection function to run our code. Here we specify where we're connecting to, the
user, the user's password, and then the database that we want to connect to.
As a note, we use "localhost" as our host. This just means we'll use the same server that this
code is running on. You can connect to databases remotely as well, which can be pretty neat.
To do that, you would connect to a host by their IP, or their domain. To connect to a database
remotely, you will need to first allow it from the remote database that will be
accessed/modified.
Next, let's go ahead and edit our __init__.py file, adding a register function. For now we'll
keep it simple, mostly just to test our connection functionality.
We allow for GET and POST, but aren't handling it just yet.
We're going to just try to run the imported connection function, which returns c and conn
(cursor and connection objects).If the connection is successful, we just have the page say okay,
otherwise it will output the error.
pg. 40
[DETECTING OF MALICIOUS SOCIAL BOTS]
INPUT DESIGN
The input design is the link between the information system and the user. It comprises the
developing specification and procedures for data preparation and those steps are necessary to
put transaction data in to a usable form for processing can be achieved by inspecting the
computer to read data from a written or printed document or it can occur by having people
keying the data directly into the system. The design of input focuses on controlling the amount
of input required, controlling the errors, avoiding delay, avoiding extra steps and keeping the
process simple. The input is designed in such a way so that it provides security and ease of use
with retaining the privacy. Input Design considered the following things:
OBJECTIVES
1. Input Design is the process of converting a user-oriented description of the input into a
computer-based system. This design is important to avoid errors in the data input process and
show the correct direction to the management for getting correct information from the
computerized system.
2. It is achieved by creating user-friendly screens for the data entry to handle large volume of
data. The goal of designing input is to make data entry easier and to be free from errors. The
data entry screen is designed in such a way that all the data manipulates can be performed. It
also provides record viewing facilities.
3. When the data is entered it will check for its validity. Data can be entered with the help of
screens. Appropriate messages are provided as when needed so that the user will not be in
pg. 41
[DETECTING OF MALICIOUS SOCIAL BOTS]
maize of instant. Thus, the objective of input design is to create an input layout that is easy to
follow
OUTPUT DESIGN
A quality output is one, which meets the requirements of the end user and presents the
information clearly. In any system results of processing are communicated to the users and to
other system through outputs. In output design it is determined how the information is to be
displaced for immediate need and also the hard copy output. It is the most important and direct
source information to the user. Efficient and intelligent output design improves the system’s
relationship to help user decision-making.
1. Designing computer output should proceed in an organized, well thought out manner; the
right output must be developed while ensuring that each output element is designed so that
people will find the system can use easily and effectively. When analysis design computer
output, they should Identify the specific output that is needed to meet the requirements.
3. Create document, report, or other formats that contain information produced by the system.
The output form of an information system should accomplish one or more of the following
objectives.
pg. 42
[DETECTING OF MALICIOUS SOCIAL BOTS]
SCREEN SHOTS
pg. 43
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 44
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 45
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 46
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 47
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 48
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 49
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 50
[DETECTING OF MALICIOUS SOCIAL BOTS]
CODING
1. USER.HTML
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Malicious Social Bots </title>
<meta content="" name="description">
<meta content="" name="keywords">
<!-- Favicons -->
<!-- Google Fonts -->
<link
href="https://fonts.googleapis.com/css?family=Open+Sans:300,300i,400,400i,600,600i,700,
700i|Roboto:300,300i,400,400i,500,500i,600,600i,700,700i|Poppins:300,300i,400,400i,500,5
00i,600,600i,700,700i" rel="stylesheet">
<!-- Vendor CSS Files -->
<link href="../static/vendor/animate.css/animate.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap-icons/bootstrap-icons.css" rel="stylesheet">
<link href="../static/vendor/boxicons/css/boxicons.min.css" rel="stylesheet">
<link href="../static/vendor/glightbox/css/glightbox.min.css" rel="stylesheet">
<link href="../static/vendor/swiper/swiper-bundle.min.css" rel="stylesheet">
<!-- Template Main CSS File -->
<link href="../static/css/style.css" rel="stylesheet">
<!-- =======================================================
* Template Name: Groovin - v4.0.1
* Template URL: https://bootstrapmade.com/groovin-free-bootstrap-theme/
* Author: BootstrapMade.com
* License: https://bootstrapmade.com/license/
======================================================== -->
pg. 51
[DETECTING OF MALICIOUS SOCIAL BOTS]
</head>
<body>
<!-- ======= Header ======= -->
<header id="header" class="fixed-top d-flex align-items-center">
<div class="container d-flex align-items-center justify-content-between">
<h1 class="logo"><a href="index.html">Malicious</a></h1>
<!-- Uncomment below if you prefer to use an image logo -->
<!-- <a href="index.html" class="logo"><img src="assets/img/logo.png" alt=""
class="img-fluid"></a>-->
<nav id="navbar" class="navbar">
<ul>
<li><a class="nav-link scrollto " href="{{
url_for('profile')}}">profile</a></li>
<li><a class="nav-link scrollto " href="{{ url_for('prediction')}}">Tweets</a></li>
<li><a class="nav-link scrollto " href="{{
url_for('users')}}">Timeline</a></li>
<li><a class="nav-link scrollto " href="{{ url_for('index')}}">Logout</a></li>
</ul>
<i class="bi bi-list mobile-nav-toggle"></i>
</nav><!-- .navbar -->
</div>
</header><!-- End Header -->
<!-- ======= Hero Section ======= -->
<section id="hero">
<div class="hero-container">
<div id="heroCarousel" data-bs-interval="5000" class="carousel slide carousel-fade"
data-bs-ride="carousel">
<ol class="carousel-indicators" id="hero-carousel-indicators"></ol>
<div class="carousel-container">
<div class="carousel-content">
<h2 class="animate__animated animate__fadeInDown">Detection of Malicious
Social Bots Using machine Learning</h2>
<div>
</div>
</div>
</div>
</div>
<!-- Slide 2 -->
<!-- Slide 3 -->
</div>
</div>
</div>
</section><!-- End Hero -->
<main id="main">
<section id="services" class="services">
<div class="container">
<div class="section-title">
<h2>tweet</h2>
<div id="fields">
<center>
<table>
{% for user in userDetails %}
<tr>
<td> <h3>Tweet :</h3> <font
style="color: #c43c35;font-size:25px;">@{{user[1]}}</font> </td>
</tr>
<tr><td></td></tr>
<tr><td></td></tr>
<tr><td></td></tr>
pg. 53
[DETECTING OF MALICIOUS SOCIAL BOTS]
<tr><td></td></tr>
<tr><td></td></tr>
<tr>
<td>
<textarea readonly="" style="width:
400px; height: 90px; border-color: white; color: black">{{user[9]}} </textarea>
</td>
</tr><br>
{% endfor %}
</table>
</center>
</div>
</div>
</section>
<a href="#" class="back-to-top d-flex align-items-center justify-content-center"><i
class="bi bi-arrow-up-short"></i></a>
<!-- Vendor JS Files -->
<script src="../static/vendor/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="../static/vendor/glightbox/js/glightbox.min.js"></script>
<script src="../static/vendor/isotope-layout/isotope.pkgd.min.js"></script>
<script src="../static/vendor/php-email-form/validate.js"></script>
<script src="../static/vendor/purecounter/purecounter.js"></script>
<script src="../static/vendor/swiper/swiper-bundle.min.js"></script>
<!-- Template Main JS File -->
<script src="../static/js/main.js"></script>
</body>
</html>
2. PROFILE.HTML
<!DOCTYPE html>
<html lang="en">
pg. 54
[DETECTING OF MALICIOUS SOCIAL BOTS]
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Malicious Social Bots </title>
<meta content="" name="description">
<meta content="" name="keywords">
<!-- Favicons -->
<!-- Google Fonts -->
<link
href="https://fonts.googleapis.com/css?family=Open+Sans:300,300i,400,400i,600,600i,700,
700i|Roboto:300,300i,400,400i,500,500i,600,600i,700,700i|Poppins:300,300i,400,400i,500,5
00i,600,600i,700,700i" rel="stylesheet">
<!-- Vendor CSS Files -->
<link href="../static/vendor/animate.css/animate.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap-icons/bootstrap-icons.css" rel="stylesheet">
<link href="../static/vendor/boxicons/css/boxicons.min.css" rel="stylesheet">
<link href="../static/vendor/glightbox/css/glightbox.min.css" rel="stylesheet">
<link href="../static/vendor/swiper/swiper-bundle.min.css" rel="stylesheet">
<!-- Template Main CSS File -->
<link href="../static/css/style.css" rel="stylesheet">
<!-- =======================================================
* Template Name: Groovin - v4.0.1
* Template URL: https://bootstrapmade.com/groovin-free-bootstrap-theme/
* Author: BootstrapMade.com
* License: https://bootstrapmade.com/license/
======================================================== -->
</head>
<body>
<!-- ======= Header ======= -->
<header id="header" class="fixed-top d-flex align-items-center">
pg. 55
[DETECTING OF MALICIOUS SOCIAL BOTS]
<div>
</div>
</div>
</div>
</div>
<!-- Slide 2 -->
<!-- Slide 3 -->
</div>
</div>
</div>
</section><!-- End Hero -->
<main id="main">
<section id="services" class="services">
<div class="container">
<div class="section-title">
<h2> Your Profile details</h2>
<div id="fields">
<form class="form-horizontal">
<div class="row">
<div class="span4">
<div class="control-group">
<div class="controls">
<label style="color:black"><b>Your name :</b></label>
<input class="span4" readonly="" value="{{account[1]}}">
</div>
</div>
<div class="control-group">
</br>
<div class="controls">
<label style="color:black"><b>Your email :</b></label>
pg. 57
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 58
[DETECTING OF MALICIOUS SOCIAL BOTS]
3. PREVIEW.HTML
<!DOCTYPE html>
<html lang="en">
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Malicious Social Bots </title>
<meta content="" name="description">
<meta content="" name="keywords">
<!-- Favicons -->
<!-- Google Fonts -->
<link
href="https://fonts.googleapis.com/css?family=Open+Sans:300,300i,400,400i,600,600i,700,
700i|Roboto:300,300i,400,400i,500,500i,600,600i,700,700i|Poppins:300,300i,400,400i,500,5
00i,600,600i,700,700i" rel="stylesheet">
<!-- Vendor CSS Files -->
<link href="../static/vendor/animate.css/animate.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap-icons/bootstrap-icons.css" rel="stylesheet">
<link href="../static/vendor/boxicons/css/boxicons.min.css" rel="stylesheet">
<link href="../static/vendor/glightbox/css/glightbox.min.css" rel="stylesheet">
<link href="../static/vendor/swiper/swiper-bundle.min.css" rel="stylesheet">
<!-- Template Main CSS File -->
<link href="../static/css/style.css" rel="stylesheet">
<!-- =======================================================
* Template Name: Groovin - v4.0.1
* Template URL: https://bootstrapmade.com/groovin-free-bootstrap-theme/
* Author: BootstrapMade.com
* License: https://bootstrapmade.com/license/
======================================================== -->
pg. 59
[DETECTING OF MALICIOUS SOCIAL BOTS]
</head>
<body>
<!-- ======= Header ======= -->
<header id="header" class="fixed-top d-flex align-items-center">
<div class="container d-flex align-items-center justify-content-between">
<h1 class="logo"><a href="index.html">Malicious</a></h1>
<!-- Uncomment below if you prefer to use an image logo -->
<!-- <a href="index.html" class="logo"><img src="assets/img/logo.png" alt=""
class="img-fluid"></a>-->
<nav id="navbar" class="navbar">
<ul>
<li><a class="nav-link scrollto active" href="{{ url_for('index')}}">Home</a></li>
<li><a class="nav-link scrollto" href="#services">Abstract</a></li>
<li><a class="nav-link scrollto " href="{{
url_for('upload')}}">upload</a></li>
</ul>
<i class="bi bi-list mobile-nav-toggle"></i>
</nav><!-- .navbar -->
</div>
</header><!-- End Header -->
<!-- ======= Hero Section ======= -->
<section id="hero">
<div class="hero-container">
<div id="heroCarousel" data-bs-interval="5000" class="carousel slide carousel-fade"
data-bs-ride="carousel">
<ol class="carousel-indicators" id="hero-carousel-indicators"></ol>
<div class="carousel-inner" role="listbox">
<!-- Slide 1 -->
<div class="carousel-item active" style="background: url(../static/img/mal6.jpg);">
<div class="carousel-container">
<div class="carousel-content">
pg. 60
[DETECTING OF MALICIOUS SOCIAL BOTS]
<style>
#loading {
background: url('../static/ajax-loader.gif') no-repeat center center;
position: absolute;
top: 0;
left: 99;
height: 100%;
width: 90%;
z-index: 9999999;
}
</style>
</head>
<body id="page-top">
<!-- Navigation -->
<!-- Contact Section -->
<section class="page-section" id="contact">
<div class="container">
<br>
<br>
<!-- Contact Section Heading -->
<h2 class="text-center text-uppercase text-secondary mb-0"> </h2>
pg. 62
[DETECTING OF MALICIOUS SOCIAL BOTS]
alert("Training finished!");
window.location = "{{url_for('register')}}";
});
}
</script>
</body>
</html>
</div>
</div>
</section>
<footer id="footer">
<div class="footer-top">
<div class="container">
<div class="row">
<div class="col-lg-3 col-md-6">
<div class="footer-info">
</div>
</div>
<div class="col-lg-2 col-md-6 footer-links">
</div>
<div class="col-lg-3 col-md-6 footer-links">
</div>
<div class="col-lg-4 col-md-6 footer-newsletter">
</div>
</div>
</div>
</div>
<div class="container">
<div class="copyright">
</div>
pg. 64
[DETECTING OF MALICIOUS SOCIAL BOTS]
<div class="credits">
</div>
</footer>
<a href="#" class="back-to-top d-flex align-items-center justify-content-center"><i
class="bi bi-arrow-up-short"></i></a>
<!-- Vendor JS Files -->
<script src="../static/vendor/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="../static/vendor/glightbox/js/glightbox.min.js"></script>
<script src="../static/vendor/isotope-layout/isotope.pkgd.min.js"></script>
<script src="../static/vendor/php-email-form/validate.js"></script>
<script src="../static/vendor/purecounter/purecounter.js"></script>
<script src="../static/vendor/swiper/swiper-bundle.min.js"></script>
<!-- Template Main JS File -->
<script src="../static/js/main.js"></script>
</body>
</html>
4. USERDETAIL.HTML
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Malicious Social Bots </title>
<meta content="" name="description">
<meta content="" name="keywords">
<!-- Favicons -->
<!-- Google Fonts -->
<link
href="https://fonts.googleapis.com/css?family=Open+Sans:300,300i,400,400i,600,600i,700,
700i|Roboto:300,300i,400,400i,500,500i,600,600i,700,700i|Poppins:300,300i,400,400i,500,5
00i,600,600i,700,700i" rel="stylesheet">
pg. 65
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 66
[DETECTING OF MALICIOUS SOCIAL BOTS]
padding-bottom: 12px;
text-align: left;
background-color: #1DA1F2;
color: white;
}
</style>
</head>
<body>
<!-- ======= Header ======= -->
<header id="header" class="fixed-top d-flex align-items-center">
<div class="container d-flex align-items-center justify-content-between">
<h1 class="logo"><a href="index.html">Malicious</a></h1>
<!-- Uncomment below if you prefer to use an image logo -->
<!-- <a href="index.html" class="logo"><img src="assets/img/logo.png" alt=""
class="img-fluid"></a>-->
<nav id="navbar" class="navbar">
<ul>
<li><a class="nav-link scrollto" href="{{ url_for('index')}}">Home</a></li>
<li><a class="nav-link scrollto" href="{{ url_for('userdetail')}}">Register
details</a></li>
<li><a class="nav-link scrollto " href="{{ url_for('admin')}}">Full
details</a></li>
<li><a class="nav-link scrollto " href="{{ url_for('user')}}">Analysis</a></li>
</ul>
<i class="bi bi-list mobile-nav-toggle"></i>
</nav><!-- .navbar -->
</div>
</header><!-- End Header -->
<!-- ======= Hero Section ======= -->
<section id="hero">
<div class="hero-container">
pg. 67
[DETECTING OF MALICIOUS SOCIAL BOTS]
</div>
</div>
</div>
</section><!-- End Hero -->
<main id="main">
<div class="section-title">
<h2>Register details</h2>
<table id="customers" style="margin-right: 300px">
<tr>
<th>user_id</th>
<th>user_name</th>
<th>Email</th>
<th>password</th>
</tr>
pg. 68
[DETECTING OF MALICIOUS SOCIAL BOTS]
</tr>
{% endfor %}
</table>
</div>
</div>
</section>
5. USER.HTML
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Malicious Social Bots </title>
pg. 69
[DETECTING OF MALICIOUS SOCIAL BOTS]
<div class="section-title">
<h2>Analysis</h2>
<link href="../static/css/bootstrap.min.css" rel="stylesheet">
<link href="../static/css/jumbotron-narrow.css" rel="stylesheet">
<script src="../static/js/jquery-1.11.2.js"></script>
<script src="../static/js/Chart.min.js"></script>
</head>
<body>
<div class="container">
<div class="header">
</div>
<canvas id="chart" width="600" height="400"></canvas>
<footer class="footer">
<script>
pg. 72
[DETECTING OF MALICIOUS SOCIAL BOTS]
datasets : [
{
label: '{{legend}}',
fillColor: "rgba(151,187,205,0.2)",
strokeColor: "rgba(151,187,205,1)",
pointColor: "rgba(151,187,205,1)",
pointStrokeColor: "#fff",
pointHighlightFill: "#fff",
pointHighlightStroke: "rgba(151,187,205,1)",
bezierCurve : false,
data : [{% for value in values %}
{{value}},
{% endfor %}]
}]
}
Chart.defaults.global.animationSteps = 50;
Chart.defaults.global.tooltipYPadding = 16;
Chart.defaults.global.tooltipCornerRadius = 0;
Chart.defaults.global.tooltipTitleFontStyle = "normal";
Chart.defaults.global.tooltipFillColor = "rgba(0,0,0,0.8)";
Chart.defaults.global.animationEasing = "easeOutBounce";
Chart.defaults.global.responsive = false;
Chart.defaults.global.scaleLineColor = "black";
Chart.defaults.global.scaleFontSize = 16;
// get bar chart canvas
var ctx = document.getElementById("chart").getContext("2d");
steps = 10
max = 40
var BarChartDemo = new Chart(ctx).Bar(chartData, {
pg. 73
[DETECTING OF MALICIOUS SOCIAL BOTS]
scaleOverride: true,
scaleSteps: steps,
scaleStepWidth: Math.ceil(max / steps),
scaleStartValue: 0,
scaleShowVerticalLines: true,
scaleShowGridLines : true,
barShowStroke : true,
scaleShowLabels: true,
bezierCurve: false,
});
</script>
</body>
</html>
</div>
</div>
</section>
<a href="#" class="back-to-top d-flex align-items-center justify-content-center"><i
class="bi bi-arrow-up-short"></i></a>
<!-- Vendor JS Files -->
<script src="../static/vendor/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="../static/vendor/glightbox/js/glightbox.min.js"></script>
<script src="../static/vendor/isotope-layout/isotope.pkgd.min.js"></script>
<script src="../static/vendor/php-email-form/validate.js"></script>
<script src="../static/vendor/purecounter/purecounter.js"></script>
<script src="../static/vendor/swiper/swiper-bundle.min.js"></script>
<!-- Template Main JS File -->
<script src="../static/js/main.js"></script>
</body>
</html>
pg. 74
[DETECTING OF MALICIOUS SOCIAL BOTS]
6. ADMIN.HTML
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Malicious Social Bots </title>
<meta content="" name="description">
<meta content="" name="keywords">
<!-- Favicons -->
<!-- Google Fonts -->
<link
href="https://fonts.googleapis.com/css?family=Open+Sans:300,300i,400,400i,600,600i,700,
700i|Roboto:300,300i,400,400i,500,500i,600,600i,700,700i|Poppins:300,300i,400,400i,500,5
00i,600,600i,700,700i" rel="stylesheet">
<!-- Vendor CSS Files -->
<link href="../static/vendor/animate.css/animate.min.css" rel="stylesheet">
< link href="../static/vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap-icons/bootstrap-icons.css" rel="stylesheet">
<link href="../static/vendor/boxicons/css/boxicons.min.css" rel="stylesheet">
<link href="../static/vendor/glightbox/css/glightbox.min.css" rel="stylesheet">
<link href="../static/vendor/swiper/swiper-bundle.min.css" rel="stylesheet">
<!-- Template Main CSS File -->
<link href="../static/css/style.css" rel="stylesheet">
<!-- =======================================================
* Template Name: Groovin - v4.0.1
* Template URL: https://bootstrapmade.com/groovin-free-bootstrap-theme/
* Author: BootstrapMade.com
* License: https://bootstrapmade.com/license/
======================================================== -->
<style>
#customers {
font-family: "Trebuchet MS", Arial, Helvetica, sans-serif;
font-size: 20px;
border-collapse: collapse;
width: 100%;
}
#customers td, #customers th {
border: 1px solid #ddd;
padding: 15px;
}
pg. 75
[DETECTING OF MALICIOUS SOCIAL BOTS]
#customers th {
padding-top: 12px;
padding-bottom: 12px;
text-align: left;
background-color: #1DA1F2;
color: white;
}
</style>
</head>
<body>
<!-- ======= Header ======= -->
<header id="header" class="fixed-top d-flex align-items-center">
<div class="container d-flex align-items-center justify-content-between">
<h1 class="logo"><a href="index.html">Malicious</a></h1>
<!-- Uncomment below if you prefer to use an image logo -->
<!-- <a href="index.html" class="logo"><img src="assets/img/logo.png" alt=""
class="img-fluid"></a>-->
<nav id="navbar" class="navbar">
<ul>
<li><a class="nav-link scrollto" href="{{ url_for('index')}}">Home</a></li>
<li><a class="nav-link scrollto" href="{{ url_for('userdetail')}}">Register
details</a></li>
<li><a class="nav-link scrollto " href="{{ url_for('admin')}}">Full
details</a></li>
<li><a class="nav-link scrollto " href="{{ url_for('user')}}">Analysis</a></li>
</ul>
<i class="bi bi-list mobile-nav-toggle"></i>
</nav><!-- .navbar -->
</div>
</header><!-- End Header -->
<!-- ======= Hero Section ======= -->
<section id="hero">
<div class="hero-container">
<div id="heroCarousel" data-bs-interval="5000" class="carousel slide carousel-fade"
data-bs-ride="carousel">
<ol class="carousel-indicators" id="hero-carousel-indicators"></ol>
<div class="carousel-inner" role="listbox">
<!-- Slide 1 -->
<div class="carousel-item active" style="background: url(../static/img/mal7.jpg);">
<div class="carousel-container">
<div class="carousel-content">
<h2 class="animate__animated animate__fadeInDown">Detection of Malicious
Social Bots Using machine Learning</h2>
<div>
pg. 76
[DETECTING OF MALICIOUS SOCIAL BOTS]
</div>
</div>
</div>
</div>
<!-- Slide 2 -->
<!-- Slide 3 -->
</div>
</div>
</div>
</section><!-- End Hero -->
<main id="main">
<section id="services" class="services">
<div class="container">
<div class="section-title">
<h2>Full details users</h2>
<center>
<table id="customers" style="margin-right: 300px">
<tr>
<th>user_id</th>
<th>user_name</th>
<th>Email</th>
<th>tweets</th>
<th> prediction</th>
<th> status</th>
<th> action</th>
</tr>
{% for admin in userDetails %}
<tr>
<form action="{{ url_for('blockUser')}}" method="post" autocomplete="off">
<td> <input name="fid" size= "5" style="color:blue;border:none"
value="{{admin[0]}}" readonly /> </td>
<td> <p style="color:blue">{{admin[1]}} </p></td>
<td> <p style="color:blue">{{admin[2]}}</p> </td>
<td> <p style="color:blue">{{admin[9]}}</p> </td>
<td> <p style="color:blue">{{admin[7]}} </p></td>
<td> <p style="color:blue">{{admin[4]}} </p></td>
<td><button type="submit" id="Geeks" class="btn btn-danger">Block</button></td>
</form>
</tr>
{% endfor %}
</table>
</center>
</div>
</div>
</section>
pg. 77
[DETECTING OF MALICIOUS SOCIAL BOTS]
7. PREDECTION.HTML
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Malicious Social Bots </title>
<meta content="" name="description">
<meta content="" name="keywords">
<!-- Favicons -->
<!-- Google Fonts -->
<link
href="https://fonts.googleapis.com/css?family=Open+Sans:300,300i,400,400i,600,600i,700,
700i|Roboto:300,300i,400,400i,500,500i,600,600i,700,700i|Poppins:300,300i,400,400i,500,5
00i,600,600i,700,700i" rel="stylesheet">
<!-- Vendor CSS Files -->
<link href="../static/vendor/animate.css/animate.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap-icons/bootstrap-icons.css" rel="stylesheet">
<link href="../static/vendor/boxicons/css/boxicons.min.css" rel="stylesheet">
<link href="../static/vendor/glightbox/css/glightbox.min.css" rel="stylesheet">
pg. 78
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 79
[DETECTING OF MALICIOUS SOCIAL BOTS]
<section id="hero">
<div class="hero-container">
<div id="heroCarousel" data-bs-interval="5000" class="carousel slide carousel-fade"
data-bs-ride="carousel">
<ol class="carousel-indicators" id="hero-carousel-indicators"></ol>
<div class="carousel-inner" role="listbox">
<!-- Slide 1 -->
<div class="carousel-item active" style="background: url(../static/img/mal6.jpg);">
<div class="carousel-container">
<div class="carousel-content">
<h2 class="animate__animated animate__fadeInDown">Detection of Malicious
Social Bots Using machine Learning</h2>
<div>
</div>
</div>
</div>
</div>
<!-- Slide 2 -->
<!-- Slide 3 -->
</div>
</div>
</div>
</section><!-- End Hero -->
<main id="main">
<section id="services" class="services">
<div class="container">
<div class="section-title">
<h2>Tweet</h2>
<body>
<div class="login">
<h1 class="cover-heading" ></h1>
pg. 80
[DETECTING OF MALICIOUS SOCIAL BOTS]
</div>
</div>
<div class="container">
<div class="copyright">
</div>
<div class="credits">
</div>
</footer>
<a href="#" class="back-to-top d-flex align-items-center justify-content-center"><i
class="bi bi-arrow-up-short"></i></a>
<!-- Vendor JS Files -->
<script src="../static/vendor/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="../static/vendor/glightbox/js/glightbox.min.js"></script>
<script src="../static/vendor/isotope-layout/isotope.pkgd.min.js"></script>
<script src="../static/vendor/php-email-form/validate.js"></script>
<script src="../static/vendor/purecounter/purecounter.js"></script>
<script src="../static/vendor/swiper/swiper-bundle.min.js"></script>
<!-- Template Main JS File -->
<script src="../static/js/main.js"></script>
</body>
</html>
8. INDEX.HTML
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Malicious Social Bots </title>
<meta content="" name="description">
<meta content="" name="keywords">
pg. 82
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 84
[DETECTING OF MALICIOUS SOCIAL BOTS]
</div>
</div>
<!-- Slide 2 -->
<div class="carousel-item" style="background: url(../static/img/mal5.png);">
<div class="carousel-container">
<div class="carousel-content">
<h2 class="animate__animated animate__fadeInDown">Detection of Malicious
Social Bots Using Learning</h2>
<p class="animate__animated animate__fadeInUp">Automata With URL Features
in Twitter Network</p>
<div>
</div>
</div>
</div>
</div>
<!-- Slide 3 -->
</div>
<a class="carousel-control-prev" href="#heroCarousel" role="button" data-bs-
slide="prev">
<span class="carousel-control-prev-icon bi bi-chevron-left" aria-
hidden="true"></span>
</a>
<a class="carousel-control-next" href="#heroCarousel" role="button" data-bs-
slide="next">
<span class="carousel-control-next-icon bi bi-chevron-right" aria-
hidden="true"></span>
</a>
</div>
</div>
</section><!-- End Hero -->
<main id="main">
<section id="services" class="services">
pg. 85
[DETECTING OF MALICIOUS SOCIAL BOTS]
<div class="container">
<div class="section-title">
<h2>Abstract</h2>
<p style="color:black">Malicious social bots generate fake tweets and
automate their social relationships either by pretending like a
follower or by creating multiple fake accounts with malicious
activities. Moreover, malicious social bots post shortened malicious URLs in the tweet in
order to redirect the requests of
online social networking participants to some malicious servers.
Hence, distinguishing malicious social bots from legitimate users
is one of the most important tasks in the Twitter network.
To detect malicious social bots, extracting URL-based features
(such as URL redirection, frequency of shared URLs, and spam
content in URL) consumes less amount of time in comparison
with social graph-based features (which rely on the social interactions of users).
Furthermore, malicious social bots cannot easily
manipulate URL redirection chains. In this article, a learning
automata-based malicious social bot detection (LA-MSBD) algorithm is proposed by
integrating a trust computation model with
URL-based features for identifying trustworthy participants
(users) in the Twitter network. The proposed trust computation model contains two
parameters, namely, direct trust and
indirect trust. Moreover, the direct trust is derived from Bayes’
theorem, and the indirect trust is derived from the Dempster–
Shafer theory (DST) to determine the trustworthiness of each
participant accurately. Experimentation has been performed on
two Twitter data sets, and the results illustrate that the proposed
algorithm achieves improvement in precision, recall, F-measure,
and accuracy compared with existing approaches for MSBD.</p>
</div>
</div>
</section>
pg. 86
[DETECTING OF MALICIOUS SOCIAL BOTS]
<footer id="footer">
<div class="footer-top">
<div class="container">
<div class="row">
<div class="col-lg-3 col-md-6">
<div class="footer-info">
</div>
</div>
<div class="col-lg-2 col-md-6 footer-links">
</div>
<div class="col-lg-3 col-md-6 footer-links">
</div>
<div class="col-lg-4 col-md-6 footer-newsletter">
</div>
</div>
</div>
</div>
<div class="container">
<div class="copyright">
</div>
<div class="credits">
</div>
</footer>
<a href="#" class="back-to-top d-flex align-items-center justify-content-center"><i
class="bi bi-arrow-up-short"></i></a>
<!-- Vendor JS Files -->
<script src="../static/vendor/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="../static/vendor/glightbox/js/glightbox.min.js"></script>
<script src="../static/vendor/isotope-layout/isotope.pkgd.min.js"></script>
<script src="../static/vendor/php-email-form/validate.js"></script>
<script src="../static/vendor/purecounter/purecounter.js"></script>
pg. 87
[DETECTING OF MALICIOUS SOCIAL BOTS]
<script src="../static/vendor/swiper/swiper-bundle.min.js"></script>
<!-- Template Main JS File -->
<script src="../static/js/main.js"></script>
</body>
</html>
-------------------------------
9. LOGINADMIN.HTML
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Malicious Social Bots </title>
<meta content="" name="description">
<meta content="" name="keywords">
<!-- Favicons -->
<!-- Google Fonts -->
<link
href="https://fonts.googleapis.com/css?family=Open+Sans:300,300i,400,400i,600,600i,700,
700i|Roboto:300,300i,400,400i,500,500i,600,600i,700,700i|Poppins:300,300i,400,400i,500,5
00i,600,600i,700,700i" rel="stylesheet">
<!-- Vendor CSS Files -->
<link href="../static/vendor/animate.css/animate.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap-icons/bootstrap-icons.css" rel="stylesheet">
<link href="../static/vendor/boxicons/css/boxicons.min.css" rel="stylesheet">
<link href="../static/vendor/glightbox/css/glightbox.min.css" rel="stylesheet">
<link href="../static/vendor/swiper/swiper-bundle.min.css" rel="stylesheet">
<!-- Template Main CSS File -->
<link href="../static/css/style.css" rel="stylesheet">
<!-- =======================================================
pg. 88
[DETECTING OF MALICIOUS SOCIAL BOTS]
<div class="carousel-container">
<div class="carousel-content">
<h2 class="animate__animated animate__fadeInDown">Detection of Malicious
Social Bots Using machine Learning</h2>
<div>
</div>
</div>
</div>
</div>
<!-- Slide 3 -->
</div>
</div>
</div>
</section><!-- End Hero -->
<main id="main">
<section id="services" class="services">
<div class="container">
<div class="section-title">
<h2> Admin login</h2>
<head>
<style>
body {
background-image: url("../static/images/email3.jpg");
background-color: #cccccc;
}
</style>
<script>
addEventListener("load", function () {
setTimeout(hideURLbar, 0);
}, false);
function hideURLbar() {
pg. 90
[DETECTING OF MALICIOUS SOCIAL BOTS]
window.scrollTo(0, 1);
}
function login(){
var uname = document.getElementById("uname").value;
var pwd = document.getElementById("pwd").value;
if(uname == "admin" && pwd == "admin")
{
alert("Login Success!");
window.location = "{{url_for('userdetail')}}";
return false;
}
else
{
alert("Invalid Credentials!")
}
}
</script>
</head>
<body id="page-top">
<!-- Portfolio Section -->
<section class="page-section portfolio" id="portfolio">
<br>
<br>
<!-- Portfolio Section Heading -->
<!-- Icon Divider -->
<!-- Portfolio Grid Items -->
<div class="row">
<!-- Portfolio Item 1 -->
<div class="col-md-6 col-lg-4" style="margin-left:380px">
<div class="control-group">
pg. 91
[DETECTING OF MALICIOUS SOCIAL BOTS]
</div>
</div>
</section>
<footer id="footer">
<div class="footer-top">
<div class="container">
<div class="row">
<div class="col-lg-3 col-md-6">
<div class="footer-info">
</div>
</div>
<div class="col-lg-2 col-md-6 footer-links">
</div>
<div class="col-lg-3 col-md-6 footer-links">
</div>
<div class="col-lg-4 col-md-6 footer-newsletter">
</div>
</div>
</div>
</div>
<div class="container">
<div class="copyright">
</div>
<div class="credits">
</div>
</footer>
<a href="#" class="back-to-top d-flex align-items-center justify-content-center"><i
class="bi bi-arrow-up-short"></i></a>
<!-- Vendor JS Files -->
<script src="../static/vendor/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="../static/vendor/glightbox/js/glightbox.min.js"></script>
pg. 93
[DETECTING OF MALICIOUS SOCIAL BOTS]
<script src="../static/vendor/isotope-layout/isotope.pkgd.min.js"></script>
<script src="../static/vendor/php-email-form/validate.js"></script>
<script src="../static/vendor/purecounter/purecounter.js"></script>
<script src="../static/vendor/swiper/swiper-bundle.min.js"></script>
<!-- Template Main JS File -->
<script src="../static/js/main.js"></script>
</body>
</html>
10. UPLOAD.HTML
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Malicious Social Bots </title>
<meta content="" name="description">
<meta content="" name="keywords">
<!-- Favicons -->
<!-- Google Fonts -->
<link
href="https://fonts.googleapis.com/css?family=Open+Sans:300,300i,400,400i,600,600i,700,
700i|Roboto:300,300i,400,400i,500,500i,600,600i,700,700i|Poppins:300,300i,400,400i,500,5
00i,600,600i,700,700i" rel="stylesheet">
<!-- Vendor CSS Files -->
<link href="../static/vendor/animate.css/animate.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet">
<link href="../static/vendor/bootstrap-icons/bootstrap-icons.css" rel="stylesheet">
<link href="../static/vendor/boxicons/css/boxicons.min.css" rel="stylesheet">
<link href="../static/vendor/glightbox/css/glightbox.min.css" rel="stylesheet">
<link href="../static/vendor/swiper/swiper-bundle.min.css" rel="stylesheet">
pg. 94
[DETECTING OF MALICIOUS SOCIAL BOTS]
</div>
</div>
</div>
</div>
<div class="container">
<div class="copyright">
</div>
<div class="credits">
</div>
</footer>
<a href="#" class="back-to-top d-flex align-items-center justify-content-center"><i
class="bi bi-arrow-up-short"></i></a>
<!-- Vendor JS Files -->
<script src="../static/vendor/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="../static/vendor/glightbox/js/glightbox.min.js"></script>
<script src="../static/vendor/isotope-layout/isotope.pkgd.min.js"></script>
<script src="../static/vendor/php-email-form/validate.js"></script>
<script src="../static/vendor/purecounter/purecounter.js"></script>
<script src="../static/vendor/swiper/swiper-bundle.min.js"></script>
<!-- Template Main JS File -->
<script src="../static/js/main.js"></script>
</body>
</html>
11. REGISTER.HTML
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Malicious Social Bots </title>
pg. 98
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 100
[DETECTING OF MALICIOUS SOCIAL BOTS]
</div>
</div>
</section><!-- End Hero -->
<main id="main">
<section id="services" class="services">
<div class="container">
<div class="section-title">
<h2></h2>
<head>
</head>
<body id="page-top">
<!-- Portfolio Section -->
<section class="page-section portfolio" id="portfolio">
<div class="container">
<br>
<br>
<!-- Portfolio Section Heading -->
<div class="login-title text-center">
<div class="row">
</div>
</div>
<!-- Icon Divider -->
<div class="divider-custom">
<div class="divider-custom-line"></div>
<div class="divider-custom-icon">
<i class="fas fa-star"></i>
</div>
<div class="divider-custom-line"></div>
</div>
<!-- Portfolio Grid Items -->
pg. 101
[DETECTING OF MALICIOUS SOCIAL BOTS]
<div class="row">
<h2>Register</h2>
<form action="{{ url_for('register') }}" method="post" >
<!-- Portfolio Item 1 -->
<div class="col-md-6 col-lg-4" style="margin-left:380px">
<div class="control-group">
<!-- Username -->
<div class="controls">
<label style="color:black"><b>Username :</b></label>
<input type="text" name="username" placeholder="Username"
required>
</div>
</div>
</br>
<div class="control-group">
<!-- Password-->
<div class="controls">
<label style="color:black"><b>Email ID : </b></label>
<input type="email" name="email" placeholder="Email" required>
</div>
</div>
</br>
<div class="form-group">
<!-- Password-->
<label style="color:black"><b>Password :</b></label>
<input type="password" name="password"
placeholder="Password" required>
</div>
<div class="col-md-6 col-lg-4" style="margin-left:-100px">
<div class="control-group">
<!-- Button -->
pg. 102
[DETECTING OF MALICIOUS SOCIAL BOTS]
<br>
<div class="controls">
<div class="msg">{{ msg }}</div>
<input type="submit" class="btn btn-success" value="submit"
style="margin-left: 240px" >
</div>
</div>
</div>
</div>
</div>
</form>
</div>
</div>
</section>
</body>
</div>
</div>
</section>
<footer id="footer">
<div class="footer-top">
<div class="container">
<div class="row">
<div class="col-lg-3 col-md-6">
<div class="footer-info">
</div>
</div>
<div class="col-lg-2 col-md-6 footer-links">
</div>
<div class="col-lg-3 col-md-6 footer-links">
</div>
<div class="col-lg-4 col-md-6 footer-newsletter">
pg. 103
[DETECTING OF MALICIOUS SOCIAL BOTS]
</div>
</div>
</div>
</div>
<div class="container">
<div class="copyright">
</div>
<div class="credits">
</div>
</footer>
<a href="#" class="back-to-top d-flex align-items-center justify-content-center"><i
class="bi bi-arrow-up-short"></i></a>
<!-- Vendor JS Files -->
<script src="../static/vendor/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="../static/vendor/glightbox/js/glightbox.min.js"></script>
<script src="../static/vendor/isotope-layout/isotope.pkgd.min.js"></script>
<script src="../static/vendor/php-email-form/validate.js"></script>
<script src="../static/vendor/purecounter/purecounter.js"></script>
<script src="../static/vendor/swiper/swiper-bundle.min.js"></script>
<!-- Template Main JS File -->
<script src="../static/js/main.js"></script>
</body>
</html>
12. LOGIN.HTML
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
pg. 104
[DETECTING OF MALICIOUS SOCIAL BOTS]
pg. 106
[DETECTING OF MALICIOUS SOCIAL BOTS]
<!-- Password-->
<div class="controls">
<label style="color:black"><b>Password :</b></label>
<input type="password" name="password"
placeholder="Password" required>
</div>
</div>
<div class="col-md-6 col-lg-4" style="margin-left:-
120px">
<div class="control-group">
<!-- Button -->
<br>
<div class="controls">
<div class="msg">{{ msg }}</div>
<input type="submit" class="btn btn-success"
value="Login" style="margin-left: 240px" >
</div>
</div>
</div>
</div>
</div>
</form>
</div>
</div>
</section>
{% with messages = get_flashed_messages() %}
{% if messages %}
<script>
var messages = {{ messages | safe }};
pg. 109
[DETECTING OF MALICIOUS SOCIAL BOTS]
<div class="copyright">
</div>
<div class="credits">
</div>
</footer>
<a href="#" class="back-to-top d-flex align-items-center justify-content-
center"><i class="bi bi-arrow-up-short"></i></a>
<!-- Vendor JS Files -->
<script src="../static/vendor/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="../static/vendor/glightbox/js/glightbox.min.js"></script>
<script src="../static/vendor/isotope-layout/isotope.pkgd.min.js"></script>
<script src="../static/vendor/php-email-form/validate.js"></script>
<script src="../static/vendor/purecounter/purecounter.js"></script>
<script src="../static/vendor/swiper/swiper-bundle.min.js"></script>
<!-- Template Main JS File -->
<script src="../static/js/main.js"></script>
</body>
</html>
--------------------------
pg. 111
[DETECTING OF MALICIOUS SOCIAL BOTS]
SYSTEM TESTING
The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality
of components, sub-assemblies, assemblies and/or a finished product It is the process of
exercising software with the intent of ensuring that the
Software system meets its requirements and user expectations and does not fail in an
unacceptable manner. There are various types of test. Each test type addresses a specific testing
requirement.
TYPES OF TESTS
Unit testing
Unit testing involves the design of test cases that validate that the internal program logic is
functioning properly, and that program inputs produce valid outputs. All decision branches and
internal code flow should be validated. It is the testing of individual software units of the
application .it is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive. Unit tests
perform basic tests at component level and test a specific business process, application, and/or
system configuration. Unit tests ensure that each unique path of a business process performs
accurately to the documented specifications and contains clearly defined inputs and expected
results.
Integration testing
Integration tests are designed to test integrated software components to determine if they
actually run as one program. Testing is event driven and is more concerned with the basic
outcome of screens or fields. Integration tests demonstrate that although the components were
individually satisfaction, as shown by successfully unit testing, the combination of
components is correct and consistent. Integration testing is specifically aimed at exposing the
problems that arise from the combination of component.
Functional test
Functional tests provide systematic demonstrations that functions tested are available as
specified by the business and technical requirements, system documentation, and user
manuals.
Functional testing is centred on the following items:
Valid Input : identified classes of valid input must be accepted.
Invalid Input : identified classes of invalid input must be rejected.
Functions : identified functions must be exercised.
Output : identified classes of application outputs must be exercised.
pg. 112
[DETECTING OF MALICIOUS SOCIAL BOTS]
System Test
System testing ensures that the entire integrated software system meets requirements. It tests
a configuration to ensure known and predictable results. An example of system testing is the
configuration-oriented system integration test. System testing is based on process descriptions
and flows, emphasizing pre-driven process links and integration points.
Test objectives
➢ All field entries must work properly.
➢ Pages must be activated from the identified link.
➢ The entry screen, messages and responses must not be delayed.
Features to be tested
pg. 113
[DETECTING OF MALICIOUS SOCIAL BOTS]
User Acceptance Testing is a critical phase of any project and requires significant participation
by the end user. It also ensures that the system meets the functional requirements.
Test Results: All the test cases mentioned above passed successfully. No defects encountered.
pg. 114
[DETECTING OF MALICIOUS SOCIAL BOTS]
CONCLUSION
This article presents an LA-MSBD algorithm by integrating a trust computational model with
a set of URL-based features for MSBD. In addition, we evaluate the trustworthiness of tweets
(posted by each participant) by using the Bayesian learning and DST. Moreover, the proposed
LA-MSBD algorithm executes a finite set of learning actions to update action probability value
(i.e., probability of a participant posting malicious URLs in the tweets). The proposed LA-
MSBD algorithm achieves the advantages of incremental learning. Two Twitter data sets are
used to evaluate the performance of our proposed LA-MSBD algorithm. The experimental
results show that the proposed LA-MSBD algorithm achieves up to 7% improvement of
accuracy compared with other existing algorithms. For The Fake Project and Social Honeypot
data sets, the proposed LA-MSBD algorithm has achieved precisions of 95.37% and 91.77%
for MSBD, respectively. Furthermore, as a future research challenge, we would like to
investigate the dependence among the features and its impact on MSBD.
pg. 115
[DETECTING OF MALICIOUS SOCIAL BOTS]
LITERATURE SURVEY
2) Adaptive deep Q-learning model for detecting social bots and influential users in
online social networks
by the social bots) based on tweets and the users’ interactions. The experimentation using the
datasets collected from Twitter network illustrates the efficacy of proposed model.
3) Fluxing botnet command and control channels with URL shortening services
pg. 117
[DETECTING OF MALICIOUS SOCIAL BOTS]
This paper compares machine learning techniques for detecting malicious webpages. The
conventional method of detecting malicious webpages is going through the black list and
checking whether the webpages are listed. Black list is a list of webpages which are classified
as malicious from a user’s point of view. These black lists are created by trusted organizations
and volunteers. They are then used by modern web browsers such as Chrome, Firefox, Internet
Explorer, etc. However, black list is ineffective because of the frequent-changing nature of
webpages, growing numbers of webpages that pose scalability issues and the crawlers’
inability to visit intranet webpages that require computer operators to log in as authenticated
users. In this paper therefore alternative and novel approaches are used by applying machine
learning algorithms to detect malicious webpages. In this paper three supervised machine
learning techniques such as K-Nearest Neighbor, Support Vector Machine and Naive Bayes
Classifier, and two unsupervised machine learning techniques such as K-Means and Affinity
Propagation are employed. Please note that K-Means and Affinity Propagation have not been
applied to detection of malicious webpages by other researchers. All these machine learning
techniques have been used to build predictive models to analyze large number of malicious
and safe webpages. These webpages were downloaded by a concurrent crawler taking
advantage of gevent. The webpages were parsed and various features such as content, URL
and screenshot of webpages were extracted to feed into the machine learning models.
Computer simulation results have produced an accuracy of up to 98% for the supervised
techniques and silhouette coefficient of close to 0.96 for the unsupervised techniques. These
predictive models have been applied in a practical context whereby Google Chrome can
harness the predictive capabilities of the classifiers that have the advantages of both the
lightweight and the heavyweight classifiers.
pg. 118
[DETECTING OF MALICIOUS SOCIAL BOTS]
REFERENCES
[1] P. Shi, Z. Zhang, and K.-K.-R. Choo, “Detecting malicious social bots based on clickstream
sequences,” IEEE Access, vol. 7, pp. 28855–28862, 2019.
[2] G. Lingam, R. R. Rout, and D. V. L. N. Somayajulu, “Adaptive deep Q-learning model for
detecting social bots and influential users in online social networks,” Appl. Intell., vol. 49, no.
11, pp. 3947–3964, Nov. 2019.
[4] S. Lee and J. Kim, “Fluxing botnet command and control channels with URL shortening
services,” Comput. Commun., vol. 36, no. 3, pp. 320–332, Feb. 2013.
[5] S. Madisetty and M. S. Desarkar, “A neural network-based ensemble approach for spam
detection in Twitter,” IEEE Trans. Comput. Social Syst., vol. 5, no. 4, pp. 973–984, Dec. 2018.
[6] H. B. Kazemian and S. Ahmed, “Comparisons of machine learning techniques for detecting
malicious webpages,” Expert Syst. Appl., vol. 42, no. 3, pp. 1166–1177, Feb. 2015.
[7] H. Gupta, M. S. Jamal, S. Madisetty, and M. S. Desarkar, “A framework for real-time spam
detection in Twitter,” in Proc. 10th Int. Conf. Commun. Syst. Netw. (COMSNETS), Jan. 2018,
pp. 380–383.
[8] T. Wu, S. Liu, J. Zhang, and Y. Xiang, “Twitter spam detection based on deep learning,” in
Proc. Australas. Comput. Sci. Week Multiconf. (ACSW), 2017, p. 3.
[10] G. Yan, “Peri-watchdog: Hunting for hidden botnets in the periphery of online social
networks,” Comput. Netw., vol. 57, no. 2, pp. 540–555, Feb. 2013.
[11] D. Canali, M. Cova, G. Vigna, and C. Kruegel, “Prophiler: A fast filter for the large-scale
detection of malicious Web pages,” in Proc. 20th Int. Conf. World Wide Web (WWW), 2011,
pp. 197–206.
[12] A. K. Jain and B. B. Gupta, “A machine learning based approach for phishing detection
using hyperlinks information,” J. Ambient Intell. Hum. Comput., vol. 10, no. 5, pp. 2015–
2028, May 2019.
pg. 119
[DETECTING OF MALICIOUS SOCIAL BOTS]
[13] C. Chen, J. Zhang, X. Chen, Y. Xiang, and W. Zhou, “6 million spam tweets: A large
ground truth for timely Twitter spam detection,” in Proc. IEEE Int. Conf. Commun. (ICC),
Jun. 2015, pp. 7065–7070.
[14] Z. Chu, S. Gianvecchio, H. Wang, and S. Jajodia, “Detecting automation of Twitter
accounts: Are you a human, bot, or cyborg?” IEEE Trans. Dependable Secure Comput., vol.
9, no. 6, pp. 811–824, Nov. 2012.
[15] C. Chen, Y. Wang, J. Zhang, Y. Xiang, W. Zhou, and G. Min, “Statistical features-based
real-time detection of drifted Twitter spam,” IEEE Trans. Inf. Forensics Security, vol. 12, no.
4, pp. 914–925, Apr. 2017.
[16] N. Rndic and P. Laskov, “Practical evasion of a learning-based classifier: A case study,”
in Proc. IEEE Symp. Secur. Privacy, May 2014, pp. 197–211.
[19] C.-M. Chen, D. J. Guan, and Q.-K. Su, “Feature set identification for detecting suspicious
URLs using Bayesian classification in social networks,” Inf. Sci., vol. 289, pp. 133–147, Dec.
2014.
[22] K. Lee, B. D. Eoff, and J. Caverlee, “Seven months with the devils: A long-term study of
content polluters on Twitter,” in Proc. ICWSM, 2011, pp. 1–8.
[23] C. Besel, J. Echeverria, and S. Zhou, “Full cycle analysis of a large scale botnet attack on
Twitter,” in Proc. IEEE/ACM Int. Conf. Adv. Social Netw. Anal. Mining (ASONAM), Aug.
2018, pp. 170–177.
[24] J. Echeverria and S. Zhou, “Discovery, retrieval, and analysis of the’star wars’ botnet in
twitter,” in Proc. 2017 IEEE/ACM Int. Conf. Adv. Social Netw. Anal. Mining 2017, 2017, pp.
1–8.
pg. 120
[DETECTING OF MALICIOUS SOCIAL BOTS]
[26] M. Agarwal and B. Zhou, “Using trust model for detecting malicious activities in Twitter,”
in Proc. Int. Conf. Social Comput., Behav.-Cultural Modeling, Predict. Springer, 2014, pp.
207–214.
[28] C. Yang, R. Harkreader, and G. Gu, “Empirical evaluation and new design for fighting
evolving Twitter spammers,” IEEE Trans. Inf. Forensics Security, vol. 8, no. 8, pp. 1280–1293,
Aug. 2013.
[30] S. Lee and J. Kim, “WarningBird: A near real-time detection system for suspicious URLs
in Twitter stream,” IEEE Trans. Dependable Secure Comput., vol. 10, no. 3, pp. 183–195, May
2013.
[31] D. R. Patil and J. B. Patil, “Malicious URLs detection using decision tree classifiers and
majority voting technique,” Cybern. Inf. Technol., vol. 18, no. 1, pp. 11–29, Mar. 2018.
[32] H. Guo, S. Li, B. Li, Y. Ma, and X. Ren, “A new learning automatabased pruning method
to train deep neural networks,” IEEE Internet Things J., vol. 5, no. 5, pp. 3263–3269, Oct.
2018.
[34] A. Moayedikia, K.-L. Ong, Y. L. Boo, and W. G. S. Yeoh, “Task assignment in microtask
crowdsourcing platforms using learning automata,” Eng. Appl. Artif. Intell., vol. 74, pp. 212–
225, Sep. 2018.
[35] G. Lingam, R. R. Rout, and D. Somayajulu, “Learning automatabased trust model for
user recommendations in online social networks,” Comput. Electr. Eng., vol. 66, pp. 174–188,
Feb. 2018.
[36] Manju, S. Chand, and B. Kumar, “Target coverage heuristic based on learning automata
in wireless sensor networks,” IET Wireless Sensor Syst., vol. 8, no. 3, pp. 109–115, Jun. 2018.
[37] Q. Sang, Z. Lin, and S. T. Acton, “Learning automata for image segmentation,” Pattern
Recognit. Lett., vol. 74, pp. 46–52, Apr. 2016.
[38] F. Morstatter, J. Pfeffer, H. Liu, and K. M. Carley, “Is the sample good enough? comparing
data from twitter’s streaming API with Twitter’s firehose,” in Proc. ICWSM, 2013, pp. 1–9.
[39] A. Neumann, J. Barnickel, and U. Meyer, “Security and privacy implications of url
shortening services,” in Proc. Workshop Web 2.0 Secur. Privacy, 2010, pp. 1–31.
pg. 121
[DETECTING OF MALICIOUS SOCIAL BOTS]
[41] K. Hans, L. Ahuja, and S. K. Muttoo, “Detecting redirection spam using multilayer
perceptron neural network,” Soft Comput., vol. 21, no. 13, pp. 3803–3814, Jul. 2017.
[44] W. Li and H. Song, “ART: An attack-resistant trust management scheme for securing
vehicular ad hoc networks,” IEEE Trans. Intell. Transp. Syst., vol. 17, no. 4, pp. 960–969, Apr.
2016.
[45] Z. Wei, H. Tang, F. R. Yu, M. Wang, and P. Mason, “Security enhancements for mobile
ad hoc networks with trust management using uncertain reasoning,” IEEE Trans. Veh.
Technol., vol. 63, no. 9, pp. 4647–4858, 2014.
[46] L. Zhao, T. Hua, C.-T. Lu, and I.-R. Chen, “A topic-focused trust model for Twitter,”
Comput. Commun., vol. 76, pp. 1–11, Feb. 2016.
[47] A. Rezvanian, M. Rahmati, and M. R. Meybodi, “Sampling from complex networks using
distributed learning automata,” Phys. A, Stat. Mech. Appl., vol. 396, pp. 224–234, Feb. 2014.
[48] H. Huang, X. Wei, and Y. Zhou, “Twin support vector machines: A survey,”
Neurocomputing, vol. 300, pp. 34–43, Jul. 2018.
[49] A. A. Heidari, H. Faris, I. Aljarah, and S. Mirjalili, “An efficient hybrid multilayer
perceptron neural network with grasshopper optimization,” Soft Comput., vol. 23, no. 17, pp.
7941–7958, Sep. 2019.
[50] M. C. Simmonds and J. P. Higgins, “A general framework for the use of logistic regression
models in meta-analysis,” Stat. Methods Med. Res., vol. 25, no. 6, pp. 2858–2877, Dec. 2016.
[51] Y. Zhou and G. Qiu, “Random forest for label ranking,” Expert Syst. Appl., vol. 112, pp.
99–109, Dec. 2018.
pg. 122