2022 A Quantum-Inspired Classifier For Early Web Bot Detection
2022 A Quantum-Inspired Classifier For Early Web Bot Detection
17, 2022
Abstract— This paper introduces a novel approach, inspired Many time series applications consider classification accu-
by the principles of Quantum Computing, to address web bot racy as the essential point and no particular importance is given
detection in terms of real-time classification of an incoming to the speed of decision. An example of such tasks is forgeries
data stream of HTTP request headers, in order to ensure the
shortest decision time with the highest accuracy. The proposed detection on signatures, where on-the-fly (OTF) classification
approach exploits the analogy between the intrinsic correlation is not required whereas high accuracy is a crucial performance
of two or more particles and the dependence of each HTTP metric [3].
request on the preceding ones. Starting from the a-posteriori Conversely, timely decisions are an essential feature on an
probability of each request to belong to a particular class, it is extrusion line in order to detect and amend possible defects
possible to assign a Qubit state representing a combination of the
aforementioned probabilities for all available observations of the before the product integrity gets compromised [4].
time series. By leveraging the underlying mathematical details These simple considerations denote the dual aspects of time
of superposition and entanglement on specific subsequences, series analysis, that turn out in selecting different approaches
it is possible to devise a measure of membership to each class, to deal with the various problems. As reported in [5], the
thus enabling the system to take a reliable decision when a approaches can be categorized in offline, whenever a complete
sufficient level of confidence is met or to continue with additional
observations. The results reported in this paper objectively sequence should be analyzed before labeling, or online (also
show the effectiveness of our quantum-inspired algorithm which known as on-the-fly), if a decision must be made as soon as
outperforms other state-of-the-art approaches, including our own possible, based on incoming observations.
one based on the Sequential Probability Ratio Test. The latter is commonly known as early classification of
Index Terms— Quantum-inspired computing, bot detection, time series [1]. Examples of such challenging problems can
sequential classification, early decision, multinomial classification, be found in various industrial scenarios, as shown in Table I,
multivariate sequence classification. often related to the processing of data streams from connected
I. I NTRODUCTION devices or sensors (Internet of Things), which enable harvest-
ing huge amounts of data, most frequently as a sequence of
I N THE era of Big Data, huge volumes of varied data
are collected at high velocity in several contexts, posing
new challenges concerning timely recognition of anomalous
correlated observations or measures. Even video sources can
be treated as a sequence of time related events, where each
or critical events. event is associated to a single video frame.
Whenever event data are indexed on time, the relevant In all those cases, such as the ones listed in Table I,
dataset represents a time series where each observation is measures are collected over time and need to be analyzed in a
somehow related to its temporal neighbors. Being able to timely manner to extract useful information about potentially
automatically classify a sequence is a highly valuable task critical conditions.
and even more important is the ability to label a time series Time series classification models usually target the recog-
with the fewest possible observations [1], [2]. nition rate as their main goal, but this is not sufficient for
early classification or prediction where earliness of decision
Manuscript received December 11, 2021; revised April 9, 2022; accepted
April 12, 2022. Date of publication April 25, 2022; date of current version
becomes a mandatory key performance indicator.
May 6, 2022. This work was supported by the ICT COST Action IC1406 A sequence of events that, for whatever reason, may end up
High-Performance Modelling and Simulation for Big Data Applications compromising a piece of equipment should be detected in the
(cHiPSet). The associate editor coordinating the review of this manuscript
and approving it for publication was Dr. Alexey Vinel. (Corresponding author:
shortest possible time, as any delay could cause damages and
Alberto Cabri.) unnecessary costs [6].
Alberto Cabri and Stefano Rovetta are with the Department of Informatics, This paper addresses the problem of on-the-fly early clas-
Bioengineering, Robotics, and Systems Engineering (DIBRIS), University
of Genoa, 16146 Genoa, Italy (e-mail: [email protected];
sification for online data streams, where data are usually
[email protected]). statistically dependent and inherently correlated over time as
Francesco Masulli is with the Department of Informatics, Bioengineer- in the case of web bot detection, a highly critical task in cyber-
ing, Robotics, and Systems Engineering (DIBRIS), University of Genoa,
16146 Genoa, Italy, and also with the Sbarro Institute for Cancer Research
security applications, where we need to distinguish automatic
and Molecular Medicine, Temple University, Philadelphia, PA 19122 USA web robots from human users.
(e-mail: [email protected]). Moreover we aim at labeling a temporal sequence of events
Grażyna Suchacka is with the Institute of Informatics, University of Opole,
45-040 Opole, Poland (e-mail: [email protected]).
using the smallest number of observations. The task is there-
Digital Object Identifier 10.1109/TIFS.2022.3170237 fore an early decision problem, based on an incomplete set
1556-6021 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
CABRI et al.: QUANTUM-INSPIRED CLASSIFIER FOR EARLY WEB BOT DETECTION 1685
TABLE I
E XAMPLE OF E ARLY C LASSIFICATION P ROBLEMS FOR T IME S ERIES
of events that requires OTF evaluation and stretches over an Several methods are available for modeling sequential data
undefined time horizon. A critical aspect is finding the optimal but Statistical models, such as ARMA or ARIMA [8], [9],
trade-off between decision speed, defined in relation to the aimed at time series prediction, assume the linearity of data
number of observations required by the trained system to take model which means that the time series is either stationary
a decision, and classification accuracy, which are conflicting or convertible into stationary. Most often, time series are non-
constraints. stationary because their statistical properties vary over time
To this aim, we present a new method for early classification and thus require data models built on training data [10], such
of online data streams, inspired by the principles of quantum as Artificial Neural Networks (ANN) [9].
computing, able to classify a series of HTTP requests with Often, machine learning techniques are not suitable for
outstanding accuracy and very effective in early decision sequential data because these algorithms disregard the statisti-
making without any knowledge of sequences’ time horizon. cal structure of a time series and are sensitive to noise, which
Please consider that no physical interpretation of quantum is always present in data streams.
theory is implied by our algorithm despite the analogical Many effective time series classification approaches are
adoption of the underlying mathematical details. The proposed available in literature [2], [11], but they are not suitable for
approach is completely myopic and no delay cost estimate early decision: it is worth underlining that early decision is
is required to force early decision because it leverages the a task for analyzing data streams collected in real time and
intrinsic structure of data to propose a class label. locating the earliest event that supports a reliable decision,
One important remark is that, to the best of our knowledge, according to a given cost function, from an incomplete set of
no public datasets are available for bot detection, making it temporally related data. It is an example of optimal stopping
difficult to compare the presented results to other relevant theory [12] because a given action is taken from sequential
studies; hence, the SPRT approach, originally discussed in [7], observations of a random variable, according to misclassifica-
has been compared with the quantum-inspired algorithm to tion or delay costs.
confirm its efficacy both in terms of classification metrics and The authors of [13] present a time series classification
decision time. strategy from incomplete information, introducing the notion
The remainder of this paper is organized as follows: of reliability as the probability required when labeling an
Section II presents the state of the art on possible approaches incomplete time series as if it were the complete data stream.
to early data stream classification; Section III introduces As an alternative for sequential binary classification, the
the theoretical background on quantum computing, which authors also refer to SPRT [14], which is a Bayes-optimal
is required to understand the proposed method; Section IV approach, but put in evidence the greedy connotation of this
illustrates the validation process of the proposed method using probabilistic model, where new observations have no impact
synthetic data; Section VI describes the test problem that has on the cumulative log-likelihood calculated from previous
been used to verify this novel approach while Section VII ones.
presents the structure of the dataset used for bot detection and SPRT has also been successfully used in [7] as a probability
the relevant features. Section VIII describes its application to integrator, with reject option, on the same BOT detection task
the chosen classification problem, regarding the analysis of proposed in this paper; it outperforms a real time binomial
web traffic logs of a real e-commerce portal; in Section IX classification approach, presented in [15], that relies on a
the experimental results are reported and commented; lastly, first-order Discrete Time Markov Chain (DTMC) [16], [17]
Section X offers concluding remarks and cues for extending to estimate the class conditional probability according to the
the research and the possible areas of future application. likelihoods of initial state and the following transition patterns.
In [18], the authors address early classification for some
II. S TATE OF THE A RT time-sensitive applications in healthcare by means of an
Monitoring natural and industrial processes often produce effective 1-Nearest Neighbor (1NN) classifier, whose major
massive volumes of sequential data (data streams), usually advantage is not needing any feature selection, pre-processing,
indexed over time. training nor configuration parameters.
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
1686 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 17, 2022
In [6], early classification is made by means of probabilis- Quantum Neural Network (QNN) for time series prediction
tic classifiers, named Early Classification framework based and modeling.
on class Discriminativeness and Reliability of Predictions A true quantum algorithm for time series classification is
(ECDIRE), that learn the timestamps when accuracy begins to proposed in [57] where the authors make use of quantum com-
exceed class defined thresholds. The predictions are released puting by formulating the reconstruction task as a quadratic
only when timestamps match the learned values. It focuses unconstrained binary optimization (QUBO) problem, although
on a set of time series of equal length, but ECDIRE can be not quantum-inspired.
utilized on variable or unknown length sequences with few To the best of our knowledge, only a very limited number of
minor changes. quantum-inspired classification methods are available, mainly
Early odor identification by means of electronic nose sen- focused on binary problems.
sors is addressed in [19], where the authors analyze subsequent Binary classification is the objective of a very recent
signal chunks collected at the sensors to feed an ensemble of quantum-inspired method, proposed by [30], that applies quan-
serially connected classifiers, with a reject option, and assign tum formalism to classical computational problems, confirm-
a class label when sufficient confidence is attained. ing a growing interest on the topic and its promising outcomes.
Most early classification approaches in literature, such A binary classifier is used to solve the quantum state dis-
as [2], [20], work on univariate time series and need the entire crimination problem introduced by Helstrom [31] considering
sequence upfront. The approaches for multivariate sequences that multiple copies of a quantum state can provide more
become more complex because the distance measures must be information than the state itself. This supervised algorithm,
able to express the correlation among features [21]. tested on real-world and simulated binomial datasets from
Multivariate time series cannot be treated as a collection Penn Machine Learning Benchmark repository [32], outper-
of univariate ones, because there exists a hidden relationship forms, on average, all the most frequently used classifiers.
among features that holds important information for the rep- Another approach, described in [33], might look similar to
resentation of real processes. the one in this paper: it estimates the density operators for each
In order to leverage the correlation property in multivariate class and applies projective measurement on quantum states
time series, [22], [23] propose Correlation Based Dynamic to label each data element. Though, it does not address time
Time Warping (CBDTW), which creates a non-overlapping series, nor it exploits entanglement in classification, which
segmentation of a time series by means of: confirms the innovative nature of our work.
• Principal Component Analysis (PCA) based similarity The algorithms analyzed so far propose several possible
measures to segment an unclassified sequence; approaches to early time series classifications, but are either
• a cost function to map each chunk to a non-negative real too specific for particular tasks or present some limitations
number and DTW distance to train the classifier. with regard to the number of features in the input stream or
Statistical analysis drives an interesting adaptive the number of classes in the target or require that the whole
non-myopic approach [24] that requires the entire sequence time series be available upfront. Our proposal gets over the
be available upfront and considers a penalty factor, similarly aforesaid limitations by introducing a real-time classification
to [19], related to decision delay and a misclassification cost approach that, in principle, works with any number of features
to balance quality of prediction and speed of decision. and classes to determine a reliable decision at the earliest
Another early classification model suitable for multivariate moment in time, never considering the complete sequence.
time series is presented in [25] on biomedical data, specifically
in multivariate gene expression. This hybrid approach binds
a generative Hidden Markov Model (HMM) model [26], that III. T HE Q UANTUM C LASSIFIER
exploits dependencies among observations on temporal seg-
ments, and a Support Vector Machine (SVM) [27] for efficient Quantum computing applies quantum-mechanical principles
discrimination of sequences. to data processing [34].
A totally different approach to early classification of bio- Those fundamental principles are:
medical multivariate time series based on shapelets is proposed • Superposition that results from linearity of the solu-
in [28]. The method, named Multivariate Shapelet Detection tions of Schrödinger’s equation. Adding together multiple
(MSD), can achieve highly accurate classification rates ana- quantum states determines another valid state and, con-
lyzing up to 64% of each test sequence. versely, any quantum state can be split up as sum of any
The strategy proposed in [29] looks for sub-concepts or number of valid states.
sub-clusters that characterize the same class label. The feature • Entanglement that occurs when the state of a composite
variables are independently scanned to uncover the inner system cannot be written as a product of states of its
structure of the MTS by means of core shapelets eligible for component systems [53]. Entangled particles can express
the classifier. stronger connection than their classical analogues.
In [54], the authors report various quantum algorithms that The quantum bit or qubit is a two-state quantum system
are equivalent to classical machine learning but use quan- that can be in a superposition of state 0 and 1 at the same
tum optimization to accelerate the training process or target time, unlike the classical bits.
binary classification problems such as Quantum SVM [55] The quantum equivalent of classical 0 and 1 logic states is
or Quantum PCA [56]. They also propose an interesting defined by the basis states of a qubit, which can be represented
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
CABRI et al.: QUANTUM-INSPIRED CLASSIFIER FOR EARLY WEB BOT DETECTION 1687
in ket notation by the following column vectors [35]: If |ψ1 . . . |ψn describe the state of n isolated quantum
systems, the state of the composite system is
1 0
|0 = and |1 = . |ψ = |ψ1 ⊗ · · · ⊗ |ψn .
0 1
The state vectors form an orthonormal basis, hence their The last aspect to consider is how to measure the probabili-
inner products x|y are: ties of each basis state from the resulting composite state: in a
real quantum system, the measurement process alters its state,
0|0 = 1|1 = 1 and 0|1 = 1|0 = 0, which turns into the pure state corresponding to the outcome
of measurement. It can be regarded as an interface between
where the br a operator x| is the conjugate-transpose of ket, the quantum and the classical domains, being the only way to
defined as x| = |x† . extract useful information from a quantum system [38].
A pure qubit state |ψ can be expressed as superposition of According to the third postulate of quantum mechanics,
the basis states a collection of measurement operators acting linearly on the
|ψ = α |0 + β |1 , (1) state space of the system can be used to measure a quantum
state: this is commonly termed projective measurement.
where α and β, termed probability amplitudes, are usually If a system can have M possible valid outcomes, a set of
complex numbers such that |α|2 and |β|2 represent the prob- {Pm : m ∈ M} operators can be identified in order to obtain
ability that, after a measure, the state |ψ is detected in the the probability of measuring m from the system state |ψ,
state |0 or |1 respectively, thus leading to which is
As sequential data streams are generally characterized by which ensures that all probabilities add up to 1, as per:
an intrinsic correlation among nearby samples, entanglement p(m) = ψ| Pm† Pm |ψ = ψ| I |ψ = 1.
becomes a fundamental property to enforce the interrelation- m∈M m∈M
ship among observations of a time series.
For the two basis states |0 or |1, measurement is per-
By definition, a state is considered entangled if it is not sepa-
formed through the projectors P0 = |0 0| or P1 = |1 1|
rable into its fundamental parts, that is, two distinct particles of
respectively, gathering the probabilities p0 and p1 .
a system are entangled if an item cannot be described without
Therefore, the probability p0 of a qubit being in state |0 can
considering the other one. Moreover, they can be entangled
be obtained through projective measurement by the following
even if separated by considerable distance [37].
equation
As an example,
p0 = ψ| P0 |ψ . (4)
1
|ψ = √ (|00..0 + |11..1)
2 Alternatively, whenever post-measurement state is not sig-
nificant, it is possible to define a density operator that
represents n entangled qubits in equal superposition, or Cat- describes the whole system [38]
State; in the example, states |00..0 and |11..1 have equal
1 1
probabilities | √ |2 = . The above equation is not separable ρ= Pi |ψi ψi | , (5)
2 2 i
because it is impossible to write it as a tensor product.
with the following constraints:
The term CatState refers to quantum superposition of two
macroscopically distinct states and is derived from the hypo- 1) Trace condition: Tr(ρ) = 1,
thetical Schrödinger cat’s experiment. 2) Positivity condition: ρ is a positive operator.
The behavior of a physical system can be described by The trace is a linear operator, hence in the case of a two state
a general framework defined by four postulates of quan- quantum system, the trace condition can be expanded as
tum mechanics. Two postulates are related to superposition
1
and measurement principles, whereas the third one describes Tr(ρ) = Tr( Pi |ψi ψi |)
the evolution of a closed quantum system in terms of the i=0
Schrödinger equation. Finally, the fourth one describes the = Tr(P0 |ψ ψ|) + Tr(P1 |ψ ψ|),
admissible states for composing two or more subsystems and
which leads to the generalized probability pi of state |i ,
asserts that the state space of a composite quantum system
expressed by
is the tensor product (symbol ⊗) of the state space of its
components [38]. pi = Tr(Pi |ψ ψ|). (6)
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
1688 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 17, 2022
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
CABRI et al.: QUANTUM-INSPIRED CLASSIFIER FOR EARLY WEB BOT DETECTION 1689
TABLE III the overall classification scores, which tend to flatten for peep
N UMBER OF S ESSIONS PER C LASS IN S YNTHETICALLY values greater than or equal to 4.
G ENERATED D ATASETS
As an exception, in the binomial case, it is possible to com-
pute the entangled states with a simpler procedure independent
of peep. With more than three classes, experimental evidence
shows that accuracy reaches its upper limit before exceeding
the greatest bearable peep value, which was at the upper limit
of 10 on our machine.
by generating a random number and reading the associated • minimize the number of observations required to make a
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
1690 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 17, 2022
TABLE IV
PARAMETERS AND S UMMARY I NDICATORS L OGGED ON G RID S EARCH
TABLE V
C LASSIFICATION R ESULTS FOR 10.000 S ESSIONS W ITH 3 C LASSES (I NCLUDING U NDECIDED S ESSIONS )
observations might become available for undecided sessions This initial experimental session pointed out an intrin-
and sooner or later make a reliable decision. sic limitation of the proposed quantum-inspired approach,
With the same settings, the classifier performance can allegedly due to the hardware specifications of our machine.
also be assessed on an increasing number of classes, easily Basically, in addition to the exponential complexity related to
generated with our tool. The metrics reported in Table VI share sequence length, also the number of classes represents a sort
DT = 0.995 and PEEP = 4 as common settings. of barrier hampering the adoption of QEMC.
The ADS indicator is defined as the average, over the total On the test machine, whose technical specifications are
number of sequences N of decision timestamps ti weighted reported at the beginning of this section, up to 10 classes
by the number of sequences classified at a given instant n i , could be detected simultaneously without compromising over-
that is: all system performance: alternative hierarchical approaches
are possible but major changes to the proposed classifica-
1
N
ADS = ti · n i . tion architecture are required to support two or more levels
N of refinement. For instance, if we were to predict possible
i=1
component failures on a cyber-physical system, it would be
Even if the number of undecided sessions reduces the
possible to implement a first classification level capable of
overall accuracy, its value stays steady above 97%, with very
discriminating among the potentially affected subsystem and
few classification errors in the binary case. If we hadn’t
then pass only the involved data streams to a specialized
considered unclassified streams, as if we could observe more
classifier that is fine tuned for the given subsystem.
events to support a trustful decision, we could ideally reach
In principle, this hierarchical approach allows to cope with
100% accuracy for three and four classes and 99.98% for the
multinomial classification problems of any size, even on edge
binomial case respectively, with as many as 27 observations
computers with extremely limited resources.
analyzed in the single worst case.
Moreover, seventy percent of classified sessions is correctly
labeled within the fifth observation and QEMC needs only VI. T HE B OT D ETECTION P ROBLEM
8 steps to classify the ninety percent.
According to specific goals of the classification task, it is The application area on which we focused our experiments
possible to tune the threshold to favor either accuracy or is cyber-security and specifically web robot detection from
LDS, given that in all cases ADS indicator denotes high HTTP request server logs [5], [45], [46], similarly to the work
classification speed on average. of [47]–[49].
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
CABRI et al.: QUANTUM-INSPIRED CLASSIFIER FOR EARLY WEB BOT DETECTION 1691
TABLE VI
C LASSIFICATION R ESULTS FOR 10.000 S ESSIONS W ITH 2-3-4 C LASSES
As evidenced in preceding sections, the multinomial version These considerations took the authors of [15] to defining a
of our algorithm is a generalization of the binary approach, sensible taxonomy of possible resource file types, organized
originally designed for bots classification from real-time HTTP into 9 more general aggregations, whence they derived a
traffic data at the web server and uses the same dataset for the semantical representation of all resource request patterns,
experimental part in order to compare the results. which is capable of expressing the differences between humans
It is an early decision, multivariate, sequential classification and robots.
task on a non-stationary data stream.
Web robots, or simply bots, are software programs capable VII. T HE DATASET FOR B OT D ETECTION
of autonomously executing specific tasks over the internet,
whose aim can be either good or malicious [50], [51]. The dataset used to test the proposed algorithm has been
These autonomous agents are pervading the net and many already utilized for [7], [46] to compare DTMC versus SPRT
bots have useful purposes, such as search engine crawlers or and contains the sequences of HTTP request headers from
price comparers, but some others have malicious goals, like many different working sessions.
stealing sensitive data, injecting malware or executing other Each session has been manually labeled as bot (label 1) or
harmful activities, and therefore must be identified as soon as human (label 0) generated and the classifier tries to take a
possible to reduce their negative effects. reliable decision before the session ends or labels the session
Usually, bots are detected through offline analysis of web as undecided. Appropriate actions can then be taken on the
server logs because it allows for a deeper understanding of undecided sessions according to the specific task objectives.
their behavioral model thus putting in evidence the crawling In order to apply the different classification models to
differences between humans and robots [52]. Nevertheless, the same bot dataset, no feature selection policy is imple-
it would be helpful to enable web servers to tell robots and mented and all available features are considered, but proper
humans apart in real time and implement specific management pre-processing transformations are needed on the original
policies that ensure the best user experience. features depending on their type.
Concerning real time detection, to the best of our knowl- The features, as shown in Table VII, can be divided
edge, two methods require special attention and therefore will into three categories, each requiring different pre-processing
be analyzed in detail and compared to the present quantum- actions:
inspired algorithm. The first method, described in [15], • numerical features (N) are standardized by subtracting the
is based on transition maps and hidden Markov models, mean and scaling to the unit variance;
whereas the second one leverages Wald’s Sequential Probabil- • categorical features (C) are transformed into the corre-
ity Ratio Test (SPRT) to gather information from subsequent sponding one-hot encoding;
events and eventually make a decision [7]. • boolean features (B) are simply translated to their numer-
The solution proposed by Doran and Gokhale in [15] is ical equivalent: 0 for False and 1 for T r ue.
an integrated method for real time and offline web robot After each feature has been transformed as explained above,
detection that analyzes the differences between human and each HTTP request is represented as a 25-feature vector and
software visitors in the resource request patterns, considered the corresponding session becomes a series of time related
time invariant by the authors, and imposes a minimum number vectors.
of events to be observed before deciding. The entire dataset contains 13.395 sessions for a total
Some basic concepts have to be defined for a common number of 1.397.838 HTTP requests. The session breakdown
understanding of the remainder of this document: is detailed as 6.190 sessions labeled as bots, 7.200 can be
• a session, according to common practice, is a series of associated to human activities and 5 sessions were excluded
requests pertaining to the same IP address and user agent because it was not possible to allot them to any class with
string, separated by a time gap shorter than thirty minutes; sufficient confidence.
• a request pattern is the ordered sequence of resource Finally, the dataset was prepared for a 10-fold cross-
requests received at the web server during a session. validation training by manually partitioning the sessions into
Though humans and robots request different specific ten roughly balanced subsets, each consisting of 619 and
resources during each visit, it is not possible to characterize 720 sessions for bot and human classes respectively.
visitors by the mere list of requested resources. Conversely, The good balancing between bot and human sessions
the order by which resource files are accessed by a human involves that either accuracy or F1 score can be indifferently
visitor is inherently different from crawling algorithms, that selected as representative metrics to evaluate the performance
are unlikely to exhibit human-like behaviors. of the proposed algorithm.
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
1692 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 17, 2022
TABLE VII
F EATURES L IST OF THE F EATURES AVAILABLE FOR M ODEL T RAINING B EFORE P RE -P ROCESSING
VIII. T HE S OFTWARE M ODEL FOR B OT C LASSIFICATION The reference orthonormal basis is defined as:
A. The Two-Stage Classification Model 1 0
|0 = |1 = .
0 1
The classification model can be ideally divided into two
logical stages. The first stage, built upon a deep neural Let xt be a sequence of HTTP request samples associated
network, is responsible for learning the classification model to a specific session and y ∈ {0, 1} be the relevant ground
and assigning an a-posteriori conditional class probability truth, which is obviously the same across each session. The
estimate to each individual HTTP request, independently of probability of request i being bot or human generated is
any other entry of the training sequences. It can eventually be computed by means of the Multi-Layer Perceptron at stage
replaced by any classifier which best fits the available data size one and is stored in pki , k ∈ {0, 1}.
to produce the aforesaid probability estimates: in the present As explained earlier in this document, quantum entangle-
case, the multi-layer perceptron was selected as the best option ment can be used to express a higher level of correlation
amongst the model we tested. among quantum states, therefore, as each request in a session
The second stage is based on the quantum-inspired entan- belongs to a specific class across the whole sequence and
gled classifier described in section III designed for a two they are reasonably correlated because they are generated
classes setting. It is noteworthy that, even if the problem is by the same entity, it sounds sensible to hypothesize that
intrinsically binary, the classification outcome of the quan- quantum entanglement be capable of capturing and exposing
tum module is three-state valued because a session might the intrinsic correlation within each session.
end before the system can take any reliable decision. Those The probabilities of the two classes, estimated by the neural
sessions are then provisionally labeled as undecided and can network, can be used to build a quantum entangled represen-
be either neglected or included in the performance metrics tation of all subsequent requests in a session. The multi-layer
computation, slightly affecting the overall results. perceptron classifier does not capture temporal information;
here we use it to assign the likelihood of each individual
sample to belong to either class. As the request order in each
B. Stage 1: Probability Estimation sequence is preserved to reflect the web navigation pattern,
The neural network implements supervised learning, setting QEMC deals with correlation by means of entanglement.
As expressed by (1), given the probabilities of the i -th
aside a fraction of the dataset for model validation and using
the remaining part for training with 10-fold cross-validation. observation in the sequence of length T , it can be linked to
The neural network is based on the MLPClassifier of the scikit- the two basis states |0 and |1, hence it is possible to compute
αi and βi as
learn toolkit [42] and it is designed as a sigmoid output unit
√ √
on top of two 50-units hidden layers with ReLU activation αi = p0i βi = p1i (10)
function. This neural network configuration has heuristically
and then create the T -qubits separable states |ψ0 and |ψ1 ,
proved to be the most effective among those tested for the
dataset under examination. The terminal sigmoid layer has according to (3), from
been selected because its output is a real number constrained |ψ0 = α |00 . . . 0 = α0 |0 ⊗ α1 |0 ⊗ . . . ⊗ αT −1 |0
between zero and one and therefore can be interpreted by the |ψ1 = β |11 . . . 1 = β0 |1 ⊗ β1 |1 ⊗ . . . ⊗ βT −1 |1
cascade stage as a probability estimate for the relevant class.
(11)
In the generalized approach for N classes, the output layer
is composed by N Softmax units that calculate probabilities The entangled state represented by a stream of n requests
whose sum is always 1. can be then expressed as the superposition of the two states
from (11):
C. Stage 2: The Quantum Classifier Module |ψ = |ψ0 + |ψ1 (12)
The second stage is the Quantum Entangled Multinomial In order to tell whether the current sample is due to a bot
Classifier proposed in section III for the binomial setting. or a human, it is necessary to measure, from the entangled
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
CABRI et al.: QUANTUM-INSPIRED CLASSIFIER FOR EARLY WEB BOT DETECTION 1693
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
1694 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 17, 2022
TABLE VIII
E XPERIMENTAL S CENARIOS
Fig. 2. Scenario A - pareto front analysis classification accuracy versus lengh Fig. 3. Accuracy vs decision step classification accuracy achieved versus the
of the decision sequence at variable values of Grade. number of requests analyzed to make a decision.
At grade 0.4, the ACC value is 0.9585, the highest for the ( = 0.0322 versus best SPRT) and the 90% of sessions is
setting, but the number of unclassified sessions is 50, which is classified within the second step. For these threshold values,
extremely high compared to SPRT, and LDS is 10. Conversely, the accuracy of SPRT is slightly lower (0.9204) than the best
at grade 2.6, accuracy is only slightly less than in the previous case, but the number of unclassified sessions decreases to
case (ACC = 0.9512, = 0.0073) but LDS is exactly the 4 while maintaining the same LDS value.
same as in SPRT and the number of unclassified sessions drops
to zero. Nevertheless, in both cases, classification accuracy is D. Scenario C
greater than in SPRT (worst case = 0.009). The third scenario compares the performance indicators
when varying grade in the threshold setting that is best
C. Scenario B for SPRT and with peep = 6, which should improve the
This scenario evaluates the classification results with regard accuracy of QEMC by considering more samples in the
to variable threshold values on 70% of sessions used for decision process. Even in this case the results for SPRT are
validation with grade set at 0.5, which is the default value ACC is 0.9205, TUC is 8 and LDS is 3 because the peep
for QEMC. mechanism only applies to QEMC, which conversely improves
The best results for SPRT are achieved with lower and upper its classification performance depending on the Pareto optimal
thresholds set to the logarithm of 0.1 and 0.85 respectively; values of grade.
in this configuration, ACC is 0.9205, TUC is 8 and LDS is 3. The optimal value to maximize accuracy is 0.2, as shown
The metrics for QEMC at the best thresholds for SPRT are figure 5, where accuracy is 0.9589, a bit higher ( = 0.0004)
slightly better in accuracy (0.9302), which means that the than in Scenario A with peep at 4, showing that it is possible
overall number of correctly classified sessions is greater, but to achieve better classification rates by considering more
it might take longer to take a decision (LDS = 5), even if samples. This is paid for in terms of LDS, that grows to 15,
in both cases the 90% of sessions is classified at the first TUC that spikes to 91 and on the number of steps required to
step, and the number of unclassified sessions is almost double classify the 90% of the sessions which becomes 3.
(TUC = 15). On the other side, the optimal value of grade to minimize
For the current setting, figure 3 visualizes the rate of LDS is 2.4, which not only requires at most 2 samples to
correctly classified sessions for the two methods: SPRT iden- take a reliable decision but also allows to achieve zero on
tifies a greater percentage at the first two requests but no the total number of unclassified sessions. The good point here
great improvement is achieved on the third and last step. is that accuracy is only 10−4 worse than for SPRT, with only
Conversely, QEMC takes over at the third request and the 1 request needed to classify 90% of the sessions in both cases.
overall performance is nearly 1% better than SPRT. The three scenarios proposed above are representative of the
The best threshold pair for QEMC is 0.25 for the lower and various combinations of post-training hyper-parameters and
0.95 for the upper threshold where, despite even greater values expose both the pros and cons of the novel quantum-inspired
of LDS (7) and TUC (23), the accuracy sensibly rises to 0.9527 approach.
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
CABRI et al.: QUANTUM-INSPIRED CLASSIFIER FOR EARLY WEB BOT DETECTION 1695
X. C ONCLUSION
In the present paper, we analyzed the general structure of a
Fig. 4. Scenario C - accuracy vs grade classification accuracy at increasing temporal sequence of data and pointed out the benefits of real
values of grade. time classification of non stationary data streams, underlining
its application in cyber-security with on-the-fly bot detection.
We introduced QEMC, a novel quantum-inspired multino-
mial classifier for early detection of significant events on time
series, that has been validated in a synthetic experimental
setting to confirm the motivating results obtained with its
binary version applied to bot detection.
The proposed technique relies on superposition and entan-
glement to integrate the class probability of each individual
event in the time series, estimated by an upstream stage,
and produce an overall score, with reject option, capable of
supporting trustful decisions even in case of a limited number
of events.
Our method has been successfully compared with another
effective bot detection approach, namely SPRT, and its results
Fig. 5. Scenario C - pareto front analysis classification accuracy versus lengh have been analyzed with reference to the contrasting objectives
of the decision sequence at variable values of grade. of classification accuracy, number of undecided sessions and
speed of decision.
The extensive experimental studies, tested on traffic streams
Classification accuracy for QEMC can be sensibly boosted from an actual Polish e-commerce server, showed that SPRT
by properly selecting the peep and grade values, at the same is able to detect, in real time, over 90% of all bots and is espe-
threshold conditions, by means of Pareto analysis. Moreover cially powerful given a very limited number of observations,
the same parameters can be used to achieve particular objec- despite it requires no minimum quantity of HTTP requests to
tives, such as zero unclassified sessions or a shorter decision be observed before making a decision.
sequence, while preserving the performance indicators that, Nonetheless, our innovative quantum-inspired multinomial
in the worst case, are fairly equal. In fact, by tuning peep classifier for early detection of significant events on time series
and grade, it is possible to increase the convergence speed of can produce better overall scores and is similarly capable of
the classification algorithm and reduce the number of requests supporting trustful decisions even in case of a limited number
needed to take a decision to even less than SPRT. of events, both in the binary and in the multinomial setting.
It is worth noting that a reduction in the training size of The results were analyzed with reference to the contrasting
the dataset has a smaller impact on classification accuracy objectives of classification accuracy, number of undecided
for QEMC than for SPRT, in the same setting: experimental sessions and speed of decision, showing that the proposed
evidence shows that, with a validation ratio of 50%, accuracy quantum-inspired algorithm, in our opinion, natively covers
is 0.9573 for QEMC and 0.9421 for SPRT whereas, with an area of application (non-stationary data stream classifica-
30% of the sessions used for training, the relevant values are tion) that so far has not yet found reliable and performing
0.9535 and 0.9204 respectively. Hence Q E MC = 0.0038 and approaches.
S P RT = 0.0217, which is nearly 6 times greater than the This paper demonstrates the effectiveness of the proposed
former. algorithm that, compared to other approaches, was proven to
Another important consideration is related to the peep value: outperform not only SPRT but also, by transitive property,
the adoption of such mechanism is imposed by the computa- other very powerful state-of-the-art techniques.
tional performance downgrade on long sequences when the Moreover, the proposed approach represents a complete real
decision process requires to consider many requests to meet time classification framework for a critical application, such
the desired confidence level. However, regardless of the length as bot detection, and can easily be integrated, as a plug-in,
of a session, the number of samples that have to be taken into in a web architecture.
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
1696 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 17, 2022
With regard to the methods analyzed in section II, some [10] G. P. Zhang, “Time series forecasting using a hybrid ARIMA and neural
additional notes are worth reporting to highlight the advan- network model,” Neurocomputing, vol. 50, pp. 159–175, Jan. 2003.
[11] S. Laxman and P. S. Sastry, “A survey of temporal data mining,”
tages and disadvantages of current implementation of the new Sadhana, vol. 31, no. 2, pp. 173–198, Apr. 2006.
approach: [12] G. Peskir and A. N. Shiriaev, Optimal Stopping and Free-Boundary
Problems (Lectures in Mathematics ETH Zürich). Boston, MA, USA:
1) QEMC is tolerant against non-standardized numerical Birkhäuser-Verlag, 2006.
features, which is usually considered a compelling trans- [13] N. Parrish, H. S. Anderson, M. R. Gupta, and D. Y. Hsiao, “Classifying
formation for machine learning tasks; with confidence from incomplete information,” J. Mach. Learn. Res.,
vol. 14, no. 1, pp. 3561–3589, Dec. 2013.
2) with QEMC, it is possible to dramatically reduce [14] A. Wald, “Sequential tests of statistical hypotheses,” Ann. Math. Statist.,
the number of training sequences with no significant vol. 16, no. 2, pp. 117–186, Jun. 1945.
decrease of classification scores; [15] D. Doran and S. S. Gokhale, “An integrated method for real time and
offline web robot detection,” Expert Syst., vol. 33, no. 6, pp. 592–606,
3) in current configuration of the classification framework, Dec. 2016.
solutions are not interpretable, therefore some areas of [16] F. Biagini and M. Campanino, “Discrete time Markov chains,” in Ele-
application might be precluded to QEMC; ments Probability and Statistics, vol. 98. Cham, Switzerland: Springer,
2016, pp. 81–87.
4) no estimate on reliability of decisions is currently avail- [17] N. Privault, Understanding Markov Chains: Examples and Applica-
able in QEMC; tions (Springer Undergraduate Mathematics Series), 2nd ed. Singapore:
5) dependencies on grade parameter have not yet been Springer, 2018.
[18] Z. Xing, J. Pei, and P. S. Yu, “Early classification on time series,” Knowl.
explored in depth, but could open the way to a fuzzy Inf. Syst., vol. 31, no. 1, pp. 105–127, Apr. 2012.
flavor of the classifier. [19] N. Hatami and C. Chira, “Classifiers with a reject option for early time-
series classification,” Dec. 2013, arXiv:1312.3989.
In our opinion, considering the interesting results achieved [20] Z. Xing, J. Pei, and P. S. Yu, “Early prediction on time series: A nearest
with this initial formulation of QEMC, the last three items neighbor approach,” in Proc. 21st Int. Conf. Artif. Intell. (IJCAI), 2009,
represent interesting areas of investigation, where near future pp. 1297–1302.
[21] H. Anderson, N. Parrish, and M. R. Gupta, “Early time-series classi-
research should be directed. We also believe that the proposed fication with reliability guarantee,” Sandia National Lab, Albuquerque,
algorithm might open the way to new approaches for time NM, USA, Tech. Rep. SAND2012-7379C 480398, 2012.
series prediction and clustering, but so far we do not envisage [22] Z. Bankó and J. Abonyi, “Correlation based dynamic time warping,” in
Proc. 8th Int. Symp. Hung. Researchers Comput. Intell. Inf., 2007.
any sensible evolution. [23] Z. Bankó and J. Abonyi, “Correlation based dynamic time warp-
Replacement of the ANN with explainable ways to compute ing of multivariate time series,” Expert Syst. Appl., vol. 39, no. 17,
the probability estimates of observations might also open new pp. 12814–12823, Dec. 2012.
[24] A. Dachraoui, A. Bondu, and A. Cornuéjols, “Early classification of
perspectives for the quantum-inspired technique, especially if time series as a non myopic sequential decision making problem,” in
accompanied by a measure of decision reliability. Machine Learning and Knowledge Discovery in Databases, vol. 9284.
Cham, Switzerland: Springer, 2015, pp. 433–447.
[25] M. F. Ghalwash, D. Ramljak, and Z. Obradović, “Early classification
ACKNOWLEDGMENT of multivariate time series using a hybrid HMM/SVM model,” in Proc.
IEEE Int. Conf. Bioinform. Biomed., Philadelphia, PA, USA, Oct. 2012,
The authors would like to thank Paolo Solinas for his pre- pp. 1–6.
cious support in reviewing some technical aspects of quantum [26] L. Rabiner, “A tutorial on hidden Markov models and selected applica-
theory. tions in speech recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257–286,
Feb. 1989.
[27] T. Hastie, R. Tibshirani, and J. H. Friedman, The Elements of Statistical
R EFERENCES Learning: Data Mining, Inference, and Prediction (Springer Series in
Statistics), 2nd ed. New York, NY, USA: Springer, 2009.
[1] T. Santos and R. Kern, “A literature survey of early time series [28] M. F. Ghalwash and Z. Obradovic, “Early classification of multivariate
classification and deep learning,” in Proc. SAMI iKNOW, 2016, pp. 1–7. temporal observations by extraction of interpretable shapelets,” BMC
[2] Z. Xing, J. Pei, and E. Keogh, “A brief survey on sequence classifica- Bioinf., vol. 13, no. 1, p. 195, Dec. 2012.
tion,” ACM SIGKDD Explorations Newslett., vol. 12, no. 1, pp. 40–48, [29] G. He, Y. Duan, R. Peng, X. Jing, T. Qian, and L. Wang, “Early
Jun. 2010. classification on multivariate time series,” Neurocomputing, vol. 149,
[3] A. Hassaïne and S. Al-Maadeed, “An online signature verification pp. 777–787, Feb. 2015.
system for forgery and disguise detection,” in Proc. Neural Inf. Process., [30] G. Sergioli, R. Giuntini, and H. Freytes, “A new quantum approach
vol. 7666. Berlin, Germany: Springer, 2012, pp. 552–559. to binary classification,” PLoS ONE, vol. 14, no. 5, May 2019,
[4] A. Oleff, B. Küster, M. Stonis, and L. Overmeyer, “Process monitoring Art. no. e0216224.
for material extrusion additive manufacturing: A state-of-the-art review,” [31] C. W. Helstrom, “Quantum detection and estimation theory,” J. Statist.
Prog. Additive Manuf., vol. 6, no. 4, pp. 705–730, May 2021. Phys., vol. 1, no. 2, pp. 231–252, 1969.
[5] S. Rovetta, A. Cabri, F. Masulli, and G. Suchacka, “Bot or not? A case [32] R. S. Olson, W. L. Cava, P. Orzechowski, R. J. Urbanowicz, and J.
study on bot recognition from web session logs,” in Quantifying Process- H. Moore, “PMLB: A large benchmark suite for machine learning
ing Biomedical and Behavioral Signals, vol. 103. Cham, Switzerland: evaluation and comparison,” Mar. 2017, arXiv:1703.00512.
Springer, 2019, pp. 197–206. [33] P. Tiwari and M. Melucci, “Towards a quantum-inspired binary classi-
[6] U. Mori, A. Mendiburu, E. Keogh, and J. A. Lozano, “Reliable early fier,” IEEE Access, vol. 7, pp. 42354–42372, 2019.
classification of time series based on discriminating the classes over [34] E. Rieffel and W. Polak, Quantum Computing: A Gentle Introduc-
time,” Data Mining Knowl. Discovery, vol. 31, no. 1, pp. 233–263, tion (Scientific and Engineering Computation). Cambridge, MA, USA:
Jan. 2017. MIT Press, 2011.
[7] G. Suchacka, A. Cabri, S. Rovetta, and F. Masulli, “Efficient on- [35] V. Moret-Bonillo, “Can artificial intelligence benefit from quantum
the-fly web bot detection,” Knowl.-Based Syst., vol. 223, Jul. 2021, computing?” Prog. Artif. Intell., vol. 3, no. 2, pp. 89–105, Mar. 2015.
Art. no. 107074. [36] A. Ekert, P. M. Hayden, and H. Inamori, “Basic concepts in quantum
[8] P. J. Brockwell and R. A. Davis, Eds., Introduction to Time Series and computation,” in Coherent Atomic Matter Waves, vol. 72, R. Kaiser,
Forecasting (Springer Texts in Statistics). New York, NY, USA: Springer, C. Westbrook, and F. David, Eds. Berlin, Germany: Springer, 2001,
2002. pp. 661–701.
[9] R. Adhikari and R. K. Agrawal, “An introductory study on time series [37] E. G. Rieffel and W. Polak, “An introduction to quantum computing for
modeling and forecasting,” Feb. 2013, arXiv:1302.6613. non-physicists,” Jan. 1998, arXiv:quant-ph/9809016.
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.
CABRI et al.: QUANTUM-INSPIRED CLASSIFIER FOR EARLY WEB BOT DETECTION 1697
[38] E. B. Guedes, F. M. de Assis, and R. A. C. Medeiros, “Fundamentals of Alberto Cabri received the degree in electronic
quantum information processing,” in Quantum Zero-Error Information engineering from the University of Genoa, Italy,
Theory. Cham, Switzerland: Springer, 2016, pp. 7–26. in 1992, and the Ph.D. degree in computer science
[39] G. Van Rossum and F. L. Drake, Jr., Python Reference Manual. and systems engineering. He is currently a qualified
Amsterdam, The Netherlands: Centrum voor Wiskunde en Informatica Teacher of computer science with the Public Sec-
Amsterdam, 1995. ondary Schools, Genoa, Italy. He is also a Profes-
[40] C. R. Harris et al., “Array programming with numpy,” Nature, vol. 585, sional Engineer with the University of Genoa. His
no. 7825, pp. 357–362, Sep. 2020. research focuses on machine learning and he has
[41] J. D. Hunter, “Matplotlib: A 2D graphics environment,” Comput. Sci. developed an innovative quantum inspired algorithm
Eng., vol. 9, no. 3, pp. 90–95, May 2007. for multivariate time series classification.
[42] F. Pedregosa et al., “Scikit-learn: Machine learning in Python,” J. Mach.
Learn. Res., vol. 12, pp. 2825–2830, Oct. 2012.
[43] W. McKinney, “Data structures for statistical computing in Python,”
in Proc. 9th Python Sci. Conf., Austin, TX, USA, vol. 445, 2010,
pp. 51–56. Francesco Masulli (Senior Member, IEEE) is cur-
[44] V. Pareto, Manuel d’économie Politique. Geneva, Switzerland: Librairie rently a Full Professor of computer science with the
Droz, 1981. University of Genoa, Italy, and an Adjunct Professor
[45] G. Suchacka, “Analysis of aggregated bot and human traffic on e- with Temple University, Philadelphia, PA, USA.
commerce site,” in Proc. Conf. Comput. Sci. Inf. Syst., Sep. 2014, He held visiting positions at Radboud University,
pp. 1123–1130. Njmegen, The Netherlands; the International Com-
[46] A. Cabri, G. Suchacka, S. Rovetta, and F. Masulli, “Online web bot puter Science Institute, Berkeley, CA, USA; and the
detection using a sequential classification approach,” in Proc. IEEE I3S Laboratory, University of Nice Sophia Antipolis,
20th Int. Conf. High Perform. Comput. Commun., IEEE 16th Int. Conf. France. He is the author of more than 250 papers in
Smart City, IEEE 4th Int. Conf. Data Sci. Syst. (HPCC/SmartCity/DSS), machine learning, neural networks, clustering, fuzzy
Jun. 2018, pp. 1536–1540. systems, and their applications. He serves as the
[47] A. Lagopoulos, G. Tsoumakas, and G. Papadopoulos, “Web robot Chair for IEEE Italy Section Computational Intelligence Society Chapter.
detection in academic publishing,” Nov. 2017, arXiv:1711.05098.
[48] A. Stassopoulou and M. D. Dikaiakos, “Web robot detection: A proba-
bilistic reasoning approach,” Comput. Netw., vol. 53, no. 3, pp. 265–278, Stefano Rovetta (Senior Member, IEEE) is cur-
Feb. 2009. rently an Associate Professor of computer science
[49] P.-N. Tan and V. Kumar, “Discovery of web robot sessions based on their
with the University of Genova, Italy. He has authored
navigational patterns,” Data Mining Knowl. Discovery, vol. 6, no. 1,
more than 170 scientific articles in machine learn-
pp. 9–35, 2002.
ing, neural networks, clustering, fuzzy systems,
[50] I. Zeifman. (Jan. 2017). Bot Traffic Report 2016. [Online]. Available:
and bioinformatics. He is a member of the Italian
https://www.incapsula.com/blog/bot-traffic-report-2016.html
Neural Network Society, the European Neural Net-
[51] G. Buehrer, J. Stokes, K. Chellapilla, and J. Platt, “Classification of
work Society, and the European Society for Fuzzy
automated web traffic,” in Weaving Services and People on the World
Logic and Technology. He received the 2008 Pattern
Wide Web, Berlin, Germany: Springer-Verlag, Jan. 2009.
Recognition Society Award. He was the chair of
[52] G. Suchacka and M. Sobkow, “Detection of internet robots using a
international conferences.
Bayesian approach,” in Proc. IEEE 2nd Int. Conf. Cybern. (CYBCONF),
Jun. 2015, pp. 365–370.
[53] M. Nielsen and I. Chuang, Quantum Computation and Quantum Infor-
mation. Cambridge, U.K.: Cambridge Univ. Press, 2010, p. 96.
[54] D. Emmanoulopoulos and S. Dimoska, “Quantum machine learning in Grażyna Suchacka (Senior Member, IEEE)
finance: Time series forecasting,” Feb. 2022, arXiv:2202.00599. received the M.Sc. degree in computer science,
[55] P. Rebentrost, M. Mohseni, and S. Lloyd, “Quantum support vector the M.Sc. degree in management, and the Ph.D.
machine for big data classification,” Phys. Rev. Lett., vol. 113, no. 13, degree (Hons.) in computer science from the
Sep. 2014, Art. no. 130503. Wrocław University of Science and Technology,
[56] S. Lloyd, M. Mohseni, and P. Rebentrost, “Quantum principal compo- Poland. She is currently an Assistant Professor with
nent analysis,” Nature Phys., vol. 10, no. 9, pp. 631–633, Jul. 2014. the Institute of Informatics, University of Opole,
[57] S. Yarkoni, A. Kleshchonok, Y. Dzerin, F. Neukart, and M. Hilbert, Poland. Her research interests include data analysis
Semi-Supervised Time Series Classification Method for Quantum Com- and modeling, data mining, machine learning, and
puting (Quantum Machine Intelligence), New York, NY, USA: Springer, quality of web service with special regard to bot
Apr. 2021. detection and electronic commerce support.
Authorized licensed use limited to: National Institute of Technology. Downloaded on July 29,2024 at 10:06:25 UTC from IEEE Xplore. Restrictions apply.