Designing quantum multi-category classifier from the perspective of brain processing information

Xiaodong Ding; Jinchen Xu; Zhihui Song; Yifan Hou; Zheng Shan

doi:10.1088/2632-2153/ad7570

1. Introduction

The complexity of the real world and the diversity of data [1, 2] have made multi-category classification problems increasingly important in AI. For example, information security [3], image classification [4], and disease diagnosis [5] are typical examples of multi-class classification problems. Solving such problems allows us to more effectively utilize machine learning techniques to address the complex and changing classification challenges in the real world. However, multi-category classification is more complex than binary classification, and models require more powerful classification capabilities and more complex decision boundaries. This undoubtedly poses a challenge in solving multi-classification problems. Traditional multi-class classification methods are usually based on deep learning frameworks [6, 7] which require large amounts of data, computational resources, and storage space to complete training. In the current context, researchers have begun to explore in depth the application of special quantum properties to multi-category classification problems. To address this challenge, the latest research has proposed two innovative solutions: one is the quantum convolutional neural network [8] based on purely variational quantum circuits, and the other is the quantum K-nearest-neighbour algorithm [9] using a divide-and-conquer strategy, both of which aim to enhance the classification effect through the unique advantages of quantum computing. At the same time, researchers are beginning to focus on how to exploit the special quantum properties of quantum neural networks (QNNs) to solve multiclass classification problems, tapping into their potential value in the field.

QNNs [10, 11] are based on the principle of superposition of states in quantum theory, and have a powerful parallel processing capability to handle large-scale datasets. QNNs utilize the properties of entanglement, superposition, and unitary evolution among quantum states to exhibit a data processing capability that is unmatched by classical neural networks. Currently, QNNs are mainly quantum–classical hybrid algorithms that combine classical neural networks with quantum algorithms when dealing with multi-category classification problems [12, 13]. Meanwhile, some quantum algorithms such as quantum machine learning [14], SQFNN [15], and QuantumFlow [16] have also been proposed for solving multi-category classification problems. However, the classification accuracy of these methods still falls short compared to traditional deep learning methods. Meanwhile, these models do not provide practical solutions to poor scalability and indicate that they will be addressed in future research. The poor scalability will limit the application scope and practical application effect of the models.

To solve the problems of low classification accuracy and poor scalability of existing QNN models in solving multi-category classification problems, we start from the feature that each brain region of the brain has different structures, combine the principle that human-specific sensation and intuition enter the brain as input information and mobilize neurons in multiple brain regions in real time for collaborative processing [17, 18], and make use of the brain's attentional mechanism [19, 20] to integrate the processed results to make adaptive responses. This inspired us to design a quantum multi-category classifier model. The model exhibits better scalability by adjusting the number of heterogeneous QNN clusters and the number of stacked layers of parameterized quantum circuits (PQCs). The heterogeneous QNN clusters process the information simultaneously, which adequately simulates the brain's ability to utilize multiple brain regions to work together and maintain model robustness. By utilizing the brain's attentional mechanism to integrate the processing results of the heterogeneous QNN clusters, the classification accuracy is improved. We have designed a quantum classifier inspired by the subtle mechanisms of the human brain for processing information. We aspire to achieve the goal of solving current and future increasingly complex classification problems more efficiently, intelligently, and accurately through this innovative design. With the rapid development of technology, we are facing diverse and increasingly complex classification challenges such as image recognition and natural language processing. These challenges require us to deal with huge datasets and complex patterns, and traditional classification methods have gradually become inadequate. Therefore, cutting from the perspective of the human brain processing information and combining the intrinsic connection between quantum thinking and the human brain, we design quantum classifiers not only to cope with these complex problems but also to improve the robustness and scalability of the models. In conclusion, the quantum classifier we designed is an in-depth exploration and reference to the mechanism of the human brain processing information, and it represents our new attempt and breakthrough in solving complex classification problems. We conduct classification simulation experiments on different datasets, and the results show that our proposed classifier exhibits strong robustness and scalability. Its classification accuracy improves by up to about 5% compared to other quantum multiclassification algorithms for different data subsets of MNIST.

2. Method

2.1. Quantum multi-category classifier modeling framework

Our multi-category classifier model is carefully designed from the perspective of the brain processing information. We are deeply inspired by the process of information processing in the brain, which requires the mobilization of neurons from multiple brain regions to work together in real-time. Each brain region possesses a unique morphological structure, and when information is received, they will participate in the processing at the same time, forming a kind of information processing network across brain regions [21, 22]. Finally, these brain regions integrate the processing results and adjust accordingly [23, 24]. This heuristic process provides a unique perspective for us to design our model. We designed a novel model that simulates the process of the brain processing information and reacting accordingly in a simplified manner for efficiently solving multi-category classification problems. The core of this model utilizes a set of heterogeneous QNN clusters, each of which simulates a different brain region of the brain, and the model runs the heterogeneous QNN clusters executing in parallel on different quantum computers. When the model receives information, these heterogeneous clusters of QNNs will participate in processing at the same time, as if multiple brain regions in the brain work together. To further simulate the working mechanism of the brain, our model introduces an attention mechanism. This mechanism allows the model to focus on the key parts of the processing results and thus make the final classification decision based on these results. This is like when a person sees a wolf in the forest, the brain will quickly process the information of this environmental scene and make the 'Wolf, run!' response. The analogy of the whole process is shown in figure 1. We chose to use a heterogeneous cluster of QNNs to model different brain regions for the following reasons.

Figure 1. Refer to the following caption and surrounding text. — **Figure 1.** When a wolf is seen in the forest, the brain calls different brain regions to work together and integrate the information to make the response of 'Wolf, run'. We use a heterogeneous cluster of QNNs to simulate the different brain regions, the data is processed by the heterogeneous cluster of QNNs, and the attention mechanism is utilized to produce the final result.
Download figure:
Standard image High-resolution image

First, it has been shown that the brain's processing of information may be related to quantum consciousness or quantum thinking [25, 26], which makes quantum systems the microscopic basis of all physical processes, and should likewise be the basis of the brain's information processing. The role of the brain stems from complex system dynamics that go beyond what can be described by traditional artificial neural networks [27, 28]. In contrast, quantum systems have similar dynamics to biological neural networks [29, 30], and therefore, QNNs can better simulate the information processing process of the brain. Secondly, the concept of quantum system goes beyond the division of matter concerning particle-wave interactions as well as fields, which integrally manifests itself as an indivisible parallel distributed processing system [31, 32], and is one of the more suitable candidates to be used to describe the holistic nature of the brain's action. Further, with the development of computer hardware, as the line widths of integrated circuits continue to shrink, Moore's Law [33] will fail and quantum effects will emerge and begin to affect the normal motion of electrons. At this point, we will have to involve quantum computing. At the same time, the quantum promotion of neural computing is also an inevitable trend in the development of neural networks, as there are certain essential connections between the two.

In summary, it is very appropriate to use heterogeneous clusters of QNNs to simulate different brain regions of the brain. They not only have the potential to simulate the information processing process in the brain but also can overcome the limitations of traditional neural networks in handling complex tasks. This is an important insight for the future use of QNNs to solve practical problems. In the following, we describe our proposed model in detail from three important processes: the design of heterogeneous QNN clusters, the processing of information by the model, and the integration and classification realization of the processing results, respectively.

2.2. Heterogeneous QNN swarm design

QNN is a neural network model for computation based on quantum mechanical principles. As early as 1995, Subhash Kak and Ron Chrisley presented their ideas about quantum neural computation [34], where they argued that quantum effects play a role in cognitive functions. Currently, most QNNs are developed as feed-forward neural networks [10], which have a similar structure to classical neural networks. Such networks replace classical neurons by using qubits to produce neurons that can be superimposed in 'excited' and 'resting' states. Each layer of qubits evaluates the information received and passes the output to the next layer, which ultimately leads to the final layer of the QNN. The key difference between a QNN and a classical neural network is how information is passed between the layers. In a classical neural network, the results of operations in the previous layer are copied to the next layer. However, in QNNs, the copying operation cannot be accomplished due to the limitation of the no-cloning theorem of quantum mechanics [35]. To solve this problem, one solution is to replace the classical output method with a unitary matrix. This unitary matrix extends the output of the current layer into the next layer, and the whole process does not require a copy operation, which meets the quantum operation requirement of reversibility. Similar to other quantum machine learning algorithms, the information is first encoded into a quantum state in Hilbert space through a quantum state preparation procedure of feature mapping. The feature mapping procedure does not require training or optimization. Once the information is encoded into a quantum state, a model called a PQC is applied. This model contains parameterized quantum gates operation optimized for a specific task, which acts on the quantum state of the information so that its output is expanded layer by layer to the final layer. The final output is obtained by measuring the quantum circuit. Before the measurements are passed to the loss function, they are usually converted into labels or predictions by classical post-processing. The framework of the whole model is shown in figure 1. The framework diagram of the QNN is as follows: we elaborate the design of the heterogeneous QNN clusters from three aspects, namely, quantum encoding circuit, PQCs, and quantum circuit measurement, respectively.

2.2.1. Quantum encoding circuit

The core task of a quantum encoding circuit is to map classical information from vector space features to quantum states in Hilbert space. In the realization process, we mainly use quantum encoding to encode information into the amplitude or phase of qubits. Currently, the main encoding methods [36] include base encoding, rotational encoding, amplitude encoding, repeated amplitude encoding, and coherent state encoding. The choice of which encoding method to implement feature mapping usually depends on the experience of the model designer, the characteristics of the original data, the number of bits in the quantum computer, and the decoherence time of the quantum system [37–39]. To meet the requirements of the validation experiments we designed, for different data types, we chose rotational encoding or amplitude encoding to realize feature mapping according to the characteristics of the experimental dataset. Examples of amplitude encoding and rotational encoding are shown as A in figure 2.

Figure 2. Refer to the following caption and surrounding text. — **Figure 2.** The core part of the QNN framework consists of three main parts, which are shown in B. The first part is feature mapping. This stage is generally realized using quantum encoding to map classical information features to quantum states in Hilbert space. In this paper, depending on the requirements of the validation experiments we have designed, we have used either rotational encoding or amplitude encoding to realize this step, which is shown in A. The second part is a PQC consisting of trainable quantum gates with parameters, which is shown in C. These quantum gates will continuously adjust their parameters during the training process to find the optimal combinations for the precise evolution of the quantum states. The third part is the measurement of the quantum circuit. This step converts the outputs of the quantum circuits into classical outputs so that we can evaluate the similarity of the outputs to the target expectation. The QNN framework can efficiently convert classical data into quantum states with parameter optimization and measurement through the above three steps, and ultimately achieve quantum processing of classical data.
Download figure:
Standard image High-resolution image

2.2.2. PQCs

PQCs [40–42], as an important component of QNNs, usually consist of sequences of quantum gates with parameters and sequences of multi-qubit controlled gates without parameters. PQCs with parameterized quantum gates are trainable, and multi-qubit controlled gates are mainly accomplished to generate entanglement among different qubits and are untrainable. Parameterized quantum gates are generally chosen ${\textrm{e}^{- \textrm{i}\theta G/2}}(G = \{X, Y\} )$ and two-bit gates $U = {\textrm{e}^{\textrm{i}\theta (Y \otimes Y)}}$ and ${U_1} = {\textrm{e}^{\textrm{i}\theta (Z \otimes Z)}}$ form the basis of parameterized quantum gates, and $X,Y,Y \otimes Y,Z \otimes Z$ is unitary matrix, which meets the requirements of quantum system operation and evolution. In a heterogeneous cluster of QNNs, the uniqueness of each QNN is mainly derived from the different PQCs. The diversity of these circuits is realized by choosing different parameterized quantum gates in the design as well as making different combinations.

2.2.3. Quantum circuit measurement

There are various ways to measure qubits, and according to the model we designed, we used a special measurement method to integrate with the whole idea of classification. The process of this measurement method is to measure each qubit state on a standard basis (Z-basis). The process of measurement on the standard basis involves projecting a qubit from a superposition state to the standard basis states $\left| 0 \right\rangle$ and $\left| 1 \right\rangle$ . The measurement causes the state of the qubit to collapse to a fixed standard basis state $\left| 0 \right\rangle$ or $\left| 1 \right\rangle$ . We record the probability of the state collapsing to $\left| 0 \right\rangle$ and $\left| 1 \right\rangle$ , and then subtract the absolute value of the sum of the probabilities of the state collapsing to $\left| 0 \right\rangle$ from the sum of the probabilities of the state collapsing to $\left| 1 \right\rangle$ . Let the QNN consist of m qubits, then according to our quantum circuit measurement method, the output of the QNN after processing the information is $M = \left| {\sum\nolimits_{j = 0}^m {{p_j}(1)} {- }\sum\nolimits_{j = 0}^m {{p_j}(0)} } \right|$ .

2.3. Processing of information

The processing of information can be viewed as the continuous optimization of the parameters $\vec{\theta}$ in the QNN and ultimately makes the output result of the model the same as the expected value or the error between the two is within the allowable range, the process can be viewed as the training process of the model, and we mainly realize it through the back-propagation method, the core of which is the computation of the parameter gradient, according to which to achieve the rapid convergence of the QNN model. The gradients of the parameters in the model are mainly estimated using the shift rule of quantum circuits [43]. The parameter $\vec{\theta}$ ( $\vec{\theta} = (\theta {}_0,\theta {}_1, \ldots ,\theta {}_{n - 2},\theta {}_{n - 1})$ ) in the model is used as the training parameter, and the general process is to parameterize the quantum circuit $W(\vec{\theta} )$ to evolve the input quantum state $\left| {{\varphi _\textrm{in}}} \right\rangle$ into the desired output quantum state $\left| {{\varphi _\textrm{out}}} \right\rangle$ , i.e. $\left| {{\varphi _\textrm{out}}} \right\rangle = W(\vec{\theta} )\left| {{\varphi _\textrm{in}}} \right\rangle$ . The main principle of parameter optimization is that the parameter $\vec{\theta} = (\theta {}_0,\theta {}_1, \ldots ,\theta {}_{n - 2},\theta {}_{n - 1})$ corresponds to the objective value of: $f(\vec{\theta} ) =$ $\left\langle {{\varphi _\textrm{in}}} \right|{W^\dagger }(\vec{\theta} )MW(\vec{\theta} )\left| {{\varphi _\textrm{in}}} \right\rangle$ . The loss function can be expressed by the following formula: $C(\vec{\theta} ) =$ $\sum\nolimits_{i = 0}^{n - 1} {f(\left\langle {{\varphi _\textrm{in}}} \right|{U^\dagger }({x_i}){W^\dagger }(\vec{\theta} ){y_i}W(\vec{\theta} )U({x_i})\left| {{\varphi _\textrm{in}}} \right\rangle ,{L_i})}$ . Estimation of the parameter gradient according to the shift method can be realized by the formula: $\frac{{\partial C({\theta _i})}}{{\partial {\theta _i}}} = \left[ {C({\theta _i} + \frac{\pi }{4}) - C({\theta _i} - \frac{\pi }{4})} \right]$ .

2.4. Results integration and classification realization

As we all know, attention is the reaction of perception according to the brain after processing information, mainly focusing on one stimulus, thought, or behavior, while ignoring others, i.e. the brain only goes to respond to the stimulus information that it feels important [44, 45]. Analogous to this principle, we adopt the following approach for the integration and classification implementation of the processing results of heterogeneous QNNs. Assuming that the heterogeneous QNN swarm consists of k heterogeneous QNNs, the output results of the whole multi-category classifier are ${M_1},{M_2},{\ldots},{M_k}$ , according to the principle that the brain only responds to important stimuli in the end, we use one-hot [46, 47] function to encode the real labels into a hotspot format. The hotspot format is a method for converting categorical variables into a machine-readable format. Specifically, the hotspot format represents each category by creating a binary vector whose length is equal to the total number of categories, where the position of one element representing a particular category is set to 1 and the rest of the positions are set to 0. For example, for a five-category problem, the Categorical Variables can be converted to the hotspot format as [1,0,0,0,0], [0,1,0,0,0], [0,0,1,0,0], [0, 0,0,1,0], [0,0,0,0,1]. So that we can simulate the process and can focus the final result to the point where the component is 1. The specific process is to take the output of the multiclass classifier ${M_1},{M_2},{\ldots},{M_k}$ by Softmax [48, 49] function after the Softmax function, which will be used as the final output, through the above processing, it can be simulated with the brain's attentional mechanism to form a simulation, and the final classification decision is decided by the output with the largest probability. Such an encoding not only accurately represents the information of each category, and enables the model to focus more on learning the details of classification during the training process. Loss function we choose the cross-entropy loss, Softmax function maps the outputs of the multi-category classifier ${M_1},{M_2},{\ldots},{M_n}$ into a vector $\hat y$ , $\hat y = Soft\max ({M_1},{M_2},{\ldots},{M_n})$ , which we can think of as the estimated conditional probability of each category for an arbitrary sample x. Suppose the whole dataset $\{X,Y\}$ has m samples, where the sample with index i is composed of the feature vector ${x^i}$ and the corresponding unique label vector ${y^i}$ . The corresponding true label vector y for any x is the true label vector. Then for any x corresponding to the true label y and the result $\hat y$ predicted by the classifier, we define the cross-entropy loss function as: $l(y,\hat y) = - \sum\nolimits_{i = 1}^n {{y_i}\log {{\hat y}_i}}$ . In this way, the loss function can be written as: $l(y,\hat y) = - \sum\nolimits_{i = 1}^n {{y_i}\log {{\hat y}_i}}$ . According to the definition of Softmax, the loss function $l(y,\hat y) = - \sum\nolimits_{i = 1}^n {{y_i}\log {{\hat y}_i}}$ can be expanded as: $l(y,\hat y) = - \sum\nolimits_{i = 1}^n {{y_i}\log \frac{{{\textrm{e}^{{o_i}}}}}{{\sum\nolimits_{c = 1}^n {{\textrm{e}^{{o_c}}}} }}} = \sum\nolimits_{i = 1}^n {\log \sum\nolimits_{c = 1}^n {{\textrm{e}^{{o_c}}}} - \sum\nolimits_{i = 1}^n {{y_i}{o_i}} } = \log \sum\nolimits_{c = 1}^n {{\textrm{e}^{{o_c}}}} - \sum\nolimits_{i = 1}^n {{y_i}{o_i}}$ . The loss function is derived for any prediction ${o_i}$ : $\frac{{\partial l(y,\hat y)}}{{\partial {o_i}}} = \frac{{{\textrm{e}^{{o_i}}}}}{{\sum\nolimits_{c = 1}^n {{\textrm{e}^{{o_c}}}} }} - {y_i} = Soft\max {(o)_i} - {y_i}$ , and from the above equation, the derivative is the difference between the final output of the multiclass classifier and the true value. In this way, the integration of the results and the classification realization is achieved.

Relevant studies [50] show that QNNs exhibit significant advantages and broad application prospects compared with classical computing. Through theoretical derivation, study [38] has explored the computational advantages of QNNs in-depth, while cutting-edge studies [51, 52] have further elucidated the superiority of QNNs over classical neural networks in terms of modeling capabilities. In particular, QNNs are capable of achieving higher effective dimensions and faster reduction of loss values during training, leading to a better fit for the data. Thus, QNNs provide new ways to solve the memory bottleneck problem in classical machine learning. It is based on QNNs that our model is designed. Through quantum encoding techniques, we map the data to Hilbert space and use parametric quantum circuits to achieve efficient information processing. In QNNs, parameters play a key role in regulating the states of qubits, and by optimizing these parameters, we can significantly improve the performance of the model. Compared with classical neural networks, QNNs have the advantages of fewer parameters, lower computational resource requirements, faster training speed, and less risk of overfitting. This is mainly due to the superposition and entanglement properties of qubits, which enable QNNs to express complex functional relationships with fewer parameters. Particularly, in the data encoding stage, we use amplitude encoding for data with large feature dimensions. Our model adopts a multiplexed QNN architecture, with each channel containing m parameters, consisting of a total of p channels, and the total number of parameters is $m*p$ . In terms of qubit requirements, the number of qubits required for amplitude encoding is $p*\log n$ if a parallel structure is used, and only $\log n$ qubits are required if a non-parallel structure is used. In summary, our model takes full advantage of QNNs, which not only perform well in terms of parameters and computational resources but also have a significant advantage in processing high-dimensional feature data. This makes our model show great potential in classification task solving and opens up a new path for future development in the field of machine learning.

3. Experimental design and results

To fully validate the accuracy and effectiveness of our model, we have selected several real-world classification problems for our experiments. All experiments are conducted based on publicly available datasets, and a desktop computer equipped with Intel(R) Core(TM) i7-8700 K CPU @ 3.70GHz processor and 32.0GB RAM is used as the hardware environment. In the experimental setup of this paper, CategoricalCrossentropy is chosen for the loss function, Adam is chosen for the optimizer, the batch size is 64, and the learning rate is taken to be 0.001 by default. These hyperparameters can be set according to one's own practical needs. For example, the optimizer can choose Adam, SGD, RMSprop, and Nadam, and the loss function can choose MSLE, MAE, BinaryCrossentropy, etc. The PQC of the five-way heterogeneous QNN consists of a stack of ${R_x} + {R_x},{R_x} + {R_y},{R_y} + {R_x},{R_y} + {R_y}, U + {U_1}$ plus CNOT respectively. Five different PQCs are shown in C of figure 2, representing Layer 1 through Layer 5. Experimenters can design QNNs with different structures to build the model according to the specific data characteristics when using the model we provided for real problem-solving.

Experiment 1. Whether the model exhibits superior performance in terms of classification accuracy. For this purpose, we chose the challenging MNIST public dataset and validated it in simulations with the help of the quantum simulator TensorCircuit. The reason for choosing the MNIST public dataset is that it has become one of the benchmarks used by researchers in the fields of Machine Learning, Machine Vision, Artificial Intelligence, and Deep Learning to measure the accuracy of classification algorithms. To better complete the integration of the final results and classification implementation, the labels are first preprocessed, the preprocessing process is based on the number of categories, and the labels are mapped to 0 to 4 respectively, as the second classification, the labels are mapped to 0,1; if it is the third classification, the labels are mapped to 0,1,2; and so on; according to the principle of one-hot encoding and the brain's attentional mechanism, one of the one-hot encode component of 1 is the processing result of focusing attention on the information in this category. The dimension of the data in the MNIST public dataset is extended from 28*28 to 32*32 to meet the requirement of amplitude encoding. We design a model consisting of five-way heterogeneous QNNs to accomplish two to five classifications of subsets (3,6), (3,8), (3,9), (0,3,9), (1,3,6), (0,3,6,9), (0,1,3,6,9), (0,1,2,3,4) of the MNIST dataset. First, we need to map the labels of the subsets, e.g. for a subset (3,6) we map its label to (0,1) so that the result after one-hot encoding is 10 000 and 01 000, for a subset (0,1,3,6,9) we map its label to (0,1,2,3,4) and the result after one-hot encoding is 10 000 01 000, 00 100 00 01 000 001. The classification results are compared with the method mentioned in QNN for [15] and the classification accuracy on different classifiers is shown in the table 1.

Table 1. Performance of SMP, SQFNN, QuantumFlow, MLP and Our method on classification accuracy of MNIST data subsets (3, 6),(8),(9),(0, 3, 9),(1, 3, 6),(0, 3, 6, 9),(0, 1, 3, 6, 9),(0, 1, 3, 6, 9),(0, 1, 2, 3, 4).

Dataset	SMP	SQFNN	QuantumFlow	MLP	Our method
(3,6)	97.97	97.89	97.63	98.79	98.58
(3,8)	86.98	89.66	87.20	93.71	95.97
(3,9)	93.66	95.12	95.56	95.13	97.23
(0,3,9)	92.16	91.98	90.39	94.35	97.30
(1,3,6)	90.69	91.59	92.28	96.38	98.49
(0,3,6,9)	95.95	96.03	93.62	97.84	96.07
(0,1,3,6,9)	95.34	95.22	92.67	97.97	96.58
(0,1,2,3,4)	94.57	94.05	90.26	97.93	96.26

From the validation results of the simulations, it can be seen that the classification accuracy of our proposed model exceeds that of other quantum classification methods by up to about 5% on different subsets of the MNIST dataset, making it the state-of-the-art simulation result, and exceeds that of the classical binary-weighted MLP algorithm(a classical computer multilevel perceptron with binary weights) on some subsets of the data ((3,8),(3,9),(0,3,9),(1,3,6)).

Experiment 2. Whether the model has excellent robustness. Our model has robustness. Even if one or several QNNs fail to work properly during information processing, or the corresponding quantum computer fails, the model is still able to continue to carry out the normal classification task. To better verify the robustness of the multiclass classifier, we assume the following cases: for the two-classification problem, three QNNs cannot work or the corresponding quantum computer fails; for the three-classification problem, two QNNs cannot work or the quantum computer fails; for the four-classification problem, one QNN cannot work or the quantum computer fails. In this case, we only need to modify the corresponding labeling mappings so that only a few heterogeneous QNNs in the model corresponding to a few classes of problems work. For example, for the two-classification problem of (3,6), we can map its labels to 10 and 01, which correspond to two working QNNs, respectively; for the three-classification problem of (0,3,9), we can map its labels to 100, 010 and 100, which correspond to three working QNNs, respectively. In this way, we can verify the robustness and reliability of the model in case of partial component failure. The classification results are compared with the method mentioned in QNN for [15] for the subsets of MNIST dataset (3,6), (3,8), (3,9), (0,3,9), (1,3,6), (0,3,6,9) on different classifiers with the classification accuracies are shown in the table 2.

Table 2. One or more QNN do not work properly during information processing or the corresponding quantum computer fails and the model continues to perform the normal classification task. Comparison of performance with SMP, SQFNN, QuantumFlow, MLP in terms of classification accuracy on subsets of MNIST data (3, 6),(8),(9),(0, 3, 9),(1, 3, 6),(0, 3, 6, 9).

Dataset	SMP	SQFNN	QuantumFlow	MLP	Our method
(3,6)	97.97	97.89	97.63	98.79	98.83
(3,8)	86.98	89.66	87.20	93.71	95.56
(3,9)	93.66	95.12	95.56	95.13	97.18
(0,3,9)	92.16	91.98	90.39	94.35	97.43
(1,3,6)	90.69	91.59	92.28	96.38	98.23
(0,3,6,9)	95.95	96.03	93.62	97.84	96.90

As can be seen from the validation results of the experiments, combined with Experiment 1, as long as the number of channels of the heterogeneous QNN cluster included in the model is greater than or equal to the number of categories to be categorized, the model can complete the categorization task properly, and the categorization accuracy is still high. This point fully verifies the strong robustness of our proposed model.

Experiment 3. Whether the model exhibits good scalability. Good scalability includes the scalability of the model in dealing with different subsets of the same dataset with different numbers of categories, the scalability of the same model on different datasets, and the scalability of the model size. In the following, we validate each of the three aspects. It can also be demonstrated in Experiment 1 that our model can be used to deal with classification problems where the number of classification categories is lower than the number of heterogeneous neural network channels. The model we designed can handle the classification problem on a subset of different numbers of categories without any adjustment, and the classification results are better. To better validate the scalability of the model. 1. We directly validate the designed model in Experiment 1 by extending it to the Fashion MNIST dataset. Fashion MNIST acts as an easy replacement for the classical MNIST dataset, and Fashion MNIST will be more challenging than regular MNIST handwritten data. 2. We validate the model by extending the heterogeneous QNN cluster of PQCs in the layer from the original 3 layers to 5 layers at the same time, to expand the model scale. Still, finish the two to five classification of subsets (3,6),(3,8), (3,9), (0,3,9), (1,3,6), (0,3,6,9), (0,1,3,6,9),(0,1,2,3,4), of Fashion MNIST dataset. The effect is shown in the table 3.

Table 3. The performance of our method setup PQCs with layers 3 and 5 in terms of classification accuracy on subsets of Fashion MNIST data (3, 6), (3,8), (3,9), (0, 3, 9), (1, 3, 6), (0, 3, 6, 9), (0, 1, 3, 6, 9), (0, 1, 3, 6, 9), (0,1,2,3,4).

Dataset	Our method(Layer = 3)	Our method(Layer = 5)
(3,6)	89.85	92.30
(3,8)	98.55	98.65
(3,9)	99.90	99.90
(0,3,9)	93.67	95.13
(1,3,6)	89.57	91.57
(0,3,6,9)	85.65	86.87
(0,1,3,6,9)	85.54	87.16
(0,1,2,3,4)	82.02	83.74

Experiments show that our model can be directly scaled to other datasets for training and completes the classification problem well. Depending on the difficulty of the dataset, the whole model can be scaled up by expanding the number of layers of the PQCs to meet the practical needs. As the model size increases, the classification accuracy on all subsets is improved.

To further verify the generalization ability of the model, we still use the model in 1 to test it on the public datasets cancer, iris, and wine provided by sklearn. according to the data features, the cancer dataset and wine dataset still use amplitude encoding, and the iris dataset has only four features, and we use rotational encoding. The effect is shown in the table 4.

Table 4. The performance of our method setup parameterizing the number of quantum circuit layers as 5 and 8 in terms of classification accuracy on the publicly available dataset Breast Cancer, Iris,Wine.

Dataset	Our method(Layer = 5)	Our method(Layer = 8)
Cancer	93.86	94.74
Iris	90.00	96.67
Wine	83.80	96.92

According to the results of simulation validation, the model we designed can complete the classification problems on different datasets while maintaining high classification accuracy. The model has excellent scalability and can complete the corresponding classification problems by adjusting the number of heterogeneous QNN clusters in the model and changing the number of stacked layers of PQCs according to the number of categories and data characteristics of multiple categories without the need to redesign the model.

In summary, through simulation validation on the quantum simulator TensorCircuit, we demonstrate that our model performs well in terms of classification accuracy, outperforming other quantum classification methods by about 5% or so, and even exceeding the classical MLP algorithm in some data subsets. In addition, our model has strong robustness, and even if some of the heterogeneous QNNs have problems, the model can still operate normally. Finally, our model has excellent scalability, not only can it be directly ported to other datasets for normal operation, but also can satisfy the classification accuracy by simply adjusting the number of layers of the heterogeneous QNN according to the data features. These advantages make our model a powerful tool for solving multi-category classification problems.

4. Discussion

The complexity of the real world determines that multi-category classification is an important problem in the field of machine learning. Most of the traditional multi-category classification methods are based on deep learning, which requires a large amount of data and computational resources and also suffers from problems such as overfitting. In contrast, quantum machine learning, as a product of the combination of quantum computing and machine learning, has advantages such as accelerating computation and reducing overfitting. Among them, QNNs play an important role in the field of quantum machine learning. However, at this stage, QNNs have also been proposed for solving multi-class classification problems, but the classification effect and scalability are relatively poor. We designed a quantum multi-category classifier inspired by the process of the brain processing information. To the best of our knowledge, we have designed a quantum multi-category classifier from this perspective for the first time, and the model has outstanding performance in solving realistic problems using QNNs. Finally, we perform classification simulations on different datasets to verify that our proposed model shows strong robustness and scalability, and the classification accuracy on MNIST data far exceeds that of other quantum multiclassification methods, making it the state-of-the-art simulation result. In conclusion, our work's idea of designing methods to solve practical problems from the perspective of the process of the brain processing information is of scientific revelation to other fields to solve practical problems, and at the same time proposes new ideas and references for solving practical problems by utilizing the advantages of QNNs. Although our model has achieved remarkable results on quantum simulators, it still faces challenges when dealing with large-scale data. Firstly, the computational resources required by the quantum simulator increase dramatically as the data size increases, especially when dealing with high-dimensional data, where its computational power may be severely limited. Second, as the complexity of the model increases, the difficulty of maintaining high-precision quantum state simulations also increases, and numerical errors and computational instability may lead to a decrease in the accuracy of the simulation results. When turning to real quantum computers, we face some of the same challenges. Data encoding and limitations on the number of qubits are among the main issues, which directly affect the amount of data that can be processed and the size of the algorithms. At the same time, the depth of the quantum lines and hardware noise also have a significant impact on the performance of the algorithm. While deeper lines can improve the expressive power of the model, they also increase the risk of error propagation and computational complexity. As for parallel work on quantum computers, despite the theoretical potential to increase the overall computational power, in reality, it faces many challenges such as synchronization, error rates, and quantum state transport, which are currently difficult to achieve. Despite these challenges, we are confident in our model. With the rapid development of quantum technology, the improvement of quantum hardware performance, and the improvement of software tools, we believe that these limitations will be gradually overcome. The potential of our model in dealing with practical classification problems will be further exploited, opening up broader prospects for future quantum computing applications.

5. Conclusion

The brain, with its remarkable parallel processing capability and adaptive learning properties, has provided us with inspiration for solving complex classification problems. Inspired by this, we put special emphasis on the parallel processing of information and attention mechanisms when designing quantum classifiers, driving us to create quantum classifiers that can deeply understand and process complex information. By simply simulating the subtle process of the brain in processing information, quantum classifiers can reveal the intrinsic laws and patterns of the data more accurately, thus achieving more efficient and precise solutions to complex classification problems. After rigorous simulation verification, our proposed quantum classifier demonstrates excellent robustness and scalability, and its classification accuracy significantly outperforms other quantum multi-classification methods, making it the state-of-the-art simulation result. This means that even when faced with complex environments such as noise, interference, or data changes, our quantum classifier still maintains stable performance and shows strong adaptability. Meanwhile, with the surge of data volume and the increasing complexity of the task, the scale of the QNN can be scaled up by changing the single-layer structure and the number of stacked layers of the PQCs, and the scale of the model can be scaled up by adding differently structured QNNs. By scaling up, the model provides powerful support for dealing with larger-scale problems. This is not only a reflection of our pursuit of technological progress, but also a profound extension and new expansion of human intelligence. The quantum classifier we designed is not only an in-depth exploration of and reference to the mechanism of information processing in the human brain but also a bold attempt and breakthrough in solving complex classification problems. Looking ahead, the quantum classifiers we designed will continue to serve as an in-depth exploration of and reference to the information processing mechanism of the human brain, leading us to achieve breakthroughs in solving complex classification problems. We firmly believe that with the continuous progress and development of quantum computing technology, our quantum classifiers will play an even more important role in future technological progress and social development, and bring more innovation and value to mankind.

Acknowledgments

The authors acknowledge the financial support from Major Science and Technology Projects in Henan Province,China, Grant No.:221100210600.

Data availability statement

All data that support the findings of this study are included within the article (and any supplementary files).

Conflict of interest

The authors have no conflicts to disclose.

Author contributions

Xiaodong Ding and Jinchen Xu conceived the idea and the experiments. Zheng Shan supervised the work and improved the idea and experiment design. Xiaodong Ding conducted the experiments. Zhihui Song, Yifan Hou analyzed the results. All authors reviewed the manuscript.

Dates