0% found this document useful (0 votes)

60 views6 pages

Cheng 2013

This document summarizes a study that used machine learning techniques and chemometric analysis to evaluate Bupleuri Radix (Chaihu), a traditional Chinese herbal medicine, through high-performance thin-layer chromatography (HPTLC). 64 Chaihu samples from different Bupleurum species were analyzed by HPTLC to assess bioactive components. Image processing and classification algorithms like SVM and neural networks were used to construct prediction models from HPTLC image features. Ensemble feature selection combined with classifiers enhanced discrimination. Experimental results showed that commercial samples could be readily distinguished and classified.

Uploaded by

ngocan886979

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views6 pages

Cheng 2013

Uploaded by

ngocan886979

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Analytical

Methods
View Article Online
PAPER View Journal | View Issue
Published on 19 September 2013. Downloaded by Christian Albrechts Universitat zu Kiel on 27/10/2014 12:12:59.

Combination of eﬀective machine learning techniques

and chemometric analysis for evaluation of Bupleuri
Cite this: Anal. Methods, 2013, 5, 6325
Radix through high-performance thin-layer
chromatography
Xiaoping Cheng,a Hongmin Cai,*a Ping He,b Yue Zhangc and Runtiao Tiand

Chaihu (Bupleuri Radix), the root of Bupleurum chinense and B. scorzonerifolium, is a traditional Chinese
herbal medicine authenticated in the Chinese Pharmacopoeia. There are also several variations available
from local herbal markets, for example, the roots of B. falcatum, B. bicaule, and B. marginatum var.
stenophyllum. In the current study, we collected 64 Chaihu samples, including 33 authenticated samples
and 31 commercial samples. Test solutions of all the examples were analysed by high-performance thin-
layer chromatography (HPTLC) to assess the principal bio-active components (saikosaponins). The HPTLC
fluorescent images acquired were analyzed by sophisticated image processing techniques for
comprehensive quantification. High dimensional features for both gray-scale and true color images
were constructed for the raw images. Classical classification algorithms, including naive Bayes, Support
Vector Machine (SVM), K-nearest neighbors, neural network and logistic, were used to construct
prediction models. To gain an insight into the principal components while evaluating the Chaihu
sample, feature selection and ensemble feature selection methods were further combined with the
Received 9th July 2013
Accepted 19th September 2013
classifiers to enhance the discrimination power. Ensemble feature selection was shown to achieve
superior performance. Experimental results demonstrated that the roots of Chaihu from different
DOI: 10.1039/c3ay41132j
species of the genus Bupleurum could be readily distinguished so that commercial samples could be
www.rsc.org/methods easily classified.

1 Introduction Therefore, an automatic and eﬀective method is a necessity to

evaluate the quality of these different species of Chaihu. It has
Chaihu (Bupleuri Radix), the root of Bupleurum chinense and B. been shown in ref. 6 and 7 that HPTLC8 is more effective in
scorzonerifolium, is a commonly used Chinese herbal medicine assessing the therapeutic power of herbal medicines than
and has been officially recorded in all the editions of the conventional methods, such as high-performance liquid chro-
Chinese Pharmacopoeia. In clinical practice in China, it is used matographic (HPLC)9 and thin-layer chromatographic (TLC). It
in the treatment of fevers, the common cold, headaches, and allows for higher resolution and thus more accurate quantita-
liver disorders. It has also been employed to increase sweating, tive measurements than TLC. With the aid of modern tech-
to prevent kidney problems, as a liver tonic, and as a spleen and niques in articial intelligence, automatic quality assessment
stomach tonic.1 It has been recognized that the principle system for herb medicine with high accuracy has recently been
bioactive components of Chaihu are saikosaponins,2 with sai- reported.10,11
kosaponins a and d being the dominant ones. In addition to the In the current study, thirty-three lots of authenticated
authentic species of Chaihu such as B. chinense, B. scorzoner- samples and thirty-one lots of commercial Chaihu were
ifolium and B. falcatum, there are more than 20 other species of collected and analyzed by HPTLC to get a pictorial description.
the genus Bupleurum which are also habitually utilized as The purpose of this study was to design and assess an effective
Chaihu in China's markets.3,4 However, most of the variants discrimination system based on classical machine learning
possess inferior qualities to those of authenticated Chaihu.5 techniques to achieve fully automatic quality evaluation. To
achieve this goal, more than 500 attributes were calculated from
a
School of Computer Science and Engineering, South China University of Technology, HTPLC uorescence images to quantify their chemical proles
Guangdong, P.R. China. E-mail: [email protected] comprehensively. Classical classication tools, including naive
b
Division of Science and Technology, Beijing Normal University-Hong Kong Baptist Bayes (NB),12 support vector machine (SVM),13 K-nearest neigh-
University United International College, Zhuhai, China
c
bors (KNN),14 neural network (RBF-NN)15 and logistic,16 were
School of Zhuhai, Jinan University, Zhuhai, China
d
used to discriminate the authentic Chaihu from fake ones.17 To
ChemMind Technologies Co., Ltd., Beijing, China

This journal is ª The Royal Society of Chemistry 2013 Anal. Methods, 2013, 5, 6325–6330 | 6325
View Article Online

Analytical Methods Paper

further enhance the discrimination power, ensemble feature sprayed with DMAB reagent and heated at 105 C on a TLC plate
selection and various feature selection mechanisms were heater (CAMAG) until the colour of the saponins was distinct.
combined with various classiers. Extensive experiments are The uorescent images were examined at 365 nm by using a UV
reported and analyzed on the performance of the combination viewer cabinet (CAMAG). The images were captured by a Digi-
of classication tools and HPTLC. The current study demon- store 2 documentation system (CAMAG). The excitation wave-
strates that the combination of advanced machine learning length was 366 nm in the reection mode and the exposure time
Published on 19 September 2013. Downloaded by Christian Albrechts Universitat zu Kiel on 27/10/2014 12:12:59.

techniques and HPTLC can assess the quality of diﬀerent was 3 seconds.
species of Chaihu in an accurate and eﬀective way. A sample image obtained following the aforementioned
procedures is shown in Fig. 1(a).

2 Experimental
3 Pattern analysis for HPTLC
2.1 HPTLC experimental sample
Sixty-four batches of Chaihu samples were collected from To obtain an effective discrimination system with machine
different herbal markets or harvested from various habitats. learning techniques, pattern analysis was essential for our
Among them, thirty-one samples, including B. chinense, study. These procedures are depicted in Fig. 2.
B. scorzonerifolium, B. falcatum, B. longiradiatum, B. bicaule and
B. marginatum var. stenophyllum, were authenticated by bota- 3.1 HPTLC ngerprint images preprocessing
nists Prof. Z. D. Wang of Henan Science & Technology Univer- The raw HTPLC uorescence images have to be preprocessed to
sity, China and Prof. D. Q. Wang of Anhui University of standardize the data in order to prevent any side-effects arising
Traditional Chinese Medicine, China. from the experiment, such as image shiing and nonuniform
lighting. An example is shown in Fig. 1. The proposed pre-
2.2 HPTLC experiment setup processing method consists of two steps. In the rst step, the
raw image are converted into gray-scale or a true color image
The chemical reagents for the experiment were obtained from
and the noise suppression scheme aims to enhance the image
the Guangzhou Chemical Reagent Factory (Guangzhou, China).
quality. This step facilitates feature extraction and is used for
Chemical reference standards of saikosaponin a and saikosa-
quantication of the image. In the second step, the denoised
ponin d were provided by the National Institute for the Control
images are aligned manually so that the tested images are the
of Pharmaceutical and Biological Products (Beijing, China).
same size. The head and tail portion of each standardized
Chemical references of saikosaponin c, saikosaponin f and
image were removed as the images were imperfect because of
saikosaponin b2 were provided by Henan College of Traditional
nonuniform lighting.
Chinese Medicine, China.
The experimental procedure for the preparation of the
3.2 HPTLC image feature construction
HPTLC ngerprint was as follows:
(1) Preparation of sample solution: A 0.3 g portion of powdered 3.2.1 Feature calculation. Since the intensity of the pre-
herb was added to 20 mL of solution of 0.5% pyridine in processed images show band-wise variations, their averaged
methanol to prevent the degradation of saikosaponins a and d. intensity prole could be used to quantify the variation. To be
The mixture was reuxed twice in a water bath at 80 C for 30 specic, the peak and valley values along the curve at particular
minutes and ltered aerwards. The ltrate was evaporated to positions were estimated as feature values. In Fig. 3, the
dryness in a fume cupboard and reconstituted in 3 mL of water detected peaks and valleys are plotted as stars. In addition, each
before the suspension was applied to a C18 cartridge. Aer
elution with 10 mL of 30% methanol and 20 mL of 80%
methanol, successively, the 80% methanol fraction was evapo-
rated to dryness and the residue was dissolved in 2 mL of
methanol. The solution was subsequently ltered through a
0.45 mm membrane lter before analysis.
(2) Preparation of references solution: A 5 mg portion of each
saikosaponin reference was dissolved in 5 mL of methanol.
(3) HPTLC chromatographic condition: The sample solutions
were applied bandwisely via an ATS4 auto-sampler (CAMAG,
Muttenz, Switzerland) onto a commercial 20 cm 10 cm pre-
coated HPTLC Silica gel 60-plate (Merck). The sample plate was
placed into a desiccator with phosphorus pentoxide and dried
under vacuum for 2 hours before development. Fieen millili-
ters of mobile phase consisting of dichloromethane–ethyl
acetate–methanol–water (30 : 40 : 15 : 3, v/v/v/v) was added into Fig. 1 Demonstration of preprocessing of HPTLC fingerprint images. (a) A raw
a twin-trough chamber, to saturate it for 15 minutes. The plate HPTLC fingerprint image; (b) gray scale transformed image with histogram
in the chamber was developed upward over a path of 8 cm and equalization; (c) image after alignment to have uniform size.

6326 | Anal. Methods, 2013, 5, 6325–6330 This journal is ª The Royal Society of Chemistry 2013
View Article Online

Paper Analytical Methods

classier by removing redundant features. Four standard

methods of feature selection were used in our experiments
because of their good performance. The four feature selection
methods include Correlation-based Feature Selection (CFS),18
Chi-square feature evaluation,19 Gain Ratio (GR) attribute eval-
uation20 and RELIEF feature selection method.21 A short
Published on 19 September 2013. Downloaded by Christian Albrechts Universitat zu Kiel on 27/10/2014 12:12:59.

summary follows of the attributes of the tested selection

schemes:
CFS evaluates the worth of a subset of attributes by
considering the individual predictive ability of each feature
along with the degree of redundancy between them.
Fig. 2 The framework for pattern analysis. Chi-square based feature selection evaluates the worth of
an attribute by computing the value of the chi-squared statistic
with respect to the class.
GR attribute evaluation attempts to evaluate the feature
importance by computing an information gain measure given
by a feature subset candidate.
RELIEF algorithm weights features iteratively by adjusting
Fig. 3 Peaks and valleys of gray intensities were estimated from the pre-
feature importance according to their ability to discriminate
processed image. Red and green stars were used to differentiate the peaks and
the valleys for the gray values. The relative position and the amplitude of the
between neighboring patterns by maximizing an expected
peaks and the valleys were recorded to serve as features for quantification of the margin through scaling of features.
HPTLC image. During our experiments, the chosen feature subsets were
those which provided the most satisfactory results.
3.2.3 Ensemble feature selection. To make use of the
bar in the tested image is made up of about 12 pixels and thus merits characterized by different features, an ensemble feature
the statistical characteristics of each bar were also calculated to selection technique was conducted to nd out the compact but
serve as feature values. In our experiment, a bar area was rich information conveyed by feature subsets. Ensemble feature
dened to be a block with a width of 12 pixels. selection22,23 is a popular supervised machine learning method.
3.2.2 Feature selection. Based on the aforementioned It rstly generates an ensemble of classiers including several
feature calculation method, four feature sets were obtained. The base classiers and then integrates these class predictions by a
rst feature set had a dimension of 504 containing all of the voting strategy. Traditional feature selection algorithms have
features extracted. The second feature set had a dimension of the goal of nding out the best feature subset which is bound by
258 aer removing the statistical characterization of the true both the learning task and the selected inductive learning
color image. The third feature set had a dimension of 176 aer algorithm, while the task of ensemble feature selection has an
removing both the bar characterization of the gray and color additional goal of nding out a set of feature subsets which will
images. The fourth with 53 features contained only the char- promote disagreement among the base classiers. Ho22 has
acterization of the gray images, including the numbers and shown that simple random selection of feature subsets may be
densities of both the peaks and valleys, the statistical charac- an effective technique for ensemble feature selection. In this
terization of the chromatographic bands. method, one randomly selects N* < N features from the N-
The obtained feature sets were then processed by classical dimensional training set T and obtains a new N* – dimensional
feature selection schemes to enhance the performance of the random subspace of the original N – dimensional feature space.

Table 1 Experimental results after various classiﬁers on feature subset with/without PCA processing. The best results for each scheme are highlighted in italics. The
overall performance of the second feature subset was the best. Processing by PCA did not enhance the classiﬁcation performance obviously

Feature set Classication accuracy (%)

Feature subset
(#features) PCA processing NB SMO RBF-NN KNN Logistic Average

I (504) No 79.69 93.75 82.81 85.94 85.94 85.62

Yes 82.81 84.38 85.94 87.5 81.25 84.38
II (258) No 82.81 89.06 90.63 92.19 87.5 88.49
Yes 82.81 87.5 92.19 85.94 90.63 87.81
III (176) No 75 89.06 81.25 87.5 81.25 82.81
Yes 82.81 84.38 87.5 85.94 87.5 85.63
IV (53) No 81.25 85.94 87.5 85.94 76.56 83.44
Yes 81.25 84.38 81.25 90.63 76.56 82.81

This journal is ª The Royal Society of Chemistry 2013 Anal. Methods, 2013, 5, 6325–6330 | 6327
View Article Online

Analytical Methods Paper

Table 2 Experimental results of combinational performance of feature selection method with various classiﬁers. The best results for each scheme are highlighted in
italics. The overall accuracy of the second feature subsets was far better than that of the other feature subsets

Classication accuracy (%)

Feature set Method #Features NB SMO RBF-NN KNN Logistic Average

Published on 19 September 2013. Downloaded by Christian Albrechts Universitat zu Kiel on 27/10/2014 12:12:59.

I CFS 134 78.13 82.81 84.38 82.81 76.56 80.94

Chi 304 75 87.5 81.25 85.94 79.69 81.88
II GR 20 93.75 85.94 90.63 95.31 89.06 90.94
ReliefF 15 87.5 84.38 90.63 92.19 82.81 87.5
III CFS 23 78.13 82.81 84.38 85.94 84.38 83.13
FSE 22 78.13 84.38 82.81 85.94 85.94 83.44
IV GR 9 79.69 87.5 84.38 84.38 81.25 83.44
ReliefF 12 81.25 81.25 85.94 82.81 76.56 81.56

Table 3 Experimental results of the performance of ensemble feature selection ensemble of the base classiers. This is achieved through three
technology combined with the Libsvm tool for the four feature sets. The overall steps: (1) elimination of low classication accuracy base clas-
accuracy of the second feature set is slightly superior to that of the other sets
siers; (2) removal of those base classiers that have identical
Classication predictions; (3) integration of the base classiers with criteria:
Feature set (#features) #Features accuracy (%) P
~
p ¼ F( ukvm) (1)
I (504) 40 87.5 3.1
II (258) 30 95.3 1.6 where ~p is the predicted result aer integration. vm is the pre-
III (176) 30 90.6 3.1 dicted result of the base classier of bm, and uk is its prediction
IV (53) 20 93.8 3.1
accuracy. The piecewise linear function F($) is an error function,

1; x . 0:5
and is dened by FðxÞ ¼ . Therefore, the inte-
0; otherwise
gration scheme predicts the base classier with high prediction
accuracy.

3.3 Classication schemes

The classication was conducted by using classical techniques
including naive Bayes,12 support vector machine (SVM)13 via
SMO25 optimization and SVM,24 K-nearest neighbors (KNN),14
radial basis function neural network (RBF-NN)15 and logistic.16
For all of the classiers used in this study, hyper-parameters
were estimated via 10-fold cross validation. The coding and
simulation experiments were conducted within Matlab
(Mathworks Co., Ltd., Boston, MA, USA) and Weka (http://
www.cs.waikato.ac.nz/ml/weka/), an open source soware for
Fig. 4 Experimental results of the classification accuracy of different classifier data mining. To eliminate the statistical variations, we con-
schemes for different feature sets. Based on different technology for feature
ducted ten experiments independently on each feature set and
selection, the best performance for each classifier was chosen. The ensemble
feature selection method combed with the Libsvm tool achieved significant averaged the classication accuracy to determine the perfor-
results. mance of a particular feature set. The classiers used are briey
described below.
3.3.1 Naive Bayes (NB) classier. The naive Bayes classi-
This strategy is repeated k times to build k feature subsets which er12 is obtained based on the famous Bayesian theorem and is
are then used to construct k base classiers. Finally, an inte- particularly suited when the dimensionality of the inputs is
grated decision rule is obtained by combining several base high. Despite its simplicity, naive Bayes can oen outperform
classiers. The integration of an ensemble of classiers has many sophisticated classication methods.26 Depending on the
oen been shown to achieve higher accuracy than the most precise nature of the probability model, naive Bayes classiers
accurate base classier alone in different real-world tasks. In can be trained very efficiently in a supervised learning setting.
this study, the ensemble feature selection technique consists of 3.3.2 Support vector machine (SVM). Support vector
two steps: learning and integration. In the learning phase, machine (SVM)13 is a famous machine learning technology,
based on the simple random selection method and the Libsvm24 whose main aim is to construct a hyper separation plane by
tool, an ensemble of k base classiers b1, b2,., bk is generated. maximizing a distance margin. As for the linear inseparable
In the integration phase, one achieves class predictions via problem, the observed sample is rstly converted into a highly

6328 | Anal. Methods, 2013, 5, 6325–6330 This journal is ª The Royal Society of Chemistry 2013
View Article Online

Paper Analytical Methods

dimensional feature space by using a non-linear mapping main experimental results are summarized in Table 1. The
algorithm to make it linearly separable. In this paper, two overall performance of the second feature set was superior to
improved training algorithms for SVM were used, that is the other feature sets as it reached an accuracy of 90%. Since the
Sequential Minimal Optimization (SMO)25 classier and color information was omitted in the second feature set, the
Libsvm.24 results imply that removal of the color information can enhance
SMO helps to accelerate the solving procedure by breaking a the discrimination power. The possible reason is that the color
Published on 19 September 2013. Downloaded by Christian Albrechts Universitat zu Kiel on 27/10/2014 12:12:59.

large quadratic programming (QP) problem down into a series information may be inaccurate due to imperfect imaging pro-
of smaller QP problems. SMO improves the scaling and reduces cessing, such as non-uniform lighting. A similar observation
computation time signicantly by utilizing the smallest was shown in the second example in which the classication
possible QP problems. performance of the rst feature set was vastly inferior to the
In ensemble feature selection, Libsvm is widely used as an latter ones. Another worthy point to note is that the classica-
efficient SVM tool. There are two steps involved in the LIBSVM: tion accuracy aer PCA did not show obvious improvement over
(1) the dataset is trained to obtain a model; (2) the model is used one without PCA preprocessing, possibly because of medium
to predict the information for the testing dataset. In this paper, feature dimensions.
a polynomial kernel was used. In the second experiments, various methods for feature
3.3.3 K-Nearest neighbors (KNN) classier. The K-nearest selection were rstly applied to the four feature subsets. The
neighbors algorithm14 was implemented for pattern recognition resulting feature subset was then fed into various classication
by using the so-called weighed vote formula to predict the herbal algorithms, including naive Bayes, SVM, RBF-NN, KNN and
species based on the spatial distances between observation and logistic classiers. The averaged performance aer each clas-
target vectors. sier of different feature subsets is summarized in Table 2. The
3.3.4 Radial basis function neural network (RBF-NN) clas- performance of the second feature set was the best in most of
sier. Neural network15 based on radial function is an efficient the cases. The results were similar to the rst experiment.
feed forward neural network.27 It has the best approximation of Furthermore, the performance of the classier dramatically
performance and global optimum characteristics, which other increased aer feature selection processing. For example, the
forward networks do not have. It has a simple structure and fast classier of naive Bayes reached an accuracy of 93.75% in
training speed. comparison with 82.81% which was achieved without feature
3.3.5 Logistic classier. Logistic16 is a classier for building selection. The accuracy of KNN reaches 95.31% with feature
and using a multinomial logistic regression model with a ridge selection, while 92.19% was achieved without processing. The
estimator. good performances implies that high accuracy can be obtained
3.3.6 Principal component analysis (PCA).28 Principal by removing redundant information in the feature set.
component analysis (PCA) is widely adopted as a preprocessing In the third experiment, the ensemble feature selection
procedure which uses an orthogonal transformation to convert method with the base classier of SVM was tested on the four
a set of observations of possibly correlated variables into a set of feature sets and achieved remarkable results. In our experi-
values of linearly uncorrelated variables, called principal ment, een base classiers were rstly constructed for
components. In many analysis case, the number of principal randomly selected features. Extensive experiments were con-
components which account for most of the variance in the ducted to search for a feature subset with good performance.
observed variables is signicantly less than that of the original The performance of the classier was evaluated via 10-fold cross-
variables.29 In our study, the correlations among the features of validation. Experimental results are summarized in Table 3.
the HPTLC images are high and thus PCA is expected to achieve Performance of the second and fourth feature sets produced
good performance by reducing the redundant features and optimal and suboptimal results. Similar to the previous two
improving classication performance. experiments, classication of the second feature set was the
best and an equal accuracy of 95.3% was achieved.
4 Experimental results In order to compare the performance of the different clas-
siers via different feature selection technologies, Fig. 4 was
In this section, we demonstrate the performance of the various plotted. As shown in Fig. 4, the ensemble feature selection
classiers on the four feature subsets by combining different method combined with the Libsvm tool achieved a signicantly
feature selection with ensemble feature selection methods. The superior accuracy for classication in comparison with the
purpose of this section is to show that the fully automated other methods.
classication models can achieve high accuracy in discrimina-
tion of authentic Chaihu samples from fake ones when the raw 5 Conclusion
images were characterized by accurate quantitative
measurements. HPTLC has been shown to be promising for the development of
In the rst experiment, the four feature sets were processed chromatographic ngerprint proling methods to determine
independently by PCA to get an economic representation by complex herb extracts. The pictorial nature of an HPTLC image
discarding 5% of the least informative components. The provides extra intuitively visible measurements for assessing its
resulting feature representation was then combined with chemical characteristics. However, quantitative image analysis
various classiers to evaluate their discrimination power. The of HPTLC remains open as well as its clinical potential. Besides,

This journal is ª The Royal Society of Chemistry 2013 Anal. Methods, 2013, 5, 6325–6330 | 6329
View Article Online

Analytical Methods Paper

various contents of saikosaponins among diﬀerent samples of 10 P. Torrione, K. D. Morton and L. Collins, Chemometrics and
Chaihu species were observed, which calls for not only assess- Machine Learning for Spectral Analysis, Optical Society of
ing the clinical quality by analyzing the multiple marker America, 2012, vol. 1, pp. 3–10.
components individually but also recognizing the entire 11 Y. Q. Wang, H. X. Yan, R. Guo and F. F. Li, Int. J. Data Min.
ngerprint pattern for consistency assurance and authentica- Bioinf., 2011, 5, 369–382.
tion purposes. 12 G. John and P. Langley, Estimating Continuous Distributions
Published on 19 September 2013. Downloaded by Christian Albrechts Universitat zu Kiel on 27/10/2014 12:12:59.

In the current study, various techniques for machine in Bayesian Classiers, Morgan Kaufmann, 1995, vol. 3, pp.
learning and image analysis were combined to evaluate the 338–345.
chemical quality of Bupleuri Radix through HPTLC. Four 13 C. Cortes and V. Vapnik, Mach. Learn., 1995, 3, 273–297.
inherent feature subsets were rstly derived to quantify the 14 D. W. Aha, D. Kibler and M. K. Albert, Instance-Based
pictorial characteristics of the HPTLC image. In order to test the Learning Algorithms, Springer, Netherlands, 1991, vol. 1, pp.
discrimination potential of the derived feature, various stan- 37–66.
dard machine learning schemes were used. Various feature 15 A. Guillén, I. Rojas and González, Neural Process. Lett., 2007,
selection methods, including lter scheme and ensemble 25, 209–225.
scheme combined with advanced classiers, were carried out to 16 S. L. Cessie and J. C. V. Houwelingen, J. Appl. Stat., 1992, 2,
assess the ngerprint pattern. Experimental results have 191–201.
conrmed the high accuracy in discriminating various samples 17 Q. Huang, Y. Zhuang, X. B. Qiao and X. J. Xu, Acta Phys.-
of Chaihu species. This study has revealed a promising way for Chim. Sin., 2007, 23, 1141–1145.
classifying the intrinsic inconsistency of herbal quality when 18 A. L. Blum and P. Langley, Artif. Intell., 1997, 97, 245–271.
their distribution of principal ingredients in this herb varied 19 H. Liu and R. Setiono, Chi2: Feature Selection and
from one batch to another. Discretization of Numeric Attributes, IEEE Computer
Society, Herndon, Virginia, 1995, vol. 2, pp.
388–391.
Acknowledgements 20 J. W. Han and M. Kamber, Data Mining: Concepts and
Techniques (The Morgan Kaufmann Series in Data
This work was supported by NSFC under award number
Management Systems), Morgan Kaufmann, 1st edn, 2000,
60902076, 61372141, and the Fundamental Research Funds for
vol. 2, pp. 179–220.
the Central Universities under award number 2013ZM0079.
21 K. Kira and L. A. Rendell, A Practical Approach to Feature
Selection, Morgan Kaufmann Publishers Inc., 1992, vol. 2,
pp. 249–256.
References
22 T. K. Ho, IEEE Trans. Pattern Anal. Mach. Intell., 1998, 20,
1 C. P. Commission, Pharmacopoeia of the People's Republic of 832–844.
China, Chemical Industry Press, 2011, vol. 1, pp. 196–197. 23 A. Tsymbal, S. Puuronen and D. W. Patterson, Inf. Fusion,
2 Z. H. Su, S. Q. Li and G. A. Zou, J. Pharm. Biomed. Anal., 2011, 2003, 4, 87–100.
55, 533–539. 24 C. C. Chang and C. J. Lin, ACM Trans. Intell. Syst. Technol.,
3 J. P. Committee, The Japanese Pharmacopoeia, Ministry of 2011, 2, 1.
Health, Japan Tokyo, 2000, vol. 1, pp. 876–878. 25 Advances in Kernel Methods, ed. B. Schölkopf, C. J. C. Burges
4 C. P. Commission, Pharmacopoeia of the People's Republic of and A. J. Smola, MIT Press, Cambridge, MA, USA, 1999, vol.
China, Peoples Health Publishing House, 1963, vol. 1, pp. 2, pp. 185–208.
237–238. 26 H. Wang, A Computerized Diagnostic Model Based on Naive
5 P. Xiao, Modern Chinese Materia Medica, Chemical Industry Bayesian Classier in Traditional Chinese Medicine, IEEE
Press, 2002, vol. 1, pp. 784–785. Computer Society, 2008, vol. 1, pp. 474–477.
6 S. B. Chen, H. P. Liu and R. T. Tian, J. Chromatogr., A, 2006, 2, 27 K. Priddy and P. Keller, Articial Neural Networks: An
114–119. Introduction, Society of Photo Optical, 2005, vol. 2, pp. 205–
7 R. T. Tian, P. S. Xie and H. P. Liu, J. Chromatogr., A, 2009, 18, 234.
2150–2155. 28 H. Abdi and L. J. Williams, Wiley Interdiscip. Rev.: Comput.
8 A. Zlatkis and R. Kaiser, HPTLC: High Performance Thin-Layer Stat., 2010, 2, 433–459.
Chromatography, Elsevier, 1977, vol. 6, pp. 95–126. 29 C. Y. Wang, Z. Y. Chen, C. G. Wu and Y. C. Liang, Medicine
9 J. Tamaoka and K. Komagata, FEMS Microbiol. Lett., 1984, 25, Composition Analysis Based on PCA and SVM, Springer,
125–128. 2005, vol. 9, pp. 1226–1230.

6330 | Anal. Methods, 2013, 5, 6325–6330 This journal is ª The Royal Society of Chemistry 2013

Kalro Attachment Report
No ratings yet
Kalro Attachment Report
43 pages
以心换心你敢吗i
No ratings yet
以心换心你敢吗i
27 pages
Full Text No Signature
No ratings yet
Full Text No Signature
144 pages
Integrated Metabolomics and Transcriptomics Analysis of Roots of And, Two Sources of Medicinal Chaihu
No ratings yet
Integrated Metabolomics and Transcriptomics Analysis of Roots of And, Two Sources of Medicinal Chaihu
13 pages
15 Pages
No ratings yet
15 Pages
15 pages
Phytochemistry of Tinospora Cordifolia 1st Edition Reference Book Download
100% (9)
Phytochemistry of Tinospora Cordifolia 1st Edition Reference Book Download
14 pages
An Improved Weighted Partial Least Squares Method
No ratings yet
An Improved Weighted Partial Least Squares Method
11 pages
11 Characterization of Bioactive Compounds From Herbal Sources
No ratings yet
11 Characterization of Bioactive Compounds From Herbal Sources
27 pages
Molecules
No ratings yet
Molecules
18 pages
Qsad 039
No ratings yet
Qsad 039
9 pages
Simultaneous Simple and Rapid Determination of Fiv
No ratings yet
Simultaneous Simple and Rapid Determination of Fiv
8 pages
s00764 023 00230 7
No ratings yet
s00764 023 00230 7
9 pages
1 s2.0 S1386142522009350 Main
No ratings yet
1 s2.0 S1386142522009350 Main
8 pages
Bioassay-Guided Assessment of Antioxidative Anti-I
No ratings yet
Bioassay-Guided Assessment of Antioxidative Anti-I
16 pages
Zuo 2020
No ratings yet
Zuo 2020
16 pages
Cromatografía de Capa Fina
No ratings yet
Cromatografía de Capa Fina
9 pages
Fpls 07 01561
No ratings yet
Fpls 07 01561
11 pages
Napolitano 2014
No ratings yet
Napolitano 2014
9 pages
Finger Prints
No ratings yet
Finger Prints
16 pages
0 - Top Demais Várias Classes LCMS-Qtof
No ratings yet
0 - Top Demais Várias Classes LCMS-Qtof
12 pages
Espectroscopía Atómica - Lectura 2
No ratings yet
Espectroscopía Atómica - Lectura 2
9 pages
Analytical Methods: Paper
No ratings yet
Analytical Methods: Paper
7 pages
Traditional Soap Making - Issues and Challenges
No ratings yet
Traditional Soap Making - Issues and Challenges
19 pages
Flos Chrysanthemi Indici
No ratings yet
Flos Chrysanthemi Indici
4 pages
High Performance Thin Layer Chromatography For The Analysis of Medicinal Plants, 1st Edition All Sections Download
100% (15)
High Performance Thin Layer Chromatography For The Analysis of Medicinal Plants, 1st Edition All Sections Download
14 pages
J19 PDF
No ratings yet
J19 PDF
8 pages
The Potential of Metabolic Fingerprinting As A Tool For The Modernisation of TCM
No ratings yet
The Potential of Metabolic Fingerprinting As A Tool For The Modernisation of TCM
10 pages
1 s2.0 S0026265X24000614 Main
No ratings yet
1 s2.0 S0026265X24000614 Main
10 pages
A New Method For Determination of Thymol and
No ratings yet
A New Method For Determination of Thymol and
11 pages
s00764 022 00206 Z
No ratings yet
s00764 022 00206 Z
9 pages
Answer Sheet in Earth Science Las-5 Importance of Minerals To Society
No ratings yet
Answer Sheet in Earth Science Las-5 Importance of Minerals To Society
8 pages
1 s2.0 S1674638412600100 Main
No ratings yet
1 s2.0 S1674638412600100 Main
7 pages
A Multi-Module Structure Labelled Molecular Network Orients The Chemical
No ratings yet
A Multi-Module Structure Labelled Molecular Network Orients The Chemical
12 pages
Chromatographic Fingerprint Analysis of Herbal Medicines Thin Layer and High Performance Liquid Chromatography of Chinese Drugs, 2nd Edition
100% (20)
Chromatographic Fingerprint Analysis of Herbal Medicines Thin Layer and High Performance Liquid Chromatography of Chinese Drugs, 2nd Edition
14 pages
Wagner 26 Key Note Address
No ratings yet
Wagner 26 Key Note Address
52 pages
Herbal
No ratings yet
Herbal
11 pages
Validation of Chromatographic Methods of Analysis of Drugs Derived From Herbs
No ratings yet
Validation of Chromatographic Methods of Analysis of Drugs Derived From Herbs
34 pages
1 s2.0 S0021967308005104 Main
No ratings yet
1 s2.0 S0021967308005104 Main
8 pages
1
No ratings yet
1
5 pages
Extraction Methods and Chemical Standardization of Botanicals and Herbal Preparations
No ratings yet
Extraction Methods and Chemical Standardization of Botanicals and Herbal Preparations
11 pages
2 Crystalline Membrane Electrodes
No ratings yet
2 Crystalline Membrane Electrodes
23 pages
Characterization of Some Plant Extracts by GC-MS: Article
No ratings yet
Characterization of Some Plant Extracts by GC-MS: Article
6 pages
02 Dev 03 01 pp6-10
No ratings yet
02 Dev 03 01 pp6-10
5 pages
Importance of Microscopic
No ratings yet
Importance of Microscopic
3 pages
2 HPTLC - For - The - Identification - of - Plant - Materials
No ratings yet
2 HPTLC - For - The - Identification - of - Plant - Materials
58 pages
HPTLC For The Identification and Quality Control of Medicinal Plants
No ratings yet
HPTLC For The Identification and Quality Control of Medicinal Plants
27 pages
Terminalia Species
No ratings yet
Terminalia Species
10 pages
Document 1
No ratings yet
Document 1
11 pages
Role of HPTLC in Characterization of Herbal Medicines PDF
No ratings yet
Role of HPTLC in Characterization of Herbal Medicines PDF
5 pages
Molecules 26 02054
No ratings yet
Molecules 26 02054
16 pages
Must Read High Performance Thin Layer Chromatography For The Analysis of Medicinal Plants - 1st Edition One-Click Download
No ratings yet
Must Read High Performance Thin Layer Chromatography For The Analysis of Medicinal Plants - 1st Edition One-Click Download
16 pages
Reshma Jain S. J. Rajput Neeraj Kumar Sethiya Sushil K. Chauddhary
No ratings yet
Reshma Jain S. J. Rajput Neeraj Kumar Sethiya Sushil K. Chauddhary
7 pages
Effect of Extraction Techniques On Phenolic Content, Antioxidant and Antimicrobial Activity of Bauhinia Purpurea: HPTLC Determination of Antioxidants
No ratings yet
Effect of Extraction Techniques On Phenolic Content, Antioxidant and Antimicrobial Activity of Bauhinia Purpurea: HPTLC Determination of Antioxidants
8 pages
1 s2.0 S2666831921000400 Main
No ratings yet
1 s2.0 S2666831921000400 Main
11 pages
European Journal of Integrative Medicine: Research Paper
No ratings yet
European Journal of Integrative Medicine: Research Paper
7 pages
Wirasuta 2016
No ratings yet
Wirasuta 2016
8 pages
Sathi A Velu 2012
No ratings yet
Sathi A Velu 2012
5 pages
Hazards and It's Classifications
No ratings yet
Hazards and It's Classifications
4 pages
Radiator Caps Maintenance Practices For Trucks and Construction Equipments
No ratings yet
Radiator Caps Maintenance Practices For Trucks and Construction Equipments
15 pages
Analyticaltechniques12 PDF
No ratings yet
Analyticaltechniques12 PDF
7 pages
Standardization of Herbal Products
No ratings yet
Standardization of Herbal Products
54 pages
Determination of Chlorogenic Acid, Baicalin and Forsythin in Shuanghuanglian Preparations by Hplc-Dad
No ratings yet
Determination of Chlorogenic Acid, Baicalin and Forsythin in Shuanghuanglian Preparations by Hplc-Dad
6 pages
5-Drill String Failure
No ratings yet
5-Drill String Failure
113 pages
Timken Solid Block HU Catalog
No ratings yet
Timken Solid Block HU Catalog
188 pages
CAMAG Flyer Book Reich
No ratings yet
CAMAG Flyer Book Reich
1 page
Aasr 2010 2 1 225 229
No ratings yet
Aasr 2010 2 1 225 229
5 pages
Arke Filters Brochure 2019
No ratings yet
Arke Filters Brochure 2019
20 pages
Eontank 600: Material Safety Data Sheet
No ratings yet
Eontank 600: Material Safety Data Sheet
11 pages
Molecular Markers in Herbal Medicine Technology
No ratings yet
Molecular Markers in Herbal Medicine Technology
7 pages
2013 Conference Abstracts List
No ratings yet
2013 Conference Abstracts List
206 pages
The Ocean: Learning Goal
No ratings yet
The Ocean: Learning Goal
24 pages
Chapter 4: Biocatalysis (Pyq) PSPM II 2016/2017
No ratings yet
Chapter 4: Biocatalysis (Pyq) PSPM II 2016/2017
20 pages
Dentin Bonding Agents My Seminar
No ratings yet
Dentin Bonding Agents My Seminar
36 pages
Catalytic Wet Air Oxidation
No ratings yet
Catalytic Wet Air Oxidation
10 pages
Microalgal Drying and Cell Disruption - Recent Advances
100% (1)
Microalgal Drying and Cell Disruption - Recent Advances
9 pages
Fertilization of Fish Fry Ponds
No ratings yet
Fertilization of Fish Fry Ponds
8 pages
2011 Chemistry Exam Solution
No ratings yet
2011 Chemistry Exam Solution
42 pages
Lecture 20 Preventive Dental Materials (Slides)
No ratings yet
Lecture 20 Preventive Dental Materials (Slides)
34 pages
Pat Impex - Manufacturer Industrial Chemicals
No ratings yet
Pat Impex - Manufacturer Industrial Chemicals
17 pages
Glass Glass Fragments and Fractures
No ratings yet
Glass Glass Fragments and Fractures
4 pages
12 - Energy - and - Respiration 9700
No ratings yet
12 - Energy - and - Respiration 9700
6 pages
Vinyl Acetylene
No ratings yet
Vinyl Acetylene
8 pages
Liver, Bile, & Pancreatic Physiology PDF
No ratings yet
Liver, Bile, & Pancreatic Physiology PDF
6 pages
Acrodur DS 3530 February 2018 R3 EDF
No ratings yet
Acrodur DS 3530 February 2018 R3 EDF
3 pages
List of Elements by Atomic Properties - Wikipedia
No ratings yet
List of Elements by Atomic Properties - Wikipedia
7 pages
Widegap Heat Exchangers
No ratings yet
Widegap Heat Exchangers
6 pages
Carbocisteine For Syrup
No ratings yet
Carbocisteine For Syrup
2 pages
TDS Petronas
No ratings yet
TDS Petronas
1 page
OSP-PP Polymer - Composites PDF
No ratings yet
OSP-PP Polymer - Composites PDF
8 pages
Artificial Intelligence for Image Super Resolution
From Everand
Artificial Intelligence for Image Super Resolution
Debmitra Ghosh
No ratings yet

Cheng 2013

Uploaded by

Cheng 2013

Uploaded by

Analytical

Combination of eﬀective machine learning techniques

1 Introduction Therefore, an automatic and eﬀective method is a necessity to

Analytical Methods Paper

Paper Analytical Methods

classier by removing redundant features. Four standard

summary follows of the attributes of the tested selection

Feature set Classication accuracy (%)

I (504) No 79.69 93.75 82.81 85.94 85.94 85.62

Analytical Methods Paper

Classication accuracy (%)

Feature set Method #Features NB SMO RBF-NN KNN Logistic Average

I CFS 134 78.13 82.81 84.38 82.81 76.56 80.94

3.3 Classication schemes

Paper Analytical Methods

Analytical Methods Paper

You might also like