0% found this document useful (0 votes)
104 views7 pages

Explaining Relationships Among Various Coal Analyses With Coal Grindability Index by Random Forest

Uploaded by

Alan Kinata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views7 pages

Explaining Relationships Among Various Coal Analyses With Coal Grindability Index by Random Forest

Uploaded by

Alan Kinata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

International Journal of Mineral Processing 155 (2016) 140–146

Contents lists available at ScienceDirect

International Journal of Mineral Processing

journal homepage: www.elsevier.com/locate/ijminpro

Explaining relationships among various coal analyses with coal


grindability index by Random Forest
S.S. Matin a, James C. Hower b, L. Farahzadi c, S. Chehreh Chelgani d,⁎
a
Department of Environment and Energy, Science and Research Branch, Islamic Azad University, Tehran, Iran
b
Center for Applied Energy Research, University of Kentucky, 2540 Research Park Drive, Lexington, KY 40511, USA
c
Architecture Faculty, Dr. Shariaty Technical College, Tehran, Iran
d
Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA

a r t i c l e i n f o a b s t r a c t

Article history: Application of Random Forest (RF) via variable importance measurements (VIMs) and prediction is a new data
Received 15 March 2016 mining model, not yet wide spread in the applied science and engineering fields. In this study, the VIMs (proxi-
Received in revised form 20 August 2016 mate and ultimate analysis, petrography) processed by RF models were used for the prediction of Hardgrove
Accepted 31 August 2016
Grindability Index (HGI) based on a wide range of Kentucky coal samples. VIMs, coupled with Pearson correla-
Available online 1 September 2016
tion, through various analyses indicated that total sulfur, liptinite, and vitrinite maximum reflectance (Rmax)
Keywords:
are the most importance variables for the prediction of HGI. These effective predictors have been used as inputs
Hardgrove Grindability Index for the prediction of HGI by a RF model. Results indicated that the RF model can model HGI quite satisfactorily
Random Forest when the R2 = 0.90 and 99% of predicted HGIs had less than 4 HGI unit error in the testing stage. According to
Variable importance the result, by providing nonlinear VIMs as well as an accurate prediction model, RF can be further employed as
Proximate analysis a reliable and accurate technique for the evaluation of complex relationships in coal processing investigations.
Ultimate analysis © 2016 Elsevier B.V. All rights reserved.
Petrography

1. Introduction parameters on comminution performance). Coal grindability is influ-


enced by coal rank, petrography and mineral matter (Hower et al.,
The U.S. Energy Information Administration (EIA) estimated that 1987; Hower and Wild, 1988; Conroy, 1994; Barton et al., 1994; Bailey
total coal production will be increased by 27 million short-tons in and Hodson, 1994; Hower, 1998; Rubiera et al., 1999; Bhattacharya
2017 (EIA, 2016). Increasing demand for high purity coal, as well as et al., 1998; Sengupta, 2002; Vuthaluru et al., 2003; Trimble and
growing awareness about environmental pollution is associated with Hower, 2003). Grindability of coal is usually measured by Hardgrove
coal consumption. Therefore, developing technologies that make coal Grindability Index (HGI) (based on the standard test method ASTM D
cleaner significantly have been considered to ensure it plays a part in 409-71) (ASTM, 1971; Lee et al., 2003). The result of the HGI test is
our future clean energy. Ultimately, to liberate, and finally remove the most effective parameter in designing a coal mill for power plants.
coal impurities (such as mineral matter), coal particles have to commi- HGI also as a predictive index is used to estimate the performance ca-
nuted to fine particles (in size range of several microns) (Sengupta, pacity of industrial pulverizers in power station boilers (lower HGI
2002). will require a greater energy input and time to the desired size for
Comminution (crushing and grinding or pulverizing), as an essential pulverized-fuel combustion) (Mackowsky and Abramski, 1943; Peters
step in coal treatment, is often the greatest energy consumer in coal et al., 1962; Hower and Lineberry, 1988; Hower, 1998; Vuthaluru
washing plants (Sengupta, 2002). Therefore understanding the behav- et al., 2003; Peisheng et al., 2005).
ior of coal through comminution process would be important, and hav- Although the HGI test is not costly (albeit time consuming), due to
ing more information on the subject could be effective for control and inherent limitation through the test, HGI determination can be rather
optimization of other treatment processes (combustion, gasification, difficult. Some of the difficulties which can be considered for a HGI de-
carbonization, etc.) (Bhattacharya et al., 1998; Sengupta, 2002; termination are: limitation of the developed methodology; various
Vuthaluru et al., 2003; Lee et al., 2003). types of HGI machine and difference in grinding bowl and its material
Grindability measurement of coal can demonstrate the above men- composition; different stages to get the required size; differences in
tioned aspects where coal grindability as an essential physical property sample preparation; and reliability, repeatability and reproducibility of
of coal reflects its relative hardness, tenacity, and fracture (effective the test. These difficulties could be due to heterogeneous properties of
coal samples such as coal rank, maceral and microlithotype distribution,
⁎ Corresponding author. and mineral matter (Xuexin, 2001; Sengupta, 2002). To overcome these
E-mail address: [email protected] (S.C. Chelgani). problems many researchers have investigated the prediction of HGI

http://dx.doi.org/10.1016/j.minpro.2016.08.015
0301-7516/© 2016 Elsevier B.V. All rights reserved.
S.S. Matin et al. / International Journal of Mineral Processing 155 (2016) 140–146 141

based on various coal analyses; common coal analyses (proximate and proposed approaches are from studies conducted at the University of
ultimate analysis), petrography, and vitrinite maximum reflectance Kentucky Center for Applied Energy Research. Samples were prepared
(Rmax) by using regression and soft computing methods [artificial neu- from Western and Eastern Kentucky Southwest, Hazard and Big Sandy
ral networks (ANNs), genetic algorithm (GA), Nero-fuzzy (such as coals. A total of more than 900 sets of data were used. The results of var-
ANFIS)] (Hower and Wild, 1988; Peisheng et al., 2005; Jorjani et al., ious analyses (input variables for HGI prediction) and their representa-
2008a, 2008b; Chehreh Chelgani et al., 2008; Chehreh Chelgani et al., tive HGIs are shown in the supplementary database. Analyses were
2011a, 2011b; Chehreh Chelgani and Makaremi, 2013). performed according to the standard ASTM test methods (ASTMD
These developed models (regression and soft computing) depen- 409-71: Hardgrove, ASTM D3172: Proximate, and ASTM D3176: Ulti-
dent upon the quality of the input data into the generation of the mate analyses). For petrology, all samples were previously prepared as
models, and are used to yield promising descriptive results. The essen- particulate pellets.
tial point is that, variation in a parameter such as HGI cannot be under-
stood without a thorough knowledge of the fundamental coal 2.2. Random Forest
properties. In addition, in many cases, a variable would be a relatively
strong contributor to HGI, but would have stronger correlation to the 2.2.1. Variable importance measurements (VIMs)
other variables that were influencing HGI. Including these variables in- RF methods aside from accurate prediction have another extremely
flates the correlation (R2) of the model, but does not necessarily mean useful output which is variable importance measures (VIMs)
that the model describes HGI more accurately. Therefore, before devel- (Breiman, 2001; Svetnik et al., 2003; Liaw and Wiener, 2002;
oping complex models, necessary caution has to be used in selecting of Bylander, 2002). VIMs for RFs have been receiving increased attention
variables to study the great inter-dependence between coal properties, as a means of variable selection in many non-parametric regression
and then HGI (Trimble and Hower, 2003; Hower, 2006). Generally re- tasks (Wang et al., 2016). VIMs provide insight into the interactions be-
gression and soft computing methods are just capable of capturing com- tween predictors and by a group of tree computed relationships be-
plex relationships among large numbers of variables to predict a target, tween a target and predictors to indicate which variables have the
but they do not necessarily give any particular insight into the interrela- significant effect on the target (Hallett et al., 2014). The most popular
tionships among inputs and target variables. This major problem led to and advanced VIM available in RFs is the permutation accuracy impor-
the development of so-called variable importance measures (VIMs) tance (PAI) measure (Strobl et al., 2007; Hapfelmeier et al., 2014). For
which can be used to identify the individual effects of explanatory var- variable selection purposes, the main advantage of the PAI in RF as com-
iables (Auret and Aldrich, 2012). pared to other tree-based methods is that it covers the impact of each
A recently developed method, Random Forests (RFs), can overcome predictor variable individually as well as in multivariate interactions
this drawback by providing attractive addition to nonlinear approxima- with other predictor variables (Strobl et al., 2007). PAI is broad applica-
tion of statistical relationships among inputs and outputs (Breiman, ble and unbiased through the consideration of multivariate interactions
2001; Strobl et al., 2008; Archer and Kimes, 2008; Hallett et al., 2014). among variables (Breiman, 2001; Strobl et al., 2007).
RF as an ensemble of multiple decision trees is a type of a high- In PAI for VIMs, “out of bag” (OOB: computations based on observa-
dimensional, non-parametric predictive model consisting of a collection tions that were not part of the sample used for constructing the respec-
of classification or regression trees (Breiman, 2001). RFs also have been tive tree) dataset accuracy is always applied to evaluate the
successfully applied to various prediction models within the last de- performance. OOB achieves higher accuracy with low bias and variance
cades and through this short period of time they have become a major than other tree structured algorithms (Kulkarni and Sinha, 2013). The
data analysis tool which performs well in comparison with many stan- OOB data can be permuted, without required to train new forests
dard methods (Díaz-Uriarte and Alvarez de Andrés, 2006; Heidema (Breiman and Cutler, 2003; Archer and Kimes, 2008). In summary, the
et al., 2006). RF models have several advantages over other statistical computation of the PAI consists of the following steps:
modeling techniques: they are able to deal with missing values and
1) Calculating the mean square error (MSE) of a decision tree,
high-dimensional data, identify complex interactions between variables
2) Permuting the values of explanatory variable in the OOB
and the most important variables measurements (VIMs), predict with
observations,
high accuracy (low-bias models and low-variation in results), and
3) Recalculating the OOB MSE of that decision tree,
they are robust against over-fitting (Hopwood et al., 1994;
4) Calculating the difference between the MSE values which were cal-
Díaz-Uriarte and Alvarez de Andrés, 2006; Biau et al., 2008; Archer
culated in step 1 and 3, and
and Kimes, 2008). Although there is a widespread usage of RF models
5) Repeating the above steps for each decision tree and use the average
in various fields (RF method should be considered by well-informed ex-
difference over all trees as the overall importance score (Strobl et al.,
perts in the field) (Auret and Aldrich, 2012; Biau et al., 2008; Archer and
2008; Hapfelmeier et al., 2014; Wang et al., 2016).
Kimes, 2008; Hallett et al., 2014; Chehreh Chelgani et al., 2016a), to our
knowledge there are rarely used to explore interrelationship among The reference implementation of PAI is available in the “R” software
coal properties or for predictions (Matin and Chehreh Chelgani, 2016; package for statistical computing which has been used in this study
Chehreh Chelgani et al., 2016b). (https://www.r-project.org/). VIM is determined based on the
The aim of the present investigation is to assess the properties of “IncNodePurity”. The IncNodePurity parameter of the RF is average
over 900 coal samples from Kentucky, USA, in order to estimate the overall nodes in all trees in the forest.
HGI with the most important parameters based on ultimate and proxi-
mate analysis, oxides, and petrographic analysis of samples by using RF. 2.2.2. Prediction by RF
To our best knowledge, no tree or RF based methods have been pro- As mentioned, RFs are broadly used in many investigations for pre-
posed for the estimation of coal grindability. diction of complex models (complicated relationships). Through pre-
diction by RF, the model combines a number of trees by taking the
2. Materials and methods same number of bootstrap samples (random samples of the original
data with replacement and with the same length) from the database,
2.1. Experimental data and building a tree based on each bootstrap sample (Hallett et al.,
2014). The procedure of taking a bootstrap sample from the original
A soft computing model for the HGI prediction requires a robust da- training data to establish the training dataset for each tree is called bag-
tabase to cover a wide variety of coal types. Such a model will be capable ging of decision trees (Archer and Kimes, 2008; Wang et al., 2016). For
for predicting HGI with a high degree of accuracy. Data used to test the prediction, an estimated label is provided by the average over all trees
142 S.S. Matin et al. / International Journal of Mineral Processing 155 (2016) 140–146

direction of the relationship, and its absolute value demonstrates the


strength. Larger absolute values indicate stronger interdependence.
Generally r value higher than 0.5 (or −0.5) indicates is a strong rela-
tionship between two variables (SPSS, 2004). To make a soft computing
model usually from the entire database, approximately 80% of samples
are randomly used for the training step and 20% of the data for testing
phase of model (Soles et al., 1998; Voyant et al., 2010; Sadighi and
Mohaddecy, 2013).

3. Results and discussions

Because of the inherent provincialism in coals (heterogeneous char-


acteristics of coals such as the age, macerals, and mineral matter), a sin-
gle analysis (such as just proximate or ultimate analysis) cannot
represent the whole structure of coal, and it is difficult to predict the
HGI based on some basic coal quality, such as proximate analysis. There-
fore, it was recommended to build an accurate model by combination of
various coal characteristics (Hower, 2006). In this study VIMs have been
applied in various analyses for the variable selection to decrease bias
and find best predictors to build an accurate model for the prediction
of HGI.

3.1. Proximate and oxide analysis parameters

Proximate analysis [moisture (M), volatile matter (VM), ash (A),


and fixed carbon (FC)] of coal using a simple muffle furnace is com-
paratively easy, cheap, and fast (Trimble and Hower, 2003). Based
on the ASTM standard for proximate analysis, fixed carbon (FC) is
Fig. 1. The variable importance measurements for proximate and oxides analyses for HGI
calculated based on differences from other variables [FC% = 100 −
prediction obtained by Random Forest (significance to weakly important variables). (moisture + volatile matter + ash)], incorporating errors of the
other laboratory tests. Therefore, it is not necessary to use all four pa-
rameters since, by definition; the four parameters are a closed
(Strobl et al., 2008; Hapfelmeier et al., 2014). Through the bagging pro- system, adding to 100% (Hower, 2006). There are a number of equa-
cedure, from the learning set (L) with size N (as an improved training tions and soft computing models developed for the prediction of HGI
set for each new tree), a various bootstrap data L(θ) with size n would based on the proximate analysis (Hower et al., 1987; Hower and
be taken. Each predictive tree “T”L(θ) would be relied on the random vec- Wild, 1988; Sengupta, 2002; Hower and Wild, 1988; Peisheng
tor θ, which demonstrates the bagged samples from the main training et al., 2005; Jorjani et al., 2008a, 2008b; Chehreh Chelgani et al.,
set L. The final predictor “f” is the average over all trees (with y′η the es- 2008; Chehreh Chelgani et al., 2011a, 2011b; Chehreh Chelgani and
timated response for sample xη where K is the size of the ensemble Makaremi, 2013).
(Auret and Aldrich, 2012): In this study, VIM of proximate variables by RF indicates that mois-
ture as a rank parameter has the highest importance and VM has the
  1X K   K  lowest importance among the proximate analysis variables for the HGI
Regression : y0η ¼ f Xη ¼ TLðθk Þ Xη 1 ð1Þ prediction (Fig. 1). These results are in good agreement with theoretical
K K¼1
studies which showed VM is not a particularly good parameter for HGI
prediction in the bituminous rank and moisture has a positive contribu-
In RF models during tree-building, high inter-correlation among pre- tion on HGI (Vuthaluru et al., 2003; Hower, 2006). According to the VIM
dictors leads to bias through predictions (Nicodemus et al., 2010). In result (Fig. 1) ash and moisture have been selected from the proximate
other words, RF models are sensitive to the inter-correlation structure analysis parameters. In addition, recent investigations have shown that
of the predictor variables (Archer and Kimes, 2008; Strobl et al., 2008; the major ash oxides content (Al2O3, Fe2O3, TiO2, Na2O, SiO2, K2O, CaO,
Nicodemus and Malley, 2009). In this study, Pearson correlation MgO, and P2O5) of coal samples are better predictors than ash for the es-
(r) has been used among selected variables (by VIMs) to avoid this pos- timation of HGI (Jorjani et al., 2008a, 2008b; Chehreh Chelgani et al.,
sible bias during predictions. Pearson correlation (inter-correlation) is a 2008, 2011a, 2011b; Chehreh Chelgani and Makaremi, 2013). VIM has
measure of linear relationships among all inputs and output variables, been used to examine the effect of oxides in comparison with ash. Re-
and range from − 1 to + 1. The sign of the correlation shows the sults (Fig. 1) indicate that Al2O3 has stronger effect in comparison

Table 1
Inter-item correlation matrix for input variables and HGI.

Variables HGI Moisture Total sulfur Fe2O3 TiO2 Al2O3 Liptinite Rmax

HGI 1.0 0.10 0.39 0.25 −0.30 −0.34 −0.55 −0.15


Moisture 0.10 1.0 0.34 0.27 −0.31 −0.39 −0.31 −0.81
Total sulfur 0.39 0.34 1.0 0.74 −0.50 −0.75 −0.43 −0.61
Fe2O3 0.25 0.24 0.74 1.0 −0.58 −0.77 −0.31 −0.42
TiO2 −0.30 −0.31 −0.50 −0.58 1.0 0.49 0.53 0.41
Al2O3 −0.34 −0.39 −0.75 −0.77 0.49 1.0 0.31 0.58
Liptinite −0.55 −0.31 −0.43 −0.31 0.53 0.31 1.0 0.34
Rmax −0.15 −0.81 −0.61 −0.42 0.41 0.58 0.34 1.0
S.S. Matin et al. / International Journal of Mineral Processing 155 (2016) 140–146 143

with ash on HGI. In this regard, it was reported that coal samples which
contain high silicate content be harder, making them more difficult to
fracture through the grinding zone (Urala and AkyVldVz, 2004). Aside
from Al2O3, TiO2 and Fe2O3 show strong relationship with HGI (Fig. 1).
Inter-correlation (Table 1) among oxides show high inter-correlation
between Fe2O3 with Al2O3 (r: − 0.77) [they also have strong inter-
correlation with TS; Fe2O3 (0.74) and Al2O3 (−0.75)]; therefore, Al2O3
which shows higher importance can also represent the effect of Fe2O3
(as mentioned RF models are so sensitive to the inter-correlations of
the predictors, so Fe2O3 would not be a good predictor in the presence
of Al2O3). According to these results, TiO2, Al2O3, and moisture have
been selected as predictors within proximate and oxides analyses for
the HGI prediction by RF.

3.2. Ultimate analysis parameters


Fig. 3. The variable importance measurements for petrography analyses for HGI prediction
obtained by Random Forest.
Various investigations indicated that ultimate analysis members
[carbon (C), oxygen (O) (by difference), nitrogen (N), hydrogen (H),
and sulfur (total sulfur (TS)) are better predictors for HGI than the prox- observations are in agreement with results reported in the literature
imate analysis parameters and concluded that ultimate analysis based (Hower and Wild, 1988; Hower and Wild, 1994; Chehreh Chelgani
models for HGI prediction are superior to proximate analysis based et al., 2008; Hansen and Hower, 2014) where mentioned that liptinite
ones in terms of accuracy (Jorjani et al., 2008a, 2008b; Chehreh is a tough constituent in the structure of coal and contributes to the
Chelgani et al., 2008; Chehreh Chelgani et al., 2011a, 2011b; Chehreh toughness and resistance to grinding (Trimble and Hower, 2003;
Chelgani and Makaremi, 2013). Oxygen (O) is calculated based on Hansen and Hower, 2014). Trimble and Hower (2003) reported the im-
other variables [oxygen = 100 − (hydrogen + nitrogen + total portance of liptinite-rich in controlling the HGI of coals. Also previous
sulfur + carbon)]; therefore, O gathers all errors of other elements anal- investigations indicated that with the increase in Rmax (as a coal rank
yses. For the same reason such as FC, from ultimate analysis O was not parameter), HGI decreases (Jorjani et al., 2008a, 2008b; Chehreh
considered as a predictor for the HGI modeling. VIM result (Fig. 2) Chelgani et al., 2008; Chehreh Chelgani et al., 2011a, 2011b; Chehreh
shows that TS has the highest importance among the other elements Chelgani and Makaremi, 2013). Based on these results liptinite and
for the HGI prediction. This result is in good agreement with results of Rmax were chosen for the final combination model.
previous investigations which indicated that among ultimate parame-
ters, TS has the highest contribution for the prediction of HGI (Hower 3.4. HGI prediction
and Wild, 1988; Chehreh Chelgani et al., 2008). Based on these results,
TS was selected from ultimate analysis for the modeling of HGI. From VIM studies on various tests; TiO2, Al2O3, moisture, total sulfur,
liptinite, and Rmax were selected for the prediction of HGI by RF. VIM
3.3. Petrography analysis parameters within the combination of variables show that liptinite, Rmax, and total
sulfur have the highest, and TiO2 has the lowest importance among
Petrographic investigations from a number of studies indicated a re- other variables, so TiO2 omitted from the list of predictors (Fig. 4). Fur-
lationship between maceral content and HGI. These investigations have thermore, Table 1 shows there are high negative inter-correlations be-
reported that petrographic [macerals and vitrinite maximum reflec- tween moisture with Rmax (r: − 0.81) and also Al2O3 with TS (r:
tance (Rmax)] influences on grindability are more important than −0.75). These results are not surprising where moisture decreased as
other variable influences (ultimate and proximate analyses) (Trimble rank (Rmax) increased. Therefore, to prevent bias through HGI predic-
and Hower, 2003; Vuthaluru et al., 2003). Therefore, maceral and tion, moisture and Al2O3 omitted from the list of predictors. Hower
maceral associations play an essential role in studying HGI modeling. and Wild (1988) examined the sensitivity of HGI to change in coal
VIM studies of group macerals (vitrinite, liptinite, and inertinite) and rank and petrographic composition for 473 high volatile bituminous
Rmax from Kentucky coal indicated that liptinite and Rmax have the Kentucky and adjacent state (Illinois, Indiana and Virginia) coal sam-
highest importance for the HGI determination (Fig. 3). These ples. Using regression (Eq. (2)) they found that HGI values of samples

Fig. 2. The variable importance measurements for ultimate analysis for HGI prediction Fig. 4. The variable importance measurements for a combination of variables for HGI
obtained by Random Forest. prediction obtained by Random Forest.
144 S.S. Matin et al. / International Journal of Mineral Processing 155 (2016) 140–146

Fig. 5. Number of trees (forest) in training stage to meet the minimum error, and the corresponded correlation coefficient (R2).

results, it can be concluded that the proposed Random Forest proce-


dures yield significant variable importance and predictions of HGI.
These results suggest that RF with its advantages over other soft com-
puting methods can be used for the evaluation of other complex param-
eters in coal processing.

Fig. 6. Predicted HGI by RF versus actual measured HGI in testing process.

could be predicted on the basis of liptinite, Rmax, and total sulfur content
(with an R2 of 0.64 and a standard error of 4.31 HGI units) (Hower and
Wild, 1988). Thus, based on VIM results; liptinite, Rmax and TS were se-
lected as inputs for the prediction of HGI.

HGI ¼ 37:41–10:22 ln ðliptiniteÞ þ 28:18Rmax þ 0:64Stotal ð2Þ

From the total database used in the modeling, 760 of samples were
randomly selected for the training phase and 169 of data points for
the testing phase of RF model. The training processes for the HGI predic- Fig. 7. Distribution of difference between actual HGI and predicted RF model for the testing
tion were stopped when after generation of 1000 trees (forest), RF step.
model met the minimum error (in other words, the OOB error stabilized
after 1000 trees) (Fig. 5). The R2 value in training stage was 0.90 (Fig. 6).
The RF model evaluated with database for the test stage (169 samples; Table 2
the test set can determine how good a RF model is). Results show that HGI prediction (RF model output) deviations from their actual values.
the model could estimate the output quite satisfactorily (Fig. 7). These HGI deviation from target (HGI unit) Less than 2 Less than 4 MSE of errors
results show high accuracy of the RF model for the HGI prediction. It
Test 60% 99% 3.54
was reported that the limits of HGI determination by ASTM are 2 to 3 Train 45% 83% 8.56
HGI units, and for certain type of coals, the reproducibility has exceeded
Random forest (RF) was used to select important variables (VIM) and predict HGI.
5 units (Sengupta, 2002). RF results (Table 2) show that 60% of samples VIM by RF through various coal analyses satisfactory selected the best HGI predictors.
have less than 2 HGI unit errors where 99% of samples indicated less RF model indicated that it can quite accurately predict HGI by selected variables.
than 4 HGI unit differences (Table 2). According to these significant Results recommended RF can be applied for other complex relationship in coal geology.
S.S. Matin et al. / International Journal of Mineral Processing 155 (2016) 140–146 145

4. Conclusion Chehreh Chelgani, S., Hart, B., Grady, W.C., Hower, J.C., 2011b. Study relationship between
inorganic and organic coal analysis with gross calorific value by multiple regression
and ANFIS. International Journal of Coal Preparation and Utilization 31 (1), 9–19.
This investigation demonstrates the capability of Random Forest Chehreh Chelgani, S., Matin, S.S., Hower, J.C., 2016a. Explaining relationships between
method as a tree-based model which can be used for selection of vari- coke quality index and coal properties by Random Forest method. Fuel 182, 754–760.
Chehreh Chelgani, S., Matin, S.S., Makaremi, S., 2016b. Modeling of free swelling index
ables based on their importance as predictors, and also for the predic- based on variable importance measurements of parent coal properties by random
tion of Hardgrove Grindability Index. The variable importance forest method. Measurement 94, 416–422.
measurements by RF can provide useful insight into the relationship be- Conroy, A., 1994. Impact of Coal Quality on Grinding Characteristics, Combustion News.
Australian Combustion Technology Centre Company Publication, pp. 1–4 (August).
tween predictors and the target, more over it has several advantages: it Díaz-Uriarte, R., Alvarez de Andrés, S., 2006. Gene selection and classification of microar-
can determine the impact of each predictor variable individually as well ray data using random forest. BMC Bioinformatics 7, 3.
as in multivariate interactions with other predictors, it guarantees unbi- Energy Information Administration (US), 2016. Independent statistics and analysis. Short-
Term Energy Outlook (STEO) (June).
ased evaluations, and also it is broadly applicable. In this paper, for the
Hallett, M.J., Fan, J.J., Su, X.G., Levine, R.A., Nunn, M.E., 2014. Random forest and variable
first time, performance variable importance measurements of Random importance rankings for correlated survival data, with applications to tooth loss.
Forest for the selection of variables through proximate and ultimate Stat. Model. 14 (6), 523–547.
analyses, and petrographic analyses of coal samples were examined to Hansen, A.E., Hower, J.C., 2014. Notes on the relationship between microlithotype compo-
sition and Hardgrove grindability index for rank suites of Eastern Kentucky (Central
generate a solid realistic HGI predictive model. VIMs by RF showed the Appalachian) coals. Int. J. Coal Geol. 131, 109–112.
best arrangement for the HGI prediction is the set of liptinite, Rmax, Hapfelmeier, A., Hothorn, T., Ulm, K., Strobl, C., 2014. A new variable importance measure
and TS as the input set. Results indicated that the RF model can model for random forests with missing data. Stat. Comput. 24, 21–34.
Heidema, A.G., Boer, J.M.A., Nagelkerke, N., Mariman, E.C.M., Vander, A.D.L., Feskens,
HGI quite satisfactorily with correlation coefficient R2 = 0.90. The dif- E.J.M., 2006. The challenge for genetic epidemiologists: how to analyze large numbers
ferences between actual and RF predicted HGIs for 99% of samples of SNPs in relation to complex diseases. BMC Genet. 7, 23.
were less than 4 HGI unit. These results reveal that the RF model can Hopwood, W., Mckeown, J.C., Mutchler, J.F., 1994. A reexamination of auditor versus
model accuracy within the context of the going-concern opinion decision. Contemp.
be used as a powerful model to assess the importance variables to Account. Res. 10, 409–431.
have a remarkable HGI prediction. In order to continue and expand Hower, J.C., 1998. Interrelationship of coal grinding properties and coal petrology. Miner.
the knowledge base surrounding the modeling and application of RF Metall. Process. 15 (3), 1–16.
Hower, J.C., 2006. Letter to the editor, discussion: prediction of grindability with multivar-
in the coal preparation field, future works could be carried out on iable regression and neural network in Chinese coal. Fuel 85, 1307–1308.
modeling of other complex factors of coal preparation such as Free Hower, J.C., Lineberry, G.T., 1988. The interface of coal lithology and coal cutting:
Swelling Index. study of breakage characteristics of selected Kentucky coals. Journal of Coal
Quality 7, 88–95.
Hower, J.C., Wild, G.D., 1988. Relationships between Hardgrove Grindability Index and
petrographic composition for high-volatile bituminous coals from Kentucky. Journal
Appendix A. Supplementary data of Coal Quality 7, 122–126.
Hower, J.C., Wild, G.D., 1994. Maceral/microlithotype analysis evaluation of coal grinding:
examples from central application high volatile bituminous coals. J. Coal. Qual. 13,
Supplementary data to this article can be found online at http://dx. 35–40.
doi.org/10.1016/j.minpro.2016.08.015. Hower, J.C., Graese, A.M., Klapheke, J.G., 1987. Influence of microlithotype composition on
Hardgrove Grindability Index for selected Kentucky coals. Int. J. Coal Geol. 7, 227–244.
Jorjani, E., Hower, J.C., Cherhreh Chelgani, S., Shirazi, M.A., Mesroghli, S., 2008a. Studies of
relationship between petrography and elemental analysis with grindability for Ken-
References tucky coals. Fuel 87, 707–713.
Jorjani, E., Mesroghli, S., Chehreh Chelgani, S., 2008b. Prediction of operational parameters
Archer, K.J., Kimes, R.V., 2008. Empirical characterization of random forest variable impor- effect on coal flotation using artificial neural network. Int. J. Miner. Metall. Mater. 15
tance measures. Computational Statistics & Data Analysis 52, 2249–2260. (5), 528–533.
ASTM D 409-71, 1971. Standard test method for grindability of coal by the Hardgrove- Kulkarni, V.Y., Sinha, P.K., 2013. Random forest classifiers: a survey and future research di-
machine method. American Society for Testing and Materials, Standard on Coal and rections. Int. J. Adv. Comput. 36 (1), 1144–1153.
Coke, p. 220. Lee, J.M., Kim, J.S., Kim, J.J., 2003. Comminution characteristics of Korean anthracite in a
ASTM D3172, 2013. Standard Practice for Proximate Analysis of Coal and Coke. ASTM In- CFB reactor. Fuel 82, 1349–1357.
ternational, West Conshohocken, PA 19428-2959, United States, pp. 1–2. Liaw, A., Wiener, M., 2002. Classification and regression by random forest. R News 2 (3),
ASTM D3176, 2015. Standard Practice for Ultimate Analysis of Coal and Coke. ASTM Inter- 18–22.
national, United States, pp. 1–2. Mackowsky, M.T., Abramski, C., 1943. Kohlenpetrographische Untersuchengsmethoden
Auret, L., Aldrich, C., 2012. Interpretation of nonlinear relationships between process var- und ihre praktische Anwendung. Feuerungstechnik 31 (3), 49–64.
iables by use of random forests. Miner. Eng. 35, 27–42. Matin, S.S., Chehreh Chelgani, S., 2016. Estimation of coal gross calorific value based on
Bailey, J.G., Hodson, A., 1994. The effect of coal grindability on pulverised fuel combustion. various analyses by random forest. Fuel 177, 274–278.
Proc. 6th Australian Coal Science Conference. Newcastle, 17–19 October. AIE, Nicodemus, K.K., Malley, J.D., 2009. Predictor correlation impacts machine learning algo-
Australia, pp. 40–47. rithms: implications for genomic studies. Bioinformatics 25 (15), 1884–1890.
Barton, W.A., Condie, D.J., Lynch, L.J., 1994. Coal grindability: relationships with coal com- Nicodemus, K.K., Malley, J.D., Strobl, C., Ziegler, A., 2010. The behaviour of random forest
position and structure. Proc. 6th Australian Coal Science Conference, Newcastle, permutation based variable importance measures under predictor correlation. BMC
17–19 October. AIE, Australia, pp. 55–64. Bioinformatics 11 (110), 1–13.
Bhattacharya, S., Anand, V., Banerjee, P., 1998. Estimation of grindability from sink-float Peisheng, L., Youhui, X., Dunxi, Y., Xuexin, S., 2005. Prediction of grindability with multi-
test data for two different coals. Int. J. Miner. Process. 53, 99–106. variable regression and neural network in Chinese coal. Fuel 84, 2384–2388.
Biau, G., Devroye, L., Lugosi, G., 2008. Consistency of random forests and other averaging Peters, J.T., Schapiro, N., Gray, R.J., 1962. Know your coal. Transactions of the American In-
classifiers. J. Mach. Learn. Res. 9, 2015–2033. stitute of Mining and Metallurgical Engineers 223, 1–6.
Breiman, L., 2001. Random forests. Mach. Learn. 45 (1), 5–32. Rubiera, F., Arenillas, A., Fuente, E., Miles, N., Pis, J.J., 1999. Effect of the grinding behaviour
Breiman, L., Cutler, A., 2003, Manual on setting up, using, and understanding random for- of coal blends on coal utilisation for combustion. Powder Technol. 105, 351–356.
ests v4.0. bftp://ftp.stat.berkeley.edu/pub/users/breiman/Using_random_forests_v4. Sadighi, S., Mohaddecy, R.S., 2013. Predictive modeling for an industrial naphta
0.pdfN. bftp://ftp.stat.berkeley.edu/pub/users/breiman/Using_random_forests_v4.0. performing plant using artificial nueral network with recurrent layers. International
pdfN. Journal of Technology 2, 102–111.
Bylander, T., 2002. Estimating generalization error on two-class data sets using out-of-bag Sengupta, A.N., 2002. An assessment of grindability index of coal. Fuel Process. Technol.
estimates. Mach. Learn. 48, 287–297. 76, 1–10.
Chehreh Chelgani, S., Makaremi, S., 2013. Explaining the relationship between common Soles, D., Corra, D., Osorio, F.S., Wolf, D.F., 1998. 3D Vision-based Autonomous Navigation
coal analyses and Afghan coal parameters using statistical modeling methods. Fuel System Using ANN and Kinect Sensor. pp. 305–314.
Process. Technol. 110, 79–85. SPSS, 2004, Version 13, SPSS Inc., Help Files.
Chehreh Chelgani, S., Hower, J.C., Jorjani, E., Mesroghli, S., Bagherieh, A.H., 2008. Predic- Strobl, C., Boulesteix, A.L., Zeileis, A., Hothorn, T., 2007. Bias in random forest variable im-
tion of coal grindability based on petrography, proximate and ultimate analysis portance measures: illustrations, sources and a solution. BMC Bioinformatics 8 (25),
using multiple regression and artificial neural network models. Fuel Process. Technol. 1–21.
89, 13–20. Strobl, C., Boulesteix, A.L., Kneib, T., Augustin, T., Zeileis, A., 2008. Conditional variable im-
Chehreh Chelgani, S., Dehghan, F., Hower, J.C., 2011a. Estimation of some coal parameters portance for random forests. BMC Bioinformatics 9 (307), 1–11.
depending on petrographic and inorganic analyses by using genetic algorithm and Svetnik, V., Liaw, A., Tong, C., Culberson, J.C., Sheridan, R.P., Feuston, B.P., 2003. Random
adaptive neuro-fuzzy inference systems. Energy Exploration & Exploitation 29 (4), forest: a classification and regression tool for compound classification and QSAR
479–494. modeling. J. Chem. Inf. Comput. Sci. 43, 1947–1958.
146 S.S. Matin et al. / International Journal of Mineral Processing 155 (2016) 140–146

Trimble, A.S., Hower, J.C., 2003. Studies of the relationship between coal petrology and Vuthaluru, H.B., Brooke, R.J., Zhang, D.K., Yan, H.M., 2003. Effects of moisture and coal
grinding properties. Int. J. Coal Geol. 54, 253–260. blending on Hardgrove Grindability Index of Western Australian coal. Fuel Process.
Urala, S., AkyVldVz, M., 2004. Studies of the relationship between mineral matter and Technol. 81, 67–76.
grinding properties for low-rank coals. Int. J. Coal Geol. 60, 81–84. Wang, H., Yang, F., Luo, Z., 2016. An experimental study of the intrinsic stability of random
Voyant, C., Muselli, M., Paoli, C., Nivet, M.L., Poggi, P., 2010. Predictability of PV power grid forest variable importance measures. BMC Bioinformatics 17 (60), 1–18.
performance on insular sites without weather stations: use of artificial neural net- Xuexin, S., 2001. Combustion Experiment Technology and Method for Coal Fired Furnace.
works. SUBJECT 5: PV SYSTEMS, Subsection: 5.1 PV Power Plants, pp. 1–4. China Electricity and Power Press, Beijing, p. 286.

You might also like