Machine Learning Approachesfor Predicting Concrete Compressive Strength
Machine Learning Approachesfor Predicting Concrete Compressive Strength
net/publication/381488849
Article in Journal of Advanced Research in Civil and Environmental Engineering · May 2024
CITATIONS READS
0 18
1 author:
Jyoti Thapa
4 PUBLICATIONS 0 CITATIONS
SEE PROFILE
All content following this page was uploaded by Jyoti Thapa on 18 June 2024.
I N F O A B S T R A C T
E-mail Id: Concrete compressive strength (CS) plays a crucial role in infrastructure
[email protected] development. Accurate and timely prediction of compressive strength
Orcid Id: is crucial for optimising the performance of structural components. In
https://orcid.org/0009-0000-0963-0955 this study, 776 experimental datasets were collected from past research.
How to cite this article: These datasets were analysed with different machine learning (ML)
Thapa J. Machine Learning Approaches for techniques. The study evaluated the applicability of ML approaches
Predicting Concrete Compressive Strength. J in forecasting concrete strength. The forecasted performance of the
Adv Res Civil Envi Engr. 2024; 11(1): 09-20. regression model was compared with different statistical parameters.
Date of Submission: 2024-02-22 In this study, output performance revealed that the random forest (RF)
Date of Acceptance: 2024-03-11 regression model has good CS prediction capabilities by its R-squared
value of 0.91 followed by k-nearest neighbors (KNN), support vector
machine (SVM), and decision tree (DT) with 0.88, 0.84, and 0.78
respectively. Therefore, this research establishes that the ML approach
has a good capacity to forecast the concrete CS based on the real
database. These predictions approach plays perfect integration into
the construction industry to timely prediction of CS of concrete with
high precision and efficiency.
Keywords: Compressive strength, Concrete, Machine Learning,
Regression Model
ISSN: 2393-8307
Thapa J
11 J. Adv. Res. Civil Envi. Engr. 2024; 11(1)
Figure 1.Schematic Flowchart for Forecasting Concrete Compressive Strength Using ML Techniques
Correlation Analysis negative correlation with water (W), sand (S), and coarse
aggregate (CA). This relationship indicated that the increase
In machine learning approaches, the multicollinearity in cement and curing time increased the concrete CS. In
establishment and visualisation are the part of data contrast, an increase in sand, coarse aggregate, and water
preprocessing phase. It provides a prime idea regarding content decreases the CS of concrete. In addition, a high
the relationship between the selected input parameters positive correlation exists between FA and SP with a value
and the target variables which is illustrated in Figure 2. The of 0.65 and a high negative correlation between FA and C
range of correlation lies between -1 to 1. A high positive with a value of 0.75. This analysis indicates that all these
correlation between selected parameters is indicated parameters have a good correlation and play a crucial role
with a +1 value and vice versa. Figure 2 illustrates that the in forecasting concrete compressive strength.
target variable CS has a positive correlation with curing
time (T), cement (C), and superplasticizer (SP) but it has a
ISSN: 2393-8307
Thapa J
J. Adv. Res. Civil Envi. Engr. 2024; 11(1) 12
ISSN: 2393-8307
Thapa J
13 J. Adv. Res. Civil Envi. Engr. 2024; 11(1)
same scale for all selected features is termed as scaling or datasets, 5-fold cross-validation shows high accuracy as
rescaling of datasets. In this study, the selected features are compared to others. Therefore, in this study, a 5-folded
in different scales. In Table 1, the value of CS ranges from cross-validation method was used to predict the optimal
6.27 MPa to 79.9 MPa but S ranges from 0 to 1820 kg/m3, hyperparameter and train the selected regression models.
this indicates that the selected features are in different This 5-fold cross-validation was applied in each regression
scales. Therefore, a uniform scale is essential to present model during the training and testing phase. During this
these datasets. In this study, min-max normalization was process, firstly the selected database was randomly split
used to maintain uniformity in the range of all selected into five divisions. After that, the selected ML regression
features. The selected datasets were rescaled by using model was trained with four-folds, and the remaining
the following formula: fold was used for testing the datasets. The output of the
Xn = (X – Xmin)/(Xmax – Xmin) (1) validation result is provided with an average value of five-
fold values. In this process, the outputs of the trained and
Where the actual value of the selected features is denoted tested model show high accuracy and better prediction
by X, rescaled values are represented by Xn. Additionally, performance. In this study, the 5-fold cross-validation
the minimum value of each feature is denoted by Xmin, method is used to determine the best hyperparameters
whereas the maximum value is denoted by Xmax. in each ML model, which are presented in Table 2.
Statistical Analysis of Selected Model Table 2.Hyperparameters in Selected ML
Regression Model
In the context of regression analysis, statistical indices have
commonly been employed to assess the effectiveness of Model Optimal Hyperparameter
regression models. In addition, these indicators have been
applied to compare the prediction performance between SVM ‘C’: 50.0, ‘gamma’: ‘scale’, ‘kernel’: ‘rbf’
the selected different regression models. Therefore, in ‘criterion’: ‘friedman_mse’, ‘min_samples_
this research, the following worldwide used equations split’: 2, ‘max_depth’: 15, ‘min_samples_
DT
are applied to determine and compare the performance leaf’: 1, ‘splitter’: ‘random’, ‘max_features’:
of selected regression ML techniques. In this equation, None
the actual values of selected features are represented KNN ‘n_neighbors’: 5, ‘p’: 1, ‘weights’: ‘distance’
by symbols yia, and corresponding predicted values are
represented by yip, and the ‘n’ in the equation refers to the ‘n_estimators’: 100, ‘max_depth’: 20,
total number of data used in method research in ML models. RF ‘min_samples_leaf’: 1, ‘max_features’: ‘sqrt’,
‘min_samples_split’: 2
ISSN: 2393-8307
Thapa J
J. Adv. Res. Civil Envi. Engr. 2024; 11(1) 14
Figure 4.Comparison Between Actual and Forecasted Outcomes in the SVR Model
Figure 5.Relationship Between Predicted Outcomes of the SVR Model with Actual Value of Concrete CS
The statistical outputs of this model are demonstrated in a non-parametric approach. This robust ML technique is
Table 3. This SVR model shows statistical metrics namely applied to solve regression problems. In addition, this DT
R2 and VAF with values of 0.84 and 83.7%. In addition, model can be used for classification tasks. This method is
the MAE, MRE, and MSE metrics of this model show the very simple to interpret, visualise, and recognise the output
values of 3.59, -0.22, and 27.3. Furthermore, the statistical results. In this method, the decision-based features are
metrics namely RMSE, RRMSE, and MAPE in this regression represented by the internal node. Likewise, the branch of
model depict the value of 5.22, 0.16, and 11.38 respectively. the tree represents the outcomes of the decision, and the
These outcome results show that the SVR model has a good predicted output is represented by the leaf node. The optimal
capability to forecast the concrete CS. hyperparameters of DT models were optimised based on
Decision Tree (DT) the selected datasets of this research study. These optimal
hyperparameters of DT models are presented in Table 2.
In supervised machine learning, the Decision tree (DT) is
ISSN: 2393-8307
Thapa J
15 J. Adv. Res. Civil Envi. Engr. 2024; 11(1)
Figure 7.Relationship Between Predicted Outcomes of the DT Model with Actual Value of Concrete CS
The relationship between the actual value and the Table 4 illustrates the different statistical evaluation indices
forecasted outcomes of concrete CS in the DT regression of the DT model. This DT model shows statistical metrics
model is illustrated in Figure 6. This comparison indicates namely R2 and VAF with values of 0.78 and 78.4 %. In
that the DT has satisfactory accuracy in forecasting the addition, the MAE, MRE, and MSE metrics of this model
concrete CS. Also, the correlation between predicted and show the value of 3.75, -1.78, and 34.53 respectively.
actual values is presented in Figure 7. This demonstrates Furthermore, the statistical metrics namely RMSE, RRMSE,
that the R-squared value of this regression model is 0.78 and MAPE in this regression model depict the value of 5.87,
with Pearson’s r value 0.89. These output results depict 0.18, and 13.38 respectively. These outcome results show
that the prediction capabilities of this regression model that the concrete CS forecasting ability of the DT model is
have acceptable limits. within acceptable limits.
ISSN: 2393-8307
Thapa J
J. Adv. Res. Civil Envi. Engr. 2024; 11(1) 16
Figure 8.Comparison Between Actual and Forecasted Outcomes in the KNN Model
Figure 9.Relationship Between Predicted Outcomes of the KNN Model with the Actual Value of Concrete
CS
ISSN: 2393-8307
Thapa J
17 J. Adv. Res. Civil Envi. Engr. 2024; 11(1)
Figure 11.Relationship Between Predicted Outcomes of the RF Model with Actual Value of Concrete CS
Random Forest (RF) ensemble approach. This technique has a good prediction
capability with high accuracy and over-fitting control
Random forest (RF) is a versatile and powerful supervised
capabilities. In addition, this method can be used to handle
ML technique. This robust ML technique is applied to solve
missing values, outliers, and high-dimensional datasets.
regression problems. In addition, this RF model can be used
In this study, optimal hyperparameters of RF models are
for classification tasks. It belongs to the family of decision
determined based on the selected datasets. The optimal
tree-based methods and is known for its high accuracy and
hyperparameters of RF models are presented in Table 2.
robustness. Also, this technique combines all the predictions
from different base machine learning algorithms using an
ISSN: 2393-8307
Thapa J
J. Adv. Res. Civil Envi. Engr. 2024; 11(1) 18
The relationship between the actual value and the forecasted selected regression shows acceptable results and can be
outcomes of concrete CS in the RF regression model is applicable in real construction projects. However, the best
illustrated in Figure 10. Furthermore, the correlation prediction model selection is very important for researchers
between predicted and actual values is presented in and real construction persons to use in the future for the
Figure 11. This demonstrates that the R-squared value of prediction of the CS of concrete and to incorporate it for
this regression model is 0.91 with Pearson’s r value 0.97. structural integrity. The output results of SVM, DT, KNN,
These outcome results show that the RF model has a good and RF regression models are compared using different
capability to forecast the concrete CS. Also, these figures statistical indicators. The best final model for forecasting
show that the forecasting outcomes of this model have a the concrete CS is selected based on the performance of
good fitting with the actual concrete strength values. these regression models. The comparison between these
Table 6 illustrates the different statistical evaluation indices selected four regression models is presented in Table 7.
of the RF model. This RF model shows statistical metrics In this comparison, firstly four weightings such as 1, 2, 3,
namely R2 and VAF with values of 0.91 and 91.22 %. In and 4 are defined as per the lower to higher performance
addition, the MAE, MRE, and MSE metrics of this model show of each model. Compares the outputs of each model and
values of 2.66, -0.83, and 14.7 respectively. Furthermore, ranks them with respective weightage. After that, add all
the statistical metrics namely RMSE, RRMSE, and MAPE in the weightage for each model and compare them with their
this regression model depict the values of 3.83, 0.12, and total weightage. Table 7 presents the overall performance of
9.14 respectively. These results show that the RF model has all models with their performance weightage. It depicts that
good capability for forecasting the CS of concrete. the RF regression model has the highest weightage followed
by KNN, SVM, and DT. In this study, the RF regression model
Comparison of Results shows good prediction performance as compared to other
In previous sections, the concrete CS prediction outputs regression models. In addition, based on the R-squared and
of different ML models were analysed and compared VAF, the RF regression model displays better prediction
with the actual compressive strength results which were capacity to forecast concrete CS as compared to other
tested in the laboratory. The prediction performance of regression models.
Table 7.Comparison Between Statistical Indices of Selected Regression Models
ISSN: 2393-8307
Thapa J
19 J. Adv. Res. Civil Envi. Engr. 2024; 11(1)
Conclusion References
In this study, 776 laboratory-tested datasets of concrete 1. Banjara R, Thapa D, Katuwal TB, Adhiakari S. Seismic
compressive strength were collected from different research behaviour of buildings as per NBC 105:1994, NBC
papers. These collected data were comprehensively 105:2020 and IS 1893:2016. Proceedings of the IOE
analysed using four machine learning approaches such as Graduate Conference. 2021;10:1464-71.
SVM, DT, KNN, and RF regression models. A compressive 2. Bhusal B, Paudel S, Katuwal TB. Investigation of
assessment of concrete CS was conducted based on these confinement effects for determining moment curvature
selected previous datasets. It has been observed that and interaction diagram of reinforced concrete column.
the performance of all regression models shows good Tech J. 2020;2(1):81-8.
prediction capabilities. Thus, this study reveals that the ML 3. Chaulagain H, Rodrigues H, Spacone E, Varum H.
approach can be applicable to forecast the CS of concrete Assessment of seismic strengthening solutions for
with high accuracy, less computational time, and very existing low-rise RC buildings in Nepal. Earthq Struct.
limited experimental effort. From this study following key 2015;8:511-39.
conclusions emerged. 4. Chithra S, Kumar SS, Chinnaraju K, Ashmita FA.
The concrete CS is positively affected by the volume of A comparative study on the compressive strength
cement, superplasticizer, and curing period. In contrast, prediction models for high performance concrete
the higher amount of water, sand, coarse aggregate, and containing nano silica and copper slag using regression
fly ash used in concrete reduces the CS of concrete. analysis and artificial neural networks. Constr Build
Mater. 2016;114:528-35.
The outputs of this research study demonstrate that the 5. Chopra P, Sharma RK, Kumar M. Regression models
R-squared value of the RF regression model has the highest for the prediction of compressive strength of concrete
value of 0.91 and lower error followed by KKN, SVM, and with & without fly ash. Int J Latest Trend Eng Technol.
DT with values of 0.88, 0.84, and 0.79 respectively. Based 2014;3:400-6.
on the comparison between all the statistical indicators, 6. Gautam D, Bhattarai A, Rupakhety R. Machine learning
the RF regression model has the highest weightage and and soft voting ensemble classification for earthquake
is followed by the KNN, SVM, and DT regression models. induced damage to bridges. Eng Struct. 2024;303:1-16.
Finally, this research reveals that all selected regression 7. Ghimire N, Chaulagain H. Seismic vulnerability
models have good prediction performance for concrete assessment of reinforced concrete school building in
compressive strength prediction. Therefore, these Nepal. Asian J Civil Eng. 2021;22:249-62.
approaches are beneficial to the timely prediction of 8. Katuwal TB. Comparative evaluation of concrete flexural
concrete compressive strength in real construction projects. strength of river bed and crusher run coarse aggregate
Based on the comparison between these selected regression in Pokhara valley. J Innov Eng Educ. 2019;2(1):221-4.
models, the RF regression model has good prediction 9. Parashar A, Aggarwal P, Saini B, Aggarwal Y, Bishnoi S.
performance with higher accuracy and minimum error Study on performance enhancement of self-compacting
between the actual and forecast outcomes. Therefore, the concrete incorporating waste foundry sand. Constr
RF regression model can be applicable in future research and Build Mater. 2020;251:1-11.
predictions of the concrete CS during the real construction 10. Prabhu GG, Hyun JH, Kim YY. Effects of foundry sand as
period. Moreover, it reduces the testing time, and cost of a fine aggregate in concrete production. Constr Build
the project accordingly. Mater. 2014;70:514-21.
11. Singh G, Siddique R. Abrasion resistance and strength
Acknowledgement properties of concrete containing waste foundry sand
The author is grateful to the researchers whose data sets (WFS). Constr Build Mater. 2012;28:421-6.
have been invaluable to the completion of this research 12. Basar HM, Deveci Aksoy ND. The effect of waste
work. Their accurate collection and documentation have foundry sand (WFS) as partial replacement of sand
significantly contributed to the depth of this research work. on the mechanical, leaching and micro-structural
Source of Funding: None characteristics of ready-mixed concrete. Constr Build
Mater. 2012;35:508-15.
Conflict of Interest: None 13. Mahajan L, Bhagat S. Machine learning approaches
There are no known conflicts of interest with the publication for predicting compressive strength of concrete with
of this research article. fly ash admixture. Res Eng Struct Mater. 2022;9:1-28.
ISSN: 2393-8307
Thapa J
J. Adv. Res. Civil Envi. Engr. 2024; 11(1) 20
ISSN: 2393-8307