Author Contributions
Conceptualization, S.D. and K.G.; Data curation, S.D.; Formal analysis, S.D.; Investigation, S.D.; Methodology, S.D. and J.C.; Project administration, K.G. and J.C.; Resources, J.C.; Software, S.D.; Supervision, K.G.; Validation, J.C.; Visualization, S.D.; Writing—original draft, S.D. and K.G.; Writing—review & editing, K.G. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
The work in this study is covered under ASU Knowledge Enterprise Development IRB titled Learner Effects in ALEKS, STUDY00007974.
Informed Consent Statement
Not applicable.
Data Availability Statement
Restrictions apply to the availability of these data. Data were obtained from EdPlus and are available from the authors with the permission of EdPlus.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Rolfe, V. A Systematic Review of The Socio-Ethical Aspects of Massive Online Open Courses. Eur. J. Open Distance E-Learn. 2015, 18, 52–71. [Google Scholar] [CrossRef] [Green Version]
- Kumar, J.A.; Al-Samarraie, H. An Investigation of Novice Pre-University Students’ Views towards MOOCs: The Case of Malaysia. Ref. Libr. 2019, 60, 134–147. [Google Scholar]
- Nagrecha, S.; Dillon, J.Z.; Chawla, N.V. MOOC dropout prediction: Lessons learned from making pipelines interpretable. In Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, 3–7 April 2017; pp. 351–359. [Google Scholar] [CrossRef]
- Qiu, J.; Tang, J.; Liu, T.X.; Gong, J.; Zhang, C.; Zhang, Q.; Xue, Y. Modeling and Predicting Learning Behavior in MOOCs. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, San Francisco, CA, USA, 22–25 February 2016; pp. 93–102. [Google Scholar] [CrossRef]
- Dalipi, F.; Imran, A.S.; Kastrati, Z. MOOC dropout prediction using machine learning techniques: Review and research challenges. In Proceedings of the IEEE Global Engineering Education Conference, Santa Cruz de Tenerife, Spain, 17–20 April 2018; pp. 1007–1014. [Google Scholar] [CrossRef] [Green Version]
- Kim, T.-D.; Yang, M.-Y.; Bae, J.; Min, B.-A.; Lee, I.; Kim, J. Escape from infinite freedom: Effects of constraining user freedom on the prevention of dropout in an online learning context. Comput. Hum. Behav. 2017, 66, 217–231. [Google Scholar] [CrossRef]
- Shah, D. By the Numbers: MOOCS in 2018 Class Central. 2018. Available online: https://www.classcentral.com/report/mooc-stats-2018/ (accessed on 16 December 2018).
- Feng, W.; Tang, J.; Liu, T.X. Understanding Dropouts in MOOCs. In Proceedings of the 23rd American Association for Artificial Intelligence National Conference (AAAI), Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 517–524. [Google Scholar] [CrossRef]
- Hellas, A.; Ihantola, P.; Petersen, A.; Ajanovski, V.V.; Gutica, M.; Hynninen, T.; Knutas, A.; Leinonen, J.; Messom, C.; Liao, S.N. Predicting academic performance: A systematic literature review. In Proceedings of the Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education, Larnaca, Cyprus, 2–4 July 2018; pp. 175–199. [Google Scholar] [CrossRef] [Green Version]
- Baker, R.S.; Yacef, K. The state of educational data mining in 2009: A review and future visions. J. Educ. Data Min. 2009, 1, 3–17. [Google Scholar]
- West, D.M. Big Data for Education: Data Mining, Data Analytics, and Web Dashboards. Gov. Stud. Brook. 2012, 4, 1–10. [Google Scholar]
- Lara, J.A.; Lizcano, D.; Martínez, M.A.; Pazos, J.; Riera, T. A system for knowledge discovery in e-learning environments within the European Higher Education Area – Application to student data from Open University of Madrid, UDIMA. Comput. Educ. 2014, 72, 23–36. [Google Scholar] [CrossRef]
- Chakraborty, B.; Chakma, K.; Mukherjee, A. A density-based clustering algorithm and experiments on student dataset with noises using Rough set theory. In Proceedings of the IEEE International Conference on Engineering and Technology, Coimbatore, India, 17–18 March 2016; pp. 431–436. [Google Scholar] [CrossRef]
- Chauhan, N.; Shah, K.; Karn, D.; Dalal, J. Prediction of student’s performance using machine learning. In Proceedings of the 2nd International Conference on Advances in Science & Technology, Mumbai, India, 8–9 April 2019. [Google Scholar]
- Salloum, S.A.; Alshurideh, M.; Elnagar, A.; Shaalan, K. Mining in Educational Data: Review and Future Directions. In Proceedings of the International Conference on Artificial Intelligence and Computer Vision (AICV2020), Cairo, Egypt, 8–10 April 2020. [Google Scholar]
- Al-Shabandar, R.; Hussain, A.; Laws, A.; Keight, R.; Lunn, J.; Radi, N. Machine learning approaches to predict learning outcomes in Massive open online courses. In Proceedings of the International Joint Conference on Neural Networks, Anchorage, AK, USA, 14–19 May 2017; pp. 713–720. [Google Scholar] [CrossRef]
- Baker, R.S.J.D.; Siemens, G. Educational Data Mining and Learning Analytics. In Cambridge Handbook of the Learning Sciences, 2nd ed.; Keith Sawyer, R., Ed.; Cambridge University Press: New York, NY, USA, 2014; pp. 253–274. [Google Scholar]
- Fiaidhi, J. The Next Step for Learning Analytics. IT Prof. 2014, 16, 4–8. [Google Scholar] [CrossRef]
- Gašević, D.; Rose, C.; Siemens, G.; Wolff, A.; Zdrahal, Z. Learning Analytics and Machine Learning. In Proceedings of the Fourth International Conference on Learning Analytics and Knowledge, Indianapolis, IN, USA, 24–28 March 2014; pp. 287–288. [Google Scholar] [CrossRef]
- Liyanagunawardena, T.R.; Parslow, P.; Williams, S. Dropout: MOOC participant’s perspective. In Proceedings of the EMOOCs 2014, the Second MOOC European Stakeholders Summit, Lausanne, Switzerland, 10–12 February 2014; pp. 95–100. [Google Scholar]
- Jayaprakash, S.M.; Moody, E.W.; Lauría, E.J.M.; Regan, J.R.; Baron, J.D. Early alert of academically at-risk students: An open source analytics initiative. J. Learn. Anal. 2014, 1, 6–47. [Google Scholar] [CrossRef] [Green Version]
- Márquez-Vera, C.; Cano, A.; Romero, C.; Noaman, A.Y.M.; Fardoun, H.M.; Ventura, S. Early dropout prediction using data mining: A case study with high school students. Expert Syst. 2016, 33, 107–124. [Google Scholar] [CrossRef]
- Palmer, S. Modelling engineering student academic performance using academic analytics. Int. J. Eng. Educ. 2013, 29, 132–138. [Google Scholar]
- Papamitsiou, Z.; Economides, A. Learning analytics and educational data mining in practice: A systematic literature review of empirical evidence. Educ. Technol. Soc. 2014, 17, 49–64. [Google Scholar]
- Peña-Ayala, A. Educational data mining: A survey and a data mining-based analysis of recent works. Expert Syst. Appl. 2014, 41, 1432–1462. [Google Scholar] [CrossRef]
- Zacharis, N.Z. A multivariate approach to predicting student outcomes in web-enabled blended learning courses. Internet High. Educ. 2015, 27, 44–53. [Google Scholar] [CrossRef]
- Cen, L.; Ruta, D.; Powell, L.; Hirsch, B.; Ng, J. Quantitative approach to collaborative learning: Performance prediction, individual assessment, and group composition. Int. J. Comput. Collab. Learn. 2016, 11, 187–225. [Google Scholar] [CrossRef]
- Mueen, A.; Zafar, B.; Manzoor, U. Modeling and Predicting Students’ Academic Performance Using Data Mining Techniques. Int. J. Mod. Educ. Comput. Sci. 2016, 8, 36. [Google Scholar] [CrossRef]
- Huang, S.; Fang, N. Predicting student academic performance in an engineering dynamics course: A comparison of four types of predictive mathematical models. Comput. Educ. 2013, 61, 133–145. [Google Scholar] [CrossRef]
- Marbouti, F.; Diefes-Dux, H.; Madhavan, K. Models for early prediction of at-risk students in a course using standards-based grading. Comput. Educ. 2016, 103, 1–15. [Google Scholar] [CrossRef] [Green Version]
- Tempelaar, D.; Rienties, B.; Giesbers, B. In search for the most informative data for feedback generation: Learning analytics in a data-rich context. Comput. Hum. Behav. 2015, 47, 157–167. [Google Scholar] [CrossRef] [Green Version]
- Kizilcec, R.F.; Pérez-Sanagustín, M.; Maldonado, J.J. Self-regulated learning strategies predict learner behavior and goal attainment in Massive Open Online Courses. Comput. Educ. 2017, 104, 18–33. [Google Scholar] [CrossRef] [Green Version]
- Kuzilek, J.; Hlosta, M.; Herrmannova, D.; Zdrahal, Z.; Wolff, A. Ou analyse: Analysing at-risk students at the open university. Learn. Anal. Rev. 2015, 8, 1–16. [Google Scholar]
- Wolff, A.; Zdrahal, Z.; Nikolov, A.; Pantucek, M. Improving retention: Predicting at-risk students by analysing clicking behaviour in a virtual learning environment. In Proceedings of the Third Conference on Learning Analytics and Knowledge, Leuven, Belgium, 8–12 April 2013. [Google Scholar]
- Hlosta, M.; Herrmannova, D.; Vachova, L.; Kuzilek, J.; Zdrahal, Z.; Wolff, A. Modelling student online behaviour in a virtual learning environment. arXiv Prepr. 2018, arXiv:1811.06369. [Google Scholar]
- Cui, Y.; Chen, F.; Shiri, A. Scale up predictive models for early detection of at-risk students: A feasibility study. Inf. Learn. Sci. 2020, 121, 97–116. [Google Scholar] [CrossRef]
- Soffer, T.; Cohen, A. Student’s engagement characteristics predict success and completion of online courses. J. Comput. Assist. Learn. 2019, 35, 378–389. [Google Scholar] [CrossRef]
- Winne, P. Improving Measurements of Self-Regulated Learning. Educ. Psychol. 2010, 45, 267–276. [Google Scholar] [CrossRef]
- Sha, L.; Looi, C.-K.; Chen, W.; Zhang, B. Understanding mobile learning from the perspective of self-regulated learning. J. Comput. Assist. Learn. 2012, 28, 366–378. [Google Scholar] [CrossRef]
- Tang, J.; Xie, H.; Wong, T. A Big Data Framework for Early Identification of Dropout Students in MOOC. Technology in Education. Technology-Mediated Proactive Learning; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
- Amnueypornsakul, B.; Bhat, S.; Chinprutthiwong, P. Predicting attrition along the way: The UIUC model. In Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs, Dota, Qatar, 25–29 October 2014; pp. 55–59. [Google Scholar]
- Baker, R.; Evans, B.; Li, Q.; Cung, B. Does Inducing Students to Schedule Lecture Watching in Online Classes Improve Their Academic Performance? An Experimental Analysis of a Time Management Intervention. Res. High. Educ. 2018, 60, 521–552. [Google Scholar] [CrossRef]
- Cicchinelli, A.; Veas, E.; Pardo, A.; Pammer-Schindler, V.; Fessl, A.; Barreiros, C.; Lindstädt, S. Finding traces of self-regulated learning in activity streams. In Proceedings of the 8th International Conference on Learning Analytics and Knowledge, Sydney, NSW, Australia, 7–9 March 2018; pp. 191–200.
- Lim, J.M. Predicting successful completion using student delay indicators in undergraduate self-paced online courses. Distance Educ. 2016, 37, 317–332. [Google Scholar] [CrossRef]
- Park, J.; Denaro, K.; Rodriguez, F.; Smyth, P.; Warschauer, M. Detecting changes in student behavior from clickstream data. In Proceedings of the 7th International Learning Analytics & Knowledge Conference, Vancouver, BC, Canada, 21–30 March 2017. [Google Scholar] [CrossRef]
- Gašević, D.; Dawson, S.; Rogers, T.; Gasevic, D. Learning analytics should not promote one size fits all: The effects of instructional conditions in predicting academic success. Internet High. Educ. 2016, 28, 68–84. [Google Scholar] [CrossRef] [Green Version]
- Bozkurt, A.; Yazıcı, M.; Aydin, I.E. Cultural diversity and its implications in online networked learning spaces In Research Anthology on Developing Effective Online Learning Courses; Information Resources Management Association, Ed.; IGI Global: Hershey, PA, USA, 2018; pp. 56–81. [Google Scholar]
- Baker, R.S.; Inventado, P.S. Educational Data Mining and Learning Analytics in Learning Analytics; Springer: New York, NY, USA, 2014; pp. 61–75. [Google Scholar]
- Kőrösi, G.; Farkas, R. Mooc performance prediction by deep learning from raw clickstream data. In Proceedings of the in International Conference in Advances in Computing and Data Sciences, Valletta, Malta, 24–25 April 2020; pp. 474–485. [Google Scholar]
- Hung, J.L.; Wang, M.C.; Wang, S.; Abdelrasoul, M.; Li, Y.; He, W. Identifying at-risk students for early interventions—A time-series clustering approach. IEEE Trans. Emerg. Top. Comput. 2015, 5, 45–55. [Google Scholar] [CrossRef]
- Akçapınar, G.; Hasnine, M.N.; Majumdar, R.; Flanagan, B.; Ogata, H. Developing an early-warning system for spotting at-risk students by using eBook interaction logs. Smart Learning Environments 2019, 6, 4. [Google Scholar] [CrossRef]
- Cohen, J. A Coefficient of Agreement for Nominal Scales. Educ. Psychol. Meas. 1960, 20, 37–46. [Google Scholar] [CrossRef]
- Adnan, M.; Habib, A.; Ashraf, J.; Mussadiq, S.; Raza, A.A.; Abid, M.; Bashir, M.; Khan, S.U. Predicting at-Risk Students at Different Percentages of Course Length for Early Intervention Using Machine Learning Models. IEEE Access 2021, 9, 7519–7539. [Google Scholar] [CrossRef]
- Costa, E.B.; Fonseca, B.; Santana, M.A.; De Araújo, F.F.; Rego, J. Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Comput. Hum. Behav. 2017, 73, 247–256. [Google Scholar] [CrossRef]
- Cano, A.; Leonard, J.D. Interpretable Multiview Early Warning System Adapted to Underrepresented Student Populations. IEEE Trans. Learn. Technol. 2019, 12, 198–211. [Google Scholar] [CrossRef]
- Burgos, C.; Campanario, M.L.; De La Peña, D.; Lara, J.A.; Lizcano, D.; Martínez, M.A. Data mining for modeling students’ performance: A tutoring action plan to prevent academic dropout. Comput. Electr. Eng. 2018, 66, 541–556. [Google Scholar] [CrossRef]
- Gupta, S.; Sabitha, A.S. Deciphering the attributes of student retention in massive open online courses using data mining techniques. Educ. Inf. Technol. 2018, 24, 1973–1994. [Google Scholar] [CrossRef]
- Praveena, M.; Jaiganesh, V. A Literature Review on Supervised Machine Learning Algorithms and Boosting Process. Int. J. Comput. Appl. 2017, 169, 32–35. [Google Scholar] [CrossRef]
- Eranki, K.L.; Moudgalya, K.M. Evaluation of web based behavioral interventions using spoken tutorials. In Proceedings of the 2012 IEEE Fourth International Conference on Technology for Education, Hyderabad, India, 18–20 July 2012; pp. 38–45. [Google Scholar]
- Kanungo, T.; Mount, D.; Netanyahu, N.; Piatko, C.; Silverman, R.; Wu, A. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 881–892. [Google Scholar] [CrossRef]
- Fayyad, U.; Piatetsky-Shapiro, G.; Smyth, P. The KDD process for extracting useful knowledge from volumes of data. Commun. ACM 1996, 39, 27–34. [Google Scholar] [CrossRef]
- Saa, A.A. Educational Data Mining & Students’ Performance Prediction. Int. J. Adv. Comput. Sci. Appl. 2016, 7, 212–220. [Google Scholar]
- Canfield, W. ALEKS: A Web-based intelligent tutoring system. Math. Comput. Educ. 2001, 35, 152. [Google Scholar]
- Craig, S.D.; Hu, X.; Graesser, A.C.; Bargagliotti, A.E.; Sterbinsky, A.; Cheney, K.R.; Okwumabua, T. The impact of a technology-based mathematics after-school program using ALEKS on student’s knowledge and behaviors. Comput. Educ. 2013, 68, 495–504. [Google Scholar] [CrossRef]
- Fei, M.; Yeung, D.Y. models for predicting student dropout in massive open online courses. In Proceedings of the 2015 IEEE International Conference on Data Mining Workshop, Atlantic City, NJ, USA, 14–17 November 2015; pp. 256–263. [Google Scholar] [CrossRef]
- Nanopoulos, A.; Alcock, R.; Manolopoulos, Y. Feature-based classification of time-series data. Int. J. Comput. Res. 2001, 10, 49–61. [Google Scholar]
- Doanne, D.; Seward, L.E. Measuring skewness. J. Stat. 2011, 19, 1–18. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Alamri, A.; Alshehri, M.; Cristea, A.; Pereira, F.D.; Oliveira, E.; Shi, L.; Stewart, C. Predicting MOOCs Dropout Using Only Two Easily Obtainable Features from the First Week’s Activities. Intelligent Tutoring Systems; Coy, A., Hayashi, Y., Chang, M., Eds.; Springer: Cham, Switzerland, 2019; pp. 163–173. [Google Scholar] [CrossRef] [Green Version]
- Archer, K.J.; Kimes, R.V. Empirical characterization of random forest variable importance measures. Comput. Stat. Data Anal. 2008, 52, 2249–2260. [Google Scholar] [CrossRef]
- Bradley, A.P. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 1997, 30, 1145–1159. [Google Scholar] [CrossRef] [Green Version]
- Algarni, A. Data Mining in Education. Int. J. Adv. Comput. Sci. Appl. 2016, 7, 456–461. [Google Scholar] [CrossRef]
- Goutte, C.; Gaussier, E. A Probabilistic Interpretation of Precision Recall and F-Score, with Implication for Evaluation. In Advances in Information Retrieval. ECIR 2005. Lecture Notes in Computer Science, 3408; Losada, D.E., Fernández-Luna, J.M., Eds.; Springer: Berlin/Heidelberg, Germany, 2005; pp. 345–359. [Google Scholar] [CrossRef]
- Fawcett, T. Introduction to receiver operator curves. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
- Sharkey, M.; Sanders, R. A process for predicting MOOC attrition. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 25–29 October 2014; pp. 50–54. [Google Scholar]
- Bulathwela, S.; Pérez-Ortiz, M.; Lipani, A.; Yilmaz, E.; Shawe-Taylor, J. Predicting Engagement in Video Lectures. In Proceedings of the International Conference on Educational Data Mining, Ifrain, Morocco, 10–13 July 2020. [Google Scholar]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Wang, J.; Xu, M.; Wang, H.; Zhang, J. Classification of imbalanced data by using the SMOTE algorithm and locally linear embedding. In Proceedings of the 2006 8th International Conference on Signal Processing, Guilin, China, 16–20 November 2006; Volume 3. [Google Scholar]
- Hong, B.; Wei, Z.; Yang, Y. Discovering learning behavior patterns to predict dropout in MOOC. In Proceedings of the 12th International Conference on Computer Science and Education, Houston, TX, USA, 22–25 August 2017; pp. 700–704. [Google Scholar] [CrossRef]
- Wang, L.; Wang, H. Learning behavior analysis and dropout rate prediction based on MOOCs data. In Proceedings of the 2019 10th International Conference on Information Technology in Medicine and Education (ITME), Quingdao, China, 23–25 August 2019; pp. 419–423. [Google Scholar]
- Yousef, A.M.F.; Sumner, T. Reflections on the last decade of MOOC research. Comput. Appl. Eng. Educ. 2021, 29, 648–665. [Google Scholar] [CrossRef]
- Aldowah, H.; Al-Samarraie, H.; Alzahrani, A.I.; Alalwan, N. Factors affecting student dropout in MOOCs: A cause and effect decision-making model. J. Comput. High. Educ. 2019, 32, 429–454. [Google Scholar] [CrossRef]
Figure 1.
Methodology. The three components are data handling, machine learning modeling, and model evaluation. The steps within each component are explained in this section.
Figure 2.
The cumulative rate of learning progression of each student (left). The rate of changes in learning progression of each student (right).
Figure 3.
The Data Preprocessing Method: Sequence Classification.
Figure 4.
The Graph of Student Learning expressed as topic mastered over time.
Figure 5.
Correlation Matrix of Features.
Figure 6.
Feature Importance Plot.
Figure 7.
Correlation Matrix of Features after Feature Selection.
Figure 8.
The ROC of the Model.
Figure 9.
(a). Accuracy of the Model on Different Days (b). Precision of the Model on Different Days (c). Recall of the Model on Different Days (d). F1-score of the Model on Different Days.
Figure 10.
The SHAP Summary Plot for this Prediction Model.
Figure 11.
SHAP Force plot 1.
Figure 12.
SHAP Force Plot 2.
Table 1.
Distribution of Students in the Course.
Class | Number of Students | Percentage |
---|
Complete | 396 | 12.50% |
Dropout | 2776 | 87.50% |
Table 2.
Attributes in Dataset.
Attributes | Description |
---|
Student ID | Student primary key |
time_and_topics | the time taken and the topics mastered for a day |
topics_mastered | the topics mastered for a day |
topics_practiced | the topics practiced by the student for a day |
time_spent | the time spent by the student for a day |
Table 3.
Features Engineered in this Research.
Average | Kurtosis | Overall Trajectory |
---|
Standard Deviation | Moving average with window size 2 | Final Trajectory |
Variance | Moving average with window size 3 | Days in consideration |
Skew | Moving average with window size 4 | |
Table 4.
A Sample Feature Table.
Mov Avg 2 | Mov Avg 3 | Mov Avg 4 | Skew | Overall Trajectory | Final Trajectory | Average | Standard Deviation | Variance | Kurtosis | Day |
---|
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | −3 | 1 |
0 | 0 | 0 | 0 | 1.5707 | 1.5707 | 0 | 0 | 0 | −3 | 2 |
2 | 1.3333 | 0 | 0.707 | 1.5707 | 1.5707 | 1.3333 | 1.8856 | 3.5555 | −1.5 | 3 |
3.6055 | 2.4037 | 1.5 | 0.493 | 1.5707 | 0.4636 | 1.5 | 1.6583 | 2.75 | −1.3719 | 4 |
5.0249 | 4.3843 | 3.1324 | 0.152 | 1.5707 | 1.1902 | 2.2 | 2.0396 | 4.16 | −1.6268 | 5 |
Table 5.
Target Values.
Target Value | Num Data Points |
---|
0 | 39,529 |
1 | 2776 |
Table 6.
Target values after SMOTE.
Target Value | Number of Data Points |
---|
0 | 39,529 |
1 | 39,529 |
Table 7.
Random Forest Model Specifications.
Arguments | Value | Specification |
---|
n_estimators | 1000 | Number of trees |
max_features | auto | sqrt (number of features) |
random_state | 42 | Control the randomness |
criterion | Gini | Gini impurity |
Table 8.
Testing Data Point Spread.
Target Value | Number of Data Points |
---|
0 | 9883 |
1 | 9882 |
Table 9.
The Results of the Model Validation.
Class | Precision | Recall | F1-Score | Support |
---|
0 | 0.91 | 0.84 | 0.87 | 9883 |
1 | 0.85 | 0.91 | 0.88 | 9882 |
Table 10.
Input Values for SHAP Force Plot 1.
Features | Values |
---|
moving_average 2 | 0.7675 |
skew | 0.7071 |
overall trajectory | 0 |
final trajectory | 1.5707 |
average | 0.5117 |
day | 3 |
Table 11.
Input Values for SHAP Force Plot 2.
Features | Values |
---|
moving_average 2 | 35.2411 |
skew | 0.3551 |
overall trajectory | 0.0182 |
final trajectory | 1.4272 |
average | 5.7054 |
day | 30 |
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |