DS in RME Using RM-Adi W
DS in RME Using RM-Adi W
Scholarly Communication
DR. Adi Wijaya, MKom App. & Use Case Pro & Master
Machine Learning Pro & Master
[ ]
Data Engineering Pro
Contact:
R Programming
• Telegram: @adiwjj Data Mining
• Email: [email protected] Data Scientist
Project Management
• HP/WA: 0838-789-19-456 Software Testing
SDLC
I am a lecturer and research fellow at the Universitas Indonesia Maju
Data Governance Foundation
(UIMA) Jakarta. Apart from that, I am also involved in IT project
pertaining data science, data governance, enterprise architecture, and Certified ATLAS.ti Prof. Trainer Junior
software development. Received a Doctorate in Electrical Engineering
(2021) at the Dept. of Electrical and Information Engineering, Universitas
Project Management Essentials Certified
Gadjah Mada. My research topics are: information processing (including
bibliometric analysis), data mining and machine learning, and health-
Scrum Fundamentals Certified
related informatics including brain-computer interface.
Source:
Zieliński JS. (2017) New Informatics Tools in Data Management, The Xth SIGSAND/PLAYS EuroSymposium 2017
Velocity
Big
Big Data Analytics Value
Data
Variety
Data Science
[email protected] Lectures and Talks in Scholarly Communication
Data Mining in Big Data Analytics
Source:
Siguenza-Guzman, L., Saquicela, V., Avila-Ordóñez, E., Vandewalle, J., & Cattrysse, D. (2015).
Literature Review of Data Mining Applications in Academic Libraries. The Journal of Academic Librarianship, 41(4), 499–510.
[email protected] Lectures and Talks in Scholarly Communication
Proses Data Mining
[email protected]
11 Lectures and Talks in Scholarly Communication
CRISP-DM
[email protected]
15 Lectures and Talks in Scholarly Communication
Pengetahuan (Pola/Model)
►Formula/Function (Rumus atau Fungsi Regresi)
WAKTU TEMPUH = 0.48 + 0.6 JARAK + 0.34 LAMPU + 0.2 PESANAN
►Tingkat Korelasi
►Rule (Aturan)
IF ips3=2.8 THEN lulustepatwaktu
►Cluster (Klaster)
[email protected]
16 Lectures and Talks in Scholarly Communication
Evaluasi (Akurasi, Error, etc)
► Estimation:
Error: Root Mean Square Error (RMSE), MSE, MAPE, etc
► Prediction/Forecasting (Prediksi/Peramalan):
Error: Root Mean Square Error (RMSE) , MSE, MAPE, etc
► Classification:
Confusion Matrix: Accuracy
ROC Curve: Area Under Curve (AUC)
► Clustering:
Internal Evaluation: Davies–Bouldin index, Dunn index, etc
External Evaluation: Rand measure, F-measure, Jaccard index, Fowlkes–Mallows
index, Confusion matrix
► Association:
Lift Charts: Lift Ratio
Precision and Recall (F-measure)
[email protected]
17 Lectures and Talks in Scholarly Communication
Data Mining in Electronic Medical Record
Scripting/Programming
Auto ML Tool/Package/Library:
► MLBox [python]
► Auto-Sklearn [python]
► Cloud Auto ML [Google cloud]
► TPOT
► Auto-Keras
► DataRobot
► BigML
► H2O AutoML
► Rapidminer AutoML
Image source:
https://medium.datadriveninvestor.com/everything-you-want-to-know-about-automated-machine-learning-pipeline-df9e44612ff
Rapidminer:
Visionaries quadrant
Completeness of vision: high
Ability to execute: middle
Together with other popular free tool
such as: KNIME, H20.ai
Training
Read Training
Data
pre-processing modeling
Testing
Read Testing
Data
pre-processing apply model results
Note:
Kita lakukan training lagi (re-train) jika kita
ingin memperbaiki model, misalnya dengan
menambahkan data training yg lebih banyak,
sehingga model akan lebih baik kinerjanya
URL dataset:
https://archive.ics.uci.edu/ml/datasets/Heart+failure+clinical+records
Sensitifity = 68.42%
Specificity = 92.68%
[email protected] Lectures and Talks in Scholarly Communication
Compartion with previous study
Ejection_fr = 25
Prediction 1
Decision = 1
[email protected] Lectures and Talks in Scholarly Communication
Lectures and Talks in
Scholarly Communication
Thank you…