Fraud Detection in Banking Data Using Machine Learning
Fraud Detection in Banking Data Using Machine Learning
• Regularizing gradient boosting library for C++, Java, Python, R, Julia, Perl, Scala.
• Developed by Tianqi Chen for DMLC (DISTRIBUTE MACHINE LEARNING COMMAND) research project.
• Initial version: terminal app configured with Library for Support Vector Machines file.
• Boosting algorithm based on gradient boosted trees.
• Avoids overfitting with regularization term.
• Utilizes parallel and distributed computing for faster model creation. (Huang,2014)
• Employs sparsity-aware algorithm to remove missing values in split gain computation.
• Applied in finance, healthcare, e-commerce for fraud detection, churn prediction, credit risk modeling.
Random Forest