0% found this document useful (0 votes)
38 views4 pages

Literature Review 2.1 Current Anti-Money Laundering (AML) and Fraud Detection Systems

Uploaded by

anampiuhillary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views4 pages

Literature Review 2.1 Current Anti-Money Laundering (AML) and Fraud Detection Systems

Uploaded by

anampiuhillary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Results and Evaluation

In the results chapter, machine learning model performance outcomes are presented with
detailed metrics such as precision, recall and AUC-ROC. Comparisons are made to illustrate
the benefits of the proposed framework over typical approaches.

Conclusion and Recommendations were given.

The key findings are summarised in the final chapter which discusses the contributions to
fraud detection of the project and recommendations for future research and real-world
applications. Potential improvements and challenges faced in the project are discussed as
well.

References and Appendices

The document concludes with a full reference list of all cited work as standardised via the
Harvard referencing style. Supplementary materials such as technical diagrams, code
snippets, and additional datasets are included in appendices.

The structured approach will help clarify and help cohere the reader’s progress in project
development from identification of the problem to the solution development, as well as the
solution evaluation.

2. Literature Review
2.1 Current Anti-Money Laundering (AML) and Fraud Detection Systems
Systems for fraud detection and anti money laundering (AML) are key to the integrity of
financial institutions, but they have many limitations. Conventional systems heavily depend
on rule-based framework by establishing thresholds and static rules to detect uncommon
transactions. These systems are simple but inflexible, and cannot evolve with changing tactics
of the fraudsters. Agorbia-Atta and Atalor (2024) state that these systems frequently do not
recognize increasingly sophisticated schemes because they use historical patterns, not
behaviors, and cannot keep up with new types of fraud. However, this shortcoming has
turned traditional AML systems passive into ‘reactive’ instead of being ‘proactive’ resulting
in leaving financial institutions vulnerable.

These systems are another key problem, in that they are prone to generating very high false
positive rates. For example, analysts facing rule-based systems are likely to spend inordinate
amounts of time investigating 'suspicious' transactions where there is nothing sinister about
them at all. According to Agorbia-Atta and Atalor (2024), these inefficiencies affect not only
compliance teams but also increase operational costs. Mallidi and Zagabathuni (2021) made
similar conclusions; rule-based models fail on complicated datasets where for example, there
are few fraudulent transactions compared to the general scenario, leading to large unbalanced
flagged cases.

Another big challenge is scalability. However, with exponential growth in transaction


volumes, traditional systems struggle to ingest data in real time. According to Rabhi and
Berry (2024), legacy frameworks do not have the computational performance required by
modern financial systems, in particular in the context of big data and real time constraints. As
a result, fraud detection is delayed and thus exposes business to the risk of loss of funds.

They thus constitute additional limitations that mandate more adaptive and scalable solutions.
This project attempts to address these gaps by utilizing machine learning and big data
technologies to deliver a strong framework which amalgamates robust prediction
performance with operational efficiency.

2.2 Big Data Analytics in Fraud Detection


This necessitates the use of the big data analytics technology in fraud detection systems due
to the high volume, velocity and variety in financial transactions. In traditional systems, vast
datasets can not be processed in real time, giving financial institutions no way to identify
undetected fraud. However, traditional big data processing frameworks, such as Hadoop and
Apache Spark, significantly address this challenge by allowing for distributed data storage
and real time analytics. Rabhi and Berry (2024) explain that these platforms are able to
process computational power needed to analyze large datasets and identify the anomalies in
real time; hence contributing to excellent fraud detection capabilities.

Combined with Hadoop’s distributed file system (HDFS), it is a great fit for the storage of
this kind of large scale transaction data. It can aggregate historical data (which is critical to
detecting long term fraud patterns) as batch processing. Hadoop is complemented by Apache
Spark, through the use of their in-memory computing ability to provide real time data
processing. These tools combined make available a unified framework for simultaneous
analysis of historical and real time financial data streams.

Another big data platform strength is it can be used together with machine learning models.
For example, MLlib library of Spark allows to launch algorithms like Random Forest,
XGBoost on large tabular data. The integration speed and accuracy for fraud detection are
improved as machine learning models can be trained and deployed at scale, and without loss
of performance.

While they have many benefits, big data platforms also impose challenges, specifically in
regard to regulatory compliance and data privacy. For the storage and processing of sensitive
financial data, the General Data Protection Regulation (GDPR) has strict requirements, thus
demanding robust security measures. According to Rabhi and Berry (2024), it is important to
deal with these concerns so that big data technologies for fraud detection is used ethically.

This project overcomes the current scalability and real-time processing limitations of the
current systems by incorporating big data analytics in its framework to provide immensely
fast and accurate fraud detection solutions.

2.3 Machine Learning Techniques


Traditional rules based fraud systems are insufficient and Machine learning (ML) has become
a powerful emerging technology that has the potential to solve many of the problems with
fraud detection. While static frameworks can detect obvious fraud schemes, they’re unable to
identify complex patterns and new, evolving fraud schemes as ML algorithms learn patterns
from data. Mallidi and Zagabathuni (2021) claim that the ML models are very good at finding
the fraudulent activities in the financial datasets even though these frauds occur very rare in
the dataset.

Among the highly used algorithms for fraud detection are RF and XGBoost because of their
flexibility and speed. An ensemble method, Random Forest utilizes multiple decision trees to
classify transactions and is robust against overfitting and high dimensional data. Mallidi and
Zagabathuni (2021) showed that RF does better than traditional models in terms of precision
and recall and is therefore suited for datasets with imbalanced classes.

XGBoost is a gradient boosting algorithm which builds decision trees sequentially to


minimize the classification errors. The reason why this method is well suited for such large
financial datasets is that it hosts computation efficiency and a capacity to capture intricate
relationships in the data. Results by Dhanawat (2022) also demonstrated that XGBoost is a
superior model in anomaly detection, obtaining greater accuracy and lower false positive
rates, compared with conventional approaches.

Although these algorithms perform well in fraud detection, they need well refined
preprocessing for handling imbalanced data. Synthetic Minority Oversampling Technique
(SMOTE) is an important tool to help resolve the issue of imbalanced data through training
(Rai et al., 2024), by ensuring that the minority class, being fraudulent transactions, is
represented sufficiently. In this way, the models don’t get biased towards the majority class
and can pick rare fraudulent activities better.

In addition, ML models can be scaled. These algorithms are integrated with Big Data
platforms like Apache Spark by which huge transaction datasets can be processed in real time
and are indispensable to modern financial systems. Performing ML tasks with Spark is
convenient because Spark’s MLlib allows us to offload that parallel execution for training or
prediction, cutting down load time.

Though promising, ML models also come with problems. Performance can be compromised
by overfitting, when a model learns noise rather than meaningful patterns. Moreover, the
implementation of ML algorithms is on heavy computational resources and task requires
domain expertise. Keskar (2020) highlighted that it is crucial to choose good hyperparameters
and carefully cross validate to get the best out of our model.

In this project, Random Forest, XGBoost and other ML models’ strengths are harnessed to
create a robust fraud detection framework. Through the combination of these algorithms with
big data technologies, the proposed system targets to achieve a high accuracy, scalability and
efficiency in detecting the fraudulent transactions.

You might also like