Untitled Document
Untitled Document
Motivation:
Problem statement:-
This project addresses these gaps by developing an automated invoice processing system using
Machine Learning (ML) and PySpark to streamline validation and classification. By
integrating a rule engine for business rule validation and a Gradient Boosting model for
classification, the system enhances scalability, accuracy, and efficiency. Leveraging PySpark,
the solution enables real-time invoice processing from SharePoint folders, reducing manual
effort while ensuring adaptability across diverse invoice formats, thereby improving the
reliability of automated invoice management systems.
Research Objectives