Problem Statement - Data Analytics

The Bajaj Finserv Health Datathon challenges participants to develop algorithms for detecting fraudulent insurance claim documents, focusing on various types of forgeries such as scribbling, digital manipulation, and whitener use. The project emphasizes creating a user-friendly interface for document uploads and visualizing detected forgeries while ensuring efficiency and scalability. Participants can earn bonus points for additional features like language analysis and reducing false positives, with the freedom to choose their technology stack and datasets.

Uploaded by

jaimaabharati102

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views4 pages

Problem Statement - Data Analytics

Uploaded by

jaimaabharati102

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Bajaj Finserv Health - Datathon

Problem Statement

The challenge is to detect potentially fraudulent insurance claim documents, such as medical
invoices, prescriptions, and lab test reports, which are received from various providers and
customers. Fraudulent claimants exploit identical digital or printed templates with minor
modifications, making standard document comparison ineffective due to background variations.
To streamline processes and improve efficiency, we aim to develop a robust algorithm to identify
Description instances of potential fraud by automating background noise removal, enabling document
standardization and accurate content comparison.

The objective of this problem statement is to develop a comprehensive forgery detection and
algorithm for printed, handwritten scanned documents or digitally generated documents. The
tool will focus on detecting and highlight four main types of document forgeries: scribbling or
overwriting, digital forgery, data manipulation, and whitener-based manipulation.

1. Participants can choose to work on one or more of the below type of forgeries.

• Scribbling or Overwriting Detection (20 Points)

o Develop an algorithm to detect regions where critical data, such as date,
customer name, amount, and invoice number, has been scribbled or
overwritten on the document.
• Digital Forgery Detection (20 Points)
o Implement a mechanism to identify digitally edited or tampered regions
within the document. This includes detecting parts of the document edited
using image editing applications and identifying entirely digitally created
documents.
• Data Manipulation Detection (20 Points)
o Develop an algorithm to detect data manipulation, where certain parts of
Key Objectives the document have been added or removed, specifically focusing on critical
fields like amounts, dates, and other important information.
• Whitener Detection (20 Points)
o Implement an algorithm to detect areas where manipulation has occurred
using a whitener, aiming to identify portions of the document that have
been altered using correction fluids or similar methods.
• Any further type of forgery/tempering detection can fetch you 30 extra points.

2. Visualization of Detected Forgeries (20 Points)

a. Create a visualization interface that highlights the detected regions of forgery,
their type and accuracy/confidence. The visualization should clearly distinguish
the types of forgery detected.
3. Efficiency and Scalability (20 Points): Ensure that the solution is efficient and scalable,
capable of processing various types of documents with differing complexities.

4. User-Friendly Interface (10 Points): Design an intuitive and user-friendly interface for
users to upload documents, view the detected forgeries, and access the visualized
results.

Bonus Points (20 Points):

5. Language and Font Analysis: Implement a feature to detect inconsistencies in
language, font styles, or character sizes within the document, which may indicate
potential forgeries.
6. False Positive Reduction: Implement methods to minimize false positives in forgery
detection, ensuring high precision and reliability.

7. Dataset:
a. Participants are encouraged to create their own data set or use any publicly
available datasets for training and testing their forgery detection algorithms.
Few sample documents are shared for reference.

8. Technology Stack:
a. Participants are free to choose any open-source tool, programming languages,
frameworks, or libraries for development. The solution should be deployable on
a standard machine or as a web application.
9. Ethical Considerations:
a. Ensure that the solution adheres to ethical guidelines, respects privacy, and
does not promote harmful or unethical use.
Input 1:

Sample Input and Output Output 1:

Input 2:

Output 2:

A Report On Network Server Room
0% (1)
A Report On Network Server Room
5 pages
Effective Error Monitoring with Bugsnag: Definitive Reference for Developers and Engineers
From Everand
Effective Error Monitoring with Bugsnag: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Process Mining: Fundamentals and Applications
From Everand
Process Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Microsoft Azure Ai Fundamentals Certification Companion Guide To Prepare For The Ai900 Exam 1st Edition Krunal S Trivedi Download
No ratings yet
Microsoft Azure Ai Fundamentals Certification Companion Guide To Prepare For The Ai900 Exam 1st Edition Krunal S Trivedi Download
82 pages
8th Sem Ode Mid Term Question Paper
No ratings yet
8th Sem Ode Mid Term Question Paper
1 page
Handwriting Recognition: Fundamentals and Applications
From Everand
Handwriting Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Machine Vision: Insights into the World of Computer Vision
From Everand
Machine Vision: Insights into the World of Computer Vision
Fouad Sabry
No ratings yet
Smart Camera: Revolutionizing Visual Perception with Computer Vision
From Everand
Smart Camera: Revolutionizing Visual Perception with Computer Vision
Fouad Sabry
No ratings yet
12 IP Webbrowser8
No ratings yet
12 IP Webbrowser8
3 pages
CISSP - Certified Information Systems Security Professional Exam Preparation Study Guide
From Everand
CISSP - Certified Information Systems Security Professional Exam Preparation Study Guide
Georgio Daccache
5/5 (1)
Adobe Scan 29-Jan-2022
No ratings yet
Adobe Scan 29-Jan-2022
11 pages
8th Sem Time-Table
No ratings yet
8th Sem Time-Table
1 page
Leveraging Technology for Property Tax Management in Asia and the Pacific–Guidance Note
From Everand
Leveraging Technology for Property Tax Management in Asia and the Pacific–Guidance Note
Asian Development Bank
No ratings yet
Adobe Scan 22-Apr-2022
No ratings yet
Adobe Scan 22-Apr-2022
4 pages
PeopleLink Interactive Display R86 1
No ratings yet
PeopleLink Interactive Display R86 1
4 pages
Cohesity Price List - 2024-09-16-Var
No ratings yet
Cohesity Price List - 2024-09-16-Var
156 pages
Complex Analysis by Sudipa Maam
No ratings yet
Complex Analysis by Sudipa Maam
6 pages
Document Forgery Detection
No ratings yet
Document Forgery Detection
4 pages
27.2.16 Lab - Investigating An Attack On A Windows Host
No ratings yet
27.2.16 Lab - Investigating An Attack On A Windows Host
8 pages
Information System
No ratings yet
Information System
11 pages
EDA & FE-Graded Internal Microproject
No ratings yet
EDA & FE-Graded Internal Microproject
1 page
Jayanta Sir Topology Midterm 8th Sem
No ratings yet
Jayanta Sir Topology Midterm 8th Sem
43 pages
AI Agents Made Easy: Build Your Digital Workforce with No-Code Tools
From Everand
AI Agents Made Easy: Build Your Digital Workforce with No-Code Tools
Barron Wilson
No ratings yet
Jayashree Maam Topology Upto Mid Term 8th Sem
No ratings yet
Jayashree Maam Topology Upto Mid Term 8th Sem
27 pages
Enhanced Image Forgery Detection by Deep Learning Approaches (Report) - 1
No ratings yet
Enhanced Image Forgery Detection by Deep Learning Approaches (Report) - 1
55 pages
Study Guide: Cisco AppDynamics Professional Implementer
From Everand
Study Guide: Cisco AppDynamics Professional Implementer
Anand Vemula
No ratings yet
Challenge Description: Finger Print Matching-Age Variation
No ratings yet
Challenge Description: Finger Print Matching-Age Variation
3 pages
ACN Lab Manual - 0
No ratings yet
ACN Lab Manual - 0
23 pages
05 - Auger Specifications
No ratings yet
05 - Auger Specifications
1 page
Software Testing Interview Questions You'll Most Likely Be Asked
From Everand
Software Testing Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
EV6 Brochure
No ratings yet
EV6 Brochure
21 pages
Study Guide Cisco AppDynamics Professional Implementer (500-430 CAPI)
From Everand
Study Guide Cisco AppDynamics Professional Implementer (500-430 CAPI)
Anand Vemula
No ratings yet
Types of Suspension
No ratings yet
Types of Suspension
4 pages
Boost Your Productivity With AI Tools
From Everand
Boost Your Productivity With AI Tools
Daniel Basso
No ratings yet
Everything You Always Wanted To Know About Twitter ( But Were Afraid To Ask) - by James Garside - HackerNoon - Com - Medium
No ratings yet
Everything You Always Wanted To Know About Twitter ( But Were Afraid To Ask) - by James Garside - HackerNoon - Com - Medium
12 pages
Practical Pentesting Guide: Preparation for Certification and Ethical Hacking
From Everand
Practical Pentesting Guide: Preparation for Certification and Ethical Hacking
Evan Blake
No ratings yet
Physics Classmate
No ratings yet
Physics Classmate
15 pages
CSEMP91
No ratings yet
CSEMP91
25 pages
Ejemplo de Bill of Lading
No ratings yet
Ejemplo de Bill of Lading
2 pages
Project Group 20
No ratings yet
Project Group 20
6 pages
Digital Forgery g26
No ratings yet
Digital Forgery g26
42 pages
Partb Isro
No ratings yet
Partb Isro
8 pages
DSA Continue Assessment
No ratings yet
DSA Continue Assessment
2 pages
Data Entry: A Guide to Data Entry Operations That Make Money Online
From Everand
Data Entry: A Guide to Data Entry Operations That Make Money Online
Daniel Shore
No ratings yet
VIDEO 2 PC - Driver Installation Instructions PDF
No ratings yet
VIDEO 2 PC - Driver Installation Instructions PDF
2 pages
Partb Isro - Compressed
No ratings yet
Partb Isro - Compressed
8 pages
BC
No ratings yet
BC
2 pages
Template
No ratings yet
Template
11 pages
Centrifuga
No ratings yet
Centrifuga
21 pages
CS36 Lab 5
No ratings yet
CS36 Lab 5
4 pages
Integrity Checker Final Sub
No ratings yet
Integrity Checker Final Sub
47 pages
Software Developer: How to Use Your Programming Skills to Build a Business
From Everand
Software Developer: How to Use Your Programming Skills to Build a Business
Daniel Shore
No ratings yet
Mini Document 2
No ratings yet
Mini Document 2
44 pages
Final Project Requriment
No ratings yet
Final Project Requriment
5 pages
HSV5 TB
No ratings yet
HSV5 TB
15 pages
Paper 1321
No ratings yet
Paper 1321
6 pages
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
From Everand
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
WINTON CLEM
No ratings yet
Data Sheet No. 01.12.01 - PR6 - 7 Inductive Pulse Module
No ratings yet
Data Sheet No. 01.12.01 - PR6 - 7 Inductive Pulse Module
1 page
Hack - To - Hire - Case Study - Data Science
No ratings yet
Hack - To - Hire - Case Study - Data Science
2 pages
Edward Heath: Work Experience
No ratings yet
Edward Heath: Work Experience
2 pages
Bottle Filling System
33% (3)
Bottle Filling System
6 pages
Sustainability and Climate Resilience: Trends and Innovations
From Everand
Sustainability and Climate Resilience: Trends and Innovations
Mimi Okougbo
No ratings yet
Fraud Detection On Bankism Data
No ratings yet
Fraud Detection On Bankism Data
25 pages
Duogen Product Bulletin PDF
No ratings yet
Duogen Product Bulletin PDF
2 pages
Plagiarism Checker Shruti
No ratings yet
Plagiarism Checker Shruti
20 pages
页面提取自－catalogue 2
No ratings yet
页面提取自－catalogue 2
9 pages
1
No ratings yet
1
13 pages
Group#5
No ratings yet
Group#5
25 pages
Pec Application Content
No ratings yet
Pec Application Content
7 pages
Software Project Proposal Documentation
No ratings yet
Software Project Proposal Documentation
10 pages
Training Presentation
No ratings yet
Training Presentation
14 pages
Fourier Coefficients For Fraud Handwritten Document Classification Through Age Analysis
No ratings yet
Fourier Coefficients For Fraud Handwritten Document Classification Through Age Analysis
6 pages
All Projects F23
No ratings yet
All Projects F23
145 pages
Assignment # 02
No ratings yet
Assignment # 02
1 page
LEED Green Associate VI. Stakeholder Involvement in Innovation
No ratings yet
LEED Green Associate VI. Stakeholder Involvement in Innovation
24 pages
Form 1 Term 1 ICT Quiz
No ratings yet
Form 1 Term 1 ICT Quiz
10 pages
Data Communication and Network Questions and Answers PDF
No ratings yet
Data Communication and Network Questions and Answers PDF
3 pages
RJPOLICE HACK 496 Doc Submission
No ratings yet
RJPOLICE HACK 496 Doc Submission
5 pages
21KP5A0507
No ratings yet
21KP5A0507
10 pages
WPR 3 (Shaurya Upadhyay)
No ratings yet
WPR 3 (Shaurya Upadhyay)
4 pages
WPR 4 (Shaurya Upadhyay)
No ratings yet
WPR 4 (Shaurya Upadhyay)
5 pages
WPR 4 (Shaurya Upadhyay)
No ratings yet
WPR 4 (Shaurya Upadhyay)
5 pages
The ChatGPT Millionaire Guide
From Everand
The ChatGPT Millionaire Guide
arlene stein
No ratings yet
WPR 2 (Shaurya Upadhyay)
No ratings yet
WPR 2 (Shaurya Upadhyay)
5 pages
Business Dashboards: A Visual Catalog for Design and Deployment
From Everand
Business Dashboards: A Visual Catalog for Design and Deployment
Nils H. Rasmussen
4/5 (1)
Final Report Phase-I
No ratings yet
Final Report Phase-I
14 pages
Effects Overview
No ratings yet
Effects Overview
3 pages
Digital Technologies – an Overview of Concepts, Tools and Techniques Associated with it
From Everand
Digital Technologies – an Overview of Concepts, Tools and Techniques Associated with it
Editor IJSMI
No ratings yet
FINAL
No ratings yet
FINAL
19 pages
No Cost Contract Automation: Create a digital contract lifecycle management system using Microsoft 365
From Everand
No Cost Contract Automation: Create a digital contract lifecycle management system using Microsoft 365
Henrik Hjertstedt
No ratings yet
Irjet V8i5131
No ratings yet
Irjet V8i5131
5 pages
Defect Prediction in Software Development & Maintainence
From Everand
Defect Prediction in Software Development & Maintainence
Rudra Kumar
No ratings yet
Digital Skills for Agile Business Analysis
From Everand
Digital Skills for Agile Business Analysis
Tj. Blake Williams
No ratings yet
44 Image Forgery Detection PY044
No ratings yet
44 Image Forgery Detection PY044
7 pages
IBM Cognos Business Intelligence
From Everand
IBM Cognos Business Intelligence
Dustin Adkison
No ratings yet
Computer Aided Fraud Prevention and Detection: A Step by Step Guide
From Everand
Computer Aided Fraud Prevention and Detection: A Step by Step Guide
David Coderre
No ratings yet

Problem Statement - Data Analytics

Uploaded by

Problem Statement - Data Analytics

Uploaded by

Bajaj Finserv Health - Datathon

• Scribbling or Overwriting Detection (20 Points)

2. Visualization of Detected Forgeries (20 Points)

Bonus Points (20 Points):

Sample Input and Output Output 1:

You might also like