0% found this document useful (0 votes)

14 views15 pages

Uid - Bda Report

The document outlines a project aimed at enhancing revenue for a health care insurance company through the analysis of customer behavior and competitor data using Big Data tools. It details objectives such as creating data pipelines, increasing revenue, and improving customer understanding, along with methodologies for data processing, cleaning, and visualization. The conclusion emphasizes the importance of collaborative approaches in healthcare policy while suggesting future enhancements for real-time data processing and broader applications beyond healthcare.

Uploaded by

soumyasathishbhat0924

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views15 pages

Uid - Bda Report

Uploaded by

soumyasathishbhat0924

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

HEALTH CARE INSURANCE ANALYSIS

CHAPTER 1

INTRODUCTION

A Health Care insurance company is facing challenges in enhancing its revenue and
understanding the customers so it wants to take help of Big Data and User Interface Ecosystem
to analyze the Competitors company data received from varieties of sources, namely through
scrapping and third-party sources. This analysis will help them to track the behavior, condition
of customers so that to customize offers for them to buy insurance policies and also calculate
royalties to those customers who buy policies in past, this in turn will enhance their revenues.

Health care insurance analysis involves the examination and evaluation of various aspects
related to health insurance policies, coverage, costs, and outcomes. As the healthcare landscape
continues to evolve, understanding and analyzing health care insurance is crucial for
individuals, healthcare providers, insurance companies, and policymakers.

This examination includes evaluating coverage options, understanding policy terms, analyzing
premium costs, and considering factors such as deductibles and copayments. By delving into
these components, individuals and organizations can optimize their health insurance choices,
ensuring adequate coverage while managing overall healthcare expenses. This process is
crucial for making informed decisions that align with one's health needs and
financial considerations.

Dept of ISE, DSATM 2023-2024 Page 1

HEALTH CARE INSURANCE ANALYSIS

CHAPTER 2

OBJECTIVES

1.The goal of the project is to create data pipelines for the Health Care insurance company
which will make the company make appropriate business strategies to enhance their revenue
by analyzing customers behaviors and send offers and royalties to customers respectively.

2.Increase Revenue: By leveraging competitor data and customer insights to tailor insurance
policies and offers effectively, resulting in revenue growth.

3.Enhance Customer Understanding: Gain a deeper understanding of customer behavior,

preferences, and health conditions to provide more relevant and personalized insurance
solutions.

Dept of ISE, DSATM 2023-2024 Page 2

HEALTH CARE INSURANCE ANALYSIS

CHAPTER 3

PROJECT ARCHITECTURE

Fig 3.1: The above figure describes the architecture of “HEALTH CARE INSURANCE ANALYSIS”

Dept of ISE, DSATM 2023-2024 Page 3

HEALTH CARE INSURANCE ANALYSIS

CHAPTER 4

PROBLEM STATEMENT

Problem 1- Data Pre-processing, Enrichment and Load into Database

● Parse and Infer schema of the given xml and csv formats data is ingested.

● We are expected to do general data cleaning steps like empty string replacements
with actual NULL, data type checks (including date format) and corrections/
rejections, file name checks, empty file checks, malformed record checks and
rejection etc.

Problem 2 - Data Analysis (Spark/Hive)

Once we have made the data ready for analysis, we have to perform analysis on a batch
basis.

 Schema Design for SQL Database:

Fig 4.1 decsribes the schema design for SQL Database which stores the information of “HEALTH
CARE INSURANCE ANALYSIS”.

Dept of ISE, DSATM 2023-2024 Page 4

HEALTH CARE INSURANCE ANALYSIS

CHAPTER 5

METHODOLOGY

User Interface Design: We have designed the frond end part using python – tkinter toolkit ,
in Visual studio code.

DATASET CREATION:A data set is a collection of data. Data sets can also consist of a
collection of documents or files. database is an organized collection of structured information,
or data, typically stored electronically in a computer system. A database is usually controlled
by a database management system (DBMS).

DATA CLEANING : Data cleaning is the process of fixing or removing incorrect, corrupted,
incorrectly formatted , duplicate, or incomplete data within a dataset

LOADING TO DATABASE : Data loading refers to the "load" component of ETL. After
data is retrieved and combined from multiple sources , cleaned, and formatted , it is then loaded

into a storage system, such as a cloud data warehouse, or relational database.

HIVE: Apache Hive is a particularly efficient tool when it comes to big data . A warehouse
data software that supports the data analysis process of big data on a regular basis, the concept
of hive big data is quite popular in the technological realm.

DATA VISULAIZATION:

Data visualization is the representation of data through use of common graphics, such as charts,

plots, infographics, and even animations.

Dept of ISE, DSATM 2023-2024 Page 5

HEALTH CARE INSURANCE ANALYSIS

CHAPTER 6

CODE TEMPLATES

Data Processing

6.1 Conversion of raw data to processed data:

For each raw file we have checked null values, duplicate values and other parameters and
then converted into processed dataset. here are some samples of codes.

Dept of ISE, DSATM 2023-2024 Page 6

HEALTH CARE INSURANCE ANALYSIS

6.2 Processed Dataset

Some snippets of processed dataset which is further used to create RDBMS

Dept of ISE, DSATM 2023-2024 Page 7

HEALTH CARE INSURANCE ANALYSIS

6.3 Hive and Sqoop

We have used Sqoop to import the data form RDBMS to Hive and there we can perform our
necessary tasks to get the outputs

Here is the HEALTHCARE_SYSTEM Database created in Hive.

6.4

Dept of ISE, DSATM 2023-2024 Page 8

HEALTH CARE INSURANCE ANALYSIS

6.5 Apache Spark

After uploading the data in to HDFS we connected spark. Here we analyze the data with
help of python. Here we get our desired result in tabular form and that result is used to
visualize our use cases.

Dept of ISE, DSATM 2023-2024 Page 9

HEALTH CARE INSURANCE ANALYSIS

CHAPTER 7
OUTPUT SCREENS
Use Case -1: User Interface Design for Health Care Insurance Analysis.

Use Case-2: Average Monthly premium for each subgroup

Dept of ISE, DSATM 2023-2024 Page 10

HEALTH CARE INSURANCE ANALYSIS

Use Case-3: Number of people whose claim either got accepted or rejected.

Use case-4: Which disease have maximum number of claims

Dept of ISE, DSATM 2023-2024 Page 11

HEALTH CARE INSURANCE ANALYSIS

Use Case-5: Which company/group is most profitable

Use case-6: Monthly premium paid by each subgroup Average

Dept of ISE, DSATM 2023-2024 Page 12

HEALTH CARE INSURANCE ANALYSIS

CHAPTER 7
CONCLUSION

We have collected data from various 3rd party sources and processed them and with the
help of Big Data tools we computed the data to visualize some of necessary use case. Based
on the above analysis the health care insurance company will create a new business strategy
to acquire more customers, engagement and send offers. As well as fetching the company
and customer details and provide easy access to information regarding customers.

Balancing affordability, inclusivity, and quality of services is crucial for crafting effective
healthcare policies that cater to diverse needs. As healthcare systems continue to evolve, a
collaborative approach involving policymakers, insurers, and healthcare providers is
necessary to address emerging challenges and enhance the overall effectiveness of health
insurance programs.

Dept of ISE, DSATM 2023-2024 Page 13

HEALTH CARE INSURANCE ANALYSIS

CHAPTER 8

FURTHER ENHANCEMENTS

This project has a very vast scope in future in this field. We developed this project on the
requirement of our client but it can be generalized in future. If we get required resources,
we can get more accurate results. There are various use cases that can be achieved by this
project. Some of future scopes are bellow-

 Real time data can also be used for real time processing.

 We can automate the whole procedure where data coming from sources and
getting executed at a same time.

 Not in the Healthcare industry we can generalized the whole procedure to

other sectors like cars, online education system etc.

Dept of ISE, DSATM 2023-2024 Page 14

HEALTH CARE INSURANCE ANALYSIS

REFERENCES

[1] Beranger, Jérôme. 2016. Ethics in Big Data: the medical datasphere.
London: Elsevier.

[2] Davis, Cord and Patterson, Doug. 2012. Ethics of Big Data. Farnham, O’Reilly.

[3] Big Data Ethics

[4] GeekforGeeks

Dept of ISE, DSATM 2023-2024 Page 15

RAG Slide ENG
No ratings yet
RAG Slide ENG
41 pages
An Introduction To Healthcare Data Analytics
No ratings yet
An Introduction To Healthcare Data Analytics
18 pages
Health Care System Analysispdf
No ratings yet
Health Care System Analysispdf
19 pages
Ccinfo 20230114 201747
No ratings yet
Ccinfo 20230114 201747
12,427 pages
Gill 2020 Fall MISAM
No ratings yet
Gill 2020 Fall MISAM
42 pages
Big Data Fraud
No ratings yet
Big Data Fraud
44 pages
Unit 1
No ratings yet
Unit 1
29 pages
Scribd 4
No ratings yet
Scribd 4
14 pages
Ibm PROJECT 1 1 Output
No ratings yet
Ibm PROJECT 1 1 Output
10 pages
IJRTI2404048
No ratings yet
IJRTI2404048
6 pages
Research Paper An Improved Approch For Fraud Detection in Health Insurance Using Data Mining Machine Learning
No ratings yet
Research Paper An Improved Approch For Fraud Detection in Health Insurance Using Data Mining Machine Learning
4 pages
MS Access Notes
No ratings yet
MS Access Notes
6 pages
AWS Data Engineering
No ratings yet
AWS Data Engineering
17 pages
Big Data Use Case
No ratings yet
Big Data Use Case
4 pages
CS3481 DBMS Lab Manual
No ratings yet
CS3481 DBMS Lab Manual
61 pages
Data Definition Language (DDL) (Slides)
No ratings yet
Data Definition Language (DDL) (Slides)
10 pages
Data Management Concepts: 2013 Pearson Education, Inc. Publishing As Prentice Hall, AIS, 11/e, by Bodnar/Hopwood
No ratings yet
Data Management Concepts: 2013 Pearson Education, Inc. Publishing As Prentice Hall, AIS, 11/e, by Bodnar/Hopwood
57 pages
Assignment Database Management
No ratings yet
Assignment Database Management
4 pages
5-Database System Components
100% (1)
5-Database System Components
15 pages
Oracle Retail Store Inventory Management: Data Migration Guide Release 12.0
No ratings yet
Oracle Retail Store Inventory Management: Data Migration Guide Release 12.0
17 pages
Enroll. No. - : Marwadi University
No ratings yet
Enroll. No. - : Marwadi University
4 pages
RDBMS Using MYSQL
No ratings yet
RDBMS Using MYSQL
2 pages
ARIES Recovery Algorithm
No ratings yet
ARIES Recovery Algorithm
4 pages
Fusion Assets Physical Inventory Comparison Process ADFDI
0% (1)
Fusion Assets Physical Inventory Comparison Process ADFDI
4 pages
Google SQL Interview Questions
No ratings yet
Google SQL Interview Questions
20 pages
SLIMS User Manual - Developers v1
No ratings yet
SLIMS User Manual - Developers v1
28 pages
Final Exam Semester 2 - Part I
0% (1)
Final Exam Semester 2 - Part I
19 pages
Big Data Analytics On Decision Making by Smart Firms in Kenya
No ratings yet
Big Data Analytics On Decision Making by Smart Firms in Kenya
20 pages
Data Modeler Resume
No ratings yet
Data Modeler Resume
5 pages
Query Processing and Optimization PDF
No ratings yet
Query Processing and Optimization PDF
73 pages
Analysis Data Statistic With Python
No ratings yet
Analysis Data Statistic With Python
25 pages
CS 3308 Learning Journal Unit 6
No ratings yet
CS 3308 Learning Journal Unit 6
7 pages
Linked List3
No ratings yet
Linked List3
19 pages
Oracle 12c: SQL: Additional Database Objects
No ratings yet
Oracle 12c: SQL: Additional Database Objects
39 pages
Q-DAS Database Manual: Oracle Instances and Scripts Maintenance Procedure
No ratings yet
Q-DAS Database Manual: Oracle Instances and Scripts Maintenance Procedure
32 pages
SQL Query To Find Second Highest Salary - GeeksforGeeks
No ratings yet
SQL Query To Find Second Highest Salary - GeeksforGeeks
5 pages
Comparative Analysis On Techniques For Big Data Testing: Adiba Abidin Divya Lal Naveen Garg Vikas Deep
No ratings yet
Comparative Analysis On Techniques For Big Data Testing: Adiba Abidin Divya Lal Naveen Garg Vikas Deep
5 pages
Changing Passwords For OBIEE Per Security Policy
No ratings yet
Changing Passwords For OBIEE Per Security Policy
19 pages
SELECT INITCAP (Lastname - ',' - Firstname) AS "NAME" FROM Employees WHERE Job - Id 'AD - PRES' OR Job - Id 'IT - PROG'
No ratings yet
SELECT INITCAP (Lastname - ',' - Firstname) AS "NAME" FROM Employees WHERE Job - Id 'AD - PRES' OR Job - Id 'IT - PROG'
4 pages
Litaoen 2C
No ratings yet
Litaoen 2C
2 pages
Securing Healthcare Software: A Practical Guide to Functional Testing, Penetration Testing, and Compliance
From Everand
Securing Healthcare Software: A Practical Guide to Functional Testing, Penetration Testing, and Compliance
Tamerlan Mammadzada
No ratings yet
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
From Everand
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
WINTON CLEM
No ratings yet
Cloud-Based Multi-Modal Information Analytics
From Everand
Cloud-Based Multi-Modal Information Analytics
Tanushri Kaniyar
No ratings yet
IBM InfoSphere: A Platform for Big Data Governance and Process Data Governance
From Everand
IBM InfoSphere: A Platform for Big Data Governance and Process Data Governance
Sunil Soares
3.5/5 (2)
Decision Making with Data
From Everand
Decision Making with Data
Ravi Deshpande
No ratings yet
Managing Big Data Effectively
From Everand
Managing Big Data Effectively
Bhima Asan
No ratings yet
Big Data and Data Science: Analytics for the Future
From Everand
Big Data and Data Science: Analytics for the Future
Dhaanyalakshmi Ahuja
No ratings yet
Principles of Data Mining
From Everand
Principles of Data Mining
Subodh Keshari
No ratings yet
Essentials of Data Analysis
From Everand
Essentials of Data Analysis
Agasti Khatri
No ratings yet
Enterprise Data Science: Smarter Decisions with Big Data
From Everand
Enterprise Data Science: Smarter Decisions with Big Data
Vidhur Gupta
No ratings yet
Introduction to Business Analytics
From Everand
Introduction to Business Analytics
Dwaipayan Sethi
No ratings yet
An Introduction to Creating Standardized Clinical Trial Data with SAS
From Everand
An Introduction to Creating Standardized Clinical Trial Data with SAS
Todd Case
No ratings yet
Effective Business Intelligence with QuickSight
From Everand
Effective Business Intelligence with QuickSight
Rajesh Nadipalli
No ratings yet
SAS Programming with Medicare Administrative Data
From Everand
SAS Programming with Medicare Administrative Data
Matthew Gillingham
5/5 (1)
Cyber Intelligence-Driven Risk: How to Build and Use Cyber Intelligence for Business Risk Decisions
From Everand
Cyber Intelligence-Driven Risk: How to Build and Use Cyber Intelligence for Business Risk Decisions
Richard O. Moore III
No ratings yet
The Analytic Hospitality Executive: Implementing Data Analytics in Hotels and Casinos
From Everand
The Analytic Hospitality Executive: Implementing Data Analytics in Hotels and Casinos
Kelly A. McGuire
No ratings yet
BI and Big Data Management
From Everand
BI and Big Data Management
Ulrich Hambuch
No ratings yet
Business Intelligence and Data Mining Techniques
From Everand
Business Intelligence and Data Mining Techniques
Dwaipayan Sethi
No ratings yet
Data-Driven Decision Making
From Everand
Data-Driven Decision Making
Aadinath Pothuvaal
No ratings yet
Free Antivirus and its Market Implimentation: a Case Study of Qihoo 360 And Baidu
From Everand
Free Antivirus and its Market Implimentation: a Case Study of Qihoo 360 And Baidu
Yang Yiming
No ratings yet
Business Analytics: Leveraging Data for Insights and Competitive Advantage
From Everand
Business Analytics: Leveraging Data for Insights and Competitive Advantage
Ronald BLaha
No ratings yet
What Is Data Analytics? A Complete Guide For Beginners
From Everand
What Is Data Analytics? A Complete Guide For Beginners
Piyush Kumar Jain
No ratings yet
Building and Operating Data Hubs: Using a practical Framework as Toolset
From Everand
Building and Operating Data Hubs: Using a practical Framework as Toolset
Georg Graner
No ratings yet
Analytics and Big Data for Accountants
From Everand
Analytics and Big Data for Accountants
Jim Lindell
No ratings yet
Outsourcing Technology In the Healthcare Industry: In Depth Research to Protect the Security, Technology, and Profitability of Your Business
From Everand
Outsourcing Technology In the Healthcare Industry: In Depth Research to Protect the Security, Technology, and Profitability of Your Business
Damon Clements
No ratings yet
The Patient Revolution: How Big Data and Analytics Are Transforming the Health Care Experience
From Everand
The Patient Revolution: How Big Data and Analytics Are Transforming the Health Care Experience
Krisa Tailor
No ratings yet
How to Measure Anything in Cybersecurity Risk
From Everand
How to Measure Anything in Cybersecurity Risk
Douglas W. Hubbard
4.5/5 (7)
CompTIA CASP+ CAS-004 Exam Guide: A-Z of Advanced Cybersecurity Concepts, Mock Exams, Real-world Scenarios with Expert Tips (English Edition)
From Everand
CompTIA CASP+ CAS-004 Exam Guide: A-Z of Advanced Cybersecurity Concepts, Mock Exams, Real-world Scenarios with Expert Tips (English Edition)
Dr. Akashdeep Bhardwaj
No ratings yet
Leaders and Innovators: How Data-Driven Organizations Are Winning with Analytics
From Everand
Leaders and Innovators: How Data-Driven Organizations Are Winning with Analytics
Tho H. Nguyen
1/5 (1)
Data Privacy: What Enterprises Need to Know?
From Everand
Data Privacy: What Enterprises Need to Know?
Deepak Gupta
No ratings yet
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Data Analytics and Data Processing Essentials
From Everand
Data Analytics and Data Processing Essentials
gareth thomas
No ratings yet
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
From Everand
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
Steven Vollmer
No ratings yet
Comprehensive Guide to Implementing Data Science and Analytics: Tips, Recommendations, and Strategies for Success
From Everand
Comprehensive Guide to Implementing Data Science and Analytics: Tips, Recommendations, and Strategies for Success
Rick Spair
No ratings yet
ISO/IEC 27001:2022: An introduction to information security and the ISMS standard
From Everand
ISO/IEC 27001:2022: An introduction to information security and the ISMS standard
Steve Watkins
5/5 (4)
CISSP Certification Success Guide
From Everand
CISSP Certification Success Guide
SUJAN
No ratings yet
Cybersecurity Fundamentals: Best Security Practices: cybersecurity beginner, #1
From Everand
Cybersecurity Fundamentals: Best Security Practices: cybersecurity beginner, #1
Bruce Brown
No ratings yet
Applied Predictive Modeling: An Overview of Applied Predictive Modeling
From Everand
Applied Predictive Modeling: An Overview of Applied Predictive Modeling
Steven Taylor
No ratings yet
CompTIA Data+ (Plus) The Ultimate Exam Prep Study Guide to Pass the Exam
From Everand
CompTIA Data+ (Plus) The Ultimate Exam Prep Study Guide to Pass the Exam
Jamie Murphy
No ratings yet
Certified Information Systems Auditor Exam Prep And Dumps Exam Review Guide for ISACA CISA Exam PART 2
From Everand
Certified Information Systems Auditor Exam Prep And Dumps Exam Review Guide for ISACA CISA Exam PART 2
Byte Books
No ratings yet
Asset Security: CISSP, #2
From Everand
Asset Security: CISSP, #2
Selwyn Classen
No ratings yet
Data Entry Operator: Skills, Software, Career Tips, and Interview Q&A
From Everand
Data Entry Operator: Skills, Software, Career Tips, and Interview Q&A
Sumitra Kumari
No ratings yet
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
From Everand
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
Brian Knight
3/5 (1)
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet
Spreadsheets To Cubes (Advanced Data Analytics for Small Medium Business): Data Science
From Everand
Spreadsheets To Cubes (Advanced Data Analytics for Small Medium Business): Data Science
alasdair gilchrist
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
Application Design: Key Principles For Data-Intensive App Systems
From Everand
Application Design: Key Principles For Data-Intensive App Systems
Rob Botwright
No ratings yet
Hadoop BIG DATA Interview Questions You'll Most Likely Be Asked
From Everand
Hadoop BIG DATA Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet

Uid - Bda Report

Uploaded by

Uid - Bda Report

Uploaded by

HEALTH CARE INSURANCE ANALYSIS

Dept of ISE, DSATM 2023-2024 Page 1

3.Enhance Customer Understanding: Gain a deeper understanding of customer behavior,

Dept of ISE, DSATM 2023-2024 Page 2

Dept of ISE, DSATM 2023-2024 Page 3

Problem 1- Data Pre-processing, Enrichment and Load into Database

Problem 2 - Data Analysis (Spark/Hive)

 Schema Design for SQL Database:

Dept of ISE, DSATM 2023-2024 Page 4

into a storage system, such as a cloud data warehouse, or relational database.

plots, infographics, and even animations.

Dept of ISE, DSATM 2023-2024 Page 5

6.1 Conversion of raw data to processed data:

Dept of ISE, DSATM 2023-2024 Page 6

6.2 Processed Dataset

Dept of ISE, DSATM 2023-2024 Page 7

6.3 Hive and Sqoop

Here is the HEALTHCARE_SYSTEM Database created in Hive.

Dept of ISE, DSATM 2023-2024 Page 8

6.5 Apache Spark

Dept of ISE, DSATM 2023-2024 Page 9

Use Case-2: Average Monthly premium for each subgroup

Dept of ISE, DSATM 2023-2024 Page 10

Use case-4: Which disease have maximum number of claims

Dept of ISE, DSATM 2023-2024 Page 11

Use Case-5: Which company/group is most profitable

Use case-6: Monthly premium paid by each subgroup Average

Dept of ISE, DSATM 2023-2024 Page 12

Dept of ISE, DSATM 2023-2024 Page 13

 Not in the Healthcare industry we can generalized the whole procedure to

Dept of ISE, DSATM 2023-2024 Page 14

[3] Big Data Ethics

Dept of ISE, DSATM 2023-2024 Page 15

You might also like