Week 10
Week 10
Problem Description
One of the challenges for Pharmaceutical companies is to
understand the persistency of drug as per the physician
prescription. This issue results in a bad impact on the
pharmacies for all the categories; patients, physicians, and
administration. However, the team of data scientist is capable of
discovering the analyzing the dataset and detecting the factors
that are impacting the primary factor which is the "persistency".
By building a classification machine learning model, we will be
able to classify the dataset and find the variables that affect the
target variables "Persistency Flag".
EDA performed on data
Dataset:
We checked from the data and didn’t find any null values.
Unknown Values :
On the other hand, we found a lot of the “Unknown” values, we
considered them as null values and decided to remove them
because they can affect the results of our ML models.
Outliers :
We have 460 outliers in “Dexa_Freq_During_Rx” variable.
Skewed Data :
As seen here, since the tail is on the right side, we can say that
“Dexa_Freq_During_Rx” variable has right-skewed distribution.
Hence, we can conclude that the mean value is greater than the
mode.
Demoghraphics analysis:
Ethnicity:
Gender:
As you can see from the graph, a huge imbalance between the
genders.
Ntm Speciality analysis:
General Practitioner, Rheumatology, Endocrinology and
Oncology specialists prescribed the NTM Rx most.
Risk Segment:
Final recommendations :