KPMG Data Analytics - Task 1
KPMG Data Analytics - Task 1
• Customer Demographic
• Customer Addresses
• Transactions data in the past 3 months
Furthermore, recommendations have been provided to avoid the reoccurrence of data quality issues
and improve the accuracy of the underlying data used to drive business decisions.
Recommendation: In order to construct meaningful variables for the model, the data has been
cleaned to avoid multiple representations of the same value. Additionally, gender records where ‘U’
have been replaced based on the distribution from the training dataset.
There is different data types for a given field make it difficult to interpret results at the later stage.
Therefore, appropriate data transformations are made to ensure consistent data types for a given
field.
3.Customer Demographics
There is additional customer ids in transaction table and Customer address table but it is not present
in the customer master table. Which indicates that the data received may not be in sync with each
other which may skew the analysis results if there are missing data records. Please refer to excel file
‘data_outliers.xlsx’ for the list of outliers between tables.
Moving to the next step will continue with cleaning the data, modelling the data , Analyzing and
transforming to insightful report it would be great to spend some time with your data SME to ensure
that all assumptions are aligned with Sprocket Central’s understanding.
Vandana Prajapati