Prasad Shinde Data Analytics Portfolio
Prasad Shinde Data Analytics Portfolio
0
Professional Background:
Appendix … 23
Module 1 Project Data Analytics Process Application in real life scenario:
Our task is to give the example of such a real-life situation where we use
Data Analytics and link it with the data analytics process.
In conclusion, I would like to mention that after doing a thorough analysis we were able
to derive the insights from the data. The data that once looked useless became useful.
Analyzing the data proved helpful in finding various issues among the courses.
1
Module 2 Project SQL Fundamentals: Instagram User Analytics
We are working with the Product team of Instagram and the Product Manager has
asked us to provide insights on the questions asked by the management team.
Tech-Stack Used:
MySQL by Oracle Corporation – Creating Database, Running SQL Queries to
get insights
Word by Microsoft Corporation – For creating a project report
Insights & Results:
A) Marketing:
1. We found the 5 oldest users of the Instagram.
2. We found the users who have never posted a single photo on Instagram.
2
3. We identified the winner of the contest and provide their details to the team.
4. We identified and suggest the top 5 most commonly used hashtags on the platform.
5. We determined What day of the week do most users register on and Provide
insights on when to schedule an ad campaign.
B) Investor Metrics:
1. We provided how many times does average user posts on Instagram. Also,
provided the total number of photos on Instagram/total number of users.
2. We provided data on users (bots) who have liked every single photo on the site.
3
Module 3 Project Advanced SQL: Operation Analytics & Investigating
Metric Spike
We are working for a company like Microsoft designated as Data Analyst Lead and is
provided with different data sets & tables from which we must derive certain insights
out of it and answer the questions asked by different departments.
Tech-Stack Used:
MySQL by Oracle Corporation – Importing Database, Running SQL Queries
to get insights
Excel by Microsoft Corporation – For extracting & manipulating
In conclusion, I would like to mention that after doing a thorough analysis we were
able to derive the insights from the data and was able to plot various graphs using
that data. The data that once looked useless became useful and helped to find out
the insights on job data that were a burden for Microsoft. Analyzing the data proved
helpful in finding various issues among the data provided.
4
Module 4 Project Statistics: Hiring Process Analytics
We are working for a MNC such as Google as a lead Data Analyst and the
company has provided with the data records of their previous hirings and
have asked us to answer certain questions making sense out of that data.
Tech-Stack Used:
Excel by Microsoft Corporation – For extracting & manipulating data
5
4. We have drawn Bar Graph to show proportion of people working different
department.
In conclusion, I would like to mention that after doing a thorough analysis we were
able to derive the insights from the data and was able to plot various graphs using
that data. The data that once looked useless became useful and helped to find out
the insights on hiring data that were a burden for Google. Analyzing the data proved
helpful in finding various issues among the data provided.
6
Final Project-1: IMDB Movie Analysis
For our Final Project, we are provided with a dataset having various columns
of different IMDB Movies. We are required to frame the problem. For this task,
we need to define a problem we want to shed some light on.
Tech-Stack Used:
7
4. Best Directors:
We found the Best Directors by ratings as follows:
John Blanchard – 9.5
Frank Darabont – 9.3
Francis Ford Coppola – 9.2
John Stockwell – 9.1
Christopher Nolan - 9
Francis Ford Coppola - 9
Peter Jackson – 8.9
Sergio Leone – 8.9
Steven Spielberg – 8.9
Quentin Tarantino – 8.9
5. Popular genres:
We found popular Genres by ratings as follows:
Comedy - 9.5
Crime | Drama - 9.3
Crime | Drama - 9.2
Drama - 9.1
Drama - 9.1
Action - 9.1
Action | Crime | Drama | Thriller - 9
Crime | Drama - 9
Crime | Drama | Thriller - 9
Action | Adventure | Drama | Fantasy - 8.9
8
Final Project-2: Bank Loan Case Study
For our Final Project, we are provided with 3 datasets of Bank loan details. The loan
providing banks find it hard to give loans to the people due to their insufficient or
non-existent credit history. Because of that, some consumers use it as their
advantage by becoming a defaulter. Suppose we work for a consumer finance
company which specialises in lending various types of loans to urban customers.
We have to use EDA to analyse the patterns present in the data. This will ensure that
the applicants capable of repaying the loan are not rejected.
Tech-Stack Used:
Excel by Microsoft Corporation – For carrying out EDA on the datasets &
Visualisation Jupyter Notebook by Project Jupyter – For carrying out EDA &
Visualisation Word by Microsoft Corporation – For creating the project report
Observations:
-People with High Income Groups are less defaulters compared to Low Income
Groups. -Mid age & Senior Age Groups including all income groups are less
defaulters. Recommended:
-It is safe to grant loan to Mid Age & Senior Age Groups with higher
income -It is risky to grant loan to young people with low income
9
Family Status & Age Group Family Status & Gender
Observations:
-Seniors irrespective or family status are less likely to be
defaulted. -Young people are more likely to be defaulted in all
Family Statuses. -Males are more likely to be defaulted than
Females. Recommended:
-It is safe to grant loan to all Family status’ Senior Age group.
-It is risky to grant loan to Single, Separated & Civil Marriage of Young Age group.
Credit Amount Group & Income Group Credit Amount Group & Age Group
Observations:
-Across all Income Group clients with Medium Credit Amount is highly defaulted, followed by
Low & High Credit Amount
-Young Age Groups with Medium & Low Credit Amount are most likely defaulted.
Recommended:
-It is very risky to grant Medium & Low Credit Amount to Young Age Group.
10
Educational Qualification & Gender Profession & Gender
Observations:
-People with Higher Education are less defaulted & People with Lower Secondary education
are more defaulted
-Unemployed & Maternity Leave clients are very much defaulted
Recommended:
-It is safe to grant loan to people with Higher Education in all professions except Females &
Females with Maternity Leave.
Observations:
-People with Higher Education are less defaulted & People with Lower Secondary education
are more defaulted
-Unemployed & Maternity Leave clients are very much defaulted
Recommended:
-It is safe to grant loan to people with Higher Education in all professions except Females &
Females with Maternity Leave.
11
Loan Application Status Relations
Current & Previous (application_data.csv & previous_application.csv):
Previous Loan status & Gender Previous loan status & Client type
Observations:
-Previously Refused & Unused offer applications were more defaulted in
Males. -New clients with previously Unused offers are more defaulted.
Recommended:
-It is safe to provide loans to previously approved Females.
-It is risky to grant loans to clients whose applications were previously Refused & Unused offer.
Age Group & Previous loan status Income Group & Previous Loan status
Observations:
-Young clients which are previously Refused are highly defaulted.
-Senior clients are less defaulted irrespective of their previous loan
status. -Previously refused application in all Income Groups are
highly defaulted. Recommended:
-It is safe to grant loan to Senior clients.
-It is less risky to grant loans for approved applicants in all Income Groups.
12
Portfolio & Previous loan status External Source Score & Previous Loan status
Observations:
-The previous applications for Cards & POS are mostly defaulted.
-Previously refused application for Cash is highly defaulted.
-Low External Source Scorer are highly defaulted.
Recommended:
-It is safe to grant loan to any portfolio for previously approved clients.
-It is highly risky to grant loans to clients with poor external source score whose loan was
previously Refused, Unused offer or Cancelled.
We carried out EDA on the given datasets using Microsoft Excel & Jupyter Notebook-
Python and got answers as following which helped us to take data driven decisions.
13
Final Project-3: XYZ Ads Airing Report
For our Final Project, we are provided with a dataset having different TV Airing
Brands, their product, their category. Dataset includes the network through which
Ads are airing, types of networks like Cable/ Broadcast and the show name also on
which Ads got aired. We can also see the data of Dayparts, Time zone and the time &
date at which Ads got aired. It also includes other data like Pod Position (the lesser
the valuable), duration for which Ads aired on screen, Equivalent sales &, total
amount spent on the Ads aired.
Tech-Stack Used:
Excel by Microsoft Corporation – For analysing the data from the given
dataset Word by Microsoft Corporation – For creating the project report
1. What is Pod Position & does the Pod position number affect the amount spent
on Ads for a specific period of time by a company:
14
2. The share of various brands in TV airings and how has it changed from Q1 to
Q4 in 2021:
- The following table shows the percentage share of spend by each brand
quarter-wise
- Maruti Suzuki is the only brand which has highest share in money spent
compared to other brands in all quarters
- The following table shows the percentage share of ads played of each brand
quarter-wise
- Maruti Suzuki has the highest share in number of ads played in all quarters
- The following table shows the percentage share of EQ units of each brand
quarter-wise
- Maruti Suzuki has the highest share in EQ units among all brands in every quarter
15
3. Competitive analysis for the brands and defined advertisement strategy of
different brands and how it differs across the brands:
10000
9000 Daytime
7000
Early Morning
6000
Evening News
5000
Late Fringe
4000
Overnight
3000
Prime Access
2000
Prime Time
1000
Weekend
0
Q1 Q2 Q3 Q4
4500
4000 Daytime
Late Fringe
2000 Overnight
1500
Prime Access
1000
Prime Time
500
Weekend
0
Q1 Q2 Q3 Q4
16
Maruti Suzuki (Day parts vs Count of Ads vs Quarter)
16000
14000
Daytime
12000 Early Fringe
Early Morning
10000
Evening News
8000
Late Fringe
6000 Overnight
Prime Access
4000
Prime Time
2000
Weekend
0
Q1 Q2 Q3 Q4
0 Weekend
Q1 Q2 Q3 Q4
17
The above table states that Maruti Suzuki has the highest contribution of
EQ units in first pod position among the top-5 Pod positions
The above table states that the major share of EQ units of Maruti Suzuki is
the highest on all days and significantly more on weekends which is not a
case in other brands
The above table shows that Maruti Suzuki has majority of EQ units share in
JAN, MAY, AUG, OCT.
Mahindra and Mahindra have majority of EQ units share in MAY & AUG.
18
4. Suggest a media plan to the CMO of Mahindra and Mahindra & which audience
should they target:
As we can see in Competitive Analysis charts, 4 out of 6 brands aired their
majority of ads at daytime in first quarter. Therefore, it would be profitable
to run the ads in this particular daypart.
The table shows that all brands have majority of share of EQ units in first
Pod Position in First Quarter. Therefore, it would be profitable to run the
digital ad in frist pod position.
In conclusion, I would like to mention that after doing a thorough analysis we were
able to derive the insights from the data and was able to plot various tables using
that data. The data that once looked useless became useful and helped to find out
the insights on Car sales dataset. Analyzing the data proved helpful in finding which
pod positions, dayparts should the ads be run. Also, suggested media plan to CMO
of Mahindra & Mahindra for potential audience target.
19
Final Project-4: ABC Call Volume Trend
For our final project we are provided with a dataset of a Customer Experience (CX)
Inbound calling team for 23 days. Data includes Agent_Name, Agent_ID, Queue_Time
[duration for which customer have to wait before they get connected to an agent], Time
[time at which call was made by customer in a day], Time_Bucket [for easiness we have
also provided you with the time bucket], Duration [duration for which a customer and
executives are on call, Call_Seconds [for simplicity we have also converted those time
into seconds], call status (Abandon, answered, transferred).
Excel by Microsoft Corporation – For carrying out EDA on the datasets &
Visualisation Word by Microsoft Corporation – For creating the project report
20
b. total volume/ number of calls coming in:
Row Count of
Labels Call_Status Total Volume of calls
10_11 13313
11_12 14626 16000
12_13 12652
13_14 11561 14000
Call_Status
14_15 10561
15_16 9159 12000
16_17 8788
10000
17_18 8534
18_19 7238
8000
19_20 6463
20_21 5505 6000
9_10 9588 Total
Grand
Total 117988 4000
2000
0
11_
12_
13_
14_
15_
16_
17_
18_
19_
20_
21_
9 10_
1011121314151617181920
Time Bucket
Manpower required for each time bucket = Head count required / Time bucket
= 130.505 / 12 = 10.87
21
d. manpower plan required during each time bucket in a day:
Head count required = Minimum Agents required / 0.76 (Min. shrinkage count 0.76)
Manpower required for each time bucket = Head count required / Time bucket
= 0.336 / 12 = 0.028
In conclusion, I would like to mention that after doing a thorough analysis we were
able to derive the insights from the data and was able to plot various tables using
that data. The data that once looked useless became useful and helped to find out
the insights on Calls dataset. Analyzing the data proved helpful finding average call
time duration, volume of calls, minimum agents required in each time bucket.
22
Appendix:
1. Google Drive link for Module 1 of Data Analytics Process:
https://docs.google.com/presentation/d/1329p1Dc1swc4q8L13ailbW_veuiiIrxk/edi
t?usp=share_link&ouid=101056032751701574408&rtpof=true&sd=true
2. Google Drive link for Module 2 of SQL Fundamentals Instagram User Analytics:
https://drive.google.com/file/d/1Aqt4VjwbsZYXq_oYiWpyjCyiregLFQs/view?usp=
share_link
3. Google Drive link for Module 3 of Advanced SQL Operation & Metric Analytics:
https://drive.google.com/file/d/1P9Kt2xT2hYml_1RTxdWXjEIlv43eC0v/view?usp=
share_link
6. Google Drive link for Final Project-2 Bank Loan Case Study:
https://drive.google.com/file/d/1ZIzfXhONU5ntiate0dnwbZW4eir5tvT/view?usp=s
hare_link
7. Google Drive link for Final Project-3 XYZ Ads Airing Report:
https://drive.google.com/file/d/1oyzPQcSTg8boDk7c4P3RvCPXLRz5X_8Z/view?
usp=share_link
8. Google Drive link for Final Project-4 ABC Call Volume Trend:
https://drive.google.com/file/d/14TzUrGxkQLgSZ67Cl2WQgk3Rn2K7MCj/view?u
sp=share_link
Google Drive link for all Portfolio Projects:
https://drive.google.com/drive/folders/15A1EFbT-
bxsl3SeEPFdiXPmcNSp8Uz7S?usp=sharing
23