0% found this document useful (0 votes)

20 views22 pages

Data Science

The document describes a data science internship project. It includes code to analyze a dataset containing transaction data from a retail store with 129 rows and 7 columns. Various visualizations are created using the code, including heatmaps, relational plots, distplots, scatterplots, bar charts, and countplots. Key observations are that paperclips have the highest sales, laptops have the highest price, and there are more female than male customers. The conclusion recommends selling more of the high-selling categories and increasing quantities of less common products.

Uploaded by

rupesh karanam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views22 pages

Data Science

Uploaded by

rupesh karanam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

DATA SCIENCE INTERNSHIP PROJECT

Contributors:
Karanam Rupesh(Team lead)
Shivansh Srivastava

1
ACKNOWLEDGEMENT

It is with at most pleasure and excitement we submit our project partial fulfilment
of the requirement for the award of Data Science internship. The project is a result of
the cumulativeefforts, support, guidance, encouragement and inspiration from many
of those for whom we must give our truthful honour and express gratitude through
bringing out this project at the outset as per our knowledge. We convey special
thanks to our Project Guide who has guided us and encouraged us to enhance our
knowledge with present working of this project to enrich the quality of project. We
expressed our appreciation to our PR’S Software Services HR HARIOM SINGH
who facilitated us to providing a friendly environment which helped to enhance our
skills in the present project.

2
INDEX
1) CODE
2) VISUALIZATIONS
3) OBSERVATION
4) CONCLUSION

3
DATASET
We have prepared the dataset manually by selecting Transaction Id ,
product ,category , price ,customer age, customer gender
We have taken total 129 rows and 7 columns in our dataset.

4
CODE
df = pd.read_excel('project data.xlsx')
df

df.shape
df.dtypes
df.head()
df.tail()
df.head(15)
df.tail(15)
df.columns
df.PRODUCT
df.CATEGORY
df['PRICE'].head()
df['CUSTOMER AGE'].head(10)

5
df[['CUSTOMER AGE','CUSTOMER GENDER']]
OUTPUTS OF THE ABOVE CODE:

6
7
df[['CUSTOMER AGE','CUSTOMER GENDER']].head(20)
df.loc[0,:]
df.isnull()
df.isnull().sum()
OUTPUTS OF THE ABOVE CODE

8
HeatMaps is about replacing numbers with colors because the human
brain understands visuals better than numbers, text, or any written
data. Human beings are visual learners; therefore, visualizing the data
in any form makes more sense. Heatmaps represent data in an easy-to-
understand manner.
corelation = df.corr()
sns.heatmap(corelation, xticklabels = corelation.columns,
yticklabels = corelation.columns,annot=True)

9
Relational plots are used for visualizing the statistical relationship
between the data points. Visualization is necessary because it allows
the human to see trends and patterns in the data. The process of
understanding how the variables in the dataset relate each other and
their relationships are termed as Statistical analysis.
sns.relplot(x
='PRODUCT',y='CATEGORY',hue='QUANTITY',data=df)

10
DistPlot It is used basically for univariant set of observations and
visualizes it through a histogram i.e. only one observation and hence
we choose one particular column of the dataset.
sns.distplot(df['PRICE'])

11
sns.distplot(df['PRICE'],bins=5)

sns.catplot(x='QUANTITY',kind = 'box', data = df)

12
import plotly.graph_objects as go
fig=go.Figure(data=go.Scatter(x=df.PRODUCT,y=df.QUANTITY,m
ode='lines+markers',marker_color='orange',marker_size=20))
fig.update_layout(title='RELATION BETWEEN PRODUCT AND
QUANTITY',xaxis_title='PRODUCT',yaxis_title='QUANTITY')
fig.show()

This above graph tells us that relation ship between product and
quantity of the items in the retail store scatterplot we have drawn.

13
import plotly.graph_objects as go
import numpy as np

fig = go.Figure()

fig.add_trace(go.Scatter(x=df.PRODUCT,y=df.CATEGORY,name='
CATEGORY'))
fig.add_trace(go.Scatter(x=df.PRODUCT,y=df.PRICE,name='PRIC
E'))

fig.update_traces(mode='lines+markers', line_width=5,
marker_size=10)
fig.update_layout(title="RELATION BETWEEN PRODUCT
CATEGORY AND PRICE"
,xaxis_title="PRODUCT",yaxis_title="CATEGORY/PRICE",
width=1000,
height=500,paper_bgcolor="LightSteelBlue",plot_bgcolor="green"
,showlegend=True)
fig.update_xaxes(showgrid=True)
fig.update_yaxes(showgrid=True)
fig.show()

14
This above graph tells us that relation ship between product and
quantity ,product and price of the items in the retail store scatterplot
we have drawn.

15
import plotly.graph_objects as go
fig=go.Figure(go.Bar(x=df.PRODUCT,y=df.PRICE))
fig.update_layout(title='BARCHART
',xaxis_title='PRODUCT',yaxis_title='PRICE')
fig.show()

This above graph tells us that relation ship between product and
quantity ,product and price of the items in the retail store scatterplot
we have drawn.

This above Bar graph tells us that relation ship between product
and price of the items in the retail store.

16
import plotly.graph_objects as go

fig=go.Figure(go.Bar(x=df.PRODUCT,y=df.QUANTITY))
fig.update_layout(title='BARCHART
',xaxis_title='PRODUCT',yaxis_title='QUANTITY')
fig.show()

This above Bar graph tells us that relation ship between quantity
and product of the items in the retail store.

17
import plotly.graph_objects as go

fig=go.Figure(go.Bar(x=df.QUANTITY,y=df.PRICE))
fig.update_layout(title='BARCHART
',xaxis_title='QUANTITY',yaxis_title='PRICE')
fig.show()

This above Bar graph tells us that relation ship between quantity
and price of the items in the retail store.

18
countplot() method is used to Show the counts of observations in each categorical
bin using bars.

sns.countplot(x='CUSTOMER GENDER',data=df)
plt.title('Distribution of GENDER')

19
20
sns.countplot(x='CUSTOMER AGE',hue='CUSTOMER
GENDER',data=df)
plt.title('DISTRIBUTION OF CUSTOMER AGE BY CUSTOMER
GENDER')

21
OBSERVATIONS:
1) We have observed that the paperclips are in the highest selling
category. Second category is pens ,third category is envelope
2) Last selling categories are monitor ,large sign , small sign
3) Laptop is having highest selling price among all the items in the
retail store.And second highest selling categories are ficus and
monitor.
4) Female customers are more in number compared to the male
customers of the retail store.

Conclusion:
1) It will be beneficial to the owner of the retail store if he sells
by keeping more no of public Areas categories products.
2) He has to increase the quantity of jackets,smartphones
,alaram clock,wall chair as they are less in number compared to
other products.

ML Lab Manual 2025-2
No ratings yet
ML Lab Manual 2025-2
35 pages
Aviat PV User Manual PDF
100% (3)
Aviat PV User Manual PDF
568 pages
Data Cleaning and Preprocessing Techniques
No ratings yet
Data Cleaning and Preprocessing Techniques
13 pages
35 Swap Space Management 08-11-2024
No ratings yet
35 Swap Space Management 08-11-2024
6 pages
7700e SPM
No ratings yet
7700e SPM
2 pages
Python Data Analysis and Visualization 100 Practical Exercises With Results and Explanations (Yuka, Horikawa Yui, Kirigaya Kouta Etc.) (Z-Library)
No ratings yet
Python Data Analysis and Visualization 100 Practical Exercises With Results and Explanations (Yuka, Horikawa Yui, Kirigaya Kouta Etc.) (Z-Library)
453 pages
Nokia 7730 SXR 1 Series Service Interconnect Routers Data Sheet EN
No ratings yet
Nokia 7730 SXR 1 Series Service Interconnect Routers Data Sheet EN
9 pages
Pulmonology (Q & A) (Medicalstudyzone - Com)
No ratings yet
Pulmonology (Q & A) (Medicalstudyzone - Com)
1,768 pages
Wattless Current
No ratings yet
Wattless Current
2 pages
Diwali Sales Analysis EDA 1696347982
No ratings yet
Diwali Sales Analysis EDA 1696347982
8 pages
WRM Year8 Spring Block 1 Brackets Equations Inequalities Exemplar Questions and Answers
No ratings yet
WRM Year8 Spring Block 1 Brackets Equations Inequalities Exemplar Questions and Answers
87 pages
BMC Resmart Gii Y30t Bipap Humidifier
No ratings yet
BMC Resmart Gii Y30t Bipap Humidifier
4 pages
Nikita Prasad - Exploratory Data Analysis (EDA)
No ratings yet
Nikita Prasad - Exploratory Data Analysis (EDA)
18 pages
ALOJIPAN Assessment - Task - 1 - Sampling - Data - Visualization
No ratings yet
ALOJIPAN Assessment - Task - 1 - Sampling - Data - Visualization
12 pages
HTML File Paths
No ratings yet
HTML File Paths
7 pages
Week13 2 Data Analysis 2
No ratings yet
Week13 2 Data Analysis 2
44 pages
Data Visualization: Types of Data Visualization: Charts and Graphs Line Charts
No ratings yet
Data Visualization: Types of Data Visualization: Charts and Graphs Line Charts
15 pages
DAUP Presentation
No ratings yet
DAUP Presentation
7 pages
Practical File Informatics Practices Class 12 For 2022-23 No WM
No ratings yet
Practical File Informatics Practices Class 12 For 2022-23 No WM
24 pages
Prac 2
No ratings yet
Prac 2
11 pages
Tasks For Students
No ratings yet
Tasks For Students
4 pages
PDF 2
No ratings yet
PDF 2
19 pages
Internship Report Data Science
100% (1)
Internship Report Data Science
58 pages
Rashed
No ratings yet
Rashed
9 pages
Stationary Waves
No ratings yet
Stationary Waves
3 pages
Sample Project 1
No ratings yet
Sample Project 1
14 pages
PRO Argument Facebook Fake News Dissemination
No ratings yet
PRO Argument Facebook Fake News Dissemination
2 pages
Presentation 17
No ratings yet
Presentation 17
18 pages
BA v2 Colibri Cctalk EN 1-0
No ratings yet
BA v2 Colibri Cctalk EN 1-0
48 pages
Challenges and Opportunities of Artificial Intelligence
No ratings yet
Challenges and Opportunities of Artificial Intelligence
9 pages
In Tenshi PPP Tte Jum Am
No ratings yet
In Tenshi PPP Tte Jum Am
23 pages
IP Practical File2
No ratings yet
IP Practical File2
35 pages
Programming Notes 3
No ratings yet
Programming Notes 3
3 pages
ML Report
No ratings yet
ML Report
12 pages
SSG-VD-000-MECH-IOM-SCA01-0001 - 3 - IFI - AC (Cover)
No ratings yet
SSG-VD-000-MECH-IOM-SCA01-0001 - 3 - IFI - AC (Cover)
20 pages
Big Mart Sales Prediction Using Machine Learning Report PDF
No ratings yet
Big Mart Sales Prediction Using Machine Learning Report PDF
56 pages
Semi-Supervised K-Means Ddos Detection Method Using Hybrid Feature Selection Algorithm
No ratings yet
Semi-Supervised K-Means Ddos Detection Method Using Hybrid Feature Selection Algorithm
15 pages
Divya Class 12 Board Practical File
No ratings yet
Divya Class 12 Board Practical File
31 pages
Derivatives?: E World
No ratings yet
Derivatives?: E World
2 pages
Ip Project Matplot (4) Con
No ratings yet
Ip Project Matplot (4) Con
18 pages
Roach 1
No ratings yet
Roach 1
2 pages
Prac 2
No ratings yet
Prac 2
11 pages
Technologyname Phase2
No ratings yet
Technologyname Phase2
20 pages
Ajeet
No ratings yet
Ajeet
26 pages
Satellite Assisted Flight Tracking and Rescue: S.A.F.T.A.R
No ratings yet
Satellite Assisted Flight Tracking and Rescue: S.A.F.T.A.R
4 pages
MANUAL AMPLIFICADOR KENWOOD Ar304
No ratings yet
MANUAL AMPLIFICADOR KENWOOD Ar304
24 pages
Project Report
No ratings yet
Project Report
37 pages
BIDA Practical Print
No ratings yet
BIDA Practical Print
56 pages
EDA Diwali Sale Analysis Project
No ratings yet
EDA Diwali Sale Analysis Project
11 pages
Netezza Analytics Transition Service Flyer
No ratings yet
Netezza Analytics Transition Service Flyer
2 pages
Practical File IP Class 12
No ratings yet
Practical File IP Class 12
19 pages
Design and Analysis of CNN-Based Skin Disease Detection System With Preliminary Diagnosis
No ratings yet
Design and Analysis of CNN-Based Skin Disease Detection System With Preliminary Diagnosis
13 pages
Guides
No ratings yet
Guides
23 pages
Data Analytics
No ratings yet
Data Analytics
24 pages
Idea Makers Stephen Wolfram Epub - Google Search
0% (1)
Idea Makers Stephen Wolfram Epub - Google Search
3 pages
Agarwal Dhar 2014 Editorial Big Data Data Science and Analytics The Opportunity and Challenge For Is Research
No ratings yet
Agarwal Dhar 2014 Editorial Big Data Data Science and Analytics The Opportunity and Challenge For Is Research
6 pages
Machine Learning Project 3
No ratings yet
Machine Learning Project 3
74 pages
Eagle Incident Form: User Information
No ratings yet
Eagle Incident Form: User Information
6 pages
Practical D.V
No ratings yet
Practical D.V
13 pages
Color Video Doorphone Kit: 1byone Products Inc
No ratings yet
Color Video Doorphone Kit: 1byone Products Inc
19 pages
Ads Phase 5
No ratings yet
Ads Phase 5
23 pages
Training Report On Data Analysis With Python
No ratings yet
Training Report On Data Analysis With Python
12 pages
Project Sale Analysis
No ratings yet
Project Sale Analysis
8 pages
Be A 65 Ads Exp 2
No ratings yet
Be A 65 Ads Exp 2
10 pages
Ip 1
No ratings yet
Ip 1
5 pages
Task 6
No ratings yet
Task 6
14 pages
Another Project-Creating Customer Segments
No ratings yet
Another Project-Creating Customer Segments
31 pages
Important Questions With Solutions IP
No ratings yet
Important Questions With Solutions IP
5 pages
Explore and Transform Data Based On Rows - Transcript
No ratings yet
Explore and Transform Data Based On Rows - Transcript
3 pages
Python Project
No ratings yet
Python Project
20 pages
IP Project Final
No ratings yet
IP Project Final
9 pages
Supermarket Sales Analysis 1
No ratings yet
Supermarket Sales Analysis 1
13 pages
Data Science Sample
No ratings yet
Data Science Sample
5 pages
Ip Project
No ratings yet
Ip Project
16 pages
API - Pipeline Fact Sheet - RV8
No ratings yet
API - Pipeline Fact Sheet - RV8
1 page
Case Study Module 1
No ratings yet
Case Study Module 1
4 pages
Machine Learning - Customer Segment Project. Approved by UDACITY
100% (1)
Machine Learning - Customer Segment Project. Approved by UDACITY
19 pages
Training
No ratings yet
Training
17 pages
Case Study
50% (2)
Case Study
8 pages
DMV Lab 12
No ratings yet
DMV Lab 12
8 pages
CMIT-796-PIP-15.69-00-0008 - 0 3D Model Review Procedure
No ratings yet
CMIT-796-PIP-15.69-00-0008 - 0 3D Model Review Procedure
10 pages
Data Analysis On BigMart Sales
67% (3)
Data Analysis On BigMart Sales
17 pages
SMDM Project Report-Survi Ghura
100% (1)
SMDM Project Report-Survi Ghura
26 pages
BigMart PDF
100% (1)
BigMart PDF
42 pages
Supply Chain PDF
No ratings yet
Supply Chain PDF
2 pages
Big Data Report
No ratings yet
Big Data Report
6 pages
Mall Customer Data Analysis PDF
No ratings yet
Mall Customer Data Analysis PDF
10 pages
SMDM Business-Report Arvind Soni-2
0% (1)
SMDM Business-Report Arvind Soni-2
15 pages
SMDM Project Report Dipti
No ratings yet
SMDM Project Report Dipti
14 pages
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet

Data Science

Uploaded by

Data Science

Uploaded by

DATA SCIENCE INTERNSHIP PROJECT

sns.catplot(x='QUANTITY',kind = 'box', data = df)

You might also like