0% found this document useful (0 votes)
55 views14 pages

Matplotlib Inline PD Set - Option (, X: X) : Import As Import As Import As Import As Lambda Import As Import

The document provides information about nutritional data from McDonald's breakfast menu items. It includes: - A table showing nutritional information like calories, fat, cholesterol, sodium for 5 breakfast items. - Code to import libraries and read in a CSV file containing the nutritional data. - Details on the shape and dtypes of the dataframe containing 260 rows and 24 columns of nutritional data. - Descriptive statistics summarizing the distribution of values for each nutritional variable.

Uploaded by

kool
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views14 pages

Matplotlib Inline PD Set - Option (, X: X) : Import As Import As Import As Import As Lambda Import As Import

The document provides information about nutritional data from McDonald's breakfast menu items. It includes: - A table showing nutritional information like calories, fat, cholesterol, sodium for 5 breakfast items. - Code to import libraries and read in a CSV file containing the nutritional data. - Details on the shape and dtypes of the dataframe containing 260 rows and 24 columns of nutritional data. - Descriptive statistics summarizing the distribution of values for each nutritional variable.

Uploaded by

kool
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

7/25/2021 temp-162719629831896177

In [2]: import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

%matplotlib inline

import seaborn as sns

pd.set_option('display.float_format', lambda x: '%.2f' % x)

import scipy.stats as stats

In [5]: import os

os.getcwd()

Out[5]: 'C:\\Users\\10000981'

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 1/14


7/25/2021 temp-162719629831896177

In [47]: df= pd.read_csv('Mcdonald .csv')

df.head().T

Out[47]:
0 1 2 3 4

Category Breakfast Breakfast Breakfast Breakfast Breakfast

Egg Egg White Sausage Sausage McMuffin Sausage McMuffin with


Item
McMuffin Delight McMuffin with Egg Egg Whites

4.8 oz 4.8 oz (135 3.9 oz (111


Serving Size 5.7 oz (161 g) 5.7 oz (161 g)
(136 g) g) g)

Calories 300 250 370 450 400

Calories from Fat 120 70 200 250 210

Total Fat 13.00 8.00 23.00 28.00 23.00

Total Fat (% Daily


20 12 35 43 35
Value)

Saturated Fat 5.00 3.00 8.00 10.00 8.00

Saturated Fat (%
25 15 42 52 42
Daily Value)

Trans Fat 0.00 0.00 0.00 0.00 0.00

Cholesterol 260 25 45 285 50

Cholesterol (% Daily
87 8 15 95 16
Value)

Sodium 750 770 780 860 880

Sodium (% Daily
31 32 33 36 37
Value)

Carbohydrates 31 30 29 30 30

Carbohydrates (%
10 10 10 10 10
Daily Value)

Dietary Fiber 4 4 4 4 4

Dietary Fiber (%
17 17 17 17 17
Daily Value)

Sugars 3 3 2 2 2

Protein 17 18 14 21 21

Vitamin A (% Daily
10 6 8 15 6
Value)

Vitamin C (% Daily
0 0 0 0 0
Value)

Calcium (% Daily
25 25 25 30 25
Value)

Iron (% Daily Value) 15 8 10 15 10

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 2/14


7/25/2021 temp-162719629831896177

In [69]: df.describe()

Out[69]:
Total Saturated
Cholesterol
Calories Total Fat (% Saturated Fat (% Trans
Calories Cholesterol (% Daily
from Fat Fat Daily Fat Daily Fat
Value)
Value) Value)

count 260.00 260.00 260.00 260.00 260.00 260.00 260.00 260.00 260.00

mean 368.27 127.10 14.17 21.82 6.01 29.97 0.20 54.94 18.39

std 240.27 127.88 14.21 21.89 5.32 26.64 0.43 87.27 29.09

min 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

25% 210.00 20.00 2.38 3.75 1.00 4.75 0.00 5.00 2.00

50% 340.00 100.00 11.00 17.00 5.00 24.00 0.00 35.00 11.00

75% 500.00 200.00 22.25 35.00 10.00 48.00 0.00 65.00 21.25

max 1880.00 1060.00 118.00 182.00 20.00 102.00 2.50 575.00 192.00

8 rows × 21 columns

The table shows the maximum, minimum, mean, standard deviation, count, 25%,50%,75% for each data column

In [49]: df.shape

Out[49]: (260, 24)

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 3/14


7/25/2021 temp-162719629831896177

In [51]: df.info()

<class 'pandas.core.frame.DataFrame'>

RangeIndex: 260 entries, 0 to 259

Data columns (total 24 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 Category 260 non-null object

1 Item 260 non-null object

2 Serving Size 260 non-null object

3 Calories 260 non-null int64

4 Calories from Fat 260 non-null int64

5 Total Fat 260 non-null float64

6 Total Fat (% Daily Value) 260 non-null int64

7 Saturated Fat 260 non-null float64

8 Saturated Fat (% Daily Value) 260 non-null int64

9 Trans Fat 260 non-null float64

10 Cholesterol 260 non-null int64

11 Cholesterol (% Daily Value) 260 non-null int64

12 Sodium 260 non-null int64

13 Sodium (% Daily Value) 260 non-null int64

14 Carbohydrates 260 non-null int64

15 Carbohydrates (% Daily Value) 260 non-null int64

16 Dietary Fiber 260 non-null int64

17 Dietary Fiber (% Daily Value) 260 non-null int64

18 Sugars 260 non-null int64

19 Protein 260 non-null int64

20 Vitamin A (% Daily Value) 260 non-null int64

21 Vitamin C (% Daily Value) 260 non-null int64

22 Calcium (% Daily Value) 260 non-null int64

23 Iron (% Daily Value) 260 non-null int64

dtypes: float64(3), int64(18), object(3)

memory usage: 48.9+ KB

In [70]: df['Category'].value_counts()

Out[70]: Coffee & Tea 95

Breakfast 42

Smoothies & Shakes 28

Chicken & Fish 27

Beverages 27

Beef & Pork 15

Snacks & Sides 13

Desserts 7

Salads 6

Name: Category, dtype: int64

In [72]: df['Category'].nunique()

Out[72]: 9

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 4/14


7/25/2021 temp-162719629831896177

In [73]: df['Category'].unique()

Out[73]: array(['Breakfast', 'Beef & Pork', 'Chicken & Fish', 'Salads',

'Snacks & Sides', 'Desserts', 'Beverages', 'Coffee & Tea',

'Smoothies & Shakes'], dtype=object)

There are 9 different categories of food available in McD which are listed below
1.'Breakfast'
2.'Beef & Pork',
3.'Chicken & Fish',
4.'Salads',
5.'Snacks & Sides', 6.'Desserts', 7.'Beverages', 8.'Coffee & Tea', 9.'Smoothies &
Shakes'

1. Plot graphically which food categories have the highest and lowest varieties.

In [132]: plt.figure(figsize=(16,8))

category_count = df.groupby('Category')['Item'].count().plot(kind='bar',color
="pink")

plt.legend(labels = ['Total Items'])

plt.xlabel('Category')

plt.ylabel('Item')

category_count

Out[132]: <AxesSubplot:xlabel='Category', ylabel='Item'>

INFERENCE FROM GRAPH:

From the graph it is inferred that highest number of items are available in category "Coffee & tea" and the lowest
number of items are available in category "Salads"

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 5/14


7/25/2021 temp-162719629831896177

1. Which all variables have an outlier?

In [116]: plt.figure(figsize=(15,5))

plt.subplot(1,5,1)

df.boxplot(column='Calories')

plt.subplot(1,5,2)

df.boxplot(column='Total Fat')

plt.subplot(1,5,3)

df.boxplot(column='Saturated Fat')

plt.subplot(1,5,4)

df.boxplot(column='Cholesterol')

plt.subplot(1,5,5)

df.boxplot(column='Sodium')

Out[116]: <AxesSubplot:>

The data columns that have outliers are

1. Calories
2. Total Fat
3. Sodium
4. Cholesterol and the data columns that does not have outliers are
5. Saturated Fat

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 6/14


7/25/2021 temp-162719629831896177

In [124]: plt.figure(figsize=(15,5))

plt.subplot(1,5,1)

df.boxplot(column='Carbohydrates')

plt.subplot(1,5,2)

df.boxplot(column='Dietary Fiber')

plt.subplot(1,5,3)

df.boxplot(column='Sugars')

plt.subplot(1,5,4)

df.boxplot(column='Protein')

plt.subplot(1,5,5)

df.boxplot(column='Saturated Fat (% Daily Value)')

Out[124]: <AxesSubplot:>

The data columns that have outliers are

1. Carbohydrates
2. Sugars
3. Protein
and the data columns that doesnot have outliers are
4. Dietary Fiber
5. Saturated Fat (% Daily Value)

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 7/14


7/25/2021 temp-162719629831896177

In [121]: plt.figure(figsize=(15,5))

plt.subplot(1,5,1)

df.boxplot(column='Total Fat (% Daily Value)')

plt.subplot(1,5,2)

df.boxplot(column='Vitamin A (% Daily Value)')

plt.subplot(1,5,3)

df.boxplot(column='Vitamin C (% Daily Value)')

plt.subplot(1,5,4)

df.boxplot(column='Sodium (% Daily Value)')

plt.subplot(1,5,5)

df.boxplot(column='Carbohydrates (% Daily Value)')

Out[121]: <AxesSubplot:>

The data columns that have outliers are

1. Total Fat (% Daily Value)


2. Vitamin A (% Daily Value)
3. Vitamin C (% Daily Value)
4. Sodium (% Daily Value)
5. Carbohydrates (% Daily Value)

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 8/14


7/25/2021 temp-162719629831896177

In [122]: plt.figure(figsize=(15,5))

plt.subplot(1,5,1)

df.boxplot(column='Dietary Fiber (% Daily Value)')

plt.subplot(1,5,2)

df.boxplot(column='Cholesterol (% Daily Value)')

plt.subplot(1,5,3)

df.boxplot(column='Calcium (% Daily Value)')

plt.subplot(1,5,4)

df.boxplot(column='Iron (% Daily Value)')

Out[122]: <AxesSubplot:>

The data columns that have outliers are

1. Dietary Fiber (% Daily Value)


2. Cholesterol (% Daily Value)
3. Calcium (% Daily Value)
4. Iron (% Daily Value)

In [ ]: 4. Which category contributes to the maximum % of Cholesterol in a diet (% d


aily value)?

In [127]: df.groupby('Category').mean()['Cholesterol (% Daily Value)'].sort_values(ascen


ding=False)

Out[127]: Category

Breakfast 50.95

Beef & Pork 28.93

Chicken & Fish 25.22

Salads 17.33

Smoothies & Shakes 14.71

Coffee & Tea 9.38

Snacks & Sides 6.23

Desserts 4.86

Beverages 0.19

Name: Cholesterol (% Daily Value), dtype: float64

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 9/14


7/25/2021 temp-162719629831896177

INFERENCE:

The maximum Cholesterol (% Daily Value) in the diet is contributed by the category Breakfast and the second
highest category is Beef & Pork and third highest category is Chicken & Fish.

In [ ]: 3. Which variables have the highest correlation? Plot them and find out the
value?

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 10/14


7/25/2021 temp-162719629831896177

In [135]: corr= df.corr(method ='pearson')

corr

Out[135]:
Total Saturated
Chole
Calories Total Fat (% Saturated Fat (% Trans
Calories Cholesterol (%
from Fat Fat Daily Fat Daily Fat
V
Value) Value)

Calories 1.00 0.90 0.90 0.90 0.85 0.85 0.52 0.60

Calories from
0.90 1.00 1.00 1.00 0.85 0.85 0.43 0.68
Fat

Total Fat 0.90 1.00 1.00 1.00 0.85 0.85 0.43 0.68

Total Fat (%
0.90 1.00 1.00 1.00 0.85 0.85 0.43 0.68
Daily Value)

Saturated Fat 0.85 0.85 0.85 0.85 1.00 1.00 0.62 0.63

Saturated Fat
(% Daily 0.85 0.85 0.85 0.85 1.00 1.00 0.62 0.63
Value)

Trans Fat 0.52 0.43 0.43 0.43 0.62 0.62 1.00 0.25

Cholesterol 0.60 0.68 0.68 0.68 0.63 0.63 0.25 1.00

Cholesterol (%
0.60 0.68 0.68 0.68 0.63 0.63 0.25 1.00
Daily Value)

Sodium 0.71 0.85 0.85 0.85 0.58 0.59 0.19 0.62

Sodium (%
0.71 0.85 0.85 0.85 0.59 0.59 0.19 0.62
Daily Value)

Carbohydrates 0.78 0.46 0.46 0.46 0.59 0.59 0.46 0.27

Carbohydrates
(% Daily 0.78 0.46 0.46 0.46 0.59 0.59 0.46 0.27
Value)

Dietary Fiber 0.54 0.58 0.58 0.58 0.35 0.36 0.05 0.44

Dietary Fiber
(% Daily 0.54 0.58 0.58 0.58 0.35 0.35 0.06 0.44
Value)

Sugars 0.26 -0.12 -0.12 -0.12 0.20 0.20 0.33 -0.14

Protein 0.79 0.81 0.81 0.81 0.60 0.61 0.39 0.56

Vitamin A (%
0.11 0.06 0.05 0.05 0.06 0.07 0.08 0.08
Daily Value)

Vitamin C (%
-0.07 -0.09 -0.09 -0.09 -0.18 -0.18 -0.08 -0.08
Daily Value)

Calcium (%
0.43 0.16 0.16 0.16 0.40 0.40 0.39 0.13
Daily Value)

Iron (% Daily
0.64 0.74 0.73 0.74 0.58 0.58 0.33 0.65
Value)

21 rows × 21 columns

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 11/14


7/25/2021 temp-162719629831896177

In [138]: plt.figure(figsize=(20,15))

sns.heatmap(corr,annot=True)

Out[138]: <AxesSubplot:>

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 12/14


7/25/2021 temp-162719629831896177

The correlation of the different variables is like below

Correlation of calories:
Calories and saturated fat - 0.85
Calories and cholesteraol - 0.6
Calories and sodium -
0.7
Calories and Protein - 0.79

Correlation of fat:
Fat and Protein - 0.8
Fat and Sodium - 0.8
Fat and Iron - 0.73
Fat and cholesterol - 0.6
Fat and
Dietary Fiber - 0.5

Correlation of Cholesterol:
Cholesterol and calories - 0.6
Cholesterol and protein - 0.5
Cholesterol and sodium -
0.6
Cholesterol and iron - 0.6

Correlation of Sodium:
Sodium and iron - 0.8
Sodium and protein - 0.8
Sodium and Dietary Fiber - 0.6
Sodium
and fat - 0.8
Sodium and Calories - 0.7

Correlation of Dietary fiber:


Dietary fiber and iron - 0.7
Dietary fiber and sodium - 0.6
Dietary fiber and fat - 0.5
Dietary fiber and calories - 0.5

Correlation of Protein:
Protein and iron - 0.7
Protein and dietary fiber - 0.6
Protein and sodium - 0.8
Protein and
Fat - 0.8
Protein and Calories - 0.7

In [ ]: 5. Which item contributes maximum to the Sodium intake?

In [129]: df.groupby('Item').mean()['Sodium'].sort_values(ascending=False).head(20)

Out[129]: Item

Chicken McNuggets (40 piece) 3600

Big Breakfast with Hotcakes and Egg Whites (Large Biscuit) 2290

Big Breakfast with Hotcakes (Large Biscuit) 2260

Big Breakfast with Hotcakes and Egg Whites (Regular Biscuit) 2170

Big Breakfast with Hotcakes (Regular Biscuit) 2150

Chicken McNuggets (20 piece) 1800

Bacon Clubhouse Crispy Chicken Sandwich 1720

Big Breakfast with Egg Whites (Large Biscuit) 1700

Big Breakfast (Large Biscuit) 1680

Big Breakfast with Egg Whites (Regular Biscuit) 1590

Bacon Clubhouse Grilled Chicken Sandwich 1560

Big Breakfast (Regular Biscuit) 1560

Premium McWrap Chicken & Bacon (Crispy Chicken) 1540

Steak, Egg & Cheese Bagel 1510

Bacon, Egg & Cheese Bagel with Egg Whites 1480

Bacon, Egg & Cheese Bagel 1480

Premium McWrap Southwest Chicken (Crispy Chicken) 1480

Steak & Egg Biscuit (Regular Biscuit) 1470

Bacon Clubhouse Burger 1470

Quarter Pounder with Bacon & Cheese 1440

Name: Sodium, dtype: int64

INFERENCE:

The the item that contributes maximum to the sodium intake is Chicken McNuggets (40 piece)

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 13/14


7/25/2021 temp-162719629831896177

In [ ]: 6. Which 4 food items contain the most amount of Saturated Fat?

In [99]: df.groupby('Item').mean()['Saturated Fat'].sort_values(ascending=False).head(2


0)

Out[99]: Item

McFlurry with M&M’s Candies (Medium) 20.00

Big Breakfast with Hotcakes (Large Biscuit) 20.00

Chicken McNuggets (40 piece) 20.00

Frappé Chocolate Chip (Large) 20.00

Double Quarter Pounder with Cheese 19.00

Big Breakfast with Hotcakes (Regular Biscuit) 19.00

Big Breakfast (Large Biscuit) 18.00

Frappé Mocha (Large) 17.00

Frappé Chocolate Chip (Medium) 17.00

Big Breakfast (Regular Biscuit) 17.00

Frappé Caramel (Large) 17.00

Big Breakfast with Hotcakes and Egg Whites (Regular Biscuit) 16.00

Big Breakfast with Hotcakes and Egg Whites (Large Biscuit) 16.00

Steak & Egg Biscuit (Regular Biscuit) 16.00

Strawberry Shake (Large) 15.00

Sausage Biscuit with Egg (Large Biscuit) 15.00

Vanilla Shake (Large) 15.00

Chocolate Shake (Large) 15.00

Frappé Caramel (Medium) 15.00

Shamrock Shake (Large) 15.00

Name: Saturated Fat, dtype: float64

In [ ]: INFERENCE:

The four items that contain the most amount of saturated fat are listed below

1. McFlurry with M&M’s Candies (Medium)

2. Big Breakfast with Hotcakes (Large Biscuit)

3. Chicken McNuggets (40 piece)

4. Frappé Chocolate Chip (Large)

https://htmtopdf.herokuapp.com/ipynbviewer/temp/37bd57e8b3a9d8799ffa791fc64779bf/DR GRACE- (24-7-2021).html?t=1627196308903 14/14

You might also like