DS100-1 WS 2.6 Enrico, DM

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Worksheet 2.

6
DS100-1
DATA VISUALIZATION IN PYTHON
APPLIED DATA SCIENCE
Name:

Enrico, Dionne Marc L. Page 1 of 1

Write codes in Jupyter notebook as required by the problems. Copy both code and output as screen grab or screen shot and paste
them here. Be sure to apply the necessary customizations.

1 Import gdp_usa.csv. Use matplotlib to show the increase in GDP each year.
Code and Output

import matplotlib.pyplot as plt


import pandas as pd

df = pd.read_csv('gdp_usa.csv', header=0)
df = df.rename(columns={"VALUE":"GDP"})
df.plot(x="DATE", y=["GDP"], kind="line", figsize=(10, 5))
plt.show()

2 Use the previous import to prepare another plot (in red) showing only the years 2001 to 2010.
Code and Output

Page 1 of 8
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('gdp_usa.csv', header=0)
df = df[(df['DATE'] > 2000) & (df['DATE'] <=2010)]
df.plot(x="DATE", y=["VALUE"], kind="line", figsize=(10, 5),color='red')
plt.show()

3 Import ‘degrees-women-usa.csv’. Prepare a 2x2 subplot to visualize the % degrees awarded to women yearly in
Agriculture (green), Business (yellow), Education (blue) and Psychology (red).
Code and Output

import pandas as pd
import matplotlib.pyplot as plt
Degrees = pd.read_csv('degrees-women-usa.csv')

plt.subplot(2,2,1)
plt.plot(Degrees [ 'Year'], Degrees [ 'Agriculture'], 'red')
plt.xlabel('Year')
plt.ylabel(' Degrees')
plt.title('Agriculture')

plt.subplot(2,2,2)
plt.plot(Degrees [ 'Year'], Degrees [ 'Business'], 'blue')
plt.xlabel('Year')
plt.ylabel(' Degrees')
plt.title('Business')
Page 2 of 8
plt.subplot(2,2,3)
plt.plot (Degrees [ 'Year'], Degrees [ 'Education'], 'yellow')
plt.xlabel('Year')
plt.ylabel(' Degrees')
plt.title('Education')

plt.subplot(2,2,4)
plt.plot(Degrees [ 'Year'], Degrees[ 'Psychology'], 'green')
plt.xlabel('Year')
plt.ylabel(' Degrees')
plt.title('Psychology')
plt.tight_layout()
plt.show()

4 Import gapminder.csv. Use matplotlib to show how life expectancy varies with gdp_cap.
Code and Output

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

data = pd.read_csv('gapminder.csv')
sns.scatterplot(x = 'life_exp', y = 'gdp_cap', data=data)
plt.xlabel('Life Expectancy')
plt.ylabel('GDP Capital')
Page 3 of 8
plt.title('Relativity of Life Expectancy to GDP Capital')
plt.show()

5 Improve the previous visualization by including population.


Code and Output

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

data = pd.read_csv('gapminder.csv')
sns.scatterplot(x = 'life_exp', y = 'gdp_cap', data=data, hue= 'population')
plt.xlabel('Life Expectancy')
plt.ylabel('GDP Capital')
plt.title('Life Expectancy vs GDP per capita')
plt.show()

Page 4 of 8
6 Refer to the gapminder import. Use seaborn to show the relationship among all numeric data. Group the data according to
continent.
Code and Output

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

data= pd.read_csv('gapminder.csv')
sns.scatterplot(x = 'life_exp', y='gdp_cap', data=data, hue= 'cont')
plt.xlabel('Life Expectancy')
plt.ylabel('GDP Capital')
plt.title('Relativity of Life Expectancy to GDP Capital')
plt.show()

Page 5 of 8
7 Import auto.csv. Use seaborn to show how horsepower varies with acceleration. We also want to see the distribution of
both variables in the same plot.
Code and Output

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

data = pd.read_csv('auto.csv')
sns.catplot(x = 'hp', y = 'accel', data=data, kind = 'box')
plt.xlabel('Horsepower')
plt.ylabel('Acceleration')
plt.title('Horsepower vs Acceleration')
plt.show()

Page 6 of 8
8 Using the auto dataset, prepare side-by-side plots showing the distribution of the number of cylinders. Use a swarm plot on
the left side and a strip plot on the right side.
Code and Output

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

data pd.read_csv('auto.csv')
plt.subplot (1,2,1)
sns.swarmplot(x = 'cyl', y= 'marker', data=data, size = 0.6)

plt.xlabel('Number of Cylinders')
plt.ylabel('Cylinder Markers')
plt.title('Swarm Plot)
plt.subplot(1,2,2)
sns.stripplot(x = 'cyl', y= 'marker', data=data)
plt.xlabel('Number of Cylinders')
plt.ylabel('Cylinder Markers')
plt.title('Strip Plot’)
plt.tight_layout()
plt.show()

Page 7 of 8
Page 8 of 8

You might also like