0% found this document useful (0 votes)

11 views12 pages

ProgrammingFinal Q1-3

1. The document scrapes COVID-19 data from a Worldometers website and converts it to a Pandas dataframe. 2. It extracts data from an HTML table on the site including country names, total cases, deaths, and tests. 3. The final dataframe contains 231 countries and columns for country, total cases, deaths, tests, and population.

Uploaded by

Pragya Wasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views12 pages

ProgrammingFinal Q1-3

Uploaded by

Pragya Wasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

26/02/2023, 21:09 PragyaWasan_ProgrammingFinal_Q1-3 .

ipynb - Colaboratory

Question 1: Web Scraping COVID19 data

import requests # The requests library is an
# HTTP library for getting and posting content etc.

import bs4 as bs# BeautifulSoup4 is a Python library
# for pulling data out of HTML and XML code.
# We can query markup languages for specific content

import pandas as pd
#Pandas to convert solution in a dataframe

import numpy as np

# a GET request will download the HTML webpage.
source = requests.get("https://www.worldometers.info/coronavirus/")
source

Response 200 means status ok. We will now convert it to beautiful soup.

# beautifulsoup can parse HTML code

soup = bs.BeautifulSoup(source.content, features='html.parser')

type(soup)

bs4.BeautifulSoup

# Extract the world data table
table = soup.find('table', id='main_table_countries_today')

# Find all the rows in the table
rows = table.find_all('tr')

# Extract the column names from the table
headers = [th.text.strip() for th in table.find_all('th')]

# Initialize an empty list to hold the data for each row
data_rows = []

# Extract the data from each row
for tr in table.find_all('tr')[9:]:
    data_row = [td.text.strip() for td in tr.find_all('td')]
    if data_row[1]:
        data_rows.append(data_row)

# Create a Pandas DataFrame from the data
df = pd.DataFrame(data_rows, columns=headers)
df

https://colab.research.google.com/drive/1rd_TELvj4andLtYLTs1it0Ad8rqSD8du#scrollTo=JishMpO8AY2q&printMode=true 1/12
26/02/2023, 21:09 PragyaWasan_ProgrammingFinal_Q1-3 .ipynb - Colaboratory

# Country,Other TotalCases NewCases TotalDeaths NewDeaths TotalRecovered NewRecovered ActiveCases Serious,Cri

0 1 USA 105,172,692 1,144,461 102,605,010 1,423,221

1 2 India 44,686,017 530,769

# Filter out rows that contain "Total" in the country name 44,153,099 2,149
df = df[~df['Country,Other'].str.contains('Total')]
2 3 France 39,612,797 164,877 39,377,523 70,397
df
3 4 Germany 38,111,063 167,812 37,661,500 +2,900 281,751

4 5 Brazil 37,023,465 698,933 36,185,975 138,557

# Country,Other TotalCases NewCases TotalDeaths NewDeaths TotalRecovered NewRecovered ActiveCases Serious,C
... ... ... ... ... ... ... ... ... ...
0
234 1 USA
Total: 105,172,692
67,971,613 1,144,461
1,350,389 102,605,010
66,209,615 1,423,221
411,609
1
235 2 India
Total: 44,686,017
13,987,204 530,769
26,076 44,153,099
13,826,303 2,149
134,825
2
236 3 France
Total: 39,612,797
12,791,231 164,877
258,576 39,377,523
12,074,558 70,397
458,097
3
237 4 Germany
Total: 38,111,063
721 167,812
15 37,661,500
706 +2,900 281,751
0
4
238 5 Brazil 679,610,846
Total: 37,023,465 +41,119 698,933
6,797,693 +167 36,185,975
652,466,051 +64,818 138,557
20,347,102
... rows...
239 × 22 columns ... ... ... ... ... ... ... ...

226 227 Vatican City 29 29 0

227 228 Western Sahara 10 1 9 0

228 229 MS Zaandam 9 2 7 0

229 230 Tokelau 5 5

230 231 China 503,302 5,272 379,053 118,977

231 rows × 22 columns

Now we have all 231 countries mentioned in the dataset.

# Keep only the desired columns
df = df[['Country,Other', 'TotalCases', 'TotalDeaths', 'TotalTests', 'Population']]
# Rename columns
df = df.rename(columns={'Country,Other': 'Country', 'TotalCases': 'Cases', 'TotalDeaths': 'Deaths', 'TotalTests': 'Tests'})
df.head()

Country Cases Deaths Tests Population

0 USA 105,172,692 1,144,461 1,164,075,621 334,805,269

1 India 44,686,017 530,769 918,324,498 1,406,631,776

2 France 39,612,797 164,877 271,490,188 65,584,518

3 Germany 38,111,063 167,812 122,332,384 83,883,596

4 Brazil 37,023,465 698,933 63,776,166 215,353,593

# Set the index of the DataFrame to be the country name
df = df.set_index('Country')

# Drop rows with zero or non-numeric data for Total Deaths or Total Tests
df = df[(df['Deaths'] != 0) & (df['Tests'] != 0)]
df

https://colab.research.google.com/drive/1rd_TELvj4andLtYLTs1it0Ad8rqSD8du#scrollTo=JishMpO8AY2q&printMode=true 2/12
26/02/2023, 21:09 PragyaWasan_ProgrammingFinal_Q1-3 .ipynb - Colaboratory

Cases Deaths Tests Population

Country

USA 105,172,692 1,144,461 1,164,075,621 334,805,269

India 44,686,017 530,769 918,324,498 1,406,631,776

France 39,612,797 164,877 271,490,188 65,584,518

numeric_cols = ['Cases', 'Deaths', 'Tests', 'Population']
Germany 38,111,063 167,812 122,332,384 83,883,596
# Replace empty strings with NaN values
Brazil 37,023,465 698,933 63,776,166 215,353,593
df[numeric_cols] = df[numeric_cols].replace('', np.nan)
... ... ... ... ...
# Drop rows with NaN values in any of the columns
Vatican City 29 799
df = df.dropna(subset=numeric_cols, how='any')
Western Sahara 10 1 626,161
df[numeric_cols] = df[numeric_cols].apply(lambda x: x.str.replace(',', '').astype(int))
df MS Zaandam 9 2

Tokelau 5 1,378
/usr/local/lib/python3.8/dist-packages/pandas/core/frame.py:3641: SettingWithCopyWarning:
A value is
China trying to be
503,302set on a
5,272copy of a slice from a DataFrame.
160,000,000 1,448,471,400
Try using .loc[row_indexer,col_indexer] = value instead
231 rows × 4 columns
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-
self[k1] = value[k2]
Cases Deaths Tests Population

Country

USA 105172692 1144461 1164075621 334805269

India 44686017 530769 918324498 1406631776

France 39612797 164877 271490188 65584518

Germany 38111063 167812 122332384 83883596

Brazil 37023465 698933 63776166 215353593

... ... ... ... ...

Macao 3514 121 7850 667490

Saint Pierre Miquelon 3452 2 25400 5759

Wallis and Futuna 3427 7 20508 10982

Montserrat 1403 8 17762 4965

China 503302 5272 160000000 1448471400

212 rows × 4 columns

# Test per case
df['tests_per_case'] = df['Tests'] / df['Cases']
# Sort it
df = df.sort_values(by='tests_per_case', ascending=False)
df.head(20)

https://colab.research.google.com/drive/1rd_TELvj4andLtYLTs1it0Ad8rqSD8du#scrollTo=JishMpO8AY2q&printMode=true 3/12
26/02/2023, 21:09 PragyaWasan_ProgrammingFinal_Q1-3 .ipynb - Colaboratory

<ipython-input-89-716111094444>:2: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-

df['tests_per_case'] = df['Tests'] / df['Cases']
Cases Deaths Tests Population tests_per_case

Country

China 503302 5272 160000000 1448471400 317.900585

UAE 1051732 2349 199149334 10081785 189.353689

Turks and Caicos 6551 38 611527 39741 93.348649

Oman 399449 4628 25000000 5323993 62.586213

Bermuda 18799 160 1027337 61939 54.648492

Saudi Arabia 829287 9614 45135808 35844909 54.427247

The tests_per_case column measures the number of COVID-19 tests conducted for every confirmed case in each country.
Rwanda 133170 1468 5959042 13600464 44.747631
Countries with a high tests_per_case value indicate that they are conducting a large number of tests relative to the number of confirmed cases.
Denmark 3174969 8244 129190878 5834950 40.690438
This could be a sign of proactive and widespread testing, which can help in identifying and isolating infected individuals and reducing the
Bhutan 62615 21 2303734 787941 36.792047
spread of the virus.
Austria 5901938 21872 211273524 9066710 35.797313
On the other hand, countries with a low tests_per_case value may not be conducting enough tests to identify and isolate all infected individuals.
Spain
This could lead 13763336
to a higher spread of119380 471036328
the virus 46719142
and potentially 34.223994 system.
overwhelm the healthcare
Sierra Leone 7760 126 259958 8306436 33.499742
Overall, the tests_per_case metric provides useful information on a country's testing strategy and can give insight into their approach to
controlling the spread of COVID-19.
Gabon 48981 However,
306 it's important to2331533
1621909 consider other factors such as population density, demographics, and healthcare
33.113023

resources when
Tongaevaluating a16801
country's response
13 to the pandemic.
535009 107749 31.843878

Yemen 11945 2159 329592 31154867 27.592465

Question 2: Exploring
Hong Kong 2883157 prediction
13451 algorithms
76127725 7604299 on vaccination
26.404294 data.
Gibraltar 20423 111 534283 33704 26.160848

Niger 9931 312 254538 26083660 25.630651

import pandas as pd
import numpy as np
Chad 7675 194 191341 17413580 24.930423
import matplotlib.pyplot as plt
Mali 32997 743
import matplotlib.dates as mpl_dates 795054 21473764 24.094736
from matplotlib.dates import MonthLocator
import seaborn as sns

url = 'https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/vaccinations/vaccinations.csv'
df = pd.read_csv(url)
df

https://colab.research.google.com/drive/1rd_TELvj4andLtYLTs1it0Ad8rqSD8du#scrollTo=JishMpO8AY2q&printMode=true 4/12
26/02/2023, 21:09 PragyaWasan_ProgrammingFinal_Q1-3 .ipynb - Colaboratory

location iso_code date total_vaccinations people_vaccinated people_fully_vaccinated total_boosters daily_va

2021-
#To ensure that the date column is a Pandas datetime object, we can use the to_datetime function
0 Afghanistan AFG 0.0 0.0 NaN NaN
02-22
df['date'] = pd.to_datetime(df['date'])
2021-
df 1 Afghanistan AFG NaN NaN NaN NaN
02-23

location iso_code 2021-

date total_vaccinations people_vaccinated people_fully_vaccinated total_boosters daily_va
2 Afghanistan AFG NaN NaN NaN NaN
02-24
2021-
0 Afghanistan AFG 2021- 0.0 0.0 NaN NaN
3 Afghanistan AFG 02-22 NaN NaN NaN NaN
02-25
2021-
1 Afghanistan AFG 2021- NaN NaN NaN NaN
4 Afghanistan AFG 02-23 NaN NaN NaN NaN
02-26
2021-
2
... Afghanistan
... AFG
... ... NaN
... NaN
... NaN
... NaN
...
02-24
2022-
157689 Zimbabwe ZWE 2021- 12219760.0 6436704.0 4750104.0 1032952.0
3 Afghanistan AFG 10-05 NaN NaN NaN NaN
02-25
2022-
2021-
157690
4 Zimbabwe
Afghanistan ZWE
AFG NaN
NaN NaN
NaN NaN
NaN NaN
NaN
10-06
02-26

... ... ... 2022-

... ... ... ... ...
157691 Zimbabwe ZWE NaN NaN NaN NaN
10-07
2022-
157689 Zimbabwe ZWE 2022- 12219760.0 6436704.0 4750104.0 1032952.0
157692 Zimbabwe ZWE 10-05 NaN NaN NaN NaN
10-08
2022-
157690 Zimbabwe ZWE 2022- NaN NaN NaN NaN
157693 Zimbabwe ZWE 10-06 12222754.0 6437808.0 4751270.0 1033676.0
10-09
2022-
157691rowsZimbabwe
157694 × 16 columns ZWE NaN NaN NaN NaN
10-07

2022-
157692 Zimbabwe ZWE NaN NaN NaN NaN
10-08

2022-
157693 Zimbabwe ZWE 12222754.0 6437808.0 4751270.0 1033676.0
10-09

157694 rows × 16 columns

# Group by date and sum the number of vaccinations per date using the groupby and sum functions

df = df.groupby('date')['total_vaccinations'].sum().reset_index()
df

date total_vaccinations

0 2020-12-02 0.000000e+00

1 2020-12-03 0.000000e+00

2 2020-12-04 5.000000e+00

3 2020-12-05 4.000000e+00

4 2020-12-06 4.000000e+00

... ... ...

811 2023-02-21 4.567548e+10

812 2023-02-22 4.484567e+10

813 2023-02-23 4.508844e+10

814 2023-02-24 4.328079e+10

815 2023-02-25 4.145385e+10

816 rows × 2 columns

# Fill missing values with 0
df = df.fillna(0)
df

https://colab.research.google.com/drive/1rd_TELvj4andLtYLTs1it0Ad8rqSD8du#scrollTo=JishMpO8AY2q&printMode=true 5/12
26/02/2023, 21:09 PragyaWasan_ProgrammingFinal_Q1-3 .ipynb - Colaboratory

date total_vaccinations

0 2020-12-02 0.000000e+00

1 2020-12-03 0.000000e+00

2 2020-12-04 5.000000e+00

3 2020-12-05 4.000000e+00

4 2020-12-06 4.000000e+00

... ... ...

811 2023-02-21 4.567548e+10

812 2023-02-22 4.484567e+10

Plotting
813 the deaths
2023-02-23 4.508844e+10

814 2023-02-24 4.328079e+10

815 2023-02-25
# Set up plot 4.145385e+10
fig, ax = plt.subplots(figsize=(12, 6))
816 rows × 2 columns
ax.set_title("Total Number of COVID-19 Vaccinations Globally")
ax.set_xlabel("Date")
ax.set_ylabel("Total Number of Vaccinations")

# Filter data for dates starting from December 2020 and ending on February 1, 2023
start_date = '2020-12-01'
end_date = '2023-02-01'
df_filtered = df[(df['date'] >= start_date) & (df['date'] <= end_date)]

# Plot data
ax.plot(df_filtered['date'], df_filtered['total_vaccinations'], linewidth=0.7, color='blue')

# Set x-axis limits to start from December 2020 and end on February 1, 2023
ax.set_xlim(left=mpl_dates.datestr2num(start_date), right=mpl_dates.datestr2num(end_date))

# Format x-axis
date_format = mpl_dates.DateFormatter('%Y-%m-%d')
ax.xaxis.set_major_formatter(date_format)

# Set tick locator for every month
locator = MonthLocator(bymonthday=1)
ax.xaxis.set_major_locator(locator)

fig.autofmt_xdate()

# Rotate x-axis labels
plt.setp(ax.get_xticklabels(), rotation=45, ha="right")

# Show plot
plt.show();

Linear regression

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

https://colab.research.google.com/drive/1rd_TELvj4andLtYLTs1it0Ad8rqSD8du#scrollTo=JishMpO8AY2q&printMode=true 6/12
26/02/2023, 21:09 PragyaWasan_ProgrammingFinal_Q1-3 .ipynb - Colaboratory
# Create X and y variables for linear regression
X = df.index.values.reshape(-1, 1)
y = df['total_vaccinations'].values.reshape(-1, 1)

# Train the model
reg = LinearRegression().fit(X, y)

# Generate predictions
y_pred = reg.predict(X)

# Plot the data and the predicted values
fig, ax = plt.subplots(figsize=(10, 6))
plt.plot(df['date'], y_pred, label='Predicted')
plt.plot(df['date'], df['total_vaccinations'], label='True')

# Set the title and axis labels
plt.title('Total Number of Vaccinations over Time Globally')
plt.xlabel('Date')
plt.ylabel('Total Vaccinations')

# Format x-axis
date_format = mpl_dates.DateFormatter('%b %d, %Y')
ax.xaxis.set_major_formatter(date_format)
fig.autofmt_xdate()

# Add a legend
plt.legend()

# Show the plot
plt.show()

# Calculate the mean squared error of the model
mse = mean_squared_error(y, y_pred)
print('Mean squared error:', mse)

Mean squared error: 3.4034585523800904e+19

South Korea analysis

from sklearn.linear_model import RidgeCV, Ridge
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

df_sk = pd.read_csv(url)
# Convert date column to Pandas datetime object
df_sk['date'] = pd.to_datetime(df_sk['date'])

# Filter the data for South Korea
df_sk = df_sk[df_sk['location'] == 'South Korea']
df_sk

https://colab.research.google.com/drive/1rd_TELvj4andLtYLTs1it0Ad8rqSD8du#scrollTo=JishMpO8AY2q&printMode=true 7/12
26/02/2023, 21:09 PragyaWasan_ProgrammingFinal_Q1-3 .ipynb - Colaboratory

location iso_code date total_vaccinations people_vaccinated people_fully_vaccinated total_boosters daily_va

South 2021-
131235 KOR 50357.0 39175.0 11358.0 2.0
Korea 02-26

South 2021-
131236 KOR 52325.0 40919.0 11581.0 2.0
Korea 02-27

South 2021-
131237 KOR 53465.0 41947.0 11688.0 2.0
Korea 02-28

South 2021-
131238 KOR 55673.0 43881.0 11959.0 2.0
Korea 03-01

South 2021-
131239 KOR 122214.0 108915.0 12270.0 3.0
Korea 03-02

... ... ... ... ... ... ... ...

South 2023-
131958 KOR NaN 44845635.0 44429115.0 NaN
Korea 02-19

South 2023-
131959 KOR NaN 44845775.0 44429290.0 NaN
Korea 02-20

South 2023-
131960 KOR NaN 44846025.0 44429485.0 NaN
Korea 02-21

South 2023-
131961 KOR NaN 44846315.0 44429716.0 NaN
Korea 02-22

South 2023-
131962 KOR NaN 44846559.0 44429953.0 NaN
Korea 02-23

728 rows × 16 columns

# Define train and test start and end dates
train_start_date = '2021-08-01'
train_end_date = '2021-09-30'
test_start_date = '2021-10-01'
test_end_date = '2021-10-08'

# Filter the data by train and test dates
train_df = df_sk[(df_sk['date'] >= train_start_date) & (df_sk['date'] <= train_end_date)]
test_df = df_sk[(df_sk['date'] >= test_start_date) & (df_sk['date'] <= test_end_date)]

# Extract the target variable (total vaccinations)
y_train = train_df['total_vaccinations'].values.reshape(-1, 1)

# Extract the features (days since the start of the train period)
X_train = (train_df['date'] - pd.to_datetime(train_start_date)).dt.days.values.reshape(-1, 1)

X_train = (train_df['date'] - pd.to_datetime(train_start_date)).dt.days.values.reshape(-1, 1)

# Create a Linear Regression model and fit it to the training data
model = LinearRegression()
model.fit(X_train, y_train)

LinearRegression()

# Choose a range of regularization strengths to try
alphas = np.logspace(-5, 5, num=11)

# Create a Ridge regression model and fit it to the training data
model = RidgeCV(alphas=alphas, cv=5)
model.fit(X_train, y_train)

RidgeCV(alphas=array([1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01, 1.e+02,

1.e+03, 1.e+04, 1.e+05]),
cv=5)

# Create a pipeline with a StandardScaler and a Ridge model
pipeline = Pipeline([
('scaler', StandardScaler()),
('ridge', Ridge())
])

# Define the parameter grid to search
param_grid = {
'ridge__alpha': np.logspace(-3, 3, num=7),
}

# Perform a grid search using cross-validation

https://colab.research.google.com/drive/1rd_TELvj4andLtYLTs1it0Ad8rqSD8du#scrollTo=JishMpO8AY2q&printMode=true 8/12
26/02/2023, 21:09 PragyaWasan_ProgrammingFinal_Q1-3 .ipynb - Colaboratory
grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='neg_mean_squared_error')
grid_search.fit(X_train, y_train)

# Print the best parameters and the mean squared error of the best estimator
print('Best Parameters:', grid_search.best_params_)
print('MSE:', -grid_search.best_score_)
Best Parameters: {'ridge__alpha': 1.0}
MSE: 1425516378643.75

# Make predictions for the test set using the best estimator
best_estimator = grid_search.best_estimator_
X_test = (test_df['date'] - pd.to_datetime(train_start_date)).dt.days.values.reshape(-1, 1)
y_test_pred = best_estimator.predict(X_test)

# Calculate the Root Mean Squared Error for the test set
y_test = test_df['total_vaccinations'].values.reshape(-1, 1)
y_test_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_test_pred)
rmse = np.sqrt(mse)

print('Test RMSE:', rmse)

Test RMSE: 648388.3421654612

It is less than 750000!

# Plot the true and predicted values for the test set
fig, ax = plt.subplots(figsize=(10, 6))
plt.plot(test_df['date'], y_test, label='True')
plt.plot(test_df['date'], y_test_pred, label='Predicted')
plt.title('COVID-19 Vaccinations in South Korea')
plt.xlabel('Date')
plt.ylabel('Total Vaccinations')

# Format x-axis
date_format = mpl_dates.DateFormatter('%b %d, %Y')
ax.xaxis.set_major_formatter(date_format)
fig.autofmt_xdate()

plt.legend()
plt.show()

Question 3: Matplotlib for Data Visualization

# Load the data for Google and Yahoo stocks and NY temperature files
google_data = pd.read_csv("google.txt", delimiter='\t')
yahoo_data = pd.read_csv("yahoo.txt", delimiter='\t')
ny_temp_data = pd.read_csv("ny.txt", delimiter='\t')

google_data

https://colab.research.google.com/drive/1rd_TELvj4andLtYLTs1it0Ad8rqSD8du#scrollTo=JishMpO8AY2q&printMode=true 9/12
26/02/2023, 21:09 PragyaWasan_ProgrammingFinal_Q1-3 .ipynb - Colaboratory

Modified Julian Date Stock Value

0 55463 527.21

1 55462 513.48

2 55461 516.00

3 55460 513.46

4 55459 508.28

... ... ...

1532 53242 106.00

1533 53241 104.87

1534 53240 109.40

1535 53237 108.31

1536 53236 100.34

yahoo_data
1537 rows × 2 columns
Modified Julian Date Stock Value

0 55463 14.40

1 55462 14.17

2 55461 14.04

3 55460 14.18

4 55459 13.86

... ... ...

3634 50191 29.25

3635 50190 27.00

3636 50189 28.75

3637 50188 32.25

3638 50185 33.00

3639 rows × 2 columns

ny_temp_data

Modified Julian Date Max Temperature

0 48988 52

1 49019 38

2 49047 31

3 49078 66

4 49108 75

... ... ...

197 55044 81

198 55075 71

199 55105 56

200 55136 68

201 55166 48

202 rows × 2 columns

# Set the date column as the index for each dataframe
google_data.set_index('Modified Julian Date', inplace=True)
yahoo_data.set_index('Modified Julian Date', inplace=True)
ny_temp_data.set_index('Modified Julian Date', inplace=True)

google_data

https://colab.research.google.com/drive/1rd_TELvj4andLtYLTs1it0Ad8rqSD8du#scrollTo=JishMpO8AY2q&printMode=true 10/12
26/02/2023, 21:09 PragyaWasan_ProgrammingFinal_Q1-3 .ipynb - Colaboratory

Stock Value

Modified Julian Date

55463 527.21

55462 513.48

55461 516.00

55460 513.46

55459 508.28

... ...

53242 106.00

53241 104.87

53240 109.40

53237
# Create a figure and axis object 108.31
fig, ax1 = plt.subplots(figsize=(10, 6))
53236 100.34

1537 rows × 1 columns

# Plot the Yahoo stock data
ax1.plot(yahoo_data.index, yahoo_data['Stock Value'], color='purple', label='Yahoo! Stock Value')

# Plot the Google stock data
ax1.plot(google_data.index, google_data['Stock Value'], color='green', label='Google Stock Value')

ax1.set_ylabel('Value (Dollars)', fontsize=14, color='purple')
ax1.tick_params(axis='y', labelcolor='green')

# Create a second y-axis on the right side of the plot
ax2 = ax1.twinx()

# Plot the NY temperature data
ax2.plot(ny_temp_data.index, ny_temp_data['Max Temperature'], color='blue', label='NY Mon. High Temp', linestyle='--')

ax2.set_ylabel('Temperature (°F)', fontsize=14, color='blue')

ax2.tick_params(axis='y', labelcolor='blue')

# Set the y-axis limits and labels
ax2.set_ylim(-150, 100)
ax2.set_yticks(range(-150, 101, 50))
ax2.set_yticklabels(['{:.0f}'.format(x) for x in range(-150, 101, 50)])

# Set the title, x-label, and y-label
ax1.set_title('New York Temperature, Google, and Yahoo!', fontsize=16, fontweight='bold')
ax1.set_xlabel('Date (MJD)', fontsize=14)

# Add a legend to the graph
lines1, labels1 = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax1.legend(lines1 + lines2, labels1 + labels2, loc='center left', fontsize=12)

# Show the graph
plt.show()

https://colab.research.google.com/drive/1rd_TELvj4andLtYLTs1it0Ad8rqSD8du#scrollTo=JishMpO8AY2q&printMode=true 11/12
26/02/2023, 21:09 PragyaWasan_ProgrammingFinal_Q1-3 .ipynb - Colaboratory

Colab paid products - Cancel contracts here

check 0s completed at 21:09

https://colab.research.google.com/drive/1rd_TELvj4andLtYLTs1it0Ad8rqSD8du#scrollTo=JishMpO8AY2q&printMode=true 12/12

heidari2018
No ratings yet
heidari2018
53 pages
6 Monthly Thorough Sling Inspection Checklist
No ratings yet
6 Monthly Thorough Sling Inspection Checklist
1 page
OpcenterEXCRMDD WhatsNew89RevA
No ratings yet
OpcenterEXCRMDD WhatsNew89RevA
8 pages
DBM EAM ..Module Five. Introduction To Enterprise Architecture - Feb 2024
No ratings yet
DBM EAM ..Module Five. Introduction To Enterprise Architecture - Feb 2024
61 pages
4_UNIT_3_EXCEL_2023_FINAL
No ratings yet
4_UNIT_3_EXCEL_2023_FINAL
96 pages
AnalysisReport 1709047325596
No ratings yet
AnalysisReport 1709047325596
45 pages
DX Diag
No ratings yet
DX Diag
44 pages
MrCooper Interview Experience
No ratings yet
MrCooper Interview Experience
3 pages
VP Machine PC and Data Station PC Setting
No ratings yet
VP Machine PC and Data Station PC Setting
3 pages
Computer Scavenger Hunt
No ratings yet
Computer Scavenger Hunt
2 pages
Experiment No. 1 BDA
No ratings yet
Experiment No. 1 BDA
8 pages
Powerbi Microsoft Com en Us Blog Enabling Granular Access Control For All Data C
No ratings yet
Powerbi Microsoft Com en Us Blog Enabling Granular Access Control For All Data C
11 pages
Offensive Security Professional Overview Survival
No ratings yet
Offensive Security Professional Overview Survival
27 pages
Industrial Training Report Format
No ratings yet
Industrial Training Report Format
18 pages
User Guide Nokia 6 1 Plus User Guide
No ratings yet
User Guide Nokia 6 1 Plus User Guide
76 pages
Moraleja Final Activity Oop
No ratings yet
Moraleja Final Activity Oop
8 pages
Temp Mail - Disposable Temporary Email
No ratings yet
Temp Mail - Disposable Temporary Email
2 pages
Sinfonia A Due Mandolini e Basso Gimo 76-Basso Acustico
No ratings yet
Sinfonia A Due Mandolini e Basso Gimo 76-Basso Acustico
7 pages
Block Chain Set
No ratings yet
Block Chain Set
9 pages
Emptech Reviewer g11
No ratings yet
Emptech Reviewer g11
17 pages
CSE466 Final Study Guide Flashcards Quizlet
No ratings yet
CSE466 Final Study Guide Flashcards Quizlet
21 pages
Manual For Application For Dog License Mumbai
No ratings yet
Manual For Application For Dog License Mumbai
14 pages
Mongodb
No ratings yet
Mongodb
9 pages
Mcafee Network Threat Behavior Analysis 9.1.x Product Guide 11-28-2021
No ratings yet
Mcafee Network Threat Behavior Analysis 9.1.x Product Guide 11-28-2021
278 pages
Roadmap
No ratings yet
Roadmap
3 pages
Instructions For Updating FTDI Drivers - JM
No ratings yet
Instructions For Updating FTDI Drivers - JM
5 pages
CSCI12 Assgnment
No ratings yet
CSCI12 Assgnment
4 pages
Text
No ratings yet
Text
21 pages
DR Seuss-Yertle The Turtle
No ratings yet
DR Seuss-Yertle The Turtle
46 pages
How Does A Single Bit Error Differs From Burst Error.
No ratings yet
How Does A Single Bit Error Differs From Burst Error.
4 pages
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6440)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4102)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4360)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2884)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (998)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1138)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (628)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1018)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1855)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (581)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (279)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (642)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1174)