0% found this document useful (0 votes)
20 views19 pages

Ventures Regression

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views19 pages

Ventures Regression

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

QM Regression Python

Ventures, Catherine B.
BSIT 2-B
Regression in Python

● Create a new notebook in Kaggle

● Upload/Choose a dataset

● Import the Python Libraries

● Visualize Data

● Multiple Regression
import numpy as np
import pandas as pd
import seaborn as sns
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
print(os.path.join(dirname, filename))

Regression in Python
Importing Libraries and Warnings
#Importing the libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# import warnings
import warnings
warnings.filterwarnings("ignore")
# We will use some methods from the sklearn module
from sklearn import linear_model
from sklearn.linear_model import LinearRegression
from sklearn import metrics
from sklearn.metrics import mean_squared_error,
mean_absolute_error
from sklearn.model_selection import train_test_split,
cross_val_score
Load the dataset and store it to variable
df and print the first 4 rows
df =
pd.read_csv("/kaggle/input/whr-2023-cs
v/WHR_2023.csv")

df.head()
Output
Print the dimension of the dataframe
Print the correlation for each column
Setting Y and X
variables
x = df[['gdp_per_capita',
'social_support',
'healthy_life_expectancy','freedo
m_to_make_life_choices','genero
sity','perceptions_of_corruption']]

y = df['happiness_score']
Create the distribution plot

sns.distplot(df['happiness_score']);
Output
Printing Scatter Chart
sns.pairplot(df, x_vars=['gdp_per_capita','social_support',
'healthy_life_expectancy','freedom_to_make_life_choices'
,'generosity','perceptions_of_corruption'],
y_vars='happiness_score', height=4, aspect=1,
kind='scatter')
Output
pairplot=sns.pairplot(df,
x_vars=['gdp_per_capita'],y_vars='happiness_sco
re', height=3.8, aspect=1, kind='scatter')
for ax in pairplot.axes.flat:
sns.regplot(x=df['gdp_per_capita'],
y=df['happiness_score'], ax=ax, scatter=False,
color='red')
plt.title("Relationship between Happiness Score
and GDP Per Capita")
plt.show()
Output
Create the correlation matrix and
display it using heatmap
sns.heatmap(df.corr(), annot =
True, cmap =
'coolwarm')

plt.show()
“If debugging is the
process of removing
software bugs, then
programming must be the
process of putting them
in.”

-Edsger Dijkstra,
computer science pioneer
Thank You.
Ventures, Catherine B.
BSIT 2-B

You might also like