Chapter 10 Logistic Reg (Python)
Chapter 10 Logistic Reg (Python)
Regression
Logistic Regression
⚫Extends idea of linear regression to
situation where outcome variable is
categorical
q = number of
predictors
The Fix:
use logistic response
function
Equation 10.2 in
textbook
Step 2: The Odds
The odds of an event are defined as:
p = probability of
eq. 10.3 event
eq. 10.5
bank_df = pd.read_csv('UniversalBank.csv')
bank_df.drop(columns=['ID', 'ZIP Code'], inplace=True)
bank_df.columns = [c.replace(' ', '_') for c in bank_df.columns]
y = bank_df['Personal_Loan']
X = bank_df.drop(columns=['Personal_Loan'])
Fitting Model
# partition data
train_X, valid_X, train_y, valid_y = train_test_split(X, y,
test_size=0.4, random_state=1)
Education_Graduate Education_Advanced/Professional
coeff 4.192204 4.341697
AIC -709.1524769205962
logit_reg_pred = logit_reg.predict(valid_X)
logit_reg_proba = logit_reg.predict_proba(valid_X)
logit_result = pd.DataFrame({'actual': valid_y,
'p(0)': [p[0] for p in logit_reg_proba],
'p(1)': [p[1] for p in logit_reg_proba],
'predicted': logit_reg_pred })