Natural Language Processing Assignment
Natural Language Processing Assignment
Natural Language Processing Assignment
PROBLEM STATEMENT:
SOURCE CODE :
PREPROCESSING:
1. TOKENIZATION:
2. STEMING:
3. PUNCTUATION REMOVAL:
PROGRAM:
train_bow = bow[:31962,:]
test_bow = bow[31962:,:]
lreg = LogisticRegression()
lreg.fit(xtrain_bow, ytrain) # training the model
Output: 0.53
SCORE CALCULATION:
test_pred = lreg.predict_proba(test_bow)
test_pred_int = test_pred[:,1] >= 0.3
test_pred_int = test_pred_int.astype(np.int)
test['label'] = test_pred_int
submission = test[['id','label']]
submission.to_csv('sub_lreg_bow.csv', index=False)