0% found this document useful (0 votes)

9 views2 pages

Project 1

Uploaded by

The document provides guidance for building a prediction model to predict whether consumers will dispute complaints using consumer complaint data. It notes that the goal is to build a model with an AUC score of at least 0.54 using the provided training and test datasets. Several feature engineering suggestions are provided, such as creating features from date columns and the complaint narrative text. The submission must match the sample submission format and number of rows to be graded correctly.

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

Project 1

Uploaded by

SURJIT VERMA

0% found this document useful (0 votes)

9 views2 pages

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

0% found this document useful (0 votes)

9 views2 pages

Project 1

Uploaded by

SURJIT VERMA

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

You are on page 1/ 2

Consumer Complaints Resolution

Consumer complaint resolution is important to any business. In this particular case we have been
given details consumer complaints along with whether consumer disputed with the conclusion. If
we are able to predict this , consumer likely disputed can be given more attention as to how the
complaints are handelled as well as how to convincingly convey the final conlusions to them.

Your target here is to build prediction model for column "Consumer disputed"

Training Data = 'Consumer_Complaints_train.csv'

Test Data = 'Consumer_Complaints_test_share.csv"

All the column names are sefl explanatory.You need to build your model on Train data . Test data
doesnt have response column 'Consumer disputed', you need to predict those values and submit
it a csv format. Submission CSV should resemble the file

Sample Submission = 'sample_submission.csv'

Column names ,value types should exactly match. Also number of rows in the submission csv
should be exactly same as test data. If this is not taken care of , your submission will not be
graded.

Your submission should have AUC score atleast 0.54 for you to pass. The better you do further
you move up in the leader board for this particular project.

Few Suggestions Before you begin:

Do not use date columns as is , you can use them to create other features. For example
which month of the year complaint was filed. Was it first week or last week of the month.
How long it took between complaint filing and data being sent to the company. These are
just ideas , feel free to make any other features out of these. You can convert strings/object
type columns to date_time data using pd.to_datetime. Create cyclic features for date
components
You can handle Consumer Complaint Narrative, creatively. See if you can create some good
feature from this column containing text data. [tfidf ?]
It doesnt make sense to use Consumer ID as predictor.
While parameter tuning with grid search or randomised search you will get cv performance
score . That would be close to what you score might on the test data, but its necessary that it
will be always close to that, especially in case if your model is overfitting.
Before removing NAs from data, do check if there are columns which have too many NaN.
See whether you need to impute those values or need to drop that column all together;
before you start removing NA obs from the entire data.
If you are creating any new features on your training data or modifying features in the train;
you will have to do that for test data also , in order to use the model which you built on test
data for making prediction.
It doesnt make sense to use ZIP CODES as a numeric variable
Its a large dataset , might take a lot of time to run
You can discuss anything on QA forum. Although threads which explicitly disclose too much
information for a solution right away will be removed from QA forum.
Consider making features for prsesence of NaNs itself

We have also uploaded a benchmark script, it gives you auc score on test data, slightly less than
what is required to pass the course. You can include your ideas to make better predictions and
make submissions. You can make as many submissions you want if you want to move up the
leader board. [We might ask you to submit the script which was used to generate the submission
at any time].

Uber Test Searching
33% (3)
Uber Test Searching
3 pages
SAP Variant Configuration: Your Successful Guide to Modeling
From Everand
SAP Variant Configuration: Your Successful Guide to Modeling
Mike Piehl
5/5 (2)
Hair Salon
33% (3)
Hair Salon
11 pages
Test 22
No ratings yet
Test 22
105 pages
Bepp280 ps1
No ratings yet
Bepp280 ps1
5 pages
70 768
No ratings yet
70 768
22 pages
Da Test 2
No ratings yet
Da Test 2
68 pages
Testing Plan For Software Project
No ratings yet
Testing Plan For Software Project
12 pages
ServiceManager - Guide - SampleDataSetup PDF
No ratings yet
ServiceManager - Guide - SampleDataSetup PDF
4 pages
Predicting Churn
No ratings yet
Predicting Churn
37 pages
Methodology of Model Creation: Mgr. Peter Kertys, VÚB A. S
No ratings yet
Methodology of Model Creation: Mgr. Peter Kertys, VÚB A. S
11 pages
Freshers Interview QnA - Power BI
No ratings yet
Freshers Interview QnA - Power BI
55 pages
SAP Tests
No ratings yet
SAP Tests
8 pages
Assignment1 2020
No ratings yet
Assignment1 2020
6 pages
Ie673 Assisngment 5
No ratings yet
Ie673 Assisngment 5
13 pages
Power Bi Interview Questions
No ratings yet
Power Bi Interview Questions
15 pages
C4C Data Workbench 1511
No ratings yet
C4C Data Workbench 1511
11 pages
Time Series Analysis Homework
100% (1)
Time Series Analysis Homework
4 pages
Test Coverage and Code Coverage: Software Development and Maintenance Life Cycles
No ratings yet
Test Coverage and Code Coverage: Software Development and Maintenance Life Cycles
10 pages
Cross Validation Thesis
100% (3)
Cross Validation Thesis
5 pages
SAP Query Step I
No ratings yet
SAP Query Step I
7 pages
PL-300
No ratings yet
PL-300
13 pages
PS Tips and Trick
No ratings yet
PS Tips and Trick
11 pages
VSM
No ratings yet
VSM
8 pages
Assignment Instructions For The Data Analytics Report
No ratings yet
Assignment Instructions For The Data Analytics Report
5 pages
Teradata Warehouse Miner
No ratings yet
Teradata Warehouse Miner
3 pages
Practice Use Case
100% (1)
Practice Use Case
3 pages
BI Tech Session On Data Warehousing: Dhruv Nath
No ratings yet
BI Tech Session On Data Warehousing: Dhruv Nath
58 pages
actualtestdumps-microsoft-mb-820-exam-dumps-by-dawson-24-05-2024-10qa
No ratings yet
actualtestdumps-microsoft-mb-820-exam-dumps-by-dawson-24-05-2024-10qa
18 pages
Final Year IGNOU-BCA Projects Reporting Guidelines: Sunil Ji Garg
No ratings yet
Final Year IGNOU-BCA Projects Reporting Guidelines: Sunil Ji Garg
30 pages
HubSpot Case Study Templates
No ratings yet
HubSpot Case Study Templates
22 pages
K - Nearest Neighbors Implementation in R
No ratings yet
K - Nearest Neighbors Implementation in R
22 pages
Exercises 2
No ratings yet
Exercises 2
10 pages
Eee
No ratings yet
Eee
64 pages
Engineering Techniques
No ratings yet
Engineering Techniques
4 pages
DSciHomeworkAssignmentV4
No ratings yet
DSciHomeworkAssignmentV4
2 pages
Test 66
No ratings yet
Test 66
223 pages
Project Report-Micro Credit Loan
No ratings yet
Project Report-Micro Credit Loan
8 pages
DA 100 Exam Practice Questions
100% (1)
DA 100 Exam Practice Questions
21 pages
Multidimensional Data Modeling in Pentaho
No ratings yet
Multidimensional Data Modeling in Pentaho
6 pages
Generalized SAP BI Unit Test Case Templates
No ratings yet
Generalized SAP BI Unit Test Case Templates
6 pages
Interview QnAs - CloudyML
No ratings yet
Interview QnAs - CloudyML
13 pages
Xii Ai Capstone Project
No ratings yet
Xii Ai Capstone Project
35 pages
Data Migration ENG FINAL
No ratings yet
Data Migration ENG FINAL
8 pages
Nine Steps To A Successful Simulation Study: 1. Define The Problem
No ratings yet
Nine Steps To A Successful Simulation Study: 1. Define The Problem
6 pages
Power BI
No ratings yet
Power BI
6 pages
TQM Assignment5 Tejasg
No ratings yet
TQM Assignment5 Tejasg
11 pages
PMT2 24
No ratings yet
PMT2 24
56 pages
Solution+Guide+-+Practice+Use+Case Try Use Case
No ratings yet
Solution+Guide+-+Practice+Use+Case Try Use Case
25 pages
Powerplay Cubes Modeling and Development Process: Main Powerplay Transformer Features
No ratings yet
Powerplay Cubes Modeling and Development Process: Main Powerplay Transformer Features
16 pages
Machine Learning Assignment 1
No ratings yet
Machine Learning Assignment 1
4 pages
Case Study of Building A Data Warehouse With Analysis Services
No ratings yet
Case Study of Building A Data Warehouse With Analysis Services
10 pages
Machine Learning - Customer Segment Project. Approved by UDACITY
100% (1)
Machine Learning - Customer Segment Project. Approved by UDACITY
19 pages
Test 55
No ratings yet
Test 55
241 pages
Introduction To OOP For LabVIEW Programmers
No ratings yet
Introduction To OOP For LabVIEW Programmers
57 pages
Task-by-Task-Guide_-Build-and-deploy-a-stroke-prediction-model-using-R
No ratings yet
Task-by-Task-Guide_-Build-and-deploy-a-stroke-prediction-model-using-R
5 pages
Data Migration Success:: Best Practices
No ratings yet
Data Migration Success:: Best Practices
8 pages
Predicting Stock Values Using A Recurrent Neural Network
No ratings yet
Predicting Stock Values Using A Recurrent Neural Network
12 pages
SAP QM Interview Questions
100% (2)
SAP QM Interview Questions
9 pages
Dell Ten Tips For Optimizing SQL Server Performance 201402
No ratings yet
Dell Ten Tips For Optimizing SQL Server Performance 201402
19 pages
RBM 2022
100% (3)
RBM 2022
4 pages
Amitakh Stanford - Flying Buffaloes 7 Rescue Quotes
No ratings yet
Amitakh Stanford - Flying Buffaloes 7 Rescue Quotes
6 pages
Fsis09cxswyl004c006sa (FR LSH)
No ratings yet
Fsis09cxswyl004c006sa (FR LSH)
1 page
The Early Days of Theosophy in Europe APSinnett
No ratings yet
The Early Days of Theosophy in Europe APSinnett
56 pages
Listening C7 28.8
100% (2)
Listening C7 28.8
2 pages
Weapon Delivery Analysis and Ballistic Flight Testing: Agard 10
No ratings yet
Weapon Delivery Analysis and Ballistic Flight Testing: Agard 10
176 pages
QMS Microbiology
No ratings yet
QMS Microbiology
21 pages
Regulatory Bodies For Health An
No ratings yet
Regulatory Bodies For Health An
4 pages
Miemczyk Et Al - 2012
No ratings yet
Miemczyk Et Al - 2012
22 pages
1511169568paper 6 Module 15 PC Puneeta VisualImageInterpretation Etext
No ratings yet
1511169568paper 6 Module 15 PC Puneeta VisualImageInterpretation Etext
10 pages
ROCKAL FOAM (28: 30) KG/M: Flammability
No ratings yet
ROCKAL FOAM (28: 30) KG/M: Flammability
9 pages
Hemantkumar Narendrabhai Patel: Application For The Post Of: Instrumentation Technician
No ratings yet
Hemantkumar Narendrabhai Patel: Application For The Post Of: Instrumentation Technician
3 pages
AP Physics - Intro To Thermodynamics: T T T T
No ratings yet
AP Physics - Intro To Thermodynamics: T T T T
9 pages
KR10082
No ratings yet
KR10082
2 pages
CAT2 Timetable Aug-Dec Semester 2023-Students
No ratings yet
CAT2 Timetable Aug-Dec Semester 2023-Students
35 pages
Chemical Reaction Engineering (CHE2001)
No ratings yet
Chemical Reaction Engineering (CHE2001)
52 pages
Design and Application of Selfconsolidating Concrete in Nigeria A Review Paper
No ratings yet
Design and Application of Selfconsolidating Concrete in Nigeria A Review Paper
6 pages
Sop QC VS
No ratings yet
Sop QC VS
13 pages
Chapter Outline: Module 5.1 Classical Conditioning: Learning Through Association
No ratings yet
Chapter Outline: Module 5.1 Classical Conditioning: Learning Through Association
8 pages
GE-LWR Module 3
No ratings yet
GE-LWR Module 3
13 pages
Quarter 2 WHLP Grade 8
No ratings yet
Quarter 2 WHLP Grade 8
7 pages
Behavioral Objectives REVIEWER
No ratings yet
Behavioral Objectives REVIEWER
6 pages
A Culture of Accountability
No ratings yet
A Culture of Accountability
8 pages
Use This Page To Type Basic Data: Name, Student's ID, School Program, and Similar Data
No ratings yet
Use This Page To Type Basic Data: Name, Student's ID, School Program, and Similar Data
6 pages
Factors Affecting The Career Choices of 1
No ratings yet
Factors Affecting The Career Choices of 1
45 pages
Makalah Expanding Your Knowledge: Mata Kuliah: Writing 1
No ratings yet
Makalah Expanding Your Knowledge: Mata Kuliah: Writing 1
25 pages
Personal-Development SLM Q3 Week-1
No ratings yet
Personal-Development SLM Q3 Week-1
8 pages
6-8 Social Studies Standards
No ratings yet
6-8 Social Studies Standards
36 pages
Optimization of Catamaran Demihull Form in Early Stages of The Design Process
No ratings yet
Optimization of Catamaran Demihull Form in Early Stages of The Design Process
7 pages
SBT B309 - Advanced Plant Ecology - 2020
No ratings yet
SBT B309 - Advanced Plant Ecology - 2020
39 pages

Project 1

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Project 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Project 1

Uploaded by

Copyright:

Available Formats

Consumer Complaints Resolution

Training Data = 'Consumer_Complaints_train.csv'

Sample Submission = 'sample_submission.csv'

Few Suggestions Before you begin:

You might also like