0% found this document useful (0 votes)

65 views113 pages

Module 3

This document discusses applications of machine learning in marketing and finance. It describes how machine learning can be used for credit card fraud detection, identity verification, and authentication. For fraud detection, both supervised and unsupervised learning are used to analyze transaction data and flag anomalous purchases. Supervised learning builds models from historical labeled data to identify patterns indicating fraud. Unsupervised learning detects outliers compared to normal user behavior. Machine learning allows for faster, more scalable, and efficient fraud analysis compared to human review of each transaction. However, limitations include lack of transparency in how models reach decisions and potential bias if the training data is imbalanced.

Uploaded by

ishaan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views113 pages

Module 3

Uploaded by

ishaan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 113

AI Applications in Marketing & Finance

Interview with Apoorv Saxena

Kartik Hosanagar, Professor of Operations, Information and Decisions

AI Applications in Marketing & Finance
Machine Learning in Finance: Fraud Detection

Kartik Hosanagar, Professor of Operations, Information and Decisions

Credit Card Fraud

Fraud Occurs Customer Impacted Dispute Required Card Replaced

Content/quotes from “Prediction Machines: The Simple Economics of Artificial Intelligence ” by Ajay Agrawal, Avi Goldfarb, and Joshua Gans
Images: https://visualmodo.com/6-security-tips-protect-ecommerce-site/, https://icons8.com/icons/set/phone, https://www.vecteezy.com/vector-art/383180-
illustration-of-scissors-cutting-a-credit-card
Credit Card Fraud

Fraud Occurs ML Detects Customer Impact Avoided Card Replaced

Early fraud detection with ML can help prevent fraud and save banks a lot of money
Content/quotes from: “A Human’s Guide to Machine Intelligence” by Kartik Hosanagar, Images: https://visualmodo.com/6-security-tips-protect-ecommerce-
site/, https://medium.com/@fenjiro/data-mining-for-banking-loan-approval-use-case-e7c2bc3ece3, https://icons8.com/icons/set/phone,
https://www.vecteezy.com/vector-art/383180-illustration-of-scissors-cutting-a-credit-card
ML Model Accuracy is Crucial

• False Negatives occur when a business does not detect a transaction as

fraudulent and allows the fraudster to make a purchase
• The actual cardholder discovers the charge, disputes it, and is usually
repaid by the bank
• The merchant/bank is responsible for both the cost of the item sold to the
fraudster and the associated dispute fees
• False Positives occur when a transaction is flagged as fraudulent and
blocked, although the potential purchase was actually not fraud
• Has an indirect impact by causing reputational damage (this customer
may not return; others may hear about people being blocked)
• Directly impacts gross profits (loss of this purchase)

Content/quotes from: https://stripe.com/radar/guide

Machine Learning Opportunity

• Both supervised and unsupervised learning are used in fraud

Supervised Learning Unsupervised Learning

• Training occurs by using a large • Anomaly detection can compare

dataset with the details of new transactions with prior ones
individual transactions provided to detect outliers
and with each transaction • This can help identify fraud that
tagged as fraud or not doesn’t necessarily fit a
• From this, the model learns the previously identified pattern (e.g.
unique patterns of fraud a new type of fraud)

Content/quotes from: https://stripe.com/radar/guide & https://www.fico.com/blogs/5-keys-using-ai-and-machine-learning-fraud-detection

Supervised Learning Example

Requires input data for training (usually historical data of two types):
• “Properties that can be ‘read off’ a single credit card payment
• Country the card was issued in, IP address of payment, user’s email
domain, etc.
• Behavioral data (provides “some of the most predictive signals”)
• Number of countries the card was used recently
Supervised Learning Example (cont.)

Requires input data for training (usually historical data of two types):
• “Properties that can be ‘read off’ a single credit card payment
• Behavioral data (provides “some of the most predictive signals”)
Sample Training Data*

* This training data is very limited in order to

provide a simple example

To build an accurate model you would need

millions of rows as well as additional
columns
Content/quotes from: https://stripe.com/radar/guide
Supervised Learning Example (cont.)

Produces an output model, such as the Sample Output*

following decision tree
• The tree answers: “of transactions in our data
set with properties similar to the transaction
we’re examining now, what fraction were
actually fraudulent?”
• “The machine learning part is concerned with
the construction of the tree- what questions *This decision tree is based on the
do we ask, in what order, to maximize the same limited data from the previous
slide
chances that we can distinguish between the
two classes accurately?”
Content/quotes from: https://stripe.com/radar/guide
Supervised Learning Example (cont.)

Produces an output model, such as the Sample Output*

• Supervised Learning: From this (very limited) data, the model would learn a
unique pattern of fraud
• If >$20 & from Canada, 100% chance of fraud
• If <$20 & from >2 countries, 100% chance of fraud
• If >$20 & not from CA, or <$20 & from <2 countries, not sure
• Unsupervised Learning: Detecting transactions that appear like anomalies
• A transaction is for an exceptionally high amount + in a country where
this person has not transacted before + in the past, foreign transactions
were preceded by flight purchase to that country unlike this time =
Anomaly

Content/quotes from: https://stripe.com/radar/guide

Advantages of ML for Fraud

Speed Scale Efficiency

• Algorithms can • A challenge for • Machine learning

quickly process a humans, but algorithms are
large volume of algorithms improve better than humans
transactions as the amount of at repetitive tasks
• This is important data increases.
for fraud since a
decision is needed
in real time
Content/quotes from: https://marutitech.com/machine-learning-fraud-detection/
Limitations of ML for Fraud

Transparency Data Volume

• Algorithms can’t always • Smaller companies may not

explain why someone was have enough training data
blocked • Algorithm accuracy might be
• Hard to catch issues with the lower as a result
model if it isn’t well understood
• Also an ethical problem if
biases go undetected (to be
discussed in Module 4)

Content/quotes from: https://marutitech.com/machine-learning-fraud-detection/

AI Applications in Marketing & Finance
Machine Learning in Finance: Additional Applications

Kartik Hosanagar, Professor of Operations, Information and Decisions

Identity Verification & Authentication

• ML can improve security through more than just detecting fraud patterns,
and it also provides new methods of improving identity verification
Traditional Verification ML-Based Verification
• Passwords • Biometric authentication
• PIN numbers using facial and voice
recognition technologies
• One biometric use case would be when new accounts are opened and
customers need to provide multiple forms of ID
• Customers could instead provide “selfies” or voice prints, facial
recognition and voice recognition technologies can be used to verify
identity based on the images/audio provided
• ATMs in China are starting to use face recognition
Content/quotes from “Section 2: Known Applications of AI”
Identity Verification & Authentication (cont.)

• Biometric authentication can also occur continuously and without intruding

into the customer experience - it involves verifying customers’ identities
while they are already engaging with the bank through mobile apps
• E.g. AI can detect unique biometric patterns of individual customers:
• How the person naturally holds a mobile device
• How the person taps the screen

Content/quotes from “Section 2: Known Applications of AI”

Identity Verification & Authentication (cont.)

Key Benefits & Limitations of ML for Identity Verification

Benefits Limitations

• Improved security, • Not fool proof - attackers could

potentially without creating a still access biometric identifiers
cumbersome experience for and pose as customers
customers • However, it can still deter
attackers

Content/quotes from “Section 2: Known Applications of AI”

Loan & Insurance Underwriting

• ML can detect patterns between consumer data and loan or insurance

outcomes, and use this to predict the outcomes of particular applicants
• E.g., Supervised learning can be used by providing a training dataset
with historical data on consumers and their lending/insurance results
• Consumer data: age, income, employment, etc.
• Lending/insurance results: repaying loans on time vs. defaulting

Content/quotes from: “Section 2: Known Applications of AI”

Additional Content from: https://emerj.com/ai-sector-overviews/machine-learning-in-finance/
Loan & Insurance Underwriting (cont.)

Key Benefits & Limitations of Loans/Insurance Models

Benefits Limitations

• Could reduce processing time • Algorithm could be biased and

• Potential for “increasing loan could perpetuate historical
volume & reducing risk…[by] discrimination
using more diverse data as well • Companies need to make sure
as data with weaker signals.” their algorithms don’t
discriminate (discussed further
in Module 4)

Content/quotes from: “Section 2: Known Applications of AI”

Additional Content from: https://emerj.com/ai-sector-overviews/machine-learning-in-finance/
Predicting Customer Churn

• Banks want to retain customers/prevent churn and can apply ML to this goal
• In much the same way as with fraud models, the customer data that banks
have can be used to “create churn models based on customer attributes or
features of those who did or did not churn for another competitor”
Key Benefits & Limitations of Churn Models
Benefits Limitations
• Predictions from churn models are • Predictions about who might churn
actionable b/c knowing in advance don’t necessarily provide insight
which customers might churn into what is causing them to leave
allows banks to make extra efforts and how best to retain them
to improve those customers’
satisfaction
Content/quotes from “Section 2: Known Applications of AI”
Three Additional Examples of ML in Finance

Customer Experience
• Conversational AI platforms are being used to service customers via chat or
over the phone to improve responsiveness and reduce costs
Personal Finance
• Personalized portfolios
Financial Forecasting
• Ability to predict company financials or budgeting needs in the future

Content/quotes for financial forecasting from “Section 2: Known Applications of AI”

Content/quotes for customer experience and personal finance section from https://www.alacriti.com/machine-learning-in-financial-services-potential-applications/
AI Applications in Marketing and Finance
Introduction

Michael R. Roberts, The William H. Lawrence Professor of Finance

Finance, Data, and Technology

• Finance has long been:

• Technology oriented
• Data oriented
• Model oriented
Example Applications

• Portfolio management
• Algorithmic trading
• Fraud detection
• Customer retention
• Returns forecasting
• Earnings forecasting
• Credit analysis
What Are We Going to Do?

• Focus on an application
• Corporate credit risk
• Emphasize process
• Scientific method
• Data science workflow
• Emphasize economics
• Avoid common pitfalls with models
• Illustrate stylized machine learning problem
• Imputing credit ratings
How Are We Going to Do it?

• Informal delivery
• Unscripted
• Dynamic
• Working together at computer
• Thought process is important
Goals

1. Emphasize importance of
• Process
• Data
• Economic and institutional details

2. De-emphasize importance of complexity

• Black box

Balance 1 and 2
AI Applications in Marketing and Finance
Process: Scientific Method

Michael R. Roberts, The William H. Lawrence Professor of Finance

“ If it (theory) disagrees with experiment, it’s
wrong. In that simple statement is the key to
science.”
— RICHARD FEYNMAN
Scientific Method

1. Clearly articulate a specific question

2. Guess an answer (hypothesize)
3. Identify empirical implications of guess
4. Compare implications with data
AI Applications in Marketing and Finance
Process: Data Science Workflow

Michael R. Roberts, The William H. Lawrence Professor of Finance

Scientific Method

1. Clearly articulate a specific question

2. Guess an answer (hypothesize)
3. Identify empirical implications of guess
4. Compare implications with data
Data Science Workflow
Data Science Workflow

1. Acquisition and verification

2. Preparation
3. Analysis 4

4. Communication* 1
2
3
AI Applications in Marketing and Finance
Corporate Credit Risk

Michael R. Roberts, The William H. Lawrence Professor of Finance

Bond Markets
Bond Markets
Bond Markets
Bond Markets
Bond Markets
Bond Markets
Bond Markets
Syndicated Lending
Corporate Credit Risk

• What is it?
• Inability of firms to repay financial obligations
• Why it’s important
• Affects availability and price of credit
• For whom is it important?
• Investors
• Employees
• Customers
• Suppliers
• Taxpayers
Outline

• Quantify and assess

• Examples
• Stylized ML example
• Predicting credit ratings
• Extensions
AI Applications in Marketing and Finance
Credit Risk - KPIs

Michael R. Roberts, The William H. Lawrence Professor of Finance

AI Applications in Marketing and Finance
Credit Risk - Credit Ratings

Michael R. Roberts, The William H. Lawrence Professor of Finance

Credit Ratings
Credit Ratings
Credit Ratings
Credit Ratings
Credit Ratings
AI Applications in Marketing and Finance
Credit Risk - Credit Ratings Prediction

Michael R. Roberts, The William H. Lawrence Professor of Finance

Task

• Develop model to distinguish between:

• Investment-grade
• Speculative-grade
Credit Ratings
Task

• Develop model to distinguish between:

• Investment-grade
• Speculative-grade
• What is success?
Task

• Develop model to distinguish between:

• Investment-grade
• Speculative-grade
• What is success?
Task

• Develop model to distinguish between:

• Investment-grade
• Speculative-grade
• What is success?
Task

• Develop model to distinguish between:

• Investment-grade
• Speculative-grade
• What is success?
Task

• Develop model to distinguish between:

• Investment-grade
• Speculative-grade
• What is success?
Task

• Develop model to distinguish between:

• Investment-grade
• Speculative-grade
• What is success?
Task

• Develop model to distinguish between:

• Investment-grade
• Speculative-grade
• What is success?
Task

• Develop model to distinguish between:

• Investment-grade
• Speculative-grade
• What is success?
AI Applications in Marketing and Finance
Credit Risk - Data

Michael R. Roberts, The William H. Lawrence Professor of Finance

Data Science Workflow

• Data acquisition and verification

• Wharton Research Data Services (WRDS)
• S&P Compustat database
• Sample
• 10,540 observations
• 1995 to 2016
• 1,400 firms
Data Science Workflow

• Data preparation
• EDA
Data Science Workflow
Data Science Workflow
Data Science Workflow
Data Science Workflow
Data Science Workflow
Data Science Workflow
Data Science Workflow
Data Science Workflow
Data Science Workflow
Data Science Workflow
AI Applications in Marketing and Finance
Credit Risk - Model Prep

Michael R. Roberts, The William H. Lawrence Professor of Finance

Model Prep

• Y = f(x1, x2, …, xk)

• Y = outcome variable = 1 if investment grade, 0 otherwise
• (x1, x2, …, xk) = model inputs, predictors, explanatory variables, etc.
Redundancy?
Train-Test Split

• Should be done at the very beginning!

AI Applications in Marketing and Finance
Credit Risk - Model Training

Michael R. Roberts, The William H. Lawrence Professor of Finance

Prediction

• Logit model - confusion matrix

Prediction

• Logit model - probability confusion matrix

• Model score: 77.2%

Prediction

• Logit model - reduced inputs

• Current ratio, interest coverage, debt-to-ebitda, debt-to-assets

• Model score: 76.5% (77.2%)
• Important?
Additional Metrics

• Logit model - reduced inputs

• Precision = Probability of true positive conditional on positive prediction, 76.54%

• Recall = Probability of a true positive conditional on a positive outcome, 77.6%
• F1 = Harmonic mean (weighted average of recall and precision), 77.1%
Thoughts

• Inspect
• (Probability) confusion matrix and model score
• Precision, recall, F1 score
• What matters depends on the goal set forth at the outset
AI Applications in Marketing and Finance
Credit Risk - Models vs. Data

Michael R. Roberts, The William H. Lawrence Professor of Finance

Alternative Models
Alternative Models
Alternative Models
Alternative Models
Alternative Models
Alternative Models
AI Applications in Marketing and Finance
Credit Risk - Error Analysis