Fairness in Machine Learning
Fernanda Viégas @viegasf
Martin Wattenberg @wattenberg
Google Brain
As AI touches high-stakes aspects of everyday life,
fairness becomes more important
How can an algorithm even be unfair?
Aren't algorithms beautiful neutral pieces of mathematics?
The Euclidean algorithm (first discovered in 300
BCE) as described in Geometry, plane, solid and
spherical, Pierce Morton, 1847.
"Classic" non-ML problem: implicit cultural assumptions
Example: names are complex.
...
Brilliant, fun article.
Read it! :)
Patrick McKenzie
http://www.kalzumeus.com/2010/06/17/
falsehoods-programmers-believe-about-na
mes/
What's different with machine learning?
Algorithm, 300 BCE
Classical algorithms don’t rely on data
What's different with machine learning?
Algorithm, 300 BCE
Classical algorithms don’t rely on data
Algorithm, 2017 CE
ML systems rely on real-world data and
can pick up biases from data
Sometimes bias starts before an algorithm ever runs…
It can start with the data
Sometimes bias starts before an algorithm ever runs…
It can start with the data
A real-world example
Can you spot the bias?
Can you spot the bias?
Model can’t recognize mugs
with handle facing left
How can this lead to unfairness?
word embeddings
Word embeddings
Distributed Representations of Words
and Phrases and their Compositionality
Mikolov et al. 2013
Meaningful directions
(word2vec)
Can we "de-bias" embeddings?
Can we "de-bias" embeddings?
Bolukbasi et al.: this may be possible.
Idea: "collapse" dimensions corresponding
to key attributes, such as gender.
How can we build systems that are fair?
First, we need to decide what we mean by “fair”...
Interesting fact:
You can't always get what you want in terms of “fairness”!
Fairness: you can't always get what you want!
COMPAS (from company called Northpointe)
● Estimates chances a defendant will be re-arrested
○ Issue: "rearrest" != "committed crime"
● Meant to be used for bail decisions
○ Issue: also used for sentencing
https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
This conclusion came from applying COMPAS to historical arrest records.
https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
Enter the computer scientists...
Low-risk
FAIR
Low-risk
FAIR
Medium-high
FAIR
Low-risk
FAIR
Medium-high
FAIR
Low-risk
FAIR
Medium-high
FAIR
Did not reoffend
UNFAIR
Black defendants who did not reoffend were
more often labeled "high risk"
Unless classifier is
perfect, can't all be fair
due to different base
rates
Low-risk
FAIR
Medium-high
FAIR
Did not reoffend
UNFAIR
https://www.propublica.org/article/bias-in-criminal-risk-scores-is-mathematically-inevitable-researchers-say
Lessons learned
● Fairness of an algorithm depends in part on how it's used
● In fairness, you can't always get (everything) you want
○ Must make a careful choice of quantitative metrics
○ This involves case-by-case policy decisions
○ These tradeoffs affect human decisions too!
● Improvements to fairness may come with their own costs to other values we want to
retain (privacy, performance, etc.)
Can computer scientists do anything besides depress us?
Hardt, Price, Srebro (2016)
On forcing a threshold classifier to be "fair" by various
definitions:
● Group-unaware
Same threshold for each group
● Demographic Parity
Same proportion of positive classifications
● "Equal opportunity"
Same proportion of true positives
Can computer scientists do anything besides depress us?
A visual explanation...
Attacking discrimination in ML mathematically
Would default on loan Would pay back loan
Attacking discrimination in ML mathematically
Credit Score 0 10 20 30 40 50 60 70 80 90 100
Would default on loan Would pay back loan
Attacking discrimination in ML mathematically
Credit Score 0 10 20 30 40 50 60 70 80 90 100
Would default on loan Would pay back loan
Score: 23 Score: 79
Attacking discrimination in ML mathematically
Credit Score 0 10 20 30 40 50 60 70 80 90 100
Would default on loan Would pay back loan
Attacking discrimination in ML mathematically
Credit Score 0 10 20 30 40 50 60 70 80 90 100
Would default on loan Would pay back loan
Attacking discrimination in ML mathematically
Credit Score 0 10 20 30 40 50 60 70 80 90 100
Would default on loan Would pay back loan
Attacking discrimination in ML mathematically
Credit Score 0 10 20 30 40 50 60 70 80 90 100
Lower score than threshold
but would pay back loan
Would default on loan Would pay back loan
Attacking discrimination in ML mathematically
Credit Score 0 10 20 30 40 50 60 70 80 90 100
Would pay back but
isn’t given a loan
Would default on loan Would pay back loan
Attacking discrimination in ML mathematically
Credit Score 0 10 20 30 40 50 60 70 80 90 100
Would pay back but Would default and
isn’t given a loan is given a loan
Would default on loan Would pay back loan
Attacking discrimination in ML mathematically
Attacking discrimination in ML mathematically
Profit: 1.2800
Attacking discrimination in ML mathematically
Profit: 1.2800
Attacking discrimination in ML mathematically
Profit: 1.2800
Multiple groups and multiple distributions
Multiple groups and multiple distributions
Demo
Case Study
Conversation AI /
Perspective API (Jigsaw / CAT / others)
Jigsaw / CAT / others
Conversation AI /
Perspective API
Conversation AI /
Perspective API
False "toxic" positives
Comment Toxicity score
The Gay and Lesbian Film Festival starts today. 82%
Being transgender is independent of sexual orientation. 52%
A Muslim is someone who follows or practices Islam. 46%
How did this happen?
How did this happen?
One possible fix
False positives - some improvement
Comment Old New
The Gay and Lesbian Film Festival starts today. 82% 1%
Being transgender is independent of sexual orientation. 52% 5%
A Muslim is someone who follows or practices Islam. 46% 13%
Overall AUC for old and new classifiers was very close.
A common objection...
● Our algorithms are just mirrors of the world. Not our fault if they reflect bias!
A common objection...
● Our algorithms are just mirrors of the world. Not our fault if they reflect bias!
Some replies:
● If the effect is unjust, why shouldn't we fix it?
● Would you apply this same standard to raising a child?
Another objection
● Objection: People are biased and opaque.
● Why should ML systems be any different?
○ True: this won't be easy
○ We have a chance to do better with ML
Another objection
● Objection: People are biased and opaque.
● Why should ML systems be any different?
○ True: this won't be easy
○ We have a chance to do better with ML
What can you do?
1. Include diverse perspectives in design and development
2. Train ML models on comprehensive data sets
3. Test products with diverse users
4. Periodically re-evaluate and be alert to errors
Fairness in Machine Learning
Fernanda Viégas @viegasf
Martin Wattenberg @wattenberg
Google Brain