Artificial Intelligence and Machine Learning
Artificial Intelligence and Machine Learning
Machine Learning
What is Artificial Intelligence?
• You know what it is—computer programs that “think” or
otherwise act “intelligent”
• The Turing test?
• What is “machine learning” (ML)?
• It’s simply one technique for AI—throw a lot of data at a program
and let it figure things out
• What are “neural networks”?
• A currently popular technique for ML
Machine Learning 2
How Does ML Work?
• Lots of complicated math
• Not the way human brains with human neurons work
• To us, it doesn’t matter—we’ll treat it as a black box with
certain properties
Machine Learning 3
How ML Works
• You feed the program a lot of training data
• From this training data, the ML algorithm builds a model of
the input
• New inputs are matched against the model
• Examples: Google Translate, Amazon and Netflix’s
recommendation engines, speech and image recognition
• However—machine learning algorithms find correlations, not
causation
• It’s not always clear why ML makes certain connections
Machine Learning 4
Correlation versus Causation
https://xkcd.com/552/
Machine Learning 5
Training Data
• Training data must represent the desired actual input space
• Ideally, the training records should be statistically
independent
• If you get the training data wrong, the output will be biased
• To understand or evaluate the behavior of an ML system, you
need the code and the data it was trained on
• “Algorithm transparency” alone won’t do it
Machine Learning 6
Learning Styles
Supervised Unsupervised
• A human labels the training data • Finds what items cluster together
according to some criteria, e.g., spam • Useful for large datasets, where there
or not spam is no ground truth, or where labels
• The algorithm then “learns” what don’t matter
characteristics make items more like • What counts is similarity
spam or more like non-spam
Machine Learning 7
Supervised: Image Recognition
• Feed it lots of pictures of different things
• Label each one: a dog, a plane, a mountain, etc.
• Now feed it a new picture—it will find the closest match and
output the label
Machine Learning 8
Unsupervised Learning
• Feed in lots of data without ground truth
• The algorithms find clusters of similar items; they can also
find outliers—items that don’t cluster with others
• They can also find probabilistic dependencies—if a certain
pattern of one set of variables is associated with the values
of another set, a prediction can be made about new items’
values for those variables
Machine Learning 9
Uses of Machine Learning
Machine Learning 10
Recommendation Engines
• To recommend things to you, Amazon, Netflix, YouTube, etc.,
do not need to know what you buy or watch
• Rather, they just need to know that people who liked X also
tended to like Y and Z.
• This is a classic example of unsupervised learning
Machine Learning 11
An Amazon Recommendation
Machine Learning 12
Finding Terrorists?
• There are very, very few terrorists
• Where are you going to find enough training data?
• Almost certainly, any features the real terrorists have in
common will be matched by very many other innocent
people
• The algorithms can’t distinguish them
Machine Learning 13
Finding Terrorists
• There are very, very few terrorists
• Where are you going to find enough training data?
• Almost certainly, any features the real terrorists have in
common will be matched by very many other innocent
people
• The algorithms can’t distinguish them
• When humans do this, we call it profiling
Machine Learning 14
A Recent Facebook Patent
Machine Learning 15
ML Doesn’t Always Work
the Way We Want it To…
Machine Learning 16
Some Examples
• Biased training data
• Microsoft Tay
• Recidivism risk
• Targeted advertising
• More…
Machine Learning 17
Watch Out for Biased Training Data!
Training data that doesn’t represent actual data
• Google Photos misidentified two African-American men as
gorillas
• Jacky Alciné—a programmer and one of the people who was
misidentified, “I understand HOW this happens; the problem is moreso
on the WHY.”
• Likely cause: not enough dark-skinned faces in the training dataset
• Cultural biases by the trainers
• Mechanical Turk workers are often used for labeling
• False positives and false negatives
Machine Learning 18
Bias In, Bias Out
• Suppose you want an ML system to evaluate job applications
• You train it with data on your current employees
• The ML system will find applicants who “resemble” the
current work force
• If your current workforce is predominantly white males, the
ML system will select white male applicants and perpetuate
bias
Machine Learning 19
Microsoft Tay
• A Twitter “chatbot”
• Tay “talked” with people on Twitter
• What people tweeted to it became its training data
• It started sounding like a misogynist Nazi…
Machine Learning 20
What Happened?
• People from 4Chan and 8Chan decided to troll it.
• With ML, vile Nazi garbage in, vile Nazi garbage out
• Microsoft didn’t appreciate just what people would try.
• “Sinders is critical of Microsoft and Tay, writing that
‘designers and engineers have to start thinking about codes
of conduct and how accidentally abusive an AI can be.’” (Ars
Technica)
Machine Learning 21
Recidivism
• Several companies market “risk assessment tools” to law
enforcement and the judiciary
• Do they work? Do they exhibit impermissible bias?
• A ProPublica study says that one popular one doesn’t work
and does show racial bias: blacks are more likely to be seen
as likely reoffenders—but the predictions aren’t very
accurate anyway
Machine Learning 22
What Happened?
• Inadequate evaluation of accuracy
• Using the program in ways not intended by the developers
• Proxy variables for race
• Using inappropriate variables, e.g., “arrests” rather than
“crimes committed”
Machine Learning 23
Hypertargeted Advertising
• It’s normal practice to target ads to the “right” audience
• ML permits very precise targeting—others can’t even see the
ads
• Used politically—some research says that YouTube’s
recommendation algorithms radicalize people
• Target managed to identify a pregnant 16-year-old—her
family didn’t even know
Machine Learning 24
Target
• People habitually buy from the same stores
• They tend to switch only at certain times, e.g., when a baby
is born
• Target analyzed sales data to find leading indicators of
pregnancy
• They then sent coupons to women who showed those
indicators
• People found that creepy—so Target buried the coupons
among other, untargeted stuff that they didn’t really care if
you bought Machine Learning 25
Privacy and ML
• ML algorithms can act on you without knowing who you are
• ML algorithms can link disparate datasets to identify you, even
without common database keys
• ML algorithms can predict thing about you
Machine Learning 26