Week 1 Lec 1
Week 1 Lec 1
Week 1: AI Risks
Improvement in AI capabilities
2
What is the current situation?
Hard to differentiate between AI & Human
How did we get here?
Scaling up algorithms
Scaling up data for training
Increasing computing capabilities
Not many predicted that we would have these advancements
Worry about AI overtaking Human
3
4
AI capabilities
Vision
Reinforcement Learning
Language
Multi-Paradigm
….
5
GANs 2014
6
Image
generation
7
Image
generation
9
Image
generation
10
Image
generation
Professor teaching Responsible and Safe AI course at IIIT Hyderabad for 70+ students
11
Video
generation
2019
12
Video
generation
April 2022
13
Video
generation
Tiny plant sprout coming out of land Teddy bear running in New York city Oct 2022
https://openai.com/index/sora/ 14
Video
Games
2013
Pong and
Breakout
15
Video
Games
2018
Starcraft,
Dota2
16
Strategy
games
2016 / 17
AlphaGo
17
Strategy
games
2022
Diplomacy
https://en.wikipedia.org/wiki/Diplomacy_(game)
18
Language based tasks
Text generation
Common-sense Q&A
Planning & strategic thinking
19
Language models
2011
20
GPT-2 2019
21
GPT-3 2020
Same as GPT-2
100X parameters
22
ChatGPT 2022
Significant changes
form GPT-3
23
Common sense Q&A
Google’s 2022
PaLM model
24
25
Common sense Q&A
Google’s 2022
PaLM model
26
AP exam
27
Planning & Strategic
thinking
28
Acting on instruction /
plans
29
https://www.adept.ai/blog/act-1
https://arxiv.org/pdf/2307.07924.pdf 30
31
32
33
34
35
36
37
ChatGPT
Facts
Writing email
Writing code
And many more…..
38
Any use cases / experiences from your side?
39
Coding: GPT-3 with Codex LM
Codex is the model that
powers GitHub Copilot
41
Math: AlphaTensor
https://deepmind.google/discover/blog/discovering-novel-algorithms-with-alphatensor/
42
Life Sciences: AlphaFold2
Predicting protein
structure
GDT is a measure of
similarity between two
protein structures
https://en.wikipedia.org/wiki/Global_distance_test
43
https://blog.google/technology/ai/google-deepmind-isomorphic-alphafold-3-ai-model/#life-molecules 44
https://blog.google/technology/research/google-ai-research-new-images-human-brain/ 45
Similar systems / applications
Bard by Google - is connected to internet, docs, drive, gmail
LLaMa by Meta - open source LLM
BingChat by Microsoft - integrates GPT with internet
Copilot X by Github - integrates with VSCode to help you write code
HuggingChat - open source chatGPT alternative
BLOOM by BigScience - multilingual LLM
OverflowAI by StackOverflow - LLM trained by stackoverflow
Poe by Quora - has chatbot personalities
YouChat - LLM powered by search engine You.com
More in the list, Devin, GPT40
46
In summary
Most of the advancements in 2022 and beyond
Good at taking actions in complex environment, strategic thinking and
connecting to real world
47
48
49
Activity #AICapabilities
Imagine the optimal collaboration between AI and humans across
sectors like healthcare, education, environmental management, and
more.
What innovations are necessary to achieve this?
What challenges could arise, and what potential risks might we face in
this best-case scenario?
Drop your answers as a response in mailing list with subject line
“Activity #AICapabilities”
50
White House:
Executive Order
on Safe, Secure,
and Trustworthy
Artificial
Intelligence, Oct
2023
https://www.whitehouse.gov/briefing-room/statements-releases/2023/10/30/fact-sheet-president-biden-issues-executive-order-on-safe-secure-and-trustworthy-artificial-intelligence/
51
https://www.whitehouse.gov/briefing-room/statements-releases/2023/10/30/fact-sheet-president-biden-issues-executive-order-on-safe-secure-and-trustworthy-artificial-intelligence/
52
Deepfakes
https://www.youtube.com/watch?v=cQ54GDm1eL0 53
Deepfakes
https://www.youtube.com/watch?v=enr78tJkTLE
54
Deepfakes: What goes on behind the scenes; go to
colab
55
https://colab.research.google.com/github/JaumeClave/deepfakes_first_order_model/blob/master/first_order_model_deepfakes.ipynb
Lip sync
https://bhaasha.iiit.ac.in/lipsync/example_upload1 56
Face recognition
https://youtu.be/jZl55PsfZJQ?si=3wD5xxRHgnD1p1fR 57
Weaponization
https://www.theguardian.com/world/2023/dec/01/the-gospel-how-israel-uses-ai-to-select-bombing-targets 58
59
Errors / Bias in algorithms
60
https://techcrunch.com/2023/06/06/a-waymo-self-driving-car-killed-a-dog-in-unavoidable-accident/
Errors in algorithms
61
https://www.theguardian.com/technology/2022/dec/22/tesla-crash-full-self-driving-mode-san-francisco
Errors in algorithms
62
https://www.indiatoday.in/technology/news/story/robot-confuses-man-for-a-box-of-vegetables-pushes-him-to-death-in-factory-2460977-2023-11-09
What is going on? ☺
https://www.youtube.com/watch?v=lnyuIHSaso8&t=75s 63
More
https://economictimes.indiatimes.com/news/new-updates/man-gets-caught-in-deepfake-trap-almost-ends-life-among-first-such-cases-in-india/articleshow/105611955.cms
64
Malicious use: ChaosGPT
"empowering GPT with Internet and
Memory to Destroy Humanity.”
https://decrypt.co/126122/meet-chaos-gpt-ai-tool-destroy-humanity 65
Malicious use: ChaosGPT
https://en.wikipedia.org/wiki/Tsar_Bomba 66
Malicious use: ChaosGPT
https://decrypt.co/126122/meet-chaos-gpt-ai-tool-destroy-humanity 67
Malicious use: ChaosGPT
https://www.youtube.com/watch?v=kqfsuHsyJb8 68
Your list of AI risks?
69
What is an alignment problem?
70
What is an alignment problem?
https://www.youtube.com/watch?v=yWDUzNiWPJA 71
Misalignment?
https://www.ndtv.com/offbeat/ai-chatbot-goes-rogue-swears-at-customer-and-slams-company-in-uk-4900202
https://twitter.com/ashbeauchamp/status/1748034519104450874/ 72
73
74
https://flowingdata.com/2023/11/03/demonstration-of-bias-in-ai-generated-images/
75
https://blog.google/products/gemini/gemini-image-generation-issue/ 76
https://blog.google/products/gemini/gemini-image-generation-issue/ 77
https://blog.google/products/gemini/gemini-image-generation-issue/ 78
Any questions?
79
Risk sources / Taxonomy
Malicious use
AI race
Organizational risks
Rogue AIs
80
Malicious use
AI could be used to engineer new pandemics or for
propaganda, censorship, and surveillance, or released to
autonomously pursue harmful goals.
81
Malicious use: Bioterrorism
Ability to engineer pandemic is rapidly becoming more accessible
Gene synthesis is halving cost every 15 months
Benchtop DNA synthesis can help rogue actors new biological agents
with no safety measures
https://www.nature.com/articles/s42256-022-00465-9 82
Malicious use: ChaosGPT
"empowering GPT with Internet and
Memory to Destroy Humanity.”
https://decrypt.co/126122/meet-chaos-gpt-ai-tool-destroy-humanity 83
Persuasive AI
AIs will enable sophisticated personalized influence campaigns that may
destabilize our shared sense of reality
AIs have the potential to increase the accessibility, success rate, scale,
speed, stealth and potency of cyberattacks
84
Concentration of Power
If material control of AIs is limited to few, it could represent the most
severe economic and power inequality in human history.
85
Malicious use: Solutions
Improving biosecurity
Restricted access controls
Biological capabilities removed from general purpose AI
Use of AI for biosecurity
Restricting access to dangerous AI models
Controlled interactions
Developers to prove minimal risks
Technical research on anomaly detection
Holding AI developers liable for harms
86
AI race
Competition could push nations and corporations to rush
AI development, relinquishing control to these systems.
87
AI race: Military
88
AI race: Corporate
89
AI race: Solutions
Safety regulations: self regulation of companies,
competitive advantage for safety oriented companies
Data documentation: transparency & accountability
Meaningful human oversight: human supervision
AI for cyber defense: anomaly detection
International coordination: standards for AI development,
robust verification & enforcement
Public control of general-purpose AIs
90
Organizational risks
Organizations developing advanced AI cause catastrophic
accidents; profits over safety
AIs could be accidentally leaked to the public or stolen by
malicious actors, and organizations could fail to properly invest
in safety research.
91
Organizational risks
92
Organizational risks
95
Rouge AIs: power seeking
96
Rouge AIs: Deception
97
Rouge AIs: Solutions
101
102
Solutions to these risks?
103
Solutions to Mentioned Risks
104
Solutions to Mentioned Risks
105
A Notional Decomposition of Risk
Exposure: extent to which elements (e.g., people, property, systems) are subjected
or exposed to hazards
113
The Disaster Risk Equation
Withstand Hazards
Identify Hazards
125
https://www.indiatoday.in/technology/news/story/robot-confuses-man-for-a-box-of-vegetables-pushes-him-to-death-in-factory-2460977-2023-11-09
Example: Robot confuses man for veggies
128
X-Risks
129
AI could someday reach human intelligence
Intelligence
Dumb Albert
person Einstein
The train won’t stop at human station
Intelligence
Dumb
person
????
135
Models Are Not Always Honest
We can show models “know” the truth,
but sometimes are not incentivized to
output it.
136
Emergent capabilities are common
137
Emergent capabilities are common
138
Power-seeking can be instrumentally incentivized
“By default, suitably strategic
“One might imagine that AI systems and intelligent agents, engaging
with harmless goals will be harmless.
in suitable types of planning, will
This paper instead shows that intelligent
systems will need to be carefully have instrumental incentives to
designed to prevent them from behaving gain and maintain various types
in harmful ways.” ~ Omohundro of power, since this power will
help them pursue their
objectives more effectively”
- Joseph Carlsmith, Is Power-
seeking AI an Existential
Risk?
https://en.wikipedia.org/wiki/Steve_Omohundro
139
Power-seeking can be explicitly incentivized
140
Stephen Hawking on AI Risk
“Unless we learn how to prepare for,
and avoid, the potential risks, AI could
be the worst event in the history of our
civilization. It brings dangers, like
powerful autonomous weapons, or
new ways for the few to oppress the
many. It could bring great disruption to
our economy.”
142
Hillary Clinton on AI Risk
143
Alan Turing on AI Risk
144
Norbert Wiener on AI Risk
145
“There are very few
examples of a more
intelligent thing being
controlled by a less
intelligent thing,”
https://edition.cnn.com/videos/tv/2023/05/02/the-lead-geoffrey-hinton.cnn
146
Speculative Hazards and Failure Modes
148
Weaponized AI
Recently, it was shown that AI could generate potentially deadly
chemical compounds
Weaponized AI
AI could be used to create autonomous weapons
Deep RL methods outperform humans in simulated aerial combat
What to do about weaponized AI?
Anomaly detection
Detect novel hazards such as novel biological phenomena
Detect malicious use and nation-state misuse
Systemic Safety (forecasting, ML for cyberdefense, cooperative AI)
Reduce probability of conflict
Policy
Out of scope for this course
Proxy Gaming
Future artificial agents could over-optimize and game faulty proxies,
which could mean systems aggressively pursue goals and create a
world that is distinct from what humans value
157
Activity #AIRisks
Ponnurangam.kumaraguru
/in/ponguru
ponguru
Thank you
[email protected]
for attending
the class!!!