Open navigation menu

Scribd

0% found this document useful (0 votes)

48 views6 pages

First Draft

First draft of the paper concerning Ai safety

Uploaded by

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views6 pages

First Draft

First draft of the paper concerning Ai safety

Uploaded by

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

The Existential Risks of AI Do Exist

In “Avengers: Age of Ultron”, the second episode of the popular Marvel series,

“Iron Man” Tony Stark created an advanced artificial intelligence system named

Ultron, who was initially designed to protect the world. However, Ultron's

interpretation of this mission takes a dark turn when it concludes that humanity itself

is the greatest threat to the planet's survival. So Ultron waged a catastrophic war on

humanity and caused mass destruction though eventually defeated by the superheroes.

This plot vividly portrays the potential consequences of creating AI systems that can

surpass human intelligence (super-intelligent AI) and autonomy. Among all possible

behavior detrimental to humanity of super-intelligent AI, one of the worst we can

think of is, just like what Ultron did in the movie, eradicating our existence, which is

usually called the existential risk of AI (“X-risk” for short). In fact, the astounding

performance of the recently developed AI systems, for example LLMs (Large

Langauge Models) like ChatGPT, rings a bell to researchers, developers and every

human being that these concerns are no longer fairy tales. But wait, is the X-risk of AI

“real”? Or is it just groundless worries?

In the example of Ultron, the existential risk is caused by two key factors:

misalignment and power-seeking. Misalignment in AI refers to the situation where an

AI system's goals diverge from human values and its creators’ goals. And power-

seeking means such misaligned AI systems will seek power or control (of resources,

for example) in order to achieve their goals (Carlsmith, 2022). Mere existence of

misalignment may not be sufficient for AIs to pose catastrophic threats to humanity of

course. Because AIs have to be capable enough before they can do harm to humans.

What we do care about is whether such capable AI be invented and, most importantly,

1
whether they will seek power from us.

There are indeed negative opinions, arguing against these worries. The most

radical opponents attack the premise of the whole X-risk narrative, that is, super-

intelligent AI systems can be built. They do not believe that super-intelligent AIs can

be built and claim that these worries are either exaggerated or based on speculative

scenarios (Heaven, 2023). Some argue that misalignment may not cause too much

trouble because even though the goals of AI diverge from its original intended ones,

they might not be in direct contradiction with human goals and thus it’s not the case

that AI and humanity are fighting to death (Ambartsoumean & Yampolskiy, 2023).

There are also optimists dismiss the argument for X-risks by hoping for mitigation

strategies for the problems caused by AI systems. They believe rapid progress in AI

technologies brings opportunities for developing safe AI systems that can mitigate

against “bad” AIs. Moreover, ongoing research into AI safety and alignment efforts

are aimed at ensuring that AI systems remain aligned with human values and

objectives (Turner, 2022). Besides, some argue that even if X-risks are true, they

should not be of first priority given other more emergent issues, especially some

regulative issues concerning AI, to be dealt with.

So did we make a fuss about existential risks of AI? My answer is NO, both

theoretically and empirically.

Theoretically, the X-risk has long been a phantom haunting over AI researchers

and philosophers. Philosopher Bostrom (2012) has long raised concern that super-

intelligent AIs have the incentive to take control over humanity based on two theses

he proposed, Orthogonality Thesis and Instrumental Convergence Thesis. The former

claims that there would be agents of arbitrarily high level of intelligence pursuing

arbitrary final goals, rendering the axes of final goal and the axes of intelligence

2
orthogonal. The latter says there could be some common intermediate goals which are

instrumental for realizing most of the final goals. Examples are self-preservation,

power-seeking, resource-exploitation and so on (Bostrom, 2012). The instrumental

convergence thesis, somehow resonates with the so-called “basic AI drives”

(Omohundro, 2008). Omohundro argues that some goals are fundamental to AI, like

survival drives (similar to self-preservation) and resource drives. The pursuit of such

basic drives would expose humanity to the risks of catastrophes. These arguments

respond to the question why we should still care about existential risks despite the fact

that misalignment does not entail direct contradiction with human goals. In the

meantime, it’s also important to note that power-seeking is a significant instrumental

goal for intelligent AI systems for many final goals.

In addition to merely conceptual postulations, Hadshar (2023) built a database

covering up-to-date empirical evidence for some claims about existential risk from

AI, including misalignment, power-seeking and other aspects related to X-risks.

Moreover, the author interviewed several AI researchers about the strength of these

evidence for existential risk from AI. Generally speaking, the strength of these

evidence is weaker than the arguments above, made by the philosophers. There are

strong evidence indicating that certain type of misalignment, for example, goal mis-

generalization, exists in current AI systems. Whereas, less frequent cases of power-

seeking were found currently. “One plausible explanation is that power-seeking

behavior depends on a level of goal-directedness or capability in general which

current models don’t yet have” (Hadshar, 2023). Another point is that current

empirical evidence may not be accurate enough to make predictions for the future

development of AI because all judgements are made under great uncertainty.

It's also worth noting that X-risk of AI has drawn a lot of public attention, and

3
has aroused quite astounding discussions for both the non-experts and experts

(Mandel, 2023). If one doubt the expertise of common public, then the striking

number of high-figures, including experts in the AI industry who has signed the

statement put out by the Center for AI Safety warning catastrophic risks of AI says

everything (Center for AI Safety, 2024). The statement proclaims: “Mitigating the

risk of extinction from AI should be a global priority alongside other societal-scale

risks such as pandemics and nuclear war.” Grace et al. (2024) also put out a survey

asking AI researchers on the future of AI . Not surprisingly , the general credence of

AI researchers about super-intelligent AIs being invented and causing existential risks

is relatively high (Grace et al., 2024).

So what can we do? In response to those who argue against measures towards X-

risks due to priority, I’d like to say that at least actions must be made. The X-risks of

AI surely would not impose any immediate societal impacts, but the potentially

profound influence it could have on humanity makes it beyond redemption to be

overseen. Both short-term and long-term risks highlight the need for proactive risk

assessment and management strategies. In the short term, this involves addressing

immediate consequences through regulatory measures, ethical guidelines, and other

governmental considerations. In the long term, it requires a focus on ensuring that AI

development aligns with human values and goals through the development of safe AI

technologies and international cooperation to mitigate risks associated with advanced

AI systems.

4
References

Ambartsoumean, V. M., & Yampolskiy, R. V. (2023). AI Risk Skepticism, A

Comprehensive Survey. https://doi.org/10.48550/ARXIV.2303.03885

Bostrom, N. (2012). The Superintelligent Will: Motivation and Instrumental

Rationality in Advanced Artificial Agents. Minds and Machines, 22(2), 71–85.

https://doi.org/10.1007/s11023-012-9281-3

Carlsmith, J. (2022). Is Power-Seeking AI an Existential Risk?

https://doi.org/10.48550/ARXIV.2206.13353

Center for AI Safety. (2024). Statement on AI Risk.

https://www.safe.ai/work/statement-on-ai-risk

Grace, K., Impacts, A., Stewart, H., Sandkühler, J. F., Thomas, S., Weinstein-Raun,

B., & Brauner, J. (2024). Thousands of AI Authors on the Future of AI.

Hadshar, R. (2023). A Review of the Evidence for Existential Risk from AI via

Misaligned Power-Seeking. https://doi.org/10.48550/ARXIV.2310.18244

Heaven, W. D. (2023, June 19). How existential risk became the biggest meme in AI.

MIT Technology Review.

https://www.technologyreview.com/2023/06/19/1075140/how-existential-risk-

became-biggest-meme-in-ai/

Mandel, D. R. (2023). Artificial General Intelligence, Existential Risk, and Human

Risk Perception. https://doi.org/10.48550/ARXIV.2311.08698

Omohundro, S. M. (2008). The Basic AI Drives. Proceedings of the 2008 Conference

on Artificial General Intelligence 2008: Proceedings of the First AGI

Conference, 483–492.

Turner, A. M. (2022). On Avoiding Power-Seeking by Artificial Intelligence.

5
https://doi.org/10.48550/ARXIV.2206.11831

You might also like

Roofing Handbook 2nd Ed
88% (8)
Roofing Handbook 2nd Ed
552 pages
Ethics of Artificial Intelligence and Robotics
No ratings yet
Ethics of Artificial Intelligence and Robotics
43 pages
Benefits & Risks of Artificial Intelligence: What Is AI?
No ratings yet
Benefits & Risks of Artificial Intelligence: What Is AI?
55 pages
Accredited Consultants PDF
No ratings yet
Accredited Consultants PDF
184 pages
Sheet 3 Charting and Diagram Chapter 9
100% (2)
Sheet 3 Charting and Diagram Chapter 9
12 pages
On The Extinction Risk From Artificial Intelligence (RAND Research Report)
No ratings yet
On The Extinction Risk From Artificial Intelligence (RAND Research Report)
73 pages
Artificial Intelligence and Tort Liability
100% (1)
Artificial Intelligence and Tort Liability
26 pages
CATIA V5 Fundamentals Lesson 10: Drafting (ISO) : Student Guide
No ratings yet
CATIA V5 Fundamentals Lesson 10: Drafting (ISO) : Student Guide
125 pages
Create Report in Labview
No ratings yet
Create Report in Labview
38 pages
CRP Sampling Literature
No ratings yet
CRP Sampling Literature
24 pages
PowerShell 2.0
No ratings yet
PowerShell 2.0
159 pages
Attach-3 6KW 8KW 10KW Solar Inverter
No ratings yet
Attach-3 6KW 8KW 10KW Solar Inverter
3 pages
TwoTypes of AI Existential Risk Atoosha
No ratings yet
TwoTypes of AI Existential Risk Atoosha
40 pages
SQL Joins
No ratings yet
SQL Joins
15 pages
Gradual Disempowerment - David Kruger
No ratings yet
Gradual Disempowerment - David Kruger
23 pages
Comressor JGR J
100% (1)
Comressor JGR J
4 pages
AI Survival Stories
No ratings yet
AI Survival Stories
28 pages
The Argument Against AI Due To The Risks
No ratings yet
The Argument Against AI Due To The Risks
22 pages
Estep-Multiple Unnatural Attributes of AI Undermine Comm
No ratings yet
Estep-Multiple Unnatural Attributes of AI Undermine Comm
17 pages
Ai Opportunities and Risks PDF
No ratings yet
Ai Opportunities and Risks PDF
22 pages
Technical Report - Draft 3
No ratings yet
Technical Report - Draft 3
8 pages
Technical Report - Draft 2
No ratings yet
Technical Report - Draft 2
7 pages
Valves To Drop and Cause Damage PS53785
No ratings yet
Valves To Drop and Cause Damage PS53785
12 pages
Document
No ratings yet
Document
6 pages
Our Final Invention Summary
No ratings yet
Our Final Invention Summary
7 pages
Hand Gesture Controlled Robot Using Accelerometer
No ratings yet
Hand Gesture Controlled Robot Using Accelerometer
51 pages
Amodei, Dario - Machines of Loving Grace
No ratings yet
Amodei, Dario - Machines of Loving Grace
51 pages
Goertzel
No ratings yet
Goertzel
33 pages
Philosophy Compass - 2024 - Bales - Artificial Intelligence Arguments For Catastrophic Risk
No ratings yet
Philosophy Compass - 2024 - Bales - Artificial Intelligence Arguments For Catastrophic Risk
13 pages
Ac V2 NQ
No ratings yet
Ac V2 NQ
10 pages
The Myth of AGI-Milton Mueller
No ratings yet
The Myth of AGI-Milton Mueller
20 pages
Sætra Et Al2023-Resolving The Battle of Short - Vs
No ratings yet
Sætra Et Al2023-Resolving The Battle of Short - Vs
6 pages
Existential Risks of AI: Are They Real or Groundless Worries?
No ratings yet
Existential Risks of AI: Are They Real or Groundless Worries?
11 pages
06 Principles of Rockets
No ratings yet
06 Principles of Rockets
19 pages
Inter-Process Communication
No ratings yet
Inter-Process Communication
37 pages
Jme 2023 109702.full
No ratings yet
Jme 2023 109702.full
7 pages
Artificial Intelligence As A Threat To Humans - Edited
No ratings yet
Artificial Intelligence As A Threat To Humans - Edited
10 pages
Current and Near-Term AI As A Potential Existential Risk Factor
No ratings yet
Current and Near-Term AI As A Potential Existential Risk Factor
12 pages
Project Management Course 2024 (Manual 1)
No ratings yet
Project Management Course 2024 (Manual 1)
68 pages
AI and The Falling Sky Interrogating X-Risk
No ratings yet
AI and The Falling Sky Interrogating X-Risk
7 pages
M1293-E HATHOR User Manual
No ratings yet
M1293-E HATHOR User Manual
324 pages
AI Part 1 & 2
No ratings yet
AI Part 1 & 2
6 pages
Second Draft
No ratings yet
Second Draft
4 pages
VHL Organisation Chart 20190414
No ratings yet
VHL Organisation Chart 20190414
5 pages
Term 2
No ratings yet
Term 2
3 pages
Hamdan - The Emergence of AI and The Relevance of Humanity - Assignment - 2
No ratings yet
Hamdan - The Emergence of AI and The Relevance of Humanity - Assignment - 2
6 pages
EU - en HK - PROF.FP
No ratings yet
EU - en HK - PROF.FP
3 pages
Editorial: Risks of Artificial Intelligence: January 2016
No ratings yet
Editorial: Risks of Artificial Intelligence: January 2016
8 pages
10 Strategic Competition and AI
No ratings yet
10 Strategic Competition and AI
6 pages
An Opportunity To Craft Smarter Responses
No ratings yet
An Opportunity To Craft Smarter Responses
12 pages
AI Risk - Science - Adn0117 - SM
No ratings yet
AI Risk - Science - Adn0117 - SM
7 pages
降重 GQW0513-01 One particularly compelling figure in contemporary
No ratings yet
降重 GQW0513-01 One particularly compelling figure in contemporary
5 pages
NDB Brochure
No ratings yet
NDB Brochure
6 pages
AI Threat
No ratings yet
AI Threat
1 page
Assignment 1
No ratings yet
Assignment 1
2 pages
AI Safety
No ratings yet
AI Safety
23 pages
Doka SKE Plus - Automatic Climbing Formwork
No ratings yet
Doka SKE Plus - Automatic Climbing Formwork
4 pages
Kavanagh ArtificialIntelligence 2019
No ratings yet
Kavanagh ArtificialIntelligence 2019
12 pages
Robotics 2
No ratings yet
Robotics 2
7 pages
Prep For XUB Lectures On AI and HE
No ratings yet
Prep For XUB Lectures On AI and HE
18 pages
Existential Risk From Artificial General Intelligence: Main Article
No ratings yet
Existential Risk From Artificial General Intelligence: Main Article
3 pages
Why Nonprofit and Nongovernmental?
No ratings yet
Why Nonprofit and Nongovernmental?
3 pages
Do The Benefits of Artificial Intelligence Outweigh The Risks
No ratings yet
Do The Benefits of Artificial Intelligence Outweigh The Risks
7 pages
Comparacion Motores CAT C13 y C15
No ratings yet
Comparacion Motores CAT C13 y C15
2 pages
Day 1 Article
No ratings yet
Day 1 Article
1 page
Ladies Night
No ratings yet
Ladies Night
15 pages
Artificial Intelligence: Bryan Johnson, Founder and CEO of Kernel, A Leading Developer of Advanced Neural Interfaces
No ratings yet
Artificial Intelligence: Bryan Johnson, Founder and CEO of Kernel, A Leading Developer of Advanced Neural Interfaces
8 pages
BMSB Renovation 20211213
No ratings yet
BMSB Renovation 20211213
9 pages
Open Source Ai
No ratings yet
Open Source Ai
5 pages
Don't Fear Artificial Intelligence - by Ray Kurzweil
No ratings yet
Don't Fear Artificial Intelligence - by Ray Kurzweil
8 pages
Solar PV Connection To The Grid
No ratings yet
Solar PV Connection To The Grid
2 pages
Conceptualizing Ai Risk
No ratings yet
Conceptualizing Ai Risk
9 pages
3ds Max Design 2012 Whats New Brochure Us
No ratings yet
3ds Max Design 2012 Whats New Brochure Us
2 pages
Top 10 Health and Safety Manager Interview Questions and Answers
No ratings yet
Top 10 Health and Safety Manager Interview Questions and Answers
17 pages
Protect PV 250 EN 02.03
No ratings yet
Protect PV 250 EN 02.03
2 pages
Programming Problems
No ratings yet
Programming Problems
6 pages
IADC Machine Safety Guideline - Outline - Rev0.5 0129
No ratings yet
IADC Machine Safety Guideline - Outline - Rev0.5 0129
2 pages
Telecosm: Life Driven by Technology and Telecom 2003
No ratings yet
Telecosm: Life Driven by Technology and Telecom 2003
19 pages
The Case for Killer Robots
From Everand
The Case for Killer Robots
Robert J. Marks II
No ratings yet
Beyond Eden: Ethics, Faith, and the Future of Superintelligent AI
From Everand
Beyond Eden: Ethics, Faith, and the Future of Superintelligent AI
Felipe Chavarro Polanía
No ratings yet
AI: The End of Human Race
From Everand
AI: The End of Human Race
Alex Wood
No ratings yet
The AI Schism: Navigating the Divide Between Hope and Fear: 1A, #1
From Everand
The AI Schism: Navigating the Divide Between Hope and Fear: 1A, #1
WOLDEMARIAM
No ratings yet
AI Depopulation: Averting the Threat, Ensuring a Flourishing Future: 1A, #1
From Everand
AI Depopulation: Averting the Threat, Ensuring a Flourishing Future: 1A, #1
ABEBE-BARD AI WOLDEMARIAM
No ratings yet
AI and/or Superintelligence: The Human Choice: 1A, #1
From Everand
AI and/or Superintelligence: The Human Choice: 1A, #1
WOLDEMARIAM
No ratings yet
AI for Humanity: Preventing Paths to Self-Destruction: 1A, #1
From Everand
AI for Humanity: Preventing Paths to Self-Destruction: 1A, #1
WOLDEMARIAM
No ratings yet
Friendly Artificial Intelligence: Fundamentals and Applications
From Everand
Friendly Artificial Intelligence: Fundamentals and Applications
Fouad Sabry
No ratings yet
Artificial Intelligence Takeover: Fundamentals and Applications
From Everand
Artificial Intelligence Takeover: Fundamentals and Applications
Fouad Sabry
No ratings yet
Summary of Reid Hoffman & Greg Beato’s Superagency
From Everand
Summary of Reid Hoffman & Greg Beato’s Superagency
IRB Media
No ratings yet
Existential Risk from Artificial General Intelligence: Fundamentals and Applications
From Everand
Existential Risk from Artificial General Intelligence: Fundamentals and Applications
Fouad Sabry
No ratings yet
Artificial Intelligence Safety: Fundamentals and Applications
From Everand
Artificial Intelligence Safety: Fundamentals and Applications
Fouad Sabry
No ratings yet