Hs1501 Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 117

HS1501 Artificial Intelligence and Society

Wong Tin Lok

based on materials by Yu Chien Siang

initial version prepared in conjunction with


Kang Joon Kiat and Lee Boon Chong

National University of Singapore


AY2023/24 Semester 2
Contents

1 Why care 4
1.1 What is Artificial Intelligence (AI)? . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 What can AI do nowadays? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Beating human in Go . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Checking parking payment . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Writing job applications . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.4 And many other applications . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Does AI bring any problem? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.1 AI won an art competition . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.2 Autonomous cars crashed . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.3 Fake videos circulated . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.4 AI Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Try this out! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Capabilities: language 13
2.1 Named entity recognition (NER) . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Sentiment analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Summarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Information extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Entity resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6 Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.7 Speech recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.8 Speech synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.9 Natural language generation (NLG) . . . . . . . . . . . . . . . . . . . . . . . 20
2.10 Further applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.10.1 Chatbots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.10.2 Writing computer code . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.10.3 Producing music . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.10.4 Other examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.11 Current challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.12 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Capabilities: vision 26
3.1 Text recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Object recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Facial recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Action recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5 Visual question answering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 Image/video generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.7 Image processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.8 Further applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.8.1 Reverse image search . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

1
3.8.2 Games, augmented reality (AR), virtual reality (VR), and the metaverse 38
3.8.3 Deepfakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.9 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.10 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4 Capabilities: robots 41
4.1 Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 Physical activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.4 Interacting with other machines . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.5 Interacting with human . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.6 Robotic process automation (RPA) . . . . . . . . . . . . . . . . . . . . . . . . 46
4.7 Further applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.8 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.9 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5 Use cases 50
5.1 Data analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.2 Manufacturing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.3 Agriculture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.3.1 Case study: Japanese cucumber farmer employing deep learning to sort
cucumbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.4 Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.4.1 Benefits of using AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.4.2 Concerns of using AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.4.3 Example of AI application: Duolingo . . . . . . . . . . . . . . . . . . . 54
5.5 Government . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.6 Green . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.7 Weather . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.8 Retail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.9 Information technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.10 Finance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.11 Insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.12 Human resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.13 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

6 Technical background 62
6.1 Artificial neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.1.1 The challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.1.2 Machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.1.3 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.1.4 Deep neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.2 Hardware acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.3 Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.4 Open-source code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.5 Low-code development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.6 Cloud computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.7 Edge computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.8 Example: Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.9 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

2
7 Challenges and issues 75
7.1 Abuse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.2 Malfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.2.1 Unexpected scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.2.2 Low-quality training data . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.2.3 Poor engineering/programming choices . . . . . . . . . . . . . . . . . . 77
7.2.4 Preventing and handling failures . . . . . . . . . . . . . . . . . . . . . 77
7.3 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.3.1 Data poisoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.3.2 Evasion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.3.3 Inference attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.3.4 Defence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.4 Explainability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.4.1 Local Interpretable Model-agnostic Explanations (LIME) . . . . . . . 81
7.4.2 Layer-wise Relevance Propagation (LRP) . . . . . . . . . . . . . . . . 81
7.5 Data scarcity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.6 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

8 Economics 86
8.1 Cost reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.2 Productivity gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
8.3 Wealth distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8.4 Job market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8.5 Leadership . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
8.6 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

9 Ethics 92
9.1 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
9.2 Weapons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
9.3 Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
9.4 Morals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
9.5 Social status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
9.6 Exercise: job loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

10 Governance 100
10.1 General principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
10.1.1 European Union (EU) . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
10.1.2 United States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
10.1.3 IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
10.1.4 Microsoft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
10.1.5 Delicate issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
10.2 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
10.2.1 General Data Protection Regulation (GDPR) in the EU . . . . . . . . 103
10.2.2 Personal Data Protection Act (PDPA) in Singapore . . . . . . . . . . 105
10.2.3 US Federal Trade Commission (FTC) probes . . . . . . . . . . . . . . 106
10.3 Liability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
10.4 Intellectual property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

11 Future 109
11.1 Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
11.2 Human–machine interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
11.3 Brain-inspired AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
11.4 Quantum computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
11.5 Society . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

3
HS1501 §1

Why care

We start the course with a definition of artificial intelligence, a showcase of its capabilities
and problems, and a number of natural questions about AI that we will look into in this
course.

1.1 What is Artificial Intelligence (AI)?


• Depending on the context, the term AI may refer to a system or a scientific discipline.
• The following definition given in Apr. 2019 by the High-Level Expert Group on Artificial
Intelligence set up by the European Commission covers both senses of the term. We
will see some of the technical terms used explained in §6.1.
Artificial intelligence (AI) systems are software (and possibly also hardware)
systems designed by humans that, given a complex goal, act in the physical or
digital dimension by perceiving their environment through data acquisition,
interpreting the collected structured or unstructured data, reasoning on the
knowledge, or processing the information, derived from this data and deciding
the best action(s) to take to achieve the given goal. AI systems can either
use symbolic rules or learn a numeric model, and they can also adapt their
behaviour by analysing how the environment is affected by their previous
actions.
As a scientific discipline, AI includes several approaches and techniques,
such as machine learning (of which deep learning and reinforcement learning
are specific examples), machine reasoning (which includes planning, schedul-
ing, knowledge representation and reasoning, search, and optimization), and
robotics (which includes control, perception, sensors and actuators, as well
as the integration of all other techniques into cyber-physical systems).
Source: High-Level Expert Group on Artificial Intelligence. “A definition of AI: Main capabilities and
disciplines”. 8 Apr. 2019. https://ec.europa.eu/newsroom/dae/document.cfm?doc id=60651. Last
accessed: 2 Jan. 2024.

• We will also use the term AI to refer to AI systems collectively, in addition to using it
for a particular AI system and for the scientific discipline of AI.

• Programs that drive AI systems are often called AI models, or models for short.

4
1.2 What can AI do nowadays?
1.2.1 Beating human in Go
An AI model called AlphaGo, developed by Google DeepMind, beat 9-dan professional Lee
Sedol in Go in 2016. It was the first time a computer Go program has beaten a 9-dan
professional player without handicap. This was considered practically unachievable in view
of the vast number of possible moves in the game of Go.

Image source: Axd, CC BY-SA 4.0, via Wikimedia Commons. https://commons.wikimedia.org/wiki/File:


Lee-sedol-alphago-divine-move.jpg.

1.2.2 Checking parking payment


Prof. Yu explains how AI can help automate the checking of parking payment in the video
below.

26 sec

1.2.3 Writing job applications


Here is a demonstration from Jan. 2024 of how Google’s AI-based Bard can help write cover
letters for job applications, using a job advertisement taken from the NUS Student Work
Scheme (NSWS) system and the advice from the NUS Centre for Future-ready Graduates
on crafting cover letters.

5
6
7
1.2.4 And many other applications
Andrew Ng: AI is the new electricity. It will transform every industry and cre-
ate huge economic value. Technology like supervised learning is automation on
steroids. It is very good at automating tasks and will have an impact on every
sector – from healthcare to manufacturing, logistics and retail.
Source: Catherine Jewell. “Artificial intelligence: the new electricity”. WIPO Magazine, Jun. 2019. https:
//www.wipo.int/wipo magazine/en/2019/03/article 0001.html.

Andrew Ng is an active proponent of AI education, Chairman and Co-Founder of Cours-


era, and Adjunct Professor at Stanford University. He co-founded Google Brain in 2012 and
was Chief Scientist at Baidu 2014–2017.

1.3 Does AI bring any problem?


1.3.1 AI won an art competition

8
Image source: Jason M. Allen / Midjourney. “Théâtre d’Opéra Spatial”. https://commons.wikimedia.org/
wiki/File:Th%C3%A9%C3%A2tre d’Op%C3%A9ra Spatial.webp.

The image above won the first place in the digital art competition at the 2022 Colorado
State Fair. Jason M. Allen made it using an AI program called Midjourney, which generates
in seconds custom images on users’ text inputs. This sparked controversy over the role of AI
in art.
Sources: [1] Colorado State Fair. “2022 Fine Arts First, Second & Third”. 29 Aug. 2022. https://colora
dostatefair.com/wp-content/uploads/2022/08/2022-Fine-Arts-First-Second-Third.pdf. [2] Rachel
Metz. “AI won an art contest, and artists are furious”. CNN Business, 3 Sep. 2022. https://edition.cnn.
com/2022/09/03/tech/ai-art-fair-winner-controversy/index.html.

1.3.2 Autonomous cars crashed


In May 2016, Joshua Brown’s Tesla Model S collided with a truck in Florida, USA while it
was engaged in the “Autopilot” mode and he was killed. Prof. Yu discusses this and similar
accidents in the following video.

2 min 26 sec
Reference: Danny Yadron and Dan Tynan. “Tesla driver dies in first fatal crash while using autopilot mode”.
The Guardian, Jul. 2016. https://www.theguardian.com/technology/2016/jun/30/tesla-autopilot-dea
th-self-driving-car-elon-musk.

1.3.3 Fake videos circulated


In Mar. 2022 during the war between Russia and Ukraine, a deepfake video of Ukrainian
President Volodymyr Zelenskyy urging Ukrainians to put down their weapons circulated on
social media and was placed on a Ukrainian news website by hackers.

1 min 12 sec
Source: The Telegraph (@telegraph). “Deepfake video of Volodymyr Zelensky surrendering surfaces on social
media”. YouTube, 17 Mar. 2022. https://youtu.be/X17yrEV5sl4.

1.3.4 AI Risk
In May 2023, many AI experts (and other notable figures) signed the following Statement on
AI risk.

9
Mitigating the risk of extinction from AI should be a global priority alongside
other societal-scale risks such as pandemics and nuclear war.
The signatories include:
• Geoffrey Hinton (Emeritus Professor of Computer Science at the University of Toronto,
awarded the Turing Award in 2018 for his work in AI);
• Yoshua Bengio (Professor of Computer Science at Université de Montréal, awarded the
Turing Award in 2018 for his work in AI);
• Demis Hassabis (CEO of Google DeepMind, which developed AlphaGo mentioned
in §1.2.1 above); and
• Sam Altman (CEO of OpenAI, which developed the GPT family of large language
models and the text-to-image AI model DALL-E).
Reference: Centre for AI Safety. “Statement on AI Risk”. https://www.safe.ai/statement-on-ai-risk.
Last accessed: 5 Jan. 2024

1.4 Reflection
• How good is current AI? How fast is it developing?
• How much does it cost to incorporate AI in our work nowadays?
– Only the tech giants can do this? Or individuals can also afford to take advantage?
• How much technical knowledge is required of us to deploy AI to perform tasks that are
specific to our needs?
– Do we need to know coding for it?
• How much of the work we are now doing ourselves can already be automated by AI?
• What can we exploit AI for, as individuals and as organizations?
– Learning? Brainstorming? Making better decisions? Increasing productivity?
Saving time and money? Starting and running businesses? Earning money? Im-
proving life? Saving lives? Saving the earth?
– What are some limitations of current AI? How creative can it be?
• What should we look out for to increase our chances of success when deploying AI?
• Is AI really going to be everywhere?
• Will AI take away my job in the future?
– Which jobs are more easily replaced by AI?
– What will the role of AI be in the future job market?
• How will AI transform jobs, societies, businesses, economies, politics, etc.?
• How do AI-made products compare to traditional machine-made products and hand-
made products?
• In which directions is AI development heading currently?
• Will AI really be dangerous to us (human)?

Answers to many of these questions may change as the technology evolves. Therefore, instead
of presenting one fixed set of answers, this course will guide us to our own set of answers.

10
1.5 Try this out!
In a paper published at the International Conference on Computer Vision 2023, researchers
from the University of Maryland, Adobe Research, and Carnegie Mellon University intro-
duced the use of rich text, i.e., text augmented with font, style, size, colour, text, and even
Internet information, to give the user of a text-to-image AI model more fine-grained control
over the output.
Here are some demonstrations from their paper, where the images on the left are gener-
ated using only textual information, and the images on the right take into account also the
augmented information.

Image source: Songwei Ge, Taesung Park, Jun-Yan Zhu, and Jia-Bin Huang. “Expressive Text-to-Image
Generation with Rich Text”. IEEE International Conference on Computer Vision (ICCV) 2023.

Try this out yourself on Hugging Face to see how helpful the use of rich text is in specifying
the image to be generated.

1. Open the “Expressive Text-to-Image Generation with Rich Text” page on Hugging Face
at https://huggingface.co/spaces/songweig/rich-text-to-image.
2. Type into the text box a textual description of an image you would like to generate.

3. Include more specific information using the buttons at the top of the text box.
4. Click the “Generate” button at the bottom, and wait for the process to end.
5. The model generates the required outputs in the box labelled “Rich-text” and in the
box labelled “Plain-text”. Compare the two.

6. Repeat the steps above with different inputs.


7. Evaluate the efficiency of model and the quality of the outputs.

11
Here is a demonstration that is not from the authors of the AI.

Moral: trying out helps one understand an AI better.

12
HS1501 §2

Capabilities: language

AI is capable of analyzing, processing, and generating speech or text. Natural language


processing (NLP) is the study of techniques that enables computers to use both written and
spoken human languages the ways human beings can. Similar techniques can be used for
non-language applications, e.g., coding and music.
Here is a summary of the major NLP capabilities of AI nowadays.
• named entity recognition (NER): extracting the names of persons, places, compa-
nies, and more, and classifying them into predefined labels
• topic modelling: uncovering hidden topics from large collections of documents
• text categorization: sorting text into specific taxonomies
• text clustering: grouping text or documents based on similarities in content

• sentiment analysis: identifying, extracting, quantifying, and studying affective states


and subjective information
• summarization: generating a short version of the input document that retains the
important points

• information extraction: finding meaningful information in unstructured text


• entity resolution: identifying records in (internal or public) data sources that refer
to the same real-world entities, and identifying relationships between these records
• translation: turning text from one language to another while retaining the meanings

• speech recognition: converting speech to text


• speech synthesis: converting text to speech
• natural language generation (NLG): transforming data into human language
References: [1] William D. Eggers, Neha Malik, and Matt Gracie. “Using AI to unleash the power of un-
structured government data”. Deloitte, 16 Jan. 2019. https://www2.deloitte.com/us/en/insights/foc
us/cognitive-technologies/natural-language-processing-examples-in-government-data.html. Last
accessed: 20 Jan. 2024. [2] Multiple contributors. “Sentiment analysis”. Wikipedia. https://en.wikipedia
.org/wiki/Sentiment analysis. Last accessed: 20 Jan. 2024.

We will look at some of these in more detail, check out the current level of the technology,
explore a few applications, and discuss some challenges in this area.

13
2.1 Named entity recognition (NER)
See how good NER is these days by following the steps below.

1. Go to the “displaCy Named Entity Visualizer” at https://demos.explosion.ai/dis


placy-ent.
2. Enter (or copy-and-paste) some text in the text box.

3. Select the language the text is in in the dropdown menu.


4. Tick the kinds of entities you want labelled.
5. Click the ü button.

6. The model labels the selected kinds of entities below.

Try out different inputs. Does it make (m)any mistakes?

2.2 Sentiment analysis


Sentiment analysis often refers to the detection of polarity (e.g., positive or negative), emotion
(e.g., angry, happy or sad), urgency, and intention (e.g., interested or not interested) in text
or speech.
Automatically analyzing customer feedback, such as opinions in survey responses and
social media conversations, using sentiment analysis allows brands to better understand their
customers, so that they can tailor products and services to meet their needs.
Reference: MonkeyLearn. “Sentiment Analysis: A Definitive Guide”. https://monkeylearn.com/sentimen
t-analysis/. Last accessed: 20 Jan. 2024.

See a demonstration of AI sentiment analysis by following the steps below.

14
1. Go to Lexalytics’s “NLP Demo” page at https://www.lexalytics.com/nlp-demo/.

2. Select a category you want demonstrated in the Industry Pack section.


3. Select a text sample in the category you want analyzed.
4. Click the “Show Analysis” button.
5. An analysis report is shown, where the words indicating sentiments are highlighted.

6. In the different tabs, one can also see an analysis of the degrees and the topics of the
sentiments.
7. Repeat the steps above with different selections and evaluate the results.

Now try sentiment analysis on your own text by following the steps below.

15
1. Open Lettria’s Customer Sentiment Analysis page on Hugging Face at https://hugg
ingface.co/spaces/Lettria/customer-sentiment-analysis.

2. Clear the “Customer Review” box.


3. Enter a piece of text in the “Paragraph” box (or copy-and-paste there a review from
your favourite online restaurant guide).
4. Click the “Submit” button and wait for the process to end.

5. The model determines whether the sentiment is “POSITIVE”, “NEUTRAL” or “NEG-


ATIVE”.
6. Click the “Clear” button and repeat the steps above with a different input.
7. Evaluate the quality of the outputs.

2.3 Summarization
Extractive models perform “copy-and-paste” operations: they select relevant phrases of the
input document and concatenate them to form a summary.
• They are quite robust since they use existing natural-language phrases that are taken
straight from the input, but they lack in flexibility since they cannot use novel words
or connectors. They also cannot paraphrase.

Abstractive models generate a summary based on the actual “abstracted” content: they can
use words that were not in the original input.
• This gives them a lot more potential to produce fluent and coherent summaries but
it is also a much harder problem as you now require the model to generate coherent
phrases and connectors.
Reference: Romain Paulus. “Your TLDR by an ai: a Deep Reinforced Model for Abstractive Summarization”.
Salesforce, 11 May 2017. https://blog.salesforceairesearch.com/your-tldr-by-an-ai-a-deep-reinfo
rced-model-for-abstractive-summarization/. Last accessed: 20 Jan. 2024.

Try out a commercial AI summarizer by following the steps below.

16
1. Open Intellexer’s “Summarizer” at http://esapi.intellexer.com/Summarizer.

2. Click into the “Load Text” tab.


3. Enter some text you want to summarize into the text box.
4. Indicate how long you want the summary to be, in terms of a percentage of the original
text or the number of sentences.

5. Click the “Summarize” button and wait for the model to load.
6. A summary is produced as requested.
7. Repeat the steps above with a different input and evaluate the quality of the outputs.

Here is a demonstration using the essay from https://www.channelnewsasia.com/co


mmentary/ai-jobs-universal-basic-income-unemployment-support-3589421.

Is the summarization performed extractive or abstractive? (It is extractive.)

2.4 Information extraction


• AI NLP allows extraction of useful information from Big Data, i.e., extensive data sets,
such as the Internet, that are too large to be analyzed using traditional methods.
• For example, AI can analyze patent data to visualize the relationships between patents,
and thus help investors make more informed decisions, as described in the video by
Prof. Yu below.

3 min 33 sec

17
• Question-answering AI can generate answers to given questions by querying a knowl-
edge base.
• Closed-domain question answering deals only with questions under a specific domain,
e.g., medicine and law, while open-domain question answering deals with factual ques-
tions about nearly everything.

• Prof. Yu demonstrates closed-domain question answering in the video below.

2 min 30 sec

• Try out open-domain question-answering AI at https://www.perplexity.ai/ your-


self. Does it give accurate answers to everyday questions? Does it give accurate answers
to academic questions?

2.5 Entity resolution


• Financial organizations and public-sector organizations can use entity resolution to
detect fraud, improve risk assessment, improve investigative outcomes, help ensure
compliance, improve customer insights, and reduce false positives and false negatives.
• The company Senzing developed software using AI that is capable of performing real-
time entity resolution.
Reference: Senzing. “Financial Services” and “Public Sector”. https://senzing.com/industries/financi
al-services/ and https://senzing.com/industries/public-sector/. Last accessed: 20 Jan. 2024.

• Watch Jeff Jonas, founder and CEO of Senzing, explain the challenges and further

18
applications of entity resolution. (There is also a demonstration using Singapore data.)

13 min 41 sec
Video source: Senzing (@senzinginc). “Real-Time AI for Entity Resolution”. YouTube, 5 Nov. 2019.
https://youtu.be/FN-Vg57Y7JQ.

2.6 Translation
• The classical approach to machine translation is rule-based, i.e., based entirely on
dictionaries and grammars. It requires a great amount of manual effort.
• Another approach is statistics-based: one picks out the most likely translation according
to some sample data given.
• In 2016, Google Translate started using translation models based on neural networks,
which give superior performance compared to statistics-based models.
References: [1] Quoc V. Le and Mike Schuster. “A Neural Network for Machine Translation, at Production
Scale”. Google Research, 27 Sep. 2016. https://ai.googleblog.com/2016/09/a-neural-network-for-machi
ne.html. Last accessed: 20 Jan. 2024. [2] Yonghui Wu, et al. “Google’s Neural Machine Translation System:
Bridging the Gap between Human and Machine Translation”. arXiv:1609.08144 [cs.CL], Sep./Oct. 2016.

Here is an English translation of a Chinese poem by Ji Zhang by the AI-based machine


translator at https://www.deepl.com/. Is the output acceptable, despite the fact that the
input is probably not of an intended type?

Compare this machine translation with the following human translation by Yuanchong Xu.

19
At moonset cry the crows, streaking the frosty sky;
Dimly lit fishing boats ’neath maples sadly lie.
Beyond the city wall, from Temple of Cold Hill
Bells break the ship-borne roamer’s dream and midnight still.

2.7 Speech recognition


• AI Singapore, in collaboration with NUS and NTU, developed a speech recognition
engine called the Speech Lab, which is able to recognize conversations comprising words
from different languages, e.g., Singlish.
Reference: AI Singapore. “Speech Lab”. https://aisingapore.org/aiproducts/speech-lab/. Last
accessed: 20 Jan. 2024.

• See how well it works in the video below.

29 sec
Video source: AI Singapore (@AISingapore). “Speech Lab Product Demo”. YouTube, 12 Nov. 2019.
https://youtu.be/ZCqW7meCXFk.

2.8 Speech synthesis


• The WaveNet model, which was created by Google’s DeepMind in 2016, can generate
realistic-sounding human-like voices that were better than what Google had from its
other speech synthesis systems at that time.
• Listen to the demonstrations at the following webpage to see how well WaveNet per-
forms yourself.
Reference: Aäron van den Oord and Sander Dieleman. “WaveNet: A generative model for raw audio”.
DeepMind, 8 Sep. 2016. https://www.deepmind.com/blog/wavenet-a-generative-model-for-raw-audio.
Last accessed: 20 Jan. 2024.

2.9 Natural language generation (NLG)


• In §1.2.3, we saw an application of NLG in writing job applications.
• Other use cases include: suggesting replies to emails and complaints, and collating
audit findings.
• The complexity, the ambiguity, and the variety of expressions in human languages make
NLG challenging.
• The most powerful NLG AIs nowadays are large language models (LLMs), in the sense
that they are massive programs that contain language information extracted from mas-
sive amounts of data.

20
• Currently, the most popular type of LLMs is the so-called Transformer models, which
we will look at in more detail in §6.8.
Reference: Multiple contributors. “Natural language generation”. Wikipedia. https://en.wikipedia.org/w
iki/Natural language generation. Last accessed: 20 Jan. 2024.

Compare the quality of different LLMs by following the steps below.

1. Go to the “Chatbot Arena: Benchmarking LLMs in the Wild” page at https://chat


.lmsys.org/.

2. Click into the “Arena (side-by-side)” tab.


3. Select two LLMs you would like to compare from the dropdown menus.
4. Ask the chosen LLMs to generate some text to your liking by entering a prompt into
the “Enter your prompt and press ENTER” box. (Do not enter personal or private
information.)
5. Click the “Send” button.
6. Wait for the text to be generated in the boxes above.
7. Compare the quality of the results.

8. Try out different LLMs and different prompts.


9. Evaluate the results. How good are the LLMs in generating a coherent piece of text?
How about sustaining a meaningful conversation? Answering factual questions? Rea-
soning?

21
2.10 Further applications
2.10.1 Chatbots
• Chatbots are softwares that conduct written or spoken conversations in natural lan-
guages.
• In Nov. 2022, OpenAI launched a free “preview” of its text chatbot called ChatGPT.
• ChatGPT can adapt to the style and the content of the prompt. This allows the user
to generate realistic and coherent continuations about a topic of their choosing.

• ChatGPT attracted widespread public interest and showed great potential. In Jan. 2023,
Microsoft extended its partnership with OpenAI through a multiyear, multibillion dol-
lar investment. While the free version of ChatGPT is based on an LLM called GPT-3.5,
Microsoft’s new Bing search engine now runs on OpenAI’s more capable GPT-4. Mi-
crosoft also started using such LLMs with its 365 apps.

• As a direct response to ChatGPT, Google released Bard, Meta released LLaMA, Baidu
released ERNIE bot, and Anthropic released Claude, all of which have ChatGPT-like
capabilities, in different capacities.
• These chatbots do not only generate text, but also analyze sentiment, summarize text,
translate text, etc.

• People have since found numerous applications of these chatbots, e.g., brainstorming,
explaining complex topics, getting feedback or second opinion, polishing write-ups, and
rehearsing for interviews.
References: [1] Murray Shanahan. “Talking About Large Language Models”. arXiv:2212.03551 [cs.CL],
Dec. 2022/Feb. 2023. [2] Microsoft Corporate Blogs. “Microsoft and OpenAI extend partnership”.
Official Microsoft Blog, 23 Jan. 2023. https://blogs.microsoft.com/blog/2023/01/23/microsofta
ndopenaiextendpartnership/. Last accessed: 20 Jan. 2024. [3] Yusuf Mehdi. “Reinventing search
with a new AI-powered Microsoft Bing and Edge, your copilot for the web”. Official Microsoft Blog,
7 Feb. 2023. https://blogs.microsoft.com/blog/2023/02/07/reinventing-search-with-a-new-
ai-powered-microsoft-bing-and-edge-your-copilot-for-the-web/. Last accessed: 20 Jan. 2024.
[4] Jared Spataro. “Introducing Microsoft 365 Copilot – your copilot for work”. Official Microsoft
Blog, 16 Mar. 2023. https://blogs.microsoft.com/blog/2023/03/16/introducing-microsoft-365-
copilot-your-copilot-for-work/. Last accessed: 20 Jan. 2024.

• Examples of currently available virtual voice agents include the Google Assistant, Ap-
ple’s Siri, and Amazon’s Alexa.

22
• Virtual voice agents work seemlessly with smart speakers, e.g., Google Nest (formerly
known as Google Home) and Amazon Echo.
• Through voice agents, one can control lights and devices, play music and videos, get
answers to questions, place orders, . . .

• Chatbots can also be used in shopping malls to provide concierge and navigation service
and to create smarter digital signage.
• Voice chatbots, e.g., Google’s Duplex and Amazon Connect, can now carry out real-
world tasks over the phone.

• Watch how well Duplex works in 2018 in the following demonstration.

1 min 55 sec
Video source: ZEM502 (@ZEM502). “Google Duplex Demo (Google I/O 2018)”. YouTube, 11 May 2018.
https://youtu.be/znNe4pMCsD4.

• Some chatbots can now be developed without code.


Source: LivePerson. “Conversation Builder”. https://www.liveperson.com/products/conversation
-builder/. Last accessed: 20 Jan. 2024.

2.10.2 Writing computer code


• Computer languages are languages. So some language models apply to them as well.
• For example, ChatGPT (discussed in §2.10.1) can generate and translate computer
code.

• Watch how this can be exploited to build apps in a low-code manner through Debuild.

47 sec
Video source: Sharif Shameem (@sharifshameem1227). “Debuild.co – Creating a Todo List”. YouTube,
3 Aug. 2020. https://youtu.be/WhPgZFsPLeE.

23
2.10.3 Producing music
• In addition to text-to-speech capabilities, WaveNet, discussed in §2.8 above, can also
be used to synthesize other audio signals such as music. The webpage linked there
contains some demonstrations of this kind towards the bottom.

• There are music “chatbots”. One example is A.I. Duet built by Yotam Mann and
friends at Google. You can try it out at https://aiexperiments.withgoogle.com/a
i-duet/view/.

2.10.4 Other examples


• grammar checkers
• natural language database query

2.11 Current challenges


• As we saw in the demonstration in §2.7, current voice agents are still not very good at
voice recognition, especially when different languages and dialects are involved. Multi-
lingual text is also a challenge.
• ASEAN languages lack corpus. Singlish has AI.SG corpus, but this is just a start and
is not yet enough.
• Currently, AI has no true understanding of language. For instance, this poses limits
to question answering abilities: AI may not understand questions that have complex
structure and may not find answers to slightly ambiguous questions.
• While AI can now retrieve information from the Internet/databases to answer open-
domain questions, many chatbots still have rather limited scopes.
• Sentiment analysis is affected by many parts of a text, each may have different meanings
and implications.
• Many NLP systems require additional training and cannot work out of the box.
• LLMs currently take lots of computer power to train and are slow.
• Current LLMs may hallucinate, i.e., they may produce confident responses that do
not seem to be justified by the source data used to make the model. The result can
be plausible sounding, but factually incorrect. The following is an example from the

24
chatbot at https://deepai.com/chat.

References: [1] Multiple contributors. “Question answering”. Wikipedia. https://en.wikipedia.org/w


iki/Question answering. Last accessed: 20 Jan. 2024. [2] Kyle Wiggers. “Salesforce’s AI navigates
Wikipedia to find answers to complex questions”. VentureBeat, 25 Feb. 2020. https://venturebeat.
com/ai/salesforces- ai- navigates- wikipedia- to- find- answers- to- complex- questions/. Last
accessed: 20 Jan. 2024. [3] Multiple contributors. “Hallucination (artificial intelligence)”. Wikipedia. https:
//en.wikipedia.org/wiki/Hallucination (artificial intelligence). Last accessed: 20 Jan. 2024.

2.12 Reflection
• We saw some current NLP capabilities of AI and how powerful/restricted they are.

• Will you start exploiting these capabilities in your work? If yes, then how? If no, then
why?
• Can you think of some creative applications of these capabilities?
• In your opinion, are these developments good or bad for you and for the society in
general?

25
HS1501 §3

Capabilities: vision

AI is capable of analyzing, processing, and generating images and videos.


Here is a summary of the major vision capabilities of AI nowadays.
• text recognition: converting images of text to machine text, aka optical character
recognition (OCR)
• object recognition: identifying the presence and the location of a desired object or
body within an image/video

• facial recognition: identifying and verifying a person using facial features in an image
or a video
• action recognition: identifying a list of categorized actions, with the moments the
actions happened in the case when the input is a video

• visual question answering: answering questions about an input image or video


• image/video generation
• image/video processing, e.g., enhancing the resolution of an image (i.e., super-
resolution), adding colours to black-and-white images (i.e., colourization), recolouring
an image, and transferring image/video style
We will look at these in more detail, check out the current level of the technology, explore a
few applications, and discuss some challenges in this area.

3.1 Text recognition


Traditional text recognition softwares work only on standard images of typewritten text in
a specific font, while AI text recognition softwares are much more flexible.
One of the standard tasks to complete when one starts writing code for AI is to recognize
handwritten digits 0–9. Try out the following code written for this particular task.

26
1. Open chrisjay’s simple-mnist-classification page on Hugging Face at https://huggin
gface.co/spaces/chrisjay/simple-mnist-classification.

2. Draw a digit 0–9 in the “Sketch” box.


3. The model displays in the “Label” box what it thinks the digit is, with a level of
confidence.
4. Click the “×” button in the “Sketch” box and repeat the steps above with a different
(possibly non-digit) input.
5. Evaluate the quality of the outputs.

Let us try out AI text recognition on real-life photos of various kinds.

1. Open Tomofumi Inoue’s EasyOCR page on Hugging Face at https://huggingface.


co/spaces/tomofi/EasyOCR.
2. Drop a text-bearing image into the “Input” box (or click the box to upload such an
image file).
3. Select the language(s) of the text in the input image.
4. Click the “Submit” button, and wait for the model to finish running.
5. The model displays in the “Output” boxes where it thinks the text is, and what it
thinks the text is in the image.
6. Click the “Clear” button and repeat the steps above with a different input.
7. Evaluate the quality of the outputs.

27
Here are some sample images you can use for the exercise.

Image sources: [1] Walter Lim from Singapore, CC BY 2.0, via Wikimedia Commons. https://commons.
wikimedia.org/wiki/File:Directional road sign along Bukit Timah Road, Singapore - 20101128.jpg.
[2] MarcusObal, CC BY-SA 4.0, via Wikimedia Commons. https://commons.wikimedia.org/wiki/File:
Green Grass.JPG. [3] Clem Onojeghuo clemono2, CC0, via Wikimedia Commons. https://commons.wiki
media.org/wiki/File:Old books on bookshelf. (Unsplash).jpg. [4] alex.ch, CC BY 2.0, via Wikimedia
Commons. https://commons.wikimedia.org/wiki/File:Lim Seng Tjoe Lecture Theatre, National Unive
rsity of Singapore - 20070125.jpg.

CAPTCHA is a type of challenge–response test used in computing to determine whether


the user is human. (CAPTCHA stands for “Completely Automated Public Turing test to tell
Computers and Humans Apart”.) The most common type of CAPTCHA was first invented
in 1997, requiring the user to enter a sequence of letters or numbers in a distorted image like
the following.

Image source: A K Nain. “OCR model for reading Captchas”. Keras code examples, created 14 Jun. 2020,
last modified 26 Jun. 2020. https://keras.io/examples/vision/captcha ocr/.

Specially made AI is now able to pass traditional CAPTCHA. Verify this by following
the steps below.

28
1. Open keras-io’s “OCR for CAPTCHA” page on Hugging Face at https://huggingf
ace.co/spaces/keras-io/ocr-for-captcha.

2. Drop one of the CAPTCHA image above into the “img path” box.
3. Click the “Submit” button.
4. The model displays in the “output” box the text in the input image.
5. Click the “Clear” button and repeat the steps above with a different input.

6. Evaluate the quality of the outputs.


7. Use tomofi’s EasyOCR (which we used above) on the same CAPTCHA images. Com-
pare the results.

Text recognition is an important part of digitization. It allows a large amount of data


to be input (and thus processed) automatically. This is often necessary in the running of a
Smart Nation.

3.2 Object recognition


There are a number of different types of object recognition.

classification

semantic segmentation

object detection

instance segmentation

Image source: Li, Johnson, and Yeung. “Lecture 11: Detection and Segmentation”. Course slides for CS231n:
Deep Learning for Computer Vision for 2017 at Stanford University. http://cs231n.stanford.edu/slides
/2017/cs231n 2017 lecture11.pdf. Last accessed: 27 Jan. 2024.

29
A popular object recognition model is YOLO (You Only Look Once). It runs fast while
maintaining good accuracy, which is key in autonomous driving amongst other applications.
Watch Joseph Redmon, one of YOLO’s early developers, present and demonstrate YOLO in
the following TED talk from 2017.

7 min 37 sec
Source: TED. “How computers learn to recognize objects instantly | Joseph Redmon”. YouTube, 18 Aug. 2017.
https://youtu.be/Cgxsv1riJhI.

Try out AI classification, semantic segmentation, and object detection on Hugging Face.

1. Open Detomo’s Universal Image Classification page on Hugging Face at https://hu


ggingface.co/spaces/Detomo/Image-Classification.
2. Drop an image into the “img” box (or click the box to upload an image file).

3. Click the “Submit” button, and wait for the model to finish running.
4. The model displays in the “output” box what it thinks is in the image.
5. Click the “Clear” button and repeat the steps above with a different input.

6. Evaluate the quality of the outputs.

30
1. Open ziplab’s “Stitched ViTs are Flexible Vision Backbones” page on Hugging Face at
https://huggingface.co/spaces/ziplab/snnetv2-semantic-segmentation.

2. Drop an image into the “Input Image” box (or click the box to upload an image file).
3. Click the “Run” button, and wait for the model to finish running.
4. The model displays in the “Segmentation Results” box a semantically segmented version
of the image.

5. Click the “Clear” button and repeat the steps above with a different input.
6. Evaluate the quality of the outputs.

1. Open shriarulmozhivarman’s “YOLOv7 Inference” page on Hugging Face at https:


//huggingface.co/spaces/shriarul5273/Yolov7.
2. Drop an image into the “Input Image” box (or click the box to upload an image file).
3. Click the “Detect” button at the bottom, and wait for the model to finish running.

4. The model displays in the “Output Image” box the result.


5. Click the “×”’ button at “Input Image” box and repeat the steps above with a different
input.
6. Evaluate the quality of the outputs.

31
Image sources: [1] Andrew Bogott, CC BY-SA 4.0, via Wikimedia Commons. https://commons.wiki
media.org/wiki/File:Ice Kacang.png. [2] Walter Lim from Singapore., CC BY 2.0, via Wikimedia
Commons. https://commons.wikimedia.org/wiki/File:Directional road sign along Bukit Tim
ah Road, Singapore - 20101128.jpg. [3] Clem Onojeghuo clemono2, CC0, via Wikimedia Commons.
https://commons.wikimedia.org/wiki/File:Old books on bookshelf. (Unsplash).jpg.

Here are some sample applications of object recognition.


• service and exhibition sectors: queue counting, crowd control
• healthcare sector: reading x-ray films, detecting colour-tagged cancerous cells

• security: detecting (smuggled) weapons, drugs, intruders


• workplace safety: detecting faulty machine parts, violations to workplace safety proto-
cols
• satellite imagery: providing activity signals (e.g., how many cars are parked in an
outdoor carpark, how much light is on in an area), analysis of soil (via hyperspectral
sensing), forest fire detection, natural disaster damage assessment

3.3 Facial recognition


• Facial recognition is an advanced form of biometric authentication.

• Current AI systems, e.g., FaceNet developed by Google in 2015, are capable of verifying,
recognizing, and comparing faces, even when they are in different poses and under
different lighting, with high accuracy.
• Since the onset of the COVID-19 pandemic, researchers have developed AI systems for
recognizing faces with masks on.

32
• They can perform 1:1 verification (i.e., confirming the identity of a person) and 1:n
identification (i.e., finding the identity of a person).
Reference: James Clayton. “Facial recognition beats the Covid-mask challenge”. BBC, 25 Mar. 2021. https:
//www.bbc.com/news/technology-56517033. Last accessed: 27 Jan. 2024.

Watch Prof. Yu talk about the current level of the technology and a couple of applications.

1 min 48 sec

We list here some other applications:


• secondary authentication for biometrics, mobile applications, e.g., in a car and at border
control points;

• detection of VIP or intruders for the security of events and organizations;


• identification of offenders, suspicious people, and victims;
• identification of celebrities in significant events;

• automatic indexing of image and video files for media and entertainment companies.

3.4 Action recognition


One way to recognize action is to first recognize pose, which describes the body’s position
in 3D space with a set of skeletal landmark points, such as shoulders and hips. Here is a
demonstration.

Image source: Delon. “Human Pose Estimation and Human Action Recognition: Experimenting for public
good”. Medium (DSAID GovTech), 13 Feb. 2020. https://medium.com/dsaid-govtech/human-pose-esti
mation-and-human-action-recognition-experimenting-for-public-good-dabde16521b3.

33
Using pose information, one can further analyze action. Some applications include the
automation of tests and invigilation, the identification of offenders, and safety enforcement.
Watch Prof. Yu discuss some applications and the challenges involved in the video below.

6 min 22 sec

3.5 Visual question answering


The latest version of ChatGPT and Google’s Bard can now accept images as input.
Check out how good current visual question-answering AI can be on Hugging Face by
following the steps below.

1. Open vikhyatk’s moondream page on Hugging Face at https://huggingface.co/spa


ces/vikhyatk/moondream1.
2. Drop an image into the “Upload or Drag an Image” box.

3. Type a question about the image in the “Question” box.


4. Click the “Submit” button, and wait for the model to finish running.
5. The model displays an answer for the input image in the “Answer” box.

6. Repeat the steps above with a different input.


7. Evaluate the quality of the outputs.

34
Image source: Andrew Bogott, CC BY-SA 4.0, via Wikimedia Commons. https://commons.wikimedia.or
g/wiki/File:Ice Kacang.png.

Visual question answering is useful, for example, for the blind and for automatic visual
data classification.

3.6 Image/video generation


Current AI is capable of generating high-quality images from text descriptions. We saw
in §1.3.1 that one such AI program Midjourney helped make an image that won an art
competition. Similar text-to-image programs include DALL·E (developed by OpenAI) and
Stable Diffusion. Try the latter on Hugging Face.

1. Open Stability AI’s Stable Diffusion 2.1 Demo page on Hugging Face at https://hu
ggingface.co/spaces/stabilityai/stable-diffusion.
2. By trial-and-error, find a text prompt that generates a bird that looks like one shown
below.

3. Evaluate the quality of the images generated and the difficulty in using Stable Diffusion
to make custom-made images.

35
While there are AIs that turn images into short video clips, generating videos from text de-
scriptions alone seems still a challenging task at present. Watch a demonstration of Google’s
video-generation AI Lumiere below.

1 min 54 sec
Source: Inbar Mosseri (inbarmosseri6223). “Lumiere”. YouTube, 23 Jan. 2024. https://youtu.be/wxLr02D
z2Sc.

3.7 Image processing


Here is a cropped version of a photo from Choo Yut Shing on flickr (CC BY-NC-SA 2.0,
https://www.flickr.com/photos/25802865@N08/27294156973).

For the purpose of demonstration, I turned it into a black-and-white photo:

36
I used the Colourize Neural Filter in Adobe Photoshop (without manual adjustments) to
turn it back to a coloured photo. Compare the result with the original photo.

As another demonstration, I reduced the resolution of the original photo:

I used the Super Zoom Neural Filter in Adobe Photoshop to increase the resolution again.
Evaluate the result.

Try the same yourself using the following AI algorithms on Hugging Face. Compare the
results with what I got from Adobe Photoshop.
• modelscope. “old photo restoration”. https://huggingface.co/spaces/modelsco
pe/old photo restoration.
• HuSusu. “Super Resolution with CNN”. https://huggingface.co/spaces/HuSusu
/SuperResolution.

37
3.8 Further applications
3.8.1 Reverse image search
Google Lens and TinEye let you search the Internet using an image for related information
or similar images.

Such technologies have numerous applications. Here are some samples.


• Online marketplaces, content hosting websites, property listings, etc. can use them to
facilitate searches, link posts, identify duplicate entries, fraud detection, and content
moderation.
• Social platforms and dating sites can use them to identify previously flagged images.
• Large-scale sorting and identification projects can use them for automation.
• Trademark offices can use them to speed up the process of identifying trademarks
infringements and to facilitate new trademark registrations.

3.8.2 Games, augmented reality (AR), virtual reality (VR), and the
metaverse
• Image and video (and text) generation capabilities of AI are useful in generating content
for games.
• AI enables games to exhibit intelligent-like behaviour, which makes them more fun.

– Try out “Quick, Draw!” at https://quickdraw.withgoogle.com/.

– You will be asked to draw some specified objects within 20 seconds.


– See how well their AI is able to tell what you are drawing.

38
• In §5.1, we will see how AI can enhance gaming experience and engagement via per-
sonalization.
• The same apply more generally to augmented reality, virtual reality, and the metaverse.

– Augmented reality (AR) is an interactive environment that combines the real world
with computer-generated content.

Image source: sndrv, CC BY 2.0, via Wikimedia Commons, https://commons.wikimedia.org/


wiki/File:Augmented Reality flashmob.jpg.

– Virtual reality (VR) is an interactive environment designed to make the user feel
immersed in a virtual world.

Image source: European Space Agency, CC BY-SA 3.0 IGO, via Wikimedia Commons, https:
//commons.wikimedia.org/wiki/File:Reality check ESA384313.jpg.

– A metaverse is essentially a virtual reality in which people can interact with one
another.
• Object recognition capabilities of AI allow the AR/VR/metaverse to respond to the
user’s environment in real time.

39
3.8.3 Deepfakes
Deepfakes are synthetic videos that were digitally altered so that they appear (and sound)
to be someone else. We already saw one example involving Ukrainian President Zelenskyy
in §1.3.3. Here is a more light-hearted yet more realistic example involving USA former
President Barack Obama (with some strong language).

1 min 12 sec
Source: BuzzFeedVideo. “You Won’t Believe What Obama Says In This Video! ”. YouTube, 17 Apr. 2018.
https://youtu.be/cQ54GDm1eL0.

While deepfakes can be used for slander and to spread misinformation, they can also
reduce costs in commercial applications, e.g., in generating a photorealistic model for clothes
or accessories, in shortening animation time for animated films, in improving computer-
generated imagery in movies.

3.9 Challenges
Watch Prof. Yu discuss some current challenges in AI vision applications.

6 min 21 sec

3.10 Reflection
• We saw some current vision capabilities of AI and how powerful/restricted they are.
• Which of these capabilities have you already encountered/used?
• What is the next vision capability you would like AI to have? Why?

• From a student’s point of view, is it good or bad if all the lecture videos are AI-
generated? Why?

40
HS1501 §4

Capabilities: robots

A robot is commonly understood to be a machine that can carry out complex tasks with no
or little human intervention. A robot can be a physical robot or a virtual software agent.
Physical robots may or may not look like humans.
Reference: Multiple contributors. “Robot”. Wikipedia. https://en.wikipedia.org/wiki/Robot. Last
accessed: 3 Feb. 2023.

Here is a summary of the major capabilities of AI robots nowadays.


• sensing: detecting the environment

• navigation: finding location and finding way


• physical activities: moving from one place to another, handling physical objects
• interacting with other machines

• interacting with human


• robotic process automation (RPA): automating software processes
We will look at these in more detail, check out the current level of the technology, explore
a few applications, and discuss some challenges in this area.

4.1 Sensing
To react to the environment, one first needs to sense it. Here are some ways in which robots
can sense the environment.

• Weight sensor, contact sensor, accelerometer, gyrometer, . . .


• Sound perception
– We saw in §2.7 how well current AI is able to perceive sound.
• Vision

– We saw in §3 how well current AI is able to see.


– Depth cameras (e.g., Intel RealSense, SG$360–670 as of Feb. 2024) can achieve
3D vision.

41
– Hyperspectral sensors can detect “light” beyond the visible spectrum, e.g., infrared
radiation, and can thus be used to identify materials remotely through their spec-
tral signatures. Here is an image to show gas cloud detection using hyperspectral
imaging.

Image source: Editwiki1111, CC BY-SA 4.0, via Wikimedia Commons. https://commons.wiki


media.org/wiki/File:Hyperspectral gas leak detection.png.

Listen to Prof. Yu talk about the numerous applications of hyperspectral sensors.

1 min 7 sec

• Radar, lidar, and ultrasonic sensors are ways to measure distances remotely using radio
waves, laser, and ultrasonic sound waves, respectively.
– Some radars work well even through clouds, fog, rain and snow.
– Lidar can also determine the 3D shape of an object remotely in high resolutions.
– The performance of ultrasonic sensors is not affected by the colour and the trans-
parency of the objects.
Here is an example of what one gets when combining depth video camera with lidar.

1 min 9 sec
Video source: Intel RealSense. “L515 LiDAR - Scanning with no motion blur”. YouTube, 19 Mar. 2021.
https://youtu.be/Kn25gKkpE6Q.

4.2 Navigation
• While outdoors, AI can use satellite signals for navigation via the Global Positioning
System (GPS).

42
• While indoors, AI can use Bluetooth signals, Wi-Fi signals, and mobile phone signals
for navigation.
• Google’s Visual Positioning System uses cameras for more accurate indoor and urban
navigation. Watch it identify key visual points in real time to tell the position and the
orientation of the camera in the video below.

24 sec
Video source: Road to VR. “Google ’Visual Positioning Service’ AR Tracking in Action”. YouTube,
18 May 2017. https://youtu.be/L6-KF0HPbS8.

4.3 Physical activities


Autonomous vehicles are essentially robots moving around on wheels. AI helps them pro-
cess data collected from sensors and determine the best course of action. Watch the first
autonomous car approved for public road testing in Singapore driving itself in 2015 in the
video below, and see what it can do.

1 min 28 sec
Video source: Smart Nation Singapore (@SmartNationSingapore). “Minister Balakrishnan rides in Au-
tonomous Vehicle developed by A*STAR/I2R”. YouTube, 12 Oct. 2015. https://youtu.be/cUDgTRxP4ks.

A more human way to move around is to walk, but walking is a complex task. Google’s
DeepMind explored virtually in 2017 whether AI can learn to walk and to navigate complex
environments on its own. Watch the results below.

3 min 25 sec
Video source: Google DeepMind. “Emergence of Locomotion Behaviours in Rich Environments”. YouTube,
14 Jul. 2017. https://youtu.be/hx bgoTF7bs.

43
In the real world, AI robots can learn to walk in a similar way. Watch the robot Atlas,
made by Boston Dynamics, walks and performs tasks in a complex environment.

1 min 20 sec
Video source: Boston Dynamics. “Atlas Gets a Grip | Boston Dynamics”. YouTube, 18 Jan. 2023. https:
//youtu.be/-e1 QhJ1EhQ.

Robots do not need to walk the human way. They can move from one place to another
like some kind of animal, e.g., a dog. Watch the robot Sand Flea, also made by Boston
Dynamics, jump.

1 min 8 sec
Video source: Boston Dynamics. “Sand Flea Jumping Robot”. YouTube, 28 Mar. 2012. https://youtu.be
/6b4ZZQkcNEo.

As demonstrated, a robot does not need to stay on the ground. It can also swim or fly to
a destination. Robots that fly are often called drones. Watch the following promotion video
of a toy drone called Tello, which is made by Ryze Technology, to see what basic capabilities
drones have nowadays. As of Feb. 2024, a Tello costs about SG$150.

1 min 18 sec
Video source: Tello Drone (@tellodrone3300). “Say Hello to Tello”. YouTube, 8 Jan. 2018. https://youtu.
be/3S1Sq64dJuc.

AI can help drones in a number of ways.


• It frees the operator to focus on tasks other than flying, such as monitoring the sur-
roundings for hazards.
• It helps the drone to avoid obstacles and to navigate through tight spaces.

44
• It helps the drone to make decisions on their own, thus reacting quickly to situations.
• It helps save energy, and thus allows the drone to cover more ground.
Drones (when combined with AI) can be useful for aerial photography, military and rescue
actions, delivery, agriculture, and inspection purposes.
AI can also learn the complex task of handling delicate objects. Watch how well a
robotic gripper developed by the Singapore University of Technology and Design (SUTD)’s
Bio-Inspired Robotics and Design Laboratory can do this.

1 min 7 sec
Video source: SUTD Singapore University of Technology and Design. “This robotic hand can help you pick
your food items and plate your dish.” YouTube, 12 Jan. 2023. https://youtu.be/m2RFdUfwUBA.

4.4 Interacting with other machines


Robots can communicate with other machines (e.g., sensors) via local networks or via the
Internet. The high speed and the low latency of fifth-generation (5G) networks allow human
users to communicate with and, if needed, control (AI-operated) robots remotely in almost
real time.
Relatively simple robots can work together in a decentralized manner to exhibit rather
complex collective behaviour. Watch the Kilobots, developed by the Self-Organizing Systems
Research Group at Harvard University, demonstrate this.

3 min 44 sec
Video source: Deep Look (@KQEDDeepLook). “Can A Thousand Tiny Swarming Robots Outsmart Nature?
| Deep Look”. YouTube, 22 Jul. 2015. https://youtu.be/dDsmbwOrHJs.

4.5 Interacting with human


With appropriate sensors, AI can help robots collaborate with human. Collaborative robots,
or cobots for short, are designed to work directly and safely with humans in the same work-
place. They can boost productivity by allowing human workers to focus on more creativity-
and solution-oriented tasks. Watch how closely cobots can work with a human worker in the

45
demonstration below.

2 min 26 sec
Video source: Universal Robots (@UniversalRobotsVideo). “UR3: The world’s most flexible, light-weight
table-top cobot to work alongside humans”. YouTube, 17 Mar. 2015. https://youtu.be/jsZvhDbnfRo.

Robots are sometimes deployed to converse with humans, e.g., to provide information, to
diagnose patients, and to take care of people. In such cases, having a human-like appearance
makes the conversation more natural. Watch how realistic humanoid robot Ameca, developed
by Engineered Arts, is at conversing and at facial expressions.

1 min 54 sec
Video source: Engineered Arts (@EngineeredArtsLtd). “Ameca expressions with GPT3 / 4”. YouTube,
31 Mar. 2023. https://youtu.be/yUszJyS3d7A.

4.6 Robotic process automation (RPA)


• Robotic process automation (RPA) is the deployment of virtual agents, often referred
to as bots, to automate tasks in computer systems.

• It can automate mundane and tedious tasks, thus eliminating human error and increas-
ing productivity and efficiency.
• Common tasks that can be automated include copy-and-pasting data, moving files and
folders, filling out forms, scraping web pages, extracting data from documents, and
generating an automated response to an email.

• Typically, traditional RPA can handle only structured data (in some standardized for-
mat).
• AI enables RPA to handle also semi-structured and unstructured data (e.g., the Inter-
net, camera footage).

• AI can help in decision making and problem solving, thus lowering the level of human
intervention.
• AI enables the user to specify the process to be automated in natural language, similarly
to how one can generate computer code using natural language instructions in §2.10.2.

46
Watch a simple example of how AI can help in the automation of submission checking
below.

4 min 23 sec
Video source: IBM Technology (@IBMTechnology). “AI-embedded resilience inside RPA”. YouTube, 23 Jul. 2021.
https://youtu.be/EBEIVhkIW2w.

4.7 Further applications


“Roomba®”, from SG$470 as of Feb. 2024, made by iRobot Corporation, is a series of
cableless robot cleaners that can vacuum-clean and mop the floor autonomously and on
request.

Image source: Dwight Sipler, CC BY 2.0, via Wikimedia Commons, https://commons.wikimedia.org/wiki


/File:Gillie trying to avoid the Roomba %282166682851%29.jpg.

“Spot®”, about SG$99000 reportedly in 2020, built by Boston Dynamics, is a dog-like


robot that is capable of sensing and inspection in complicated terrains. It was deployed in
the Bishan–Ang Mo Kio Park to remind visitors of safe distancing measures in 2020.

Image source: Gin Tay. “As the robot is four-legged, it is able to navigate obstacles more effectively compared
to wheeled robots, making it suitable for different terrains”. The Straits Times, 9 May 2020. https:

47
//www.straitstimes.com/singapore/robot-reminds-visitors-about-safe-distancing-measures-in-bi
shan-ang-mo-kio-park.

“Pepper”, reportedly about SG$19000 including mandatory monthly subscription fees,


manufactured by SoftBank Robotics, is a humanoid robot designed to chat with people in a
friendly manner. DBS deployed Pepper to guide customers on how to use the Video Teller
Machines at its branch in Plaza Singapura in 2017.

Source: DBS. “DBS reimagines banking with lifestyle space for ”tech” generation”. 10 Nov. 2017. https:
//www.dbs.com/newsroom/DBS reimagines banking with lifestyle space for tech generation.

In the Consumer Electronics Show 2024, Samsung introduced a new version of the robot
“Ballie”, and LG unveils the robot “Q9”. These are virtual voice agents that can move
around the home on their own, and perform tasks according to what they sense. Ballie has
a built-in projector that enables it to present content on walls, floors and ceilings. Q9 has
two wheeled legs that allow it to pass bumps on the floor.

48
References and image sources: [1] Samsung Electronics. “A Day in the Life With Ballie: An AI Companion
Robot for the Home”. 8 Jan. 2024. https://news.samsung.com/us/samsung-ballie-ai-companion-robot-
home-video-ces-2024/. Last accessed: 3 Feb. 2024. [2] LG Electronics. “LG ushers in ‘Zero Labour Home’
with its smart home AI agent at CES 2024”. 2 Jan. 2024. https://www.lg.com/sg/about-lg/press-and
-media/lg-ushers-in-zero-labour-home-with-its-smart-home-ai-agent-at-ces-2024/. Last accessed:
3 Feb. 2024.

4.8 Challenges
• To protect the safety and the privacy of people, governments have strict regulations on
where robots can operate.
• In unfamiliar situations, the behaviour of robots may be hard to predict.

• The technology for imitating emotions is still not mature.

Listen to Prof. Yu share his experience about drones.

2 min 15 sec

Listen to Prof. Yu discuss the issues of robots going out of control.

2 min 30 sec

4.9 Reflection
• We saw some current capabilities of AI robots and how powerful/restricted they are.
• Supposing that you are to get a physical AI robot for your home free-of-charge next
month, what capabilities would you like it to have and not have?

• The automation of which operations within the NUS using AI-powered RPA would be
most beneficial to you now?
• Do you think robots will be a threat to humans (like in the films and otherwise)? In
what sense?

49
HS1501 §5

Use cases

We saw in §2–§4 that AI can read, write, hear, speak, see, draw, and move. One last capability
of AI that we will see is “thinking”. This is highly useful in applications. We will study this
first, then delve deeper into the vast potentials of adopting AI in a few selected sectors.
• manufacturing
• agriculture

• education
• government
• green
• weather

• retail
• information technology
• finance

• insurance
• human resources
Here are the main tasks for which AI is generally used nowadays.

Perception: Collecting past and current data.


Notification: Sending appropriate alerts and reminders based on the
data collected.
Suggestion: Recommending action based on past behaviors, human
input, and the data collected.
Automation: Enabling more and more work to be automated.
Prediction: Predicting future data based on the data collected.
Prevention: Identifying potential threats based on its predictions
Situational awareness: Figuring out what we need to know.

Source: R “Ray” Wang. “Monday’s Musings: Understand The Spectrum Of Seven Artificial Intelligence
Outcomes”. 17 Sep. 2016. https://www.raywang.org/blog/2016-09/mondays-musings-understand-spectr
um-seven-artificial-intelligence-outcomes. Last accessed 10 Feb. 2024.

The adoption of AI is key to the ongoing “Fourth Industrial Revolution”, aka “Indus-
try 4.0”, which is characterized by the merging of the physical, the digital, and the biological
worlds.

50
5.1 Data analytics
• Data analytics refers to the process of converting raw data into actionable insights.
• Data analytics can be descriptive, diagnostic, predictive, and prescriptive, which are
respectively about finding out what is happening, why it happened, what will likely
happen, and what should be done.
• As we saw in §2 and §3, AI can extract information from raw data, e.g., websites and
camera footage, for further analysis.
• Like how AI recognizes objects in images, for example, AI can also recognize patterns,
trends, and relationships in raw data.
• AI can use the regular patterns/trends learnt to detect irregularities and to predict
what will likely happen.
• AI can use the relationships learnt to suggest causes of events.
• All together, AI can analyze the potential implications of different choices and recom-
mend the best course of action.
• One general application of data analytics is personalization: AI can tailor an experience
to the patterns it recognizes in the user and to the specified requirements.
Reference: AWS. “What Is Data Analytics?” https://aws.amazon.com/what-is/data-analytics/. Last
accessed: 10 Feb. 2024.

5.2 Manufacturing
Listen to Prof. Yu talk about the issues in traditional manufacturing plants that AI can help
solve.

3 min 30 sec
• A smart factory is a digitalized manufacturing facility that operates using connected
devices, machinery, and production systems that continuously collect and share data.
• Smart factories are typically equipped with an abundance of sensors to collect real-time
data into a centralized system on the cloud.
• With these sensors, AI systems can track critical key performance indicators (KPIs),
improve planning, and visually inspect parts in factories for patterns of imperfections,
cf. §3.2.
• By analyzing real-time data from sensors and other sources (cf. §5.1), AI can anticipate
and address potential issues proactively before they lead to breakdowns in physical
systems, instead of reacting to issues as they arise. This approach, known as predictive
maintenance, is also useful for general facilities.
• Predictive maintenance enables the user to tailor maintenance routines to each piece
of equipment, and thus optimize maintenance resources.

51
• AI systems can maintain a digital representation of (parts of) the factory that is updated
with real-time sensor data. This representation is sometimes called a digital twin, and
is useful in simulations and tests for improvements.
• Virtual reality (VR) technology is often useful in maintenance simulations, product
development, and worker training. As mentioned in §3.8.2 and §2.10.1, AI can help VR
systems generate content, track user movement, and execute verbal instructions.
• We saw in §4.3 and §4.5 that AI robotics can automate a large number of processes.
• AI-powered data analytics can be used to predict future demand, and thus manage the
supply chain, the storage, and the product distribution more efficiently.

• Here are a few common challenges faced by smart factories:


– insufficient failure data for analysis, as highly critical systems are not allowed to
fail;
– insufficient or wrong types of sensors;
– physical or financial constraints in the installation of sensors;
– preference of the status quo, as not all people share the same vision.
• As many other sectors can be viewed as manufacturing in a very broad sense, many of
these points apply outside of factories as well.

5.3 Agriculture
• By 2050, the world’s population is projected to reach 9.8 billion. Agricultural produc-
tion will need to increase by 70% to meet the growing demand for food.
• Farming’s labour crunch is seen to be a global problem due to the arduous physical
labour.
• AI robotics allows farmers to control the equipment without physically operating it.
• It also helps farmers plow and spray crops with enhanced precision.
• AI-powered predictive analytics help optimize operations, and hence improve produc-
tivity.
Reference: Food and Agriculture Organization of the United Nations. “2050: A third more mouths to feed”.
23 Sep. 2009. https://www.fao.org/newsroom/detail/2050- A- third- more- mouths- to- feed/. Last
accessed: 10 Feb. 2024.

5.3.1 Case study: Japanese cucumber farmer employing deep learn-


ing to sort cucumbers
• Makoto Koike’s parents have a cucumber farm, in which his mother used to spend up
to eight hours a day to manually sort their harvest according to size, thickness, colour,

52
texture, shape, small scratches, etc., into nine classes as follows.

• In 2016, Koike built the following cucumber sorter based on AI object recognition
technology to automate this sorting process.

47 sec
Video source: Kazunori Sato. “TensorFlow powered cucumber sorter by Makoto Koike”. YouTube,
4 Aug. 2016. https://youtu.be/4HCE1P-m1l8.

• His sorter used a Raspberry Pi 3 (SG$64 as of Feb. 2024) as the main controller,
an Arduino Micro (SG$34 as of Feb. 2024) for controlling the conveyer belt, and the
open-source (thus free-of-charge) AI software library TensorFlow for the code.
• On real use cases, the system’s accuracy was about 70%.
• The system used two to three days on a typical desktop computer to learn the nine
classes of cucumbers using 7000 low-resolution images.
• More computing power is needed to increase the accuracy.
Source:佐藤一憲 . “How a Japanese cucumber farmer is using deep learning and TensorFlow”. Google Cloud
Blog, 1 Sep. 2016. https://cloud.google.com/blog/products/ai-machine-learning/how-a-japanese-cu
cumber-farmer-is-using-deep-learning-and-tensorflow. Last accessed: 10 Feb. 2024.

5.4 Education
• The incorporation of AI with education is transforming the modern educational land-
scape.
• Global Market Insights Inc. predicted that the AI-in-Education market size would grow
from US$4 billion in 2023 to US$20 billion in 2032, which is equivalent to an annual
growth rate of 10%.

53
Reference: Global Market Insights Inc. “AI in Education Market”. Report ID: GMI2639, Jan. 2023. https:
//www.gminsights.com/industry-analysis/artificial-intelligence-ai-in-education-market. Last
accessed: 10 Feb. 2024.

Listen to Prof. Yu talk about an emerging form of independent learning made possible
by the recent advancement of AI.

2 min 39 sec

5.4.1 Benefits of using AI


• AI automation makes education more widely available.
• Via real-time data analytics, AI can track each student’s progress and personalize the
learning experience to her/his goals and abilities by changing the speed, the materials,
and the order of topics.
• AI can help make learning more interesting and effective for students, e.g., through
gamification and augmented reality.
• With its RPA, NLP, and action recognition capabilities, AI can automate (partially)
time-consuming, tedious tasks, such as administrative record keeping, exam invigila-
tion, simple grading, and answering frequently asked questions, for teachers.

5.4.2 Concerns of using AI


• AI can collect data from students via usage and performance tracking for personal-
ization, but the data collected may be used for other purposes, e.g., to improve the
AI itself, and the place where the data is stored may be unsafe/unknown. This raises
privacy concerns.
• Schools with more resources, funding, and accessibility can expand AI-based learning
and the learning of AI more easily than other schools. This widens existing gaps
between schools.
• The adoption of AI may lead to a population that is generally not able to function
when technology is not available, e.g., in emergency situations.

5.4.3 Example of AI application: Duolingo


Duolingo, a mobile application for learning languages, uses the NLP and the data analytic
capabilities of AI to
• analyze user responses in order to
– identify common grammar mistakes and make corrections and suggestions,
– keep the learning at the right level and the right pace for the user;
• analyze the sounds and patterns of the user’s speech, so as to provide targeted feedback
on the pronunciation;

54
• power the auto-suggest feature in free-form writing exercises;
• create voices for their characters, to make lessons more fun;
• deliver its high-stake English test, in
– creating test items,
– assessing the language ability required for each test item,
– adaptively administering items for every test,
– grading test-takers’ answers (even for questions that require more substantial writ-
ing),
– synthesizing the answers and the grades into a final test score, and
– making the human review and proctoring stage more stable and efficient.
References: [1] Sophie Wodzak. “3 ways Duolingo improves education using AI”. 30 Mar. 2023. https:
//blog.duolingo.com/ai- improves- education/. [2] Duolingo. “The AI Behind Duolingo • Dr. Burr
Settles • Duocon 2021”. YouTube, 21 Aug. 2021. https://youtu.be/fnTZdZENRIk, 10 min 31 sec. [3] Burr
Settles and Geoffrey T. LaFlair. “The Duolingo English Test: AI-driven language assessment”. 30 Apr. 2020.
https://blog.duolingo.com/the-duolingo-english-test-ai-driven-language-assessment/.

5.5 Government
• Since Nov. 2014, the Singapore government have pursued a number of projects, many
of which are powered by AI, to transform Singapore into a digital-first Smart Nation
where technology is integrated seamlessly into the way people work, live, and play.
• Singpass, the National Digital Identity initiative, uses facial recognition as one of its
user authentication methods.
• Facial recognition is also used for immigration clearance at checkpoints.
• Many official procedures, e.g., invoicing and payment, are becoming paperless in Sin-
gapore. This digitalization makes data more accessible to AI systems.
• Chatbots are used to answer to people’s queries and help people report municipal issues.
• Synapxe, the national HealthTech agency, formerly Integrated Health Information Sys-
tems (IHiS), aims to use AI and other technological solutions to improve healthcare
outcomes, e.g., using data analytics to detect early signs of chronic diseases, and AI
vision to check that patients take their medication correctly.
• The Smart Nation Sensor Platform uses sensors to collect essential data that can be
analysed to create smart solutions nationwide.
– It can help predict the consumption of public resources, and thus improve the
effectiveness of municipal services and save energy.
– With video analytics capabilities to automate the analysis of police camera footage,
police officers can respond faster to potential threats and follow up on incidents
of interest, thereby ensuring the safety and security of citizens.
• DataSpark, a subsidiary of SingTel Group, analyzes data from mobile phone networks
to track people movement. This resulting information can be used to
– coordinate traffic lights;
– show people in which direction to move in crowded areas and at big events to
maintain safety;
– optimize services;

55
– plan evacuation due to emergencies and terrorist attacks;
– discover choke points in urban planning; and
– understand foot traffic in shopping malls.
References: [1] Home Team Science & Technology Agency, Immigration & Checkpoints Authority. “Use Of
Iris And Facial Biometrics As The Primary Biometric Identifiers For Immigration Clearance At All Check-
points”. 28 Oct. 2023. https://www.ica.gov.sg/news-and-publications/newsroom/media-release/use-
of-iris-and-facial-biometrics-as-the-primary-biometric-identifiers-for-immigration-clearance
-at-all-checkpoints. Last accessed: 10 Feb. 2024. [2] Government Technology Agency. “VICA – Virtual
Intelligent Chat Assistant”. https://www.tech.gov.sg/products-and-services/vica/. Last accessed:
10 Feb. 2024. [3] IMDA. “Nationwide E-Invoicing Initiative”. https://www.imda.gov.sg/How-We-Can-Hel
p/nationwide-e-invoicing-framework. Last updated: 28 Jun. 2023. Last accessed: 10 Feb. 2024. [4] MAS.
“Singapore’s e-payment’s journey”. https://www.mas.gov.sg/-/media/MAS/Images/MAS-E-payment-Time
line-Infographic v7.pdf. Last accessed: 10 Feb. 2024. [5] Sherlyn Seah and Calvin Yang. “Singapore’s
health tech agency IHiS relaunches as Synapxe, taps artificial intelligence for better care”. CNA, 28 Jul. 2023.
https://www.channelnewsasia.com/singapore/singapore-health-tech-agency-ihis-relaunches-syn
apxe-taps-artificial-intelligence-ai-better-care-3661441. Last accessed: 10 Feb. 2024. [6] Data
Spark (@dataspark3321). “DataSpark Mobility Genome™– How it works video”. YouTube, 22 Aug. 2017.
https://youtu.be/iAi9jjZJkrs, 3 min 20 sec.

Listen to Prof. Yu reflect on the Lamppost-as-a-Platform trial, which was part of the
Smart Nation Sensor Network in Singapore.

2 min 56 sec

5.6 Green
• Jeppesen FliteDeck Advisor, a mobile application developed by Boeing, analyzes data
in real time to provide pilots advisories to improve cruise fuel burn.
• AI algorithms are learning to predict flight delays, giving airports and airlines a better
shot at avoiding them.
• Analyses on real-time flight traffic updates produce an optimal speed, so that flights
can avoid the wasteful burning of fuel in holding patterns as they wait for their turn
to land.
• The company Verdigris analyzes sensor data to identify motor problems in buildings
that could be using excess energy and to verify energy efficiency upgrades.
• DeepMind applied AI to analyzes sensor data to reduce the amount of energy Google’s
data centres use for cooling by up to 40% in 2016.
References: [1] Boeing. “Jeppesen FliteDeck Advisor”. https://ww2.jeppesen.com/flight-and-fuel-opt
imization/flitedeck-advisor/. Last accessed: 10 Feb. 2024. [2] Eric Adams. “AI Wields the Power to
Make Flying Safer—and Maybe Even Pleasant”. Wired, 28 Mar. 2017. https://www.wired.com/2017/03/
ai-wields-power-make-flying-safer-maybe-even-pleasant/. Last accessed: 10 Feb. 2024. [3] Oliver
Wyman, Jérôme Bouchard, and Fabrice Villaumé. “New Technology May Help Airlines Cut Fuel Use And
Travel Time”. Forbes, 20 Jul. 2018. https://www.forbes.com/sites/oliverwyman/2018/07/20/new-
technology- may- help- airlines- cut- pricey- fuel- consumption- and- meet- environmental- regula
tions/?sh=58406f50f076. Last accessed: 10 Feb. 2024. [4] Verdigris. “Verdigris for Energy Efficiency”.
https://verdigris.co/energy- efficiency. Last accessed: 10 Feb. 2024. [5] Rich Evans and Jim
Gao. “DeepMind AI reduces energy used for cooling Google data centers by 40%”. The Keyword (Google).
https://blog.google/outreach-initiatives/environment/deepmind-ai-reduces-energy-used-for/.
Last accessed: 10 Feb. 2024.

56
5.7 Weather
• Through its large terrestrial camera network, the Helios platform uses AI vision to
detect the occurrence and impacts of weather in specific locations on critical ground
infrastructure (e.g., roads) that traditional weather sources typically struggle to identify.
• This makes available real-time, accurate local ground weather intelligence that is use-
ful in supporting weather forecasting, emergency response, vehicle safety, and other
weather-dependent decision making.
Source: NV5 Geospatial Solutions, Inc. “HELIOS® Ground Weather Analytics”. https://www.nv5geospat
ialsoftware.com/Products/Helios. Last accessed: 24 Jul. 2023.

5.8 Retail
• Facial recognition, speech analytics, and text analytics can be used to analyze the
sentiment, emotion, tone, and context behind customer behaviour. This helps identify
and predict customer patterns, and can thus improve product search.

• AI-powered data analytics can be used to create personalized product and service bun-
dle recommendations.
• AI object recognition algorithms can be used to automatically label images, say, in
online marketplaces.

• Chatbots can be used to automate communication with customers.


• The Magic Mirror, enabled by AR technologies, can let customers see themselves with
a chosen piece of garment or accessory in the video, thus producing a virtual dressing
room for brick-and-mortar as well as web stores.

Image source: Ikusuki, CC BY 2.0, via Wikimedia Commons. https://commons.wikimedia.org/wiki


/File:Virtual clothes trying %289507550174%29.jpg.

57
• Watch how AR can help enhance shopping experience in the following video.

1 min 35 sec
Video source: Dries Buytaert. “Shopping with augmented reality”. YouTube, 21 Jun. 2018. https:
//youtu.be/ZroFBG7-P7Q.

• A combination of AI vision and shelf-weight sensors allows the tracking of customers


and items as they move around a store, thus eliminating the need of cashiers. Cheers
has four such cashierless stores in Singapore as of Feb. 2024. Watch how they work in
the instructional video below.

Video source: NTUC FairPrice (@NTUCFairPriceSG). “Instructional Video for Cheers Unmanned
Store”. YouTube Shorts, 6 Oct. 2022. https://www.youtube.com/shorts/77Y6-w7i73A.

5.9 Information technology


• As computer infrastructures are getting more complex, it gets difficult (if not impossi-
ble) for humans to manage them manually. AI helps in the automation.
• We already saw in §2.10.2 how AI can be used to write codes. In fact, AI can be used
to develop AI, as we will see in §6.5.
• In information technology operations, AI can analyze event logs and past data to pre-
dict and detect anomalies, diagnose issues, optimize operations, check regulatory con-
formance/compliance, etc. This area is known as AIOps.

58
• In cyber defence, AI can similarly use data analytics to detect attacks, identify dan-
gerous user behaviour, find system vulnerabilities, prioritize them, and facilitate fast
response.
• To keep up with AI-assisted cyberattackers, the adoption of AI in cyber defence becomes
more and more important.

• Listen to Prof. Yu talk about improvements in AI-powered cyber defence systems in


the video below.

1 min 30 sec

5.10 Finance
• RPA and chatbots can automate services and tasks, both at the frontend and at the
backend, e.g., to better engage customers and stakeholders, and to help build reputation
on social media platforms.

• We saw in §4.7 that DBS used the humanoid robot Pepper to guide customers.
• Combining language, vision, and data analytics capabilities, AI can automatically track
and extract useful information from a vast number of complex documents and other
sources, not only from transaction records and text, but also from charts, graphs,
drone/satellite images, blockchains, etc., online and offline.

– Blockchain is a kind of records (e.g., transaction records within a group of banks


under a special payment scheme) that can be maintained securely without a central
ledger, and is available to all the parties involved.
– Blockchains can also contain other information that may be of interest to financial
institutions, e.g., identity information (via self-sovereign identity systems), own-
ership of digital assets (via non-fungible tokens (NFTs)), and Smart Contracts.
• By analyzing the information extracted, AI can
– detect anomalies, money laundering and fraud;
– make predictions for future performance, price trends and market trends (e.g., for
real estates, foreign exchange, equity, and commodity), so as to enable more in-
formed investment decisions;
– identify profitable assets;
– continuously assess the risks associated with a company by looking into geopolit-
ical factors, legal contracts, debts and liabilities, cybersecurity posture/maturity,
insider threats, technical architectures, international office structure, etc.;
– suggest causes of longer-term trends in asset prices;
– make recommendations, e.g., suggest a fair value for a merger or an acquisi-
tion based on patents, intellectual properties, employee sentiments, market ac-
ceptance/strength, monopoly power, etc.;

59
– fully automate investments that shift between different asset classes based on
changing market conditions and individual investment needs such as profit, risk
appetite, and liquidity aspects;
– adjust single client portfolios in real time to keep on track with clients’ selected
investment strategy; and
– ensure compliance.
• Currently, there are still challenges in
– providing more personalized and user-friendly financial advice, say, in the form of
videos, via a mobile app or the Internet;
– providing a natural-language customer interface with 3D displays;
– trading digital assets automatically;
– automating blockchain transactions and Smart Contracts;
– improving cybersecurity; and
– ensuring compliance with new legislation and industry standards, e.g., for the
protection of Critical Information Infrastructure (CII) and privacy.
Reference: Deloitte. “The expansion of Robo-Advisory in Wealth Management”. Aug. 2016. https://ww
w2.deloitte.com/content/dam/Deloitte/de/Documents/financial-services/Deloitte-Robo-safe.pdf.
Last accessed: 10 Feb. 2024.

Listen to Prof. Yu look into the future of banking, and reflect on some of the AI projects
pursued by the DBS bank.

4 min 6 sec

5.11 Insurance
• Chatbots and NLG technology can help promote products, automate application pro-
cesses, write documents, handle service requests, send payment reminders, register
claims, and provide updates on claims.
• Drones can help make risk surveys and site inspections faster and safer.
• RPA can help automate onboarding processes, policy renewal and invoicing.
• Data analytics can help suggest new products to be developed, match products and
customers, understand customers, recommend cross-selling options, assess risk, achieve
more accurate pricing, assess damage, and detect fraud.

5.12 Human resources


• Chatbots can help screen candidates and facilitate employee feedback.
• RPA can help manage reporting, tracking, and many other routine processes.

60
• Data analytics can help predict job-hopping, job performance of new hires, match
applicants with job openings, select best-fit employee for promotion, counter fraud,
manage compliance, and plan for job enhancement and training.

In Jun. 2018, the DBS bank introduced the use of an AI chatbot in the interview process.
Watch a demonstration below.

2 min 36 sec
Video source: DBS. “Jim demo video”. YouTube, 17 Aug. 2018. https://youtu.be/WgmpLL5QvB8.

5.13 Reflection
• We saw how AI can and will transform and disrupt many sectors.
• It is likely that all sectors will be affected.

• There are some common themes in the transformations and the disruptions, e.g., pre-
diction and personalization.
• How is AI transforming/disrupting the sectors you would like to enter?
• How can you use AI to bring innovation to the sectors you would like to enter?

• Did the development of AI affect your career choices? If yes, then how?

61
HS1501 §6

Technical background

We have seen how powerful and how useful modern AI is. In this part, we study the key
factors behind the success of current AI. This will help us better understand the nature of
this technology. In the process, we will also explain some common terminology that one
would likely run into when reading about AI now.
• What are the key factors that enable AI to do what seemed impossible in the past?
– artificial neural networks
– hardware acceleration
– Big Data
• What are the key factors that enable AI to be so widely available today?
– open-source code
– low-code development
– cloud computing
– edge computing
At the end, we will demonstrate these using the transformer model, which is one of the most
influential AI models nowadays.

6.1 Artificial neural networks

Image source: Lollixzc, CC BY-SA 4.0, via Wikimedia Commons, https://commons.wikimedia.org/wiki/F


ile:AI hierarchy.svg.

62
6.1.1 The challenge
• In the traditional setting, some human programmers must give a computer each and
every instruction precisely to get it to perform a task.
• In such a setting, the complexity of the problems a computer can solve is limited by
the complexity of the instructions humans can comprehend precisely.
– For example, it is humanly impossible to describe precise rules which when followed
would allow a computer to tell correctly whether an arbitrary input is an image
of noodles or not, be it at the left or the right of the image, made from rice or
wheat, raw or cooked, in a soup or stir-fried, with a pouched egg or wontons on
top, served in a hawker centre or in a restaurant, while excluding french fries and
beansprouts.

Image sources: [1] Tekkyy (english Wiki), Public domain, via Wikimedia Commons. https://commons.
wikimedia.org/wiki/File:Wanton noodles.jpg. [2] Alpha, CC BY-SA 2.0, via Wikimedia Commons.
https://commons.wikimedia.org/wiki/File:Curry Laksa - Laksa King (2597729514).jpg. [3] Ocdp, CC0,
via Wikimedia Commons. https://commons.wikimedia.org/wiki/File:Nissin Chicken Ramen 002.jpg.
[4] N509FZ, CC BY-SA 4.0, via Wikimedia Commons. https://commons.wikimedia.org/wiki/File:
Gon caau ngau ho (20150222171214).JPG. [5] https://m.facebook.com/323068641198750/posts/120
2981213207484/. [6] Photo by CEphoto, Uwe Aranas, https://commons.wikimedia.org/wiki/File:
Dalian Liaoning China Noodlemaker-01.jpg. [7] Popo le Chien, CC BY-SA 3.0, via Wikimedia Commons.
https://commons.wikimedia.org/wiki/File:Rice vermicelli.jpg. [8] Popo le Chien, CC BY-SA 3.0,
via Wikimedia Commons. https://commons.wikimedia.org/wiki/File:Fries 2.jpg. [9] cyclonebill, CC
BY-SA 2.0, via Wikimedia Commons. https://commons.wikimedia.org/wiki/File:Stegte gr%C3%B8ntsag
er (6290846922).jpg.

6.1.2 Machine learning


• These days, one popular solution is to use a (generally simpler) algorithm to find, within
a specific class of (generally more complicated) programs, one that performs the given
task well.
– This approach is known as machine learning.
– The program to be found is (what we have from §1.1) called a model.
– The algorithmic process of finding a model is called training (from the point of
view of the machine learning algorithm) or learning (from the point of view of the
model).

63
– In machine learning, the specific class of programs from which a suitable model is
to be found has to be carefully chosen: if it is too small, then it may not contain a
program that can perform the given task well enough; if it is too big, then it may
be too hard to find a suitable program in it.
• Traditionally, there are three basic machine learning paradigms: supervised learning,
unsupervised learning, and reinforcement learning.
• In supervised learning, training is done using labelled examples, meaning that each
data point possesses an associated label to be learnt by the model: the training is then
essentially a process of finding a model that can produce labels sufficiently similar to
the given ones.

– For example, in object recognition, the training data can be a large number of
images, each of which is labelled “noodles” or “non-noodles”. The model then
learns a way to reproduce the labels by looking at the images without seeing the
labels.

• In unsupervised learning, training is done by identifying patterns in unlabelled data.


– For example, in data analytics, the model can use the transactions on an online
marketplace as training data to identify shopping patterns, according to which
users can be categorized.
• In reinforcement learning, a model learns an action policy through trial and error using
a reward/punishment system.
– For example, DeepMind’s virtual robot in §4.3 learnt to walk using forward progress
as reward.
• In all cases, a trained model is supposed to generalize, in the sense that it performs
well even on inputs that it has not seen during training.

6.1.3 Neural networks

Image source: BrunelloN, CC BY-SA 4.0, via Wikimedia Commons, https://commons.wikimedia.org/wiki


/File:Example of a deep neural network.png.

• Many AI models nowadays are based on deep neural networks.

64
• (Artificial) neural networks are a particular computational architecture inspired by
neural circuits in brains.
• A neural network typically has an input layer (e.g., containing the pixels in an image,
digitized as numbers), one or more hidden layers, and an output layer (e.g., indicating
whether the input is an image of noodles).

• A layer in a neural network is composed of nodes that are often referred to as neurons.
• Each neuron stores a number, which, except for the neurons in the input layer, is
calculated using a weighted sum of the numbers stored in the neurons in the previous
layer following the structure of the network.

• An activation function (often the Rectified Linear Unit (ReLU) function, which allows
only the positive numbers to go through) gives a criterion to determine whether this
weighted sum goes through to the next layer.
• Before passing the weighted sum to the activation function at a neuron, one sometimes
adds a fixed number called a bias to the weighted sum to adjust the activation threshold.

• The weights and the biases are the parameters of the model that are independent of
the input here.
• By varying these parameters, one gets a whole class of different programs of the same
neural network structure that have varying behaviours.

• A right set of parameters for each neuron is required for the neural network to give
desired outputs.
• To train a neural network, one tunes the parameters to look for a program that performs
the desired task sufficiently well.

• The training is typically done by repeatedly running labelled examples (e.g., images that
are known to show noodles or non-noodles) through the neural network: by comparing
the outputs and the labels, one revises the parameters to decrease the error.
• One advantage of the neural network architecture is the availability of well-tested al-
gorithms to train it, e.g., Gradient Descent with back-propagation.

• In general, more complicated tasks require bigger neural networks in terms of the
number of parameters, but bigger neural networks take more time and more energy to
train and run.
• It is mathematically proven that neural networks, when appropriately structured and
trained, can in theory perform any task on a digital computer sufficiently well.

65
6.1.4 Deep neural networks

inputs outputs

shallower

inputs outputs

deeper

• Deep neural networks are neural networks with multiple hidden layers.
• Deep learning is machine learning using deep neural networks.
• Deeper neural networks can perform more tasks, and are observed to generalize better,
compared to other neural networks with the same number of parameters.

• Since around 2009, deep learning has made major advances in solving problems that
had resisted the best attempts of the AI community for many years, e.g., in recogniz-
ing images and speech, predicting the activity of drug molecules, reconstructing brain
circuits, and understanding natural language.
References: [1] Yoshua Bengio. “Learning Deep Architectures for AI”. Foundations and Trends® in Machine
Learning, vol. 2, no. 1, pp. 1–127, 2009. [2] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. “Deep
learning”. Nature, vol. 521, pp. 436–444, 2015. [3] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton.
“ImageNet Classification with Deep Convolutional Neural Networks”. Communications of the ACM, vol. 60,
no. 6, Jun. 2017, pp. 84–90.

66
6.2 Hardware acceleration
• Training AI models for real-world applications typically require massive amounts of
computation that would take impractically long to execute without specialized hard-
wares.
• Central processing units (CPUs) of modern computers are designed to perform a small
number of general tasks fast.
• Exploiting the similarities with calculations in processing graphics, graphics process-
ing units (GPUs) help speed up the training and the running of neural networks by
performing a large number of simple arithmetic operations in parallel.

Image source: ZMASLO, CC BY 3.0, via Wikimedia Commons, https://commons.wikimedia.org/wi


ki/File:NVIDIA RTX 4090 Founders Edition - Verpackung (ZMASLO).png.

NVIDIA® GeForce RTX™ 4090 GPU, 16384 cores, 24 GB memory, Boost Clock 2.52 GHz,
starting at SG$2700 as of Feb. 2024.
• It was in the mid 2000s when GPUs started to become available for non-graphics use. It
took only a few years for researchers to start using GPUs to train neural networks, and
the improvement in speed ranged from 5- to 70-fold. Nowadays, GPUs have become a
popular hardware accelerator for machine learning.
• There are now also specialized hardware developed for AI computations, e.g., Google’s
tensor processing units (TPUs).
Reference: Rajat Raina, Anand Madhavan, and Andrew Y. Ng. “Large-scale deep unsupervised learning
using graphics processors”. In Proceedings of the 26th Annual International Conference on Machine Learning
(ICML ’09), Association for Computing Machinery, New York, NY, USA, pp. 873–880, 2009.

6.3 Big Data


• We saw in §6.1.2 above that data are needed in training AI models.
• The growth of the Internet and the Internet of Things (IoT) has made more and more
data readily available for different kinds of training and analytics.
– The IoT refers to infrastructures where multiple physical devices, typically with
sensors or interactive components, exchange data with one another over the In-
ternet or another kind of communication network.
– Often, this exchange of data does not require human intervention.
– Example 1: in a smart car, tire pressure sensors send measurements to the car
dashboard, so that the driver can continuously monitor tire condition without
leaving the car.
– Example 2: a pacemaker can send heart activity to the user’s mobile phone.
– IoT systems bring data from the physical world into the digital world.

67
– IoT expanded quickly in recent years because of decreasing hardware costs, de-
creasing cost of digital communication, and increasing device proliferation, amongst
other reasons.
• Digitalization makes other types of data, e.g., business records and customer feedback,
more easily available for training and analytics too.
• All these are Big Data as defined in §2.4, i.e., they are extensive data sets that are too
large to be analyzed using traditional methods.
• Using the Big Data to train AI models is actually a way of extracting information from
the data.
• Big Data’s huge volume and variety are important in training AI models that perform
well.
• The high velocity at which Big Data are generated provides up-to-date data for training
AI, while AI provides a means to process Big Data at high velocity.
Reference: Malika Bakdi and Wassila Chadli. “Big Data: An Overview”. In Soraya Sedkaoui, Mounia
Khelfaoui, Nadjat Kadi, eds., Big Data Analytics, pp. 3–13. Apple Academic Press, 2022.

6.4 Open-source code

Image source: Open Source Initiative, “Logo Usage Guidelines”, 5 May 2023. https://opensource.org/log
o-usage-guidelines/.

• AI researchers often make their research findings, code and data sets freely available,
e.g., on arXiv, GitHub, Papers With Code, and Hugging Face.
• This makes it easy and fast for people to build on others’ work, and modify others’
code for their own needs.
• Two popular software libraries/frameworks for programming AI algorithms are Tensorflow
(developed by Google Brain) and PyTorch (originally developed by Facebook AI (now
Meta AI), now part of the Linux Foundation).
• Both are open source, meaning in particular that the source code is widely and freely
available, and it may be redistributed and modified freely.
• Both allow computation on one or more CPUs and GPUs.
• Both can be used with Python, which is also open source, and is arguably the top
programming language for AI applications as of today.
Reference: Ian Pointer. “6 best programming languages for AI development”. InfoWorld, 20 Nov. 2019.
https://www.infoworld.com/article/3186599/. Last accessed: 17 Feb. 2024.

• The Open Source Initiative is working towards a definition of open-source AI that


takes into account the special nature of AI. A first version is scheduled to be released
in Oct. 2024.
Reference: Stefano Maffulli. “Open Source AI Definition: Where it stands and what’s ahead”. Voices
of Open Source, 7 Feb. 2024. https://blog.opensource.org/open-source-ai-definition-where-it
-stands-and-whats-ahead/. Last accessed: 17 Feb. 2024.

68
6.5 Low-code development
• With TensorFlow and PyTorch, simple AI algorithms can be programmed with less
than ten lines of code.
Reference: Google Developers. “Hello World – Machine Learning Recipes #1”. YouTube, 31 Mar. 2016.
https://youtu.be/cKxRvEZd3Mw, 6 min 52 sec.

• AI models for specific tasks, e.g., image, sound, and pose recognition, can be built with
a no-code graphical interface, from data collection and training to evaluation. Watch
how this is done with Google’s Teachable Machine in the video below.

2 min 8 sec
Video source: Google. “Teachable Machine 2.0: Making AI easier for everyone”. YouTube, 8 Nov. 2019.
https://youtu.be/T2qQGqZxkD0.

• There are platforms with graphical interfaces that automate the process of comparing
and implementing AI algorithms for specific applications. One example is DataRobot.
See how it is like in the video below.

1 min 31 sec
Video source: DataRobot. “DataRobot AI Platform [2018 Version - Update Available]”. YouTube,
16 Apr. 2018. https://youtu.be/RrbJLm6atwc.

• AI can now automate even the selection of AI algorithms and the pre-processing of
data. This capability is called automated machine learning (AutoML). Watch Google
CEO Sundar Pichai talk about it when they first made it possible in 2017.

1 min 11 sec
Video source: Elrashid Media : Tech-meetups-Startups-Hackathons (@elrashidmediatech-meetups-2861).
“Google #IO17 | Keynote | AutoML”. YouTube, 18 May 2017. https://youtu.be/92-DoDjCdsY.

69
6.6 Cloud computing
• The cloud refers to a distributed network of servers, accessible via the Internet, that
virtually deliver services such as softwares, hardwares, and data storage.
• Installing and maintaining the hardware required to run complex AI on a commer-
cial scale can be prohibitively expensive, especially for Small and Medium Enterprises
(SMEs).

• Cloud computing services provide smaller companies an affordable option to equip


themselves with powerful AI capabilities that drive their products.
• Here are the characteristics of cloud computing.
– On-demand self-service: the resources are available whenever the user wants
them.
– Broad network access: the resources are available through the Internet on
common consumer devices.
– Resource pooling: the resources are deployed to serve multiple users.
– Rapid elasticity: users can choose the amount and the type of resources they
get dynamically.
– Measured service: the user pays according to the amount of resources s/he
used.
– The service provider is responsible for the set-up, the management, the mainte-
nance, and the security of the software and the hardware resources, not the user.
Reference: Peter Mell and Timothy Grance. “The NIST Definition of Cloud Computing”. National
Institute of Standards and Technology, NIST Special Publication 800-145, Sep. 2011. https://nvlp
ubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-145.pdf.

• Watch how Amazon Web Service (AWS)’s cloud computing service can help businesses
in the video below.

3 min 1 sec
Video source: Amazon Web Services. “Introduction to AWS Lambda — Serverless Compute on Ama-
zon Web Services”. YouTube, 20 May 2015. https://youtu.be/eOBq h4OJ4.

• AWS, Google Cloud, Microsoft’s cloud computing platform Azure, and Alibaba Cloud
all have services specific to AI applications.
• Google’s Colaboratory (Colab) allows users to run Python code with GPUs online, free
of charge.

70
6.7 Edge computing
• Edge computing refers to the idea of storing data and performing computations on
them at the edge, i.e., at or close to the data sources and the users, e.g., at a sensor, a
mobile phone, and more generally an IoT device.
• With the advance of technology, computing devices are getting smaller, cheaper, more
powerful, more power-efficient, and more flexible physically, to the extent that some
IoT devices are now able to run AI models locally.
• This creates a so-called Artificial Intelligence of Things (AIoT) system.
• Example 1 cont’d: an AI model learns what the optimal tire pressure to maintain is,
given the car model, the tire model, and the terrain frequently driven on, and makes
timely suggestions to the driver on when to pump the tires and to what pressure.
• Example 2 cont’d: the pacemaker monitors for irregular heart activity, warns the user
about them, and can send out an automated distress call via the mobile phone during
a heart attack.
• A number of small, power-efficient single-board computers can be used to deploy AI in
IoT devices, for example:
– Raspberry Pi (model 4B, from SG$63 as of Feb. 2024, 85 mm × 56 mm × 17 mm,
Power over Ethernet, 4-core CPU at 1.5GHz, from 1GB memory, originally de-
signed for educational use);

Image source: Michael H. (,,Laserlicht“), CC BY-SA 4.0, via Wikimedia Commons, https:
//commons.wikimedia.org/wiki/File:Raspberry Pi 4 Model B - Side.jpg.

– NVIDIA® Jetson Nano™ (Developer Kit, about SG$173 as of Feb. 2024, 100 mm×
80 mm × 29 mm, 5–10W power, 128-core GPU, 4-core CPU at 1.43GHz, 4GB
memory, designed for AI applications).

Image source: SparkFun Electronics, via Wikimedia Commons, https://commons.wikimedia.or


g/wiki/File:NVIDIA Jetson Nano Developer Kit %2847616885631%29.jpg.

• There are also light-weight, low-power hardware accelerators that are suitable for edge
devices, for example:

71
– The Intel® Neural Compute Stick 2 (about SG$250 as of Feb. 2024, 72.5 mm ×
27 mm × 14mm, plug and play via USB) contains a hardware accelerator for AI
vision applications.
– One can add a TPU to a device via a USB port using Google Coral’s USB Accel-
erator (about SG$80 as of Feb. 2024, 65 mm × 30 mm × 8 mm, capable of perform
4 trillion operations per second using 2W).
• Some smartphones nowadays are equipped with (co-)processors that are designed with
AI applications in mind, for example, iPhone 15 has a 5-core GPU and a 16-core Neural
Engine, the HUAWEI P60 Pro has an Adreno GPU and a Qualcomm AI Engine,
Google’s Pixel 8 has a Google Tensor G3.
References: [1] https://www.apple.com/sg/iphone-15/specs/, last accessed: 17 Feb. 2024. [2] ht
tps://consumer.huawei.com/sg/phones/p60-pro/specs/, last accessed: 17 Feb. 2024. [3] https:
//store.google.com/product/pixel 8, last accessed: 17 Feb. 2024.

• One can make trained AI models programmed in TensorFlow or PyTorch run on mobile
devices and on the web, for example:
– TensorFlow Lite enables one to run and retrain AI models written in TensorFlow
on mobile, microcontrollers and other edge devices.
• Fifth-generation (5G) cellular networks enable IoT and AIoT devices to communicate
with one another must faster, which is extremely important, for example, in autonomous
vehicles and remote surgeries.
• Here are some advantages of AIoT systems.
– The incorporation of AI allows IoT devices to perform a wider range of functions.
– Not having to send the data over to the cloud for processing can improve the speed
of the operation and saves network bandwidth.
– Having the data stored and processed at the edge avoids the privacy and the
security issues of sending the data to the cloud for processing.
– There is sometimes no network connection, e.g., for drones flying in remote areas
or for robots working underground, in which case the AI must run on the device
itself.
• Here are some issues associated with the use of AIoT systems.
– The sharing of data, e.g., health data, amongst edge devices raises privacy and
security issues.
– As these systems are typically connected to the Internet, such data may even be
sent to the cloud without the user knowing.
– The reliance on AI in decision making increases the severity of malicious attacks.

Listen to Prof. Yu talk about the potentials of AIoT systems.

2 min 9 sec

72
6.8 Example: Transformer
• In 2017, Google Brain introduced the Transformer language model, which was shown
to outperform a number of other AI NLP models that had been in use.
• Many powerful chatbots mentioned in §2.9, e.g., OpenAI’s ChatGPT, Google’s Bard,
Meta’s LLaMA, and reportedly Baidu’s ERNIE bot, are based on Transformer models.
• Transformer models use neural networks.
• A special feature of Transformer models is that, when processing a word, the surround-
ing words are directly involved as well.
• Transformer models are generally pretrained only for simple tasks like predicting a
masked word or the next word/sentence in a given piece of text.
• This pretraining is self-supervised, in the sense that the training data come unlabelled
like in unsupervised learning, but labels are extracted from the data, which the model
then learns like in supervised learning.
• Via transfer learning, the pretrained models can then be trained with smaller datasets
for more specific tasks, e.g., text categorization, named entity recognition, rudimentary
reading comprehension, question answering, summarization, and translations (between
natural languages and between programming languages).
• As a result, these pretrained models are sometimes called foundation models.
• Transformer-based large language models are mostly hosted on the cloud.
• Meta’s LLaMa is open source, but OpenAI’s ChatGPT is not, as of Feb. 2024.
• As of Feb. 2024, the basic version of ChatGPT is trained from the GPT-3.5 family of
transformer models, where the acronym GPT stands for “generative pretrained trans-
former”.
• More technical information is publicly available about GPT-3, a predecessor of GPT-
3.5.
– The GPT-3 model has 175 billion parameters and is 96 layers deep.
– The training data for GPT-3 consist of over 300 billion words (or, more precisely,
tokens).
– It is estimated that the training for GPT-3 would have taken 34 days if 1024 NVIDIA
Tensor Core A100 GPUs were used.
• Transformer models find applications also in protein structure prediction, speech recog-
nition, image classification, and video classification.
• A similar training method gives the popular diffusion model for images, which is trained
to remove noise added into images.
– The text-to-image programs DALL·E and Stable Diffusion mentioned in §3.6 are
both based on diffusion models.
References: [1] Jakob Uszkoreit. “Transformer: A Novel Neural Network Architecture for Language Under-
standing”. Google Research, 31 Aug. 2017. https://ai.googleblog.com/2017/08/transformer-novel-neur
al-network.html. Last accessed: 17 Feb. 2024. [2] Craig S. Smith. “Battle Of The Bots: China’s ChatGPT
Comes Out Swinging To Challenge OpenAI”. Forbes, 24 Mar. 2023. https://www.forbes.com/sites/craig
smith/2023/03/24/battle-of-the-bots-baidus-ernie-comes-out-swinging-to-challenge-openai/. Last
accessed: 17 Feb. 2024. [3] Rick Merritt. “What Is a Transformer Model?”. NVIDIA blog, 25 Mar. 2022. ht
tps://blogs.nvidia.com/blog/2022/03/25/what-is-a-transformer-model/. Last accessed: 17 Feb. 2024.
[4] Tom Brown, et al. “Language models are few-shot learners”. Advances in neural information processing
systems, vol. 33, pp. 1877–1901, 2020. [5] Deepak Narayanan, et al. “Efficient Large-Scale Language Model
Training on GPU Clusters Using Megatron-LM”. SC ’21: Proceedings of the International Conference for
High Performance Computing, Networking, Storage and Analysis, art. no. 58, pp. 1–15, Nov. 2021.

73
6.9 Reflection
• We saw a greatly simplified picture of how AI works under the hood, and the key
resources it relies on.
• Are you more interested in being a user or a developer of AI, or both?
• What would you most want to achieve using Google’s Teachable Machine we saw
in §6.5?
• Do you think current AI qualifies as being intelligent? Why?

74
HS1501 §7

Challenges and issues

We saw in §5 and §6 respectively what current AI technology can do to help people and
roughly how it works. In this part, we will look at a few issues that it brings, and some
current challenges in the adoption of AI.
• abuse: one can use AI for bad purposes, e.g., cyberattacks, and political manipulation
• malfunction: AI may fail in various ways (e.g., giving wrong or biased outputs) for
various reasons (e.g., related to the training process or the training data)
• security: people may attack an AI to affect its performance or to steal data
• explainability: it has been hard to describe in a human-understandable way why an
AI gives a certain output
• privacy: AI enables and requires one to collect, process, and keep track of a huge
amount of personal data extensively; to be discussed in §9.1 when we look into ethics
• data scarcity: high-quality data may not be available for training

7.1 Abuse
We saw in §5 many ways in which one can use AI to benefit people. The same technology is
capable of causing harm to people too when used with ill intent. The power of AI makes the
resulting harm more severe and harder to avoid. Here are a few examples of how AI may be
abused.
• Deepfakes and natural language generation AI can be used to spread misinformation
and to manipulate public opinion. In §1.3.3, we gave an example in the war between
Russia and Ukraine.
• Deepfakes and natural language generation AI can also be used in impersonation, scams,
and social engineering attacks.
• AI robotics can be used to automate physical and cyber weapons. We will discuss more
about these in §9.2 when we look into ethics.
• Cyberattackers can use AI to help them in many ways.
– As we saw in §3.1, AI can break CAPTCHA using its vision capabilities.
– We saw in §2.10.2 that AI can write simple computer code. In particular, it can
help generate new malware.
– By analyzing human patterns, AI can act like humans to evade some network
defences.

75
– AI can discover network vulnerabilities by learning from the networks it has seen
before.
– Etc.
As demonstrated by the Cyber Grand Challenge by the Defense Advanced Research
Projects Agency (DARPA) of the United States Department of Defense in 2016, the
associated technologies are already mature enough to fight real cyberwarfares in which
computer systems automatically locate, exploit, and patch vulnerabilities.
Source: [1] Defense Advanced Research Projects Agency. “Cyber Grand Challenge (CGC) (Archived)”.
https://www.darpa.mil/program/cyber-grand-challenge. Last accessed: 3 Mar. 2024. [2] DARPAtv.
“DARPA Cyber Grand Challenge: Visualization Overview”. YouTube, 22 Jul. 2016. https://youtu.
be/LEfejsqEucY, 2 min 21 sec.

7.2 Malfunction
AI sometimes makes mistakes. The mistakes can range from innocent to fatal. These can
be due to unexpected scenarios, low-quality training data, or poor engineering/programming
choices, amongst other reasons. Let us look at each of these causes one by one, and discuss
some good practices in preventing and handling failures.

7.2.1 Unexpected scenarios


We saw in §6.1.2 that AI learns from the data provided to train it. If the training data do
not cover a scenario that an AI encounters, then the AI may respond unpredictably. Here
are two examples.

• As described in §2.1, large language models may produce confident responses that do
not seem justified by the data used to train them, presumably when the training data
do not provide (enough) information on what they are asked to produce.
• In §4.8, Prof. Yu talked about a drone that failed to land safely on tall grass because
there was no tall grass in its training.
Such problems originate from the AI developers not being able to anticipate all the circum-
stances that the AI would run into and all the consequences that the AI outputs would entail.
We will discuss this more in §9.4.

7.2.2 Low-quality training data


Even when the scenarios are already anticipated by the developers, the training data used
may still be not fitted or not representative enough for the purposes. In this case, the bias
present in the training data leads to biased results. Here are two examples.

• An AI was used to assess which pneumonia patients have high risks. It was mostly
accurate, but erroneously classified patients with a history of asthma as low-risk. In
reality, such patients have higher rates of survival only because they were directly sent
to intensive care. This mistake was caused by the use of data that are not fitted for
the purpose.

• In 2015, a user reported that the Google Photos app misclassified two dark-skinned
people as “gorillas”, which echoes racist tropes. Google apologized for the incident.
Reportedly, as of 2023, Google Photos still does not classify any (gorilla or not) photo
as “gorillas” unless the word itself appears in the photo. One potential reason for the
incident is that the training data used did not contain enough photos of dark-skinned
people.

76
References: [1] Rich Caruana, et al. “Intelligible Models for HealthCare: Predicting Pneumonia Risk and
Hospital 30-day Readmission”. In Proceedings of the 21th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining (KDD ’15). Association for Computing Machinery, New York, NY,
USA, pp. 1721–1730, 2015. [2] Anonymous. “(2015-06-03) Incident Number 16”. In S. McGregor, ed.,
Artificial Intelligence Incident Database. Responsible AI Collaborative, https://incidentdatabase.ai/ci
te/16. Last accessed: 3 Mar. 2024. [3] Nico Grant and Kashmir Hill. “Google’s Photo App Still Can’t Find
Gorillas. And Neither Can Apple’s”. The New York Times, 22 May 2023. https://www.nytimes.com/2023
/05/22/technology/ai-photo-labels-google-apple.html. Last accessed: 3 Mar. 2024.

7.2.3 Poor engineering/programming choices


Examples of poor engineering choices include the use of wrong or insufficient types of sensors
or data. We saw in §1.3.2 a fatal accident in which AI drove a car into a truck whose colour
is similar to that of the sky. If lidar were used in addition to the camera, then likely the
accident would not have happened.
One common consequence of poor programming choices is overfitting, in which the AI
model learns the specifics of the training data instead of patterns that are generalizable to
unseen data.

One possible reason for overfitting is that the AI models used are too complex for the data
involved. Another possible reason is that the model is trained too much for the amount of
training data used.
In reinforcement learning, a poor choice of a reward/punishment system may lead to the
AI behaving in undesirable ways. We will see more of this in §9.4.

7.2.4 Preventing and handling failures


Here are some ways to reduce the chances or the severity of an AI (and more generally a
computer system) making mistakes.

77
• human-in-the-loop: include humans to look over the system and to provide advice
when needed
• think failure: expect that systems will fail, and some unlikely event with huge impact
will happen; design safeguards and contingency plans accordingly

• backup plans: have different systems back up one another


• minimization of dependencies: make some parts of the system run even when others
fail if possible
• fail fast: detect problems early in the development cycle, e.g., by implementing system
and administrative procedures for faster reporting, and by carrying out testing alongside
development
• Gall’s law: do not build complex systems from scratch; instead, build them from
simpler systems that work
Know when and how to escalate issues quickly. When systems fail, stay calm, act quickly,
identify the root cause of the failure, remediate, and contain the damage, e.g., by reconfiguring
one system to fulfil another’s role. Learn from past incidents.
Listen to Prof. Yu talk about how Netflix improves resilience of its video streaming service
in the video below.

4 min 26 sec

7.3 Security
People or agencies may attack an AI, e.g., to steal, modify, destroy data, or to prevent the sys-
tem from functioning properly. Such attacks may be performed by insiders (e.g., employees
who are laid off or frustrated) and state-funded high-end espionage. They may target indi-
viduals, companies, or critical information infrastructures (CIIs) such as hospitals, railway
systems, payment systems, power plants and networks. They can cause substantial financial
loss, disruptions, and deterioration of reputation.
We will talk about three kinds of attacks on AI and discuss how to defend against such
attacks.

7.3.1 Data poisoning


Data poisoning refers to a kind of attack in which the training data are manipulated to affect
the behaviour of an AI negatively.
One such incident happened in 2016 to Tay, a chatbot developed by Microsoft to interact
with users on Twitter (now X) for entertainment purposes. Soon after its release, it started
making lewd and racist comments. Microsoft claims that this was due to a “coordinated
attack” on Tay, which reportedly was programmed to go along the lines of the tweets that it
reads. In the end, Microsoft had to take Tay offline within 16 hours of its release.

78
References: [1] Amy Craft. “Microsoft shuts down AI chatbot after it turned into a Nazi”. CBS News,
25 Mar. 2016. https://www.cbsnews.com/news/microsoft-shuts-down-ai-chatbot-after-it-turned-int
o-racist-nazi/. Last accessed: 3 Mar. 2024. [2] Peter Lee. “Learning from Tay’s introduction”. Official
Microsoft Blog, 25 Mar. 2016. https://blogs.microsoft.com/blog/2016/03/25/learning-tays-introduc
tion/. Last accessed: 3 Mar. 2024.

7.3.2 Evasion
Sometimes it is possible to specially design an input, called an adversarial example, that
can trick an AI to produce wrong outputs. Some adversarial examples even seem normal or
innocent to human eyes. Watch Prof. Yu present a few examples in the video below.

3 min 58 sec

7.3.3 Inference attacks


• Inference attack refers to a kind of attack in which the attacker infers from a system
information that is not supposed to be accessible.
• One may be able to infer information about the training data (which may contain per-
sonal information) by interacting with an AI model (which is often publicly accessible).
– For a fictitious example, consider a research group that develops and makes avail-
able an AI model that recommends treatment for AIDS depending on age, weight,
medical history, race, etc.
– For training this AI model, this research group uses data from the country’s central
healthcare system.
– By suitably querying this AI model, an inference attacker may use public infor-
mation about a public figure, say, a politician or a celebrity, to infer whether
this public figure is in the training data, and thus whether this public figure is
diagnosed AIDS.
– This piece of information about the public figure may not be one that is intended
to be public.
• Such attacks are possible because AI models can remember very specific information
about their training data.

See how well the AI chatbot at https://deepai.org/chat remembers the lyrics of your
favourite song by asking it to tell you, for example, the line following any line given by you.

79
7.3.4 Defence
Here are a few cyber defence measures that are specific to AI.

• adversarial training: add adversarial examples to the training data and explicitly
label them by their correct classification
• defensive distillation: train the AI model (called the student model ) using outputs
from another AI model (called the teacher model )

– The resulting student model is known to be less affected by small perturbations


and thus more resistant to evasion attacks.
– As a side remark, the teacher–student combination can also be used to make AI
models smaller, and thus easier to fit in resource-constrained edge devices.
Reference: Nicolas Papernot, et al. “Distillation as a Defense to Adversarial Perturbations Against
Deep Neural Networks”. In 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597, 2016.

• ensemble learning: use multiple AI models to perform the same task


– For example, to detect human presence, one can use both object recognition and
facial recognition.
– To subvert the entire system, one would then need to subvert all the constituent
models successfully.
– The failure of one but not all of the constituent models may indicate an attack.
– Ensemble learning also improves the accuracy of AI systems.
– It can also be used to counter data scarcity, as we will see in §7.5.
• data perturbation: add noise to individual data points before using them for training

– The noise added to an individual data point is substantial enough so that it is


hard to recover the original data point from a perturbed one.
– The noises added to different data points are coordinated in a way that does not
affect the accuracy of the trained AI model too much.

7.4 Explainability
As we saw in §6.1.2, in machine learning, models are not coded by human but are chosen
automatically by the algorithmic process of training. In fact, the models chosen are often too
large and too complicated to be comprehensible by human. As a result, outputs produced
by current AI models typically do not come with human-comprehensible explanations of why

80
certain outputs are given. These explanations are important because they make it easier for
humans to trust the AI. They are useful in diagnosing malfunctions and in detecting attacks.
Additional effort is needed to make such explanations available. Methods to achieve this are
referred to as Explainable AI (XAI).
We will look into two such methods.

7.4.1 Local Interpretable Model-agnostic Explanations (LIME)


This method does not require any information about the model being investigated. To explain
an output produced by a model, one feeds into the model inputs that are modified from the
original one in various small parts, so that one can read from the resulting outputs which
parts are relevant in determining the output one would like explained.
Watch one demonstration from Prof. Yu of using LIME to explain an evasion attack in
the video below.

2 min 25 sec

7.4.2 Layer-wise Relevance Propagation (LRP)


This method requires one has access to the neural network to be investigated. It works by
tracing an output from the output layer back to the input layer to see which parts of the
input contribute to it.
Try it out to see how LRP can be used to present explanations by following the steps
below.

1. Open the “XAILab Demo: Explainable VQA” page by the Fraunhofer Institute for
Telecommunications at https://lrpserver.hhi.fraunhofer.de/visual-questio
n-answering/.
2. Click a picture in #1.
3. Type in a question in #2.
4. Press the enter key.

5. Wait for the answer to appear in #3.


6. The areas relevant in producing the answer are shown in #4.
7. Try again with different pictures and different questions.

8. Evaluate the quality of the outputs.

81
Try to use LRP to investigate what an AI sees in an adversarial example.

1. Open the “Explainable AI Demos: Image Classification” page by the Fraunhofer Insti-
tute for Telecommunications at https://lrpserver.hhi.fraunhofer.de/image-cl
assification.

2. At the bottom right-hand corner, select “Adversarial Attacks” in the drop-down list.
3. Choose one of the images on the right.
4. The page displays what the AI classifies the image to be, and a heatmap showing parts
of the image that contribute to this classification.

5. Compare the heatmap with what you expect the AI to focus on if it were to classify
the image correctly.
6. Try again with different images under “Adversarial Attacks”.
7. Compare the heatmaps with those for the images under “General Images”.

82
7.5 Data scarcity
• We saw in §6.1 and in §7.2.2 that machine learning typically requires a lot of training
data that are representative of the problem to produce a model that performs well.
• However, in practical situations, large amounts of such data sometimes are not acces-
sible or simply do not exist.
• Few-shot learning refers to learning tasks for which only a small amount of training
data is available.
• A number of methods can be used to counter the problem of data scarcity. Here are a
few examples.
– One can modify existing data to generate new data.
∗ This approach is known as data augmentation.
∗ For example, one can rotate, flip, crop, adjust the contrast of images for
training object recognition.
– One can train another model to generate training data as follows. A generator
model generates data. The generated data and real data are mixed and fed into
a discriminator model, which identifies whether the input is real or generated.

83
During training, the two models improve with each other, so that at the end the
generator model can generate realistic data for training.
∗ This generator–discriminator combination is known as a generative adversarial
network (GAN).
∗ For example, at the beginning of the COVID-19 pandemic, one can use GANs
to produce synthetic lung CT scans and X-ray images for training. Here are
some synthetic X-ray images generated by a GAN.

Image source: Rutwik Gulakala, Bernd Markert and Marcus Stoffel. “Generative adversarial
network based data augmentation for CNN based detection of Covid-19”. Scientific Reports,
vol. 12, art. number 19186, 2022.

∗ As a side remark, GANs are very useful in generating realistic images for other
purposes too.
– One can re-train a trained model to adapt to a different context.
∗ This approach is known as transfer learning.
∗ For example, one can use the linguistic features extracted into translation
models for more popular languages to obtain models for less popular ones.
– Use ensemble learning in which a few smaller neural networks are used instead of
one big neural network.
∗ The principle behind this approach is that bigger neural networks typically
require more training data to perform well.
∗ For example, instead of using one model to recognize images of ice kacang, one
can combine the use of a number of models that recognize images of shredded
ice, sweet corn, red beans, pink colour, inverted cone shape, etc., for which
training may be easier and more training data may be available.
• While there are ways to make some model work with a small amount of data, to obtain
best results, it is still important to find more high-quality data.

7.6 Reflection
• We saw that, although AI can be very useful, it brings also many challenges and issues.
• A number of solutions are available to counter the existing problems, but these problems
are far from being completely solved, and new problems will likely arise with the rapid
advancement of AI.
• As a user, how worried are you about AI giving you wrong information?

84
• What measures would you take personally to protect yourself against the negative
effects of AI?
• Do you think that AI will do more good than bad to people?

85
HS1501 §8

Economics

[This chapter is to be updated before Week 9.]


The digital economy commonly refers to the body of economic activities powered by the
advance of digital technologies. As we saw in §5, AI plays a key role in it, transforming most
(if not all) businesses significantly. More generally, AI transforms societies, improving the
quality of peoples’ lives through better decision-making, personalized products/services, and
automation. This transformation may also cause fundamental changes in the labour force
structure. As Andrew Ng put it, these make AI the “new electricity”.
In this part, we will study in more detail the positive and the negative economic impacts
of AI.
• cost reduction: how one can use AI to save money
• productivity gain: how one can use AI to increase productivity and make money

• wealth distribution: how AI may widen the gap between the rich and the poor
• job market: which jobs will likely be replaced, in what ways the remaining jobs will
likely change, and what skills will likely remain valuable
At the end, we will discuss the type of leadership that would facilitate successful implemen-
tation of AI.

8.1 Cost reduction


We saw in §5 many examples in which one can use AI to save money. Let us extract from
there a few general ways in which AI can help do this.
• AI can automate many mundane manual tasks, thus reducing manpower costs.
• AI capabilities can help improve product reliability, thus reducing the costs to rectify
errors.

• Via data analytics and possibly automated sensor monitoring, AI helps one avoid wast-
ing money on what is irrelevant.
• AI can predict when maintenance is likely needed, thus reducing the cost of unexpected
downtime.
• AI can help identify frauds, thus reducing the financial loss they cause.

Meanwhile, the cost of implementing AI has been decreasing rapidly, due to improvements
in software and hardware, and more powerful pay-per-use cloud computing services.

86
• In §6, we had a glimpse of the cost of some of the software and the hardware one can
use to deploy AI.
• ARK Invest predicts in their Big Ideas 2023 report that the costs of AI training, AI
software, and AI hardware are going down by about 70% per year.
• As an example, they state that the cost of training a large language model to GPT-3
level goes down from US$4,600,000 in 2020 to US$450,000 in 2022.
• Although large language models are speculated to cost millions of US dollars to train,
many of them are available to individual users at a relatively affordable price nowadays.
For example, as of Oct. 2023, ChatGPT is available free-of-charge upon registration,
and the ChatGPT Plus plan costs only US$20 per month.

8.2 Productivity gain


Saving money alone is not enough for successful business operations. One also needs to
actively make money and increase productivity. AI can help with these too, as we also saw
in §5. Let us extract from there a few general ways in which AI can help in these aspects.
• AI automation can help get many jobs done more efficiently.
– In particular, it can facilitate team collaboration by, e.g., providing timely notifi-
cations and summarizing meeting minutes.
– According to the Big Ideas 2023 report by ARK Invest, AI coding assistants
increase the output of software engineers by 2-fold in 2022. They predict that this
increase will step up to about 10-fold by 2030.
• AI scales up better with the complexity of a project because significantly less commu-
nication is involved, compared to what is required in a human team.
• AI automation can help one focus on the more meaningful tasks.
• AI automation and analytics allow consumer products, services, and recommendations
to be personalized on a large scale.
• Data analytics and generative AI power product breakthroughs and innovations.
– Generative AI refers to AI that is capable of generating new content, e.g., text
and images.
• Data analytics can spot trends, identify causes of events, and provide market insights
to help one plan strategically and make more-informed (financial and other) decisions.
• AI can help assign work to the most suitable people within an organization.
• AI-powered monitoring can help meet compliance and governance requirements.
Listen to Prof. Yu discuss in the following video how AI drives innovation and what possible
scenario this entails.

3 min 34 sec

87
8.3 Wealth distribution
• As we saw in the previous sections, strategic development and implementation of AI
can greatly help generate wealth.
• The entities that have the resources to develop AI would thus likely see rapid and
significant growth, while those that do not have such resources would likely lose their
competitiveness.

• For this reason, the advancement of AI may widen the existing gap between the rich
and the poor.
• AI itself can also be used to counter this increasing inequality by making basic services,
e.g., education and healthcare, more widely available.

• To stay competitive, it is important to prioritize AI development.


• As a result, resources for AI, such as AI hardware and Big Data, now become strategic
resources, for both companies and countries.

8.4 Job market


• In §5, we saw that all job sectors will likely be transformed and disrupted by AI.
• According to Frey and Osborne’s estimation in 2013, about 47% of total US employ-
ment is in the high risk category, meaning that associated occupations are potentially
automatable perhaps by 2033.

• The scenarios developed by the McKinsey Global Institute in a report published in


late 2017 across 46 countries suggest that 75 million to 375 million workers (3 to 14 per-
cent of the global workforce) will need to switch occupational categories. Moreover, all
workers will need to adapt, as their occupations evolve alongside increasingly capable
machines.

• Occupations that involve complex perception and manipulation tasks, creative intelli-
gence tasks, social intelligence tasks, and high-level cognitive capabilities are unlikely to
be automated soon.
• Generalist occupations requiring knowledge of human heuristics, and specialist occu-
pations involving the development of novel ideas and artifacts, are thus the least sus-
ceptible to computerization.
• Examples of high-risk positions:
– workers in transportation and logistics occupations,
– many office and administrative support workers (e.g., record clerks, office assis-
tants, finance and accounting clerks),
– labour in production occupations,
– a substantial share of employment in services and sales occupations (e.g., cashiers,
counter and rental clerks, helpdesk staff, telemarketers, and food service workers),
– paralegals and legal assistants,
– jobs carried out in predictable settings (e.g., assembly-line workers, dishwashers,
food preparation workers, agricultural and other equipment operators),
– datacentre administrators,
– programmers,

88
– data analysts
• Examples of low-risk positions:
– chief executives and managers,
– most business and finance occupations,
– healthcare providers,
– most occupations in education,
– arts and media jobs (e.g., artists, performers, and entertainers),
– professionals (e.g., engineers, scientists, accountants, analysts, and lawyers),
– manual and service jobs in unpredictable environments (e.g., builders, home health
aides, and gardeners)
References: [1] Carl Benedikt Frey and Michael A. Osborne. “The Future of Employment: How susceptible
are jobs to computerisation?” Oxford Martin Programme on Technology and Employment, 17 Sep. 2013.
Working Paper. https://www.oxfordmartin.ox.ac.uk/publications/the-future-of-employment/. Last
accessed: 10 Oct. 2023. [2] McKinsey Global Institute. “Jobs lost, jobs gained: workforce transitions in a
time of automation”. McKinsey&Company, Dec. 2017. https://www.mckinsey.com/~/media/mckinsey/ind
ustries/public%20and%20social%20sector/our%20insights/what%20the%20future%20of%20work%20will%2
0mean%20for%20jobs%20skills%20and%20wages/mgi%20jobs%20lost-jobs%20gained report december%2020
17.pdf. Last accessed: 10 Oct. 2023.

• In addition to getting rid of or transforming jobs, AI also creates new jobs. Here are
two examples.

– prompt engineer, responsible for crafting text prompts for generative AI to create
desired outputs
– AI engineers, responsible for developing and training AI
• Education needs to adapt to this shift in the labour market.

– Knowledge- and technique-based education is now (even) less useful, while skill-
based education, emphasizing complex problem-solving skills, social skills, innova-
tion and learning skills, becomes increasingly important in securing employment.
• Despite the creation of new jobs, widespread automation brought about by AI may still
lead to an excess of manpower resources and thus mass unemployment eventually.

• One proposed solution is universal basic income, i.e., an amount of money regularly
provided by governments to maintain basic living standards without the citizens having
to work.
– Advantages:
∗ It is free money.
∗ It reduce societal stress due to the need to compete with rapidly developing
AI.
∗ It frees up humans to pursue arts, culture, hobbies, etc.
– Disadvantages:
∗ There is risk of citizens becoming lazy and entitled.
∗ It reduces the urge for governments to create meaningful jobs for those who
want employment.
∗ It increases government spending.

We will discuss in §9.6 the ethical issues that the impact of AI on the labour market raises.

89
8.5 Leadership
In addition to funding and technical requirements such as training data, a good leader is key
to the success of AI projects. Here are some desirable qualities of leaders that are especially
important for AI projects.
• complex problem-solving skills, because AI problems are complex, deep, and broad

• knowledgeable: a correct understanding of what AI can and cannot do


• adaptability: ability to cope with the rapid developments in AI
• courage to put up changes initiated by AI developments even when they challenge
entrenched processes and beliefs

• humility: eagerness and willingness to learn from anybody, including subordinates,


because the AI world is highly dynamic, and relevant knowledge can come from any-
where
In §7.2, we saw a number of technical reasons for AI failure. Listen to Prof. Yu discuss some
non-technical reasons for the failure of AI projects in the video below.

2 min 26 sec

Businesses that are unable to react to changes will very likely be displaced. Two notable
examples are
• camera film pioneer Kodak, which went bankrupt in 2012 after film cameras were re-
placed by digital cameras; and
• once world-leading mobile phone manufacturer Nokia, whose handset business was ac-
quired by Microsoft in 2013 after it failed to keep up with strong competitors including
Apple.
Listen to Prof. Yu explain in the following video the importance of learning, starting with
the new chairman of Nokia as an example.

1 min 28 sec

One company that managed to survive major disruptions is Amazon. It transformed from an
online bookstore to a technology company. Watch in the following video how Amazon uses

90
AI in its some of its operations and services.

4 min 27 sec
Video source: CNN. “Amazon is using AI in almost everything it does”. YouTube, 5 Oct. 2018. https:
//youtu.be/2DtyjC0UxTw.

8.6 Reflection
• We saw that AI will likely change the economy significantly.
• One needs to act to survive the disruptions brought about by AI developments.
• Are you confident that you will be able to run projects involving AI applications suc-
cessfully?

• If necessary, do you think you can use AI to maintain a living for yourself and perhaps
also for your family?
• Do you think some non-AI businesses are worth preserving?

91
HS1501 §9

Ethics

[This chapter is to be updated before Week 10.]


As AI becomes increasingly powerful and increasingly involved in people’s lives, a number
of ethical issues surfaced, and ethical considerations become increasingly important in the
development of AI. Unfortunately, it is often hard to weigh the harms brought about by AI
against the benefits it brings.
In this part, we will see the major viewpoints on the ethical dilemmas originated from
the deployment of AI.
• privacy: when it is right to collect and use personal data
• weapons: should the development of precision weapons be encouraged
• bias: AI can amplify hidden bias in people
• morals: when people do not agree on what is right or wrong, what should AI do
• social status: to what extent can AI be considered human
We will end with an exercise consisting of a few ethical questions raised by AI-induced job
loss.
Disclaimer. It is not the intention of these notes to unequivocally state the “correct solu-
tions” to these dilemmas, for in the real world such “correction solutions” often do not exist.
The content here is meant to leave enough food for thought to spark inner debate in the
reader.

9.1 Privacy
• Most people’s data are already being aggressively collected by governments and corpo-
rations:
– shopping habits,
– Internet-browsing patterns,
– social-media presence,
– work-performance statistics,
– measurements by wearable electronics,
– video footage from security surveillance,
– etc.
• Due to the data-hungry nature of AI, data collection will likely accelerate as AI is
incorporated into more aspects of people’s lives.

92
• Moreover, the data collected is often shared, e.g., amongst IoT devices, with cloud
servers, or even amongst partner organizations.
• People are often not aware of (the purpose of) the collection and the sharing of their
own personal data.

• For example, the use of the fitness tracking app Strava was reported to have accidentally
revealed the locations of military bases and spy outposts around the world in 2017.
Reference: Alex Hern. “Fitness tracking app Strava gives away location of secret US army bases”. The
Guardian, 28 Jan. 2018. https://www.theguardian.com/world/2018/jan/28/fitness-tracking-app
-gives-away-location-of-secret-us-army-bases.

Question. Should explicit/implicit consent be required for personal data collection and
sharing?

• Yes: there are people who do not want their personal data collected and shared (without
them knowing).
• Yes: it shows respect to the people involved, e.g., the citizens and the customers.
• No: data collection in public premises is reasonable, given the public nature of the
premises; data collection in private premises by the owner is also reasonable, given the
ownership of the premise.

• No: it should be understood that when one passes data over to a company (e.g., Google
or Microsoft), whether actively or passively, the company has control over the data.
• It depends: the need for consent is less if the data is collected anonymously.
• It depends: the need for consent is less if the data is collected for public good, e.g., for
security reasons.

Listen to Prof. Yu discuss the complexity of privacy issues and what advice he has about
putting data on the Internet in the video below.

5 min 4 sec

• AI training can now be performed without having to send the actual personal data over
to a central location.
• Instead, one performs a “mini-training” locally at the edge device, and has only the
“mini-training” results sent over.

• This is done in a way that individual personal data cannot be reconstructed from the
“mini-training” results, thus protecting the privacy of the users.
Reference: Lucy Bellwood and Scott McCloud. “Federated AI”. Google AI. https://federated.withgoogle
.com/. Last accessed: 17 Oct. 2023.

93
9.2 Weapons
• AI can be used to automate, anonymize, and augment the precision of physical and
cyber warfare.
• AI is used in armed drones to enable them to navigate around obstacles, identify targets,
and maintain stability while discharging a firearm.
• According to public information as of Oct. 2023, the actual discharging of firearms is
so far controlled remotely by human operators.
• Nevertheless, as we saw in §3 and §4.3, current AI-equipped drones can already have
the capability to autonomously fly, decide where to point their sensors, identify ground
objects, and decide when to discharge firearms.
• Reportedly, military drones have been used in the war between Ukraine and Russia by
both countries to attack targets since the war started in 2022.
• The photo below shows a Turkish-made Bayraktar TB2 drone, a model that was de-
ployed by Ukraine to fight against Russia.

Image source: Ministry of Defence of Ukraine, CC BY 4.0, via Wikimedia Commons. https://comm
ons.wikimedia.org/wiki/File:Bayraktar TB2 of UAF, 2020, 09.jpg.

References: [1] Jake Horton, Olga Robinson, and Daniele Palumbo. “What do we know about drone attacks
in Russia?” BBC, 1 Sep. 2023. https://www.bbc.com/news/world- europe- 65475333. Last accessed:
19 Oct. 2023. [2] No named author. “How are ’kamikaze’ drones being used by Russia and Ukraine?” BBC,
3 Jan. 2023. https://www.bbc.co.uk/news/world-62225830. Last accessed: 19 Oct. 2023. [3] Anna Konert
and Tomasz Balcerzak. “Military autonomous drones (UAVs) – from fantasy to reality. Legal and Ethical
implications”. Transportation Research Procedia, vol. 59, pp. 292-–299, 2021.

Question. Should we allow AI to make human life-or-death decisions?


• Yes: the AI can base its decision on much more information than a human can com-
prehend.
• No: allowing robots to kill humans is a slippery slope we do not want to go down.
Question. Who should be held accountable if the AI makes a mistake?
• No one, because AI made the decisions autonomously
• The person(s) who is using the AI
• The person(s) who developed the AI
Is this justification to halt autonomous weapon development?
Question. Is higher precision (e.g., guided munitions that use facial recognition to identify
targets) good or bad?

94
• Good: higher precision means lower collateral damage and reduced impact on civilian
lives.

• Bad: the existence of such weapons is scary because it makes one instinctively less safe.

Question. Should we use AI to target a country’s infrastructure in a cyberwarfare?

• Yes: targeting support infrastructure is a non-lethal yet efficient way to gain advantage
in a war.

• No: indiscriminate obstruction/destruction of vital infrastructure harms civilians.

Question. Is it fair to send autonomous weapons to fight technologically weaker countries


that do not have similar weapons?

• Yes: it avoids unnecessary risks to the soldiers and their families.


• No: it is unfair for the technologically weaker country to have to risk human lives when
the other side does not.

9.3 Bias
• As we saw in §7.2.2, bias present in the training data of an AI leads to biased results.
• Bias in AI may also be caused by poor system design choices.
• If the AI is deployed in sensitive areas, such as job recruitment and medical diagnoses,
then biased decisions have the potential to ruin lives.
• Example 1: Amazon’s AI recruitment system
– Bias was discovered in the system in 2015.
– The system was trained on past résumés and hiring decisions.
– Due to the greater number of male applicants, more men than women were con-
sidered favourable hires in the training data.
– This made the AI system favour men over women.
Reference: Jeffrey Dastin. “Amazon scraps secret AI recruiting tool that showed bias against women”.
Reuters, 11 Oct. 2018. https://www.reuters.com/article/us-amazon-com-jobs-automation-insi
ght-idUSKCN1MK08G.

• Example 2: an algorithm widely used in US hospitals to allocate health care to patients


– It was found that the algorithm was less likely to refer black people than white
people who were equally sick to programmes that aim to improve care.
– The algorithm assigned risk scores to patients on the basis of total health-care
costs accrued in one year, which seemed reasonable because higher healthcare
costs are generally associated with greater health needs.
– However, the fact that black people spend less money on healthcare than white
people does not mean that they are more healthy.
– It is speculated that this black people’s reduced access to care may be due to a
variety of reasons ranging from distrust of the healthcare system to direct racial
discrimination by health-care providers.
Reference: Heidi Ledford. “Millions of black people affected by racial bias in health-care algorithms”.
Nature, vol. 574, pp. 608–609, 2019.

95
• XAI is helpful in detecting bias in AI models.
• There are tools developed specifically to detect, examine, and mitigate bias in AI
models, e.g., Facebook (now Meta)’s Fairness Flow, and IBM’s AI Fairness 360.
• Listen to Prof. Yu talk about courses of action one can take to reduce bias when
implementing AI.

2 min 40 sec

• AI’s that recommend content to a viewer/reader according to what one likes may create
an echo chamber that amplifies existing bias in people.
• Get a glimpse of what such recommendations can do in the following trailer of The
Social Dilemma, a docudrama by Netflix.

2 min 34 sec
Video source: Netflix. “The Social Dilemma | Official Trailer | Netflix”. YouTube, 28 Aug. 2020.
https://youtu.be/uaaC57tcci0.

9.4 Morals
As AI is typically trained to perform very specific tasks, it may not be able to consider the
wider impacts of its decisions, even if they would be common sense to humans. Listen to
Prof. Yu explain this with a couple of examples.

6 min 13 sec

• To prevent AI from maximizing success rate at the cost of contravening human morals,
one needs to program human morals into the AI.

96
• One theoretical example we saw was Asimov’s Three Laws of Robotics discussed by
Prof. Yu in §4.8.
• For a straightforward real-life example, a self-driving car should be programmed to
brake to avoid running over pedestrians, even when it means that it would consume
more fuel and reach the destination late.

• It becomes harder to anticipate more complex scenarios and program in the decisions.

Suppose you are driving down a highway and there are cars behind you. At this time, a hare
jumps out onto the road in front of you.

Question. Do you hard-brake to avoid hitting the hare, at the risk of causing an accident
with the cars behind?

• Any life is worth preserving.


• The hare is innocent.
• The drivers behind are expected to be able to react in time.

Question. Do you change direction abruptly to avoid hitting the hare and to prevent the
car behind from rear-ending you, at the risk of losing control of the car and hitting something
else?

• Any life, including human life, is worth preserving.


• Inflicting property damage on other drivers is unjust.

Question. Do you go on and run over the hare to avoid causing a car accident?

• The life of a single hare is worth less than the lives of humans which might be at risk
in an accident.
• Hares reproduce quickly anyway.

Question. What if the hare above were replaced by a human child?

• Research findings reveal that people in different countries may have rather different
views to the questions above.
Reference: Amy Maxmen. “Self-driving car dilemmas reveal that moral choices are not universal”.
Nature, vol. 562, pp. 469–470, 2018.

• Conceivably different people in the same country may also have rather different views.

Question. When making driving decisions, who should decide which factors are more im-
portant and which are less?

• Passengers: their lives may actually be affected by the driving decisions.

• Local government: they can ensure that all self-driving cars adhere to the local norms.
• Car manufacturers: they know the cars best.

97
9.5 Social status
• We saw in §2.10.1 that current AI is already able to sustain human-style conversations.
• Current large language models can generate human-style text so well that arguably
they pass the so called Turing Test.
• See what the Turing Test is in the video below.

1 min 54 sec
Video source: CNET. “What is the Turing Test?”. YouTube, 31 Mar. 2015. https://youtu.be/sXx-P
pEBR7k.

• Reportedly, some people developed emotional attachments to an AI.


Reference: James Purtill. “Replika users fell in love with their AI chatbot companions. Then they lost
them”. ABC Science, 1 Mar. 2023. https://www.abc.net.au/news/science/2023-03-01/replika-u
sers-fell-in-love-with-their-ai-chatbot-companion/102028196. Last accessed: 21 Oct. 2023.

Question. Should we avoid developing emotional attachments to an AI?

• Yes: developing emotional attachments to an AI makes one further detached from the
human society.
• No: if emotional attachments to something makes one happier, then why not, be it a
human or a machine or an object?

Question. Should we avoid treating an AI as a human?

• Yes: AI is not human, and cannot be human.


• No: if an AI becomes more human than some human, then it is reasonable to treat it
as human.

Question. Should we accept that AI makes mistakes, like what we do to humans?

• Yes: we should not expect that AI can make perfect decisions while humans cannot,
especially when it comes to complex matters such as predictions and medical diagnosis.
• No: AI has more computational power and can have more information available to it.

• The rapid advancement of AI makes the following question increasingly prominent: to


what extent should AI be treated as an independent entity, e.g., in terms of ownership,
responsibility, rights, liability and relationships?
• We will discuss these more in §10.3 and §10.4.

98
9.6 Exercise: job loss
As discussed in §8.4, widespread automation brought about by AI may lead to a radical
change in the labour force structure and an excess of manpower resources eventually.

Question. Should we expect retrenched workers to re-train for new jobs created by AI?

• Yes: lifelong learning is key to a successful career. The new jobs will also be higher
value and pay more.
• No: not everyone has the educational or financial background to be re-trained. Jobs
involving oversight of AI tend to require a high degree of computer literacy. Consider
a worker in their 50s to 60s who attended vocational training after secondary school
whose job is replaced by AI. Should one really expect them to catch up on 20 to 30 years
of technologies?
• No: the cost of re-training exceeds the potential value such workers can bring by the
time they are sufficiently qualified.

Question. For a worker whose performance output is objectively inferior to an AI, do we


give them a token job or retrench them?

• Given them a token job: every person should have access to gainful employment.

• Retrench them: economic efficiency is essential for any organization.

Question. For people who have priorities beyond career, e.g., people who choose to have
modest career ambitions to have more time for family, passion projects, etc., should we take
away their jobs and force them to acquire AI skills?

• Yes: every person should aspire towards being a more useful person.
• No: we should respect other people’s choices.

99
HS1501 §10

Governance

[This chapter is to be updated before Week 11.]


As the adoption of AI becomes more widespread, e.g., in making important decisions
such as medical diagnosis, recruitment and mortgage approval, the consequences of it not
working in the intended ways become more serious. Governance can help control the risks.
However, too much governance may hinder innovation. People thus need to strike a right
balance. Moreover, legislation is currently slow to catch up with the rapid development of
AI, and there are areas upon which people do not generally agree. In this part, we will see
what people currently have, and what people still need to discuss, in this delicate pursuit.
• general principles, set separately by the US government, the EU, IBM, etc.
• privacy: General Data Protection Regulation (GDPR) in the EU, Personal Data Pro-
tection Act (PDPA) in Singapore, etc.

• liability: who is held accountable for the decisions and the mistakes that an AI makes
• intellectual property: who owns an output generated by an AI, and what one may
use to train an AI
As a side note, to ensure and monitor compliance to regulations, one can use AI to assist with
the repetitive, time-consuming, and labour-intensive task of poring through vast quantities
of documents across multiple organisations, as we saw in §5.9 and §5.10, for example.

10.1 General principles


In §4.8, we saw the three laws of robotics from Asimov’s science fictions, which seem equally
applicable to AI nowadays. In reality, several sets of compliance principles about AI use have
also been released by different organizations around the world. Let us see what some of these
contain and not contain.

10.1.1 European Union (EU)


In Apr. 2019, the High-Level Expert Group on Artificial Intelligence set up by the Euro-
pean Commission presented Ethics Guidelines for Trustworthy Artificial Intelligence. These
guidelines include the following.

• three (necessary but not sufficient) components of trustworthy AI:


1. lawful — respecting all applicable laws and regulations
2. ethical — respecting ethical principles and values

100
3. robust — both from a technical perspective while taking into account its social
environment
• seven key requirements to achieve trustworthy AI:
1. human agency and oversight: AI systems should empower human beings; at
the same time, proper oversight mechanisms need to be ensured.
2. technical robustness and safety: AI systems need to be resilient, secure, safe,
accurate, reliable and reproducible.
3. privacy and data governance: adequate data governance mechanisms must
also be put in place to ensure the quality and integrity of the data, legitimized
access to data, and protection of data.
4. transparency: the data, system and AI business models should be transparent,
through traceability mechanisms, accessible explanations, and the provision of
information about the AI system when it is used by humans.
5. diversity, non-discrimination and fairness: unfair bias must be avoided; AI
systems should be accessible to all, and involve all relevant stakeholders.
6. societal and environmental well-being: AI systems should benefit all human
beings, in terms of both their environmental and their societal impacts.
7. accountability: mechanisms should be put in place to ensure responsibility and
accountability for AI systems and their outcomes.
In Apr. 2021, the European Commission proposed the first regulatory framework for AI,
commonly called the EU AI Act.
• It establishes obligations for AI system providers and users depending on the risk level
of the AI.
• As of Oct. 2023, talks are still ongoing amongst countries in the European Council on
the final form of the law.
• The aim is to reach an agreement by the end of 2023.
Reference: [1] European Commission. “Ethics guidelines for trustworthy AI”. 8 Apr. 2019, last updated
17 Nov. 2022. https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai.
Last accessed: 2 Jan. 2024. [2] European Parliament. “EU AI Act: first regulation on artificial intelligence”.
8 Jun. 2023, last updated 16 Jun. 2023. https://www.europarl.europa.eu/news/en/headlines/society/2
0230601STO93804/. Last accessed: 28 Oct. 2023.

10.1.2 United States


In Nov. 2020, the Executive Office of the President of the United States issued a memorandum
providing guidance to US federal agencies for regulation of AI applications. Here are the
principles for the stewardship of AI applications put forward in the document.
1. public trust in AI: promote reliable, robust, and trustworthy AI applications
2. public participation: provide ample opportunities for the public to participate in all
stages of the rule-making process
3. scientific integrity and information quality: leverage scientific and technical in-
formation and processes, and develop regulatory approaches in a manner that both
informs policy decisions and fosters public trust in AI
4. risk assessment and management: determine which risks are acceptable and which
are not
5. benefits and costs: select approaches that maximize net benefits

101
6. flexibility: pursue regulatory approaches that are not technology-specific and do not
impose mandates that would harm innovation
7. fairness and non-discrimination: ensure fairness and non-discrimination in accor-
dance with law
8. disclosure and transparency: inform human end users when and how AI is used, to
increase public trust in AI and to preserve the possibility of humans to make informed
decisions
9. safety and security: develop AI systems that are safe, secure, and operate as in-
tended, and prevent bad actors from using AI against a regulated entity
10. interagency coordination: ensure consistency and predictability of AI-related poli-
cies
Reference: Executive Office of the President of the United States. “Guidance for Regulation of Artificial Intel-
ligence Applications”. Memorandum for heads of executive departments and agencies, M-21-06, 17 Nov. 2020.

10.1.3 IBM
IBM has the following Trust and Transparency Principles for the use of AI (and other trans-
formative innovations).
• The purpose of AI is to augment human intelligence.
• Data and insights belong to their creator.
• New technology, including AI systems, must be transparent and explainable.
IBM also has the following five ethical focal areas for AI design and development.
1. accountability: AI designers and developers are responsible for considering AI design,
development, decision processes, and outcomes.
2. value alignment: AI should be designed to align with the norms and values of your
user group in mind.
3. explainability: AI should be designed for humans to easily perceive, detect, and
understand its decision process.
4. fairness: AI must be designed to minimize bias and promote inclusive representation.
5. user data rights: AI must be designed to protect user data and preserve the user’s
power over access and uses.
References: [1] IBM. “IBM’s Principles for Trust and Transparency”. https://www.ibm.com/policy/t
rust- transparency- new/. Last accessed: 24 Oct. 2023. [2] IBM. “Everyday ethics for AI”. https:
//www.ibm.com/design/ai/ethics/everyday-ethics. Last accessed: 24 Oct. 2023.

10.1.4 Microsoft
Microsoft has six key principles for responsible AI.
• fairness: allocate opportunities, resources, or information in ways that are fair to the
humans who use it
• reliability and safety: make the system function well for people across different use
conditions and contexts, including ones it was not originally intended for
• privacy and security: protect the data in the system
• inclusiveness: design the system to be inclusive of people of all abilities

102
• transparency: help people avoid misunderstanding, misusing, or incorrectly estimat-
ing the capabilities of the system
• accountability: create oversight so that humans can be accountable and in control
Reference: Microsoft AI. “Principles and approach”. https://www.microsoft.com/en-us/ai/principles-an
d-approach. Last accessed: 28 Oct. 2023.

10.1.5 Delicate issues


• Sufficient transparency of AI systems is required in regulating the AI systems.
• However, in making AI models transparent, trade secrets about how the AI model
works could be leaked.
• Development time is also increased by the need to incorporate XAI to increase trans-
parency.
• If XAI is mandated by regulation in one legal zone (e.g., the EU), then it could drive
developers away from that zone, leading to brain drain.

• The transparency requirements above are phrased carefully to avoid these problems,
but they do not solve these problems.

Question. How do you think governance would affect you as an AI user?

• For example, what do you think are things you can do with AI now but will eventually
not be allowed?

Question. Are there some general principles that are not listed above but that you think
people should keep in mind when developing and implementing AI?

• For example, should we allow AI to surpass human intelligence? To self-improve? To


kill humans?

Question. Given that ill-intentioned people do not follow rules and accidents are usually
unforeseen, to what extent do you think AI governance can help avoid disasters?

10.2 Privacy
We saw in §9.1 various privacy issues in AI development and implementation. Legislation was
enacted to ensure the collected data (for AI and for other purposes) are used only for their
intended purpose in a way that is safe for users. Fines are levied on entities who contravene
these laws.

10.2.1 General Data Protection Regulation (GDPR) in the EU


The GDPR was put into effect in 2018.
It outlines seven protection and accountability principles in relation to data processing.
1. lawfulness, fairness and transparency — processing must be lawful, fair, and
transparent to the data subject
2. purpose limitation — process data only for the legitimate purposes specified explic-
itly to the data subject when the data is collected

103
3. data minimization — collect and process only as much data as absolutely necessary
for the purposes specified
4. accuracy — keep personal data accurate and up to date
5. storage limitation — store personally identifying data only for as long as necessary
for the specified purpose
6. integrity and confidentiality — process data in such a way as to ensure appropriate
security, integrity, and confidentiality (e.g., by using encryption)
7. accountability — be able to demonstrate compliance with all of these principles

The GDPR recognizes explicitly eight privacy rights for data subjects:
1. the right to be informed
2. the right of access;
3. the right to rectification;

4. the right to erasure;


5. the right to restrict processing;
6. the right to data portability;

7. the right to object;


8. rights in relation to automated decision making and profiling.
Listen to Prof. Yu talk about the challenges in observing the right of erasure for AI
systems.

57 sec
• British Airways was fined GB£20M in 2020 for not taking sufficient security measures,
as demonstrated by a data breach that took place in 2019 where its systems were
modified by attackers to harvest customers’ personal and credit card details.
• Hotel giant Marriott International was fined GB£18.4M in 2020 for not putting in
appropriate measures to protect personal data, as demonstrated by a cyber attack
2014–2018 where hackers gained access to 339 million guest records.
References: [1] Ben Wolford. “What is GDPR, the EU’s new data protection law?” GDPR.EU. https:
//gdpr.eu/what-is-gdpr/. Last accessed: 28 Oct. 2023. [2] No author named. “British Airways fined
£20m over data breach”. BBC, 16 Oct. 2020. https://www.bbc.com/news/technology-54568784. Last
accessed: 28 Oct. 2023. [3] Carly Page. “Marriott Hit With £18.4 Million GDPR Fine Over Massive 2018
Data Breach”. Forbes, 30 Oct. 2020. https://www.forbes.com/sites/carlypage/2020/10/30/marriott-h
it-with-184-million-gdpr-fine-over-massive-2018-data-breach/. Last accessed: 28 Oct. 2023.

104
10.2.2 Personal Data Protection Act (PDPA) in Singapore
The PDPA comprises various requirements governing the collection, use, disclosure and care
of personal data in Singapore. The main data protection rules came into force in 2014.

Image source: Personal Data Protection Commission. “Data Protection Obligations under the PDPA”. http
s://www.pdpc.gov.sg/-/media/Files/PDPC/PDF-Files/Resource-for-Organisation/Data-Protection-O
bligations-under-the-PDPA.pdf. Last accessed: 28 Oct. 2023.

• Singapore Taekwondo Federation was fined $30K in 2018 for failing to make reasonable
security arrangements to prevent the unauthorised disclosure of minors’ NRIC numbers
on its website in 2017.

105
• Genki Sushi was fined SG$16K in 2019 for failing to put in place reasonable security
arrangements to protect personal data of its employees, as demonstrated by the data
being subjected to a ransomware attack in 2018.
• AIA was fined SG$10K in 2019 for failing to take reasonable security arrangements
in its letter generation process, which led to 245 letters being sent wrong recipients
in 2017.
References: [1] Personal Data Protection Commission “PDPA Overview”. https://www.pdpc.gov.sg/Ov
erview- of- PDPA/The- Legislation/Personal- Data- Protection- Act. Last accessed: 28 Oct. 2023.
[2] Personal Data Protection Commission. “Breach of Protection Obligation by Singapore Taekwondo Federa-
tion”. 22 Jun. 2018. https://www.pdpc.gov.sg/all-commissions-decisions/2018/06/breach-of-protecti
on-obligation-by-singapore-taekwondo-federation. Last accessed: 28 Oct. 2023. [3] Personal Data Pro-
tection Commission. “Breach of the Protection Obligation by Genki Sushi”. 2 Aug. 2019. https://www.pdpc
.gov.sg/all-commissions-decisions/2019/08/breach-of-the-protection-obligation-by-genki-sushi.
Last accessed: 28 Oct. 2023. [4] Personal Data Protection Commission. “Breach of the Protection Obligation
by AIA”. 20 Jun. 2019. https://www.pdpc.gov.sg/all-commissions-decisions/2019/06/breach-of-the-
protection-obligation-by-aia. Last accessed: 28 Oct. 2023.

10.2.3 US Federal Trade Commission (FTC) probes


Facebook (now Meta) was fined US$5000M in 2019 by FTC for
• repeatedly using deceptive disclosures and settings to undermine users’ privacy prefer-
ences, allowing the company to share users’ personal information with third-party apps
that were downloaded by the user’s Facebook “friends”, and
• not taking adequate steps to deal with apps that it knew were violating its platform
policies.
Reference: US Federal Trade Commission. “FTC Imposes $5 Billion Penalty and Sweeping New Privacy
Restrictions on Facebook”. 24 Jul. 2019. https://www.ftc.gov/news-events/news/press-releases/2019
/07/ftc-imposes-5-billion-penalty-sweeping-new-privacy-restrictions-facebook. Last accessed:
28 Oct. 2023.

10.3 Liability
• In §7.2, we saw a number of ways AI can malfunction.
• There is no currently common consensus on who should be legally/financially respon-
sible for the decisions and mistakes made by AI.
• For example, in the fatal accident from §1.3.2 where AI drove a car into a truck whose
colour is similar to that of the sky, it is not clear who should bear the consequences.

106
Question. Who should be held accountable for the decisions and mistakes made by AI?

• AI users: they are the people actually operating the AI and have the responsibility to
perform a final check on the outputs
– However, there is often little users can do to control the AI.

• AI developers: they are the people who made all the design and engineering decisions
for the AI, and they are responsible for informing the users of the limitation of the AI
– However, it is hard for them guard against unintended uses of the AI.
• owners of training data: the AI learnt from the data they provide

– However, they may not even be aware that their data were used in training AI,
and they probably have no control over how the training is done.
• the AI itself: it is the entity that actually made the decision/mistake
– However, it is unclear to what extent the AI is an entity.

Maybe multiple parties should be held accountable.

10.4 Intellectual property


• As we saw in §1–§5, people can use AI to create new products, new ideas, inventions,
etc.
• As with liability, there is no common consensus on who should own these creations.

Question. Who should own the intellectual properties created using AI?

• AI users: they created the intellectual properties, albeit using a tool that happens to
be AI
– However, arguably, AI did a significant part of the creative work.

• AI developers: they are the people who made all the design and engineering decisions
for the AI
– However, owning the AI does not immediately imply that they own everything
produced using the AI.
• owners of training data: the AI output is modelled upon what is used to train the AI

– However, they do not need to be involved in the creation process at all.


• the AI itself: it is the entity that actually created the work
– However, allowing AI to own intellectual properties may implicitly force one to
recognize unwillingly that AI has a mind on it own.

Maybe multiple parties should own the intellectual property jointly.

• The following are some AI-generated images that combine the content of a photograph
(labelled A) with the style of several well-known artworks (in the bottom left corner of

107
each panel).

Image source: Leon A. Gatys, Alexander S. Ecker, Matthias Bethge. “A Neural Algorithm of Artistic
Style”. arXiv:1508.06576v2 [cs.CV], Sep. 2015.

• The images generated by AI are not exactly the same as the original inspirations, but
their art styles can be strikingly similar.
• Whether these count as plagiarism is still fiercely debated, with the legal discussion
largely centering on whether art generated by AI is transformative enough to be covered
under fair-use law.

108
HS1501 §11

Future

[This chapter is to be updated before Week 12.]


We have gathered in previous parts of this course a rough picture of the current status of
AI. In this final part, we review what people anticipate about future AI.
• capabilities: at which aspects people want AI to be better, e.g., common sense, emo-
tion, causal inference

• human–machine interfaces, e.g., headgears, brain–computer interfaces


• brain-inspired AI: innovating AI using our knowledge about the brain
• quantum computing: this new mode of computing can speed up computations re-
quired in AI implementation

• society: how AI will change the society

11.1 Capabilities
• In previous parts, we have seen AI’s capabilities in language, vision, robotics, and data
analytics, all of which are advancing at very high speeds.

• However, in many of these cases, the AI is trained to one and only one specific task,
e.g., playing Go, or recognize objects in an image. Such an AI is called Artificial Narrow
Intelligence (or Narrow AI, or ANI, or Weak AI ).
• Nowadays, some AI are capable of performing multiple tasks. One notable class of ex-
amples are the Transformer language models, which are capable of answering questions,
translating text, writing poems, summarizing text, etc.
• One grand aim of AI research is to develop a single AI that can perform any human
task, e.g., sensory perception, fine motor skills, navigation, causal inference, reasoning
about hypothetical situations and abstract concepts, problem solving using logic and
common sense, natural-language communication, creative work, social and emotional
engagement, planning, and continuous self-learning. Such an AI is called Artificial
General Intelligence (or General AI, or AGI, or Strong AI ).
• The latest Transformer language models, e.g., OpenAI’s GPT-4 released in Mar. 2023,
are already able to perform quite a number of these tasks.

109
– For example, GPT-4 is multi-modal, in the sense that it accepts both text and
image inputs, as demonstrated below.

– This also demonstrates the capability of GPT-4 to communicate in natural lan-


guage, understand social context, apply common sense, and infer cause.
– GPT-4 passed a simulated bar exam with a score around the top 10% of test
takers, and obtained a score of 700 out of 800 in a simulated SAT Math Test.
– However, GPT-4 still hallucinates, although less frequently compared to its pre-
decessors.
Reference: OpenAI. “GPT-4”. 14 Mar. 2023. https://openai.com/research/gpt-4. Last accessed:
4 Nov. 2023.

• It is highly debatable whether current AI can generate original ideas and be creative.
– Technically, AI learns from the data provided to it, and nothing else.
– So it is not able to generate ideas out of nowhere.

110
– However, like humans creators, it is able to put together what it learnt in a way
that has not been done before.
– Here is an example created using Stable Diffusion.

Image source: Ugleh. “Spiral Town – different approach to qr monster”. Reddit, 10 Sep. 2023.
https://www.reddit.com/r/StableDiffusion/comments/16ew9fz/spiral town different app
roach to qr monster/.

– Arguably, this counts as creativity.


– What complicates the matter further is that the human user is involved in in-
structing the AI to create the work, e.g., by providing multiple rounds of carefully
crafted text instructions.
– It is less clear how much of the creativity involved lies with the human user, and
how much with the AI; cf. the image in §1.3.1 that won an art competition.
• AI that can grasp abstract concepts (e.g., university-level mathematics), understand
emotions, or explain cause and effect are still in early stages of development.

– Being able to grasp abstract concepts is one of the fundamental characteristics of


human intelligence.
– AI that can understand human emotions are sometimes referred to as emotion AI
or affective AI. It is important in facilitating human–AI interactions.
– AI that can explain cause and effect are sometimes referred to as causal AI. It is
important in making longer-term or hypothetical predictions.
• The following figure shows where various AI applications are at in what the company
Gartner call the Hype Cycle, and how long they take to reach the plateau of productivity,

111
according to the company in Jul. 2023.

Image source: Lori Perri. “What’s New in Artificial Intelligence from the 2023 Gartner Hype Cycle”.
Gartner, 17 Aug. 2023. https://www.gartner.com/en/articles/what-s-new-in-artificial-intel
ligence-from-the-2023-gartner-hype-cycle. Last accessed: 2 Nov. 2023.

• If AI can achieve AGI, then with more effort it is conceivable that an AGI can surpass
human abilities. Such an AGI is called Artificial Superintelligence (or Super AI, or
ASI ). The point at which ASI is achieved is called the singularity.

11.2 Human–machine interfaces


• AI, say, with its vision capabilities and its capability to analyze sensor data, enables
humans to interact with machines in more natural ways.
• Some such ways are through virtual reality (VR), augmented reality (AR), and the
metaverse, which we saw already in §3.8.2.
• Reportedly, the company Neuralink is developing devices that allow computers to re-
ceive brain signals of users as input.
• Maybe these VR/AR headgears or these devices that connect human brains directly to
computers will become prevalent in the future; maybe some other new interface will.

11.3 Brain-inspired AI
• In a sense, Artificial Intelligence (AI) is intended to simulate human intelligence, and
human intelligence takes place in the brain. So it is natural to use knowledge about
the brain to develop AI.

112
• As we described them in §6.1, artificial neural networks resemble animal neural networks
in a number of ways.
– They both perform complex operations by combining various simpler operations.
– Each neuron passes on information to the next only when a certain threshold is
reached.
– This passing-on of information is modulated, using neurotransmitters in brains,
and using weights in artificial neural networks.
– Etc.
• At the same time, there are many ways in which artificial neural networks differ from
animal neural networks.
– Artificial neural networks are faster than animal neural networks, but they use
more energy.
– Artificial neural networks needs much more data to learn well, compared to ani-
mals.
– Animals can acquire new knowledge without losing what they already learnt, while
at present artificial neural networks typically have difficulty doing the same.
– Etc.
• Listen to Matthew Botvinick, AI expert and neuroscientist from Google DeepMind and
University College London, talk about how neuroscience helps AI research.

2 min 38 sec
Video source: Google DeepMind (@Google DeepMind). “Neuroscience and AI – Matt Botvinick”.
YouTube, 9 May 2018. https://youtu.be/uv4Hh3wDH14.

• Neuromorphic computing refers to computing approaches inspired by the structure and


the function of actual brains.
• While Artificial Neural Networks simulate how the brain works using software, it is
possible to do the same using hardware.
• One example is the TrueNorth chip produced by IBM in 2014.
• Another example is the Loihi chip developed by Intel. It was first released in 2018.
• Loihi works at low power levels and uses the so-called Spiking Neural Networks.
– Signals pass through usual Artificial Neural Networks instantaneously.
– This is different in a Spiking Neural Network, in which signals may be fired off at
different times to achieve different effects.
– This resembles more the brain.
Reference: Intel. “Taking Neuromorphic Computing to the Next Level with Loihi 2 Technology Brief”.
https://www.intel.com/content/www/us/en/research/neuromorphic-computing-loihi-2-technolo
gy-brief.html. Last accessed: 5 Nov. 2023.

113
• Watch what Loihi can do in the following video.

1 min 31 sec
Video source: Intel Newsroom (@IntelNewsroom). “Combining Vision and Touch in Robotics Using
Intel Neuromorphic Computing”. YouTube, 15 Jul. 2020. https://youtu.be/tmDjoSIYtsY.

• Listen to Prof. Yu talk about why such neuromorphic chips have not worked out well
yet in the video below.

1 min 51 sec

• As people get a better understanding of how the brain works, conceivably more neuro-
morphic approaches to AI will emerge.

11.4 Quantum computing


• As we saw in §6.2 and §6.8, current AI technology requires heavy computation.
• A limit on computation power thus also limits how well AI can perform.
• In theory, the so-called quantum computing techniques can speed up computations by
performing a huge number of them in parallel.

• So quantum computers can help make better AI.


• However, quantum computing is still in its infancy, and it is developing very slowly.
• Here is a picture of the IBM Q System One quantum computer, which was first intro-

114
duced in 2019.

Image source: IBM Research, CC BY 2.0, via Wikimedia Commons. https://commons.wikimedia.or


g/wiki/File:IBM Q system (Fraunhofer 2).jpg.

• When people will be able to put quantum computers into practical use is still not clear.

11.5 Society
• In the short run, AI is expected to play a key role in the so-called “Fifth Industrial
Revolution”, aka “Industry 5.0”, which focuses on human–machine collaboration, sus-
tainability, human-centredness, and the environment.
• In the long run, although it is almost certain that AI will continue to transform our
society, but it is not entirely clear how.

Question. Now that AI is able to produce extremely realistic fake videos, how can we still
trust anything without seeing it in person?
Question. Without any trust in the society outside of personal reach, will the society need
to work differently? If yes, then what are the differences?

Question. What social standing will AI-powered machines have eventually?


Question. Will AI technology evolve to a point where some machines are socially accepted
as independent participants in the society?

Question. Will AI technology evolve to a point where many (if not all) people are integrated
with a personalized AI assistant that enhances the person’s abilities and enables the person
to live virtually forever?

115
Question. Which of the following do you think will most likely be how humans (not) live
with AI in the future?

• AI augments humans, increases their productivity, automates the mundane tasks, and
leaves the non-automatable “interesting” tasks to humans.
• AI takes over all the human jobs completely and solves all the problems, while humans
can pursue anything according to their interests without having to worry about making
ends meet.

• People’s lives go on as usual, while AI becomes just another common item in people’s
toolboxes, like what electricity, computers, Internet, etc. are today.
• People get worried about the alarming advancement of AI, stop AI development com-
pletely, and never return to it.

• The very small number of people who are in control of AI and its development become
super-rich, while the remaining majority suffer in poverty and have no way to improve
their lives.
• AI surpasses human abilities, gains a mind of its own, goes out of control, and en-
slaves/kills/exiles all humans.

Different people may have different views to the questions above. Listen to what Elon
Musk thinks about future job markets in the video below, where he was interviewed by UK
Prime Minister Rishi Sunak in the AI Summit in Nov. 2023. Musk is a businessman who has
been actively involved in many AI projects, including OpenAI, the Tesla Autopilot system,
Neuralink, and more recently xAI.

1 min 31 sec
Video source: The Telegraph. “Elon Musk tells Rishi Sunak ’AI will mean people no longer need to work”’.
YouTube, 3 Nov. 2023. https://youtu.be/eeLGD5pegIM.

AI is now very actively developed and implemented by government agencies, research


institutes, universities, commercial companies, non-profit organizations, and even individuals.
This development is virtually unstoppable. We hope that this course provides you with some
basic information about AI that will help you survive better in this world where AI is more
and more widely adopted. To stay competitive, very likely you will need to keep up with the
rapid advancement of AI continuously. Be open to change, and keep learning.

116

You might also like