0% found this document useful (0 votes)
2 views240 pages

Understanding The Digital and AI Transformation

The document discusses the transition of human society into a digital and AI society, highlighting the significant changes in business models, operations, and social activities due to digital transformation. It emphasizes the historical context of this transformation, beginning with digital conversion in the 1960s, leading to the convergence of communication and computing technologies. The author aims to provide insights into the implications of digital and AI technologies, addressing both the benefits and challenges they present in modern society.

Uploaded by

rafael_oropeza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views240 pages

Understanding The Digital and AI Transformation

The document discusses the transition of human society into a digital and AI society, highlighting the significant changes in business models, operations, and social activities due to digital transformation. It emphasizes the historical context of this transformation, beginning with digital conversion in the 1960s, leading to the convergence of communication and computing technologies. The author aims to provide insights into the implications of digital and AI technologies, addressing both the benefits and challenges they present in modern society.

Uploaded by

rafael_oropeza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 240

Byeong Gi Lee

Understanding
the Digital and AI
Transformation
Understanding the Digital and AI Transformation
Byeong Gi Lee

Understanding the Digital


and AI Transformation
Byeong Gi Lee
Professor Emeritus
Seoul National University
Seoul, Korea (Republic of)

ISBN 978-981-96-0032-8 ISBN 978-981-96-0033-5 (eBook)


https://doi.org/10.1007/978-981-96-0033-5

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2025

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or
information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore

If disposing of this product, please recycle the paper.


Preface

Human society is now transitioning into a digital society. Just as the Industrial Revo-
lution led to a shift from agrarian society to industrial society, the Digital Revolution
is driving the transformation from industrial society to digital society. The digital
society has begun to sprint towards a digital and artificial intelligence (AI) society.
As human society transitions into a digital society, everything is changing to a
new paradigm. The digital transformation began in industry, fundamentally changing
corporations. It went beyond merely adopting digital technologies in the manufac-
turing sector, transforming business models, operations, organizational structures,
and decision-making processes, thereby innovating businesses. The digital transfor-
mation is spilling over into society, broadly changing human social activities. From
politics and economics to social activities, and even the daily lives of individuals, a
comprehensive change is occurring in digital conception and digital methodology.
Transitioning to a digital society does not mean the disappearance of existing
industries, just as agriculture did not vanish with the transition to an industrial society.
It is simply a shift to a new paradigm. Just as farming transitioned from plows to
tractors in the industrial society, in the digital society, this will evolve into farming
with autonomous driving tractors utilizing digital technology.
The start of human society’s transformation to a digital society occurred approx-
imately 50 years after the first emergence of digital concepts. The kickoff for digital
transformation was the digital conversion started in the 1960s to realize the dream
of long-distance communication. Converting analog signals to digital not only made
long-distance communication possible but also the comprehensive processing of
voice and video signals with computer data. This provided the impetus for the conver-
gence of communications and computing. First, communications and computers
converged at the level of communication, establishing an internet-based communi-
cation platform. Subsequently, convergence occurred at the system level, along with
the emergence of smartphones, establishing a content platform based on operating
systems (OS). On this platform, open application marketplaces opened, attracting a
flood of applications, which in turn spurred the rapid growth of various application
platforms for search, social media, online commerce, and content sharing. This led
to the “ICT Big Bang,” marking the start of the Digital Revolution sweeping across

v
vi Preface

industries and society, propelling the digital transformation. This digital transforma-
tion is accelerating towards a digital and AI transformation with the emergence of
AI technologies like ChatGPT.
Digital technologies at the heart of digital transformation are greatly impacting
human life today. Digital technology is bringing about changes in communication
methods, access and distribution of information, work and learning methods, and
lifestyles. Furthermore, the spearhead of digital technology, AI, is emerging and
starting to add new momentum to these changes. These changes are extending beyond
individual lifestyles to corporate activities and government operations, significantly
altering the way human society operates. Moreover, while there are positive changes
such as improved accessibility to information and services, enhanced social connec-
tivity, and industrial innovation, various issues are also emerging, including digital
divides, digital illiteracy, job loss, misinformation, fake news, and privacy concerns.
What is the wise way to live through this era of digital and AI transformation? It
is essential to understand the direction of changes of the times and act socially and
personally in alignment with the trends. In order to do so, one must first understand
what digital technologies are, what their characteristics are, and how they function. In
addition, it is necessary to comprehend digital platforms, the services they provide,
and the reciprocal obligations implicitly attached to these services. Also crucial is to
understand the AI technology, how digital and AI technologies transform industries
and society, and what are the benefits and problems these changes entail.
This book is intended to help find solutions to such problems. By reading and
contemplating on this book, the reader will gain an understanding of the essence of
digital and AI transformation and find ways to live and act wisely in the digital age.
I am grateful to everyone who helped in the process of writing this book. My
thanks go to Profs. Bahk Saewoong, Shim Byunghyo, and Moon Byung-Ro of Seoul
National University, Profs. Kang Chung Gu and Lee Inkyu of Korea University, Prof.
Steven Whang of KAIST, Professor Emeritus Noh Seok Kyun of Yeungnam Univer-
sity, Executive Vice President Choi Sunghyun of Samsung Electronics, former Pres-
ident Choi DooWhan of POSCO DX, and Vice President Doh Youngjin of Hyundai
Motor Company. I also thank the Academy of Science of Republic of Korea for their
support in writing this book.

Seoul, Korea (Republic of) Byeong Gi Lee


Chronology of Digital and AI Technologies

vii
Contents

1 Introduction to Digital and AI Transformation . . . . . . . . . . . . . . . . . . . . 1


2 Foundations of Digital Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Development of Communications and Computers . . . . . . . . . . . . . . 13
2.1.1 Wired Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.2 Wireless and Mobile Communications . . . . . . . . . . . . . . . . . 15
2.1.3 Computers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.4 Computer Communications . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.5 Broadcasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.6 Key Technological Elements . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Digital Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.1 Three Types of Signal Sources . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.2 Analog and Digital Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.3 Digital Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.4 Circuit Mode and Packet Mode . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Digital Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.1 Digital Integration in Circuit Mode . . . . . . . . . . . . . . . . . . . . 25
2.3.2 Digital Integration in Packet Mode . . . . . . . . . . . . . . . . . . . . 26
2.3.3 Digital Integration in Wireless Communication . . . . . . . . . . 27
2.3.4 Internet Communication Platform . . . . . . . . . . . . . . . . . . . . . 30
2.4 Digital Convergence, ‘ICT Big Bang’ . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.1 Digital Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.2 ‘ICT Big Bang’—OS Aspect . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.3 ‘ICT Big Bang’—Business Aspect . . . . . . . . . . . . . . . . . . . . 34
2.4.4 Foundation of Digital Transformation—A Summary . . . . . 35
3 Digital Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1 Establishment of Digital Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Types of Digital Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.1 Communication Platform (Internet) . . . . . . . . . . . . . . . . . . . . 39
3.2.2 Content Platform (OS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2.3 Application Platform (Apps) . . . . . . . . . . . . . . . . . . . . . . . . . . 41

ix
x Contents

3.2.4 Examples of Platform Services . . . . . . . . . . . . . . . . . . . . . . . . 42


3.2.5 Content, Applications, and Services . . . . . . . . . . . . . . . . . . . 44
3.3 Digital Platform Companies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.1 Apple Inc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.2 Google Inc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3.3 Amazon Inc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3.4 Meta Platforms (Facebook) . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4 Nature of Digital Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4.1 Two-Sided Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.4.2 Network Externality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.4.3 Transaction Costs and Advertisers . . . . . . . . . . . . . . . . . . . . . 52
3.4.4 Attention Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4.5 Regulatory Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.5 Dysfunctions of Digital Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.5.1 Search Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.5.2 Social Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.5.3 Personal Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.5.4 Lowest Price Guarantee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5.5 Natural Monopoly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.6 Regulation of Digital Platform Companies . . . . . . . . . . . . . . . . . . . . 60
3.6.1 Methods of Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.6.2 EU’s Digital Markets Act (DMA) . . . . . . . . . . . . . . . . . . . . . 63
3.6.3 EU’s Digital Services Act (DSA) . . . . . . . . . . . . . . . . . . . . . . 66
4 Digital Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.1 5G/6G Mobile Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.2 Internet of Things (IoT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.3 Cloud Computing, Edge Computing . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.4 Digital Virtual Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.4.1 Augmented Reality (AR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.4.2 Virtual Reality (VR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.4.3 Metaverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.5 Digital Twin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.6 Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.6.1 Big Data Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.6.2 Bioinformatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.7 Cybersecurity Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.8 Robots, Autonomous Driving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.8.1 Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.8.2 Robotic Process Automation (RPA) . . . . . . . . . . . . . . . . . . . . 86
4.8.3 Autonomous Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.8.4 Autonomous Drones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.9 Decentralized/Distributed Technology . . . . . . . . . . . . . . . . . . . . . . . . 90
4.9.1 Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.9.2 Cryptocurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Contents xi

4.9.3 Web 3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92


4.10 3D/4D Printing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.11 Quantum Computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.12 Artificial Intelligence (AI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5 Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.1 Human Intelligence and Artificial Intelligence . . . . . . . . . . . . . . . . . 102
5.1.1 Human Cognitive Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.1.2 Cognitive Process of AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.1.3 Understanding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.1.4 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.2 Development of Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . 105
5.2.1 Early Stages of Artificial Intelligence . . . . . . . . . . . . . . . . . . 106
5.2.2 Neural Networks and Machine Learning . . . . . . . . . . . . . . . 106
5.2.3 Generative Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.2.4 Transformer Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.3 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.3.1 Search Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.3.2 Sorting Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.3.3 Optimization Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.4 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.4.1 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.4.2 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.4.3 Semi-supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.4.4 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.5 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.5.1 Structure of Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.5.2 Types of Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.5.3 Recurrence and Convolutional Neural Networks . . . . . . . . . 120
5.5.4 Learning Process of Neural Networks . . . . . . . . . . . . . . . . . . 123
5.5.5 Practical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.5.6 Deep Learning, Deep Neural Networks . . . . . . . . . . . . . . . . . 126
5.6 Transformer Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.6.1 Transformers Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.6.2 Self-attention Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.6.3 Comparison with RNN and CNN . . . . . . . . . . . . . . . . . . . . . . 132
5.6.4 Natural Language Processing . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.7 GPT and BERT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.7.1 Architectures of GPT and BERT . . . . . . . . . . . . . . . . . . . . . . 135
5.7.2 Applications of GPT and BERT . . . . . . . . . . . . . . . . . . . . . . . 137
5.7.3 ChatGPT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.8 Implementation of AI Cognitive Functions . . . . . . . . . . . . . . . . . . . . 140
5.8.1 Functions of Blocks in AI Systems . . . . . . . . . . . . . . . . . . . . 141
5.8.2 Complexity of AI Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.8.3 Number of Weight Parameters . . . . . . . . . . . . . . . . . . . . . . . . 145
xii Contents

5.8.4 Training of AI Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146


5.9 Challenges and Limitations of AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.9.1 Technological Challenges and Limitations . . . . . . . . . . . . . . 151
5.9.2 Social and Ethical Challenges . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.9.3 Risks and Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.9.4 Regulation and Governance . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.9.5 EU’s Artificial Intelligence Act (AIA) . . . . . . . . . . . . . . . . . . 160
6 Digital and AI Transformation of Industry . . . . . . . . . . . . . . . . . . . . . . . 163
6.1 Benefits of Digital Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.2 Drivers of Digital Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.3 Application of Digital Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.4 Digital Transformation in Traditional Industries . . . . . . . . . . . . . . . . 168
6.4.1 Operational Improvement in Traditional Industries . . . . . . . 169
6.4.2 The Case of Energy Company ENGIE . . . . . . . . . . . . . . . . . 169
6.4.3 The Case of John Deere & Company . . . . . . . . . . . . . . . . . . . 171
6.5 Digitalization in the Manufacturing Industry . . . . . . . . . . . . . . . . . . 172
6.5.1 ‘Industry 4.0’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
6.5.2 The Case of Steel Company POSCO . . . . . . . . . . . . . . . . . . . 175
6.5.3 The Case of Hyundai Motor Company . . . . . . . . . . . . . . . . . 177
6.6 Digital Transformation Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
6.7 AI Transformation of Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7 Digital and AI Transformation in Society . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.1 Society in the Digital Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
7.2 Culture in the Era of Digital Transformation . . . . . . . . . . . . . . . . . . . 187
7.3 Changes of Jobs in the Digital Age . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
7.4 Digital Divide, Digital Inclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.5 Education in the Digital Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.6 Politics in the Digital Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
7.7 Digital Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
7.7.1 Digital Surveillance Society . . . . . . . . . . . . . . . . . . . . . . . . . . 198
7.7.2 Surveillance Capitalism Society . . . . . . . . . . . . . . . . . . . . . . . 200
7.8 Digital Self-restraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
7.8.1 Fear of Disconnection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
7.8.2 Degraded Concentration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
7.8.3 Digital Escapism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
7.8.4 Digital Technostress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
7.8.5 Confinement of Thoughts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
7.9 AI Transformation in Society . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Contents xiii

8 Challenges of Digital and AI Transformation . . . . . . . . . . . . . . . . . . . . . 211


8.1 Digital Revolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
8.2 Digital Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.3 Education, Digital Literacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
8.4 Misinformation, Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
8.5 Personal Information, Personal Competence . . . . . . . . . . . . . . . . . . . 219
8.6 Role of Government . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
8.7 Era of AI, Age of AI Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Acronyms

2FA Two-Factor Authentication


ADAM Adaptive Moment Estimation
ADSL Asynchronous Digital Subscriber Line
AI Artificial Intelligence
ALU Arithmetic Logic Unit
AMPS Advanced Mobile Phone Service
API Application Programming Interface
AR Augmented Reality
ARP Address Resolution Protocol
ATM Asynchronous Transfer Mode
AWS Autonomous Weapon System
BERT Bidirectional Encoder Representations from Transformers
BFS Breadth-First Search
BISDN Broadband Integrated Services Digital Network
CAD Computer-Aided Design
CATV Cable Television
CATV Community Antenna Television
CBDC Central Bank Digital Currency
CDMA Code-Division Multiple Access
CGI Computer Generated Imigery
CNN Convolutional Neural Network
CPS Cyberphysical System
DApps Decentralized Applications
DAW Digital Audio Workstation
DeFi Decentralized Finance
DFS Depth-First Search
DL Deep Learning
DMA Digital Markets Act
DNN Deep Neural Networks
DSA Digital Services Act
EDM Electronic Dance Music

xv
xvi Acronyms

FDMA Frequency-Division Multiple Access


FLOPS Floating-point Operations per Second
FNN Feedforward Neural Network
FOMO Fear of Missing Out
FTP File Transfer Protocol
GAI General AI
GAN Generative Adversarial Network
GELU Gaussian Error Linear Unit
GNN Graph Neural Networks
GPAI General-Purpose AI
GPT Generative Pre-trained Transformer
GSM Global System for Mobile Communications
HDTV High-Definition Television
HSPA High-Speed Packet Access
HTML HyperText Markup Language
HTTP HyperText Transfer Protocol
IaaS Infrastructure as a Service
ICT Information and Communication Technology
IDS Intrusion Detection System
IoT Internet of Things
IP Internet Protocol
IPS Intrusion Prevention System
IPTV Internet Protocol Television
IPv4 Internet Protocol version 4
IPv6 Internet Protocol version 6
IS Interim Standard
ISDN Integrated Services Digital Network
ISP Internet Service Provider
ITS Intelligent Traffic System
ITU International Telecommunication Union
LAN Local Area Network
LAW Lethal Autonomous Weapon
LIDAR Light Imaging, Detection, and Ranging
LSTM Long Short-Term Memory
LTE Long-Term Evolution
MAC Medium Access Control
MAC Multiplier Accumulator
MFN Most-Favored Nation
MIDI Musical Instrument Digital Interface
MIMO Multi-Input Multi-Output
ML Machine Learning
MLM Masked Language Modeling
MR Mixed Reality
Acronyms xvii

MSE Mean Square Error


NAI Narrow AI
NER Named Entity Recognition
NFT Non-Fungible Token
NLP Natural Language Processing
NR New Radio
NSP Next Sentence Prediction
OFDMA Orthogonal Frequency-Division Multiple Access
OS Operating System
OTT Over the Top
PaaS Platform as a Service
PDA Personal Digital Assistant
PoS Proof of Stake
PoW Proof of Work
RADAR Radio Detection and Ranging
RARP Reverse Address Resolution Protocol
ReLU Rectified Linear Unit
RL Reinforcement Learning
RLHF Reinforced Learning by Human Feedback
RNN Recurrence Neural Network
RPA Robot Process Automation
SaaS Software as a Service
SDGs Sustainable Development Goals
SMTP Simple Mail Transfer Protocol
STB Set-Top Box
STEM Science, Technology, Engineering, and Mathematics
TCP Transmission Control Protocol
TDMA Time-Division Multiple Access
Telematics Communications + Informatics
UDP User Datagram Protocol
UHDTV Ultra-High-Definition Television
UI User Interface
UX User Experience
V2I Vehicle to Infrastructure
V2V Vehicle to Vehicle
VAE Variational Autoencoder
VDSL Very high-speed Digital Subscriber Line
ViT Vision Transformer
VLOP Very Large Online Platform
VLOSE Very Large Online Search Engine
VR Virtual Reality
WCDMA Wideband Code-Division Multiple Access
WiFi Wireless Fidelity
WiMAX Worldwide Interoperability for Microwave Access
xviii Acronyms

WLAN Wireless Local Area Network


WPA WiFi Protected Access
WWW World Wide Web
XR eXtended Reality
Chapter 1
Introduction to Digital and AI
Transformation

We often feel like living in unfamiliar territory in our daily lives. The familiar land-
scapes of the past are continually disappearing, replaced by unfamiliar digital devices
and tools. The life pattern of the industrial age has been gradually changing to that
of the digital age. Everyone carries a smartphone and depends on it for everything.
Beyond just making calls and sending texts, we enter chat rooms, search for informa-
tion, read newspapers, listen to music, watch videos, take photos, make notes, check
appointments, and find our way. Smartphones have expanded the range of tasks we
can perform independently, while simultaneously reducing what we can do without
them. In the midst of becoming accustomed to smartphones, the symbol of the digital
age, we have unwittingly stepped into the digital age.
Human society, after maintaining a hunter-gatherer lifestyle for ages, transitioned
to an agrarian society and then into an industrial society with the onset of the Indus-
trial Revolution in the nineteenth century. Utilizing buried carbon resources to create
power and applying spinning wheel technology that began in medieval monasteries,
humans produced power beyond the limits of human muscle. The Industrial Revo-
lution, sparked by the invention of the steam engine in 1814, shaped the industrial
society through the nineteenth and twentieth centuries, developing various technolo-
gies and ultimately blooming into today’s prosperous material civilization. On this
foundation, the electronic civilization took the baton, advancing communication,
computers, and semiconductors, laying the groundwork for the Digital Revolution.
The development and convergence of communications and computers, supported by
semiconductors, intensified the momentum of the Digital Revolution. This conver-
gence peaked with the integration of communication–computer devices (i.e., smart-
phone) and the ignition of the ‘ICT Big Bang’ with the App Store, triggering the
Digital Revolution and initiating the digital transformation of the industrial society.
We are living in an era of digital transformation. Digital kiosks replacing staff in
restaurants, online book orders that allow purchasing books even as physical stores
disappear, internet banking for transferring money at midnight, queue apps intro-
duced in local clinics, search engines for finding any information, and social media

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 1
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_1
2 1 Introduction to Digital and AI Transformation

networks for mingling with friends—all these represent the changes of the digital
transformation era. The smartphone, an essential item we carry and use daily, is an
emblematic device of digital transformation technology that integrates voice, video,
data, computer, and communication services. The internet and social media we use
every day are means of social connection and communication brought by digital trans-
formation. The frequently mentioned ‘Fourth Industrial Revolution’ is none other
than the digital transformation of the manufacturing industry. Thus, digital trans-
formation is fundamentally reshaping personal life, social activities, and industrial
production methods, totally transforming human society and industry.
Digital transformation stems from the transformation of the industrial society
caused by the Digital Revolution. Just as the Industrial Revolution brought about a
paradigm shift from agrarian to industrial society, the Digital Revolution brings a
new paradigm shift from industrial to digital society. During the Industrial Revolu-
tion, various production machines, including the steam engine, were the epicenter,
causing a major shift in production methods, transitioning from rural to urban-centric
societies, and changing lifestyles. Similarly, the Digital Revolution is changing indus-
trial activities and spreading to social activities on platforms established through the
vertical convergence of communications and computers, transitioning from an indus-
trial to a digital society. This transformation of the industrial society driven by the
digital paradigm is the essence of digital transformation.
Foundation of Digital Transformation
How did the digital transformation, which exerts such a massive influence, begin?
Was it invented out of necessity by humanity, or did it start as a small change that
spread and led to larger changes? How has digital transformation unfolded and taken
root in human society? What impact does it have on human industry and social
activities, and how does it transform them? These questions need to be carefully
considered and their answers meticulously reviewed. By doing so we will be able to
understand the changes brought about by digital transformation more clearly, identify
and better respond to associated problems, and predict future directions of change.
Amidst this, we can engage in social activities that align with digital transformation
and lead a comfortable life in the digital age without anxiety.
Digital transformation was not deliberately invented to bring convenience to
human life or contribute to productivity enhancement. It was a phenomenon that
naturally appeared during the development of communications and computers. The
starting point of digital transformation was digital conversion. This was the recip-
rocal signal processing action of converting analog signals to digital signals and
vice versa. Digital represented a new world pioneered by communications engi-
neers who dreamed of noise-free long-distance communication, leaving behind the
analog world. They discovered and invented the theories and technologies supporting
digital conversion, implemented it in the form of digital communication, and realized
the dream of long-distance communication. Once analog signals were converted to
digital signals, it became possible to process voice and images along with computer
data, providing the impetus for the convergence of communications and computers.
Communications and computers developed independently, competed, and clashed
1 Introduction to Digital and AI Transformation 3

before finally converging. This convergence progressed from wired to wireless


mobile communication and evolved at the system level, achieving complete integra-
tion of communications and computers. This integration became the driving force
behind the significant changes leading to digital transformation. The dream of long-
distance communication opened a new world of digital, which became the starting
point for the digital transformation that eventually changed the world.
The convergence of communications and computers progressed as a multidimen-
sional fusion across wired and wireless domains and systems. Initially, commu-
nications and computers converged at the signal transmission level, establishing a
communication platform where all voice, video, and data signals were transmitted
via the internet, regardless of being wired or wireless. Subsequently, communications
and computers converged at the system level, giving birth to smartphones and leading
to the establishment of a content platform where content, services, and applications
were created, distributed, and consumed on the base of operating systems (OS). On
this foundation, application stores engaged in the direct sale of applications, which
enabled them to assume a central position in the communication world, thereby
relegating traditional communications businesses to the periphery. This marked a
historic event that ended the golden age of 130 years of traditional communications
and ushered in the era of content. It was the ‘ICT Big Bang’ that shook the world of
information and communication. With App Stores at the center, all sorts of applica-
tions flocked to the OS content platform, from which various application platforms
for search, social media, online commerce, and content sharing rapidly expanded.
Thus, the wave of Digital Revolution swept across industries and society, acceler-
ating digital transformation. And this digital transformation is transitioning into a
digital and AI transformation with the emergence of AI technologies like ChatGPT.
Digital Platforms
The three digital platforms, namely the communication platform established through
the convergence of communications and computers at the communication level, the
content platform established through their system-level convergence, and the appli-
cation platforms formed on top of them, have functioned as engines driving digital
transformation. The communication platform is based on the internet, the content
platform is based on computer operating systems (OS), and the application platforms
are formed on various applications. These have become digital platforms, carrying
various contents, applications, and services, and have accelerated the digital trans-
formation of industries and societies. Especially, application platforms have pene-
trated deeply into society by providing services such as search, social media, online
commerce, and content sharing through various applications, leading the digital
transformation.
A digital platform is a space for creating, storing, delivering, processing, and
applying digital resources. Here, digital resources refer to digital content, applica-
tions, and digital services composed of various information. Therefore, digital plat-
forms form a digital ecosystem where all developers, providers, and users participate
in creating, distributing, and consuming various digital resources.
4 1 Introduction to Digital and AI Transformation

Digital platforms provide the digital technology infrastructure, such as hardware,


software, and networks, which enables the production and distribution of content,
applications, and services. They offer Application Programming Interfaces (APIs)
that allow developers to create applications and services in accordance with the
platform’s functionalities and formats. In addition, digital platforms provide user
interfaces, like web or mobile apps, enabling users to access, interact with, and utilize
the platform’s functionalities, content, and services. Thus, digital platforms are spaces
driven by digital technology for digital transformation and where the stakeholders
related to digital transformation communicate, transact, and collaborate.
There are various types of digital platforms. Well-known examples include
application marketplaces like Apple’s App Store and Google’s Play Store, social
media platforms like Facebook and Twitter, e-commerce platforms like Amazon and
Alibaba’s mobile shopping, cloud computing platforms like Amazon Web Services
(AWS) and Microsoft Azure, and content sharing platforms like YouTube and Netflix.
In addition, there are various online commerce platforms, educational platforms,
collaboration platforms, messaging platforms, travel booking platforms, financial
payment platforms, and health fitness platforms.
Companies leading these digital platforms are called platform companies. While
the number of digital platform companies is as vast as the types of platforms, the
largest and most influential among them include the Big Tech companies like Apple,
Google (Alphabet), Amazon, and Meta (formerly Facebook). As of January 20, 2024,
these four companies are among the top 10 corporations in the world by market
capitalization. This fact indicates that platform companies are leading the era of
digital transformation and driving the economic landscape of the digital age.
Digital Technology
Where does the driving force that turns the great wheel of digital transformation
come from? It is digital technology. Digital technology has been applied to industry,
driving its digital transformation, and then it has spread to the digital transformation
of society. Without digital technology, neither the term nor the phenomenon of digital
transformation would exist.
Digital technology is varied, but all are information and communication technolo-
gies (ICT) that combine communications and computer technology. Technologies
directly related to communications include the fifth generation (5G) mobile commu-
nications, the Internet of Things (IoT), and others, while those related to computers
include artificial intelligence (AI), digital twins, quantum computing, etc. Technolo-
gies common to both communications and computers include cloud computing and
edge computing, augmented reality (AR) and virtual reality (VR), the metaverse,
autonomous driving, among others. In addition, technologies related to computer
data processing include big data analysis, cybersecurity solutions, and others like
robotic process automation, 3D printing, blockchain, etc.
5G mobile communication technology significantly outperforms 4G in various
aspects, including data transmission rate, latency, the number of devices that can
1 Introduction to Digital and AI Transformation 5

be connected simultaneously, and frequency efficiency. Therefore, it is an essen-


tial element for future services that will be offered, such as autonomous driving,
intelligent transportation systems (ITS), the IoT, smart cities, telemedicine, remote
education, and industrial automation.
The Internet of Things (IoT) extends the internet from people to objects,
connecting surrounding objects to the internet to enable them to communicate with
people and objects, share data, and operate intelligently. Thus, it can promote digital
transformation by providing connectivity, automation, efficiency, and innovation
through automation and efficiency improvements, real-time situation awareness and
response, and resource and energy management.
Cloud computing allows users or businesses to store and process data by
connecting to storage devices, computers, and application software mounted on a
cloud provider’s servers instead of owning physical hardware and infrastructure.
It saves the investment needed for purchasing digital devices and reduces ongoing
management costs, offering an economical and scalable solution. Furthermore, it
provides a means for users or businesses to collaborate and share documents remotely
without being limited by location.
Augmented reality (AR) is a technology that overlays digital information on the
real world seen through screens or glasses, enhancing users’ perception of reality.
Virtual reality (VR) creates a digital environment where users can act as if they
physically exist within a virtual world, and the metaverse is a digital platform where
the virtual world and the real world merge. The metaverse, with interconnected virtual
and real environments, allows users to participate in various social activities through
avatars, holding potential for various purposes in the future.
Big data refers to the big data analytics that collects, stores, processes, and analyzes
large volumes of data to extract meaningful patterns, correlations, trends, insights,
and knowledge. Utilizing big data analysis can provide in-depth understanding of
phenomena, data-based decision-making, and optimization in task processing.
Blockchain is a decentralized, distributed technology developed to allow peer-
to-peer transactions without the intervention of central authorities. Blockchain
technology offers a transparent and efficient means to record and verify transac-
tions, holding potential for various industries. Cryptocurrencies like Bitcoin and
non-fungible tokens (NFTs) are built on blockchain technology.
Artificial intelligence (AI) refers to machines that mimic human intelligence
to think, learn, reason, solve problems, and make decisions like humans. Various
technologies get involved in implementing AI, including machine learning, neural
networks, natural language processing, and computer vision.
Artificial Intelligence (AI)
Artificial intelligence (AI), a facet of digital technology, uniquely mimics human
intelligence to learn, reason, solve problems, and make decisions. Through neural
networks and machine learning, AI acquires knowledge, bases reasoning on that
knowledge, solves problems, and makes decisions. AI aims to create systems capable
of performing tasks that require human intelligence, such as voice recognition, visual
recognition, decision-making, and language translation. Furthermore, it seeks to
6 1 Introduction to Digital and AI Transformation

mimic various aspects of human cognitive abilities to adapt to and improve situations
and operate autonomously. These tasks require capabilities in learning, reasoning,
problem-solving, perception, and understanding human language.
The term AI was first used in the 1950s, but the implementation process has
progressed slowly. The core elements of AI, including algorithms, machine learning,
and neural networks, have evolved over about 70 years, overcoming various chal-
lenges. Technological breakthroughs have propelled its development, marked by
several challenging events. Technologically, the emergence of recurrent neural
networks (RNNs) and convolutional neural networks (CNNs) effectively processed
sequential data like voice and grid-type data like images, respectively. In addition,
the advent of variational autoencoders (VAEs) and generative adversarial networks
(GANs) opened the door to generative models, and the emergence of the self-attention
mechanism facilitated parallel processing. From an event perspective, milestones in
AI development were set when IBM’s Deep Blue defeated the world chess champion
in 1997, IBM Watson won against human champions in a game show in 2011, and
Google’s AlphaGo beat the world’s top Go player in 2016.
The self-attention mechanism introduced in 2017 was implemented in the trans-
former architecture, revolutionizing natural language processing. The transformer
architecture was adopted in Google’s BERT and OpenAI’s GPT, marking a new
era in natural language processing. GPT, developed since 2018, is a “generative
pre-trained transformer” as its name suggests, breaking through the limits of neural
network size and capabilities with its self-attention mechanism and transformer struc-
ture. GPT evolved into GPT-3, GPT-4, etc., and similarly, BERT evolved into Bard,
Gemini, etc.
ChatGPT-3.5, released in November 2022, is an advanced language model appli-
cation of GPT. It is a generative neural network model operating on pre-training and
possessing a transformer architecture, specialized in natural language processing.
ChatGPT inherits the strengths of the transformer architecture, demonstrating excep-
tional abilities in understanding and generating natural language. The name ChatGPT
reflects its ability to engage in dialogue, answer questions, and perform various func-
tions in a chat format. However, ChatGPT also inherits the limitations of the biases
and inaccuracies present in the data used for its training.
The release of ChatGPT-3.5 sparked extraordinary interest in AI. Witnessing AI’s
ability to generate human-like texts and converse with humans, people realized that
AI is not a future concept but a present reality. The launch of ChatGPT spurred
intense competition among ICT and platform providers, rapidly shifting research
and development focus toward AI. Concurrently, the release of ChatGPT served
as a catalyst for raising awareness about the potential risks of AI, sparking public
discussions on AI’s dangers and prompting government policy responses.
Digital and AI Transformation in Industry
The industrial sector was the first to apply digital technology, initiating digital trans-
formation. The reason traditional industries began to focus on digital transformation
was due to competition. If a company does not transition to digital, it risks falling
behind in the competitive race and eventually becoming obsolete. Just as sticking to
1 Introduction to Digital and AI Transformation 7

plow farming in the industrial age would inevitably lead to being outpaced by tractor
farming, in the digital era, clinging to simple tractor farming means being outcom-
peted by digitally enhanced, autonomous-driving tractors. Therefore, companies
have recognized digital transformation as a timely challenge and have competitively
plunged into it.
As much as digital transformation was a focus for companies, it was narrowly
defined in the past as the act of applying digital technologies to various business
areas to change corporate operations. However, the digital transformation pursued
by companies goes beyond merely adopting digital technologies. It seeks a compre-
hensive change that leverages digital technologies to transform business models,
processes, organizational structures, and decision-making processes, aiming to create
new value. Notably, with the rapid evolution of AI technology, companies are
applying AI significantly to tasks like processing large volumes of data, making
real-time decisions, ensuring precise quality control, and actively responding to
customers, pursuing automation and intelligence. This is known as AI transformation
of industry.
The digital and AI transformation of industries requires substantial investment,
such as purchasing various digital devices, developing software, and hiring digital
experts. It also demands concerted efforts from all management and entails the chal-
lenging task of comprehensively innovating the company. Thus, for a successful
transformation, it is necessary to present clear goals and visions, appoint a trans-
formation leader with full authority, establish a concrete transformation plan, and
form a dedicated organization. Moreover, it is essential to invest in securing digital
devices and technologies, build IT infrastructure, manage and analyze data, collabo-
rate closely with the field, and thoroughly plan and execute performance evaluation
and analysis.
The benefits a company gains from transitioning to digital and AI are diverse. It
can improve efficiency, reduce costs, and shorten market entry time through process
automation and operation optimization. Utilizing digital and AI technology allows
for the provision of personalized services that meet customer needs, offering new
experiences and satisfaction through interaction with customers. Digital and AI trans-
formation can spur innovation, enabling the development of new products, services,
and revenue streams that were unimaginable in the past. By leveraging real-time
data and advanced analytics for data-based decision-making, businesses can optimize
their strategies and quickly respond to market changes. Digital and AI transformation
also enables the optimization of resource use and minimizes environmental impact,
bringing companies closer to sustainability goals. Ultimately, embracing digital
and AI transformation allows for agile response to market changes, maintaining
a competitive edge.
Digital and AI Transformation in Society
Digital technology has permeated into society through the innovative services of
platform companies, significantly changing the way we engage in social activities
and conduct our daily lives. Digital technology has brought about changes in every-
thing from communication methods and access to and distribution of information, to
8 1 Introduction to Digital and AI Transformation

work and learning methods, and even lifestyle habits. For instance, everyone carries
smartphones that enable communication with others, searching for various informa-
tion, and enjoying news, sports, movies, videos, taking photos, and playing games.
The changes include positive aspects such as improved social connectivity, enhanced
access to information and services, and increased innovation in industries. However,
there are also negative aspects such as digital divides, job losses, and the spread of
fake news. Furthermore, there are risks of misusing digital technology to lead to a
digital surveillance society.
In the digital age, the use of digital devices has increased work efficiency, leading
to a trend of reduced manpower needed for tasks. Especially with the spread of robotic
process automation (RPA), rapid replacement of manual labor jobs has resulted
in significant job losses, and the development of artificial intelligence (AI) is also
reducing office jobs. In addition, job opportunities further reduce as banking oper-
ations transition to fintech using software and applications, ticketing or ordering
services are replaced with digital kiosks, and face-to-face services are being replaced
by electronic and cyberoperations. The current issue of job shortages experienced
by nearly all countries is partly due to economic recessions, but a more funda-
mental cause is the reduction in the number of jobs due to the advancement of digital
technology. This is a common problem faced by both developed and developing
countries.
Digital and AI transformation is fundamentally changing the way teachers teach
and students learn by integrating digital technology into education. The first important
aspect of education in the digital age is to ensure all students have equal access to
digital devices and internet connections, enhancing the effectiveness of education
through the application of digital and AI technologies. In addition, it is crucial to use
various digital tools in education to improve students’ digital literacy and scientific
literacy. Moreover, education in the digital era needs to be restructured in preparation
for the future where humans coexist with digital and AI technologies. To that end, it
is essential to closely observe the development of AI to understand what it means to
be human in light of AI and explore which human abilities need to be developed for
coexistence with AI.
Entering the digital and AI transformation era, we observe various pathological
phenomena in political and social contexts. It is important to note how social media
platforms, emerging from digital transformation, have radically changed political
activities and social movements. Among the many changes brought about by hyper-
connected social media to the political and social environment, the most shocking is
the collective actions mediated by the internet. Social network services (SNSs) and
internet personal broadcasting through YouTube are platforms that have made this
possible. SNS provides a means for members to express and share their opinions,
while internet personal broadcasting allows individuals to disseminate their views to
an unspecified number of people. These social media platforms enable the formation
of groups that share opinions, and members of these groups can unite in collective
action. Unlike in the past, groups can form and act without the constraints of time or
space.
1 Introduction to Digital and AI Transformation 9

One phenomenon quietly unfolding in the digital transformation era is the collec-
tion of individual citizens’ and consumers’ information for surveillance purposes.
Collecting information on individuals for national surveillance is an act that occurs in
controlled societies and is not permitted in free democratic societies. However, even
in free democratic countries, individuals’ information is being collected and used,
such as when digital platform companies collect consumer information for targeted
advertising. The former is surveillance by the state, while the latter is surveillance by
businesses. While the former is conducted unilaterally by the state without the consent
of its citizens, the latter is carried out by businesses with the consent of consumers to
a significant extent. If the former violates citizens’ rights through political actions,
the latter is an economic activity conducted with consumers’ understanding. If the
former leads to a digital surveillance society, the latter guides us toward a surveillance
capitalism society.
Challenges of Digital and AI Transformation
As the digital era matures with the progress of digital transformation, the advent
of ChatGPT signals the baton passing to AI, preparing us for the AI era. However,
the transition from a digital society to an AI society differs from the shift from an
industrial to a digital society. While the transition to a digital society corresponds
to a paradigm change, the AI era exists on a continuum with the digital era. AI
emerged as one of the digital technologies and has been in use, gradually increasing
in importance until it eventually becomes the central axis of digital technology.
Therefore, the development of AI signifies that the digital transformation is fully
transitioning into a digital and AI transformation.
Digital transformation has introduced several new challenges to human society.
Digital platform services have raised issues regarding the protection of personal infor-
mation, and search engines and social media have introduced biases and distortions
to information. Especially, digital transformation has brought numerous challenges
to society, such as digital divides, job losses, cyber-attacks and data security, and the
spread of false information and fake news. The digital divide can cause socioeco-
nomic and educational inequalities in the digital age. False information and fake news
can amplify social conflicts, undermine national unity, overturn election results, and
put democracy at risk if misused in politics. The application of digital technology to
remote cameras and facial recognition technology for surveillance purposes raises
the possibility of a surveillance society. In education, there’s the task of redefining
educational content in preparation for the future where humanity coexists with AI
robots and other digital technologies.
Solving these issues has become an essential task for the development of human
society and the improvement of human life. While some can be resolved through
individual efforts, most require systematic solutions through social institutions and
infrastructure. It is necessary to legislate or strengthen existing laws to protect
personal information, data security, and consumer rights. Also necessary is the
development of technologies to counter false information and fake news alongside
establishing punitive regulations, and the enhancement of network security measures
against cyber-attacks. In addition, various measures are also needed to control the
10 1 Introduction to Digital and AI Transformation

indiscriminate malfunction of social media and to prevent damage to liberal democ-


racy in response to political and social pathologies, thereby protecting citizens from
digital surveillance and consumers from surveillance capitalism. In education, it is
necessary to enhance digital literacy, to establish an educational philosophy suited
for the digital and AI era, and to reflect it in educational practices. Also needed
are diverse policy solutions to address job losses and changes in the digital age,
along with various re-education and vocational transition programs to assist in digital
transformation. Through such multifaceted efforts, we can address the social chal-
lenges that accompany the digital society by preventing the problems posed by digital
technology, while ensuring the benefits evenly enjoyed by all citizens.
Digital transformation, a key icon of our era, has placed digital platforms at
the center of digital civilization, providing various means of communication and
connection, and hastening the advent of a ‘hyperconnected’ world. However, now
their development has reached its limits and is presenting various problems. Internet
search services have exposed issues related to privacy due to the collection of personal
information and its use for targeted advertising. Social network services have led to
the distribution of false information and fake news, as well as problems like cyber-
bullying. Meanwhile, as electronic commerce platform providers have strengthened
their market dominance, they can arbitrarily raise fees for sellers and reduced the
choices available to users, leading to detrimental effects. Such problems have been
tolerated in the atmosphere of respect for technological innovation and industrial-
ization, but now the negative impacts have escalated into serious social issues, and
the issue of restricted competition due to the monopolies of platform companies
has reached a serious level. In response, various lawsuits have been filed, the U.S.
government has begun to regulate monopolistic practices, and the European Union
has started to respond with the Digital Markets Act and the Digital Services Act.
If we turn to AI technology, AI itself amplifies concerns about privacy violations,
data security, and digital surveillance because it relies on massive amounts of data to
function. The way AI gathers and uses extensive data can lead to issues of intellectual
property infringement, and decisions made based on AI can raise ethical and bias
concerns. On the other hand, using AI technology also makes it possible to develop
robust security technologies to address these issues. Thus, AI technology embodies
a double-edged sword in terms of information protection and digital surveillance.
Consequently, the social impact of the AI transition is extensive and multifaceted,
bringing both opportunities and challenges. The task during the digital and AI tran-
sition period will be to minimize various potential negative effects, such as issues
with information security, digital surveillance, property rights infringements, ethical
issues, and fairness, while maximizing the benefits of digital and AI technologies.
Organization of the Book
This book sequentially describes the topics of digital transformation from Chaps. 2
to 8, as illustrated in Fig. 1.1. It explains the foundation of digital transformation,
digital platforms, the digital technology, AI technology, digital transformation of
industry, digital transformation in society, and the challenges of digital transforma-
tion in that order. This chapter, “Introduction to Digital Transformation,” outlines the
1 Introduction to Digital and AI Transformation 11

&KDOOHQJHVRI'LJLWDO7UDQVIRUPDWLRQ

'LJLWDO7UDQVIRUPDWLRQ 'LJLWDO7UDQVIRUPDWLRQ
$SSOLFDWLRQV
RI,QGXVWU\ LQ6RFLHW\

'LJLWDO7HFKQRORJLHV $UWLILFLDO,QWHOOLJHQFH
7HFKQRORJLHV
'LJLWDO3ODWIRUPV

)RXQGDWLRQRI'LJLWDO7UDQVIRUPDWLRQ

,QWURGXFWLRQWR'LJLWDO7UDQVIRUPDWLRQ

Fig. 1.1 Organization of the book

entire content of the book. Chapter 2, “The Foundation of Digital Transformation,”


discusses the communications and computer technologies underpinning digital trans-
formation and the formation of communication and content platforms made possible
by their convergence. The next three topics—Chap. 3, “Digital Platforms,” Chap. 4,
“Digital Technology,” and Chap. 5, “Artificial Intelligence (AI)”—each deal with
the core technologies for digital transformation independently. The following two
topics—Chap. 6, “Digital Transformation of Industry,” and Chap. 7, “Digital Trans-
formation in Society”—discuss the application of digital transformation in industry
and society, respectively. Chapter 8 concludes the book by revisiting the challenges of
digital transformation discussed in Chaps. 2 to 7 from a comprehensive perspective.
Reading this book from Chaps. 1 to 8 allows readers to follow a sequence that starts
with the historical background before moving on to understand the core technologies
of digital and AI transformation, such as digital platforms, digital technology, and
AI, and then to examine the industrial and social impacts and issues of digital and AI
transformation. However, since the content of each chapter is independent, under-
standing the content of one chapter is not a prerequisite for reading the next. For
example, understanding the foundation of digital transformation is not a prerequisite
for reading about digital platforms, nor is understanding digital platforms a prereq-
uisite for understanding digital technology. Therefore, it is advisable to start reading
from the topics of most interest and then freely move to other topics as desired.
Chapter 2
Foundations of Digital Transformation

Several stages of technological innovation were needed until digital transformation


became a reality, and the technologies developed into an explosive force as they
evolved through the convergence of communications and computers. Initially estab-
lished were the theoretical support needed to convert analog signals to digital and the
technological means to implement these theories into actual systems. Theoretically,
digital signal processing theories, along with theories related to signal transmission,
switching, and networking, provided the necessary support. Technologically, infor-
mation and communication technology, computer technology needed to implement
digital conversion into systems, and semiconductor technology that provided the
physical means for implementation backed these developments. Digital conversion
was first applied to telephone voice signals and then to television video signals. As
these were integrated and processed with computer data in the digital domain, it
began to pave the way for the convergence of communications and computers. This
convergence at the system level of communications and computers gained explosive
power, resulting in the formation of communication and content platforms. These
platforms exerted tremendous power, impacting all industries and services, and led
to the widespread expansion of digital transformation.

2.1 Development of Communications and Computers

A landmark event in the history of communications was Alexander Graham Bell’s


invention of the telephone in 1876, which soon led to the establishment of AT&T.
In 1896, Guglielmo Marconi invented wireless telegraphy. The century from the
invention of the telephone to the breakup of the Bell System in 1984 was the
golden age of communications development. Various technologies for wired and

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 13
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_2
14 2 Foundations of Digital Transformation

wireless communications, mobile communications, and data communications devel-


oped significantly, along with the components and materials supporting them, led by
AT&T Bell Laboratories.
A groundbreaking starting point in the history of computer development was the
creation of ENIAC in 1946. Following the invention of the transistor and the emer-
gence of integrated circuits, computers developed rapidly. Starting with mainframe
computers, development continued through minicomputers, microcomputers, and
personal computers (PCs), and with the advent of smartphones, computers were
integrated with communications terminals. With the rise of the internet and the
development of mobile communications, computers became the mainstay of mobile
computing and, in combination with information and communication, advanced ICT
technology. In this development process, IBM led the initial advancement, followed
by significant contributions from DEC, Hewlett-Packard (HP), Apple, and Microsoft
(MS).

2.1.1 Wired Communications

When telephone communication first started, users were directly connected by


communication lines. However, it soon changed to a system where operators manu-
ally connected users through a central exchange, and in 1892, the Strowger switch was
invented, marking the beginning of automatic exchange by mechanical devices. The
invention of the vacuum tube electronic amplifier in 1906 made it possible to extend
the transmission distance, enabling transcontinental long-distance telephone calls
between New York and San Francisco in 1915. In 1956, the first transatlantic undersea
cable system, TAT-1, was installed, starting to connect the USA and Europe. A funda-
mental limitation of analog communication was noise and transmission capacity. The
introduction of digital communication technology drastically reduced noise, starting
with the digital modulation method PCM and the digital carrier T1 developed in 1962.
In addition, transmission capacity greatly increased with the introduction of optical
communication through fiber optics, with the first optical communication system
developed in 1975 based on the invention of the laser in 1960 and the practical level
of fiber optics reached in 1966. The issue of switching was completely resolved with
the development of the fully automatic electronic switch 4ESS in 1976, followed
by various types of electronic switches, including 5ESS. Meanwhile, as the impor-
tance of video and data services gradually increased, solutions were explored for
transmitting these services in integrated manner using existing telephone networks.
In 1984, the Integrated Services Digital Network (ISDN) was standardized, by the
International Telecommunication Union (ITU) and later expanded to the Broad-
band Integrated Services Digital Network (BISDN), creating the basis for integrated
services.
2.1 Development of Communications and Computers 15

2.1.2 Wireless and Mobile Communications

Wired communications evolved into wireless and mobile communications. The foun-
dation of wireless communication was established in the late 1800s, with Maxwell
formulating four equations related to electromagnetic waves in the 1860s, and Hertz
successfully generating and detecting electromagnetic waves in 1888. Based on this,
Marconi invented wireless telegraphy in 1896 and succeeded in communicating
across the Atlantic using electromagnetic waves in 1901. Wireless technology was
initially used for wireless telegraphy, then shifted to broadcasting, leading to the
start of radio broadcasting in 1916 and the first demonstration of black and white
television broadcasting in 1927. Radar was invented in 1935. The first communica-
tions satellite, Echo I, was launched in 1960, followed by the broadcasting satellite
Telstar I in 1962, and INTELSAT I in 1965, ushering in the era of satellite commu-
nications. Mobile communications were first offered by Bell Labs in 1946 but did
not become widespread. The first generation of mobile communications (1G) began
much later, with the analog AMPS system in 1983. The second generation (2G) saw
the commercialization of the TDMA-based GSM system in 1991 and the CDMA-
based IS-95 system in 1996. The third generation (3G) converged both into the
CDMA technology, with the asynchronous WCDMA system starting in 2006 and
the synchronous cdma2000 system in 2007. The fourth generation (4G) used the
OFDMA technology, with the m-WiMAX system commercialized in 2006 and the
LTE system in 2009, but the LTE system took lead in the market. The fifth generation
(5G) was commercialized in 2018, based on the OFDMA technology.

2.1.3 Computers

When the ENIAC computer was first developed in 1946, it was large enough to fill a
big room.1 The development of computers has progressed to the point where they can
now be held in hand. Technically, vacuum tubes were used in the 1950s, transistors
in the 1960s, and integrated circuits (IC) and large-scale integrated circuits (LSIC)
from the 1970s onwards. Large mainframe computers were first developed, followed
by minicomputers, microcomputers, personal computers (PC), and smartphones,
becoming smaller in size and lower in price. The user base also expanded from experts
to the general public. In the 1950s, only computer experts could use computers; in
the 1960s and 1970s, highly skilled individuals could use them; and from the 1980s
onwards, they became accessible to everyone.
In the 1960s, IBM led the development of mainframe computers, releasing the
IBM 1401 in 1960 and the IBM 360 in 1964. These large mainframe computers were
primarily used for large corporations, military, and space development. From 1965,

1 ENIAC, created by John Mauchly and Presper Eckert at the University of Pennsylvania in 1946,
is considered the precursor of computers. It used 18,000 vacuum tubes and required a cooling area
of 167 square meters due to the heat generated.
16 2 Foundations of Digital Transformation

small minicomputers like the PDP-8 became popular, led by DEC, Data General,
and HP. In the 1970s, several microcomputers were released, including the CDC’s
Datapoint2200 in 1970 and HP’s HP9830A in 1972, keeping computers in the realm
of experts until then.
The widespread distribution of computers for general use began in the mid-1970s
when IBM, HP, Apple, Commodore, and Compaq competitively launched personal
computers (PC). In 1973, IBM’s Los Gatos Research Laboratory created the proto-
type of a portable computer called SCAMP, known as the precursor of PCs. IBM
set the standard for PCs using DOS and Windows as operating systems (OS), and
most other companies released products compatible with IBM PCs. However, Apple
competed with IBM by launching PCs with a new concept using OS X as the oper-
ating system. The widespread adoption of PCs began in earnest in 1977 when Steve
Wozniak and Steve Jobs sold the first Apple II PC, equipped with BASIC language,
color graphics, and 4,100 characters of memory, for $1,297. IBM then shifted its
strategy from focusing on mainframes and minicomputers to actively entering the
PC business.
The transition of PCs to smartphones began with the personal digital assistants
(PDAs) of the 1990s. PDAs featured email, calendars, address books, calculators,
web browsing, and fax send/receive capabilities and could be used as cellular phones
when closed. When defining a smartphone as a small computer with a small OS
and a mobile phone, the first commercially successful smartphones were RIM’s
BlackBerry phone in 1999 and Motorola’s A760 handset in 2003. In 2007, Apple
launched the iPhone with iOS, disrupting the communications landscape with the
introduction of the App Store, marking the start of the smartphone era and leading to
the “ICT Big Bang.“ In response to the iPhone, Samsung Electronics launched the
Galaxy with Android OS in 2009.

2.1.4 Computer Communications

Computer communications emerged from the need to connect terminals and


computers to allow shared use of computers during their development. In 1960,
Paul Baran introduced the concept of distributed communication through packets,
departing from traditional circuit-switched communications. Packet switching was
implemented in 1964 and incorporated into the US Department of Defense’s
ARPANET project in 1965. In 1969, Leonard Kleinrock successfully conducted
the first data communication between UCLA and Stanford University as part of the
ARPANET project, considered the inception of the Internet. In 1973, Vint Cerf and
Robert Kahn proposed the TCP protocol, which was redefined as the TCP/IP protocol
in 1981 and adopted for ARPANET in 1983, establishing the Internet protocol.
In the context of Local Area Networks (LAN), Robert Metcalfe proposed the
concept of Ethernet in 1973, which was commercialized from 1980 and established
as the IEEE 802.3 standard in 1983. For Wireless LANs (WLAN), following Norman
Abramson’s development of the first wireless packet system, the ALOHA network in
1971, the IEEE 802.11 WiFi standard was established in 1997 and commercialized in
2.1 Development of Communications and Computers 17

1999. It has since evolved into various versions, significantly increasing transmission
capacity and establishing itself as a means for wireless Internet access. Meanwhile,
in 1990, Tim Berners-Lee developed the World Wide Web (WWW), and Internet
Service Providers (ISPs) began to emerge in the late 1980s. The commercialization
of the Internet accelerated with the launch of Netscape in 1994 and Explorer in 1995,
leading to rapid development. Particularly, the introduction of standardized ADSL in
1995, which opened the door to high-capacity subscriber network communications
after 1998, made the Internet a formidable competitor to traditional communications.

2.1.5 Broadcasting

Broadcasting began in the 1920s, starting with radio and followed by television,
which became practical in the 1930s.2 TV broadcasting evolved from black and white
to color TV in the 1950s.3 Originally, broadcasting modulated programs on waves
to a non-specific audience via terrestrial transmission, but cable television (CATV)
began in the 1950s to target areas with poor reception, becoming widespread in the
1970s4 In the 1980s, terrestrial broadcasting expanded to satellite broadcasting, and
in the 1990s, it transitioned to digital, providing high-definition television (HDTV)
broadcasting services.5 Alongside, CATV also evolved into digital CATV. The tran-
sition of terrestrial broadcasting to digital required turning off all analog broadcasts
(“analog switchoff”) and simultaneously turning on digital broadcasting (“digital
switchover”).6
Cable television serves as a broadcast method that receives terrestrial or satellite
broadcasts via a communal antenna and provides services to subscribers through
cables, initially using coaxial cables and later fiber optics. Originally, because
antennas were set up high in areas with difficult reception for communal use, it was
also known as “community antenna television (CATV).” Broadcasting transitioned
from transmitting to a non-specific audience to providing subscription services to
cable subscribers.
Initially, CATV primarily offered one-way broadcast services but gradually
improved the cable network to enable two-way services. With the rise of the Internet
in the 1990s, cable modems were attached to provide Internet services alongside

2 The journey started in 1927 when Baird transmitted a signal over 705 km from London to Glasgow
through telephone lines, followed by the first transatlantic television signal between London and
the US in 1928, and a trial television service in Germany in 1929. However, the first commercial
television service was provided by BBC broadcasting in 1936.
3 In the United States, NBC broadcasted the Rose Parade in color for the first time on January 1,

1954, using the NTSC system developed by RCA.


4 The first commercial CATV service started in 1948 in a small town in Pennsylvania, USA.
5 The first HDTV broadcast started in Japan in 1991. After digital HDTV standards were established,

the US started broadcasts in 1998, Korea in 2001, and the UK in 2006.


6 Luxembourg was the first to start this transition in 2006, followed by the US in 2009, and Korea

between 2010 and 2012 in phases.


18 2 Foundations of Digital Transformation

broadcasting, leading broadcasters to enter the communications business. As CATV


began to penetrate the communications market, telecom operators started offering
broadcasting services using Internet Protocol Television (IPTV). From 1998, telecom
operators began providing IPTV services using high-speed subscriber networks like
ADSL and VDSL.7 This introduced competition to the broadcasting market, with
IPTV’s significance lying in its use of IP packet mode technology, extending beyond
communications to broadcasting.

2.1.6 Key Technological Elements

Several elements have provided crucial milestones in the development of communi-


cations and computers, including the invention of new materials and the establishment
of new conceptual frameworks. Notably, the invention of the transistor in 1947 by
Bell Labs physicists Walter Brattain, William Shockley, and John Bardeen marked
a significant shift from vacuum tube-based electronics to semiconductor-based elec-
tronics, opening the path to miniaturization and cost reduction in communications,
broadcasting, and computer devices.
The invention of the laser in 1960 and the practical application of fiber optics with
attenuation to 20 dB/km in 1966 fundamentally solved the problem of transmission
capacity in communications. In addition, digital signal processing theory, emerging
in the 1960s, solved the noise problem in communications, enabling all analog signals
to be processed digitally and facilitating the digital convergence of voice, video, and
data.
New theories and concepts have marked significant milestones in communica-
tions development. The concept of hexagonal cellular communication, proposed
by Douglas Ring and William Rae Young of Bell Labs in 1947, was implemented
in mobile communication networks 30 years later. Claude Shannon’s information
theory, presented in 1948, outlined the limits of information that communication
methods and channels can convey.
In the development of the Internet, several groundbreaking inventions stood out.
Paul Baran’s proposal of packet-switched communications in 1960 opened a new
chapter for data communication. The ARPANET communication demonstrated by
Leonard Kleinrock and others in 1969 is celebrated as the first Internet communica-
tion. Vint Cerf and Robert Kahn’s proposal of the TCP protocol in 1973, formally
adopted as the comprehensive TCP/IP protocol in 1981, solidified IP packet mode
as the standard transmission method across various networks. ARPANET’s official
adoption of TCP/IP in 1983 in replacement of Network Control Protocol (NCP)
marked the inception of today’s Internet.8

7 The US West in the US first started IPTV services in 1998.


8 January 1, 1983, on which day ARPANET officially adopted TCP/IP is often referred to as the
Internet’s "flag day," signifying the establishment and widespread adoption of the modern Internet’s
structure.
2.2 Digital Conversion 19

2.2 Digital Conversion

Digital transformation was not initiated by knowing in advance that processing tasks
digitally would be effective for integration of signals and then developing related
technologies. Instead, digital transformation originated from the digital conversion,
which involved converting analog signals to digital signals for long-distance commu-
nication. Once all analog signals were converted to digital, it became possible to
process all types of signals, including data signals, in digital format. This capability
created a synergistic effect, leading to integration and laying the groundwork for
digital transformation. Initially, the analog signals that were converted to digital
were telephone voice signals and later expanded to include images and television
video signals.

2.2.1 Three Types of Signal Sources

The history of information and communication development is the process of


evolving methods to deliver voice, video, and data signals created by telephones,
televisions, and computers, respectively. Telephones, televisions, and computers
were electronic devices created at different times for different service purposes: tele-
phones for voice services in 1876, televisions for broadcasting services in the early
1920s (first broadcast in 1927), and computers for calculation and data processing in
1947. While telephones and televisions were of general interest from the beginning,
computers initially attracted experts’ interest, becoming popular among the general
public with the release of personal computers (PCs) in 1973. From a general users’
perspective, telephones for voice services appeared in 1876, televisions for video
services in 1927, and PCs for data services in 1973, each emerging about 50 years
apart.
Telephones, televisions, and PCs form three types of communication signal
sources for providing voice, video, and data communication services, respectively.
Voice signals, originally analog with a frequency band below 4 kHz, can be
converted to digital with a transmission rate of 64kbps. Video signals, originally
two-dimensional analog signals, are scanned horizontally.9 In color TVs, colors
are decomposed into RGB (red, green, blue) and scanned for intensity. Converting
these analog TV signals to digital can yield a transmission speed of approximately
250Mbps, but applying signal compression technologies like MPEG-2 can reduce it
to 2–6 Mbps.10 Digital TVs, with densely packed pixels within the screen, directly
convert the intensity of these pixels into digital signals. The transmission rate varies

9 This scanning process repeats 30 times per second in countries using the NTSC standard, like
the US, Japan, and Korea, while it occurs 25 times per second in countries adopting the PAL or
SECAM standards, based on the electrical supply frequency of 60 Hz and 50 Hz, respectively.
10 For example, standard-definition TVs scanning 720 pixels per line and 480 lines at 30 frames per

second with 8 bits per color per pixel result in a data rate of approximately 248.83 Mbps.
20 2 Foundations of Digital Transformation

with the pixel density or resolution, with standard-definition TV (SDTV) at 720 × 480
pixels, high-definition TV (HDTV) at 1920 × 1080 pixels, and ultra-high-definition
TV (UHDTV) at 3840 × 2160 pixels. Data signals, originally digital, typically have
lower transmission rates than voice or video signals but can be higher for multimedia
data.

2.2.2 Analog and Digital Signals

How do analog and digital signals differ? Essentially, all sounds in their natural state
can be considered analog signals. Sounds heard by the ears and scenes seen by the
eyes are all analog. Human voices and voices transmitted through telephones are
analog, as are filmed scenes and broadcast television images. Analog signals change
continuously over time, as illustrated in Fig. 2.1. For instance, an analog signal x(t)
changes continuously over time t, with time t having continuous real values and the
analog signal x(t) also having real values.
In contrast, digital signals, as shown in the figure, are discontinuous. They do
not exist at all times but only at specific intervals, that is, at discrete times. Thus,
the signal is represented by natural numbers n in the figure. The digital signal x[n]
represents the magnitude of the signal at time n, which can be real values or limited
to certain values expressible in binary form. Strictly speaking, the former is called
a discrete-time signal, and the latter a digital signal. Since the time axis is discrete
time in both cases, the terms discrete-time signal and digital signal are sometimes
used interchangeably.
Since the invention of the telephone in the 1870s, traditional telephone commu-
nications have processed and transmitted voice signals in analog form. The system
that processes analog signals is referred to as an analog circuit (see Fig. 2.2). Analog
circuits consist of analog components like resistors (R), inductors (L), capacitors
(C), and operational amplifiers (OpAmp), with actual electronic components and
physical connections.
After the 1960s, as analog signals began to be converted to digital signals and tele-
phone communications shifted from analog to digital, voice signals were converted

Fig. 2.1 Example of analog and digital signals


2.2 Digital Conversion 21

Fig. 2.2 Example of analog and digital circuits

to digital, processed, and transmitted in digital form. Digital circuits, referred to


in this context (see Fig. 2.2), comprise adders, multipliers, and delay elements.
Unlike analog components, these do not have physical forms or connections but
are programmed to perform calculations as depicted in the circuit by connecting
arithmetic logic units (ALU), multiplier-accumulators (MAC), memory, etc.

2.2.3 Digital Conversion

The motivation for converting analog signals to digital signals was simply the desire
to overcome the limitations of transmission noise and enable long-distance commu-
nication. With analog signals, accumulated noise could not be removed, making
long-distance communication impossible. The reason was that there was no way
to distinguish between the signal sent by the sender and the noise that interfered
during transmission, making it impossible to remove the noise and restore the orig-
inal transmitted signal. However, when analog signals are converted to digital signals
and transmitted as binary 0’s and 1’s (in actual circuits, for example, 0 and 5 V), the
receiver can consider parts deviating from 0 and 1 as noise and decode the trans-
mission signal to restore the original signal (refer to Fig. 2.3). When this digital
conversion issue was theoretically and practically resolved, digital communication
became a reality, and the dream of long-distance communication was realized.
How was it possible to convert analog signals to digital signals and then convert
them back to recover the original analog signals? Two processes are necessary to
convert an analog signal to a digital signal. Using Fig. 2.1 as an example, one is
the sampling process related to the x-axis (i.e., the time axis), and the other is the
quantization process related to the y-axis (i.e., the signal magnitude axis). On the
22 2 Foundations of Digital Transformation

#PCNQI5KIPCN

5KIPCN5GPV 0QKUG5KIPCN 5KIPCN4GEGKXGF

&KIKVCN5KIPCN

Fig. 2.3 Example of analog and digital transmission noises

x-axis, sampling involves taking samples at regular intervals from the continuous-
time analog signal to create a discrete-time signal. On the y-axis, quantizing this
discrete-time signal reads it as a digital discrete-time signal.11
The challenge in digital conversion is not just taking samples from the analog
continuous-time signal at regular intervals to obtain a discrete-time signal, but
whether it is possible to restore the original continuous-time signal from that discrete-
time signal. Thus, the crux of the digital conversion problem is how to sample in
the sampling process so that the original analog signal can be restored from the
sampled discrete-time signal. The Nyquist Sampling Theorem solves this problem.
It states that if samples are taken at a frequency more than twice the highest frequency
component contained in the analog signal, the original analog signal can be perfectly
restored from those samples.12 Of course, this theory assumes that no errors occur in
the quantization process and its reverse. This depends on how many bits are allocated
for quantization, and by setting the number of bits sufficiently large, quantization
error can be made negligible. In theory, if the number of bits is infinitely large, the
digital signal becomes identical to the analog signal.
As digital conversion (i.e., analog-to-digital and digital-to-analog conversion)
became feasible, active academic research supported the processing of all analog
signal processing digitally. As a result, in the 1960s, digital signal processing theory,

11 ‘Quantization’ here means quantification or digitization, which means expressing a real value as
an m-bit binary digital signal. Quantizing is performed by checking which of the 2m equally spaced
gradations the real value is closest to and then reading that gradation value in binary.
12 Frequency components of a signal can be found by converting it into a Fourier series or applying

a Fourier transform. The highest frequency component in the transformed domain is identified, and
then sampling is done at more than twice that frequency.
2.2 Digital Conversion 23

%KTEWKV/QFG 2CEMGV/QFG

& &
# #

$WHHGT

$WHHGT
$ ' $ '
%KTEWKV 2CEMGV

% ( % (

5YKVEJ 5YKVEJ 4QWVGT 4QWVGT

Fig. 2.4 Circuit mode and packet mode

corresponding to analog circuit theory, was established.13 For nearly a century after
the invention of the telephone, circuits were composed, analyzed, filtered, frequency
converted and modulated/demodulated in analog form, but digital signal processing
became independently possible for all these analog signal processing methods.
Components like resistors, inductors, and capacitors that constituted analog circuits
were replaced with digital components performing operations such as delay, addition,
and multiplication, and semiconductor chips performing these operations supported
digital signal processing and communication. As a result, in the 1960s, digital
channel banks, replacing analog channel banks, emerged, thus marking the begin-
ning of digital transmission. With the maturity of digital technology, digital transmis-
sion expanded to wireless, optical, undersea, and satellite communications, and as
switches were replaced with digital switches as well, a fully digital communication
network became possible.

2.2.4 Circuit Mode and Packet Mode

There was a fundamental difference between traditional telephone communication,


which focused on voice services, and computer communication for data transmission.
Not only was there a difference between analog voice signals and digital computer
data signals, but a more significant difference was that voice signals are real-time,
whereas data signals are non-real-time. Telephone communication adopted a method
that pre-sets a circuit, a communication path connecting the sender and receiver, using
switches and then delivers voice signals in real-time over this circuit. In contrast,
computer communication collects data as it is generated, forms packets, attaches the
receiver’s address, and inputs them into a router. The former method is called circuit
mode, and the latter packet mode (refer to Fig. 2.4).

13See A. V. Oppenheim and R. W. Schafer, Digital Signal Processing, Pearson, 1975, and A. V.
Oppenheim and R. W. Schafer with J. R. Buck, Discrete-Time Signal Processing, 2nd Edition,
Prentice-Hall, 1999.
24 2 Foundations of Digital Transformation

In a phone call, when the handset is lifted and the receiver’s phone number is
entered via dial or button, the telephone exchange receives that number, finds a path
to connect to the receiver, and sets the switches on all exchanges along that path. This
establishes the circuit. The process of finding a path to connect to the receiver is called
routing, which is similarly applied today when using a navigation system to find a
route to a destination on a map. Once the call starts, the circuit remains connected
until the end of the call, and no other user can interfere with that connection. The
user exclusively uses the circuit during the call duration, and thus, the usage fee is
charged based on the connection time. Since there are significant periods of silence
during a call, the efficiency of communication resource use is low.
When sending data from a computer, the sender’s computer packs the sender’s data
into packets, attaches the receiver’s address, and sends them to a router. Typically, a
message is divided into multiple packets for processing. The router reads the packet’s
address and sends it out through an exit leading to the destination, choosing exits with
less traffic. The next router receives it and processes it in the same manner. This way,
the packet passes through several routers to reach the destination computer, where the
receiver’s computer receives all the packets comprising the same message, assembles
them, and delivers them to the receiver. Since packets from the same message may
take different routes and arrive in a different order, the assembly process corrects
the sequence. Packet mode, therefore, has a relatively complex packet processing
procedure and has limitations in handling real-time signals. However, because users
do not monopolize communication resources and share them with all users, the
efficiency of resource use is high.

2.3 Digital Integration

Communication services for voice, video, and data signals began independently
and evolved by forming separate networks. However, as each domain expanded,
conflicts arose among different signal types, necessitating a digital conversion
and the need to deliver these signals in an integrated manner through a single
transmission medium. Particularly, the need for integrated delivery between tele-
phone voice services and computer data services grew, leading to frequent conflicts
between the two domains. Originally, data communication started to transmit data
between computers or between a computer and a terminal. As computers evolved
and PCs became widespread, the scope and frequency of computer communication
increased, leading to frequent conflicts between telephone communications. The
conflicts, which started in wired communications, intensified with the need to inte-
grate voice, video, and data services and later moved to the wireless mobile commu-
nication domain. Thus, the development process of information and communication
technology involved conflicts and confrontations, eventually finding solutions for
integration or convergence.
2.3 Digital Integration 25

2.3.1 Digital Integration in Circuit Mode

Telephones, televisions, and computers contrast in terms of services. Telephony


began with a service that connects the sender and receiver via wires to provide voice
services, characterized as real-time bidirectional services with relatively low infor-
mation content. Television started as a real-time unidirectional service delivering
video signals to an unspecified audience via wireless, with relatively high informa-
tion content. This expanded to cable television (CATV) services, delivering real-time
unidirectional video services to subscribers. Computers were developed as indepen-
dent computing systems for processing large computations, necessitating computer
communication for data transmission between computers and user terminals. This
evolved into computer communication where users, centered by computers, formed
local area networks (LANs) for data communication. Computer data services are
bidirectional non-real-time services with relatively low information content. Thus,
these three services differed in terms of real-time capability, directionality, and infor-
mation content, leading to separate networks for voice, TV video, and data services in
the early days of services. However, as the number of users for each service increased,
expanding and building separate networks became economically burdensome.
The 1960s pursuit of digital conversion in telephony focused solely on noise-
free long-distance communication. With the success of digital conversion, digital
communication emerged, presenting new applications. This opened the possibility
of integrated processing of voice, video, and data signals generated from telephones,
televisions, and computers. This became possible because digital conversion resulted
in all signals, whether voice, video, or data, being in the same digital form, eliminating
the need to build three separate networks. Research was conducted to integrate and
deliver voice, video, and data services through the existing telephony network instead
of building individual networks. Digital conversion thus paved the way for service
integration.
Efforts to provide voice, video, and data signals in an integrated manner through
the same network intensified with the ITU-led ISDN in the early 1980s. This involved
digitizing voice and video signals and multiplexing them with inherently digital data
signals for transmission over a single communication line using the circuit mode
standard. With the advent of high-definition television (HDTV), which had an even
greater information content, ISDN was extended to Broadband ISDN (BISDN), and
Asynchronous Transfer Mode (ATM) was adopted as a new communication mode to
support it. ATM, which retains blending characteristics of packet and circuit modes,
integrated voice, video, and data signals continuously. Unlike circuit mode, which
allocated digital time slots proportionally to the information content and sequentially
mapped voice, video, and data, ATM used cells of equal size for each signal type,
generating them in numbers proportional to their information content for continuous
transmission.
This approach using an integrated digital communication network eliminated the
need to build separate networks for voice, video, and data signals. Previously, these
signals required separate networks due to the differences of characteristics. However,
26 2 Foundations of Digital Transformation

after digital conversion, they all took the same form, with the only difference being
the number of bits. Digitally converted, the information content of TV video signals
can be hundreds to thousands of times greater than that of telephone voice signals.
Thus, allocating the length of time slots or the number of ATM cells proportional to
the information content of each signal type allows the simultaneous provision of all
three services through a single digital network.

2.3.2 Digital Integration in Packet Mode

The ITU’s attempt to provide comprehensive digital services through ATM


and BISDN received widespread support from global communications operators,
successfully standardizing the approach. Subsequently, communications manufac-
turers developed ATM transmission and switching systems, and operators installed
these systems in their backbone networks. However, the deployment of fiber optics
in subscriber networks stumbled due to economic feasibility issues.
Digital communication, applied since the digital conversion of telephony, first took
root in backbone networks connecting exchanges. The replacement of transmission
systems and switches in these networks with ATM systems marked rapid progress
in integrating communication networks through the ATM mode. BISDN, which
assumed the use of optical fibers as transmission lines, had completed the replace-
ment of backbone networks with fiber optic networks, but progress in deploying fiber
optics in subscriber networks, which connect subscribers to telephone exchanges,
was slow. The subscriber lines, deployed since the early days of telephone service
using twisted-pair copper wires, had low transmission quality sufficient only for
short-distance voice signal transmission within 10 km or so. This infrastructure was
inadequate for ATM communication, making the replacement of subscriber lines
with fiber optics a critical requirement for BISDN’s success. However, since the
subscriber network is extensively spread geographically, the cost of deploying fiber
optics was substantial, while the revenue potential from such investment was uncer-
tain. Further, the services to offer over fiber optic subscriber lines were not ready and
the demand for broadband services among subscribers was obscure. Due to these
economic concerns, communications operators hesitated to invest in the subscriber
network, which led to BISDN’s stagnation.
Meanwhile, the internet began to move actively.14 The performance of routers
improved, and a variety of software to facilitate computer communication became
available. The development of the World Wide Web (WWW) started to attract public
interest, and the emergence of web browsers like Netscape and Explorer made internet

14 The internet, which started as a packet mode network called ARPANET in 1969, transitioned to
the Internet after adopting the TCP/IP protocol, which combines the Transmission Control Protocol
(TCP) and the Internet Protocol (IP), in 1983. Subsequently, the Domain Name System (DNS) was
introduced, allowing the use of domain names and IP addresses for addressing. Internet Service
Providers (ISPs) emerged in the late 1980s, and the commercialization of the internet began in
earnest in the mid-1990s, leading to its explosive growth.
2.3 Digital Integration 27

searches easier. On this foundation, the development of Asynchronous Digital


Subscriber Line (ADSL), which digitalizes subscriber networks to increase transmis-
sion rates to megabits per second (Mbps), began to be widely distributed. It triggered
a rapid increase of the interest in the internet, and the internet began to spread quickly,
leading to construction of the so-called high-speed networks. ADSL, evolving and
expanding into Very high-speed Digital Subscriber Line (VDSL), offered even higher
transmission rates, facilitating the rapid spread of the internet. On the other hand,
BISDN based on ATM remained stagnant, and as a result, telecom operators were
pushed out of the competition with the internet, leading to the abandonment of the
ATM-BISDN project. Thus, the internet emerged as the winner.
The fierce ‘battle’ between ATM-BISDN and the internet in wired communi-
cations was intense. It was a confrontation between traditional communications
and computer communications, and a battle between ATM’s circuit mode and the
internet’s packet mode. However, this confrontation resulted in the internet’s victory,
as it bypassed the obstacle of fiber optic subscriber network laying through ADSL.
Thus, the battle between circuit mode and packet mode ultimately ended in a victory
for packet mode. This is the so-called First Digital War.15 Afterward the internet
has paved the way for service integration, accommodating not only data but also
voice and video signals in packet form within the IP packet mode. Although this
was vulnerable in the early stage to real-time communication services, the unlim-
ited increase in transmission capacity through fiber optic backbone networks and the
rapid improvement in router performance enabled to overcome these vulnerabilities.
The significance of the ‘First Digital War’ goes beyond the simple victory of the
internet over ATM-BISDN in the subscriber network. The victory of the internet
over ATM-BISDN marked a historic event in the change of communication mode in
wired communications. The decline of service integration by ATM’s circuit mode and
the victory of the internet’s packet mode signifies that the packet mode technology
represented by the internet has gained the upper hand over the traditional circuit
mode technology represented by ATM. This indicates the beginning of a major
transformation in all wired communications beyond the subscriber network, where
packet mode replaces circuit mode.

2.3.3 Digital Integration in Wireless Communication

Following the conclusion of digital conversion and integration based on packet mode
in the wired communication domain, the battle between circuit and packet modes
shifted to the realm of wireless mobile communication.

15 The battle between ATM-BISDN and internet in the 1990s was a global confrontation in digital
communications, involving all telecom operators and manufacturers worldwide, with the internet
camp fully mobilized. At that time, the key debate topic at various academic conferences in the
fields of communications and computers was ‘ATM or the internet?’.
28 2 Foundations of Digital Transformation

Table 2.1 Properties of mobile communication systems


Generation 1G 2G 3G 4G 5G
Bandwidth 30 kHz 25 MHz 25 MHz 100 MHz ~ 30 GHz
Data-rate 2.4 kbps 64 kbps 2 Mbps 1 Gbps 10 Gbps
Name AMPS GSM, WCDMA, LTE NR
IS-95 HSPA
cdma2000
Multiple FDMA TDMA CDMA OFDMA OFDMA
access CDMA
Major Analog Digital Broadband High-speed High throughput,
functions voice voice, mobile, int’l internet, high reliability, low
text roaming, multimedia latency, enhanced
message global radio streaming, mobile broadband
access seamless roaming,
global mobility
Use Voice Text, Visual phone, Electronic Autonomous
service basic Mobile TV, commerce, game, vehicle, digital
internet GPS online shopping factory, edge
access computing, AR/VR

The dream of wireless mobile communication began in the 1900s, but the first
mobile communication system was established in 1946, and the first portable cell-
phone appeared in 1973. The first commercial mobile communication service was
provided in 1983 with the first generation (1G) mobile communication system
AMPS, an analog mobile communication using Frequency-Division Multiple Access
(FDMA). The user interest and competition among operators in mobile communica-
tion heated up, leading to the emergence of the second generation (2G) digital mobile
communication in 1991, which used Global System for Mobile Communications
(GSM) based on Time-Division Multiple Access (TDMA). In opposition, another
2G digital mobile communication system, IS-95, appeared in 1996, utilizing Code-
Division Multiple Access (CDMA).16 The third generation (3G) mobile communi-
cation emerged in 2006 with asynchronous WCDMA and synchronous cdma2000,
both based on CDMA. Later, the fourth generation (4G) appeared in 2009 and the
fifth generation (5G) in 2018, both utilizing Orthogonal Frequency-Division Multiple
Access (OFDMA). Table 2.1 lists a collection of the properties of 1G through 5G
mobile communication systems.
With the advent of 2G digital mobile communication, data services began, and
users were interested in data transmission speeds, making speed enhancement a
focal point of competition from 3G onwards. Voice signals were accommodated

16 The CDMA technology was developed by Qualcomm in the USA, with South Korea pioneering
its commercial system. CDMA mobile communication was mainly used in the USA, South Korea,
Vietnam, India, Brazil, and Chile, covering about 13% of global mobile phone subscribers, while
86% used GSM.
2.3 Digital Integration 29

using circuit mode, and data using packet mode. Until 3G, mobile communica-
tion primarily focused on voice while emphasizing data, but with the rise of internet
usage, the demand for data services significantly surpassed voice services, which was
reflected directly on the development of the 4G mobile communication. Expanding
transmission capacity and efficiently handling internet data became central concerns
in developing 4G. Consensus was reached on adopting OFDMA to increase transmis-
sion capacity. However, opinions divided on handling data efficiently: one approach
continued using circuit mode time slots for data packets, and the other adopted the
‘all-IP mode’ for accommodating both voice and data in IP packets, as used on the
internet. The former approach led to LTE adopted by the traditional 2G and 3G
standardization group 3GPP, while the latter led to mobile WiMAX (m-WiMAX)
stemming from IEEE 802 series standards.17
The standardization battle for 4G mobile communication ultimately became
a competition between ITU’s 3GPP LTE and IEEE’s m-WiMAX. From another
perspective, as 3GPP represented traditional circuit mode and IEEE802 repre-
sented IP packet mode, this was essentially the ‘Second Digital War’ in wireless
mobile communication. During the lengthy standardization process, a surprising
turn occurred when the 3GPP camp, recognizing that future communications would
inevitably be data-centric, pivoted the LTE standard proposal to the ‘all-IP mode’.
Thus, both sides adopted the ‘all-IP mode’ for 4G standards, leading to a some-
what anticlimactic victory of packet mode over circuit mode in 4G wireless mobile
communication. Although LTE and m-WiMAX were both adopted as standards,
with most telecom operators and manufacturers participating in LTE,18 packet mode
emerged as the winner in the ‘Second Digital War’. As a result, mobile communica-
tion retained its traditional communication exterior but adopted the genetic makeup
of packet mode internet interior.
The victory of the IP packet mode in the ‘Second Digital War’ in wireless commu-
nication, following the ‘First Digital War’ in wired communication, marked a signif-
icant turning point in the history of communications development. It transformed all
domains, wired and wireless, to handle voice and video signals as IP packets, inte-
grated and processed together. Since the introduction of IPTV services in 1998, even
TV video signals began to be transmitted as IP packets. This development estab-
lished the IP packet mode as the dominant communication protocol, centralizing
the communications platform around the internet and integrating all communication
signals into it. Thus, the 130-year history of circuit mode communications ceded its
throne to IP packet mode, positioning the internet as the primary communications
platform.

17 As the internet penetrated the subscriber network via ADSL and spread to wired local area
networks (LAN/MAN) and wireless local area networks (WLAN/WMAN), it formed WiFi and
subsequently evolved into the broadband WiMAX. The expansion of fixed WiMAX into mobile
form is known as m-WiMAX.
18 Though m-WiMAX was aggressively pushed and standardized by companies like Intel and

Samsung Electronics, it couldn’t overcome the market dominance of LTE led by traditional telecom
groups.
30 2 Foundations of Digital Transformation

2.3.4 Internet Communication Platform

So, what exactly is the internet, the victor that toppled the traditional communication
“Goliath,” and what is the core DNA within it, the IP packet mode?
The IP packet mode refers to the packet mode technique used by the Internet
Protocol (IP) within the TCP/IP protocol suite. The fundamental difference between
circuit and packet modes, as previously explained, is that circuit mode establishes
a circuit for continuous signal transmission for real-time services like voice, while
packet mode collects data as it becomes available and sends it in packets intermittently
for non-real-time services like data. For example, in telephone voice services, dialing
a receiver’s phone number prompts the exchange to establish a circuit connecting the
sender and receiver, which is exclusively used by them for the duration of the call, and
charges are based on the duration of circuit occupancy. Conversely, in packet mode,
the sender packages data into packets, attaches the receiver’s address, and sends
them to a router, where the packets are sent over a communication path shared with
other packets, with charges potentially based on the number of packets sent. Circuit
mode monopolizes communication paths, leading to low network usage efficiency,
while packet mode shares paths, enhancing efficiency but possibly introducing delays
due to packet processing and waiting times. Thus, circuit mode sacrifices resource
efficiency for real-time reliability, whereas packet mode achieves higher efficiency
by not being constrained by real-time requirements.
The fundamental design concepts of traditional circuit-switched networks and
internet-based IP packet-switched networks are diametrically opposed. Traditional
telephone networks concentrate intelligence in the central switch, leaving user termi-
nals (i.e., telephones) with minimal intelligence. In contrast, the internet disperses
intelligence to the user terminals (i.e., computers), with routers in the network core
performing simple functions. Circuit-switched networks rely on complex systems
like exchanges that synchronize all incoming signals and compute routing paths
considering the entire network, leading to high development and maintenance costs.
Routers, however, only set paths within individual routers, simplifying their function
and significantly reducing costs.
The IP packet mode refers to the packet system used on the internet, where the
TCP/IP protocol is utilized to control transmission, addressing, and routing. The
TCP/IP protocol provides a simple yet powerful means of data transmission through
its 4-layer structure. Notably, the Internet Protocol (IP) has a resilience akin to that of
a weed, capable of easily riding over any physical network and delivering data. This
allows it to traverse subscriber networks via ADSL. TCP, the Transmission Control
Protocol, supports various protocols used by internet users such as HTTP for WWW
access, FTP for file transfer, Telnet for remote access, SMTP for mail delivery, etc.,
all operating on top of it (refer to Table 2.2).19

19 The application layer of the TCP/IP (Transmission Control Protocol/Internet Protocol) protocol
includes protocols like HTTP (Hypertext Transfer Protocol), FTP (File Transfer Protocol), Telnet (a
remote access protocol), and SMTP (Simple Mail Transfer Protocol), while the link layer includes
2.4 Digital Convergence, ‘ICT Big Bang’ 31

Table 2.2 TCP/IP protocol


Layer Layer name TCP/IP protocol suite
suite
4 Application layer FTP, HTTP, Telnet, SMTP
3 Transport layer TCP, UDP
2 Internet layer IP, ARP, RARP
1 Link layer MAC (Ethernet, ADSL, ISDN)

The IP packet mode, developed initially for data services, requires performance
improvements to accommodate real-time services like voice and video. This involves
reducing the delay in segmenting, transmitting, and reassembling service signals
into IP packets to meet real-time service requirements. Though initially considered a
daunting challenge, advances in router technology and the introduction of optical
communications have gradually made solutions feasible. The TCP/IP protocol,
designed for stable wired networks, must consider additional factors when applied to
wireless mobile communication services due to varying wireless channel conditions.
The outcome of the “First and Second Digital Wars,” where the IP packet mode
prevailed over circuit mode to become the dominant communication platform, repre-
sents a significant technological shift. The Internet Protocol (IP) now provides a
universal medium for traffic delivery, offering a common foundation for related infor-
mation processing. This shift led to a change from traditional billing methods based
on call duration to packet-based charging, with modern mobile carriers adopting
tiered pricing plans based on data usage, a variant of packet-proportional billing.
Expensive voice service businesses transformed into cost-effective VoIP operations.
However, the most crucial change was the establishment of the IP-based communi-
cation platform itself, enabling a variety of services like email, web browsing, file
sharing, video conferencing, and online shopping to be provided over the internet.
The IP-based communication platform represents a vertical integration centered by
IP, solidifying this union and exerting significant influence across all digital industries
and services.

2.4 Digital Convergence, ‘ICT Big Bang’

The birth of the internet from computer communications and its victory over long-
standing traditional communications to establish itself as a communication platform
was a revolutionary event, significant enough to be called the ‘Big Bang’ of communi-
cations. However, the wave of digital integration did not end there. Communications

Medium Access Control (MAC) protocols like Ethernet, ADSL, and ISDN. UDP is the User Data-
gram Protocol, used for simple message exchanges; ARP is the Address Resolution Protocol, which
maps IP addresses to physical network addresses (MAC addresses); and RARP is the Reverse
Address Resolution Protocol, which maps MAC addresses back to IP addresses.
32 2 Foundations of Digital Transformation

and computers advanced further, to merge at the system level, and eventually formed
a content platform, paving the way for the ‘ICT Big Bang’.

2.4.1 Digital Convergence

Computers, paralleling communications, evolved from mainframe computers to


minicomputers in the mid-1960s, to microcomputers in the early 1970s, to PCs in the
mid-1970s, and further miniaturized into smartphones in the 2000s. The inception of
smartphones began with PDAs in the 1990s, formalized with Symbian phones and
BlackBerry phones in the early 2000s, and was perfected with the launch of the iPhone
by Apple in 2007, equipped with the iOS operating system, followed by Android-
equipped smartphones. Android OS was introduced by Google in 2008 as a coun-
terpart to iOS, with Samsung releasing the Android-equipped Galaxy smartphone in
2009, followed by various other manufacturers.
The implications of the advent of smartphones are very significant. From the
perspective of communications, it corresponds to the addition of computer func-
tions to mobile communication devices, and from the perspective of computers, it
corresponds to the reduction of computers to the size of terminals. Wireless mobile
communication devices in communications began with vehicle-mounted devices,
went through the feature phone stage,20 and have reached smartphones. Computers
evolved through mainframes, minicomputers, microcomputers, and PCs to reach
smartphones. Therefore, a smartphone is both a communication terminal and a
compact computer, signifying the systemic convergence of communications and
computers. That is, communication devices and computers, having undergone their
own evolutionary processes, have converged at the system level with the advent of
smartphones. This digital convergence, resulting from collisions between communi-
cations and computers, constitutes the phenomenon of device-level ‘ICT Big Bang’,
or ‘Smart Big Bang’ (refer to Fig. 2.5).

2.4.2 ‘ICT Big Bang’—OS Aspect

The complete fusion of communications and computers through smartphones,


igniting the ICT Big Bang, was catalyzed by the App Store accompanying the
iPhone’s launch in 2007. At the time of the iPhone’s introduction, many telecom
operators were already operating application marketplaces. However, due to the use
of different Operating Systems (OS) by each operator, there was no interoperability,

20 Feature phones are multifunctional mobile phones used before the advent of smartphones. Before
feature phones, there were simple phones with only calling functionality. Simple phones evolved into
feature phones equipped with additional features like music, video, text messaging, and cameras,
which then further evolved into today’s smartphones with computer functions.
2.4 Digital Convergence, ‘ICT Big Bang’ 33

Fig. 2.5 Development of communications and computers, and ICT Big Bang

forcing application developers to create the same application for the OS specified
by each telecom operator. While Symbian and BlackBerry had a relatively large
adoption among manufacturers and a broad user base, their services were not user-
friendly, their interfaces were not intuitive, and their OS functionalities were limited,
hindering application activation. The application marketplace transformed rapidly
with the introduction of the open marketplace, the App Store, with the launch of the
iPhone. Application developers began to freely post their applications on the App
Store, and Google later entered the competition with the Play Store based on the
Android OS, leading to an explosive increase in the number of applications. This
represents the OS-level ‘ICT Big Bang’ phenomenon.
With the emergence of the App Store and Play Store, the competition in smart-
phones shifted from communications to applications, placing the OS at the center
of competition. This intense competition at the OS level marked the ‘Third Digital
War.’ Initially, Apple’s iOS competed against traditional mobile OSs like Symbian,
BlackBerry, and Series 40, but the scene expanded with the rapid rise of Android as a
latecomer. By 2012, iOS and Android began to dominate, with many manufacturers
adopting Android, establishing it as the dominant force in the mobile OS market (see
Fig. 2.6).21 Thus, the so-called Third Digital War in the mobile OS arena ultimately
concluded with a victory for the two major camps: Google’s Android and Apple’s
iOS.
The reasons behind the failure of Symbian, BlackBerry, Windows Mobile, and
others against iOS and Android include various factors. First, iOS introduced inno-
vation in user experience with its intuitive interface and seamless integration with
the existing Apple ecosystem, while Android provided a user-centric experience
with customizable interfaces and integration with Google services, unlike Symbian

21In January 2010, the mobile market share was distributed with iOS at 33%, Symbian at 34%,
BlackBerry at 10%, and Android at 5%. By December 2012, this had shifted to Android at 33%,
iOS at 23%, Symbian at 11%, and BlackBerry at 4%. As of September 2023, the market is divided
between Android at 70% and iOS at 30%.
34 2 Foundations of Digital Transformation



 #PFTQKF







 5[ODKCP
K15


 $NCEMDGTT[

 5GTKGU

Fig. 2.6 OS market share from January 2009 to September 2023. Source StatCounter Global Stats

and others which were not user-friendly or intuitive. Second, Apple’s App Store
and Google’s Play Store created a vast app ecosystem, significantly outperforming
Symbian’s Ovi Store and other OS’s, attracting users with the availability of diverse
applications. Third, Apple and Google rapidly evolved their mobile OS’s, intro-
ducing new features and performance improvements that captured users’ interest,
unlike Symbian and others. Fourth, iOS and Android attracted developers by offering
market share expansion, ease of development, and revenue generation opportunities,
while Symbian and others had limited development tools and support. Fifth, Apple
and Google effectively marketed their OS and devices, creating strong brand loyalty,
especially Apple, which created a premium brand perception, unlike Symbian and
others that lacked brand appeal. Sixth, Apple and Google presented a clear vision and
strategy for their mobile platforms and invested heavily in ecosystem development,
unlike Symbian and others. Seventh, Apple optimized its devices by controlling
both hardware and software, creating a technical ecosystem, while Android flexibly
accommodated various hardware manufacturers’ ecosystems, unlike Symbian and
others.

2.4.3 ‘ICT Big Bang’—Business Aspect

The ‘ICT Big Bang’ mentioned above was a device-level Big Bang resulting from
the system-level convergence of communications and computers, and an OS-level
Big Bang resulting from the combination of smartphones and application stores.
However, the Big Bang phenomenon that created explosive changes in reality was
2.4 Digital Convergence, ‘ICT Big Bang’ 35

&RQWHQW
7UDIILF
&RQWHQW 6HUYLFH
&RQWHQW 'HYLFH 8VHU
3URYLGHU 3URYLGHU
3ULFH &KDUJH

'HYLFH 3ULFH

0DQXIDF
WXUHU

&RQWHQW $SSOLFDWLRQ &RQWHQW


&RQWHQW
6WRUH 8VHU
3URYLGHU
3ULFH &KDUJH
7UDIILF
&KDUJH
'HYLFH
)HH
'HYLFH 6HUYLFH
$SS0DQDJHU 0DQXIDFW
0DQXIDF
$GYHUWLVHU 3URYLGHU
$SSOH*RRJOH WXUHU
XUHU 3ULFH

Fig. 2.7 ICT Big Bang: a before, b after

at the level of the communications business. The ICT Big Bang pushed traditional
telecom operators from the center stage to the periphery, replaced by application
marketplaces. As the core of telecom business shifted from voice to data services,
and with the emergence and success of open application stores, traditional telecom
operators were forced into a defensive position in the data business, leading to their
marginalization. The fact that telecom operators, who had dominated the commu-
nications stage for 130 years, were pushed to the periphery is indeed an ‘ICT Big
Bang’. Figure 2.7 illustrates this by showing that the central position which used to
be taken by telecom service providers before the ICT Big Bang was taken by the
open application store after. This was a historical Big Bang that upended the foun-
dations of the communications market, marking the end of the telecom-led era and
the dawn of a content-led era. This Big Bang, which reshaped the landscape of the
ICT industry, is thus referred to as the ‘ICT Big Bang’, and since it was triggered by
the advent of smartphones, It is also called the ‘Smart Big Bang’.

2.4.4 Foundation of Digital Transformation—A Summary

The foundation of digital transformation was formed through a process described


above. To summarize:
36 2 Foundations of Digital Transformation

Since the digital conversion of analog signals, there have been three significant
conflicts in the development of communications and computers, known as the ‘Digital
Wars’. The first was a conflict between the circuit mode and the IP packet mode in the
realm of wired communication, referred to as the ‘First Digital War’. This conflict
then shifted to the mobile communications domain, leading to the ‘Second Digital
War’. As a result, the IP packet mode of the Internet, victorious from the first two
wars, ascended as the communications platform.
The third conflict took place in the mobile OS of smartphones, which integrated
communication terminals and computer systems. With the advent of smartphones,
computers became integrated into communication devices, and computer compa-
nies transformed into communications companies. A prime example of this is the
transformation of Apple, a computer company, into a communications manufacturer
after the launch of the iPhone. Apple entered the mobile communications market
with the iPhone equipped with the iOS operating system, competing against the
existing operating systems of telecom service providers like Symbian and Black-
Berry. This competition intensified with the entry of smartphones equipped with
Google’s Android OS, sparking the ‘Third Digital War’. This war concluded with the
joint victory of iOS and Android, linked to the App Store and Play Store, establishing
them as the foundation of the ‘Content Platform’.
Through these three Digital Wars, communications and computers fully inte-
grated both internally, in terms of communication methods, and externally, at the
device level. As a result, a communication platform based on the IP packet mode of
the internet and a content platform based on iOS and Android were established. The
complete convergence of communications and computers, epitomized by the combi-
nation of smartphones and application marketplaces, caused the ‘ICT Big Bang’ (or
‘Smart Big Bang’). The Big Bang triggered transitioning of the communications era
to the content era, completely changing the ICT industry landscape and impacting
all industries and society. This laid the technical foundation for the full-scale digital
transformation. Ultimately, the foundation of digital transformation was established
as a result of three Digital Wars in the development and convergence process of
communications and computers.
Chapter 3
Digital Platforms

Digital transformation, as discussed in previous sections, began with the digital


conversion of analog signals. This digital conversion was the starting point of the
long journey to the integration of voice, video, and data services in digital form,
and to the convergence of the traditional and computer communications at the signal
level and eventually at the system level. Upon this foundation, a communication plat-
form was established, on which an OS-based content platform was formed, leading
to the creation of application platforms through various applications. Thus, digital
conversion laid the groundwork for a digital platform composed of communication,
content, and application platforms, driving digital transformation with revolutionary
momentum.
Digital platforms provide a space to create, store, transmit, process, and apply
digital resources using digital technology. It is a digital ecosystem where developers,
providers, and users participate in creating, distributing, and consuming various
digital resources like content, applications, and services. Application platforms, the
pinnacle of digital platforms, offer various services such as search, social media,
e-commerce, and content sharing, becoming the axis for digital transformation in
industry and society. Platform companies, leading this significant change, captured
and prepared for the digital transformation trend earlier than others, ultimately
emerging as the dominant forces of the digital transformation era.

3.1 Establishment of Digital Platforms

The digital platforms were established through the first, second, and third Digital
Wars, as previously discussed. In both the wired and wireless communication sectors,
the IP packet mode triumphed over the circuit mode, leading to the convergence of
voice, video, and data signals into IP packets. This integration of communication and

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 37
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_3
38 3 Digital Platforms

computing established the internet based on the IP packet mode as the communica-
tion platform. This platform represents a solid integration of traditional and computer
communications through IP packets, laying the foundation for offering both informa-
tion and communication services and content services. This convergence has created
a cornerstone for converging all digital content into ICT.
The shift to providing data services began with 2G mobile communication, with
data speeds increasing through 3G, 4G, and 5G. This led to a shift in competition from
communication services to content services. Initially, telecom operators competed
with their content services, but the launch of the iPhone with iOS and Google’s entry
with the Android mobile OS escalated the competition into the Third Digital War—a
battle among operating systems, with iOS and Android emerging as the victors in
the content platform space. This has led to applications being traded and operated
through the app marketplaces running on these two OSs.
Thus, the communication platform born from the integration of communication
and computer in both wired and wireless networks is the internet platform based
on IP packets. The content platform, arising from the system-level integration of
communication and computing, operates on the internet platform through OS-driven
content ecosystems. Today’s ability to make calls, surf the web, engage in social
media, and conduct online transactions on a smartphone is due to the integration of
the communication platform and the content platform on the device. These platforms
work together to run various applications, turning the applications themselves into
application platforms that provide various services.
Therefore, when defining a digital platform as a platform that provides digital
services, it includes communication platform, content platforms, and application
platforms. What appears externally is the internet, which is the lower layer, and
what is transmitted through the internet is content, the upper layer. All voice, video,
and data signals are transmitted over the communication platform in the form of IP
packets, and all content is distributed through applications on the content platform
according to the OS protocol. For example, iOS and Android themselves form the
core of the content platform, and the various applications built on them become
application platforms that provide the unique services of the application.

3.2 Types of Digital Platforms

The term platform is familiar to the general public thanks to platforms found at train
stations. A train station platform is a space where passengers board and alight from
trains. It is not a space for specific trains or passengers; instead, it is an open space
that all trains and passengers can use together. Digital platforms operate similarly.
They are digital spaces where developers and providers of various digital resources
meet users to exchange digital resources. The digital resources provided here include
various contents, applications, and services. Generally speaking, a digital platform is
a digital ecosystem space where developers, providers, and users participate together
to create, distribute, and consume various digital resources such as digital content,
3.2 Types of Digital Platforms 39

Fig. 3.1 Relations among


digital platforms
$SSOLFDWLRQ3ODWIRUP

$SS

&RQWHQW3ODWIRUP

26

&RPPXQLFDWLRQ3ODWIRUP

,QWHUQHW

applications, and services. Digital platforms encompass communication platform,


content platforms, and application platforms. Figure 3.1 shows the relationship
among these three types of digital platforms.

3.2.1 Communication Platform (Internet)

From a physical standpoint, the developers, providers, and users of digital platforms
are interconnected and digital resources are delivered through communication plat-
form based on Internet Protocol. In other words, the internet is a communication
platform that connects various devices, enabling information exchange and commu-
nication. The devices like desktop computers, laptops, and mobile devices provide
the means for users to access and interact with relevant websites and apps through
the internet. This means providing the hardware and interfaces necessary for web
browsing, emailing, social media, and other services through internet access. There-
fore, the internet communication platform forms a comprehensive ecosystem that
connects device developers, communication service providers, and users through
various information and communication devices connected to the communication
network, facilitating communication, information exchange, and digital services
worldwide.
Today’s widely used services such as social media, mobile shopping, cloud
computing, and content sharing are provided dependent on applications on content
platforms. However, not all content and services are provided this way. Several
services are directly provided on the communication platform. For example, services
like web browsing, file transfer, remote computer access, and email do not rely on
iOS or Android operating systems but are provided directly through the internet.
40 3 Digital Platforms

These services can be accessed via desktop computers, laptops, mobile devices, or
other specialized devices. While dedicated applications for iOS and Android devices
can be developed and offered, users are not limited to them and can access these
services from various devices and OS’s.
The operation of services like web browsing on communication platforms involves
connecting to the internet and accessing websites and web-based content through
web browsers such as Google Chrome, Mozilla Firefox, and Microsoft Edge. These
browsers are available on various operating systems like Windows, macOS, Linux,
etc. File transfers between computers can be done using File Transfer Protocol (FTP),
and web browsing involves transferring web pages and images between computers
and web servers using HyperText Transfer Protocol (HTTP/HTTPS). These proto-
cols are usable across various platforms and devices. Remote computer access is
achieved through services like Remote Desktop, Secure Shell (SSH), Virtual Network
Computing (VNC) over the internet, available on various operating systems. Email
services like Gmail, Outlook, Yahoo Mail operate independently of the underlying
operating system and can be accessed on various devices through web browsers or
specific email client applications.

3.2.2 Content Platform (OS)

When the focus of communication shifts to content, content encompasses a broad


and comprehensive range. It includes all digital information that can be provided
through communication platforms. This means various digitally processed informa-
tion or programs, movies, music, videos, animations, games, performances, software,
and others are collectively referred to as content. Generally speaking, the content of
interest to content platforms includes all digital content that can be distributed through
communication networks. In broad sense, content platforms represent content,
applications, and services provided through communication networks.
Content platforms are ecosystems where stakeholders involved in developing,
distributing, and using digital content (e.g., OS providers/managers, content devel-
opers, users, etc.) all participate. This includes application marketplaces (e.g., App
Store, Play Store), application development tools (e.g., Apple’s Xcode, Android
Studio), and other hardware/software infrastructure necessary for content produc-
tion and distribution. Thus, a content platform can be defined as an ecosystem where
all stakeholders involved in content participate in using the content production and
distribution infrastructure to create, distribute, and consume various content.
Therefore, content platforms provide technical resources across hardware, soft-
ware, and networks that enable them to deliver content and services to users through
applications. They also create an ecosystem where OS managers, app developers,
app users, and other stakeholders can communicate, transact, share data, and collab-
orate. Furthermore, content platforms provide Application Programming Interfaces
(APIs) that allow developers to create applications and services tailored to the plat-
form’s features and formats. APIs provide a means for various software to interact in
3.2 Types of Digital Platforms 41

a standardized way. In addition, content platforms offer user interfaces such as web
or mobile apps that enable users to access and interact with content and services.
At the core of the content platform is an operating system. The content platform
operates on the internet communication platform powered by an OS. The content
ecosystem is like a community that uses the same OS as its language. For example,
it is akin to a linguistic community that uses the language of Android OS or iOS.
Content platforms produce, distribute, and consume content on the foundation of
the same OS. The previously discussed ‘Third Digital War’ was essentially a battle
between different OS’s, and a war between platform ecosystems using those OS’s.
The winner was decided by the performance of the OS itself, its user-friendliness,
and the adaptability of content creators. Therefore, a content platform may well be
referred to as an OS platform.

3.2.3 Application Platform (Apps)

The application platform refers to a platform formed by applications placed on a


content platform, essentially corresponding to an app platform. Application platforms
are provided based on operating systems such as iOS or Android. Since the OS
manages hardware devices and executes applications, connecting various software,
application platforms can leverage this foundation to create user-friendly interfaces
and offer specific services. Apple’s App Store and Google’s Play Store are separate
content platforms operating on iOS and Android, respectively, thus applications are
developed to be loaded onto each platform.
There are various types of application platforms. Apple’s App Store and Google’s
Play Store form separate content platforms, and the applications placed on them each
create their own application platforms. Within application platforms, there are social
media platforms like Facebook and Twitter, e-commerce platforms like Amazon and
Alibaba’s mobile shopping, cloud computing platforms like Amazon Web Services
(AWS) and Microsoft Azure, and content sharing platforms like YouTube and
Netflix. In addition, there are online marketplace platforms like Amazon Market-
place and Etsy, educational platforms like Coursera and Khan Academy, collab-
oration platforms like Microsoft Teams and Slack, communication platforms like
Zoom and WhatsApp, travel booking platforms like Airbnb and Expedia, financial
payment platforms like PayPal and Venmo, and health fitness platforms like Fitbit
and MyFitnessPal, among others.
Application platforms form a digital ecosystem space where service providers and
users, along with devices related to specific services and contents provided, partici-
pate to create, distribute, and consume app-specific services. For example, a social
media platform is an ecosystem space where social media service providers (i.e., plat-
form companies) and users (i.e., SNS users), along with devices related to providing
and using social media services, participate in creating (i.e., posting), distributing
(i.e., storing and displaying), and consuming (i.e., downloading) messages. The
participants in the ecosystem vary depending on the platform; for instance, in the
42 3 Digital Platforms

Table 3.1 Comparison of the characteristics of digital platforms


Communications platform Content platform Applications
platform
Basis Internet Operating system Application
Service Device connection, App development, app up/ App-specific
information exchange, download, app service services
digital services support
Infrastructure Desktops, laptops, mobile Application store, app Applications
devices, hardware, development tools, content
interfaces production/distribution
infrastructure
Participants Network operators, device OS operators, app Service providers,
developers, service developers, app users service users,
providers, subscribers advertisers,
marketing media

case of travel booking platforms, accommodation providers are also included in the
ecosystem.
A comparison of the characteristics of the three types of digital platforms discussed
above is illustrated in Table 3.1.

3.2.4 Examples of Platform Services

In order to realistically understand the interrelationship and operational procedures


between communication platforms and content platforms, we examine the web
service procedure using Google’s search service as an example (see Table 3.2). We
will look at cases where services are received via a communication platform using a
PC and via a content platform using a smartphone. For PCs, the operating systems
(OS) used in the system are Windows, macOS, Linux, etc., and for smartphones, the
OS are iOS and Android.
A user’s search query is transmitted to the search engine through several steps
involving software, hardware, and the internet. First, in the case of PCs, when a user
enters a search query using a browser like Chrome, Firefox, Edge, Safari, etc., the
browser sends out the search query as an application-layer HTTP or HTTPS request
(see Table 2.2). In the case of smartphones, when a user enters a search query using
a search application, the iOS or Android OS captures this input and sends it out
as an HTTP/HTTPS request through the relevant Application Programming Inter-
face (API). The HTTP/HTTPS request includes not only the search query but also
metadata such as the device type. Second, the HTTP/HTTPS request is passed to
the transport layer, where TCP operates to divide the HTTP/HTTPS request into
small packets and processes them for reliable transmission. Third, each TCP packet
3.2 Types of Digital Platforms 43

Table 3.2 Comparison of platform services (e.g., web service and mail service)
Web service Mail service Operating system
(a) When receiving by PC
Application program Browser (Chrome, Email (Gmail, Windows, MacOS,
Edge, Firefox, Safari, Outlook, Yahoo, etc.) Linux
etc.)
Application layer HTTP/HTTPS, DNS SMTP, IMAP, POP3
Transport layer TCP, UDP
Internet layer IP
Link layer Ethernet, WiFi
(b) when receiving by smartphone
Application program Web app/API Email app iOS, Android
Application layer HTTP/HTTPS, DNS SMTP, IMAP, POP3
Transport layer TCP, UDP
Internet layer IP
Link layer Ethernet, WiFi, LTE, 5G

is delivered to the internet layer, where it is further divided into IP packets, encap-
sulated, and attached with the IP addresses of the sender (i.e., the user’s device) and
the receiver (i.e., Google’s server). The task of converting the domain name (i.e.,
google.com) to an IP address is carried out by the application layer’s Domain Name
System (DNS). Fourth, the IP packets are passed to the link layer and transmitted
through the network. In this case, the network can be Ethernet or WiFi when using a
PC, and Ethernet, WiFi, LTE, 5G network, etc., when using a smartphone. The link
layer uses protocols and hardware interfaces suited to the type of network to transmit
data.
The IP packets transmitted in this manner pass through various routers and
networks on their way to Google’s servers. Once these packets arrive at Google’s
servers, they undergo a reverse process of the above steps and are reassembled into the
original HTTP/HTTPS request. The search server, having received the original search
query, runs sophisticated search engine software to process the query. It searches the
vast index of the web to find relevant results, utilizing powerful data centers to do so
in a very short time. The search results are packaged into an HTTP/HTTPS response
and sent back to the user’s device. This process is carried out through the same steps
as when the search query was sent, that is, processed through Google server’s appli-
cation layer, transport layer, internet layer, link layer, then through the network to
the user’s device, where it is reassembled and interpreted by the user’s web browser
or search app. Finally, the search results are displayed on the user’s device. All these
processes happen in a very short time.
The description of the web service above also applies to email services. In the
case of using a PC, the application changes from a browser to an email server,
44 3 Digital Platforms

and in the case of smartphones, the web app changes to an email app, but every-
thing else remains the same. Table 3.2 illustrates such service procedures through
communication platforms and content platforms, using search and email services
as examples. In the case of content platforms, using different applications and their
respective APIs instead of web applications and their APIs allows for the provision
of application-related services.

3.2.5 Content, Applications, and Services

We mentioned earlier that content represents content, applications, and services


provided via communication networks. If we look closer, however, these three
categories differ in nature and function.
Content refers to information, data, media, and other informational resources that
users create, share, and consume. These are the objects provided through commu-
nication networks for users to obtain information, enjoy entertainment, or receive
education. Content comes in various forms such as text, images, videos, audio, and
documents and is provided on digital platforms in the form of blog posts, articles,
social media updates, podcasts, videos, images, media, etc.
Applications (or apps) are software programs designed to perform specific func-
tions or tasks on electronic devices like smartphones, computers, and tablets. Appli-
cations range from productivity tools like word processors and email clients to enter-
tainment tools like games and multimedia players. Users can download applications
from application stores to their devices, and these applications provide user-friendly
interfaces that allow users to access specific functions or perform specific tasks.
Applications can provide both content and services.
Services refer to functions, capabilities, informational resources, etc., provided
over networks such as the internet. Services offer a variety of functionalities like cloud
computing, online banking, streaming media, social networking, email, etc. Unlike
applications, services provide means for users to access remotely without needing to
install anything like applications on their devices; they can be accessed and utilized
through web browsers or client software. Services may be offered for free, supported
by advertisements or on a subscription basis, and often require user interaction or
participation. They are designed to meet specific needs of users or assist in performing
specific tasks, such as communication, data storage, and financial transactions.
In summary, there are distinct differences between content, applications, and
services. Content is the actual data or information that users consume through text,
images, videos, etc. Applications are software programs that help users perform
specific tasks or functions. Services are functionalities or capabilities provided
remotely via a network. Content is what users read, view, listen to, and consume
directly; applications are tools users utilize to perform tasks or functions; services
are functionalities available for user utilization. While content and applications are
downloaded and installed on user devices, services are provided directly through web
3.3 Digital Platform Companies 45

browsers or client software. Services can be provided directly on a communication


platform or through an application platform.

3.3 Digital Platform Companies

Given that digital platforms are ecosystems where developers, providers, and users
all participate and where a variety of digital resources such as digital content, appli-
cations, and services are congregated, digital platform companies today hold a signif-
icant share in industry and society. The number of companies providing application
platforms is as diverse as the types of platforms themselves. Among these, the largest
and most influential companies include Apple, Google, Amazon, and Meta (formerly
Facebook). As of June 2024, all four companies are among the top 10 companies in
the world by market capitalization. This fact underscores how platform companies
are leading the digital transformation era and shaping the economic landscape.1

3.3.1 Apple Inc.

Apple, founded in 1976 by Steve Jobs, Steve Wozniak, and Ronald Wayne, initially
developed hardware and software for personal computers (PCs) like the Apple I,
Apple II, and Macintosh (Fig. 3.2). Subsequently, Apple diversified its business to
produce smartphones such as the iPhone, content consumption and creation tools
like the iPad, laptop computers like the MacBook, as well as portable music players
like the iPod, and music and media management programs like iTunes. Apple also
launched wearable devices such as the Apple Watch and AirPods, and the streaming
music service, Apple Music. The iPhone, in particular, revolutionized the smartphone
market, transforming Apple into a communications manufacturer and a leader in the
communications device sector. The iPhone accounts for about 50% of Apple’s total
revenue, with hardware device sales comprising 78% of the total.2
Apple’s products are known for their excellent brand image and product design.
They have impressed users with intuitive and user-friendly interfaces. Apple’s prod-
ucts and services form a synergistic ecosystem, a powerful tool that encourages the
integration of various Apple products. The quality and reliability of its products have
given customers a positive image. Apple distributes numerous applications through

1 As of June 28, 2024, the ranking of the top 10 companies by market capitalization is as follows:
Microsoft ($3.4 trillion), Apple ($3.3 trillion), Nvidia ($3.0 trillion), Alphabet (Google) ($2.3
trillion), Amazon ($2.1 trillion), Saudi Aramco ($1.8 trillion), Meta Platforms (Facebook) ($1.3
trillion), Berkshire Hathaway ($880 billion), Eli Lilly ($820 billion), and TSMC ($760 billion).
2 In 2023, Apple’s revenue was $383.3 billion, with iPhone sales at $201 billion being the largest,

followed by iPad at $28 billion, MacBook at $29 billion, wearables like Apple Watch and AirPods
at $40 billion, and services like the App Store at $85 billion. Hardware device sales accounting for
78% of total sales affirm Apple’s status as a manufacturing company.
46 3 Digital Platforms

Fig. 3.2 Apple logo

the App Store, building a diverse app ecosystem. The Apple App Store, in partic-
ular, has become a genuine application marketplace that satisfies both users and
developers, spearheading the ‘ICT Big Bang’ in conjunction with the iPhone.
Thus, Apple has brought innovation to communications through the iPhone and
App Store and has led the development of the digital industry in terms of product
design, technological innovation, and user experience. Furthermore, tools like the
iPad and Apple Pencil have supported the creativity of designers and artists. Apple
is expected to bring more innovations in technology and products in the future,
expanding its interests into AI, VR, AR, and the metaverse.
However, there are various issues and concerns. Apple’s ecosystem is monopo-
listic, limiting interoperability with other companies’ devices and services, and high
product prices pose a barrier to entry for low-income groups. Apple’s high market
share raises concerns about market dominance and unfair competition due to monop-
olistic corporate behavior, notably in its unilateral control over App Store platform
usage and fees, causing dissatisfaction among developers. As a manufacturer, Apple
faces criticism for environmental responsibilities related to resource consumption
and waste generation. The high dependency on Chinese manufacturing and supply
chain operations poses geopolitical risks, with ethical concerns about labor condi-
tions and human rights at Chinese manufacturers. Meanwhile, Apple’s collection of
user data for service improvement and targeted advertising has drawn criticism for
privacy compliance and refusal to comply with government requests.

3.3.2 Google Inc.

Google was founded in 1998 by Larry Page and Sergey Brin. Initially, it led inno-
vations in the web search domain by developing a search engine (Fig. 3.3). Today,
its business areas have diversified into advertising, operating systems (OS), video
content, and cloud services. Google remains the dominant player in the search engine
market. In advertising, Google generates substantial revenue through its advertising
platforms, AdWords (renamed to Google Ads in 2018) and AdSense. Google owns
Android, an OS that competes with Apple’s iOS, and operates YouTube, a popular
video platform. It also runs Google Cloud Platform, offering cloud computing, data
3.3 Digital Platform Companies 47

Fig. 3.3 Google logo

storage, and analytics tools. The majority of Google’s revenue comes from digital
advertising, with ads through search and other platforms accounting for nearly 60% of
total revenue. Including Google Network ads and YouTube ads, advertising revenue
reaches 77% of total revenue.3
Google has contributed to popularizing knowledge with its innovative search
engine, becoming synonymous with online searching. Android OS has played a
significant role in the digital platform ecosystem by providing an open alternative
to Apple’s closed iOS system. Google excels in data analytics and artificial intel-
ligence, offering insightful information and trends as well as personalized services
and functionalities. Recognized for its open culture and capability for technological
innovation, Google is expected to increase its investments in AI, cloud technologies,
healthcare technologies, and autonomous vehicles.
However, there are several concerns and issues with Google Search. While it has
played a role in popularizing knowledge, it has also been instrumental in Google’s
success as an advertising company. The main issue arises from the collection of user
information during the search process and its use in targeted advertising. Google
collects user information under the pretext of improving search services, providing
personalized and faster search services. However, personalized services have led
to the creation of filter bubbles, where search algorithms remember past searches
to offer similar results, inadvertently filtering information within the confines of
previous search histories.
Moreover, Google’s use of user information for targeted advertising has signif-
icantly increased its advertising revenue, exposing users to targeted ads without
their awareness. Concerns extend beyond this, including the risk of massive user
information being exposed through hacking and cyber-attacks. Google’s search algo-
rithms may also create political biases or discriminatory perceptions, leading to social
controversies. Furthermore, Google’s active collection of user information beyond
search services raises concerns about its future impact on users and society. With
77% of its revenue coming from advertising and its leading position in the global
digital advertising market, Google’s dominance in the advertising market also raises
monopolistic concerns.4

3 In 2023, Google’s revenue was $307.4 billion, with Google Search ads at $175 billion, Google
Network ads at $31.3 billion, YouTube ads at $31.5 billion, and cloud services at $33.0 billion.
Advertising revenue totaled $237.9 billion, accounting for 77% of total revenue, making Google
predominantly an advertising company. (Source: FourWeekMBA).
4 In 2022, global digital advertising revenue (market share) was distributed among Google ($168.4

billion, 29%), Meta ($112.7 billion, 19%), Alibaba ($41 billion, 7%), Amazon ($38 billion, 6%),
48 3 Digital Platforms

Fig. 3.4 Amazon logo

3.3.3 Amazon Inc.

Amazon was founded by Jeff Bezos in July 1994, starting as an online bookstore
before expanding into a comprehensive e-commerce company (Fig. 3.4). Amazon
entered the e-book market with the launch of the Kindle e-book reader, further diver-
sifying into Amazon devices like Echo smart speakers and Fire tablets. Amazon Web
Services (AWS) marked its venture into cloud computing services, while Amazon
Prime and Amazon Studios represent its venture into entertainment.
Amazon’s core business today is e-commerce, operating various online market-
places worldwide and offering third-party sellers access to Amazon’s online market.
Amazon Marketplace sells products produced by Amazon alongside those from
other sellers, fostering competition and improving product quality. AWS provides
computing power, storage, and databases to individuals, businesses, and govern-
ments. Amazon Prime offers subscription services like fast shipping and streaming,
while Amazon invests in content creation through Amazon Prime Video and Amazon
Studios. Online shopping sales account for 43% of Amazon’s total revenue, and
including third-party and offline sales, it reaches 71%, solidifying Amazon’s status
as a commerce company.5 However, while online sales slow, cloud services and
advertising revenues are surging, with traffic moving from Google to Amazon for
shopping-related searches.6
Unlike Apple’s contribution to ICT innovation with the iPhone and App Store
or Google’s popularization of knowledge via its search engine, Amazon’s founding
story and success formula differ. Amazon, starting as an online bookstore and shifting
to online retail, aimed to become “the most customer-centric company in the world.”

and ByteDance ($29.1 billion, 5%), according to Insider Intelligence (note that statistics may vary
by data source). In 2023, the digital advertising market share changed significantly, with Google’s
share increasing to 39%, Meta at 18%, Amazon at 7%, ByteDance at 3%, and Baidu at 2%.
5 In 2023, Amazon’s revenue was $554 billion, with online sales accounting for $231.9 billion, third-

party seller services generating $140 billion, AWS revenue at $90.8 billion, advertising revenue at
$46.9 billion, subscription services revenue at $40.2 billion, offline store sales at $20 billion, and
other revenue at $5 billion. Total commerce revenue, which includes online, third-party, and offline
store sales, was $391.9 billion, making up 71% of the total revenue. (Source: FourWeekMBA).
6 Notably, AWS led the cloud service market in Q4 2022 with a 32% share, a significant future

business landscape shift for Amazon. Cloud service market share in Q3 2023 remained stable, with
Amazon at 32%, Microsoft at 23%, and Google at 11%.
3.3 Digital Platform Companies 49

Fig. 3.5 Meta logo

This vision has driven Amazon’s strategic decisions and operations, leading to its
success. Amazon has built a loyal customer base by ensuring customer satisfaction,
robust infrastructure, a wide product range, and reliable delivery, stimulating compe-
tition, service improvements, and price reductions. It has pioneered e-commerce,
smart devices, and cloud computing services, innovating markets and industries.
Through self-publishing and Kindle Direct Publishing, Amazon has offered oppor-
tunities for writers and transformed the publishing industry. Amazon Marketplace
enables small businesses to sell products globally.
Despite its contributions, Amazon faces concerns. Its core business in online
retail has succeeded by encroaching on existing retail markets with a unique
vision of customer centrality, essentially attracting customers from existing busi-
nesses in a zero-sum game. With a 37.6% share in the online retail market, far
surpassing competitors like Walmart, Amazon faces monopoly concerns.7 Domi-
nance in the online retail market restricts consumer choice and raises fears of
arbitrary rent increases. For example, third-party sellers pay Amazon various fees,
including referral, fulfillment, subscription, advertising, and long-term storage fees.
Amazon’s aggressive expansion through mergers and acquisitions, integrating them
to strengthen market dominance, raises concerns about the future “Amazon Empire.”
As a major employer, Amazon faces criticism for warehouse working conditions
and labor treatment, with rapid robot adoption reducing employment. Amazon’s
collection of vast user data for online businesses also raises privacy and security
concerns.

3.3.4 Meta Platforms (Facebook)

Meta, originally founded by Mark Zuckerberg in 2004 as Facebook, began as a social


network for college students but has grown into a social media giant through various
acquisitions (Fig. 3.5). In 2021, the company rebranded itself as Meta Platforms.
Meta started with Facebook, a platform for connecting friends and family, and
expanded into visual content sharing by acquiring Instagram. It broadened its services
to include messaging and voice calls by acquiring WhatsApp. Meta also entered the
AR/VR space by acquiring Oculus, shifting its focus toward building virtual worlds

7The market share of the US online retail market in 2023 was as follows: Amazon 37.6%, Walmart
6.4%, Apple 3.6%, eBay 3.0%, and Target 1.9% (Source: Statista).
50 3 Digital Platforms

and the metaverse with the launch of Oculus Quest. However, the Oculus business
did not meet revenue expectations, with the majority of Meta’s revenue still coming
from advertising on its social media platforms.8
Meta owns several platforms, including Facebook, Instagram, and WhatsApp,
serving hundreds of millions of users. These platforms satisfy diverse user needs
and expand connectivity and social networks. Meta’s personalized advertising plat-
form offers advertisers targeted marketing opportunities, enhancing connectivity and
communication among people and forming social networks, which is recognized
as Meta’s social contribution. In the future, Meta is expected to actively develop
and innovate around the metaverse and virtual worlds, expanding social interaction
functionalities.
However, Meta faces significant concerns about the potential risks and negative
impacts of dominant social media platforms. The primary concern is related to privacy
issues. Meta collects vast amounts of personal information through various platforms
and uses it for targeted advertising, raising concerns about privacy protection, data
security, and the possibility of data leaks. In addition, Meta’s social media platforms
can be used to spread misinformation and fake news and can cause problems such as
misinformation, incitement, and opinion division, which can influence elections and
politics through information manipulation and distortion. Furthermore, excessive use
of social media can lead to addiction, have negative impacts on mental health, and
cause serious social issues like online cyberbullying. Particularly, young users may
be sensitive to the negative effects of social media, prompting calls for appropriate
regulation.9 Meanwhile, Meta faces scrutiny over competition issues and suspicions
of monopolistic practices due to its dominant position in the social media market.

3.4 Nature of Digital Platforms

From a different perspective, just as a train station platform serves as the physical
space where the railway company (i.e., the service provider) connects with passengers
(i.e., the service users) to offer transportation, digital platforms function as virtual
ecosystems where developers, providers, and users converge to create, distribute,
and consume digital resources, such as content, applications, and services. Among
digital platforms, content platforms form ecosystems based on operating systems to
support application platforms, while application platforms create ecosystems based
on applications to provide various services and content. Before the rise of app stores,

8 In 2023, Meta Platforms’ revenue was $134 billion, of which $131.9 billion, or 98%, came from
advertising.
9 Meta faced a class-action lawsuit from 41 US states, accused of intentionally designing addictive

systems to keep users engaged for long periods, causing irreversible harm to the mental health
of children and adolescents. This lawsuit stems from 2021 when Frances Haugen, a former Meta
product manager, exposed hundreds of internal documents, revealing that Meta was aware of the
harmful effects of social media on teens’ mental and physical health but failed to act, instead
enhancing the addictive aspects of its platforms.
3.4 Nature of Digital Platforms 51

the application marketplace was fragmented and dominated by communication oper-


ators. However, the emergence of app stores brought about the consolidation of two
globally integrated marketplaces after a period of intense competition. These plat-
forms now enable the global expansion of application ecosystems, allowing sellers
and consumers from around the world to meet and engage in transactions. As applica-
tion platforms have grown and established themselves as central hubs for exchanging
information and products globally, new phenomena have emerged, shedding light on
the true nature of digital platforms.

3.4.1 Two-Sided Markets

Digital platforms as markets are characterized by their nature as two-sided markets.10


In such markets, providers (i.e., producers, sellers) are on one side, and users (i.e.,
consumers) on the other, with the digital platform connecting both sides. Digital
platforms provide the marketplace for providers and users to transact. For example,
in the case of content platforms, Apple’s App Store or Google’s Play Store serve
as marketplaces connecting application providers with application users. In applica-
tion platforms, Google Search or Amazon’s product, for example, search platform
connects users on one side to websites on the other. Moreover, application plat-
forms like Airbnb or Booking.com connect consumers needing accommodations
with lodging facilities willing to offer them.

3.4.2 Network Externality

Digital platform two-sided markets exploit “network externalities.” Network exter-


nality refers to the phenomenon where the utility or value of a product or service
increases as the number of users increases. A classic example of network externality
is traditional telephone service, where the utility of a phone increases as more people
are available to call. In other words, the more users a system has, the better it func-
tions. For instance, search engines can improve search performance through network
externally, enhancing search services. The more people use a search engine, the larger
the dataset of user queries and interactions, allowing the provider to continuously
improve search algorithms and deliver more accurate and relevant results. Similarly,
navigation services benefit from having a large number of users as it leads to a larger
collection of data, enabling the service provider to update in real-time, increase the
accuracy of the navigator, and find more efficient routes. Digital platforms, therefore,
thrive on the increasing utility brought about by their growing user base, improving

10The concept of two-sided markets was extensively researched by Jean Tirole. Refer to J-C. Rochet
and J. Tirole, ‘Platform competition in two-sided markets’, Journal of the European Economic
Association 1, pp. 990–1029, 2003.
52 3 Digital Platforms

the efficiency and quality of services provided to both providers and users within the
ecosystems.

3.4.3 Transaction Costs and Advertisers

In the two-sided markets of digital platforms, the method of covering transaction costs
differs from traditional markets. In offline commerce, buyers pay sellers directly for
products, and sellers pay rent to property owners. In online commerce, buyers pay
the platform provider, who then deducts platform usage fees before transferring the
remaining amount to the seller. In digital platforms handling information, the cost
of using information can be borne by users (i.e., buyers) and service providers (i.e.,
sellers), or by a third party, such as advertisers. In reality, it is often the sellers
or third parties who bear the costs. For example, when booking accommodation
through Airbnb or Booking.com, users only pay the accommodation fee, while the
property owner pays a significant portion of this fee to the platform provider as a cost.
Similarly, when using Google’s services like the search engine, YouTube, or Gmail,
Google provides these services for free to users but charges advertisers beyond the
cost, inserting ads throughout the service.

3.4.4 Attention Economy

In the information-providing two-sided market, the so-called attention economy


emerges as a new concept, highlighting the importance of capturing customers’
attention and time. The attention economy involves engaging customers with tailored
products or services to capture their attention, thereby increasing sales by holding
their interest and time. In the information-providing two-sided market, the platform’s
role is fundamentally to solve users’ information problems. For instance, Airbnb and
Booking.com inform users about suitable accommodations, and Google’s search
engine directs users to websites of interest. While providing this information, plat-
form providers do not charge users but instead impose costs on service providers or
third parties, such as advertisers. Especially, charging advertisers to relieve users of
any burden proves to be an effective incentive to attract more users. In this context,
platform providers earn significantly more from advertising fees, making it a primary
revenue source for platforms like Google or Meta. However, users do not receive
services entirely for free; they must endure ads. Even if they do not watch ads
directly, they must tolerate an ad-exposed environment. Thus, platform providers
aim to keep users on the platform as long as possible, capturing their attention with
ads. During their stay, users explore various information, revealing their interests,
which platform operators capture and collect as valuable data assets.
3.5 Dysfunctions of Digital Platforms 53

3.4.5 Regulatory Function

A unique mechanism in the basic nature of digital platform two-sided markets is that
platform operators themselves perform a regulatory function. The platform provider
may also impose restrictions on selling prices. Particularly, it monitors to ensure
that sellers do not pass costs onto consumers. The quality of products delivered to
the platform may also be inspected, and arbitration mechanisms are set up in case of
disputes. Such regulatory functions help the platform operator maintain market order
and protect consumers. Why is this? It is because the platform operates as a two-
sided market. By maintaining market order and protecting consumer interests, more
consumers are drawn to the platform, which in turn attracts more sellers, benefiting
the platform operator.

3.5 Dysfunctions of Digital Platforms

Digital platforms have enabled various functions that were not possible in the indus-
trial society. These functions have created new services that did not exist before,
made inefficient tasks efficient, and brought various conveniences to human life.
Using internet search, humanity can now instantly access knowledge accumulated
over thousands of years, and with navigators installed in cars, one can travel to every
corner of the globe. Messenger services have made it possible for anyone to commu-
nicate from anywhere at any time, and social networking services have enabled
people to form and interact within groups. E-commerce services allow the purchase
of goods produced in various countries with just a few clicks. They have also made
it possible to enjoy movies, performances, sports, and entertainment from around
the world while sitting at home. All these benefits brought by digital civilization are
provided through digital platforms.
However, digital platforms have not only provided positive functions. The emer-
gence of new platform capabilities has brought about unintended negative effects
as well. Internet search services have raised privacy issues due to the collection of
personal information and its use in targeted advertising. Social network services
have been associated with the distribution of false information, fake news, and issues
like cyberbullying. In addition, as e-commerce platforms have increased the market
dominance of providers, they have arbitrarily raised rents for sellers and narrowed
consumer choices, leading to detrimental effects. These problems did not exist in
the industrial society, or if they did, they were not as severe. Internet search services
and social networks did not exist before, so their associated issues were also non-
existent. Furthermore, before online e-commerce, offline commerce was limited to
specific countries or regions, so even if a company had significant market power, it
was subject to national regulations, preventing serious monopoly issues.
54 3 Digital Platforms

3.5.1 Search Engines

The first dysfunction of search services is that the information collected can constrain
future searches, limiting search results to areas of past interest. For example, if
someone has previously searched for travel in Morocco, subsequent searches about
Morocco may prioritize content related to travel and tourism, while information
about its history or political system might be excluded or placed far down in the
results. This happens because internet search engine algorithms are designed to
remember past searches and provide fast, personalized service by offering search
terms within a similar scope. Consequently, users may unknowingly find themselves
filtered within the categories of their past search histories, a phenomenon known as
a “filter bubble.” Filter bubble limits the range of information accessible via internet
searches, potentially obscuring users’ views and distorting their understanding. If this
phenomenon repeats, it may lead users to view things from a narrow perspective,
relying only on information provided by the search engine, thus falling into prejudice
and overconfidence.
The search algorithms themselves are not without issues. Since algorithms are
generally kept secret, it is unclear how search results are ranked, causing users and
content creators to speculate about biases in search outcomes. Search engines tend to
prioritize well-known and frequently visited websites, making it difficult for lesser-
known or smaller businesses and content creators to gain visibility. This has led to
suspicions that Google uses its market dominance to control web traffic, impacting
the sustainability of online businesses and favoring its products and services over
competitors. Google has faced antitrust investigations in several countries for such
competitive practices. In addition, there is potential for search algorithms to be
manipulated to promote false or misleading information, propaganda, disinforma-
tion campaigns, and the spread of fake news. Malicious actors could also use search
engine optimization (SEO) techniques to manipulate search results to prominently
display erroneous information.
Like messenger services, social networks, and online commerce, search engines
collect extensive information about users, allowing for the creation of detailed user
profiles. While the nominal reason is to provide faster search services or personalized
services, the amount of personal information collected by platform providers often
exceeds what is necessary for service improvement. Once information collection
begins, various problems emerge, such as users being subjected to unwanted ads or
temptations to purchase products. If collected personal information is not securely
managed, it could be stolen through hacking or cyber-attacks, leading to significant
unforeseen damages. Thus, platform operators must prioritize security measures to
protect user privacy, but the adequacy of these measures is often unclear, leaving
privacy concerns as a highly explosive dysfunction.
3.5 Dysfunctions of Digital Platforms 55

3.5.2 Social Media

Like search engine services, social media platforms collect personal information,
leading to potential data breaches as a basic dysfunction. However, a more significant
and subtly operative dysfunction is the “echo chamber effect.” Social media amplifies
or strengthens homogeneous beliefs among its participants, creating a phenomenon
where like-minded individuals reinforce each other’s views, leading to information
distortion and bias. This effect can drive users away from the truth and even lead to
the formation of factional groups. Initially, uncertain individuals may become more
convinced and act boldly after being exposed to similar opinions within these echo
chambers. If filter bubbles create bias through collected information, echo chambers
do so through the overlapping selection of information, leading to confirmation bias.
The echo chamber effect causes social media users to cluster into groups with similar
tendencies, and these formed groups become blindly supportive and follow the group.
Today, collective actions mediated through the internet have become a social
issue, which is the lethal dysfunctions that social media brings to the political and
social environment. Various social networking services and personal internet broad-
casts through platforms enable collective action. When used effectively, social media
can help overcome communication barriers due to time constraints or geographical
limits, thereby facilitating the formation and maintenance of human relationships.
However, if social media is used to form malicious groups, incite collective actions,
or engage in cyberviolence, it becomes a dysfunction. Specifically, personal internet
broadcasting, unlike traditional media, is not regulated and can be used irresponsibly
to generate sensationalist content or unfounded claims. This can greatly pollute the
media ecosystem and degrade into a tool that incites and mediates collective actions
or violence. Moreover, if social media circulates false information and fake news
and is misused to manipulate information and stir up public sentiment, it can have
severe impacts on elections and politics.
Misuse of social media can deteriorate relationships and harm personal privacy or
dignity. This could stem from the platforms’ inherent dysfunctions or users’ indis-
criminate behaviors. Social media platforms can be addictive, leading users to spend
excessive time on them at the expense of neglecting real-life responsibilities and
relationships. Overuse of social media can also trigger mental health issues such
as depression, anxiety, and loneliness. Constant comparison with others can lead to
feelings of inadequacy or negatively impact self-esteem, and may even result in Fear
of Missing Out (FOMO) syndrome.11 These phenomena can be considered inherent
dysfunctions of social media itself, as they reflect the detrimental psychological and
social effects associated with its pervasive use. On the other hand, cyberbullying is a
common occurrence on social networks, where users may suffer psychological and
emotional harm due to derogatory comments, threats, and cyberabuse. Malicious
individuals can exploit social media to commit phishing attacks, identity theft, and

11 The term FOMO syndrome is primarily used to describe a fear of missing out or being excluded,
or a vague anxiety about situations where others seem to be having worthwhile experiences that
one has not tried.
56 3 Digital Platforms

other cybercrimes, causing social disorder. These issues can be attributed not so much
to the inherent dysfunctions of social media itself, but rather to the misuse of social
media by its users.
The enduring presence of social media footprints is indeed another dysfunction
of the medium. Actions and statements from the past do not simply disappear but
continue to affect lives today. A personal mistake stored on social media can resurface
at any time, and someone who remembers these past actions might spread them
via social media for various reasons. This can lead to a form of “public opinion
trial,” where past actions are judged by today’s standards, resulting in the public
condemnation of past mistakes. While it might be considered positive that individuals
are held accountable for their actions, the judgments made through public opinion
rather than through legal proceedings are inherently unfair. The “right to be forgotten”
faces challenges against the argument for the “right to know.” Even attempts to erase
past digital records through “digital undertakers” cannot guarantee the complete
removal of all stored files across social media and digital devices, and the aggrieved
memories imprinted in victims’ minds remain inerasable. This highlights the struggle
of creating a forgiving society that allows for reflection and recovery in the face of
social media’s barriers.

3.5.3 Personal Information

From an individual perspective, the dysfunction caused by the collection, leakage,


and misuse of personal information is far more significant than the structural dysfunc-
tions of search engines or social media. Platforms that provide search services, social
network services, messenger services, and online commerce all collect user informa-
tion. This information can be used to improve the services provided to users and help
them select more suitable sellers. However, personal information can be mishandled,
leading to potential leaks and misuse that can be devastating for individuals. So the
collection of personal information is problematic from the start.
Platforms collect personal information not only through user search queries but
also through various social media activities. Users indirectly provide their personal
information while using these platform services and consent to its use for various
reasons. As a result, the amount of information collected about each user by the
platform is much more extensive and detailed than one might guess. Once a platform
collects personal information, this information becomes the property of the platform
company, and individuals cannot control it. Furthermore, it is impossible to know
when, where, and how the collected personal information will be used and how far it
will spread. Therefore, one can only hope that platform companies manage personal
information with the utmost security priority; however, in general, companies do not
invest as much in security as users expect.
Platform operators actively collect user information using all means because they
regard user data as an asset to the company. This includes search queries, personal
messages exchanged via messengers, various travel photos, stories and ‘likes’ posted
3.5 Dysfunctions of Digital Platforms 57

on social networks, and various physical information sent via wearable health check
apps. This data is collected and categorized individually by platform companies.
Analyzing this data allows for the understanding of user behavior (such as loca-
tion, movement trajectories, interests, personal networks, social graphs, consumption
habits, and internet search patterns). There is an inseparable relationship between
user data and behavior; thus, analyzing user data enables the understanding of user
behavior. In other words, platform companies collect much more user data than
necessary for service improvement and use this data for business advantages.
If platform companies use personal information to attempt price discrimination,
users may experience pricing unfairness. If the platform does not have information
about a user’s preferences, it might offer the standard or a lower price, but if it
knows the user needs a particular product, it could charge a higher price. Such
price differentiation allows platform companies to make more profits. If such price
discrimination is applied to medical insurance, it could potentially lead to a crisis in
the insurance system. Discriminating by charging lower premiums when someone
is healthy and higher premiums when their health deteriorates can exclude some
individuals from medical insurance, ultimately endangering the insurance system.
When personal information is disclosed to others, the individual becomes
constrained in social life, potentially leading to unhappiness. Disclosure means that
one’s private space narrows while the public space expands. Although one can live
comfortably in private spaces, in public spaces, one becomes acutely aware of the
gaze of others. Thus, the disclosure of personal information means that many aspects
previously considered private become public, constraining social behavior. This can
be particularly disadvantageous in conflict situations and lead to unexpected conse-
quences. To escape such constraints and restrictions, people may act differently than
usual, may not freely express their opinions, and may not seek help even when
facing difficulties. If such a state of isolation becomes unbearably difficult, one may
seek refuge in safe spaces, which could be physical places like churches or temples
or virtual spaces like other social networks. However, if one joins a homogeneous
group on a social network, while it may initially feel safe and comforting, differences
in opinions among members over time can lead to divisions and eventually harmful
actions against each other. This phenomenon can occur in both religious and political
groups, albeit with varying degrees of severity.
Thus, the leakage of personal information through digital platforms can be consid-
ered the most critical dysfunction at an individual level. The best strategy is to prevent
the leakage of personal information as much as possible. Primarily, each individual
should not leak their personal information, and secondarily, platforms should take the
best measures to prevent the leakage of personal information. While personal infor-
mation leakage by individuals can be minimized by prudent behavior, leakage by
platform operators can only be minimized by enacting and strictly enforcing personal
information protection laws. Further, more proactive measures are also considerable,
such as legislating to limit the content and scope of collected information or limiting
the period of storing collected information.
58 3 Digital Platforms

3.5.4 Lowest Price Guarantee

Another dysfunction of digital platforms is that they can deceive users through the
collection of massive amounts of personal data and sophisticated sales tactics, which
users can hardly imagine. For example, products recommended by a platform’s
marketplace are likely to be specific products that pay substantial commissions to
the platform’s own products. This means that products recommended by platforms
may be more beneficial for the platform operators than for the users. Even if platform
operators use extensive user personal data to expose items that might interest users
and present them as the best products for the user, ultimately, this is more about
making money for the business than helping the user. Such dual business practices
raise issues concerning the transparency of information and the neutrality of platform
operators.
In the case of intermediary service platforms like online commerce and travel
agencies, there exists a sales tactic known as the “lowest price guarantee,” which
can disrupt the order of commerce by luring users. For example, if hotel booking
platform A advertises that it guarantees the lowest price, how can platform A ensure
the lowest price while compensating for any shortfall? If platform A is a well-known
platform that is convenient to use and connected to various hotels, and it guarantees
the lowest prices, everyone would want to use this platform. If all users flock to
platform A and stop using other platforms, what happens? Platform A may pass on
the losses incurred from guaranteeing the lowest prices to the hotels by demanding
additional fees, and the hotels, having no other way to connect with users except
through platform A, will have to comply. As a result, hotels paying additional fees
will seek ways to recover their losses, eventually passing them on to other customers
or other platforms. For instance, they might increase the rates for customers not
booked through platform A or for those booking through other travel platforms like
platform B. Thus, the losses from one platform’s lowest price guarantee end up
being transferred to other users as additional charges. This is a problem for all digital
platforms, especially intermediary commerce platforms, leading to the breakdown
of a healthy online commerce order.
Such lowest price guarantees are banned or regulated in European countries like
Germany and France due to their harmful effects on market order. These regula-
tions primarily prevent online commerce and hotel booking platforms from limiting
competition or restricting how third parties (e.g., hotels, merchants) set prices through
lowest price guarantees. Germany, in particular, pays attention to clauses related to
the lowest price guarantee in digital markets, often referred to as price parity or
most-favored nation (MFN) clauses. The Federal Cartel Office investigates compa-
nies that use such clauses and restrict or prohibit them if they are deemed to hinder
fair competition. France investigates these clauses and, if found to limit competition
or negatively affect consumers, requires limitations or corrective measures.
3.5 Dysfunctions of Digital Platforms 59

3.5.5 Natural Monopoly

Digital platforms inherently possess the risk of evolving into natural monopolies
due to network externalities. As exemplified earlier, search engines can collect more
data and enhance their algorithms with an increasing user base, thus providing better
search results. Similarly, navigational apps become more accurate and efficient as
more users contribute data about their behavior patterns and road conditions. Conse-
quently, users naturally gravitate toward platforms with a larger user base, and this
choice further strengthens the platform through network externalities, promoting a
winner-takes-all scenario and leading to a natural monopoly.
With the content platforms divided between iOS and Android, where Android is
used by several manufacturers but iOS is exclusively controlled by Apple, Apple’s
ecosystem showcases high market dominance and associated detriments.12 This
exclusivity means Apple can tightly control hardware, software, and applications,
limiting freedom for users and developers who must adhere to Apple’s guidelines
and policies. Apple’s control over the iOS app marketplace further restricts compe-
tition for developers, who may face rejection or high fees for store entry. In addi-
tion, the requirement to use Apple’s exclusive development tools and programming
language can constrain open-source and cross-platform development preferences.
Apple’s premium pricing for its devices restricts consumer access, and the lack of
compatibility with other manufacturers’ devices hinders platform switching and data
sharing.
Google’s dominance in the global search market, with a 91% share, and its
leading position in digital advertising, showcases potential monopolistic dysfunc-
tions. The overwhelming market share restricts competition and innovation, making
it challenging for users to find alternatives and suppressing diversity in the digital
ecosystem. Google’s market power could lead to bias in search algorithms away
from the most relevant results toward those most profitable, affecting the quality
and diversity of search information. Similarly, in digital advertising, Google’s
dominance raises concerns about price manipulation, which limits advertisers’
options and potentially increases costs that may be passed on to consumers. In
additional, Google’s control over content publishers’ revenue streams can signif-
icantly influence disputes over revenue sharing and threaten the sustainability of
high-quality journalism. Like its search engine, Google’s dominance in advertising
makes it difficult for smaller competitors to challenge its vast resources, thereby
suppressing innovation and limiting the introduction of new services to the detriment
of consumers.
Amazon, with a 37.6% share in the global online retail market, overwhelmingly
surpasses its closest competitor, Walmart, which holds only 6.4%. This dominant
position allows Amazon to impose arbitrary fees on platform sellers and restrict

12As of April 2024, the global OS market share stands at Android 70.9% versus iOS 28.4%, yet in
North America and Oceania, iOS dominates with 55% and 54% market share, respectively. While
Android is utilized by a majority of smartphone manufacturers like Samsung, Huawei, and Xiaomi,
iOS remains confined within Apple’s closed ecosystem, including iPhones, iPads, and iPod touches.
60 3 Digital Platforms

consumer choices. Amazon’s vast delivery network and warehouses exert pressure
on competitors, and during the pandemic, while other competitors were on the defen-
sive, Amazon aggressively expanded employment and investment. Numerous third-
party sellers depend on Amazon’s marketplace to reach customers, and Amazon
controls them through various fees and access to customer data. Fees for third-party
sellers include referral fees, fulfillment fees, subscription fees, warehouse fees, long-
term storage fees, and product removal fees, causing distress and complaints among
sellers.
Meta’s unparalleled position in social networking and messaging services, with
Facebook, Instagram, WhatsApp, and Facebook Messenger dominating user counts,
raises concerns over monopolistic practices.13 The Federal Trade Commission
(FTC)’s lawsuit against Meta for acquiring Instagram and WhatsApp highlights
potential antitrust violations, indicating that such dominant market power can hinder
market entry for other companies and limit user choices. If a monopoly eliminates
competition, it enables firms to act against user interests, lessen product improvement
pressures, and ultimately disadvantage consumers.

3.6 Regulation of Digital Platform Companies

The solution to the dysfunctions of digital platforms is regulation. However, regula-


tion of corporate activities can always lead to both positive and negative outcomes
simultaneously. While regulation can reduce the adverse effects that corporate activ-
ities have on society, it can also suppress business activities and dampen creativity
and innovation. Implementing regulations too early can hinder the proper develop-
ment of technology and services, and prevent companies from growing sufficiently,
leading to premature suppression. Therefore, it is essential to carefully consider the
maturity of technology, services, and companies before proceeding with regulations
to address the negative aspects without stifling creativity and innovation.
If we assess the current maturity of platform companies, Apple, Google, Amazon,
and Meta are all among the top 10 companies in the world by market capitalization,
indicating that it is time to correct their negative aspects through regulation. The
primary target for regulation of these digital platform companies is their monopo-
listic dominance due to market power. The natural monopoly phenomenon arises in
digital platforms due to network externality effects, which significantly increases the
likelihood of these companies becoming monopolistic enterprises. Indeed, the fact
that Apple, Google, Amazon, and Meta have become among the top 10 compa-
nies by market capitalization in a relatively short period is an evidence of the
natural monopoly phenomenon of digital platform companies. The United States

13 As of April 2024, Meta’s platforms have a combined total of 8.1 billion monthly active users, with
Facebook at 3.1 billion, Instagram at 2.0 billion, WhatsApp at 2 billion, and Facebook Messenger
at 1.0 billion, according to Statista data. This overwhelmingly surpasses YouTube at 2.5 billion,
TikTok at 1.6 billion, WeChat at 1.3 billion, and Telegram at 900 million users.
3.6 Regulation of Digital Platform Companies 61

has provided a free environment that has allowed platform companies to innovate
and grow rapidly, contributing to their fast growth. However, the USA has begun
regulating platform companies including Meta.14 Google,15 Amazon,16 and Apple17
for antitrust violations and Europe has also begun to formalize regulation of digital
platform companies.18

3.6.1 Methods of Regulation

Digital platforms are deeply infiltrated in our society today, and our daily lives are
significantly dependent on digital platform services. Therefore, the negative aspects
of digital platforms already affect our lives, making regulation of digital platforms
an urgent task of our times. Effective regulation of digital platforms is necessary
to reduce the social harm caused by certain operators’ monopolies. To that end, it
is essential to accurately describe the characteristics of digital platforms and find
suitable methods of regulation. The methods should be considerately designed such
that they can preserve the advantages and innovative development of digital plat-
form services while preventing their negative aspects and side effects. Further, they
should be able to open doors to new competitors in the digital platform’s two-sided

14 In December 2020, the FTC filed a lawsuit against Facebook (Meta), alleging that its acquisitions
of Instagram and WhatsApp were anticompetitive. The first lawsuit was dismissed in June 2021, but
the FTC filed an amended lawsuit, with a trial expected to take place in 2024. The FTC argues that
while the acquisitions did not lead to price increases, they resulted in poorer service and restricted
consumer choice.
15 In January 2023, the U.S. Department of Justice filed a lawsuit against Google for abusing

its monopoly power in the digital advertising market and harming fair competition, in violation
of antitrust laws. The lawsuit claimed that Google demanded default monopoly rights to block
competitors. Specifically, it was alleged that Google entered into revenue-sharing agreements with
electronic device manufacturers and wireless carriers to set its web browser as the default search
engine, thereby excluding competitors. On August 5, the federal court in Washington DC ruled that
“Google is a monopolist, and it has acted as one to maintain its monopoly”.
16 On September 26, 2023, the FTC filed a lawsuit against Amazon for allegedly using its dominant

position in the e-commerce market to harm competitors. The FTC argued that Amazon penalized
sellers on its platform if they offered their products at lower prices on competing platforms. It was
also claimed that Amazon forced sellers to use its expensive logistics network, resulting in harm to
competitors and causing consumers to pay higher prices for their purchases.
17 On March 21, 2024, the U.S. Department of Justice, along with 16 states, filed an antitrust lawsuit

against Apple. The lawsuit alleged Apple was harming consumers by using its monopolistic position
in the market. Key issues include ‘super apps,’ differentiation in messaging app bubble colors,
restrictions on cloud streaming gaming apps, and limiting cross-platform compatibility for digital
wallets and smartwatches, which restrict consumer choices and hinder competition.
18 In July 2022, the EU enacted the Digital Markets Act (DMA), a competition law aimed at

regulating monopolies in digital platforms. Simultaneously, the Digital Services Act (DSA) was
established to regulate the monopolies of digital platform services. For specific details, refer to
Sects. 6.2 and 6.3.
62 3 Digital Platforms

market and ensure that all participants are fairly compensated according to their
contributions.
However, the problem is that digital platforms present a case that has not been
experienced in the industrial society of the past, so it is unclear what the most effective
way to regulate them is. In the past, industries like telecommunications, electricity,
and railways had high entry barriers and a high potential for monopoly, which was
regulated to prevent. The method used to regulate monopolistic companies at that
time was to split the company and introduce competition. For example, in the case of
telecommunications, local communication businesses requiring essential facilities
were considered core businesses, so the long-distance communication business was
chosen to separate for competition. Similarly, in the railway business, train stations
and railway networks were seen as essential facilities, so the transport sector was
separated for competition.
However, it appears inappropriate to apply the same concept of splitting core busi-
nesses and competitive areas to today’s digital platform business. It is because the
reasons for high entry barriers and monopolies are different between the two. In the
past, the high entry barriers were due to the need for substantial facility investments,
naturally forming a monopoly structure. In contrast, the difficulty for new entrants
in the digital platform market is not due to substantial facility investments but due to
the network externality effects, which concentrate users on some particular service
providers. Therefore, it is not valid or effective to apply the regulatory methods
used in traditional industries to digital platform businesses. Moreover, while tradi-
tional industries were confined within a country, allowing the government to regu-
late them, digital platform businesses are global, making such regulatory methods
inapplicable.19
Therefore, it is necessary to develop and apply new methods of regulation that
fit the nature of digital platforms, such as creating new competition laws to regulate
natural monopolies in digital platforms. For example, unlike traditional monopolistic
companies where high entry barriers were due to essential facilities, digital platforms
have barriers due to user concentration. It might be more appropriate to target the
core business directly, considering that the user concentration phenomenon occurs
in the core business segment. If competitors could access the core business without
discrimination, it could resolve the issues caused by user concentration. Therefore
in the two-sided markets like digital platforms, a “multi-homing” system that allows
users and providers to participate in multiple platforms simultaneously could help
to resolve the natural monopoly problem. Multi-homing, originally a term used in
communications, refers to connecting users to multiple networks to improve the
reliability and performance of communications. For instance, in the US, taxi opera-
tors like Uber and Lyft allow drivers and passengers to freely use both companies,

19 Nobel Prize-winning economist Jean Tirole argued that traditional regulatory and antitrust rules
are ineffective in addressing the issues of increased dominance in digital platforms, and new rules are
needed to ensure competition in the digital platform market, including the potential for competition
and fair compensation to the contributions of the platform users. Refer to “The Future of Platform
Regulation” NBER conference lecture, April 22, 2022.
3.6 Regulation of Digital Platform Companies 63

which is an example of multi-homing. Applying this system to digital platform busi-


nesses can prevent specific platforms from monopolizing the market. If drivers were
restricted to working for only one of the two taxi companies, they would flock to
the bigger company having more customers, leading the smaller company to decline
and eventually resulting in monopoly by the bigger.

3.6.2 EU’s Digital Markets Act (DMA)

The European Union (EU) was the first to legislate regulations for digital platform
companies with the establishment of the Digital Markets Act (DMA) in July 2022,
aimed at regulating monopolies in digital platforms. The DMA was first proposed
in December 2020, adopted by the European Parliament in July 2022, officially
published in the Official Journal in October 2022, came into effect in November
2022, and has been applied since May 2023.
The DMA is a law designed to increase fairness and the possibility of competition
in the digital platform market. It sets clear criteria for identifying ‘gatekeepers’ and
stipulates the obligations and prohibitions that gatekeepers must comply with. Gate-
keepers refer to large digital platforms that provide core platform services, such as
online search engines, app stores, and messaging services. Specifically, a gatekeeper
is defined as a digital platform that, first, holds a strong economic position with
significant impact in domestic markets and operates across several EU countries;
second, connects a large user base and many businesses from a powerful interme-
diary position; and third, has had a stable and durable position in the market, meeting
the above two criteria for at least two out of the last three fiscal years.
Examples of obligations that the DMA requires gatekeepers to comply with are as
follows: In certain situations, the gatekeeper must allow other companies to interop-
erate with the gatekeeper’s own services. It must provide business users of the gate-
keeper’s platform with access to data generated during their use of the platform. It
is required to provide tools and information necessary for advertisers and publishers
to perform independent verification of advertisements placed on the gatekeeper’s
platform. Business users of the gatekeeper’s platform must be allowed to promote
and make contracts with their customers outside of the gatekeeper’s platform.
Examples of prohibitions that the DMA requires gatekeepers to comply with are
as follows: The gatekeeper must not give preferential ranking to its own services or
products over similar services or products provided by third parties on the platform.
It must not hinder consumers from connecting with businesses outside the platform.
It must not prevent users from uninstalling pre-installed software or apps if the users
choose to do so. Without valid user consent, the gatekeeper must not track users for
targeted advertising purposes outside of its core platform services.
If these obligations and prohibitions are not complied with, the company can be
fined up to 10% of its total annual worldwide turnover, and up to 20% in the case of
repeated violations. If gatekeepers systematically breach DMA obligations, further
64 3 Digital Platforms

measures can be taken after investigation, including, as a last resort, non-monetary


measures such as divesting parts of the business.
After the implementation of the DMA in November 2022, the European Commis-
sion began receiving information on user numbers from each digital platform
company and designated the gatekeepers in September 2023. The designated gate-
keepers include six platform companies: Alphabet (Google), Amazon, Apple,
ByteDance, Meta (Facebook), and Microsoft. These six gatekeepers have a total
of 22 designated core platform services, as listed in Table 3.3.
The essence of the DMA revolves around fairness and competitiveness. Fairness
ensures that actual service providers can receive a portion of the added value created
on digital platforms as compensation. This means that service providers, by offering
their services through digital platforms and thereby contributing to the platform’s
business, deserve fair compensation for their contributions. Competitiveness involves
creating conditions that allow new platforms capable of challenging existing ones to
compete with them over core businesses. For instance, if there is a next-generation
search engine with new features, it should be able to compete with and enter the
market against Google’s search engine. Similarly, if there’s a new social network
platform that’s more appealing than Facebook or Instagram, it should be allowed
market entry as well.
In order to prevent monopolies by digital platform operators, it is crucial to create
conditions that enable users to try new platforms. For example, we consider a new
social network platform with features that compete with Facebook. Users may want
to switch to the new platform but feel unable to do so because they have already
posted much content on Facebook and are uncertain if their friends will follow them
to the new platform. If posting a photo on Facebook would also upload it to the new
platform, this issue would be resolved. This is what multi-homing entails. Multi-
homing creates conditions that allow digital platform users to participate simultane-
ously in new platforms, thereby preventing monopolies by existing operators. The
DMA mandates interoperability and data access as obligations for multi-homing.
Supporting interoperability requires the opening or standardization of interfaces.
Therefore, the benefits brought by the DMA can be summarized as follows:
Business operators dependent on gatekeepers for providing services will enjoy a
fairer business environment. Innovators and entrepreneurs gain new opportunities
to compete and innovate in the online platform environment without unfair condi-
tions and restrictions. Users will have the chance to choose from a wider variety of
better services, change service providers when they want, and select fair prices. Gate-
keepers will also have the opportunity to innovate and offer new services in an open
and competitive environment. However, unfair practices to gain undue advantage
over service providers and users are not permitted.20

20 Following the implementation of the DMA, Apple announced that from March 2024, iPhone
apps could be downloaded from app markets other than its own App Store in Europe, and it allowed
the use of third-party payment systems for in-app purchases. This move by Apple, in response to
Europe’s strong antitrust regulation through the DMA, indicates that it can no longer maintain a
closed ecosystem.
Table 3.3 DMA-designated gatekeepers and platform services
Social network Intermediation NI-ICS* Ads Video sharing Search Browser OS
Alphabet (Google) Google Map Google Ads YouTube Google Search Chrome Android
Play Store
Google Shopping
Amazon Marketplace Amazon Ads
3.6 Regulation of Digital Platform Companies

Apple App Store Safari iOS


Byte Dance TikTok
Meta Facebook Marketplace WhatsApp Meta Ads
Instagram Messenger
Microsoft LinkedIn Windows PC OS
* NI-ICS refers to number-independent interpersonal communication services
65
66 3 Digital Platforms

3.6.3 EU’s Digital Services Act (DSA)

The European Union (EU) legislated the Digital Services Act (DSA) alongside the
Digital Markets Act (DMA) in July 2022 to regulate the monopolies of digital plat-
form services. Like the DMA, the DSA was first proposed in December 2020, adopted
by the European Parliament in July 2022, published in the Official Journal in October
2022, and came into effect in November 2022, with enforcement starting in August
2023.
The DSA works closely with the DMA to protect users’ fundamental rights,
establish a foundation for fair competition among companies, create a safer digital
space, and foster innovation, growth, and fair competition in both the European
single market and the global market. The DSA covers a wide range of digital plat-
form services from simple websites to internet infrastructure services and online
platforms. The rules set out in the DSA apply mainly to online intermediation plat-
form businesses such as e-commerce, social networks, content sharing platforms,
app stores, and online travel and accommodation platforms.
The motivation behind the EU’s establishment of the DSA stems from the benefits
and challenges brought by digital platform services. Digital platform services have
enhanced human life in various ways, such as communication, information search,
shopping, entertainment, food ordering, and movie watching, and have facilitated
business activities across borders and new market access. However, they also intro-
duced significant issues, including the trade of illegal goods, services, and content
online, and the spread and exploitation of false information created by manipulative
algorithms. Moreover, some large platforms have dominated the digital economy
ecosystem, acting as gatekeepers that impose unfair conditions on businesses and
limit users’ choices by establishing their own rules. Therefore, the EU introduced
the DSA as a modern legal framework to ensure online safety for EU users, protect
the fundamental rights of businesses, and maintain a fair and open online platform
environment.
After the DSA’s implementation in November 2022, the European Commission
designated certain digital platforms as Very Large Online Platforms (VLOP) and
Very Large Online Search Engines (VLOSE) in April 2023, based on their active
user numbers exceeding 45 million (10% of the European population) in the EU as
of February that year. The designated VLOPs included 17 entities like AliExpress,
Amazon Store, and the App Store, while VLOSEs included Bing and Google Search,
among others, as listed in Table 3.4.
The DSA sets rules and responsibilities for online platforms, particularly VLOPs
and VLOSEs. Examples of obligations that the DSA requires VLOPs and VLOSEs
to comply with are as follows: They must publish a content moderation activity report
describing the efforts they make to address illegal content and misinformation. They
are required to implement robust content moderation systems and mechanisms to
prevent the spread of illegal content, hate speech, and harmful misinformation. They
must provide relevant users with information regarding the removal or restriction of
3.6 Regulation of Digital Platform Companies 67

Table 3.4 DSA-designated VLOPs and VLOSEs


Social network, Intermediation Video sharing Search Dictionary
messenger
service
Alphabet Google map YouTube Google Search
(Google) Play store
Shopping
Amazon Amazon store
Apple App store
ByteDance TikTok
Meta Facebook,
Instagram
Microsoft LinkedIn Bing
Alibaba AliExpress
Booking Booking.com
holdings
Pinterest Pinterest
Snapchat Snapchat
X Twitter(X)
Wikipedia Wikipedia
Zalando Zalando

such content and offer an opportunity to appeal these actions. Platforms must coop-
erate with regulatory authorities and law enforcement to respond to illegal activities
related to content.
The DSA imposes additional obligations on online platforms such as e-commerce
sites and social media networks. These include providing clear information about
each advertisement displayed and discontinuing personalized ads based on sensitive
data or profiling of children’s data. Platforms must avoid online interface designs
intended to manipulate user behavior and disclose in their Terms & Conditions (T&C)
any use of fully or partially automated systems for recommending certain information
to users. They are required to offer a notification mechanism for users to report
illegal content and notify affected users of any actions taken regarding their content.
Platforms must provide an effective internal complaint-handling system for users
impacted by decisions related to content or account management.
Examples of prohibitions that the DSA requires VLOPs and VLOSEs to comply
with are as follows: They are prohibited from managing or spreading illegal content,
including hate speech, child abuse, and unauthorized copyrighted material. They
must avoid engaging in unfair competition practices that harm consumers, other
businesses, or stifle innovation. Discrimination based on personal characteristics such
as gender, race, or religion is not allowed. They are also prohibited from engaging
in manipulative practices that distort the presentation of content or services.
68 3 Digital Platforms

A notable aspect of the obligations and prohibitions for VLOPs and VLOSEs is
that personalized advertising based on an individual’s religion, race, sexual orien-
tation, or political views is prohibited, and no form of personalized advertising can
be directed at children and adolescents. This is because if advertisements containing
specific religious, ethnic, or political messages are “personalized,” it could lead to
the entrenchment of user biases. For example, it prevents recommending provoca-
tive content with white supremacist messages to users who enjoy content involving
racial hate. In addition, harmful content, such as discriminatory or biased speech,
terrorism, and child sexual abuse, must be swiftly removed, and platforms must
establish internal measures to prevent the spread of misinformation. Furthermore,
users must be provided with the ability to opt out of data collection and disable
recommendation algorithms, meaning they should be able to view posts in simple
chronological order without recommendations.
VLOPs and VLOSEs are directly regulated by the EU. To this end, EU regulatory
authorities have the right to access the data and information held by VLOPs and
VLOSEs, and if access is refused or obstructed, they can take legal action. Authorities
have the power to order the removal or restriction of specific content that violates
DSA regulations, and failure to comply with such orders can result in penalties.
Large platform companies that violate the DSA may be fined up to 6% of their
global revenue, and in cases of serious or repeated violations, their services may
be temporarily suspended or permanently banned from the European market. In
addition, companies that do not comply with DSA regulations may face legal action
from regulatory authorities, users, or other affected parties.
Along with the DSA, the era of unregulated growth for digital platform companies
has come to an end. It is likely that the EU’s regulations on digital markets and
services will spread to other countries, meaning that digital platform companies will
have to operate under strict regulations in the future. The previously unregulated use
of sensitive personal information for personalized advertising or content exposure is
no longer allowed. Even content recommendation algorithms, which are considered
a core competitive advantage, must be fully revised to comply with the regulations.
For companies like Google, which relies on advertising for approximately 80% of
its revenue, and Meta, which relies on over 90%, the DSA could represent a seismic
shift in their business models.
Chapter 4
Digital Technology

Digital transformation, succinctly defined, refers to transitioning toward the paradigm


of the digital age. Specifically, it involves applying digital technologies and the new
methods and concepts that come with the digital age, enabling businesses, society,
and individuals to innovate in all operations, activities, and lifestyles, improve effi-
ciency, and create new value, thus transforming into a new dimension. Therefore,
for digital transformation to be successful, it is necessary to adopt digital technolo-
gies and pursue comprehensive changes that align with the associated methods and
concepts. The starting point for digital transformation is, thus, a deep understanding
and utilization of digital technologies.
The digital technologies driving digital transformation are varied and encompass
information and communication technology (ICT), which combines communication
and computer technology. There are technologies directly related to communica-
tion, such as 5G/6G mobile communications, the Internet of Things (IoT), and those
related to both communication and computers, including cloud computing and edge
computing, augmented reality (AR), virtual reality (VR), the metaverse, autonomous
driving, blockchain, as well as those related to computers, like artificial intelligence
(AI), quantum computing, and those related to computer data processing, such as
big data analysis, bioinformatics, digital twins, and cybersecurity solutions. In addi-
tion, there are other technologies such as robotics, 3D/4D printing, which require
mechanical and material technologies in addition to ICT.

4.1 5G/6G Mobile Communication

5G mobile communication provides communication services wirelessly through a


cellular network. That is, the service area is divided into small geographical areas
called “cells,” with a base station positioned at the center of each cell. These base
stations are interconnected through wired networks, including fiber-optic networks.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 69
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_4
70 4 Digital Technology

Users communicate wirelessly with the base station in their area using a cellular
phone, and if a user moves to a different base station’s area, the previous base
station hands over the communication signal to that particular base station, enabling
uninterrupted communication while on the move.
The wireless frequency bands used in 5G mobile communication are divided
into bands below 6 GHz (FR1) and the 24–54 GHz band (FR2). The maximum
channel bandwidth is 100 MHz for FR1 and 400 MHz for FR2. FR1 has rela-
tively longer wavelengths, allowing signals to propagate further, resulting in base
stations being several kilometers apart. However, FR2 belongs to the millimeter
wave (mmWave) band, experiencing significant attenuation and shorter signal prop-
agation distances, with base stations only tens to hundreds of meters apart. Therefore,
while the FR2 band offers higher data transmission rates than FR1, it requires more
frequent handovers.
In mobile communication, multiple access refers to the technique of carrying
user signals across frequency bands. Each generation of mobile communication has
used different multiple access methods (refer to Table 2.1), and like 4G, 5G mobile
communication uses Orthogonal Frequency Division Multiple Access (OFDMA)
technology for both uplink and downlink. OFDMA is a version of Orthogonal
Frequency Division Multiplexing (OFDM) modulation that allows multiple users
to access simultaneously by assigning different subcarriers to different individual
users.
In addition, 5G mobile communication adopts Massive MIMO (Multi-Input
Multi-Output) technology which involves equipping both base stations and termi-
nals with multiple antennas to transmit and receive signals. MIMO simultaneously
transmits and receives signals from multiple users, increasing throughput via spatial
multiplexing or improving reception performance through beamforming by transmit-
ting and combining the same user’s signal through multiple antennas. While 4G uses
2 to 4 MIMO antennas, 5G uses up to hundreds of antennas to transmit signals through
dozens of channels simultaneously, known as Massive MIMO, and supports beam-
forming technology.1 That is, 5G supports beamforming through Massive MIMO,
enhancing signal strength and reducing interference, thus improving the efficiency
of signal transmission and reception.
By adopting various advanced technologies including massive MIMO and beam-
forming, 5G has significantly improved performance. Specifically, 5G’s data trans-
mission rate is 10–100 times faster than 4G, allowing for the download of large files
and streaming of high-definition videos. 5G reduces latency to about a tenth of 4G,
beneficial for applications requiring immediate responses, such as remote surgery,
autonomous driving, augmented reality (AR) and Intelligent Traffic Systems (ITS).2

1 Beamforming technology enables numerous simultaneous connections, increasing both data rate
and network capacity. Beamforming sends signals from multiple antennas simultaneously to focus
them on a specific receiver, allowing efficient wireless signal reception at the receiver without
increasing transmission power.
2 ITS applies ICT to the field of road traffic, enhancing traffic efficiency and safety, increasing road

capacity, and reducing travel time by systematizing the infrastructure, vehicles, users, traffic, and
mobility management, and interfaces with other modes of transport.
4.1 5G/6G Mobile Communication 71

In addition, 5G has a larger network capacity and supports significantly more simul-
taneous device connections compared to 4G, enabling it to accommodate and serve
a large number of devices at the same time in applications such as the Internet of
Things (IoT), smart cities, and industrial automation.
For example, examining how 5G technology becomes an essential element in
ITS services, we find the following points. First, fast data transmission is crucial
for sending and receiving large amounts of data in real-time, which is necessary
for managing and optimizing traffic flow in ITS. Second, low latency is essential
for enabling real-time communication among vehicles, infrastructure, and central
control systems in ITS. Third, the capability for massive simultaneous device connec-
tions allows various devices such as sensors, cameras, traffic lights, and vehicles in
ITS to be connected and communicate with each other at the same time. Fourth,
beamforming technology enhances the accuracy of location services required for
vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication, as well
as collision prevention.
Thus, 5G technology, compared to 4G, has outstanding performance in terms
of data transmission speed, latency, the number of simultaneous connections, and
frequency efficiency, offering great potential for future services like IoT, smart cities,
and remote education, bringing innovation and efficiency improvement. However, it
is assessed as technically limited in fully supporting unmanned remote services like
remote healthcare, industrial automation, or high-risk services such as autonomous
driving and ITS. To overcome the limitations of 5G and meet the advanced industry
requirements like smart factories, smart farms, remote healthcare, autonomous
driving, and ITS, research on the sixth generation (6G) mobile communication is
actively underway globally, integrating cutting-edge technologies like computing,
sensing, communication, and artificial intelligence. 6G research aims to support
fundamental industry innovation by meeting demanding requirements across all
industries.
While 4G focused on increasing data transmission volumes by securing broadband
channels, 5G aimed to expand network capacity, massively increase the number of
simultaneous connections, and significantly reduce transmission delay. Going one
step further, 6G aims to advance the broadband, massive connectivity, and low-
latency goals of 5G to achieve ultra-broadband (up to 1000 Gbps), ultra-massive
connectivity (100 devices per square meter), and ultra-low latency (6 ms delay within
a 1000 km range). It also aims to extend communication to space, save energy, and
achieve ultra-precise positioning. If these technological goals to provide 6G mobile
communication services could be successfully achieved by around 2030, it would
be possible to provide uninterrupted services while traveling at supersonic speeds in
airplanes or hyperloops, as well as high-speed internet services in deserts, seas, or
remote areas. It would be also possible to fully and safely commercialize massive
unmanned agriculture, transportation, and distribution using unmanned autonomous
vehicles, robots, and drones.
72 4 Digital Technology

4.2 Internet of Things (IoT)

The Internet of Things (IoT) extends the internet used by people to objects (things). It
connects the objects around us to the internet, allowing them to communicate, share
data, and operate intelligently. This connection between the physical and digital
worlds enables data-driven decisions and automation. Specifically, it is a network
equipped with sensors and software on various physical devices, vehicles, buildings,
and objects, connected through a network to collect, exchange, and share data. To
accommodate the vast number of objects, the IoT protocol has expanded from the
32-bit address field of IPv4 to the 128-bit address field of IPv6.3
Technically, IoT comprises sensors, actuators, and communication devices.
Sensors detect and collect data from the environment, like temperature and humidity
sensors. Actuators execute commands affecting the environment, like turning lights
on or off in a smart lighting system. Communication devices transmit data collected
by sensors or commands to actuators using WiFi, Bluetooth,4 and other wireless and
mobile communication devices.
The data generated by numerous IoT devices can be processed at the network
edge through edge computing or sent to the network’s core for cloud computing
processing. Edge computing processes large amounts of data directly at the edge,
reducing latency, and compresses the data to send to cloud servers as necessary. Edge
computing is particularly important for applications requiring real-time responsive-
ness. Through such processes, IoT enables intelligent data-based decision-making,
adjusts responses to environmental changes, and prevents foreseeable issues.
IoT applications are diverse, as the following cases exemplify. Smart home
systems can automate and remotely control lighting, heating, and security. Smart
city systems can optimize infrastructure and services considering traffic, energy effi-
ciency, and public safety. Manufacturing can optimize and automate production lines.
Agriculture can monitor crop conditions and adjust irrigation to increase production
efficiency. Healthcare can manage health conditions and respond to emergencies
by connecting various sensors on the body to medical institutions. IoT enhances
automation and efficiency, provides real-time situational awareness and responses,
manages resources, and saves energy, thereby promoting connectivity, automation,
efficiency, and innovation, and thus facilitating digital transformation in businesses,
public, and personal sectors.

3 The internet was originally built on the IPv4 protocol (Internet Protocol version 4) with a 32-bit
address field, which was nearly exhausted and anticipated to require a much larger address space
for future IoT implementation, leading to the standardization of IPv6 (Internet Protocol version 6)
with a 128-bit address field. A 32-bit address field can represent 232 (approximately 4.3 billion)
addresses, and a 128-bit field can represent 2128 (approximately 3.4 × 1038 ) addresses.
4 Bluetooth is a short-range wireless communication industry standard that supports data commu-

nication between electronic devices over a 2.4 GHz frequency band.


4.3 Cloud Computing, Edge Computing 73

Looking more closely at smart cities within IoT applications, smart cities are
comprehensive systems that operate cities efficiently and sustainably to improve citi-
zens’ quality of life and provide convenience. Specifically, smart cities comprehen-
sively control and maintain traffic systems, energy management, waste management,
water resources management, ICT connectivity, medical services, public safety and
security, and environmental monitoring. IoT plays a central role in smart city systems
by connecting various sensors and management systems to efficiently manage assets
and resources. For example, smart traffic lights equipped with weather monitoring
sensors can adjust brightness based on weather conditions. Sensors collecting traffic
volume data can send this information to traffic management departments to prevent
congestion. Smart parking places sensors in each parking space to collect availability
data in advance, displaying available spaces at the entrance as vehicles enter. Sensors
placed throughout roads can immediately report accidents to traffic management
departments for automatic emergency response.
Wearable technologies that measure various health indicators such as heart rate,
step count, sleep quality, and body temperature by attaching sensors to the user’s
body are also an extension of IoT. These health wearables typically use specialized
sensors to measure physical and health states, such as glucose monitors for diabetes.
The measured data is transmitted to other devices, medical institutions, or cloud
computing via Bluetooth, WiFi, mobile communications, etc. Information security
is crucial as it involves sensitive personal health information. Health wearable tech-
nology is rapidly evolving to improve sensor accuracy, user interface, battery life,
and the aesthetics of sensor devices.

4.3 Cloud Computing, Edge Computing

Cloud computing is a technology that allows users to access data storage and
computing power provided by remote data centers via the internet. Cloud services
enable users to save on costs by having the provider centrally install the commonly
needed storage and computing resources in the network, allowing users to access
these resources.
The advantages of using cloud computing are numerous. Since users receive
computing services such as memory storage, computational power, databases,
networking, and software from cloud service providers via the communication
network, there is no need for users to individually acquire such computing resources.
Instead of owning and managing physical hardware and infrastructure, individuals or
businesses can store and process data using the storage, computers, and application
software housed in the cloud servers of the service providers. Users can connect to
the provider’s cloud servers via the internet and receive cloud computing services
on-demand, choosing as much as they need. For example, users can rent only hard-
ware resources, or hardware plus operating software, and even application soft-
ware. Renting hardware resources is called IaaS (Infrastructure as a Service), adding
74 4 Digital Technology

,QWHUQHWRI7KLQJV (GJH&RPSXWLQJ &ORXG&RPSXWLQJ

0HPRU\ 0HPRU\
&RPSXWLQJ &RPSXWLQJ
,QWHUQHW 1HWZRUNLQJ 1HWZRUNLQJ
'DWD%DVH 'DWD%DVH
6RIWZDUH 6RIWZDUH

Fig. 4.1 IoT and edge/cloud computing

operating software resources is called PaaS (Platform as a Service), and including


application software is called SaaS (Software as a Service).
Cloud computing offers various benefits. First, cloud storage services provide an
economical and scalable solution for data storage and backup. Second, individuals
or businesses do not need to install dedicated servers or download applications,
minimizing equipment investment and only paying for the resources used. Third,
cloud resources can be quickly expanded or reduced as needed, and large amounts
of data can be processed and analyzed with available tools and services. Fourth,
cloud-based tools enable remote collaboration and document sharing. Fifth, users
can access cloud servers and receive services from anywhere in the world with an
internet connection.
Edge computing is a technology that processes and analyzes data near the user’s
devices or sensors instead of sending it to a central data center. That is, it processes
data at the edge, close to the data-generating resources such as sensors, IoT devices,
smartphones, and industrial machines. This concept has grown in importance with
the increase in sensors and IoT devices. Edge computing delivers computing services
as close to the edge as possible and reduces the workload for cloud computing by
minimizing the data sent to the cloud (refer to Fig. 4.1).
Thus, edge computing has several advantages, including shorter latency, effi-
cient use of communication bandwidth, offline operation capability, and real-time
decision-making. First, processing data at the edge reduces the time delay in sending
data to the central cloud server and receiving a response, making it suitable for appli-
cations requiring fast or near-real-time responses. Second, as most data is processed
at the edge and only essential data is sent to the cloud, the use of communica-
tion bandwidth decreases. This reduces the load on the network infrastructure and
enhances security against risks during transmission. Third, edge computing devices
can operate independently, allowing data processing to continue even if the connec-
tion to the cloud is lost. Fourth, decisions can be made in real-time at the edge
without waiting for the cloud, making it applicable to real-time traffic management
and control in smart cities, and enabling vehicles to make immediate decisions based
on sensor data in autonomous driving.
Comparing cloud computing with edge computing, cloud computing offers advan-
tages in scalability, accessibility, and centralized management, while edge computing
4.4 Digital Virtual Spaces 75

offers advantages in lower latency, real-time processing, and reduced communica-


tion bandwidth. Edge computing is used in conjunction with cloud computing, so it
is more appropriate to view it not as an alternative technology to cloud computing
but as an extension of cloud computing for optimizing data processing.

4.4 Digital Virtual Spaces

In the digital environment, technologies that provide users with immersive experi-
ences include augmented reality (AR), virtual reality (VR), and the Metaverse.5 These
utilize advanced computing technologies and user interfaces, including 3D graphics,
real-time rendering,6 and artificial intelligence, to provide interactive experiences.
However, they differ significantly; for example, AR overlays digital information onto
the real world using smartphones or AR glasses, VR creates a completely immersive
virtual world by blocking out the real world, and the Metaverse creates a virtual
shared space that fuses physical reality AR with digital reality VR.

4.4.1 Augmented Reality (AR)

Augmented reality (AR) is a technology that overlays digital information on the


real world seen on a screen, helping users better perceive reality. It adds digital
information like images, videos, and 3D models to the user’s real environment in
real-time. Unlike VR, which immerses users in a virtual world, AR combines the
virtual (digital) world with the real world. AR must meet three conditions to function
correctly: it must combine the physical real world with the digital world, allow real-
time interaction between the real and digital worlds, and accurately identify the real
and digital worlds in three dimensions.
Devices used for AR include smartphones, tablets, smart glasses, and headsets.
The five key components needed for AR services are AI, AR software, AR data
processing, lenses, and sensors. First, AI interprets and processes data collected
by sensors and cameras, facilitating pattern recognition, spatial relationships under-
standing, and user interaction. Second, AR software renders digital images, processes
user inputs, and manages interactions between the digital and physical worlds. Third,
AR data processing involves processing signals captured by sensors, running AI algo-
rithms, and rendering AR content in real-time. Fourth, lenses display digital images

5 With the widespread use of virtual reality technology, concepts such as mixed reality and extended
reality have emerged. Mixed Reality (MR) mixes AR and VR, while Extended Reality (XR) encom-
passes AR, VR, and MR. For example, Meta’s Oculus Quest 3 headset and Apple’s VisionPro headset
are mixed reality devices.
6 Rendering, also known as image synthesis, is an essential process in computer graphics that uses

additional information from a two-dimensional or three-dimensional scene file to create realistic


photos or videos.
76 4 Digital Technology

or information over the real world. Smartphones and tablets capture the real envi-
ronment with camera lenses, and AR software overlays digital information on that
scene. AR glasses or headsets use special lenses to project digital content into the
user’s field of view, naturally overlaying it with the real world. Fifth, sensors collect
information about the user’s environment and interactions within it. Commonly used
sensors include motion sensors (e.g., accelerometers, gyroscopes), environmental
sensors (to detect depth, brightness, etc.), and input sensors (touchscreens, voice
recognition microphones, eye-tracking sensors, etc.).
Applications of AR range from mobile apps and shopping to navigation, gaming,
and education. Mobile apps can provide additional information or visual enhance-
ments when the device’s camera is pointed at physical objects or locations. AR allows
users to try on clothes or see how furniture looks in their home before purchasing. AR
navigation apps overlay road information on the real world seen through the camera,
making it easier to explore unfamiliar areas. AR games integrate virtual elements
into the user’s physical environment, allowing actions like capturing virtual objects
during gameplay. AR can enhance learning experiences through interactive visual-
izations and simulations, such as showing complex molecular structures or historical
artifacts in three dimensions.

4.4.2 Virtual Reality (VR)

Virtual reality (VR) is an immersive digital environment technology that makes users
feel and act as if they are physically present in a separate virtual world. Users typically
use headsets or goggles that completely cover their field of vision to enter and interact
within a digital world, with the physical environment being blocked out. Users can
interact with the virtual world using hand controllers or motion sensors, not just
immersing in the VR world but actively engaging with it, distinguishing VR from
simulation.
VR requires a VR headset, a high-performance PC or game console, motion
tracking sensors, controllers, and audio output. To create an immersive interactive
experience, VR needs several components: a three-dimensional immersive environ-
ment, interactivity, and audio-visual synchronization. The three-dimensional immer-
sive environment allows users to experience interactions very similar to those in
physical spaces. Interactivity enables real-time interaction with the virtual environ-
ment and its objects. Accurate synchronization of VR’s visual and auditory elements
can create a realistic environment with spatial sound and high-quality graphics. The
VR user interface must be intuitively and flexibly integrated into the virtual envi-
ronment. Real-time rendering is necessary for VR content to immediately respond
to user actions and movements. Adding vibration or physical sensations through
special gloves or controllers can enhance immersion by allowing users to feel the
virtual environment or objects.
VR is applied in various fields, including gaming, training simulations, education,
therapy and rehabilitation, architectural visualization, and others. VR games immerse
4.4 Digital Virtual Spaces 77

players in the game world, allowing natural interactions with game characters. VR
provides realistic training simulations in aviation, medical, and military fields without
actual risks. It offers immersive educational experiences, letting students explore
historical events, scientific concepts, and virtual field trips. VR can enhance therapy
by immersing patients in virtual environments for pain management or physical
rehabilitation. It allows architects to visualize building designs, enabling clients to
experience buildings and spaces before construction.

4.4.3 Metaverse

The Metaverse is a compound word from “meta” (meaning transcendence or fiction)


and “universe,” referring to digital platforms where virtual and real worlds merge.
It is a digital platform where various virtual worlds and real environments are inter-
connected, allowing users to socialize, work, learn, enjoy entertainment, and engage
in economic activities in virtual spaces. The Metaverse breaks down the boundaries
between the physical world and digital virtual worlds, providing spaces for users to
interact and be active in virtual worlds. It includes elements of AR and VR, offering
interconnected spaces in virtual worlds for social interaction, work, economic activ-
ities, and entertainment. Users can access and interact in the Metaverse through AR
devices like smartphones and smart glasses and immersive VR headsets.
In the Metaverse, users represent their digital identities through virtual avatars,
expressing and acting through them. Users can engage in social activities with
others, conversing, collaborating, and exchanging in ways similar to the real world,
thereby experiencing social interactions. The Metaverse can establish its economic
ecosystem, including virtual real estate, digital goods and services trading, and
virtual currency. The Metaverse is applied in entertainment, education, training,
art, medicine, commerce, etc., where users can experience new things, work, and
collaborate in virtual worlds.
The Metaverse is notable for its potential for significant future development
as an interconnected and persistent virtual space offering economic and social
activity opportunities, integrating AR and VR. However, several challenges must
be addressed for the Metaverse to continue developing in the future, which include
privacy and data security, user safety and content management, hardware pricing and
user costs, legal and ethical issues, economic models and revenue generation, techno-
logical scalability, and social and psychological impacts. With users spending much
time in the Metaverse, concerns about personal information leakage and security are
significant. Collecting and using personal data, such as biometric data, poses privacy
risks. Protecting users from harassment, cyberbullying, and inappropriate content is
a crucial challenge in an environment where many people interact in various ways.
The use of advanced hardware in the Metaverse could create entry barriers due to high
costs. As the Metaverse is a new territory, it faces legal and ethical challenges such
as intellectual property rights, the legality of virtual actions, and the ethics of interac-
tions. For the Metaverse to become a sustainable economic model, it needs to address
78 4 Digital Technology

various economic challenges including compensating creators, taxing digital goods


and services, securing financial transactions, and exchanging cryptocurrencies. As
the number of users increases, it should be able to address further challenges as well,
including scaling the Metaverse’s underlying technology and ensuring performance.
In addition, in-depth research is needed on the social and psychological effects of
long-term exposure to virtual environments on individuals.
The Metaverse is evolving with the help of generative AI, allowing users to create
their Metaverse using various generative AI tools, generate 3D images or convert 2D
images to 3D, and attempt natural interactions between reality and virtual spaces.
Further, generative AI opens up innovative business models in avatars and intellectual
property, anticipating new possibilities.

4.5 Digital Twin

A digital twin is a technology that creates a digital replica of a real-world object using
a computer and simulates situations that may occur in reality to predict outcomes and
find solutions to problems. Data and information that represent the structure, context,
and operation of physical systems in the real world are input into the digital twin
on the computer, and by running simulations, the past and present operational states
can be understood, and the future can be predicted. The digital twin is a powerful
digital entity that can be used to optimize the physical world, and by using digital
twins, the operational performance of real-world objects and business processes can
be significantly improved.7
The core technologies that make up digital twins include data analytics, the IoT,
simulation, AI, and cloud computing. Digital twins predict results and find solutions
by analyzing real-time data collected from their physical counterparts. To collect real-
time data, sensors and IoT devices are attached to the physical counterpart, providing
a continuous stream of data. Advanced simulation and modeling techniques replicate
the characteristics and behaviors of the physical counterpart in the digital domain.
Digital twins process data to identify patterns and anomalies and use AI algorithms
to predict behaviors. When necessary, digital twins can utilize cloud infrastructure
to secure storage, processing capacity, and accessibility for the digital counterpart.
Specifically, digital twins are built through the following steps: First, data related
to the physical entity to be replicated is collected from various sources, including
sensors, IoT devices, historical records, and measurements. Second, the collected data
is integrated to create a comprehensive dataset representing the behavior, characteris-
tics, and state of the physical entity. Third, a computational model that simulates the

7 The term “digital twin” has been used since the 1960s by NASA for remotely operating and
maintaining systems and was used by Michael Grieves in 2003 as a tool to optimize the entire
lifecycle management of products in industrial environments. Although the technology at the time
could not fully implement digital twins due to the need for extensive data storage and processing
devices, advancements in technology over the past 20 years have led to its widespread practical use
in various industries today.
4.5 Digital Twin 79

behavior and properties of the physical object is developed. Fourth, a user-friendly


interface is created for users to interact with and visualize the digital twin. Fifth, a
connection is established between the digital twin and real-time data sources. Sixth,
analytics and monitoring functions that allow for real-time data analysis and insights
are implemented. Seventh, insights obtained from the digital twin are used to make
decisions, optimize operations, and improve the performance of the physical entity.
Digital twin technology is applied in industries such as manufacturing, health-
care, smart cities, energy management, and aerospace. In manufacturing, digital twins
optimize manufacturing processes, monitor equipment status, and simulate produc-
tion scenarios. In healthcare, digital twins can be created for patients to monitor
health status and predict potential issues. Digital twins can simulate urban environ-
ments in smart cities, supporting city planning, resource management, and infras-
tructure development. Applying digital twins to energy systems can optimize energy
consumption, predict demand, and manage distribution networks. In the aerospace
industry, creating digital twins for aircraft engines and components allows for real-
time monitoring, maintenance, and performance optimization. In addition, digital
twin technology is applied in agriculture and food industry, construction and real
estate, environmental and resource management, product development and design,
education and training, and more.
Practically, applying digital twin technology in product production can improve
product quality through the following process: A digital twin of the production
process is created, control programs are installed on the process manager’s device,
and sensors are installed throughout the production and consumption process. Signals
from the sensors are connected in real-time to the digital twin in the manager’s device.
If a problem occurs during the production and consumption process, it is reported
in real-time to the digital twin program manager. The manager uses the digital twin
system to find the optimal solution, which is then communicated to the production
site and reflected in the production process. This method of production and manage-
ment reduces cost losses due to production process errors and improves product
quality, meeting consumer demands more effectively.
Digital twin technology faces several challenges, including data collection and
management, integration and compatibility, and real-time analysis and updates.
Collecting comprehensive data for complex systems is challenging, especially
managing large volumes of data from various sources in real-time. Integrating
digital twins with existing IT infrastructure and operational technology systems can
be complex, and compatibility with older legacy systems may be difficult. Digital
twins often contain critical data, making them targets for cyber-attacks, and in fields
like healthcare, concerns may arise about personal information protection and data
security. Reflecting the physical counterpart accurately in a dynamic environment
requires continuous data updates, and making timely decisions requires processing
large amounts of data in real-time. Developing and implementing digital twin tech-
nology requires significant initial investment, and maintaining and operating digital
twins requires substantial resources. Scaling small-scale digital twin prototypes to
organization-wide technology is a separate challenge. Managing and interpreting
80 4 Digital Technology

data from digital twins requires significant expertise in data analysis, modeling, and
simulation, necessitating digital twin technology experts and dedicated team training
and education.

4.6 Big Data

With the advancement of internet and mobile technologies and the spread of
various digital platforms, humanity has formed a global hyperconnectivity network,
producing an unimaginable amount of data that was previously unthinkable. Billions
of people worldwide are connected through the internet and mobile communica-
tions, generating vast amounts of data. The emergence of social media platforms
like Facebook, Twitter, and Instagram has attracted users globally, who exchange
messages, post various contents, and upload media, creating massive amounts of
data daily. The widespread adoption of smart home devices, wearable devices, and
various industrial sensors among IoT devices also generates a considerable amount
of data. Online shopping platforms produce extensive data on consumer behavior,
preferences, and purchase history through e-commerce, and digital banking and
online transactions also produce significant amounts of data. The amount of data
distributed through streaming platforms like Netflix and YouTube is enormous, and
the data users generate, share, and distribute themselves is overwhelming. Research
in human genome projects and life sciences generates vast amounts of data, and the
digitization of health records significantly increases the data pool in the medical field.
Such unprecedented increase in data volume poses a unique challenge in the digital
transformation era, leading to the emergence of the term “big data” and the field of
“big data analytics”.

4.6.1 Big Data Analytics

Big data refers to large volumes of data, but the real interest in big data lies in how
to collect, store, process, and analyze this vast amount of data to extract meaningful
patterns, correlations, trends, insights, and knowledge. It also involves utilizing this
analysis to gain a deeper understanding of various phenomena, make data-driven
decisions, and optimize business processes. Therefore, while commonly referred to
as big data, the core content is big data analytics. In other words, big data analytics
involves analyzing the essence hidden within vast amounts of data (big data). Big
data analytics is a multidisciplinary field requiring expertise in data science, statistics,
computer science, information technology, and related areas, developing alongside
the advancements in artificial intelligence, machine learning, and cloud computing.
Big data analytics involves various processing stages, including data prepro-
cessing, data integration, data storage, data analysis, visualization, and interpretation.
4.6 Big Data 81

Big data analytics systems must be capable of adequately responding to various vari-
ables such as data volume, generation speed, diversity, reliability, variability, and
scalability across these multi-stage processing steps. First, the vast amount of gener-
ated data requires expandable and efficient infrastructure for storage, processing, and
analysis. Second, rapid data generation speeds necessitate corresponding real-time
data processing capabilities. Third, the diversity and inclusion of structured, semi-
structured, and unstructured data from various sources require complex integration
and analysis capabilities. Fourth, data from diverse sources may be incomplete,
inconsistent, or contain errors, necessitating the ability to handle such variability.
Advanced analytical capabilities are needed to extract reliable and valuable insights
even in such situations. Fifth, the infrastructure must be flexible enough to expand
processing capabilities as the load increases or data volume spikes.
Big data analytics systems must be equipped to handle the various data character-
istics mentioned above and legally manage personal information protection, security,
and ethical considerations. There is concern over privacy and security breaches during
the mass storage and processing of sensitive data. It is crucial to ensure that no ethical
issues or potential biases arise, especially concerning personal information handling
or general data analysis. Therefore, systems must be designed to strictly adhere to
data protection and ethical regulations.
Big data analytics involves several considerations. Data from various sources
lacks consistency and accuracy, requiring significant time and resources for data
cleansing and management. Storing and processing large volumes of big data requires
large storage systems and powerful computing capabilities, demanding substan-
tial investment. The demand for real-time data processing and analysis is growing,
but implementing this capability technically has its limits. Analyzing massive and
complex datasets to extract meaningful essence is a challenging task, requiring
multidisciplinary knowledge and comprehensive insight.
A successful example of utilizing big data analytics in business is Netflix. Netflix
uses big data analytics to analyze viewers’ behavior, preferences, and viewing
patterns, customizing content recommendations, producing or acquiring new content,
and making content easily discoverable for users. When expanding into global
markets, Netflix utilized big data analytics to understand regional preferences and
content consumption patterns, informing its content library and marketing strate-
gies in each region. Netflix’s data-driven decision-making, content recommendation,
content production approach, and global market expansion showcase how big data
analytics can lead to success in the entertainment industry.

4.6.2 Bioinformatics

Bioinformatics is the application of big data analytics in the fields of biology and
medicine. It is an interdisciplinary field that combines biology, computer science,
mathematics, and statistics to analyze and interpret biological data. As a key area
in modern biological research, bioinformatics uses computational techniques to
82 4 Digital Technology

process, manipulate, and analyze vast amounts of biological information, such as


gene sequences and protein structures. Bioinformatics has contributed to advance-
ments in genetics, genomics, medicine, and biotechnology by extracting meaningful
insights and knowledge from biological data.
Bioinformatics encompasses various areas, including bio-data analysis and
genome sequencing analysis. First, it allows for the analysis of various bio-data,
such as identifying patterns in DNA sequences, predicting protein structures, and
analyzing gene expression profiles, using computer tools and techniques. Second,
it involves processing and interpreting large-scale genome sequencing data to iden-
tify genes, regulatory elements, and functional regions within the genome. Third,
applied in proteomics, it predicts protein structures, studies protein interactions, and
analyzes changes due to post-synthesis modifications. Fourth, in drug discovery, it
analyzes molecular structures and predicts potential drug candidates. Fifth, it helps
understand gene functions within the context of organisms and compares genome
sequences across different species to elucidate evolutionary relationships.
One of the most notable successes made possible by bioinformatics is the Human
Genome Project. This large-scale project aimed to map and analyze the entire human
genome (the complete set of human genetic information encoded in DNA) from
1990 to 2003. Bioinformatics played a crucial role in processing and analyzing the
massive DNA sequence data generated during the project. It also played a significant
role in organizing and managing the vast amount of data generated during genome
sequence analysis. Furthermore, bioinformatics helped compare the human genome
with those of other species to elucidate evolutionary relationships, identify genes
and regions, and predict the functions and potential roles of identified genes. The
successful completion of the Human Genome Project is a significant achievement in
biology and medicine, supported by bioinformatics.
During the recent coronavirus pandemic, Pfizer-BioNTech’s rapid development
of the COVID-19 vaccine was due to collaboration among various scientific fields,
including molecular biology, virology, immunology, vaccine development, and accu-
mulated knowledge from past vaccine research, with bioinformatics playing a crucial
role. Bioinformatics analyzed the coronavirus’s genetic sequence, designed the
antigen part of the virus, analyzed large-scale clinical trial data, and tracked virus
mutations. After the coronavirus crisis, bioinformatics became more widely used in
drug discovery, understanding gene expression patterns, personalized medicine, and
increasingly utilizing machine learning and AI techniques.

4.7 Cybersecurity Solutions

As the digital transformation unfolds, cybersecurity has become significantly high-


lighted. With all information being processed and stored digitally after the digital
conversion, it is crucial to protect digital assets and sensitive data and ensure the
integrity of digital services. Various cybersecurity technologies, including firewalls,
4.7 Cybersecurity Solutions 83

intrusion detection and prevention, encryption, authentication, and access control,


can protect data and systems and safeguard businesses in the digital environment.
A well-known example that underscores the importance of cybersecurity is the
data breach incident at Equifax, a credit rating agency in the United States, in 2017.
This incident is one of the most notable failures due to the absence of cybersecurity
solutions. It was a significant cybersecurity incident where personal and financial
information of approximately 150 million USA, UK, and Canadian citizens was
exposed, known as the largest cybercrime related to identity theft. The incident
occurred because Equifax did not promptly address security vulnerabilities. Despite
knowing about a vulnerability in their web application and the existence of a security
patch, Equifax failed to take timely action. This vulnerability allowed hackers to
infiltrate and access customers’ personal information and credit card numbers without
authorization. This incident served as an alarm on the importance of cybersecurity,
security vulnerability management, and timely security updates.
Malware refers to all types of malicious software designed to harm or exploit
programmable digital devices, services, or networks or to use them to inflict harm.
Malware includes viruses, worms, spyware, adware, ransomware, Trojans, rootkits,
etc., and they operate as follows. First, a virus attaches to a clean file to infect
other clean files, damaging system core functions, and can delete or corrupt files.
Second, a worm operates independently, replicating itself through networks to other
computers, consuming system memory or network bandwidth, and causing web
servers, networks, or individual computers to become unresponsive. Third, spyware
is designed to monitor users’ activities without their knowledge, collecting personal
information, internet usage data, and other sensitive information. Fourth, adware
often comes with free software, tracking browsing behavior and displaying unwanted
ads. Fifth, ransomware encrypts user data and demands a ransom for the decryption
key, posing a significant threat to both individual users and businesses by causing data
loss and financial damage. Sixth, Trojans disguise themselves as legitimate software,
tricking users into installing and running them on their systems, and upon activation,
perform pre-designed tasks such as theft, damage, or disruption. Seventh, rootkits
are designed to gain unauthorized access to a computer’s highest level (root level)
without detection, often used to hide the presence of other malware.
Cybersecurity technologies for defending against malware include firewalls, intru-
sion detection and prevention systems (IDS/IPS), encryption, authentication, and
access control. First, firewalls act as barriers between trusted internal networks
and untrusted external networks, controlling incoming and outgoing network traffic.
Second, IDS/IPS monitor network traffic for suspicious activities and block or prevent
potential threats. Third, encryption converts sensitive data into code that can only be
decrypted with the corresponding decryption key. Fourth, authentication and access
control ensure that users must provide various forms of identification to access the
system, allowing access only to authorized individuals. Other security technologies
include vulnerability assessment, penetration testing, endpoint security, and security
information management.
The targets of cybersecurity solutions are diverse, including networks, data,
clouds, IoT, mobile devices, etc., and their functions are as follows. First, using
84 4 Digital Technology

firewalls, IDS/IPS, and encryption technologies can protect networks from unau-
thorized access, data breaches, and cyber-attacks. Second, applying encryption and
access control, along with secure storage practices, can ensure the confidentiality,
integrity, and availability of sensitive data. Third, applying encryption and access
control, along with continuous monitoring, can protect data and applications serviced
in cloud environments. Fourth, utilizing firewalls, IDS/IPS, encryption, and access
control can protect IoT devices from hacking and unauthorized access that could
severely impact critical infrastructure and personal information. Fifth, implementing
encryption, secure authentication methods, and app security measures can ensure
the security of mobile devices and transmitted data. Sixth, identifying vulnerabil-
ities in software applications and applying appropriate patches can defend against
hackers’ attacks. In addition, utilizing security information management, machine
learning, and artificial intelligence can enhance cybersecurity capabilities and allow
for real-time detection and response.
For individual user, the first target for protection against cyberthreats is personal
computers (PCs), for which it is necessary to adhere to the following ten precautions.
First, use antivirus and anti-malware software and regularly update them, activating
real-time scanning to detect and block threats. Second, regularly update the oper-
ating system and all software to address security vulnerabilities, setting up automatic
updates for essential software and operating systems. Third, use strong passwords
that mix letters, numbers, and special characters, avoiding the use of the same pass-
word across multiple accounts. Consider using a password manager to securely store
and manage passwords.8 Fourth, use the operating system’s built-in firewall or third-
party firewalls to monitor and control incoming and outgoing network traffic. Fifth,
be cautious of emails from unknown sources, especially those requesting personal
information or prompting to click on links, and familiarize yourself with various
phishing scam techniques. Sixth, use WPA3 or at least WPA2 encryption for WiFi
networks and set strong passwords for WiFi networks.9 Seventh, enable two-factor
authentication (2FA) for online accounts related to sensitive services such as email,
banking, and social media.10 Eighth, regularly back up important data to external

8 A password manager is a software application designed to store and organize passwords. It typi-
cally encrypts the password database with a master password, and later on, the user only needs to
remember the master password. Since a password manager stores passwords in an encrypted form,
it is less susceptible to hacking and theft.
9 Wi-Fi Protected Access 2 (WPA2) is an enhancement of the original WPA standard made in 2004

and has become the de facto security standard for Wi-Fi networks. It is widely used in both personal
and enterprise Wi-Fi networks. Wi-Fi Protected Access 3 (WPA3) is the latest version of the Wi-Fi
security protocol released in 2018, offering stronger security features than WPA2, though it is still
in the process of adoption.
10 Two-factor authentication (2FA) is a security method in which a user provides two different

authentication factors to verify themselves. This adds an additional layer of security (e.g., text
message, biometric factors) to the traditional single-factor authentication method, where the user
only provides one factor (typically a password), making it more difficult for attackers to gain access.
For example, the user enters a username and password as usual, followed by providing a second
authentication factor, such as entering a code sent to user’s phone or scanning user’s fingerprint.
4.8 Robots, Autonomous Driving 85

drives or cloud storage, setting up automatic periodic backups. Ninth, avoid down-
loading software or opening attachments from untrusted sources, and if necessary,
pre-scan downloaded files and email attachments with antivirus software. Tenth,
lock your computer when not in use and set passwords for sensitive data, restricting
physical access to the computer to trusted individuals only.

4.8 Robots, Autonomous Driving

Robots are physical or virtual machines designed to perform tasks autonomously


or non-autonomously. Robots range from industrial robots used in manufacturing to
autonomous vehicles, autonomous drones, and software programs that automate tasks
in computers. The term “robot” encompasses both physical machines and software
programs that perform automated tasks or interact with users online, with software
programs performing specific automated tasks or interacting online referred to as
“bots.” Robots can interact with their environment and perform complex actions
based on programming and received data.

4.8.1 Robots

Robots, in their physical form, are machines or devices that mimic human actions
or perform automated tasks. Comprising sensors, actuators, and computing systems,
robots interact with the environment and perform tasks based on pre-programmed
instructions. Capable of independently performing various tasks as programmed,
robots can recognize their surroundings through sensors and move by rolling on
wheels, walking on legs, flying, or swimming, and manipulate objects using arms,
hands, or specialized tools. Robots are utilized in various fields and tasks, such as
assembly and processing in manufacturing and surgical assistance in healthcare.
Robotics, the field that deals with the development, design, manufacture, operation,
and use of robots, covers both hardware and software development of robots, as well
as their interaction with environments and applications in various fields.
Robots include not only stationary industrial robots but also mobile robots, intel-
ligent mobile robots, and humanoid robots. Mobile robots, designed to move around
in their environment using wheels, tracks, legs, or by flying like drones, are used for
exploration, surveillance, delivery, and search and rescue. Intelligent mobile robots,
equipped with high-performance sensing, perception, and decision-making capabili-
ties, can process data from their surroundings, make decisions, and adapt to changing
situations, with autonomous vehicles and drones being prime examples.
Robots are categorized into industrial robots, service robots, exploration robots,
drones, etc., based on their applications. Industrial robots are installed in manu-
facturing settings to perform tasks like welding, painting, assembly, and handling
objects. Service robots perform tasks such as vacuum cleaning, lawn mowing, and
86 4 Digital Technology

healthcare assistance. Exploration robots are employed in environments difficult for


humans, like deep-sea exploration, space missions, and hazardous waste handling. In
the future, robots are expected to be increasingly used in fields of medical and health
care such as surgery, rehabilitation, and patient care, in the fields of agriculture for
planting, fertilizing, spraying pesticides, harvesting, etc., in the fields of research in
environments inaccessible to humans, and in other fields.
Humanoid robots, unlike traditional robots, are designed to adapt to various work
environments and flexibly respond to unexpected situations. Humanoids typically
have a torso, head, two arms, and two legs, designed to resemble the human body,
allowing for natural interaction with humans and environments. Most humanoid
robots are bipedal, walking on two legs, and can perform complex motions such as
walking, rotating, climbing stairs, navigating uneven terrain, and even running or
performing complex movements in advanced models. They can manipulate objects
with hands and perform delicate tasks. Some humanoid robots can produce basic
facial expressions, facilitating natural human–robot interaction. Research is ongoing
to enhance humanoid robots’ capabilities with various sensors, vision recognition
cameras, balance-sensing gyroscopes, tactile sensors, and development of voice
recognition and synthesis abilities for human–robot interaction through language.
Efforts are being made to apply artificial intelligence to humanoid robots, enabling
them to learn, adapt, recognize, and make decisions independently.11

4.8.2 Robotic Process Automation (RPA)

Robotic process automation (RPA) uses software robots (i.e., bots) to automate busi-
ness processes, aiming to replace or assist human work with rule-based automation.
RPA plays a crucial role in enhancing process efficiency and reducing errors through
rule-based and repetitive task automation. RPA performs tasks following predefined
rules and automates work based on pre-set logic, allowing bots to handle tasks without
human intervention. By delegating routine and repetitive tasks to bots, organizations
can automate and optimize business processes, freeing humans for more critical tasks
and thus contributing to organizational competitiveness.
RPA offers benefits such as time savings, error reduction, system integration, and
process improvement. It can process tasks quickly and consistently, reducing the like-
lihood of human error and increasing work efficiency. RPA automates data and work-
flows between different systems and applications, facilitating system integration.
Thus, RPA can improve work efficiency and reduce costs through process automa-
tion. RPA is applied in various areas, including financial operations, accounting tasks,
data entry, customer support, human resources management, and maintenance work.

11 Recently, humanoid robot technology has made leaps with the incorporation of AI. Tesla unveiled
the 173 cm tall, 73 kg humanoid robot Optimus in 2021 and its second generation, Bumblebee,
in December 2023, equipped with AI. Various AI robots were displayed at CES in January 2024.
Goldman Sachs estimates the humanoid robot market to reach $154 billion by 2035, predicting that
humanoids will fill labor shortages in manufacturing and service industries.
4.8 Robots, Autonomous Driving 87

RPA is widely applied due to its advantages in automation, operational efficiency


enhancement, and cost reduction. However, several challenges need to be addressed
for its full and sustainable utilization. Although RPA is effective for structured,
rule-based tasks, it struggles with complex tasks requiring judgment, creativity, and
human interaction. While implementing RPA for small tasks is relatively straightfor-
ward, expanding it to organizational-wide processes is not simple. In addition, it is
challenging to adapt RPA bots to changes in business processes or IT infrastructure
and optimize them for increased loads. Data security, privacy, and compliance with
industry regulations are also important considerations. It is crucial to ensure RPA
bots have appropriate access to sensitive data while preventing unauthorized access,
especially in fields like finance and healthcare, where compliance with industry regu-
lations is mandatory. Errors in RPA setup or process design can propagate through
automated tasks, necessitating continuous monitoring and quality control. Mean-
while, it is important to note that if companies focus solely on short-term cost
savings through RPA, they may overlook other important aspects. For instance, as
RPA increases, the number of jobs in the sector may decrease, leading to resistance
from employees, and the long-term strategic planning for digital transformation may
be neglected.

4.8.3 Autonomous Vehicles

Autonomous driving refers to the technology that allows vehicles to operate


autonomously without a driver, using sensors, cameras, radars, LiDAR (Light
Imaging, Detection, and Ranging), and GPS to perceive the surrounding environ-
ment. Various sensors collect data about the environment, and the vehicle’s onboard
computer creates detailed maps of the surroundings and makes real-time decisions
using complex algorithms. Autonomous vehicles (AVs) and autonomous drones
(ADs) are prominent examples of autonomous driving technology application.
In autonomous vehicles, sensors and cameras recognize lanes, traffic lights, other
vehicles, pedestrians, and various road obstacles. Radar systems use radio waves to
detect objects and measure their speed and distance, while LiDAR uses laser pulses
to create high-resolution 3D maps of the vehicle’s surroundings. In addition, GPS and
navigation systems assist in determining the vehicle’s location and planning routes.
The vehicle’s computer system processes all this data to make driving decisions
such as acceleration, braking, and steering. The decision-making can improve over
time by adopting AI and machine learning technologies. Autonomous vehicles are
equipped with V2X (Vehicle-to-Everything) communication technology, enabling
communication with infrastructure and pedestrians for safe operation.
Autonomous vehicles offer several advantages. They can reduce traffic accidents
caused by human error, making them suitable for long-distance transport trucks and
taxi services. Autonomous vehicles provide mobility for the elderly and disabled,
who may not be able to drive themselves and optimize driving in terms of route, speed,
and fuel consumption. Autonomous driving technology has advanced significantly
88 4 Digital Technology

in sensors, AI algorithms, and machine learning, with pilot programs underway in


cities worldwide. Governments are establishing regulations to support autonomous
driving.
Despite advancements, autonomous vehicles still face challenges before
widespread adoption. Safety and reliability are critical concerns. Autonomous driving
systems may not guarantee safety in bad weather, unexpected road conditions, sudden
obstacles, or unpredictable behavior from other drivers. Comprehensive testing
systems need to be developed to verify the safety and reliability of autonomous
driving systems under various conditions. Integrating autonomous driving into
existing traffic systems and implementing V2X requires expanding traffic system
infrastructure. Ethical decision-making in unavoidable accident situations and deter-
mining liability in accidents are complex challenges that require collaborative efforts
among developers, regulatory bodies, and the public, focusing on safety, reliability,
ethical considerations, and seamless integration with existing traffic ecosystems.
In practice, there are five different autonomy levels in autonomous vehicles as
follows: Level 1 autonomy, which includes semi-autonomous features like basic
cruise control and lane departure warning, is widespread in modern vehicles. Level
2 autonomy, where the vehicle can assist with both steering and acceleration/
deceleration simultaneously, is also commercially available in many vehicles today.
Level 3 autonomy allows the vehicle to handle most driving tasks in specific condi-
tions (e.g., highways), and the driver can disengage from actively driving in those
conditions. However, the driver must be available to take control when requested.
Level 3 autonomy is currently available in limited forms, such as Mercedes-Benz’s
Drive Pilot, in certain regions like Germany and Nevada. Level 4 autonomy enables
a vehicle to drive itself without human intervention, but only in specific, controlled
environments (e.g., designated city zones or geofenced areas). This level is being
actively tested worldwide and is available in some cities, such as Beijing and Shen-
zhen, where companies like Baidu and Pony.ai offer limited autonomous taxi services.
Level 5 autonomy, where vehicles are fully autonomous with no need for human
intervention and capable of handling all environments and conditions, is still in the
development and testing stages.

4.8.4 Autonomous Drones

Autonomous drones are unmanned aerial vehicles that use various sensors, cameras,
GPS, and AI-based systems to detect the environment and fly independently along
pre-programmed paths. Sensors and cameras identify and detect surrounding objects
and obstacles, while GPS and navigation systems determine the drone’s location and
enable it to follow a set route. The onboard computer controls the drone’s flight path,
speed, and altitude, and AI and machine learning technologies allow it to adapt to
unexpected situations.
Significant technological advancements have been made in autonomous drone
technology due to competitive research and development by various companies.
4.8 Robots, Autonomous Driving 89

Autonomous drones now feature sophisticated navigation systems, including GPS,


computer vision systems, obstacle detection and avoidance sensors, and enhanced
decision-making capabilities through AI technology and machine learning. Advance-
ments in battery technology have extended flight times and ranges, and improve-
ments in communication technology have strengthened remote control and data
transmission capabilities in challenging environments.
Autonomous drones have diverse applications. They can take precision
photographs of hard-to-reach places, film movies, and use images for terrain surveys,
environmental monitoring, wildlife tracking, environmental change tracking, and
security surveillance. They can inspect infrastructure such as pipelines, power lines,
and wind turbines, especially in inaccessible or dangerous locations, and perform
search and rescue operations. Autonomous drones can deliver goods and medical
supplies quickly to densely populated urban areas or remote mountain regions. In
agriculture, they can spray crops with pesticides and insecticides, sow seeds, and
monitor crop conditions. In natural disasters, they play a crucial role in search and
rescue operations. Militarily, they can penetrate enemy lines without risking pilots’
lives for various operations.12
While autonomous drones have made significant technological progress and are
increasingly used in various fields, they still face regulatory, technical, and social
challenges. The challenges include standardized regulatory frameworks for drone
operation, interoperability standards for integration with transportation and logistics
ecosystems, and safe operation in regulated airspace to avoid collisions with other
aircraft and adverse weather conditions. Current levels of sensor technology, battery
life, and AI decision-making algorithms have limitations for long-duration missions
in complex environments. Protecting autonomous drones from cyber-attacks and
ensuring the security of collected data and privacy are important challenges. From
the public’s perspective, safety and noise pollution issues are crucial concerns. The
application areas for autonomous drones are expected to expand in the future, with
advancements in machine learning and AI technology, improved battery technology,
further developed wireless communications, and robust collision avoidance systems.
In addition, the autonomous drone industry is projected to become more vibrant with
institutional developments such as expanded operational spaces for drones, estab-
lished interoperability standards, and well-established ethical and legal frameworks
for sensitive areas.
In practice, there is no universally adopted standard for drone autonomy levels,
but the concept closely mirrors that of autonomous vehicles, with increasing levels of
autonomy from manual to fully autonomous operation. Level 1 autonomy is Assisted
Control, where the drone provides basic assistance, such as automated stabilization
or altitude hold. Level 2 autonomy is Partial Autonomy, where the drone can perform
specific tasks autonomously, such as following a predefined flight path or returning to

12 The Russia-Ukraine war that erupted in 2022 has significantly accelerated the militarization of
drones. In this conflict, drones have been utilized for surveillance, target acquisition, and direct
attacks, among other military purposes. Due to their capability for remote operation, drones offer
tactical advantages such as risk-free intelligence gathering and precision strikes. With the ongoing
war serving as a turning point, drones are expected to be extensively used in future military strategies.
90 4 Digital Technology

its launch point, but still requires human oversight. Level 3 autonomy is Conditional
Autonomy, where the drone can fly autonomously and make decisions in controlled
environments (e.g., predefined airspace), but may still require human intervention
in complex situations. Level 4 autonomy is High Autonomy, where the drone can
operate without human intervention in specific, controlled environments. It is capable
of navigating obstacles, making decisions based on sensor data, and adjusting to envi-
ronmental changes. Level 5 autonomy is Full Autonomy, where the drone is capable
of operating entirely autonomously in all environments and conditions, handling
complex tasks such as navigation, obstacle avoidance, and decision-making without
human oversight. Compared to autonomous vehicles, autonomous drones face unique
challenges, including airspace regulations, dynamic weather conditions, and the need
for advanced sense-and-avoid systems to ensure safe flight.

4.9 Decentralized/Distributed Technology

Humans inherently seek freedom and autonomy and are wary of centralized control.
This aspect has manifested technologically through decentralized and distributed
technology. While centralization offers the benefits of efficiency and order, there has
been a movement toward decentralization and distribution to secure freedom and
autonomy, even at the expense of those benefits. This trend has led from centrally
controlled infrastructure networks to decentralized ad-hoc networks,13 from central-
ized financial systems to the invention of decentralized blockchain cryptocurren-
cies, and from the corporate-controlled Web 2.0 to the emerging decentralized
Web 3.0. This evolution toward decentralized and distributed networks essentially
reflects human nature’s orientation toward autonomy, transparency, and a cooperative
community, technologically.

4.9.1 Blockchain

Blockchain is a technology developed to enable peer-to-peer transactions without


the intervention of central authorities. Traditional transaction methods involve banks
as intermediaries for transfers, recording each transaction in a ledger. However, with
blockchain, transactions can be conducted directly between individuals, bypassing
banks. Blockchain operates on a decentralized system where, instead of keeping the

13A wireless computer network built on the foundation of physical communication infrastructure is
called an infrastructure network, while a wireless network where computers communicate directly
with each other autonomously, without the help of communication infrastructure, is called an ad
hoc network.
4.9 Decentralized/Distributed Technology 91

ledger centrally, every participant maintains and manages an identical ledger. Trans-
actions are recorded in blocks, and these blocks are chained together in chronological
order of transactions, hence the name “blockchain.”
The most significant feature of blockchain is its decentralization. Unlike tradi-
tional centralized systems, blockchain operates on a distributed network of computers
(nodes). Since each node holds a copy of the entire blockchain, transparency is guar-
anteed to all users, and the system remains operational even if some nodes encounter
problems. Before adding transaction records to the ledger, all nodes must agree on the
transaction’s validity through a consensus mechanism, which is a critically impor-
tant process in blockchain. The most typical consensus mechanism, “Proof of Work”
(PoW), is designed to select the longest chain of blocks as the principle to make it
difficult for malicious hackers to interfere.14 Moreover, since blockchain is composed
of a chain of blocks and transactions are grouped and encrypted into the next block,
once added to the blockchain, it becomes extremely difficult to alter the information
within a block, which ensures the security and integrity of recorded data. The combi-
nation of a distributed network and encryption technology provides high security,
and because the ledger is held by all participants, transactions are transparent.
Blockchain technology has the potential to be utilized across various industries
due to its ability to record and verify transactions in a secure, transparent, and efficient
manner. The most well-known application is cryptocurrency. Blockchain serves as
the technical foundation for various cryptocurrencies, including Bitcoin. It can also
enhance supply chain transparency and authenticity, simplify international payments,
improve financial transaction transparency, and streamline settlements. Blockchain
is used in smart contracts, which are automatically executed and enforced when
predefined conditions are met, and in identity management, providing secure and
decentralized control to reduce the risk of identity theft.

4.9.2 Cryptocurrency

Bitcoin, based on blockchain technology, is a decentralized digital currency. Bitcoin’s


total issuance is predetermined to be 21 million, with divisibility into smaller units.
Bitcoin transactions utilize blockchain technology, and Bitcoins can be acquired
through “mining”. In blockchain, participants solve cryptographic puzzles in the

14 The consensus mechanism is critically important in blockchain. When a blockchain operates


publicly, there’s a significant risk that hackers with malicious intent could interfere with the
consensus process. To counteract such hacking attempts, the principle of choosing the longest
blockchain is adopted, making it difficult to create blocks. This means requiring the solution of
complex cryptographic problems that demand extensive computation to produce a value that cannot
be derived easily. Selecting the longest chain implies trusting the result that has involved the most
computation. Such a consensus method makes it difficult for hacking attempts to succeed due to
the enormous amount of computation required, known as Proof of Work (PoW). However, PoW is
problematic due to its excessive computational demand leading to significant power consumption.
To address these issues, an alternative method called Proof of Stake (PoS) has been proposed.
92 4 Digital Technology

Proof of Work (PoW) consensus mechanism to validate transactions and are rewarded
with Bitcoins, which process is known as mining. The amount of Bitcoin received for
mining decreases by half approximately every 3–4 years, known as Bitcoin halving.
Bitcoin differs from fiat currency in several aspects. While opening a bank account
may have stringent requirements, account freezing, limited transaction hours, fees,
and restricted transaction countries, cryptocurrency transactions are free from these
constraints, allowing for anonymous transactions without personal information expo-
sure. Unlike fiat currencies, which can be affected by financial crises, inflation, panic,
or deflation due to indiscriminate issuance, cryptocurrencies are not subject to these
issues. However, Bitcoin lacks the physical asset basis for price determination, like
gold, which leads to price instability and unpredictability, making it unsuitable for
everyday transactions.
Bitcoin’s open-source nature allows for the creation of other cryptocurrencies,
known as alternative coins or altcoins, numbering in the thousands, with Ethereum
being a prominent example. Ethereum’s cryptocurrency, ether (ETH), uses the Proof
of Stake (PoS) consensus mechanism in its latest version, and Ethereum 2.0, for
mining. PoS rewards validators who stake more ether, encouraging continuous
computer operation for validation. However, PoS’s drawback is that validators with
more ether have greater mining opportunities, leading to the proposal of alternative
mechanisms to address this issue.
Recently, China has been actively promoting Central Bank Digital Currency
(CBDC), which, although utilizing blockchain technology, differs from public
blockchains used in cryptocurrencies by allowing centralized control. CBDC aims
to ensure compliance with financial transaction regulations, monitoring, and anti-
money laundering by central banks. While offering some level of privacy like tradi-
tional banking, CBDC does not provide the anonymity and decentralization of cryp-
tocurrencies, potentially becoming a tool for monitoring citizens’ economic activities
if misused. Essentially, CBDC is not a typical cryptocurrency but an extension of
national currency into the digital realm, partially utilizing blockchain technology.

4.9.3 Web 3.0

The World Wide Web (WWW), initially known as Web 1.0, evolved into what we
currently use as Web 2.0. The next generation of the web, proposed as a decentralized
alternative to Web 2.0, is Web 3.0.
Web 1.0 was a “static web,” mainly composed of static web pages without dynamic
content or user-generated content. It was a one-way, read-only web where users
mainly consumed information provided by websites. The era of Web 1.0 was domi-
nated by text-based content, based on HTML (HyperText Markup Language) tech-
nology,15 limiting the amount and type of information and limiting the role of users

15 HTML is a markup language developed for displaying web pages. A markup language is a type
of language that uses tags and other elements to specify the structure of documents or data.
4.9 Decentralized/Distributed Technology 93

to content consumers. Microsoft, a pioneer of the Web 1.0 era, significantly benefited
by bundling Internet browsers with its Windows operating system, revolutionizing
access to information.
Web 2.0, or the “social web,” transitioned from static Web 1.0 to a platform
featuring dynamic content and interactive user engagement. Characterized by bidi-
rectional interconnectivity and adopting new technologies like XML (eXtensible
Markup Language) and HTTP (HyperText Transfer Protocol),16 it opened the door
for users to actively participate in content production, sharing, and communication.
This transformation allowed for the continuous sharing and reproduction of content,
turning the web into a dynamic space. The advent of Web 2.0 platforms, utilizing
user data to attract advertisers, led to a platform-centric ecosystem. Companies like
Apple, Google, Amazon, and Meta (Facebook) quickly grew into platform giants by
securing a large volume of data ahead of others.
The emergence of Web 2.0 enabled a digital world where users not only consumed
information but also created and provided it. User-generated content became main-
stream, with social media platforms like YouTube, Facebook, and Instagram oper-
ating on content produced by users rather than the companies themselves, generating
substantial revenue by connecting advertisers with this content. However, users who
produced the data transferred ownership rights to the platform companies, not only
missing out on revenue sharing but also losing ownership rights. Moreover, these
companies’ centralization of data raised issues with cybersecurity, privacy, and ethics.
Web 3.0 is presented as an alternative to address these issues.
Web 3.0 is a “distributed web” that aims for a decentralized, distributed network
and emphasizes user ownership and control of data. To achieve decentralization, it
adopts blockchain technology, uses smart contracts that enable transactions without
the need for trust, and applies AI technology for data processing and analysis. By
choosing a distributed approach, user data is stored across network nodes, or users’
computers, instead of being stored on the servers of platform companies. Since
blockchain technology is adopted as the method for implementing decentralization,
all the advantages of blockchain discussed earlier are carried over to Web 3.0. The use
of blockchain makes the web environment more transparent and secure, solving issues
related to privacy and targeted advertising. In Web 3.0, data ownership shifts from
corporations to individuals, and a reward system for data usage can be established.
In other words, users can be compensated for the efforts involved in generating
information.
Since Web 3.0 uses blockchain, it enables secure, transparent, and tamper-proof
record keeping and transactions, and allows for the use of decentralized finance
(DeFi) and smart contracts. DeFi, unlike traditional financial services, does not
require identity verification processes like certificates, and as long as there is an
internet connection, users can access a variety of financial services such as deposits,

16 XML is a markup language designed to facilitate the easy exchange of data between different
types of systems connected to the Internet. HTTP is a request/response protocol for exchanging
messages between clients and servers. The data transmitted via HTTP can be accessed through
internet addresses that start with http; known as URLs (Uniform Resource Locators).
94 4 Digital Technology

payments, insurance, and investments. Smart contracts are self-executing contracts


with the terms of the agreement directly written into code. Once certain conditions
are met, the contract is automatically executed, enabling transactions without the
need for trust. Web 3.0 supports a digital economy where assets are tokenized (i.e.,
converted into digital tokens) and exchanged via the internet. This includes not only
cryptocurrencies but also the tokenization of real-world assets and the trading of
digital collectibles as non-fungible tokens (NFTs). In addition, due to the decen-
tralized nature of Web 3.0, users gain control over their own data, and through
encryption and decentralized consensus mechanisms, they can enhance privacy and
security. Web 3.0 allows for interaction across different blockchain networks and
decentralized applications (DApps).
However, since Web 3.0 is based on blockchain, it inherits both the advantages
and limitations of blockchain. First, scalability is an issue. As the network using Web
3.0 grows and the number of users increases, there may be problems with capacity
and processing speed. Second, complexity can also be a problem. Technologies like
blockchain and smart contracts are relatively complex for the average user to under-
stand and interact with, making the user experience less intuitive. Third, there is the
issue of interoperability. For Web 3.0 to fully realize its potential, various blockchain
networks and decentralized applications (DApps) must interact smoothly, but stan-
dardization between the various platforms and technologies that support this remains
a challenge. In addition, other pressing issues include cryptocurrency regulation, the
legal status of smart contracts, societal and institutional acceptance, challenges in
information protection and modification due to the permanence of blockchain data,
and the high costs of building and maintaining infrastructure. Nevertheless, as tech-
nologies like AI, IoT, and blockchain mature and integrate, and as societal awareness
of privacy and personal data rights increases, Web 3.0’s position as an alternative to
Web 2.0 is expected to strengthen.

4.10 3D/4D Printing

3D printing, or three-dimensional manufacturing, is a manufacturing technology


that creates 3D objects by layering materials from bottom to top. It begins with
scanning an existing object into a digital model, slicing the 3D model horizontally
into thin versions, and then layering and melting these layers together to complete the
manufacturing process. This method is often referred to as “additive manufacturing”
in contrast to the “subtractive manufacturing” of traditional methods, which carve
out the desired shape from raw materials.
3D printing offers several advantages. It allows for personalized designs and
complex structures, making it suitable for rapid prototyping and unique product
creation. It can produce objects with intricate internal and external structures that
would be difficult or impossible to manufacture traditionally, and it minimizes waste
by using only the necessary amount of material. However, there are limitations,
4.11 Quantum Computer 95

including a limited selection of printing materials, slower speeds for large objects,
and the need for post-processing.
A unique advantage of 3D printing is remote or distributed manufacturing. Designs
can be digitally transmitted, allowing identical objects to be produced remotely. This
eliminates the need for physical transportation of prototypes or products, saving
on shipping costs and time. It also allows for design customization and production
volume adjustment according to local demands.
3D printing has wide applications across various industries. In healthcare, it
can produce patient-specific implants, prosthetics, and hearing aids. The aerospace
industry uses it to create lightweight, fuel-efficient components for aircraft and space-
craft. Automotive manufacturers use it for prototyping, custom parts, and even entire
car models. Architects use large 3D printers for complex models and construction
components, while fashion designers and artists use it for unique accessories, jewelry,
clothing, and sculptures. 3D printing also supports STEM education, manufacturing
process training, production of limited-edition consumer goods, customized cake
decorations, tools, film props, and more.
4D printing adds the dimension of time to 3D printing, creating objects that can
change shape or function in response to environmental stimuli such as temperature,
humidity, light, or other factors. Unlike 3D printing’s static shapes, 4D printed objects
can adapt and transform, offering potential for innovation in engineering, materials
science, medicine, architecture, and more.
Both 3D and 4D printing technologies have transformed manufacturing, design,
and other fields, yet face technical challenges and limitations. The range of printable
materials is restricted, and some lack the strength or durability of traditional materials.
The resolution of 3D printing can affect product quality and precision, while finding
materials that change shape for 4D printing is challenging. The cost of printers and
materials can be high, especially for industrial machines, and 3D printing can be
slow and unsuitable for mass production. Operating and designing with 3D and
4D printers require specialized skills and training, and ethical considerations arise
with applications such as printing human tissues or organs. Future developments
are needed to overcome these limitations, improve printer speed and resolution, and
discover new materials and printing technologies.

4.11 Quantum Computer

Quantum computing is a new computing approach that utilizes the principles of


quantum mechanics to perform certain types of calculations more efficiently than
traditional computers. Like traditional computers use bits as the basic data unit,
quantum computers use quantum bits or qubits. Qubits can exist in a state of super-
position, representing both 0 and 1 simultaneously, which gives quantum computers
unique advantages in performing specific computational tasks.
In quantum computing, the concepts of superposition, entanglement, and decoher-
ence are crucial. Unlike the bits in traditional computers, which can only be in a state
96 4 Digital Technology

of 0 or 1, qubits in a quantum computer can be in a superposition of both states. This


superposition allows quantum computers to perform certain calculations in parallel,
exponentially speeding up computation. In addition, the states of two or more qubits
can be entangled, meaning they cannot be described independently. This entangle-
ment allows qubits to interconnect and interact in ways that enhance computational
power. Quantum systems are susceptible to errors due to quantum decoherence,
which occurs when they lose their quantum properties through interactions with the
external environment. These sensitive quantum properties make quantum computers
vulnerable to noise, errors, and quantum interference, indicating that significant time
and development are needed to achieve consistent and error-free computations.17
Quantum computers have the potential to surpass traditional computers in solving
certain types of problems that are inefficient or impossible for classical computers
to handle. For example, quantum computers can perform tasks like factoring large
numbers efficiently and speeding up database searches, although the latter provides
only a quadratic speedup. They are particularly well-suited for simulating quantum
systems, aiding in the understanding of complex chemical reactions, material prop-
erties, and quantum interactions. Quantum computers also hold promise for solving
certain optimization problems and enhancing machine learning, though practical
applications in these fields are still under active research. In addition, they pose
a threat to classical cryptography systems like RSA, while enabling the develop-
ment of quantum-safe cryptographic methods. However, realizing the full potential
of quantum computers requires addressing key challenges, such as the stability of
qubits and overcoming error correction issues.
Even if quantum computers overcome technical challenges and become practical,
traditional computers are expected to perform better in many areas. For everyday
computing tasks like word processing, web browsing, email, and basic office appli-
cations, traditional computers are more suitable. Traditional computers excel at
executing sequential and deterministic algorithms and are more efficient for multi-
media tasks like graphic rendering, video editing, image processing, and conventional
cryptographic communications. They are also highly effective for general database
work, conventional scientific computing, software development, data analysis, statis-
tics, networking, and infrastructure management. While quantum computers offer
unique advantages for specific types of computations, they are not expected to replace
traditional computers but rather serve as complementary technology.
As quantum computers have become more feasible, comparisons with super-
computers have drawn significant interest. In December 2023, at the annual ‘IBM
Quantum Summit’, IBM unveiled the 1,121-qubit quantum computer ‘Condor’.
Quantum computing researchers have predicted that quantum computers with more

17 In quantum mechanics, the phenomena of superposition, entanglement, and decoherence can be


explained by the energy states and particle states of electrons that make up atoms. The energy state
of an electron appears as a superposition of several quantized eigenstates, and this eigenstate can be
obtained through measurement. Entanglement is a phenomenon in which two particles establish a
correlation through specific interactions, such that the state of one particle is closely connected to the
state of the other. However, due to external disturbances such as measurement or interactions with
the environment, this entangled state can disappear, which phenomenon is known as decoherence.
4.12 Artificial Intelligence (AI) 97

than 1,000 qubits could surpass supercomputers for certain types of problems.
Around the same time, Time magazine selected Hewlett Packard’s exa-FLOPS
(1018 Floating-point Operations per Second) supercomputer ‘Frontier’ as one of
the greatest inventions of 2023.
Comparing ‘Condor’ and ‘Frontier’ as of 2023 in terms of performance and other
aspects reveals key distinctions. Both excel in high-speed computing, with ‘Condor’
leveraging its qubits for quantum-specific tasks and ‘Frontier’ using traditional binary
computing with massive parallel processing to achieve speed. However, ‘Condor’s’
computational capabilities are still highly experimental, with issues such as stability
and error correction yet to be fully resolved. On the other hand, ‘Frontier’ represents
the peak of classical computing power, delivering stabilized, reliable performance
across various scientific, technological, and industrial applications.
In terms of cost, ‘Condor’ demands cutting-edge technology and materials for
quantum processing, error correction, and sophisticated cooling systems to main-
tain qubit stability at temperatures near absolute zero. These requirements make its
development and operation highly expensive and complex. While ‘Frontier’ is also
costly due to its exascale performance, large size, and cooling needs, it operates with
well-established technology and is more easily maintained.
Regarding physical space, while quantum computers like ‘Condor’ may require
less room for the core computing unit, the infrastructure for cooling, error correction,
and maintaining the controlled environment needed for quantum systems signifi-
cantly increases the space needed. In contrast, ‘Frontier’ occupies an area equivalent
to two basketball courts to house the supercomputer, storage systems, and auxiliary
devices.
As for applications, the general comparison between quantum computers and
classical computers applies here. ‘Condor’ has the potential to excel in solving
complex optimization problems, simulating quantum systems, tackling specific types
of cryptography, and other tasks that are difficult for classical computers. Meanwhile,
‘Frontier’ excels in large-scale simulations, weather forecasting, and physics and
astronomy research that demand massive classical computing power.
In summary, while ‘Condor’ represents the future potential of quantum computing,
it is still in the experimental phase, particularly in terms of stability and scala-
bility. ‘Frontier’, however, is a pinnacle of classical supercomputing and is already
being applied in a wide range of fields. Quantum computers like ‘Condor’ are
unlikely to replace classical supercomputers like ‘Frontier’; rather, they will serve as
complementary technologies, each excelling in different domains (see Table 4.1).

4.12 Artificial Intelligence (AI)

Artificial intelligence (AI) refers to machines designed and trained to think, learn,
reason, solve problems, and make decisions like humans. Machine learning is the
process by which AI learns. AI is generally categorized into two types: ‘Narrow
AI’ (NAI or ANI), designed and trained for specific tasks like personal assistants or
98 4 Digital Technology

Table 4.1 Comparison of quantum computers and supercomputers (As of 2023)


Quantum computer Supercomputer
Target Item Condor (IBM) Frontier (Hewlett Packard)
Size 1121Qubit Exa(1018 ) FLOPS, parallel
(performance) processing
Status Experimental phase Practical use stage
Expense High cost due to cutting-edge technology, Less costly than quantum
(elements) advanced materials, error correcting computing but significant
systems, near absolute-zero temperature operational costs due to energy
requirement, Qubit stability consumption, cooling, and
infrastructure for large-scale
systems
Space Quantum computer core is relatively Requires large space,
small, but large space is needed for approximately the size of two
supporting infrastructure, including basketball courts, to accommodate
cooling systems, error correction systems, the supercomputer, cooling
and maintaining stable quantum systems, power systems, and
environments storage units
Use Optimization, pattern recognition, Large-scale simulations, data
material science simulations, quantum analytics, AI applications,
simulations, encryption, and solving complex scientific computations,
problems difficult for classical computers weather forecasting, physics, and
astronomy research

image recognition technologies, and ‘General AI’ (GAI or AGI), which possesses
intelligence similar to human intelligence and is the goal AI research aims to achieve.
Machine learning improves algorithm performance using data. It enables
computers to enhance their intelligence through self-learning, involving training
and evaluation. Training allows computers to discover features in similar datasets
independently, while evaluation uses different datasets for assessment. Feedback
from evaluation results, combined with repeated learning, enhances intelligence.
This learning method is known as supervised learning. Unsupervised learning, in
contrast, involves giving datasets to the computer to discover patterns, structures,
and relationships on its own. Besides supervised and unsupervised learning, there is
also reinforced learning, where learning occurs through trial and error by interacting
with the environment, taking actions based on the current state, and adjusting actions
based on feedback from the environment.
Implementing AI involves various technologies, including machine learning,
neural networks, natural language processing, computer vision, and robotics. Neural
networks, inspired by the human brain’s neural structure, can perform deep learning
when layered multiple times, used for language and image recognition. Natural
language processing enables machines to understand, interpret, and converse in
human language, used in chatbots, language translation, and sentiment analysis.
Computer vision allows machines to interpret and understand visual information,
recognizing objects, scenes, and faces in images and videos. Robotics combines
4.12 Artificial Intelligence (AI) 99

machine learning and mechanical engineering to create intelligent machines that can
interact with their environment.
The development of AI and machine learning has seen several groundbreaking
inventions. The concept of deep neural networks emerged in the 1980s, followed by
the backpropagation algorithm and recurrent neural networks (RNN), kickstarting
the development of deep neural networks. Convolutional neural networks (CNNs)
advanced computer vision. The 2010s saw the proposal of generative models, opening
new possibilities for computer vision, natural language processing, and data anal-
ysis. The introduction of the transformer architecture with self-attention mechanisms
brought significant advancements to deep learning, revolutionizing natural language
processing, speech processing, and image recognition. (For explanations and details
on CNN, RNN, DL, etc., see Sect. 5.5.)
A significant milestone in AI’s development occurred on November 2022, with
the release of ChatGPT-3.5 for general testing. ChatGPT is an application model of
the Generative Pre-trained Transformer (GPT) designed by OpenAI for conversa-
tional purposes. GPT is a generative natural language processing model with a trans-
former architecture, trained on vast amounts of text data from the internet, capable
of conversing in human-like text. OpenAI later released ChatGPT-4 for subscription
and subsequently GPT-4 Turbo and GPT-o1.
ChatGPT has had several positive impacts, such as demonstrating that large
language models can generate human-like text and engage in conversations. It
helps users by answering questions, providing explanations, and supporting creative
writing and content generation, thus enhancing productivity. In addition, it has
promoted research and development in the fields of natural language processing
and artificial intelligence and provided educational tools that enable users to learn
new topics, concepts, and languages.
On the negative side, ethical concerns have been raised about the potential errors,
biases, rudeness, and discriminatory behavior of AI-generated content. There are
also concerns about the spread of misinformation and disinformation, as well as the
potential for information manipulation and deception. Furthermore, privacy and data
security issues arise due to the possibility of sharing personal or sensitive information.
In addition, it has been suggested that jobs requiring communication and interaction
with customers could be reduced, and an over-reliance on AI could lead to a decline in
critical thinking and independent decision-making skills. (For detailed information
on artificial intelligence, see Chap. 5.)
Chapter 5
Artificial Intelligence

Artificial intelligence (AI) refers to the imitation of human intelligence using


machines programmed to think and learn and sometimes specifically refers to
machines programmed to think and learn by mimicking human intelligence. The
main interest of AI is to create systems capable of performing tasks that require
human intelligence, such as voice recognition, visual recognition, decision-making,
and language translation. Furthermore, AI aims to mimic various aspects of human
cognitive abilities to adapt to and improve situations and to create machines that
can operate autonomously. Such tasks require abilities like learning (i.e., acquiring
information and the rules for using that information), reasoning (i.e., using rules to
reach approximate or definite conclusions), problem-solving, perception, and under-
standing human language. AI systems range from simple rule-based algorithms to
complex neural networks that mimic the human brain, and they are used to find
solutions to complex problems in various fields, from automating complex tasks to
healthcare, finance, transportation, and entertainment.
The fundamental core components that make up AI systems include algorithms,
machine learning, and neural networks. Algorithms provide the logical system and
rules that guide the AI system. Machine learning uses these algorithms to learn
from data and make predictions or decisions. Neural networks, a type of machine
learning algorithm, are designed to handle complex tasks such as image recogni-
tion or voice recognition. Essentially, algorithms are the basic building blocks of AI,
machine learning is the field that allows machines to learn from data using these algo-
rithms, and neural networks are sophisticated implementations of machine learning
algorithms designed to handle complex tasks. Algorithms, machine learning, and
neural networks each contribute to the overall functionality and effectiveness of AI
systems, building upon and enhancing each other’s capabilities. These three elements
are crucial to consider when reviewing, applying, and seeking improvements in AI
systems. Understanding these three components well provides a foundation for a
comprehensive understanding of the overall shape of complex AI.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 101
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_5
102 5 Artificial Intelligence

5.1 Human Intelligence and Artificial Intelligence

AI is a machine created to mimic human intelligence, with the ultimate goal of


replicating and potentially surpassing it. Before delving into the technical aspects
of AI, it is useful to compare the current status of AI with human intelligence.
One practical way to explore this is by comparing how both humans and AI under-
stand and respond to questions, which provides insights into their respective cogni-
tive processes. Using human cognitive abilities as a benchmark can help us better
understand the capabilities and limitations of AI technology.

5.1.1 Human Cognitive Process

The process by which humans understand questions and generate answers involves
multiple brain regions and complex cognitive functions. As revealed by cognitive
science, this process unfolds as follows:
First, the cognitive process begins when a question is either read or heard. Visual
or auditory information is received by the sensory organs, such as the eyes or ears.
At this stage, an attention mechanism activates to focus on the relevant question,
filtering out irrelevant sensory information.
Second, the perceived question is processed in the language centers of the brain,
primarily located in the left hemisphere. These centers include Broca’s area, which
is involved in speech production, and Wernicke’s area, which is responsible for
language comprehension. During this phase, the grammatical structure of the ques-
tion is analyzed, and the meaning of words is interpreted in the context of the
conversation or text.
Third, once the question is understood, the brain begins retrieving relevant infor-
mation from memory. This process involves the hippocampus, which is key for
long-term memory, and the prefrontal cortex, which aids in memory retrieval. The
brain searches through stored knowledge and experiences to find information that
can be used to construct an appropriate answer.
Fourth, the prefrontal cortex integrates the retrieved information with the context
of the question. It applies logic and reasoning to determine the most suitable response.
General knowledge, personal experience, and contextual clues are used to infer any
implied meaning or depth of explanation required.
Fifth, after deciding on the content of the answer, the brain plans how to express it.
The prefrontal cortex is involved in selecting appropriate words and structuring them
into coherent sentences. Broca’s area manages the production of speech or written
language.
Sixth, the motor cortex initiates the physical process of delivering the response. It
activates the muscles needed for speaking or writing. As the response is expressed,
auditory or visual feedback systems monitor its accuracy. If any discrepancies are
detected (such as a misspoken word), the brain adjusts the response in real-time.
5.1 Human Intelligence and Artificial Intelligence 103

6HQVLQJ 'HFLSKHU ,QIRUPDWLRQ ,QWHJUDWLRQ $QVZHU ([SUHVVLRQ


$WWHQWLRQ $QDO\VLV 5HWULHYDO ,QIHUHQFH 'HFLVLRQ $GMXVWPHQW

9HFWRU 4XHVWLRQ 3UHWUDLQHG ,QIHUHQFH $QVZHU


&RQYHUVLRQ $QDO\VLV .QRZOHGJH 6\QWKHVLV *HQHUDWLRQ

Fig. 5.1 Cognitive process of question and answer: a Human, b AI

This question-and-answer process involves intricate neural networks and cogni-


tive processes that work in harmony to ensure the response is relevant, coherent,
and appropriate. The process is not merely about retrieving facts but also involves
interpreting language nuances, understanding context, and adapting communication
in a fluid and dynamic manner (see Fig. 5.1a).

5.1.2 Cognitive Process of AI

AI, like humans, can understand questions and generate answers, but its operation
is fundamentally different. AI processes information through complex calculations,
rather than through biological cognition. We can explore this by examining a widely
used transformer-based AI model, GPT.
First, when GPT receives a question, it tokenizes the input text into smaller units
(tokens) and converts these tokens into numerical vectors through an embedding
process. These tokens, representing parts or whole words, start with arbitrary vector
values that adapt over time as GPT learns to associate them with semantic meanings.
Second, GPT uses its transformer architecture and self-attention mechanism to
analyze the relationships and relative importance of these tokens. This allows it to
deeply understand the structure, grammar, and context of the question, determining
its theme and intent. The embedded data is processed through several layers of
self-attention, refining its understanding based on patterns learned during training.
Third, unlike humans, GPT does not search an internal knowledge base or memory
for answers. Instead, it relies on patterns learned during pre-training on a vast corpus
of data. This extensive training allows GPT to generate responses based on the context
provided in the input, rather than retrieving information as humans do from memory.
Fourth, GPT synthesizes an answer by analyzing the input tokens in context.
Although its reasoning abilities are still limited, it can perform some level of inference
through its interaction with the trained weight parameters and input tokens. When
104 5 Artificial Intelligence

examples are provided in the prompt or when guided by a “chain of thought,” GPT’s
ability to infer and reason can improve.1
Fifth, GPT generates an answer in natural language. During this process, it predicts
the next word (token) based on the context of the question and the tokens it has already
generated. This is done through complex calculations that rely on its extensive pre-
trained data. The sequence of generated tokens is then converted into a coherent text
and presented as the final answer.
Comparing human and AI cognitive processes (as illustrated in Fig. 5.1a and
b), we see that while their functions may appear similar, they are fundamentally
different. AI lacks the ability to retrieve specific information from memory; it can
only simulate knowledge retrieval based on its pre-trained data. Furthermore, its
inference and reasoning capabilities are still underdeveloped, which is indicated
by the dashed lines around the ‘Pre-trained Knowledge’ and ‘Inference, Synthesis’
blocks in Fig. 5.1b.
It is important to note that GPT’s ability to understand and generate consistent
text comes from being trained as a language model. By pre-training on vast amounts
of digital text, it learns patterns of language, grammar, facts, and reasoning, which
enables it to generate responses that are relevant, grammatically correct, and logically
coherent.

5.1.3 Understanding

How does AI “understand” questions, and what does “understanding” mean for AI?
Unlike humans, who use conscious thought, existing knowledge, and reasoning
to reflect on the meaning, context, and implications of a question, GPT—a model of
AI—relies on pattern recognition, statistical correlations, and predictive modeling
to simulate understanding.
For GPT, “understanding” begins with interpreting the input text. It uses a self-
attention mechanism within a pre-trained transformer architecture to identify patterns
and correlations in the input. As it passes through multiple layers of the self-attention
network, GPT refines its interpretation by detecting patterns and relationships in the
data. This process allows it to grasp the context, nuances, and intended meaning of
the question.
Once GPT has processed the content and context of the question, it generates a
response based on its training. It selects words and constructs sentences by drawing
from learned patterns of how similar questions have been answered in the past.
However, GPT does not retain any memory of the interaction. Once a response is

1 Two representative methods for enhancing AI’s inference capability using prompts are in-context
learning (ICL) and chain-of-thought (CoT). ICL is a method where the AI model analyzes and
learns from the context within the input data itself, without requiring additional training. It enables
the model to make predictions by understanding the given context. CoT, on the other hand, is a
technique that guides the AI model to solve problems step-by-step in a sequential manner, allowing
it to perform multi-step inference and logical progression more effectively.
5.2 Development of Artificial Intelligence 105

generated, the task is complete, and if the same question is repeated, GPT will go
through the entire process again, potentially producing a different response each time
due to the probabilistic nature of its predictions.
Thus, for AI, “understanding” refers to the ability to interpret input information
and generate a relevant response. Unlike humans, who store information and can
consciously reflect on their responses, GPT does not retain or remember the content
it processes or the responses it generates. Its understanding is momentary, driven by
patterns in data rather than conscious reasoning or long-term memory.

5.1.4 Memory

How does AI “remember,” and what exactly does it retain?


Humans pay attention to questions, perceive them, and interpret these inputs in
the brain. Important information is encoded into memory, which can be reinforced
by associating new information with existing knowledge or through repetition. Over
time, this information is stored in both short-term and long-term memory.
In contrast, AI “remembers” only what it has learned during its training phase and
retains nothing from individual interactions. AI does not learn new information or
reinforce its knowledge during use; its responses are based solely on patterns learned
from the training data.
Human brains store essential features, meanings, and emotions related to
perceived information across a network of neurons. Memory is thought to be stored at
synapses, where synaptic plasticity allows the strength of memories to change over
time. Repeated activation of specific neural circuits strengthens these memories,
making it easier to recall related information later.
AI, however, stores information in the form of weight parameters during training.
It converts input data into numerical vectors via embeddings and adjusts the neural
network’s weights based on feedback during training. These weights, primarily
within the attention and feedforward layers of the transformer architecture, act as
AI’s “memory.” After the training phase is complete, GPT no longer learns or stores
new information.

5.2 Development of Artificial Intelligence

The three elements of algorithms, machine learning, and neural networks have
evolved over a long period of approximately 70 years, overcoming various chal-
lenges through technical breakthroughs and setting milestones in the development
of AI with several challenging events. Technologically, the emergence of recurrent
neural networks (RNNs) and convolutional neural networks (CNNs) enabled effec-
tive processing of sequential data and images, respectively, while variational autoen-
coders (VAEs) and generative adversarial networks (GANs) opened the doors to
106 5 Artificial Intelligence

generative models, and the advent of the transformer architecture with self-attention
mechanism paved the way for parallel processing. Events such as the introduction
of IBM Deep Blue, IBM Watson, and Google AlphaGo have marked milestones in
the history of AI development.

5.2.1 Early Stages of Artificial Intelligence

Although the concept of artificial intelligence (AI) existed as early as the 1940s,
the term AI was first used at a conference hosted by John McCarthy and others at
Dartmouth College in 1956, where AI was defined as the science and engineering
of making intelligent machines. The concept and fundamental theories of neural
networks existed in the 1940s, referring to computing systems inspired by the struc-
ture and function of the human brain, particularly neurons and their interconnections.
In 1943, Warren McCulloch and Walter Pitts proposed a model of artificial neurons
with binary outputs. In 1949, Donald Hebb introduced the concept of learning through
strengthened neural connections. Building on these ideas, in 1957, Frank Rosen-
blatt introduced the perceptron, a model designed for pattern recognition inspired
by biological neurons.
In 1956, the first AI program, Logic Theorist, was developed, capable of solving
puzzles using symbolic logic. That same year, the early natural language processing
computer program ELIZA was developed, demonstrating the possibility, albeit prim-
itive, of machines understanding and responding to human language. In the 1970s,
Expert Systems capable of mimicking the decision-making of human experts in
specific fields were developed and widely disseminated.
In 1997, IBM’s computer Deep Blue defeated world chess champion Garry
Kasparov, proving that machines could perform complex calculations and strate-
gies. Between the 1990s and the 2000s, neural network research saw a revival, and
machine learning algorithms capable of learning and making decisions based on
data improved AI capabilities. In 2011, IBM Watson demonstrated its ability to
understand natural language and answer questions on the game show “Jeopardy!”
by defeating human champions. In 2016, Google DeepMind’s AlphaGo defeated the
world’s leading Go player, Lee Sedol, proving AI’s ability to tackle complex board
games through deep reinforcement learning.

5.2.2 Neural Networks and Machine Learning

From a technical perspective, the development of artificial intelligence and machine


learning has seen several groundbreaking inventions over time.
Recurrent neural networks (RNNs) are neural networks that give feedback to
themselves, designed to process sequences or time series data. The foundational
5.2 Development of Artificial Intelligence 107

concept was introduced in the 1980s, but practical application required further devel-
opments, culminating in the creation of long short-term memory (LSTM) networks in
1997. LSTMs addressed the problem of long-term dependencies in data sequences,
significantly advancing natural language processing and speech recognition with
RNNs.
Convolutional neural networks (CNNs), introduced in the 1980s, employ a method
of filtering image pixels with a weight matrix to extract features. Practical application
became feasible only after improvements in computer performance and data capacity
allowed for larger and more complex neural networks. The emergence of AlexNet
in 2012 marked a significant development in CNNs, leading to breakthroughs in the
field of computer vision and revolutionary advancements in deep learning.
Deep neural networks (DNNs) are neural networks with depth, meaning a large
number of layers. Neural networks were emerged in the 1980s but were not widely
used or practical at that time. Initially, researched neural networks were shallow,
with only a few layers of depth. Neural networks became practically viable with the
application of the backpropagation algorithm in 1986, which subsequently became
a core technology for DNNs. However, it was not until 2006, when fast and efficient
training methods were introduced, that DNNs, including RNNs and CNNs, became
widely applicable.
The concept of graph neural networks (GNNs) was first introduced in 2009,
but practical application occurred after 2017, following sufficient advancements in
computational capabilities, the availability of large datasets, and several improve-
ments to GNN structures, leading to sophisticated graph neural network models.
Unlike traditional neural networks that dealt with 1D sequences or 2D images, GNNs
process data represented in graph form, emerging as a new tool for analyzing inher-
ently graph-structured data such as social networks, communication networks, and
molecular structures.
Machine learning has evolved into supervised learning, unsupervised learning,
deep learning, and reinforcement learning. Supervised learning involves learning
from labeled data, while unsupervised learning occurs without labeled data. Deep
learning uses neural networks with many layers, and reinforcement learning involves
learning through interaction with an environment by taking actions and receiving
feedback. In 2016, deep reinforcement learning emerged as Google DeepMind devel-
oped AlphaGo by intricately combining deep learning with reinforcement learning.
Deep reinforcement learning has enabled machines to learn and adapt in complex
and uncertain environments like games or autonomous driving, becoming a key
technology in AI development.

5.2.3 Generative Models

In 2013, the variational auto-encoder (VAE), a generative model, was proposed,


significantly impacting unsupervised learning and data generation. It opened
new possibilities in various fields such as computer vision and natural language
processing, endowing computers with creative capabilities like image synthesis.
108 5 Artificial Intelligence

In 2014, the generative adversarial network (GAN), composed of two opposing


networks: a generator and a discriminator, was introduced. GANs have become a tool
for generating realistic data, enhancing and altering images and videos, contributing
to practical applications of AI and creative exploration.
Generative models have rapidly developed with the emergence of the self-attention
mechanism and its application in the transformer architecture. Between 2018 and
the early 2020s, large-scale generative models like GPT and BERT, which adopt the
transformer architecture, have revolutionized natural language processing, language
understanding, and generation.

5.2.4 Transformer Architecture

The attention mechanism was introduced in 2014, and the self-attention mechanism
followed in 2017. Unlike RNNs or LSTMs, the self-attention mechanism can process
each part of the input data in parallel and adaptively work with the length of the input
data, significantly contributing to natural language processing.
The transformer architecture, presented in 2017, has achieved groundbreaking
progress in natural language processing. Its hallmark is the use of the self-attention
mechanism, which allows for parallel processing of data, moving away from the
sequential data processing used by RNNs and LSTMs. The transformer architecture
has been adopted in most large-scale AI models that followed.
In 2018, BERT, which adopts the transformer architecture, brought innovation
to natural language processing. Applied to various natural language processing
applications, from language translation to chatbots, BERT improved performance,
multifunctionality, and depth of language understanding.
Developed between 2018 and 2023, GPT is also a large-scale language model
that uses the transformer architecture, bringing about revolutionary advancements
in natural language processing by breaking the limits on the size and capabilities of
neural networks. In November 2022, ChatGPT emerged, capable of generating text,
engaging in conversations, and answering questions in a manner akin to humans.

5.3 Algorithms

An algorithm, in general, is a procedure or method for solving a problem, specifically


a collection of instructions for actions to be taken to solve a problem. A computer
program typically consists of a collection of sophisticated algorithms. When defining
AI as using machines programmed to think and learn in order to mimic human
intelligence, the ‘programs that think and learn’ are essentially algorithms. Therefore,
AI operates through numerous algorithms across various layers, making AI itself a
collection of algorithms.
5.3 Algorithms 109

Algorithms used in AI include search algorithms, sorting algorithms, optimiza-


tion algorithms, and machine learning algorithms. Machine learning algorithms are
further divided into several types such as supervised learning algorithms, unsu-
pervised learning algorithms, reinforcement learning algorithms, and deep learning
algorithms. This section briefly explains search algorithms, sorting algorithms, and
optimization algorithms, with machine learning discussed in a separate section.

5.3.1 Search Algorithms

Search algorithms are used for searching data or finding paths. Traditional examples
of search algorithms in computer science include depth-first search (DFS), breadth-
first search (BFS), and the A-Star algorithm. DFS is used for exploring graph or tree
structures, BFS for finding the shortest paths, and A-Star for shortest path finding
and graph traversal.
Among various search algorithms, we single out the DFS and discuss how it
operates. Historically, DFS was used for symbolic reasoning, logic, and problem-
solving, and it continues to be used for solving certain types of problems related
to search and optimization in AI. In complex AI systems, especially those dealing
with structured data or requiring exhaustive search and exploration, DFS operates as
follows: Starting from a node, it moves to an adjacent node, marks it as visited, and
repeats this process until it reaches the end of a branch. It does not revisit previously
visited nodes. If it reaches the end of a branch and there are no unvisited adjacent
nodes left, it backtracks to the last visited node that has unvisited adjacent nodes
and begins exploring a different branch. This process continues until all nodes have
been visited. DFS is used for complete exploration of a graph, visiting all nodes.
However, it does not necessarily find the shortest path from the starting node to the
destination node; it aims to explore as deeply as possible. Thus, DFS is useful for
finding whether a path exists between two nodes and what that path is.
Today’s search engines, widely used on search platforms, also fall into the cate-
gory of search algorithms but are more complex than traditional algorithms and
incorporate elements of AI’s machine learning and neural networks. Search engines
operate on complex algorithms to search data in search indexes and provide the
most relevant results for a search query. For Google search, algorithms such as
PageRank, Hummingbird, RankBrain, and BERT are used. PageRank is a link anal-
ysis algorithm used to rank web pages in search engine results, Hummingbird is
an algorithm to understand the intent and contextual meaning of a search query,
RankBrain interprets search queries to find related pages even if the words are
not exact, and BERT is a neural network-based technology for pre-training natural
language processing, helping to understand the nuances and context of words and
identify search query-related results more accurately (For more details on BERT, see
Sect. 5.7).
110 5 Artificial Intelligence

5.3.2 Sorting Algorithms

Sorting algorithms are traditional algorithms used to arrange a set or collection of


items or elements in a specific order. Sorting organizes data into an easily accessible
format and is used in various computer programs, including databases, search algo-
rithms, and data analysis. It also plays a crucial role in the context of AI and machine
learning, especially in data preprocessing and optimization processes.
There are various types of sorting algorithms, such as bubble sort, selection sort,
insertion sort, and merge sort. Bubble sort is one of the simplest sorting methods,
repeatedly traversing the list and swapping adjacent elements if they are in the wrong
order. Selection sort divides the list into sorted and unsorted sections and moves the
smallest (or largest) element from the unsorted section to the sorted section. Insertion
sort takes elements one by one and inserts them into their correct position within the
sorted portion of the list. Merge sort, a divide-and-conquer sorting method primarily
applied to large lists, divides the input list into smaller ‘sublists’ until these sublists
are trivially sorted, then merges the sublists to form a ‘sorted sublist’.
In AI and machine learning, sorting algorithms are used for data preprocessing,
search algorithms, recommendation systems, and optimization. Data preprocessing
prepares data for AI and machine learning by cleaning and transforming it, where
sorting algorithms help organize data in a specific order, making it easier to remove
duplicates, identify anomalies, and detect data patterns. Sorting data also enables
efficient searching for maximum or minimum values or specific elements within
a dataset. By arranging items or content in a specific order according to user
preferences, sorting algorithms allow recommendation systems to function effec-
tively. Furthermore, in optimization processes, sorting algorithms prioritize the most
promising optimization candidates, thereby enhancing efficiency.

5.3.3 Optimization Algorithms

Optimization algorithms are used in various fields, including engineering and


computer science, to find the optimal solution among possible solutions. Optimiza-
tion involves maximizing or minimizing an objective function, which quantitatively
represents the performance of a solution. The goal is to find input values that maxi-
mize or minimize this function. Optimization problems often come with constraints
that the solution must satisfy. The set of all possible solutions for an optimization
problem is called the search space, with the best solution within a specific search
space referred to as the ‘local optimum’ and the best solution across the entire space
called the ‘global optimum’.
There are various optimization algorithms, including brute-force search, gradient
descent, evolutionary algorithms, linear programming, integer programming,
dynamic programming, simulated annealing, and swarm intelligence algorithms.
Brute-force search algorithm systematically explores all possible solutions within the
5.4 Machine Learning 111

search space. Gradient descent algorithm, applied to continuous functions, updates


solutions based on the objective function’s gradient. Evolutionary algorithms main-
tain a set of potential solutions and find the optimal solution through evolution.
Linear programming applies to linear optimization problems with linear constraints,
maximizing or minimizing a linear objective function. Integer programming is an
optimization algorithm where decision variables must take integer values. Dynamic
programming is a divide-and-conquer optimization method that solves problems by
breaking them down into smaller sub-problems without repetitive redundant calls.2
Simulated annealing explores a wide search space initially and then focuses on the
most promising areas to find an approximate optimum. Swarm intelligence algo-
rithms, inspired by the collective behavior of insects or animals, involve multiple
agents communicating and cooperating to find the optimal solution.
Among these optimization algorithms, gradient descent is widely used in machine
learning and deep learning. It is an optimization algorithm that iteratively finds param-
eters minimizing the objective function. It calculates the gradient of the objective
function with respect to the parameters, where the gradient is a vector pointing in the
direction of the steepest increase of the function, obtained by partial differentiation.
To reduce the objective function value, parameters are adjusted in the opposite direc-
tion of the gradient. Adjusting slightly leads to slow convergence, while adjusting
a lot leads to rapid convergence, but excessive adjustment can cause oscillation and
divergence. The process repeats until the objective function’s change is minimal or
the gradient is close to zero, at which point the parameters are considered to have
converged (Refer to Sect. 5.5.4).

5.4 Machine Learning

Machine learning (ML) is a core technology of AI that helps computers learn from
data and make predictions or decisions. The goal of machine learning is to improve
performance on specific tasks by learning from a given dataset, with the quality and
quantity of the learning data significantly impacting the results. In the early stages of
machine learning, using decision trees has the advantage of visually displaying the
characteristics by which the data is classified.3 The goal of machine learning is to
develop a learning model that can generalize from training data to new, unseen data.

2 When a problem is solved using the divide-and-conquer method with a recursive algorithm, ineffi-
ciencies can occur due to redundant recursive calls that solve the same subproblems multiple times.
Dynamic programming is a technique that improves efficiency by storing the results of subproblems
and reusing them, thereby avoiding redundant calculations.
3 A decision tree is a simple yet powerful machine learning algorithm that can be used to analyze and

predict complex data structures. It is useful for visually representing and interpreting the inherent
patterns in data, especially by simplifying complex decision paths. This helps in understanding the
structure and characteristics of the data in the early stages of machine learning. The structure of
a decision tree intuitively conveys the knowledge gained during the learning process, playing a
crucial role in analyzing and verifying the decision-making process of the trained model.
112 5 Artificial Intelligence

Effective generalization means the machine learning algorithm can apply what it has
learned to new data effectively. However, overly complex models may capture noise
in the data, reducing performance, while overly simple models may fail to capture
basic patterns.
Machine learning includes several types: supervised learning, unsupervised
learning, semi-supervised learning, reinforcement learning, and deep learning (deep
neural networks). In supervised learning, the algorithm learns a function that maps
inputs to outputs based on input–output pairs in the dataset. Unsupervised learning
identifies patterns, structures, or features in data without predefined outputs. Semi-
supervised learning builds better models using both labeled and unlabeled data.
Reinforcement learning learns the optimal actions through interaction with an envi-
ronment, based on rewards or penalties for the actions taken. Deep learning uses
multiple layers of neural networks to extract features from low to high levels of
abstraction. The performance of these machine learning algorithms varies depending
on the problem, available data, and desired outcomes, necessitating the selection of
the most appropriate algorithm for each situation. This section will explain super-
vised, unsupervised, semi-supervised, and reinforcement learning, with deep learning
discussed in a subsequent section following the discussion on neural networks (Refer
to Sect. 5.5.6).
While there are various types of machine learning, the basic learning proce-
dure shares common steps: data collection, organizing and transforming data into a
format suitable for machine learning, identifying or generating features to improve
model performance, selecting a specific machine learning algorithm for training,
and evaluating the results of the training. If the evaluation results are satisfactory, the
learning process ends; otherwise, the model’s internal parameters are adjusted, and
the learning and evaluation process is repeated. Once learning concludes, the model
is deployed in the target environment for use, continuously monitored and updated,
and if performance degrades, it may be retrained.

5.4.1 Supervised Learning

In supervised learning, training algorithms use a dataset with specified labels. Labels
are the expected outputs or target values for each input data used for training. They
distinguish from input data and serve as essential guidelines for learning the mapping
between inputs and their corresponding outputs. For instance, if training involves
distinguishing between pictures of dogs and cats, the pictures serve as input data,
and “dog” and “cat” are the labels. Supervised learning algorithms attempt to predict
the labels based on the input data, adjusting parameters through comparisons between
the predictions and given labels during training.
The supervised learning process begins with data collection and preprocessing,
where input data is cleaned, missing values are handled, and variables are transformed
into a format suitable for the learning model. The data is then divided into training and
evaluation datasets, with the training set used for model training and the evaluation
5.4 Machine Learning 113

set for assessing model performance. Depending on the problem type, an appropriate
machine learning model is selected, such as linear regression models for regression
problems (predicting continuous values) and decision trees or neural network models
for classification problems (predicting categories).4 After training the model with
the training dataset, its performance is evaluated using the evaluation dataset. The
learning process concludes once the model achieves satisfactory accuracy.
As an example of supervised learning, we consider training a system to classify
emails as spam or non-spam. The process begins with collecting a dataset of emails
labeled as spam or non-spam, extracting features such as word frequency, sender
address, and email length during preprocessing. Part of the dataset is designated
for training and the rest for evaluation. A suitable model is selected based on the
classification task and trained using the training dataset. After sufficient training, the
model’s accuracy is evaluated using the evaluation dataset. If the model performs
satisfactorily, it can be deployed in email applications to filter spam. In this example,
the model learns from labeled email data and uses this knowledge to correctly classify
new emails.

5.4.2 Unsupervised Learning

Unsupervised learning involves learning from data that is not labeled or classified.
Unlike supervised learning, which trains on labeled data, unsupervised learning
is designed to identify patterns, relationships, or structures in datasets without
predefined labels or categories. It is used to discover unknown groups, underlying
structures, or distributions and patterns in the given data.
Like supervised learning, unsupervised learning starts with data collection and
preprocessing. However, there are no output categories or labels for the input data.
Depending on the task, such as clustering, dimension reduction, or association anal-
ysis, an appropriate unsupervised learning algorithm is selected. The model discovers
patterns or structures in the dataset during training without comparisons to known
outputs, aiming to explore the data itself. The process involves adjusting model
parameters and applying the algorithm repeatedly to better understand the data struc-
ture. Once unsupervised learning outcomes are generated, data experts must interpret
the identified clusters, patterns, or relationships.
Customer segmentation in marketing is an example of unsupervised learning,
where customers are segmented into various groups based on their purchasing
behavior without predefined categories. Data on customers’ purchase history, demo-
graphics, and browsing behavior is collected, and a clustering algorithm like k-
means is selected for the segmentation task. K-means clustering is an unsupervised

4 Regression problems involve predicting through variables that are real numbers, where the predic-
tions are continuous real numbers. In contrast, classification problems target cases where the subject
is not a continuous real number but a unique value or categorical variable. For example, predicting
tomorrow’s temperature is a regression problem, while distinguishing between dogs and cats is a
classification problem.
114 5 Artificial Intelligence

learning method that minimizes the variance of distances between the points of a
cluster and its centroid. By applying k-means, clusters of customers with similar
purchasing behaviors are identified. Analyzing the classified clusters helps under-
stand different customer groups (e.g., frequent buyers, occasional shoppers, high-
value item purchasers). Insights from this learning can be used for tailored marketing
strategies, personalized recommendations, and individualized customer service. In
this example, the unsupervised learning algorithm uncovers hidden patterns in
customer behavior, providing valuable information for business strategy. This exem-
plifies unsupervised learning’s ability to find structure in data that is not explicitly
labeled or classified.

5.4.3 Semi-supervised Learning

Semi-supervised learning is a machine learning method that sits between supervised


and unsupervised learning. It utilizes both labeled and unlabeled data for training,
typically employing a small amount of labeled data and a large amount of unlabeled
data. This approach is particularly useful when labeling data is expensive or time-
consuming. The goal of semi-supervised learning is to build a model that can learn
from both the labeled and unlabeled portions of the data. Labeled data is used as an
initial guide for classifying or predicting outcomes, while unlabeled data is used to
identify hidden structures or patterns to enhance the learning process.
Semi-supervised learning starts with collecting a dataset that includes both labeled
and unlabeled data. The model training begins with the labeled data, which could be
a standard supervised learning algorithm. Then, this initial model is applied to the
unlabeled data to make predictions or generate pseudo-labels. In the final training
phase, the model is retrained on a combination of labeled data and the newly generated
pseudo-labeled (originally unlabeled) data. Through this expanded training set, the
model learns more as it repeats the process. Finally, the model’s performance is
evaluated, and parameters are fine-tuned as necessary. Evaluation can be done using
a separate validation set or through cross-validation.
We consider the example of classifying web pages into categories such as sports,
news, technology, etc. Initially, a small amount of labeled web pages and a large
amount of unlabeled web pages are collected. A classification model, such as a deci-
sion tree or neural network, is selected and trained using the labeled dataset. The
initial model is then used to assign categories to the unlabeled web pages, gener-
ating pseudo-labels and adding them to the training dataset. The model is iteratively
retrained on this expanded dataset, adjusting parameters as needed. Once the model
is sufficiently trained, it can be used to automatically classify new web pages. This
example illustrates the use of a large amount of unlabeled data to supplement the
learning process when labeled data is scarce or difficult to obtain, a typical case for
semi-supervised learning.
5.5 Neural Networks 115

5.4.4 Reinforcement Learning

Reinforcement learning (RL) is a type of machine learning where an agent learns to


make decisions by taking actions in a given environment and adjusting those actions
based on the received rewards. The environment can be a physical space, a computer
game, financial markets, simulations, and others. Unlike supervised learning, which
learns from labeled data, and unsupervised learning, which finds patterns in data,
reinforcement learning learns from the evaluation of actions’ outcomes. That is, it
observes the environment while taking actions, and learns what actions are most
desirable under certain circumstances based on feedback in the form of rewards or
penalties.
An RL agent interacts with its environment by observing the state of the environ-
ment and deciding what action to take based on those observations. The strategy used
to decide on an action is called a policy, which can be deterministic or stochastic.
When the agent takes an action, the environment responds by presenting a new state
and a reward. This reward serves as a feedback signal to the agent, who then adjusts its
policy. Through this process, the agent learns and optimizes the policy to maximize
cumulative rewards. Reinforcement learning involves experimenting with various
actions and learning from successes and failures through trial and error.
As an example of reinforcement learning, we consider the training of a robot
to navigate a maze. The goal is to train the robot to find a specific location in the
maze, with the maze serving as the environment. The robot checks its current posi-
tion in the maze and decides to move in a certain direction. If the movement brings
the robot closer to the goal, it receives a positive reward; if it moves further away
or hits a wall, it receives a negative reward. Through repeated trial and error, the
robot learns a policy that maximizes cumulative rewards, eventually learning the
optimal path through the maze. This example of training a robot exemplifies rein-
forcement learning, where actions are taken in a given environment and modified
based on feedback rewards. Reinforcement learning is effective not only in robotics
but also in gaming, autonomous vehicles, and investment portfolio management,
where sequential decisions are crucial. Balancing exploration of new actions and
exploitation of known strategies is key in reinforcement learning.

5.5 Neural Networks

Machine learning operates through computational models known as neural networks,


or artificial neural networks, inspired by the biological neural networks of the human
brain. Neural networks are designed to recognize patterns and solve complex prob-
lems in a manner similar to human cognition. Just as the human brain is composed
of interconnected neurons, a neural network consists of interconnected nodes. Each
neuron processes input and produces output; similarly, a node takes input data, applies
weights, sums them up, and produces an output through a mathematical function.
116 5 Artificial Intelligence

Typically, the structure of a neural network includes an input layer, several hidden
layers, and an output layer. Each layer consists of nodes, and the links connecting
layers carry parameters known as weights. The process of machine learning is essen-
tially about finding the optimal set of weights. In deep learning, or deep neural
networks, there are many hidden layers, resulting in a large number of weight param-
eters to learn. There are various types of neural networks, including convolutional
neural networks (CNNs), useful for image and video recognition, and recurrent neural
networks (RNNs), useful for language modeling and text generation, which have
evolved into transformer structures.
The learning process of a neural network involves forward propagation, back-
ward propagation, and repeated weight adjustments. Input data fed into the network
passes sequentially through each layer’s neurons to calculate output values. The
error between the output values and target values is propagated backward through
the layers, calculating each weight’s contribution to the error. By applying methods
such as gradient descent, weights are updated. This process repeats until the neural
network learns and ultimately finds the optimal weights, concluding the learning
phase.

5.5.1 Structure of Neural Networks

Designed to mimic the way the human brain processes information, the structure of
neural networks also emulates the biological neural network structure of the brain.
The architecture of a neural network is composed of layers of interconnected nodes,
corresponding to neurons, each taking input data, performing simple calculations,
and outputting the result. Typically, a neural network has a single input layer, multiple
hidden layers, and a single output layer connected in a sequence (refer to Fig. 5.2).
The input layer receives the input data. Each node in the input layer represents
one feature of the input data. For instance, in image recognition, each input node
represents the intensity of a pixel.

Fig. 5.2 Neural network ,QSXW/D\HU +LGGHQ/D\HU 2XWSXW/D\HU


architecture
5.5 Neural Networks 117

The hidden layers exist between the input and output layers, often in multiple
layers. The term “hidden” is used because, from the outside, only the input and
output layers are visible, while the intermediate layers are not. In the hidden layers,
actual computations are performed by applying weights to the input data. Each node
in these layers acts as a neuron responsible for these computations, producing a result
that is sent as output. The number of hidden layers and the number of neurons in
each layer are determined based on the complexity of the task the neural network
is designed to perform. For deep learning tasks, a neural network architecture with
many hidden layers is chosen.
The output layer produces the final output data of the network. The form of the
output varies depending on the task. For classification tasks, it outputs the probabil-
ities of various classes, and for regression tasks, which predict continuous values, it
outputs continuous values.
Each node, corresponding to a neuron, performs the following computation
process as the basic operational unit of the neural network (refer to Fig. 5.3). First, it
multiplies each input data by its corresponding weight and then sums them. This sum
is then added to a bias term, and the result is passed through an activation function
to produce the output. Mathematically, if the inputs are x 1 , x 2 , …, x n , the weights are
w1 , w2 , …, wn , the output is y, the bias is b, and the activation function is f (·), then
the relationship is y = f (w1 x 1 + w2 x 2 + · · · + wn x n + b). Adding the bias adjusts the
activation point of the activation function. The activation function shapes the output
into the desired form. This neuron model is also referred to as a perceptron, as was
initially named by Frank Rosenblatt.
The calculation above is for the output of a single neuron, and since each layer is
composed of multiple neurons (for example, m neurons), it results in multiple outputs.
Therefore, the output of each layer can be represented as a vector Y composed of m
components. Since the input signal can be represented as a vector X composed of n

,QSXW :HLJKWV %LDV $FWLYDWLRQ 2XWSXW


)XQFWLRQ

ࠆ̛߾

ܹ‫ࡶݥ‬

࢏ԯଜ‫ࠝݤݫ‬.

1RGH 1HXURQ

Fig. 5.3 Operation in node (neuron)


118 5 Artificial Intelligence

components, the input–output relationship for a layer ultimately becomes Y = WX,


where W is the weight matrix. Thus, it can be said that each layer performs a linear
transformation by the weight matrix W.
In neural networks, the activation function is a crucial element that influences the
performance of the nodes. Commonly used activation functions include the sigmoid,
hyperbolic tangent (tanh), rectified linear unit (ReLU), and Gaussian error linear
unit (GELU). The sigmoid function outputs values between 0 and 1, forming an S-
shaped curve (with a midpoint at 0.5), making it useful for predicting probabilities.
The tanh function, similar to the sigmoid but outputs values between − 1 and + 1
(with a midpoint at 0), facilitates weight learning and models complex relationships
better than the sigmoid function.5 The ReLU function outputs the input directly
if it is positive and outputs zero for negative inputs, helping the model converge
faster. The GELU function is defined as x scaled by the cumulative distribution
function (CDF) of a standard normal distribution and is smoother and often performs
better than the ReLU function. In practice, the choice of activation function often
relies on experience. Various functions are experimented with to observe the model’s
performance, and different activation functions can be used within the same neural
network depending on specific requirements of the layers.

5.5.2 Types of Neural Networks

Neural networks come in various types designed to suit the characteristics of the
tasks they perform. Common neural networks include feedforward neural networks,
recurrent neural networks, convolutional neural networks, long short-term memory
networks, variational autoencoder networks, and generative adversarial networks.
Each type has its strengths, so the choice depends on the nature of the task, the
characteristics of the input data, and the required output type.
Feedforward neural networks (FNNs) have a simple structure where connections
between nodes do not form feedback, allowing data to move in one direction from
the input layer, through hidden layers, to the output layer. They are typically used for
classification and regression tasks. The neural network structure depicted in Fig. 5.2
represents a typical feedforward neural network.
Recurrent neural networks (RNNs) are designed for sequential or time series
data, such as sequences or text, and employ a self-feedback mechanism.6 The RNN
structure allows each hidden layer to feed its output back as input, processing it
alongside new inputs in a recursive manner. Deep RNNs have this recursive struc-
ture across multiple layers. While RNNs can handle variable-length inputs, they

5 The formulas are f (x) = 1/(1 + e^(− x)) for the sigmoid function and f (x) = (e^(x) − e^(− x))/
(e^(x) + e^(− x)) for the tanh function, where ‘^’ denotes exponentiation.
6 J. J. Hopfield, “Neural Networks and Physical Systems with Emergent Collective Computational

Abilities.” Proceedings of the National Academy of Sciences, vol. 79(8), pp. 2554–2558, 1982.
5.5 Neural Networks 119

struggle with long sequences. LSTM networks, designed to learn long-term depen-
dencies within sequences, overcome this limitation and are effective in applica-
tions requiring long-term context, such as machine translation and speech recogni-
tion. RNNs are commonly used for sequence analysis, speech recognition, language
modeling, natural language processing, translation, and text generation (For more
details on RNNs, refer to the next subsection).
Convolutional neural networks (CNNs) are designed to process grid-like data,
such as images.7 CNNs consist of convolutional layers, pooling layers, and fully
connected layers. Feedforward neural networks are modified by replacing hidden
layers with convolutional layers, followed by pooling layers, and ending with fully
connected layers. Convolutional layers apply convolutional filters to the input data
to extract features. Pooling layers reduce the spatial dimension of the input for the
next convolutional layer. Fully connected layers, located at the end of CNNs, use the
features extracted by the convolutional layers to determine the final output. CNNs are
widely used for image and video recognition, image classification, medical image
analysis, and natural language processing (For more details on CNNs, refer to the
next subsection).
Variational autoencoders (VAEs) are generative neural network models that use
unsupervised learning to generate new data similar to the training data.8 VAEs consist
of three main components: an encoder, a decoder, and a latent space between them.
The encoder compresses input data into a lower-dimensional representation (specifi-
cally, the means and variances of probability distributions) and passes it to the latent
space. The decoder then randomly samples from this distribution to regenerate the
input data. The VAE’s loss function comprises two parts: reconstruction loss and
regularization loss. Reconstruction loss measures how accurately the decoder can
reconstruct the input data from the latent representation, while regularization loss
ensures that the learned distribution remains close to the prior distribution, typically a
standard Gaussian. VAEs excel at learning complex data distributions and generating
new data similar to the original, making them useful for image and text generation
tasks.
Generative adversarial networks (GANs) consist of two competing networks: a
generator and a discriminator.9 The generator’s objective is to create data that is indis-
tinguishable from real data, while the discriminator’s role is to distinguish between
real and generated data. During training, the generator continuously improves its

7 Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel,


“Backpropagation Applied to Handwritten Zip Code Recognition”, Neural Computation, vol. 1,
issue 4, pp. 541–551, 1989.
8 Diederik P. Kingma and Max Welling, “Auto-encoding variational Bayes”, Proceedings of Inter-

national Conference on Learning Representations (ICLR), 2014, and Danilo Jimenez Rezende,
Shakir Mohamed, and Daan Wierstra, “Stochastic backpropagation and approximate Inference in
deep generative models” Proceedings of Machine Learning Research, vol. 32, pp/1278–1286, 2014.
9 Ian Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil

Ozair, Aaron Courville, Yoshua Bengio, “Generative Adversarial Nets” Proceedings of the
Conference on Neural Information Processing Systems (NIPS 2014), December 2014.
120 5 Artificial Intelligence

ability to produce realistic data, while the discriminator becomes better at identi-
fying generated data. Training concludes when the discriminator can no longer reli-
ably distinguish generated data from real data. GANs are widely used for generating
and enhancing realistic, high-quality data and have also been applied in creative and
artistic fields.
Among various neural network types, VAEs and GANs are widely used as gener-
ative models.10 Generative models focus on creating new data that resembles the
training data. GANs involve a generator producing data and a discriminator eval-
uating it against real data until the generated data becomes indistinguishable from
real data. VAEs, on the other hand, learn by compressing input data into a lower-
dimensional latent representation using an encoder and then regenerating the data
from this representation through a decoder. More recently, specialized neural network
models for generative tasks, including transformer architectures, have emerged.
While VAEs are effective for generating complex high-dimensional data and GANs
excel in image generation and editing, transformers are particularly powerful for
processing sequential data and for tasks such as natural language understanding
and generation. They have also demonstrated impressive performance in image
processing and generation (For more details on transformers, see Sect. 5.6).

5.5.3 Recurrence and Convolutional Neural Networks

Recurrent neural networks (RNNs) and convolutional neural networks (CNNs) play
pivotal roles in showcasing the diversity and capabilities of neural network models.
RNNs are designed to process sequential data, while CNNs are optimized for grid-like
data, such as images. Their unique characteristics make them suitable for different
types of applications: RNNs are well-suited for tasks involving long-term depen-
dencies in sequential data, whereas CNNs are highly efficient in real-time computer
vision tasks. RNNs and CNNs serve as foundational models when developing new
neural network architectures and optimization techniques, enhancing performance
and efficiency for specific tasks.
RNNs and CNNs also form the foundation for constructing advanced transformer-
based AI systems like BERT and GPT. Research aimed at improving neural network
models often emphasizes the importance of RNN, CNN, and transformer architec-
tures. A growing trend involves combining RNNs and CNNs with transformers in
hybrid models, where CNNs efficiently extract features from images and videos,
followed by transformers, which process and interpret large-scale data. This hybrid
approach leverages the strengths of both CNNs and transformers for more powerful
AI systems.

10 Restricted Boltzmann Machines (RBMs) also operate as generative models, consisting of a visible
layer and a hidden layer, with a simpler structure than VAEs or GANs. They are used for dimen-
sionality reduction, classification, regression, collaborative filtering, feature learning, and topic
modeling.
5.5 Neural Networks 121

• Recurrent Neural Networks (RNN):


RNNs are artificial neural networks designed to recognize patterns in data sequences,
such as text, genomes, handwriting, as well as numeric time series data like sensor
readings or stock prices. A key feature of RNNs is their ability to store information
about previously processed data or internal states and use them along with new inputs.
This functionality is enabled by feedback loops within the network, allowing data
to be looped back, making RNNs particularly useful for processing sequence data
where the order and context are important.
The structure of an RNN consists of an input layer, hidden layers, and an output
layer. The input layer receives sequence input data and outputs results, while the
output layer takes inputs from the hidden layers to generate the final output. However,
hidden layers use not only the output from the input layer but also outputs and the
states from previous steps as inputs. This means the outputs of hidden layers are
fed back into the same layers as inputs, contrasting with FNNs or CNNs where data
only moves forward. The feedback loop in RNNs allows for the storage and reuse of
the outputs and states from the previous sequence data during the processing of new
data.11
RNNs are applied in a variety of fields including natural language processing,
sequence prediction, sequence generation, and video analysis. They are used for
language modeling, machine translation, speech recognition, and text generation by
capturing the sequential nature of language to understand and generate text. RNNs are
also employed in predicting future values based on past data, such as in finance and
weather forecasting, capturing trends and patterns over time. In addition, RNNs have
been utilized in music composition, learning note patterns to generate data sequences
for creating new compositions. For video analysis, RNNs analyze temporal dynamics
to identify time-based patterns, useful for tasks like video classification.
RNNs are capable of handling tasks where the lengths of inputs and outputs
vary, such as machine translation. By applying the same weights to input data across
different time steps, RNNs can efficiently learn sequence patterns. However, they face
limitations when processing very long sequences due to the vanishing or exploding
gradient problem during training. These issues, common in deep neural networks,
occur when gradients either become too small (vanishing) or too large (exploding),
which slows down learning or causes divergence. To address these limitations, vari-
ants like long short-term memory (LSTM) networks and gated recurrent unit (GRU)
models were developed. LSTMs introduce memory cells with gates that allow the
network to easily learn and maintain long-term dependencies. GRUs offer a simpler
design with similar functionality to LSTMs, making them a more computationally
efficient alternative. Ultimately, the challenge of learning long-term dependencies
can be more effectively solved by using self-attention mechanisms, which enable the

11The cyclical nature of RNNs implies that the process of output being fed back and stored to be
used with incoming inputs in the next time-step repeats indefinitely within the same layer. If this
process is unfolded over time, it resembles a feedforward structure with an infinitely large number
of layers. This is analogous to the infinite impulse response (IIR) filter in signal processing, in
contrast to the finite impulse response (FIR) filter corresponding to FNNs.
122 5 Artificial Intelligence

model to focus on different parts of the input sequence when generating each output.
(For more details on self-attention mechanisms, see Sect. 5.6.2).
• Convolutional Neural Networks (CNNs)

CNNs are designed for processing grid-structured data, particularly suitable for visual
image analysis. They have achieved significant success in tasks such as image recog-
nition, image classification, and object detection. The distinguishing feature of CNNs
is their ability to capture the spatial hierarchical structure of features within images,
inspired by the organization of the animal visual cortex. This design enables CNNs to
automatically and adaptively learn the spatial hierarchical features of input images.12
Unlike traditional neural networks, which fully connect each input to all neurons,
CNNs apply convolutional filters to small, localized regions of the input, reducing
the number of parameters. This approach allows the network to focus on local
spatial consistency and effectively recognize visual patterns in images with minimal
preprocessing.
CNNs typically consist of multiple layers that transform the input image to output
features present in the image. These layers include convolutional layers, activation
layers, pooling (or down-sampling) layers, and fully connected layers. Convolutional
layers apply multiple convolutional filters to the input data to extract features, sliding
each filter across the input data and computing the dot product.13 Activation layers,
typically featuring the ReLU function, add nonlinearity to the network, enabling it
to learn complex patterns. Pooling layers sift through the extracted feature data to
reduce the amount of data for the next convolutional layer, thereby reducing the spatial
dimensions. Fully connected layers use the features extracted by the convolutional
layers to determine the final output.
The architecture of CNNs leverages the 2D structure of input images, processing
the images in a hierarchical manner across layers. For instance, the first layer might
extract edges, the next layer patterns, and subsequent layers higher-level features
such as objects or faces, enabling multi-level processing. This hierarchical approach
allows CNNs to extract complex features from simple data.
Applications of CNNs are diverse, including image and video recognition,
image classification, object detection, face recognition, medical image analysis, and
autonomous vehicles. CNNs can accurately identify objects, places, and people in
images or videos, categorize images based on visual content, detect specific classes of

12 Hongping Fu1, Zhendong Niu, Chunxia Zhang, Jing Ma and Jie Chen, “Visual cortex inspired
CNN model for feature construction in text analysis,” Frontier in Computational Neuroscience, vol.
10, pp. 1–10, July 2016.
13 The term “convolution” is widely used in circuit theory and signal processing, representing the

output signal y obtained by passing the input signal x through a filter h, expressed as y = x ∗ h
and read as “x convolution h”. Here, h represents the filter’s impulse response function, and the
convolution is calculated by flipping h across the time axis, overlaying it on the input signal x, and
calculating the overlapping area. This process is repeated bymoving h along the time axis to obtain
 as y(t) = x(τ)h(t − τ)dτ. In case the input and
the output function y, mathematically expressed
output signals are digital, dot product y(n) = x(m)h(n − m) is applied instead of integration.
5.5 Neural Networks 123

objects in digital images and videos, analyze medical images to assist in disease diag-
nosis, and help autonomous vehicles recognize traffic signs, pedestrians, and other
vehicles for safe navigation. The ability to learn and recognize patterns in visual data
makes CNNs a core element of contemporary AI systems requiring visual under-
standing. Beyond image and video processing, CNNs can also be applied to tasks in
natural language processing and time series analysis.
CNNs offer high computational and training efficiency by using small receptive
fields, which reduce the number of parameters and allow for scalability to large
images and complex datasets. Their suitability for parallel processing makes them
well-optimized for hardware like GPUs, and they can automatically learn hierarchical
features from data. However, CNNs also have several limitations, including the need
for large amounts of labeled data, high computational demands, long training times,
and susceptibility to overfitting, especially on small datasets. In addition, CNNs can
inherit biases and fairness issues from the training data, struggle to generalize to
cases not covered in the training set, and are sensitive to input variations. They are
also vulnerable to adversarial attacks, where small changes in input data can lead
to incorrect predictions. Addressing these challenges requires advances in model
architecture, training methods, and more efficient computational strategies.
When comparing RNNs and CNNs to other neural network architectures, fully
connected networks are versatile but inefficient for tasks that require recognizing
temporal or spatial patterns, as they lack the specialized recurrent or convolutional
layers. VAEs are designed for unsupervised tasks like dimensionality reduction and
feature learning, capturing complex data distributions but struggling with temporal
or spatial data. GANs are effective for generating new data but are not as well-
suited as RNNs and CNNs for processing sequential and spatial data, respectively.
Transformer architectures surpass RNNs in sequential data processing, offering better
performance and parallelism, but they demand significant computational resources.
Despite the rise of newer structures like transformers, RNNs and CNNs remain
relevant for specialized applications due to their architecture, which is specifically
optimized for handling sequential and spatial data.

5.5.4 Learning Process of Neural Networks

The learning process of a neural network begins with the input data passing through
the network, from the input layer, through hidden layers, and finally to the output
layer, performing node (neuron) operations in a forward direction. The operational
process of each node in every layer involves multiplying each input data by its
corresponding weight, summing all these values, adding a bias, and then passing the
result through an activation function to produce an output. This output becomes the
input for the next layer, continuing the node operation process in a chain reaction
until the final output data is produced.
Once the final output data is obtained through the forward operation process, it is
compared with the target value to calculate and quantify the error. This quantification
124 5 Artificial Intelligence

process is performed by a function known as the cost function.14 The most widely
used cost functions are the mean square error (MSE) and the cross-entropy func-
tion. MSE, as the name suggests, is obtained by taking the square of the difference
between the calculated outputs and target values and then averaging it. The closer
the calculated values are to the target values, the closer the MSE converges to 0.
Cross-entropy is calculated by taking the logarithm of each predicted probability,
multiplying it by the corresponding true target value (usually a one-hot encoded
vector, with the correct class represented by 1 and the other classes by 0), summing
the results across all outputs, and then applying a negative sign. Cross-entropy is
primarily used in models that output probabilities, where the entropy decreases as
the calculated probability distribution gets closer to the target distribution. MSE is
commonly used for regression tasks, while cross-entropy is used for classification
tasks.
After the cost function is computed, the next step is to determine how to adjust
each weight to minimize the cost function. This process can be divided into two
stages: first, finding the direction to change, i.e., the gradient; and second, making the
adjustment in that direction. The first process applies the backpropagation method,
and the second applies the gradient descent technique.
According to optimization theory, changes should be made in the opposite direc-
tion of the steepest gradient, and this direction’s gradient can be found by taking the
partial derivative of the cost function with respect to each weight. However, calcu-
lating the partial derivative for all weights in all layers is computationally excessive.
Therefore, applying the chain rule of differentiation layer by layer is more efficient.
The process starts by taking the partial derivative of the cost function with respect to
the weights of the last layer, the output layer, and then for the weights of the second-
last layer, and so on, moving backward (i.e., backpropagating) to the front input layer.
This method of calculating partial derivatives in a backward direction is known as
the backpropagation algorithm.15 It is necessary to store intermediate results layer
by layer during the forward operation to aid backpropagation calculations.
Once the gradient for each weight is calculated using the backpropagation method,
the gradient descent technique is applied to adjust the weights in the direction that
reduces the cost function the most, effectively descending along the calculated
gradient. If the cost function is J(w) and the gradient is ∇J(w), then the weight
w is adjusted in the opposite direction of the gradient by w(n + 1) = w(n) − α ∗
∇J(w(n)), where, α represents the learning rate, a hyperparameter that controls the
size of the change. If this parameter is too large, there is a risk of overshooting the
optimum value, while too small a value can result in a slow convergence. To improve
the rate of convergence of the weights, methods that adjust the learning rate for each
weight based on the history of gradient changes are also used.

14 The objective function in an optimization process corresponds to the cost (or loss) function in
the neural network learning process.
15 David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams, “Learning Representations by

Back-propagating Errors”, Nature, vol. 323, pp. 533–536, October 1986.


5.5 Neural Networks 125

5.5.5 Practical Considerations

The learning process of a neural network, involving the repeated application of


the backpropagation method and gradient descent, concludes when the gradient
converges to zero, indicating that the weights no longer change and the local optimal
value has been reached. However, because gradient descent requires the repetition of
partial derivative calculations for all data, it involves a significant amount of compu-
tation and time. Such computational burden can be alleviated by adopting stochastic
gradient descent (SGD) which, inspired by how the human brain learns from a small
amount of data gradually, selects a single data point at random to calculate the
gradient. This method does not follow the steepest descent precisely, resulting in a
zigzagging path, but it can reach the optimal point with much less computations.
Mini-batch gradient descent, which calculates the gradient with a subset of the data
called a mini-batch, strikes a balance between the stability of gradient descent and
the efficiency of stochastic gradient descent. A more advanced method is adaptive
moment estimation (Adam), which improves convergence speed and maintains robust
performance by adaptively adjusting the learning rate using the mean of the gradients
and the mean of the squared gradients.
In the learning process of neural networks, there are other parameters to be deter-
mined alongside the gradient. The learning rate is a prime example, along with
the duration of training, the size of the neural network (i.e., the number of layers
and nodes), batch size, etc. These parameters, decided by the human designer, are
called hyperparameters, in contrast to the weight parameters determined by the neural
network model. The preprocessing of original data for use by the neural network and
the final assessment of success after learning also involve human intervention.
Improper arrangement of the number of layers and nodes in the learning process
can lead to underfitting and overfitting issues. Underfitting occurs when there are
insufficient layers or nodes to capture the basic patterns in the data. Conversely,
overfitting happens when there are too many layers or nodes, capturing even the noise
in the data, which degrades performance on new data. If underfitting or overfitting
is observed, it is necessary to adjust the size of the neural network and the quantity
and quality of the dataset appropriately.
Among the techniques used to prevent overfitting in neural network training is
regularization. Overfitting occurs when a neural network model learns not only the
underlying patterns in the training data but also the noise within it, leading to degraded
performance on new, unseen data. To prevent this, regularization techniques can be
applied to encourage simpler and more robust learning by adding a penalty to the loss
function during training, which limits the model’s complexity.16 In addition, other
methods such as dropout, which randomly deactivates some of the nodes, reducing
the effective size of the neural network, or early stopping, which halts training at

16 Representative regularization techniques include L1 regularization and L2 regularization. L1


regularization adds the sum of the absolute values of the weights to the loss function, leading some
weights to be exactly zero, while L2 regularization adds the sum of the squares of the weights to
the loss function, suppressing the occurrence of large weights.
126 5 Artificial Intelligence

an appropriate point if the trend of convergence is no longer smooth, can also be


employed.
After training concludes, the model’s performance is evaluated using separate
validation and test datasets that were not used in training. Various evaluation metrics
related to training, such as accuracy, precision, recall factor, and mean squared error,
are measured to assess the learning effectiveness. The performance evaluation of
a neural network model involves comparing it with a baseline model or existing
models to assess quality and generalization ability. Depending on the evaluation
results, adjustments to hyperparameters or retraining may be necessary.

5.5.6 Deep Learning, Deep Neural Networks

Deep learning (DL) is a machine learning technique particularly effective for


processing large datasets and solving complex problems.17 Deep learning models
automatically and hierarchically learn data representations through neural networks
with multiple layers. As the number of layers increases, the model can transform
the input data into progressively more abstract and complex representations. The
term “deep” refers to the depth of the network, meaning the number of layers. While
traditional neural networks may consist of only a few layers, deep neural networks
can have many layers—sometimes even hundreds—allowing them to learn more
intricate patterns and representations. This makes deep learning widely applicable in
areas such as computer vision, natural language processing, and speech recognition.
The deep learning process begins with preprocessing the input data, such as
resizing images or tokenizing text. After preprocessing, an appropriate deep neural
network architecture is selected based on the task at hand. For example, RNNs
are chosen for sequence analysis, while CNNs are preferred for image processing.
During this step, decisions are made regarding the number and type of layers (e.g.,
convolutional, pooling, fully connected), along with hyperparameters such as the
learning rate. The network is then trained using the input dataset, with parameters
adjusted through backpropagation and gradient descent to minimize error. As training
progresses, data flows through the network layer by layer, with each layer refining
the data before passing it to the next. Throughout the learning process, hyperpa-
rameters like learning rate, number of layers, and layer size are fine-tuned, and the
model’s performance is evaluated using a validation dataset. Once the model is fully
trained and validated, it can be deployed for real-world tasks such as data prediction
or analysis.

17Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh, “A Fast Learning Algorithm for Deep
Belief Nets,” Neural Computation, vol. 18, pp. 1527–1554, 2006.
5.6 Transformer Architecture 127

5.6 Transformer Architecture

The transformer architecture is particularly well-suited for processing sequential


data, such as text, and has gained significant attention for its ability to handle long
sequences while capturing contextual information effectively. Before transformers,
models like RNNs and CNNs were widely used, but each had limitations. RNNs,
and their improved version LSTM networks, process data sequentially, relying on the
outputs of previous steps. This dependence restricts parallel processing and makes
it difficult to capture long-range dependencies. CNNs, while excellent at extracting
local spatial features, are not inherently designed for sequential data with long-
distance dependencies, as their fixed-size filters limit their ability to capture broader
context across a sequence. These limitations prompted the development of the trans-
former architecture, which is more effective at processing sequential data by lever-
aging self-attention mechanisms to capture relationships across the entire sequence
in parallel.
The transformer architecture, first proposed by Vaswani et al. in 2017, claimed,
“The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best
performing models also connect the encoder and decoder through an attention mech-
anism. We propose a new simple network architecture, the Transformer, based solely
on attention mechanisms, dispensing with recurrence and convolutions entirely.”18
The attention mechanism can be applied between different sequences or different
parts of the same sequence, with self-attention being a special form that focuses
within the same sequence, significantly enhancing the ability to understand and
represent relationships within the data.
The transformer architecture relies primarily on the self-attention mechanism,
which allows it to process all words in a sentence simultaneously while considering
the full context, regardless of the distance between words. This enables parallel
processing, unlike RNNs and LSTMs, which process data sequentially, or CNNs,
which rely on convolutional operations. The transformer’s approach is particularly
effective for handling long sequences, as it avoids the limitations of sequential or
localized data processing.
The transformer architecture has brought revolutionary progress to natural
language processing and has shown excellent performance in image processing.
Transformers have achieved great success in various natural language processing
tasks, such as language translation, text summarization, and question-answering
systems, and have also performed well in other tasks, including image recognition
and generation. Its ability to handle long-distance dependencies and efficiency in
parallel processing has been pivotal in AI advancements, laying the foundation for
several subsequent models like BERT and GPT.

18Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez,
Lukasz Kaiser, Illia Polosukhin, “Attention Is All You Need”, Proceedings of The 31st Conference
on Neural Information Processing Systems (NIPS 2017), 2017.
128 5 Artificial Intelligence

5.6.1 Transformers Architecture

The transformer architecture processes input sequences not continuously but by


cutting them into smaller tokens of words or characters. These tokens, as inputs,
make the transformer operate as shown in Fig. 5.4, used by Vaswani et al. when they
first proposed the transformer architecture. The “input embedding” block transforms
incoming tokens into numerical vector representations, which are then processed in
vector form by the rest of the blocks. Since the transformer cuts the sequence into
tokens, it is essential to carry positional information along with the tokens to later
reconstruct the sequence from the tokens. The “positional encoder” encodes this
necessary positional information. The transformer architecture consists of several
layers of encoders–decoders. “N×” next to the encoder-decoder blocks indicates
that those blocks repeat N times sequentially.19 Each layer’s encoder is composed of
multi-head attention and feedforward blocks, and the decoder is composed of masked
multi-head attention, multi-head attention, and feedforward blocks. Encoders map
the input sequence into a continuous vector representation for processing, which
is input into the decoder’s multi-head attention. Decoders represent the output
sequence as continuous vectors, processing them along with the input sequence’s
vector representations to generate output probabilities.
Figure 5.4 shows one layer of the many layers of the transformer architecture, with
the left half being the encoder and the right half being the decoder. In the encoder, the
input sequence tokens are converted into vector representations by the input embed-
ding block, and the encoded positional information of each token is also inputted.
The multi-head attention block allows each token to attend to the relationships with
other tokens in the input sequence (detailed information about attention is referenced
in the following section). The feedforward block is a unidirectional neural network
applied to the vector representation of tokens. After multi-head attention and feedfor-
ward processing, each output goes through a process of addition with the input and
normalization (Add and Norm).20 The output vector representation, after completing
the encoder process, is inputted into the multi-head attention of the decoder.
In the decoder, the tokens of the output sequence (‘Outputs’) are first transformed
into vector representations by the ‘output embedding’ block, and positional encod-
ings are added to these embeddings to inject information about the position of each
token in the sequence.21 The masked multi-head attention block prevents the decoder

19 In the case of GPT-3 model, the number of layers, N, is known to be 32–96.


20 “Add” is the process that facilitates the efficient flow of information across deep neural networks
by adding the input to the output of the layer’s transformation, mitigating the vanishing gradient
problem. This ensures that the model can remain stable and learn effectively as its depth increases.
On the other hand, “norm” involves normalizing the layer’s output to stabilize the learning process
and promote faster convergence. This is achieved by computing and adjusting the mean to 0 and
variance to 1 for each output, playing a crucial role in improving the model’s performance and
generalization capabilities.
21 In the transformer architecture, the label “outputs” in the decoder’s input refers to the target

sequence (such as a partially translated sentence in a translation task). The encoder processes
the input sequence (e.g., the original sentence in the source language), and the decoder generates
5.6 Transformer Architecture 129

2XWSXWV
3UREDELOLWLHV

6RIWPD[

/LQHDU

$GG 1RUP

)HHG
)RUZDUG

$GG 1RUP $GG 1RUP


)HHG 0XOWL+HDG
)HHG
)RUZDUG $WWHQWLRQ
)RUZDUG 1[

1[
$GG 1RUP $GG 1RUP

0XOWL+HDG 0DVNHG0XOWL
$WWHQWLRQ +HDG $WWHQWLRQ

3RVLWLRQDO 3RVLWLRQDO
(QFRGLQJ
  (QFRGLQJ

,QSXW 2XWSXW
(PEHGGLQJ (PEHGGLQJ

2XWSXWV
,QSXWV
VKLIWHGULJKW

Fig. 5.4 Transformer structure. Source Vaswani (2017)

from attending to future tokens, ensuring that the model can only use information
from previous tokens when predicting the next token. The subsequent attention block,
known as the cross-attention block, allows the decoder to focus on the relevant parts
of the input sequence by attending to the encoder’s output representations. The feed-
forward block and the addition and normalization block function similarly to those in
the encoder. The annotation ‘shifted right’ on the decoder’s input ‘outputs’ indicates
that the sequence is shifted one position to the right, ensuring that the prediction
for each token depends only on the previous tokens, not on future tokens. This is
essential for training the transformer to predict the next word in a sequence without
access to the subsequent (unknown) words.
The vector representation coming out of the decoder finally passes through the
‘linear’ and ‘softmax’ blocks. The ‘linear’ block serves as the output stage’s fully
connected feedforward neural network (FNN) layer, mapping the decoder’s output

the corresponding output (e.g., the translated sentence) based on both the encoded input and the
previously generated tokens in the target sequence.
130 5 Artificial Intelligence

to the dimension matching the token size. The ‘softmax’ block converts the output of
the ‘linear’ block into probabilities as the final operation process of the transformer.
The softmax function ensures that all output values are between 0 and 1 and that their
sum equals 1, forming a probability distribution. The values generated by the softmax
represent the transformer model’s estimation of the probability that a particular token
will be the next token in the sequence.

5.6.2 Self-attention Mechanism

The self-attention mechanism stands as a pivotal innovation within the transformer


model, enabling it to dynamically assess the relevance and relationship of each
segment within a data sequence to every other segment. This mechanism ensures
that every position in the sequence has the capacity to ‘attend’, or pay focused
attention, to all positions simultaneously, thereby evaluating the interconnections
within the sequence itself. The term “self” underscores the mechanism’s introspective
approach, focusing on internal relationships within the same sequence. Beyond basic
self-attention, the model incorporates multi-head attention, a sophisticated extension
that allows the model to explore and capture relationships across different represen-
tational subspaces in parallel. This means that instead of deriving a single set of
feature vectors through self-attention, multi-head attention enables the extraction of
multiple sets of feature vectors, each representing distinct aspects or types of rela-
tionships within the data. This multiplicity enriches the model’s understanding and
processing of the sequence by capturing a broader spectrum of relational nuances,
significantly enhancing its analytical depth and flexibility.
In the transformer architecture shown in Fig. 5.4, the term “multi” in multi-head
attention signifies that the transformer model conducts several independent self-
attention processes—or “heads”—in parallel. For instance, consider the sentence
“He was interested in the bats in the cave.” Here, one head might recognize “bats”
within the context of animals living in a cave, another might consider “bats” as sports
equipment, potentially misinterpreting the context, and another head might focus on
the emotional or exploratory tone conveyed by “was interested in.” In fact, this is an
oversimplified example intended to visualize the multi-head function. In reality, in
each head, a diverse set of features are included and the features are in much more
abstract form than we can recognize. Each head executes the self-attention process
utilizing distinct sets of weights, and these multiple heads operate in parallel to
analyze various facets of the input sequence from different representational perspec-
tives.22 Multi-head attention synthesizes insights from several heads, facilitating a
comprehensive and nuanced comprehension of the input sequence. This approach
significantly enhances the model’s ability to discern and represent complex interre-
lations within the data. In contrast, single-head attention recognizes diverse types of

22In the case of GPT-3, the number of heads in the masked multi-head attention block is known to
be 12–96.
5.6 Transformer Architecture 131

relationships or features using one head which may be much larger in size than a
head in the multi-head attention.
The self-attention mechanism operates as follows: First, the input sequence is
tokenized into individual tokens. These tokens are transformed into vectors through
an embedding process. In the multi-head attention block, the embedding vectors are
used to calculate three sets of vectors: query (Q), key (K), and value (V ). This is done
by applying three separate linear transformations to the input embedding vectors,
each using distinct weight matrices learned during training: W Q for the query, W K
for the key, and W V for the value vectors. Initially, these three weight matrices are
randomly initialized and have no specific distinction, but during training, they learn
to focus on different aspects of the input data. As a result of these transformations, we
obtain the three sets of vectors—Q, K, and V —which are then used in the following
steps of the attention mechanism.
The self-attention mechanism computes attention scores (S) by taking the dot
product of each query vector (Q) with the corresponding key vector (K) from other
tokens. This measures the similarity or “compatibility” between tokens. These scores
are then scaled (usually by dividing by the square root of the dimensionality of the
key vectors) and normalized using the softmax function to produce attention weights
(A). These weights represent the relative importance of each token in relation to
others. Finally, the attention weights are used to compute the output at each token
position as the weighted sum of the value vectors (V ). This allows the model to focus
dynamically on the most relevant tokens when processing the sequence, capturing
both context and relationships between tokens. Figure 5.5 illustrates the multi-head
attention mechanism as described above.

/LQHDU
WUDQVIRUP $WWHQWLRQ $WWHQWLRQ
VFRUHV ZHLJKWV
4
:4
,QQHU 6 $
TTTTT 6RIWPD[
,QSXW HHHHH 3URGXFW
ę,JRWRVFKRROĚ
(PEHG
(
 :. .
7RNHQV GLQJ
WWWWW NNNNN &RQFDWHQDWLRQ
3RVLWLRQDO
(QFRGLQJ 2XWSXW 2 2BFRQFDW
:9 3URFHVV :2
9 2XWSXW

YYYYY $WWHQWLRQ
2XWSXW
.5

3' 93 -' 9- 8' 98


53-V 5AUECNGF5USTV FA- #UQHVOCZ 5AUECNGF
1#8 1AEQPECV91  1A1A1AP

Fig. 5.5 Illustration of the multi-head attention mechanism


132 5 Artificial Intelligence

The roles of the query (Q), key (K), and value (V ) vectors in the self-attention
mechanism can be summarized as follows: The query vector (Q) is generated for
each input token and is used to determine how much attention the model should pay
to other tokens in the sequence. Each token also has a corresponding Key vector
(K), which represents the features of the token that other tokens will attend to. The
attention scores are computed by taking the dot product between the query vector
of the current token and the Key vectors of all tokens, including itself. These scores
represent the relevance or compatibility between the current token and all others.
The attention scores are then normalized using the softmax function, converting
them into attention weights (A). These weights represent the importance of each
token relative to the others in the sequence. The value vector (V ) contains the actual
content or information of each token. The weighted sum of the value vectors, based
on the attention weights, is then computed to produce the output O for each token. In
essence, the query and key vectors are used to calculate how tokens in the sequence
relate to each other, while the value vectors contain the information that is used to
generate the final output. The self-attention mechanism leverages these components
to dynamically assign different weights to various tokens, allowing the model to
capture complex relationships and dependencies across the input sequence.
In the case of a multi-head attention setup, the transformer model expands this
mechanism by having multiple sets of W Q , W K , and W V matrices, each constituting
an “attention head.” Each head independently computes its own set of Q, K, and V
vectors, allowing the model to simultaneously focus on different aspects of the input
sequence from various representational subspaces. The outputs from each attention
head are then concatenated and linearly transformed once more through an additional
weight matrix, often denoted as W O , to combine the diverse insights gathered from
each head into a single, unified output O_concat (see Fig. 5.5). This aggregation
process enables the model to integrate a richer set of contextual cues and relationships,
enhancing its ability to understand and process the sequence comprehensively. The
output O_concat thus obtained therefore becomes a new representation of the input
sequence reconstructed through self-attention.
Through this elaborate orchestration of multiple attention heads and the subse-
quent aggregation of their outputs, the self-attention component dynamically allo-
cates weights across the input sequence, elucidating intricate relationships within
it. The multi-head attention mechanism thus significantly contributes to the trans-
former’s skillfulness in capturing the subtle interaction of elements within the
sequence, supporting a deeper and more nuanced understanding of the data.

5.6.3 Comparison with RNN and CNN

The transformer architecture offers several significant advantages over traditional


recurrent neural networks (RNNs) and convolutional neural networks (CNNs). Trans-
formers can handle long-distance dependencies effectively, offer scalability and effi-
ciency, enable parallel processing, and capture context across entire input sequences.
5.6 Transformer Architecture 133

These capabilities make transformers highly suitable for processing sequential data,
such as natural language, while maintaining superior performance in comparison to
RNNs and CNNs.
RNNs are designed for sequential data and, in theory, can model long-distance
dependencies. However, in practice, they struggle with very long sequences due to
issues like vanishing or exploding gradients, which hinder effective learning over long
time steps. CNNs, while excellent at capturing local dependencies through convo-
lutional filters, are inherently limited when it comes to capturing long-range depen-
dencies in sequential data. In contrast, the transformer’s self-attention mechanism
calculates relationships between all tokens in a sequence simultaneously, enabling it
to capture dependencies regardless of their distance.
A major limitation of RNNs is their sequential data processing. Each step depends
on the output of the previous step, making parallelization difficult and leading to high
computational costs when scaling to large datasets or long sequences. CNNs, while
more parallelizable due to their use of convolutions, are constrained by the size of their
receptive fields, which limits their ability to process sequences that extend beyond
this fixed size. Transformers, on the other hand, can process entire input sequences
in parallel, splitting them into tokens and handling them simultaneously. This paral-
lelization significantly reduces training and inference times, especially when lever-
aging hardware like GPUs, making transformers highly scalable and efficient for
large datasets.
While RNNs excel in sequence-based tasks and CNNs are well-suited for image
data, transformers exhibit versatility across a broad range of tasks, including natural
language processing (NLP), image recognition, and audio processing. This flexi-
bility makes transformers more generalizable compared to RNNs and CNNs. RNNs
gradually capture context through sequential steps, and CNNs capture local context
through convolutional features. In contrast, transformers capture the entire sequence
context at every layer. By leveraging self-attention, transformers can assess rela-
tionships between every token in the sequence, making them highly effective for
complex sequence tasks such as NLP, where understanding relationships across the
entire sequence is essential.

5.6.4 Natural Language Processing

The transformer model has made groundbreaking contributions to natural language


processing (NLP), with notable successes in models like GPT and BERT. The ‘T’ in
both names stands for ‘transformer,’ reflecting the architecture that underpins their
impressive capabilities. Transformers have overcome the limitations of earlier models
like RNNs and LSTMs by offering solutions to long-distance dependency challenges
and by enabling parallel processing, which speeds up computations significantly.
One of the key advantages of transformers in NLP is their ability to understand the
entire context of a sequence simultaneously through the self-attention mechanism,
134 5 Artificial Intelligence

capturing intricate relationships within text. These features have made transformers
the dominant architecture in modern NLP.
Transformers have significantly improved machine translation, allowing models
to maintain context and nuance, even in complex sentence structures. Transformer-
based models like GPT have demonstrated the ability to generate coherent and contex-
tually relevant text, making them invaluable for tasks such as creative writing, chat-
bots, and automatic content creation. In addition to text generation, transformers excel
at text classification, helping categorize texts by genre or emotion and recognizing
named entities like names, locations, and dates. This makes them ideal for tasks like
sentiment analysis and information extraction. Furthermore, transformers are effec-
tive in summarizing lengthy documents, condensing large texts into concise, mean-
ingful summaries. Despite these strengths, transformer models require significant
amounts of training data and computational resources, and they may face difficulties
in generalizing to completely unseen data or new tasks without fine-tuning.
Beyond NLP, transformers have expanded into other domains, including image
and audio processing. The vision transformer (ViT) architecture treats an image as
a sequence of patches, similar to how text is treated, and has demonstrated excep-
tional performance in image recognition tasks.23 In speech recognition, transformers
capture long-distance dependencies and context within audio sequences, making
them highly effective for tasks like speech-to-text conversion. Similarly, transformers
have shown great potential in music generation by treating musical elements, such
as notes and rhythms, as sequential data. In addition, when combined with reinforce-
ment learning, transformers help agents remember long sequences of actions and
their outcomes, which is useful in game-playing scenarios. In these contexts, trans-
formers model complex relationships between events, predict outcomes, and assist
in decision-making processes.

5.7 GPT and BERT

Generative pre-trained transformer (GPT) and bidirectional encoder representations


from transformers (BERT) are iconic language models that significantly advanced
natural language processing capabilities based on the transformer architecture.
Following the proposal of the transformer architecture in 2017, OpenAI and Google,
respectively, developed GPT and BERT in 2018, achieving unprecedented success
in the field of language modeling.24 GPT showcased exceptional text generation

23 Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai,
Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob
Uszkoreit, Neil Houlsby, “An image is worth 16 × 16 words: Transformer for image recognition at
scale,” Proceedings of International Conference on Learning Representations (ICLR), 2021.
24 Since then, the transformer architecture has been used in various AI models for decision manage-

ment, robotic process automation, natural language processing, computer vision, optimization and
others. In particular, AlphaFold 2, an optimization AI model developed by Google DeepMind, has
dramatically improved the performance of protein structure prediction.
5.7 GPT and BERT 135

capabilities, while BERT set new standards in the area of language understanding.
GPT and BERT have marked milestones in the decades-long journey of AI evolu-
tion, demonstrating sophisticated neural network architectures’ ability to understand
and generate human language with remarkable accuracy. GPT evolved into subse-
quent models like ChatGPT-3.5, GPT-4, GPT-4o, GPT-o1, etc., and BERT led to the
development of improved models like LaMDA, Bard, and Gemini.25 Both GPT and
BERT, as well as their derivative models, utilize the transformer architecture as their
foundation.

5.7.1 Architectures of GPT and BERT

While GPT and BERT both revolutionize natural language processing through their
use of the transformer structure, they differ in their core concepts. GPT is a generative
model capable of producing text, whereas BERT focuses on deeply understanding
the context of language. GPT excels in text completion, content creation, and creative
writing due to its unidirectional (left to right) context understanding, meaning each
word prediction depends only on the preceding words. On the other hand, BERT
analyzes text bidirectionally (both from left to right and right to left), allowing for a
comprehensive understanding of each word’s context within the text.
GPT and BERT also differ in their applications and training methods. GPT is
primarily used for text generation, suitable for applications requiring consistent and
contextually relevant text paragraphs. In contrast, BERT is mainly used for text
understanding and interpretation, making it ideal for tasks like sentiment analysis,
question answering, and language inference. GPT is trained as a language model
to predict the next word in a sequence based on previous words, while BERT is
trained to understand the context of words in sentences by masking random words
and predicting them based on their surrounding context. Therefore, GPT is more
suited for tasks requiring text generation, such as writing assistance, chatbots, and
creative writing tools. Meanwhile, BERT is more effective in tasks involving text
understanding, like information extraction, search engines, and text classification
systems.
If we compare the architectures of GPT and BERT, both base their structures
on the transformer architecture but adopt different components for their use. GPT
utilizes the decoder of the transformer (the right half of Fig. 5.4), whereas BERT is
built upon the encoder (the left half of Fig. 5.4). In the case of GPT architecture,
the multi-head attention block in the middle of the decoder, which used to take in

25 Language Model for Dialogue Applications (LaMDA), Bard, and Gemini are transformer-based
AI models developed by Google and released in 2021, March 2023, and December 2023, respec-
tively. LaMDA was built on advancements from models like BERT, which excel at understanding
context within sentences. Bard extends LaMDA’s conversational capabilities by integrating Google’s
powerful search technology to provide real-time, up-to-date conversational responses. While
Gemini, introduced in December 2023, is the successor to Bard, publicly available information
about its technical features is limited.
136 5 Artificial Intelligence

2XWSXWV 2XWSXWV
3UREDELOLWLHV 3UREDELOLWLHV

6RIWPD[ 6RIWPD[

/LQHDU /LQHDU

$GG 1RUP $GG 1RUP

)HHG )HHG
)RUZDUG )RUZDUG

1[
1[ $GG 1RUP
$GG 1RUP
0DVNHG0XOWL
0XOWL+HDG +HDG $WWHQWLRQ
$WWHQWLRQ

3RVLWLRQDO
3RVLWLRQDO
  (QFRGLQJ
(QFRGLQJ
,QSXW
,QSXW
(PEHGGLQJ
(PEHGGLQJ

,QSXWV ,QSXWV

D E

Fig. 5.6 a BERT architecture, b GPT architecture

the output of the encoder, is no longer necessary. Figure 5.6 shows the resulting
architectures.26
GPT employs only the decoder part of the transformer architecture, which makes it
highly effective for text generation tasks. The decoder in GPT includes a masked self-
attention mechanism along with feedforward neural networks. The masking ensures
that the model can only attend to the previously generated words in a sequence,
making it a unidirectional model. This setup allows GPT to generate text sequentially
by predicting the next word based on the words that precede it. As a result, GPT excels
at generating coherent, contextually relevant text, making it ideal for tasks such as
creative writing, summarization, and dialogue generation.
On the other hand, BERT utilizes the encoder part of the transformer architecture,
which is optimized for understanding context rather than generating text. BERT

26 The linear and softmax blocks in Fig. 5.6a correspond to task-specific layers that are added
when BERT is fine-tuned for downstream tasks such as classification. However, different output
heads, such as ones specialized for the masked language modeling (MLM) and next sentence
prediction (NSP) tasks, would be used during pre-training rather than the task-specific heads shown
in fine-tuning.
5.7 GPT and BERT 137

consists of multiple layers of transformer encoders, each using self-attention to


process input data. Since the encoder is bidirectional, BERT can capture the context
from both directions (before and after each word) simultaneously, giving it a deeper
understanding of the text. BERT is trained using masked language modeling (MLM),
where random words in a sentence are masked, and the model learns to predict these
words based on the surrounding context. Additionally, BERT uses next sentence
prediction (NSP) to understand relationships between consecutive sentences, making
it highly effective for tasks such as question-answering, sentiment analysis, and text
classification.
When comparing GPT and BERT, their different architectural focuses lead to
distinct strengths. GPT’s decoder-based structure makes it exceptional for tasks
requiring sequential text generation, as it processes information unidirectionally.
In contrast, BERT’s encoder-based structure is designed for tasks that require a
deep understanding of text, as it processes information bidirectionally. The training
objectives also differ: GPT is optimized for text generation, learning to predict the
next word in a sequence, while BERT is focused on comprehension, leveraging
masked language modeling and next sentence prediction to understand the underlying
meaning of a text.

5.7.2 Applications of GPT and BERT

The applications of GPT are wide-ranging, with its primary strength being text
generation. GPT excels at creating creative content such as stories, poems, and
dialogues, making it ideal for use in interactive storytelling, game narratives, and
creative writing. It can also be applied to chatbots and conversational agents, where
generating coherent and contextually appropriate responses is crucial. In addition,
GPT is used in news article writing, content creation, and even in code generation for
software development. It can assist with language translation by generating context-
aware translations. In the field of education, GPT can be used to create educational
content, such as practice questions and explanations, providing interactive learning
experiences.
BERT, on the other hand, excels at text classification, information extraction,
and search engine optimization. BERT is highly effective for sentiment analysis,
making it useful for analyzing customer feedback, social media posts, and reviews.
Its ability to understand text context allows it to organize and categorize large
amounts of content, which makes it invaluable in applications like content filtering
and text categorization. BERT has also been adopted by search engines like Google to
improve the understanding of query intent and provide more relevant search results.
In question-answering systems, BERT’s deep contextual understanding helps find
specific answers within large texts, making it useful in applications ranging from
customer service to information retrieval. In addition, BERT can identify and clas-
sify named entities (such as names, locations, and organizations) and can be used
138 5 Artificial Intelligence

for machine translation and text summarization by extracting key information from
input texts.
When comparing the applications of GPT and BERT, GPT’s strength lies in gener-
ating coherent and creative content, making it ideal for applications that require
language generation and interaction. In contrast, BERT excels at tasks that require a
deep understanding of text, such as classification, sentiment analysis, and question-
answering. Both models are versatile and can be adapted for a wide range of NLP
tasks. However, their application areas are defined by their core strengths—GPT is
more effective for text generation tasks, while BERT is better suited for text compre-
hension and analysis. In natural language processing, both models have found success
across various industries and services, with GPT dominating in content creation and
BERT in information extraction and contextual understanding.
GPT and BERT are revolutionary models in natural language processing (NLP),
but they encounter challenges when working with long texts due to their fixed
context windows. GPT, known for its powerful text generation capabilities, can
struggle to maintain coherence over extended narratives because it only processes
a limited number of tokens at a time (usually around 2048 tokens for GPT-3 and
earlier versions). As a result, GPT may lose track of long-term context in lengthy
texts. Newer versions of GPT, such as GPT-3.5 and GPT-4, aim to mitigate this by
increasing the model’s attention span and token capacity, enabling better handling
of long sequences.
Similarly, BERT, which excels at context understanding and text analysis, is
constrained by a maximum input length (typically 512 tokens), limiting its ability
to process long documents in one pass. Techniques like text segmentation or sliding
windows can be used to process longer documents by dividing them into chunks, but
this often leads to a loss of global context across segments. Despite these limitations,
ongoing research and advancements (such as the development of Longformer and
Big Bird models) seek to address these issues by extending the attention mechanism
to capture broader contexts in longer texts, thereby improving the models’ ability to
handle large-scale documents.

5.7.3 ChatGPT

See Fig. 5.7.


ChatGPT, developed by OpenAI, is an advanced language model application of
GPT. It operates based on pre-training and incorporates the transformer architecture,
specializing in natural language processing. Therefore, it inherits the strengths of
the transformer architecture and exhibits exceptional abilities in understanding and
generating natural language.27 The name ChatGPT reflects its capability to converse
with users, answer questions, and compose texts in a chat-like manner. OpenAI

27In December 2023, Google unveiled ‘Gemini’, a generative AI based on large language models,
which is known to surpass human experts in large-scale multi-task language understanding tests.
5.7 GPT and BERT 139

Fig. 5.7 OpenAI logo

released GPT-3 in June 2020, ChatGPT-3.5 for public testing in November 2022,
ChatGPT-4 in March 2023, and GPT-4o in May 2024. Furthermore, in October
2024, it released GPT-o1 with enhanced inference capability.28
ChatGPT can understand text input in various languages, grasp the context of
conversations, and generate consistent and contextually appropriate responses. It
was trained on a wide range of internet texts, enabling it to respond knowledgeably
to topics covered during its training. However, it cannot provide information on real-
time events or topics not included in its training data up until its last training session.
Also, ChatGPT may have limitations in responding to languages with insufficient
training data.
ChatGPT can engage in interactive conversations with users, answering ques-
tions, providing explanations, assisting with creative writing, solving coding prob-
lems, and more. It can maintain the context of a conversation, remember previous
questions and answers, and provide comprehensive responses based on this infor-
mation. Some versions can connect to external tools like a web browser for informa-
tion retrieval or DALL-E for image generation, offering additional functionalities.
Furthermore, ChatGPT can significantly aid scientific and technological research,
such as designing new molecules or simulating cellular behavior. It is expected to
become a competent assistant in literature review, data summarization, hypothesis
setting, concept development, experiment design support, coding and data analysis,
and drafting research proposals and papers.29
Despite its powerful natural language processing capabilities, ChatGPT has limi-
tations. It inherits the limitations of the GPT model that relies on training. Insuf-
ficient training can lead to a lack of knowledge, and biases in training data can be
perpetuated. Sometimes, it may provide plausible but inaccurate or absurd responses,
a phenomenon known as “hallucination”. It can also produce ethically question-
able or value-misaligned responses and has limitations in accessing real-time data.
Such limitations are considered to be solvable or mitigable by enhancing AI’s infer-
ence capabilities and search capabilities, and thus, AI developers are focusing on
improving AI’s inference and planning abilities. (GPT-o1, for example, is known to

28 OpenAI expanded the capabilities of GPT-4, allowing users to create and use customized GPT
applications. In January 2024, they launched the ‘GPT Store’, where these applications can be
shared.
29 In December 2023, the scientific journal “Nature” included ChatGPT in its “Nature’10” list,

alongside ten scientists and technologists.


140 5 Artificial Intelligence

have significantly improved inference abilities compared to previous GPT models.)


Therefore, it is advisable to verify ChatGPT’s responses with other sources through
web searches before use.
The release of ChatGPT-3.5 generated unprecedented attention in the AI world,
showcasing its remarkable ability to produce human-like text and engage in natural,
coherent conversations with users. This breakthrough demonstrated that AI is no
longer a distant concept but an integral part of our present reality. In response, ICT
companies and platform operators recognized the immediate impact of such advance-
ments and ignited a fierce competition to accelerate AI development.30 Leading tech
firms prioritized investments in AI research, aiming to integrate conversational AI
into their services and stay at the forefront of this transformative technology. This
shift underscores a new era where AI is not only enhancing business operations but
also reshaping the way individuals interact with technology on a daily basis.
Simultaneously, the launch of ChatGPT also raised alarms about the potential risks
of AI. It spurred public discussions on the dangers of AI and prompted governmental
policy responses. Professional developers advocated for responsible, ethical, and safe
AI development, while governments called for AI that does not harm employment,
education, or national security. The US administration and Congress, along with the
G7 countries, have urged measures to mitigate risks associated with AI misuse,
enhance AI management and security, and mandate identifiers for AI-generated
content. These arguments eventually materialized for the first time with the enactment
of the EU’s Artificial Intelligence Act (AIA) in 2024. (Refer to Sect. 5.9.5).

5.8 Implementation of AI Cognitive Functions

In Sect. 5.1, we examined the question-and-answer process of humans and AI as


a means to compare the cognitive functions of both. When receiving a question,
humans go through six stages—cognitive attention, decoding interpretation, informa-
tion retrieval, integration and reasoning, answer composition, and expression modifi-
cation—to understand and respond to the question. In contrast, AI goes through five
stages—vector transformation, question interpretation, knowledge learning, infer-
ence synthesis, and answer generation (see Fig. 5.1). While, superficially, the cogni-
tive process of AI may seem similar to that of humans, in reality, AI’s process of
understanding and answering questions is carried out through a series of complex
calculations. Moreover, AI ‘understanding’ a question means preparing to generate
a response by numerically interpreting the input question, and once the question is
understood, AI immediately begins generating the answer. Now, let’s examine how
these cognitive functions are implemented in AI systems. We will explore how the
AI system performs such question-and-answer processes, by taking the GPT model
as a representative AI system model.

30
For instance, Google’s Gemini, Meta’s Llama2, Amazon’s Bedrock service, and IBM’s
Watsonx.ai are notable examples.
5.8 Implementation of AI Cognitive Functions 141

5.8.1 Functions of Blocks in AI Systems

The ability of AI to understand questions and provide answers is primarily achieved


through the transformer blocks and subsequent blocks within the AI system. Let’s
examine how the GPT model performs these question-and-answer processes.
Figure 5.8 is a horizontal depiction of the GPT structure shown in Fig. 5.6b. When
a question is input through the leftmost arrow in the figure, the internal transformer
structure interprets the question, understands the context, and generates an answer,
which is then output through the rightmost arrow in the figure.
The GPT model is composed of blocks such as input embedding, masked
multi-head attention, feedforward, linear transformation, and softmax, each serving
different functions. The transformer block, represented by a large rectangle
(including the masked multi-head attention block and the feedforward block), is
repeated N times in succession (‘N×’). Roughly speaking, the function of under-
standing the question is performed within the transformer blocks, while the function
of generating the answer (i.e., creating the output text) is performed within the linear
and softmax blocks.

1. Input Embedding
The input embedding block performs the ‘vector conversion’ function of AI as
shown in Fig. 5.1. This block splits the input text into tokens, which are small units
like words, and converts each token into numerical vectors. Through this embed-
ding process, the input text is transformed into numerical vectors, and all subse-
quent processes within GPT are carried out through numerical calculations. Each
token’s embedded vector includes positional information indicating where the token
appeared in the text.
2. Masked Multi-Head Attention

This block, together with the subsequent feedforward block, is responsible for inter-
preting the question and understanding its context. These functions are achieved
through the self-attention mechanism. The blocks performing the self-attention func-
tion are called heads. GPT processes self-attention through multiple heads in parallel,
with each head focusing on different features of the input data. After passing through

0Z

/CUMGF
+PRWV /WNVK #FF (GGF #FF
'ODGF *GCF  (QTYCTF  .KPGCT 5QHVOCZ
FKPI #VVGPVKQP 0QTO 0QTO

2QUKVKQPCN
'PEQFKPI

Fig. 5.8 GPT architecture


142 5 Artificial Intelligence

the multi-head attention, the model can effectively identify patterns within the input
data and understand relationships and context between words.
3. Feedforward
Data that has passed through the multi-head attention block is further transformed
in the feedforward block, capturing complex patterns. This block consists of two
feedforward neural networks (FNNs) with a nonlinear activation function in between.
The first FNN expands the data’s dimensions, the nonlinear activation function,
GELU, allows the model to learn more complex patterns through nonlinearity, and
the second FNN reduces the data back to its original dimensions. The reason for
expanding dimensions and processing in a higher-dimensional space is that nonlinear
processing in an expanded space can capture and represent more complex features and
patterns. Projecting the input into a high-dimensional space enables more complex
transformations and interactions within that space.
Comparing with Fig. 5.1 AI, the transformer block’s multi-head attention and
feedforward blocks together perform the functions of ‘question analysis’, ‘pre-trained
knowledge’, and ‘inference, synthesis’.
4. Linear Transformation and Softmax
The linear transformation block and the softmax block are responsible for generating
the response. The linear transformation block is a fully connected FNN that trans-
forms the input data into a space that matches the model’s vocabulary size, expanding
the input into a set of logits (unnormalized prediction values). The softmax block then
applies the softmax operation to these logits. The softmax function converts all output
values into non-negative values and normalizes them to sum up to 1, allowing the
model to interpret the logits as a probability distribution. This probability indicates
the likelihood of each word being the next word in the output sequence. Therefore,
the word with the highest probability is selected as the next output word.
Comparing with Fig. 5.1 AI, the linear transformation block and softmax block
together perform the function of ‘answer generation.’

5.8.2 Complexity of AI Systems

Let’s examine the size and complexity of AI systems capable of understanding


questions and generating answers, specifically using the GPT-3 175B model as an
example.31

31 Although newer models such as GPT-4 and GPT-4o have been released, with improvements over
GPT-3, system details for these newer models are not fully publicly available. As a result, for the sake
of convenience, we will review the complexity of AI models using the GPT-3 175B model. While
GPT-4 and other recent models introduce enhancements in areas such as accuracy and contextual
understanding, their underlying architectures can be viewed as extensions or refinements of the
transformer-based approach used in GPT-3. Therefore, many aspects of the structure, functionality,
and complexity of these newer models can be understood by analyzing GPT-3.
5.8 Implementation of AI Cognitive Functions 143

1. Input Embedding
The embedding block converts input text into tokens and then transforms each token
into numerical vectors. While it is possible to convert tokens into simple numbers,
transforming them into numerical vectors captures semantic relationships, syntactic
roles, and contextual information, making calculations within the transformer struc-
ture more efficient. The size T of each embedding vector is 12,288, meaning each
token is represented by a 12,288-dimensional vector. Each vector is expressed as a
32-bit or 64-bit floating-point number.32
When the transformed embedding vectors for each token are combined, they form
a matrix called the embedding matrix. The total set of tokens is called the vocabulary,
and if the vocabulary size is D, the embedding matrix W E has dimensions of D ×
12,288. For the GPT-3 175B model, with a vocabulary size of 50,257, the embedding
matrix size is 50,257 × 12,288.33
The embedding matrix is learned during the training process, much like other
weight matrices in a neural network. Once training is complete, the embedding
matrix contains a representation for each input token, where each token is mapped
to a corresponding vector in the matrix. This matrix can be visualized as a table,
where input tokens are associated with specific rows, and the vectors representing
the tokens are stored in the corresponding rows. During the usage stage, when an
input token is processed, the model simply retrieves its pre-learned embedding vector
from this table. This retrieved vector represents the token’s position in the semantic
space and serves as the input to subsequent layers of the model. This lookup process
is efficient and constitutes the input embedding phase.
2. Masked Multi-Head Attention
The masked multi-head attention block, a core element of the transformer structure,
performs the self-attention mechanism, with detailed functions as shown in Fig. 5.5.
Multi-head attention processes the embedding vector in parallel by dividing it by
the number of heads. Specifically, the embedding vector dimension T is divided
by the number of heads H, with each head processing a T /H-dimensional vector.
This reduction of the embedding vector’s dimension is achieved by passing each
embedding vector through weight matrices W Q , W K , and W V , with each weight
matrix sized T × (T /H). The resulting Q, K, and V vectors are reduced to T /H
dimensions. For the GPT-3 175B model, with T = 12,288 and H = 96, each weight
matrix W Q , W K , W V is sized 12,288 × 128. Thus, the Q, K, and V vectors are reduced

32 This applies to the GPT-3 175B model, which has 175 billion parameters, but the size of the
numerical vector varies with different models. For example, the GPT-3 Small model with 125 million
parameters has a vector size of 768 dimensions, the GPT-3 XL model with 1.3 billion parameters
has a vector size of 1600 dimensions, and the GPT-3 13B model with 13 billion parameters has a
vector size of 5120 dimensions.
33 The reason why the GPT-3 175B model uses a vocabulary size of about 50,000 tokens, in

contrast to the Oxford English Dictionary’s approximately 600,000 words, is to balance model
complexity and the ability to capture diverse linguistic nuances, allowing it to process various
language structures and vocabularies without excessive computational load and memory increase.
144 5 Artificial Intelligence

0Z

(GGF
/CUMGF
+PRWV #FF (QTYCTF #FF
/WNVK*GCF .KPGCT
'ODGF   5QHVOCZ
#VVGPVKQP
FKPI 0QTO W^F1, 0QTO
W^Q, W^K, W^L
W^F2
W^E W^V, W^O

2QUKVKQPCN
'PEQFKPI

Fig. 5.9 Weight matrices of GPT-3

to T /H = 128 dimensions, and the weight matrix W O , which connects the outputs
of the H multi-head attention blocks, is sized 12,288 × 12,288.34

3. Feedforward

The first FNN is a linear transformation that expands the input vector’s dimension,
and the second FNN following the nonlinear activation function in the middle is
a linear transformation that reduces the vector back to its original dimension. For
the GPT-3 175B model, the expansion ratio is 4×. Thus, the weight matrix W F1 for
the first linear transformation is 12,288 × 49,152, and the weight matrix W F2 for
the second linear transformation is 49,152 × 12,288. The calculation process of the
intermediate nonlinear activation function, GELU, is straightforward.
The transformer block, composed of masked multi-head attention and feedfor-
ward blocks, is repeated N times in sequence, where N represents the depth of the
transformer network. For the GPT-3 175B model, N = 96.

4. Linear and Softmax

The linear block linearly transforms the input data into a space matching the vocab-
ulary size, expanding it into a set of logits. The size of the weight matrix performing
this transformation is determined by the vocabulary size. For the GPT-3 175B model,
the linear block weight matrix W L is 12,288 × 50,257.
The softmax block applies the softmax operation to the output logits of the linear
block. The softmax function is a mathematical function that converts logits into a
probability distribution and is simple to compute.
The cumulative weight matrices discussed above are summarized in Fig. 5.9.

34The number of heads H varies by model, with GPT-3 Small having 12 heads, GPT-3 XL having
32 heads, and GPT-3 13B having 40 heads. Consequently, the sizes of the weight matrices W ^Q,
W ^K, W ^V are 768 × 64 for GPT-3 Small, 1600 × 50 for GPT-3 XL, and 5120 × 128 for GPT-3
13B.
5.8 Implementation of AI Cognitive Functions 145

5.8.3 Number of Weight Parameters

For the GPT-3 175B model mentioned in the previous sections, the number of weight
matrices that need to be learned during the training process is extremely large. The
weight matrices to be learned include W E from the input embedding block, W Q ,
W K , W V , W O from the masked multi-head attention block, W F1 , W F2 from the
feedforward block, and W L from the linear block (refer to Fig. 5.9). The size of
each matrix for the GPT-3 175B model is as follows: W E : 50,257 × 12,288, W Q :
12,288 × 128, W K : 12,288 × 128, W V : 12,288 × 128, W O : 12,288 × 12,288,
W F1 : 12,288 × 49,152, W F2 : 49,152 × 12,288, W L : 12,288 × 50,257. Notably, the
masked multi-head attention block and feedforward block are repeated N = 96 times
in succession.
GPT-3 simultaneously learns all the weight matrices listed above. Comparing with
the neural network structure in Fig. 5.2, each weight matrix corresponds to a layer in
a multilayer neural network. However, unlike a standard multilayer neural network
where the output of each layer directly feeds into the next, additional processing steps
are involved. If we consider the three matrices W Q , W K , and W V in the multi-head
attention block as being computed simultaneously, the input embedding block forms
one layer, the masked multi-head attention block forms two layers, the feedforward
block forms two layers, and the linear transformation block forms one layer. Given
that the masked multi-head attention and feedforward blocks are repeated 96 times,
the GPT-3 175B model can be considered a deep neural network with a total of 1 +
(2 + 2) × 96 + 1 = 386 layers.
Now, let’s calculate the number of weight parameters for the GPT-3 175B model
based on Fig. 5.9. The model name 175B indicates that there are ‘175 billion’ weight
parameters. Let’s verify how this number is derived.
1. Input Embedding
The input embedding block contains 50,257 × 12,288 = 617,588,016 weight
parameters in the weight matrix W E .
2. Masked Multi-Head Attention
A single masked multi-head attention block contains 12,288 × 128 = 1,572,864
weight parameters in each of the W Q , W K , and W V matrices. With 96 heads (H),
the total number of parameters is 1,572,864 × 3 × 96 = 452,984,832. In addition,
the W O matrix contains 12,288 × 12,288 = 150,994,944 parameters. Therefore, the
total number of parameters in the masked multi-head attention block is 603,979,776.
3. Feedforward
A single feedforward block contains 12,288 × 49,152 = 603,979,776 weight param-
eters in W F1 and 49,152 × 12,288 = 603,979,776 weight parameters in W F2 , totaling
1,207,959,552 parameters.
A single transformer block contains one masked multi-head attention block and
one feedforward block. The weight parameters are only in the masked multi-head
146 5 Artificial Intelligence

attention block and the feedforward block, totaling 603,979,776 + 1,207,959,552 =


1,811,939,328 parameters. With 96 transformer blocks connected in sequence, the
total number of weight parameters is 173,946,175,488.
4. Linear and Softmax
The linear transformation block contains 12,288 × 50,257 = 617,588,016 weight
parameters in the weight matrix W L .
Total Parameters: Summing up all the discussed parameters, the GPT-3 175B
model has a total of approximately 175,181,351,520 weight parameters, or roughly
175 billion parameters (refer to Table 5.1).
In addition to the weight parameters, there are other parameters to be learned
during the training process, which are the normalization parameters and the bias
parameters.
Each normalization block contains 2 normalization parameters,35 and each trans-
former block contains 2 normalization blocks (see Fig. 5.9). Since each normalization
parameter is a 12,288-dimensional vector, the number of normalization parameters
per transformer block is 2 × 2 × 12,288 = 49,152. Given that the GPT struc-
ture contains 96 transformer blocks in sequence, the total number of normalization
parameters is 4,718,592.
Bias parameters are present for each node (neuron) (see Fig. 5.3). Examining
each block, firstly, the input embedding block contains the weight matrix W E , but
since it only performs the function of converting tokens to vectors, there are no bias
parameters. A single masked multi-head attention block contains 128 bias parameters
each in W Q , W K , and W V , and since there are 96 heads, there are a total of 128 × 3
× 96 = 36,864 bias parameters. In addition, W O contains 12,288 bias parameters, so
the total for a single attention block is 49,152. A single feedforward block contains
49,152 bias parameters in W F1 and 12,288 in W F2 , totaling 61,440. Thus, a single
transformer block, which contains one masked multi-head attention block and one
feedforward block, has a total of 49,152 + 61,440 = 110,592 bias parameters. With
96 transformer blocks connected in sequence, the total number of bias parameters to
be learned is 10,625,472. Finally, the number of bias parameters in the weight matrix
W L of the linear transformation block is 50,257. Therefore, the total number of bias
parameters in the GPT-3 175B model is 10,625,472 + 50,257 = 10,675,729.

5.8.4 Training of AI Systems

The purpose of training an AI system is to enable it to predict the next token based
on the preceding tokens. During the training process, the target of learning is the

35 In each layer’s normalization, the input x_i is normalized with respect to the mean μ and standard
deviation σ as y_i = (x_i − μ)/σ. The result is then scaled and shifted using the two learnable
parameters γ and β in the form z.i = γy_i + β. The parameters γ and β are determined during the
training process, and they help prevent information loss and improve convergence.
Table 5.1 Calculation of weight parameters (GPT-3 175B model)
Input embedding Masked multi-head attention Feedforward Linear
Weight matrix size WE: 50,257 × 12,288 W Q, W K, W V:
12,288 × 128 W F1 : 12,288 × 49,152 W L : 12,288 × 50,257
W O : 12.288 × 12,288 W F2 : 49,152 × 12,288
Number of weight parameters W E : 617,588,016 W Q , W K , W V : 1,572,864 × 3 × W F1 : 603,979,776 W L : 617,588,016
96 = 452,984,832 W F2 : 603,979,776
5.8 Implementation of AI Cognitive Functions

W O :12.288 × 12,288 =
150,994,944
Repetition 1 96 96 1
Total weight parameters in each 617,588,016 (452,984,832 + 150,994,944) × (603,979,776 + 603,979,776) × 617,588,016
block 96 = 57,982,058,496 96 = 115,964,116,992
Total weight parameters of the 175,181,351,520
model
147
148 5 Artificial Intelligence

weight parameters, and the learning results are stored as weight values.36 These
weight values play the role of the brain in understanding questions and generating
answers. As calculated above, the GPT-3 175B model has around 175 billion weight
parameters, most of which are concentrated in the multi-head attention blocks and
feedforward blocks within the transformer blocks.
The training process of an AI system is similar to that of training a deep neural
network (e.g., GPT-3 175B model is a deep neural network consisting of 386 layers).
When training an AI system, all the weight parameters of each layer are initially set
to random values and are adjusted as the learning progresses. By receiving training
data and repeating the process of forward propagation, cost function calculation,
backpropagation, and weight updates, as explained in Sect. 5.5, learning progresses.
When the cost function converges to zero, the training ends, and the weights at that
point constitute the final weight matrices. These weight matrices are used for actual
question-answering and are not modified until the next training process.
Specifically, the GPT-3 175B model uses self-supervised learning during the
training process. This is a learning method where the model generates its own labels
from the input data to train itself. While self-supervised learning can be classified as
unsupervised learning because labels are not explicitly provided, it effectively oper-
ates like supervised learning since the training input data inherently contains labels
(i.e., the next tokens) that the model utilizes. The cross-entropy function is used as the
loss function, calculated by comparing the predicted token probability distribution to
the actual distribution of the next token. The backpropagation technique is employed
to compute the gradients of the weight parameters with respect to the loss function,
and the Adam optimization algorithm is applied to update the weights based on those
gradients.37 The input data is processed by dividing it into multiple batches (with a
batch size of up to 2,048 tokens), and all computations within the transformer block
are also handled on a batch basis. During this process, masking is applied to ensure
that future tokens are not involved in predicting the next token.38 The cost function
is calculated on a per-token basis and averaged over the batch to update the weight
parameters. After performing the forward pass, loss calculation, and backpropaga-
tion for the first batch, the weight parameters are updated once, and then the process
is immediately repeated for the second batch. Once the weight parameter updates
for all batches are complete, an epoch is said to be finished. Training on the given
input data concludes when the cost function reaches zero or when there is no change
in the weights after repeated epochs. The final weights are then used as the starting

36 Although relatively small in number, normalization parameters and bias parameters are also
targets of learning.
37 Adam (adaptive moment estimation) optimization technique improves convergence speed and

maintains robust performance by adaptively adjusting the learning rate using the mean of the
gradients and the mean of the squared gradients.
38 Masking can be applied by adding a mask matrix M to the attention score matrix S (see Fig. 5.5).

The elements m_ij of matrix M are set to 0 for i ≤ j and to − ∞ for i > j. This way, the terms
to which − ∞ is added become 0 during the following softmax process, effectively achieving the
masking effect.
5.8 Implementation of AI Cognitive Functions 149

point for training on the next input data, and this training process continues until all
training data is exhausted.39
The data used for training AI systems is extensive. Large amounts of data are
collected from various sources, preprocessed, and then used for training. Prepro-
cessing is necessary because data collected from various sources can vary greatly in
form and content and may include incorrect or irrelevant information. Subsequently,
fine-tuning is performed using specific datasets and human feedback, adjusting
performance in detail during this process. In particular, reinforced learning by human
feedback (RLHF) is crucial. Humans evaluate the model’s output, verify if the
model’s performance meets the intended accuracy and reliability, and identify and
correct the model’s biases or defects.
The performance of an AI system depends on the quantity and quality of the
training data used. If the information in the training data is accurate, the AI learns
correctly, but if the training data contains biased information, the AI model’s
responses will exhibit bias. Training large-scale AI models requires a large and
diverse set of datasets, including Common Crawl,40 BookCorpus,41 Wikipedia,
books, articles, journals, and other texts: For models like GPT-4o, multimode data
such as images, audio, and video are also necessary.
AI systems only learn during the training process and do not learn further once
the training period is over.42 Thus, the weight parameters remain unchanged after
the training is completed. Even if the AI learns new facts during question-answering
sessions (which it can separately store), it does not immediately reflect these in the
weight parameters. This distinguishes AI from human intelligence. While humans
think about various aspects and store information in memory during the process
of understanding questions and generating answers, AI ends its role at generating
responses based on what it understands.
So far, we have examined how AI’s cognitive functions are implemented in
systems through a question-and-answer process. In conclusion, like humans, AI
understands questions and provides answers; however, every step is conducted
through numerical calculations. System implementation involves designing the

39 GPT-3 is known to have been trained on 570 gigabytes of text data from various sources. This
dataset includes hundreds of billions of tokens, and the training involved millions of repeated epochs.
It is estimated that this training was completed in about 34 days using 1024 Nvidia V100 GPUs.
40 Common Crawl is a non-profit organization that regularly (e.g., monthly) crawls the web to

systematically collect data from websites and provides the archives and datasets to the public for
free. The Common Crawl web archive consists of petabytes (i.e., thousands of terabytes) of data
collected since 2008.
41 BookCorpus is a dataset composed of texts from about 7,000 self-published books collected

from the indie e-book distribution website Smashwords. This dataset consists of approximately
985 million words and includes books from various genres such as romance, science fiction, and
fantasy. It was used to train the initial GPT and BERT models.
42 For example, the final training of GPT-4 was completed on March 14, 2023. GPT-4o, released on

May 13, 2024, is an optimized version of GPT-4 offering new features and improvements, without
retraining GPT-4. In contrast, GPT-o1, released on October 4, 2024, is a completely new model,
designed and trained from scratch to achieve advanced capabilities in reasoning and problem-
solving.
150 5 Artificial Intelligence

computational architecture that executes these processes and determining the weight
parameters embedded within it. The understanding of the question happens within
the transformer blocks, while the answers are generated in subsequent blocks. The
values of the weight parameters are determined through training, which requires vast
amounts of data, computing power, and significant energy consumption. However,
once training is complete, the amount of computation and energy consumption
required during the usage phase is small.

5.9 Challenges and Limitations of AI

AI, with its immense potential, stands as a technological marvel that has rapidly
evolved and is poised to further accelerate due to intensified competition among
corporations. Its proliferation across industries and society is bound to have profound
impacts on human life. However, the path forward for AI is filled with numerous
challenges and limitations. Technical constraints are the first hurdles it will encounter,
and overcoming these will be a formidable task. Moreover, AI will likely give rise to
ethical and societal concerns, leading to significant debates and possibly resulting in
various regulatory measures. Such regulations could redefine the trajectory of AI’s
development within a more constrained framework.
The mixed feelings of anticipation and apprehension toward AI’s future among
humanity highlight the complexity of its integration into our lives. There’s a tangible
fear that, if not managed wisely, AI could bring about adverse outcomes for
humanity. It will be crucial to successfully navigate these challenges and limitations
in determining the future of AI.
Addressing technical challenges requires continuous innovation and research to
improve AI’s efficiency, reduce its environmental impact, and enhance its ability to
generalize across different tasks without compromising on performance. Ethically,
it necessitates a balanced approach that considers the societal impact of AI, ensuring
that its development and deployment are aligned with human values and benefit
society as a whole.
The potential regulatory landscape could both safeguard against the misuse of
AI and ensure that its development is aligned with ethical standards and societal
well-being. However, too stringent regulations could stifle innovation and hinder the
potential benefits AI could offer.
Ultimately, the future of AI will hinge on our ability to advance the tech-
nology responsibly, addressing ethical concerns and societal impacts while navi-
gating through the regulatory frameworks that may emerge. This balanced approach
will enable us to harness the full potential of AI, mitigating risks and ensuring that
its development and application contribute positively to humanity’s progress.
5.9 Challenges and Limitations of AI 151

5.9.1 Technological Challenges and Limitations

AI faces significant technological challenges and limitations, many of which originate


from the transformer architecture itself and extend to large-scale AI models like
GPT and BERT. One of the biggest challenges is the vast amount of computing
power and training data required to train and utilize these models, along with the
environmental impact this entails. The quality of the training data can significantly
affect service quality, and inherent biases in the data can be transferred to the service.
Generalizing and expanding services without being constrained by training data is
another challenging task. Standardization for integration between different systems
or products from different companies is also crucial for the future development of
AI. Addressing these technological challenges is a prerequisite for the advancement
of AI.
High-performance AI models using deep learning and transformer architectures
are large-scale and computationally intensive. For example, GPT-3 has around
175 billion weight parameters and 96 transformer layers, while BERT has 110 million
(BERT-Base) and approximately 340 million (BERT-Large) weight parameters, with
12 (BERT-Base) and 24 (BERT-Large) transformer layers, respectively. Such models
require significant computing resources and memory, necessitating powerful devices
like GPUs and TPUs,43 and substantial computation time for training, and their subse-
quent models require much more resources. The training and operation of these
large-scale AI models consume high levels of computing power and energy, raising
concerns about carbon emissions and environmental impact.44
To achieve optimal performance, these models require extensive and diverse
datasets. For GPT-3, the training dataset includes hundreds of gigabytes of data from
sources like Common Crawl, Wikipedia, books, and other texts. BERT is trained
on the BookCorpus and English Wikipedia, which together comprise billions of
words. Acquiring such vast datasets is costly, and training AI models on this scale is
an expensive endeavor. Moreover, large-scale AI models may not perform well for
languages with insufficient training data.
AI models struggle with scalability and generalization. Expanding the model
size is challenging as it exponentially increases the required computational power
and memory space and the amount of data needed for effective training. Increasing
model size does not guarantee performance improvements and may make the model

43 GPUs were originally designed for graphic rendering and are capable of high levels of parallel
processing. They support various machine learning frameworks such as TensorFlow and PyTorch.
TPUs, developed by Google to accelerate machine learning tasks like deep learning, can handle
large-scale computations and are optimized for TensorFlow.
44 Efforts are actively underway to make AI models lighter while maintaining performance in order

to solve these problems. On one hand, research is being conducted to reduce the weight of current
AI models or increase their processing speed. On the other hand, research is aimed at emulating
the human brain, which performs high-level thinking without matrix operations like deep neural
networks. The Kolmogorov-Arnold Networks (KAN) may be regarded as an example of such efforts.
Such research is being carried out simultaneously at both the semiconductor level and the software
code level.
152 5 Artificial Intelligence

more prone to overfitting. Thus, the cost of enlarging the model may not justify
the performance gains. Training larger models often requires distributed systems,
complicating the coordination and parallelization of the training process. Therefore,
scaling AI models presents new challenges in computation, data, and efficiency.
The generalization problem in AI models, which stems from a lack of diversity in
training data, highlights several key limitations in AI models. These limitations arise
due to constrained context comprehension, biases in training data, and inherent limi-
tations in machine learning algorithms. AI models, relying heavily on their training
data, excel in domains similar to their training environment but often falter in unfa-
miliar territories or contexts. This is partly because machine learning models based
on AI, such as GPT and BERT, struggle with scenarios not well-represented in
their training data, lacking deep contextual understanding. The models’ overfitting
to training data features makes adapting to new, feature-different data challenging.
Without the capacity for common-sense reasoning to interpret new situations beyond
their training, current AI models face a significant barrier to generalization.
Moreover, the complexity and inaccessibility of systems like GPT and BERT lend
them a “black-box” nature, making it difficult to comprehend the reasoning behind
specific decisions or to explain generated or analyzed content. This poses a challenge
in applications requiring explanations, such as in the medical or financial sectors.
Transparency and accountability become crucial when errors or biases emerge, espe-
cially in applications where these qualities are demanded. The lack of transparency
complicates diagnosing and rectifying issues related to generalization.
Interoperability, integration, and standardization also represent significant chal-
lenges for AI. The diversity of data formats and protocols across different systems
and industries leads to compatibility issues, hindering seamless interaction among
various AI algorithms. The absence of universal standards for AI models and data
complicates achieving interoperability between different AI systems. The complexity
and opacity of deep learning-based AI models limit the ability of different AI systems
to understand and utilize each other’s outputs. While the rapid evolution of AI tech-
nology makes establishing unified standards premature, the diverse requirements and
constraints across domains render the creation of universal standards impractical.
These challenges underscore the necessity for ongoing research and development
in AI to enhance generalization capabilities, improve transparency and explainability,
and foster interoperability through adaptive and flexible standards. Addressing these
issues will be crucial for realizing the full potential of AI across a broad spectrum
of applications, ensuring it can be deployed responsibly and effectively in various
domains.

5.9.2 Social and Ethical Challenges

AI’s development raises various social and ethical concerns, fundamentally origi-
nating from biases in training data. These biases, when ingrained in AI systems, can
5.9 Challenges and Limitations of AI 153

be reflected in the outputs, potentially causing significant social and ethical reper-
cussions. Addressing these social and ethical challenges is crucial for AI to progress
smoothly with societal support.
Large-scale AI models heavily rely on vast amounts of high-quality data, and
their performance is directly linked to the quality of training data. Biases in training
data can lead to unfair or discriminatory outcomes, especially in sensitive areas like
employment, lending, law enforcement, and credit scoring. Biases may manifest in
various forms, such as racial, gender, or socioeconomic biases, leading to ethical
dilemmas. The inclusion of personal information in training data also raises privacy
concerns.
One of the critical issues with AI systems is the lack of transparency and account-
ability. The black-box nature of deep learning models makes it difficult to under-
stand and explain AI decisions. This opacity can lead to unintended biases and
ethical issues, fostering suspicion and mistrust toward AI system operators. There’s
a growing demand for AI systems to be not only accurate but also interpretable and
explainable.
The issue of accountability in AI systems is also contentious. As AI autonomy
increases, it becomes unclear who should be held accountable for AI’s decisions—
developers, users, or the AI itself. In addition, ensuring AI systems comply with laws
and regulations and determining liability in case of malfunctions are critical aspects
of AI’s responsibility.
The most significant societal concern with the advancement of AI is the issue
of employment. There is a possibility that AI, especially through automation, could
replace jobs involving routine and repetitive tasks. This includes occupations in
manufacturing and warehouse management, data entry and processing, customer
service and support, retail and sales, transportation and delivery, accounting and
bookkeeping, basic analysis, and medical support. As AI systems become more
extensive and intelligent, even professions requiring a high degree of specialized
knowledge, such as legal services, medical services, management support services,
and research and development tasks, could potentially be replaced by AI. The replace-
ment of these tasks by AI means not so much that the jobs will disappear entirely,
but rather that the focus of the work shifts to AI, changing the role of people in
those jobs. On the other hand, AI also creates new job opportunities in fields such as
AI development, data analysis, machine learning, cybersecurity, and AI ethics and
governance.
The advancement of AI holds the potential to create new industries and markets,
enhance the efficiency of production and business, and spur economic growth.
However, if the benefits of AI technology are not widely shared but become concen-
trated, or if AI infrastructure is not evenly distributed, social inequality could increase.
It could widen the gap between those proficient in and able to utilize AI technology
and those who are not, as well as between those with access to education and training
in fields like computer science, data analysis, and engineering and those without, and
between workers who can adapt to and learn AI-related technologies and those who
cannot. This can be considered the ‘AI divide’, analogous to the ‘digital divide’
154 5 Artificial Intelligence

of the digital transformation era. Like the digital divide, the AI divide could ulti-
mately expand socioeconomic disparities or shift social dynamics in unforeseen
ways. Addressing this gap issue requires the construction of AI infrastructure to
effectively utilize AI technology, along with education in ‘digital and AI literacy’.
AI raises various concerns and issues related to personal information and privacy.
AI systems, needing a vast amount of data for effective training and operation, often
collect individuals’ data without explicit consent or a complete understanding of
data usage, even with consent. This collected personal data can become a target
for cyber-attacks, and AI’s unauthorized use of sensitive information like an indi-
vidual’s health status or personal preferences can cause privacy issues. Moreover, as
AI unconsciously learns and applies biases present in the training data, it could violate
privacy and personal dignity. Strong legislation and strict enforcement regarding
personal data collection, use, and privacy invasion are necessary to prevent these
issues. Particularly, it is crucial that AI technologies are developed to be aware of
and comply with privacy regulations.
AI’s impact on society is diverse, but among these, ethical issues are the most
complex and serious. The bias and fairness issues, transparency and accountability
problems, and personal information and privacy issues mentioned above are all inter-
connected with ethical concerns. These internal problems of AI systems are matched
by external problems, or ethical issues arising from the use of AI technology.
Malicious use of AI technology can cause significant harm to society and
human lives, with cyber-attacks, the spread of false information, and autonomous
weapon systems (AWS)45 being prime examples. Especially important is preventing
the destructive use of AI in critical infrastructure and defense sectors and main-
taining security from cyberthreats. In addition, using AI technology for surveillance
techniques like facial recognition or behavior prediction can compromise human
autonomy and dignity, affecting individual freedom and privacy adversely. There-
fore, it is essential to ensure that AI is developed and used safely, not posing a threat
to global security or producing false information such as deepfakes.46

5.9.3 Risks and Threats

The creation of superintelligent AI that far surpasses human intelligence is perceived


as a risk to humanity itself. The moment we question whether such superintelligence

45 An autonomous weapon system (AWS), also referred to as a lethal autonomous weapon (LAW)
or a killer robot, is a weapon system that uses AI to identify, select, and engage targets without
human intervention. Unlike unmanned drones that are remotely controlled by humans, autonomous
weapons make decisions using AI algorithms on their own.
46 Deepfake is a digitally forged artifact created using AI algorithms, manipulating audio and video

content to make it appear as if an individual has done something they have not actually done. With
the advancement of deepfake technology, the content produced is becoming increasingly difficult
to distinguish from reality, raising various ethical and legal issues at social, political, and personal
levels.
5.9 Challenges and Limitations of AI 155

can be controlled by human intelligence or not, AI becomes recognized as a threat-


ening entity. Besides these vague risks and threats, there are various dangerous and
threatening elements associated with AI. How these elements are resolved and safely
managed will determine the success or failure of AI technology. If these issues are
well resolved or safely managed, AI will benefit and advance human life; otherwise,
AI could cause suffering to humanity and potentially lead to its destruction.
When examining the elements that make AI a risk, they can be broadly cate-
gorized into four types, namely technical, ethical, social, and security risks. First,
technical risks include algorithmic bias, failure or unexpected behavior in unforeseen
situations, actions contrary to human intent, decision-making errors, and limits in
understanding complex human contexts. Second, ethical risks encompass concerns
about bias and discrimination, personal information usage and privacy invasion, the
issue of false information production and spread, the potential violation of human
autonomy in decision-making processes, problems related to the use of surveillance
technology, and other potential misuses. Third, social risks include job displacement
and changes, unequal access to AI opportunities, the inducement of socioeconomic
inequalities, the potential for market concentration and changes in social dynamics,
and issues related to perceptions and trust in AI. Fourth, security risks involve cyber-
attacks, infrastructure destruction, data breaches, and vulnerabilities to hacking and
manipulation. These risk factors were readily discussed in various sections above.
Looking at the elements that make AI a threat, there are several, including the
weaponization of AI, its use as a tool for information manipulation, cyber-attacks,
and surveillance. Firstly, when AI technology is deployed in military equipment such
as autonomous weapon systems (AWS), it poses a threatening force. Autonomous
weapons make decisions autonomously based on algorithms, making it difficult to
determine who is responsible for the outcomes. That is, in cases where autonomous
weapons cause unintended casualties, it is unclear who should be held respon-
sible—manufacturers, programmers, military commanders, or governments. Unlike
humans, who can make decisions considering moral and ethical implications based
on the situation, autonomous weapons simply make decisions based on algorithm. In
addition, it is challenging to train AI in complex and dynamic situations like actual
combat, so autonomous weapons might act unpredictably in real-world scenarios.
Furthermore, they are vulnerable to hacking and cyber-attacks, raising the possibility
of malfunction and the threat that they could fall into the hands of terrorist groups
or anti-state organizations.
Using AI technology to covertly manipulate social media feeds, search results,
and news recommendations can subtly yet powerfully influence public opinion and
societal behavior. Particularly, the creation of fake audio and videos, or deepfakes,
using AI technology, makes it difficult to distinguish authenticity, posing a significant
threat. Deepfakes not only threaten the integrity of information but can manipulate
public opinion, impact elections, and lead to political instability. Using deepfakes
to combine audio and video into fabricated scenarios can lead to severe privacy
violations and defamation, and depending on the content, can incite discord, violence,
cyber-attacks, and riots. Furthermore, deepfakes blur the line between reality and
fiction, causing people to doubt the truthfulness of all reports, eroding public trust in
156 5 Artificial Intelligence

the media, and leading to a disconnection from reality. It also makes the fact-checking
work of the press more challenging. However, deepfakes are hard to detect because
as soon as AI algorithms are developed to detect them, new methods of creating
undetectable deepfakes could emerge. Moreover, punishing the misuse of deepfakes
without suppressing legitimate expression and respecting intellectual property rights
is highly challenging.
If AI is used for cyber-attacks, it can be extremely threatening in terms of the
scale and sophistication of attacks. Integrating AI into hacking devices and tech-
niques or into cyberwarfare and cybercrime can lead to severe cybersecurity threats.
AI technology can analyze vast amounts of data more efficiently than human hackers
to identify system vulnerabilities, learn and adapt in real-time to changing situa-
tions, and adjust strategies accordingly. AI cyber-attacks can automate and opti-
mize the execution of attacks, making them more sophisticated and harder to detect.
AI cyber-attacks can target AI systems such as autonomous vehicles or industrial
control systems, potentially disrupting operations or damaging the learning process
or outcomes. AI can also be used for cyberbullying, using deepfakes as a weapon for
personal attacks and cyberharassment.
Utilizing AI technology for surveillance and privacy invasion becomes a painful
threat. AI technology can process vast amounts of data from various sources such as
CCTV cameras, online activities, and mobile devices, and if this capability is used
for mass surveillance, it can greatly infringe on privacy and freedom. In addition,
using AI for high-precision facial recognition technology can identify individuals
in crowds, track movements and behaviors, and create detailed profiles of personal
activities. If government agencies use this technology for monitoring citizens, it can
lead to significant privacy invasion and restrict free living. While the government
may install CCTV under the pretext of preventing crime, it could ultimately be used
for surveillance and profile creation of specific communities or local residents. If
personal information gathered through CCTV and facial recognition technology is
combined with data from online activities and mobile devices to create comprehen-
sive profiles, it could compile sensitive information about an individual’s habits,
health status, financial situation, political inclinations, and religious preferences,
potentially leading to detrimental consequences if misused. This becomes a very
threatening presence to privacy and free living.
How can we mitigate or manage these risks and threats? For technical risks, we
can only encourage and wait for developers to do their best to solve the problems.
However, other risks related to usage, i.e., ethical, social, and security risks, need to
be addressed through strict problem identification, planning countermeasures, and
taking actions. First, we need to create a model for assessing risks, apply the model
to real-world problems to analyze risks, and then develop strategies for risk manage-
ment. Following that, we should implement strategies, regulate through legal systems,
encourage stakeholder understanding, and educate the public. Especially for deep-
fakes and cyber-attacks using AI technology, technical responses for detection and
defense are crucial, as is the establishment and strict enforcement of stringent legal
systems. The issue of government surveillance or privacy invasion is something that
might happen in a controlled totalitarian state, but it cannot be ignored in democratic
5.9 Challenges and Limitations of AI 157

countries, necessitating proper legal mechanisms and constant monitoring through


civic activities.
Wisely resolving AI’s risks and threats and moving toward a future society lies
in establishing a safe relationship between humans and AI. As AI systems become
more intelligent and autonomous in the future, they should be designed and trained
to respect and coexist harmoniously and cooperatively with humans. In addition,
the goals and actions of AI systems should align with human expectations, and AI
ethics should match human values. Thus, in coexistence with humans, AI’s role
should complement rather than replace human capabilities. In decision-making, AI
should not decide in place of humans but assist or complement human decisions.
Trust in AI is a prerequisite for cooperation and reliance in crucial decision-making.
Developing AI systems that can be trusted is a fundamental duty of developers,
and avoiding excessive dependency of AI is a fundamental attitude of users. Trust
in AI means it has reliably addressed issues of bias and fairness, transparency and
accountability, and privacy and personal information, alleviating social and ethical
concerns.

5.9.4 Regulation and Governance

In order to address the social and ethical issues posed by AI technology, appropriate
regulations need to be established. For internal problems such as bias and fairness
issues, transparency and accountability issues, and personal information and privacy
issues, it is necessary to set ethical guidelines for professionals to keep in mind during
AI research and development. For external problems that threaten socital and interna-
tional safety, such as cyber-attacks, the spread of false information, and autonomous
weapon systems, proactive measures need to be taken by enacting domestic and inter-
national laws and systems. However, due to the rapid changes in AI technology and
conflicts of interest among nations, regulatory enactment faces several challenges.
Regulation is meant to prevent problems from recurring or spreading by under-
standing the essence of the issue, but it is difficult to establish a framework for AI
technology. The pace of technological development outstrips the regulatory process,
making it hard to keep up. Laws and regulations that cannot keep pace with tech-
nological advancement quickly become outdated, failing to adequately address new
developments or risks, and can sometimes become obstacles. In addition, while AI is
developed and used worldwide, regulatory approaches vary by country and region,
complicating regulatory enactment. Establishing consistent international standards
and regulations requires extensive international cooperation and consensus building,
which is complex and slow due to conflicting national interests and concerns.
When enacting regulations for AI, it is essential to strike a balance between
fostering innovation and ensuring responsible oversight, while also considering the
timeliness of intervention. While regulations are critical for managing risks, rushing
into overly strict or premature regulations can hinder or stifle technological progress.
The challenge lies in creating policies that encourage technological advancements
158 5 Artificial Intelligence

while mitigating potential risks—a delicate balance that defines the art of regula-
tion. Historical examples illustrate how this balance has been successfully achieved,
benefiting both society and national development. For instance, the US government
introduced regulations when AT&T sought to expand into the computer business after
its communications technology had sufficiently matured. Similarly, current efforts
to regulate digital platforms have emerged only after the technology developed to
a level where issues of fair competition became evident. However, AI technology
differs fundamentally from these previous cases due to its vast potential power and
associated risks. Unlike other technologies, if AI is not regulated before its full devel-
opment and widespread commercialization, there may be no effective way to mitigate
its risks later. Thus, proactive research and public consultation are crucial for devel-
oping comprehensive ethical guidelines. In addition, international cooperation must
be accelerated to establish a unified regulatory framework that addresses the global
nature of AI’s influence and its potential impact on society.
When creating regulations and ethical guidelines for AI, several key areas need to
be addressed. Firstly, internal issues related to AI technology development should be
regulated through comprehensive ethical guidelines for developers. These include:
1. Transparency and Explainability: AI systems should be designed to be trans-
parent, allowing users to understand how decisions are made.
2. Bias and Fairness: Ethical guidelines should prevent biases learned during
training from leading to biased or unfair outcomes in the AI system’s operations.
3. Accountability: Guidelines are needed to clarify responsibility for decisions made
by AI systems and legal accountability for any problems that arise, including
moral and legal responsibilities related to autonomous systems like self-driving
vehicles.
4. Safety and Security: AI systems must be designed and built to operate safely and
be protected from cyberthreats. Developers should follow guidelines that ensure
the development of robust security protocols and regular vulnerability checks.
5. Sustainability and Environmental Impact: Developers should consider the
substantial energy consumption of large-scale AI models and their environmental
impact, and guidelines should encourage the development of energy-efficient AI
technologies.
Secondly, for external issues related to the use of AI technology, various laws and
systems need to be established. These include:
1. Laws on Data Use and Personal Information Handling: Strict regulations are
necessary for data collection, use, and sharing, including consent for data
collection, secure data storage, and prevention of personal information leaks.
2. Regulations to Prevent the Spread of False Information: Strict laws and enforce-
ment are needed to prevent the production and spread of false information using
AI technologies like deepfakes.
3. Compliance with Ethical Guidelines and the Principle of Harmlessness: Laws
should prohibit the use of AI for harmful purposes, such as autonomous weapon
systems development or unauthorized surveillance.
5.9 Challenges and Limitations of AI 159

4. Human Oversight of AI Decisions: Institutionalizing human supervision over AI


decision-making processes to ensure fairness, ethics, and responsibility.
5. Public Participation and Education: Incorporating public opinion and education
into AI governance, promoting public discourse on the social use of AI, and
educating the public about AI capabilities and risks.
6. International Consensus and Cooperation: International cooperation is neces-
sary to create and enforce AI regulations, setting global standards to address
transnational challenges.
In order to support the development and implementation of ethical guidelines and
legal frameworks for AI, a robust governance structure is essential. This structure
should include institutions at both national and international levels capable of over-
seeing, monitoring, and regulating AI development and usage. In addition, to ensure
the democratic legitimacy of AI policies and foster public trust, public participa-
tion in AI governance is crucial. These regulatory frameworks, alongside supporting
management organizations, are key to ensuring the responsible development and
deployment of AI. The challenge of the AI era is to establish an effective, ethical,
and sustainable governance framework through collaboration among governments,
industry, academia, and civil society. This cooperation maximizes AI’s benefits while
minimizing its risks.
Addressing AI’s ethical challenges requires a multi-stakeholder approach. A
diverse group of stakeholders, including AI experts, entrepreneurs, ethicists, poli-
cymakers, and the general public, must collaborate to legislate and systematically
regulate AI. This ensures that AI development remains responsible, ethical, and
sustainable. Governance bodies for AI regulation and ethical guidelines can operate
at multiple levels: international, regional, and national. At the national level, AI
governance must be inclusive, involving governments, corporations, universities,
research institutions, and civil society organizations.
At the international level, organizations such as the UN, EU, and other regional
bodies can play a crucial role in setting global standards and norms for AI ethics and
regulation through existing international cooperation platforms. National govern-
ments are central to creating and enforcing AI regulations, either by forming dedi-
cated AI management organizations or assigning these tasks to existing regulatory
agencies. The private sector can take a proactive role by forming industry consortia
and standards bodies to self-regulate AI, utilizing existing standards organizations
like the IEEE and ISO to establish technical and ethical benchmarks. Corporations
can also establish their own internal regulatory and ethical standards for AI. In addi-
tion, citizens’ organizations can contribute by shaping public opinion on AI policy
and providing input to governments, ensuring that societal values are reflected in AI
governance and ethical discourse.
160 5 Artificial Intelligence

5.9.5 EU’s Artificial Intelligence Act (AIA)

The Artificial Intelligence Act (AIA) enacted by the European Union (EU) in 2024,47
is a comprehensive legislative measure that synthesizes and addresses the challenges
and limitations of AI that we have discussed so far, including technical, ethical, and
social issues, as well as concerns related to the risks and threats, and the regulation and
governance, posed by AI. The AIA is the world’s first comprehensive legal framework
designed to address the challenges posed by AI while promoting innovation. The
AIA aims to ensure the ethical, safe, and transparent use of AI by establishing clear
obligations for AI developers, distributors, and users. The regulation covers technical,
ethical, and social issues related to AI, while also addressing risks and threats posed
by the technology. The act focuses on the reliability, accountability, and transparency
of AI systems, ensuring they respect fundamental human rights and align with societal
values.
At the heart of the AIA is the classification of AI systems into four categories
based on their risk levels:
1. Unacceptable Risk: AI systems that pose a significant threat to human safety or
fundamental rights are prohibited. This includes AI systems used for real-time
remote biometric identification in public spaces for law enforcement purposes
(with exceptions), social scoring based on personal behavior or characteristics,
and systems that manipulate human behavior or thought.
2. High Risk: AI systems that could affect human safety, health, or rights must
comply with strict regulations. These systems include AI in healthcare, trans-
portation, and law enforcement. Obligations include conformity assessments,
risk management, and ongoing monitoring to ensure safety and compliance with
performance standards.
3. Limited Risk: These systems must meet transparency requirements. Users must
be informed when they are interacting with an AI system, and AI-generated
content must be clearly labeled (e.g., text, audio, and video, including deep-
fakes).48 Transparency and disclosure are critical to ensuring users are aware of
AI involvement.
4. Acceptable Risk: This category includes AI systems that pose minimal or no
risk and can be freely used. Examples include spam filters and AI-enabled video
games, which do not require significant oversight.

47 The Artificial Intelligence Act (AIA) was first proposed by the European Commission on April
21, 2021. It was passed by the European Parliament on March 13, 2024, and unanimously approved
by the Council of the EU on May 21, 2024. The AIA was published in the Official Journal of the
EU on July 12, 2024, and has come into effect on August 1, 2024.
48 The AIA regulates deepfakes. It mandates that any image, audio, or video content generated or

manipulated by AI must be clearly labeled as such. However, exceptions can be made for legal
purposes.
5.9 Challenges and Limitations of AI 161

The AIA imposes additional obligations on providers of General-Purpose AI


(GPAI) models, which perform a wide range of tasks. Providers must maintain tech-
nical documentation, publish data summaries, and ensure compliance with EU copy-
right laws. While open-source GPAI models face fewer restrictions, they must meet
the basic requirements related to copyright and data summaries unless they pose
systemic risks, in which case additional evaluations and risk mitigation measures are
required.
The act mandates that AI developers and operators manage data quality, ensuring
transparency in the system’s operations and decision-making processes. Clear expla-
nations of the AI system’s functionality, limitations, and risks must be provided to
users. The AIA also requires human oversight over AI operations, ensuring humans
can intervene or stop AI decision-making if necessary. In addition, post-market
monitoring is required, with AI providers expected to report serious incidents or
malfunctions promptly.
One of the significant features of the AIA is the establishment of AI testing
sandboxes, which are controlled environments designed to allow for the development,
testing, and validation of AI systems before their deployment in the market. They
will provide guidance and support for participants, especially SMEs, to experiment
with high-risk AI systems in a controlled environment with regulatory oversight.
This promotes innovation while ensuring compliance with legal standards.
The AIA introduces strict penalties for non-compliance, including fines of up to
7% of a company’s global annual revenue. Repeated violations could lead to even
harsher penalties, such as divestiture of parts of the business as a last resort.
Recognizing the need for global leadership in AI regulation, the AIA sets a
precedent for other nations to follow, potentially influencing international standards.
Alongside the AIA, the EU has enacted the DMA and the DSA in 2022, marking
2022 and 2024 as pivotal years for both digital and AI governance. The AI Act
is seen as a cornerstone in regulating AI development, promoting responsible AI,
and addressing emerging challenges such as deepfakes, data privacy, and bias in AI
systems.
In conclusion, the AIA represents a balanced approach to fostering innovation
while ensuring that AI systems remain ethical, safe, and respectful of human rights.
By prioritizing risk management, accountability, and transparency, the AIA lays the
foundation for responsible AI use in the EU and beyond.
Chapter 6
Digital and AI Transformation
of Industry

Industry was the first to embrace digital technology, initiating the digital transfor-
mation. The reason industries focused on digital transformation early on was due
to competition. Failing to transform digitally means falling behind in the compe-
tition and, eventually, becoming obsolete. Just as sticking to plow farming would
inevitably lead to being outcompeted by tractor farming in the industrial age, insisting
on simple tractor farming in the digital and AI era will result in obsolescence by digi-
tally enhanced autonomous tractors. Therefore, companies have recognized digital
transformation as a timely challenge and have competitively jumped into it. Because
companies led the way in digital transformation, the act of integrating digital tech-
nologies into various business areas to change operations was initially defined as
digital transformation.
Lately, as digital technologies matured, AI technology has emerged and started to
advance rapidly. Catching this trend, companies are once again shifting their direction
toward AI technology. They are applying AI technology significantly to tasks such as
processing large volumes of data, making real-time decisions, precise quality control,
and proactive customer service, aiming for automation and intelligence. This is what
is known as the AI transformation. However, since AI technology is included in
digital technology and the AI transformation is seen as an extension of the digital
transformation, they are collectively referred to as the digital-AI transformation.

6.1 Benefits of Digital Transformation

The digital transformation in the industrial sector goes beyond just adopting new
digital technologies; it involves using digital tools and technologies to optimize
existing operational methods, enhance customer experience, and create new busi-
ness models. In essence, digital transformation reflects the incorporation of digital
tools and concepts into business models, processes, and organizational structures,

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 163
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_6
164 6 Digital and AI Transformation of Industry

promoting data-driven decision-making, automated procedures, digital collabora-


tion, and innovation, thus improving industrial management efficiency, fostering
corporate innovation, and creating new value.
The reason businesses were the first to pay attention to digital transformation is
because business activity itself is competitive. Business activities involve competing
with other companies for the consumer’s attention in the market with products and
services. This competition applies not only to the domestic market but also to the
global market. While politics, society, culture, education, and other sectors may focus
more on domestic issues, the economy and industry are always facing international
competition. Therefore, businesses must be keenly aware of technological and soci-
etal changes, adopting new technologies immediately as competitive weapons. In
particular, significant undertakings like digital transformation cannot be overlooked
or delayed by businesses.
Digital transformation requires substantial investment and data. Significant invest-
ments are needed to introduce digital devices and technologies and to hire digital
technology experts. So the companies with the capability to mobilize capital are at
an advantage in digital transformation. In addition, digital transformation is more
effective when there is a large accumulation of data related to production processes,
supply chains, and customers. So the companies that have accumulated substantial
data from past business activities and those capable of rapidly collecting data are at
an advantage, similar to how Google or Amazon have gained a significant competi-
tive edge by collecting vast amounts of user information. Capital and data can create
synergistic effects. Considering these points, generally, large corporations, perceived
to have investment capabilities and accumulated data, are in a more advantageous
position for digital transformation than small and medium-sized companies.
The benefits obtained from digital transformation for businesses are diverse, as
listed below:
First, it can improve efficiency through process automation and operational
optimization, reduce costs, and shorten market entry times.
Second, by utilizing digital technologies, businesses can offer personalized
services tailored to customer needs, enhancing customer interaction and satis-
faction.
Third, digital transformation can accelerate innovation, making it possible to
develop new products, services, and revenue streams that were previously
unimaginable.
Fourth, utilizing real-time data and advanced analytics for data-driven decision-
making can optimize business strategies and quickly respond to market changes.
Fifth, embracing digital transformation allows businesses to react agilely to market
changes, maintaining a competitive edge.
Sixth, using digital technologies enables businesses to explore global markets and
access customers worldwide.
Seventh, digital transformation can optimize resource use and minimize environ-
mental impact, moving closer to sustainability goals.
6.2 Drivers of Digital Transformation 165

As such, digital transformation in the industrial sector has become a strategic


element that boosts business efficiency, innovation, and competitiveness. Regardless,
with the advent of the digital age, everything becomes digitalized and the world is
interconnected digitally, making digital transformation inevitable for businesses. The
key is how quickly businesses can respond during the digital transformation era to
maintain a comparative advantage and competitiveness against their rivals, and how
they can revamp their operations and pursue technological innovation to provide new
experiences and value to customers.

6.2 Drivers of Digital Transformation

What motivates the digital transformation in industry? As mentioned earlier, business


activities inherently involve competition, and industries were the first to apply digital
technologies competitively, leading to digital transformation. While the competitive
business environment is a direct motivator, several other factors also drive the digital
transformation in industry.
The most fundamental factor driving digital transformation in industry can be
attributed to digital technology itself. Businesses were able to adopt these technolo-
gies to gain a competitive edge, which in turn changed the industrial landscape and
altered market and consumer expectations, making it imperative for companies to
adapt to these changes. The issue extends beyond the emergence or evolution of a few
new technologies; entirely unforeseen digital technologies have also appeared, and
existing technologies have transformed, shifting the industrial paradigm and putting
pressure on companies to change. Digital transformation offers new opportunities
and benefits in productivity improvement, cost reduction, and others, making it indis-
pensable. In addition, the fear that falling behind in digital transformation could widen
the gap with successful competitors and lead to downfall cannot be ignored.
In general, the factors driving digital transformation in industry can be listed as
follows:
First, the emergence of various digital technologies, such as 5G, IoT, cloud
computing, big data, AI, etc., provides powerful tools that improve operational
efficiency, reduce costs, and create innovative solutions.
Second, changes in consumer consumption patterns due to societal changes,
leading industries to adopt digital strategies to meet evolving customer expec-
tations for personalized experiences.
Third, the entry of new market competitors armed with digital technologies,
including digital ventures and digital-native competitors, forces traditional indus-
tries to find countermeasures in digital technology, leading to self-innovation and
adaptation of manufacturing methods and operational models to digital standards.
Fourth, the process of self-innovation discovers the true value of data and digital
technology, prompting a pursuit of digital transformation to utilize data analytics
for valuable insights, driving efficient management and competitive advantage.
166 6 Digital and AI Transformation of Industry

Fifth, changes in industrial environment and regulations, such as new industry


standards and regulations, necessitate the adoption of digital solutions for data
security, privacy protection, and compliance.
Sixth, the globalization trend in the digital age, as information and business cross
borders, requires digital technologies for efficient communication, supply chain
management, and global market expansion.
However, the process of pushing for digital transformation in industry comes with
its challenges and new tasks. Traditional industries may face difficulties adopting new
digital technologies due to legacy systems and processes. Employees and manage-
ment might resist adopting unfamiliar digital technologies, necessitating a strategic
approach and a cultural shift within the company. Increased adoption of digital tech-
nologies exposes businesses to cybersecurity threats and data breaches, requiring
strong security measures. Digital transformation can raise privacy concerns with
the collection and use of customer data, necessitating data protection regulations.
Initial costs and ongoing investments can pressure finances, especially challenging
for small and medium-sized enterprises. After digital transformation, there is a need
for experts in data analysis, AI, cybersecurity, etc., making talent acquisition costly
and challenging.
As such, industries adopt digital transformation due to various reasons such as
technological advancements, changing customer expectations, competitive pressure,
efficiency of big data, changes in standards and regulations, and business globaliza-
tion. However, they face difficulties such as integrating digital systems into traditional
systems, reluctance of internal staff to digital changes, cybersecurity and privacy
issues, costs of digital transformation, and a shortage of digital talent.1

6.3 Application of Digital Technologies

The digital tools supporting digital transformation in the industry are the digital
technologies that have driven the digital transformation itself. These technologies
are detailed in Chap. 4, and those closely related to industrial digital transformation
are as follows:
1. 5G Mobile Communications Technology: It is superior to 4G in terms of
data transmission speed, latency, the number of simultaneous connections, and
frequency efficiency. It is essential for future-oriented services like autonomous

1 A Boston Consulting Group study found that the actual success rate of digital transformation is
only about 30%. The report suggests strategies to increase the success rate to 80%, including setting
a cohesive strategy reflecting clear change goals, rallying commitment from top executives to middle
managers, deploying top digital talent, adopting an agile organizational management mindset for
widespread digital transformation, effectively monitoring progress toward targeted outcomes, and
well-equipping business-centric modular technology and data platforms. Refer to Boston Consulting
Group’s Digital Transformation Report “Flipping the odds of digital transformation success” by
Patric Forth et al., October 29, 2020.
6.3 Application of Digital Technologies 167

driving, ITS, Internet of Things, smart cities, telemedicine, remote education,


and industrial automation.
2. Internet of Things (IoT): It expands the internet from people to objects (or
things), connecting surrounding objects to the internet for mutual communi-
cation, data sharing, and intelligent operation. It promotes digital transforma-
tion by providing connectivity, automation, efficiency, and innovation through
automation and improved efficiency, real-time situation awareness and action,
resource management, and energy savings.
3. Cloud Computing: It allows users or businesses to access storage devices,
computers, and application software installed on cloud provider servers for
data storage and processing, instead of owning physical hardware and infras-
tructure. It offers an economical and scalable solution by reducing the invest-
ment needed for digital devices and ongoing management costs and enables
remote collaboration and document sharing regardless of location.
4. Augmented Reality (AR), Virtual Reality (VR), Metaverse: AR enhances users’
perception of reality by overlaying digital information on the real-world view.
VR creates a digital environment where users can act as if physically present in
a virtual world. The metaverse is a digital platform where the virtual and real
worlds merge, allowing users to participate in various social activities in virtual
spaces through avatars, showing potential for diverse applications.
5. Digital Twin: It is a technology that creates a digital counterpart of a real-world
object in a computer, allowing for the simulation of real-world situations to
predict outcomes and find solutions to problems. Digital twins are powerful
digital objects that can significantly improve the operation performance of real-
world objects and business processes.
6. Big Data: It refers to the analysis of large volumes of data to extract mean-
ingful patterns, correlations, trends, insights, and knowledge. Big data analytics
enables a deeper understanding of phenomena, data-based decision-making, and
optimization of business processes.
7. Robots, Robotic Process Automation (RPA): Robots are mechanical devices or
machines equipped with sensors, actuators, and computing systems that interact
with the environment and perform tasks based on pre-programmed commands.
Robots are used in a variety of fields from product assembly and processing to
surgical assistance. RPA uses software robots to automate business processes,
automating rule-based tasks to replace or assist human labor.
8. Blockchain: A decentralized, distributed technology developed to allow peer-to-
peer transactions without central authority intervention. Blockchain technology
provides a secure, transparent, and efficient means of recording and verifying
transactions, offering potential applications across various industries, including
cryptocurrencies like bitcoin and non-fungible tokens (NFTs).
9. 3D Printing: It is a manufacturing technology that builds three-dimensional
objects by layering material from the bottom up. 3D printing enables remote
manufacturing, where objects can be produced at remote locations by trans-
mitting digital or design data of the scanned object. It is widely applicable in
168 6 Digital and AI Transformation of Industry

prototype production, tool manufacturing, architectural modeling, implant and


prosthetic production, etc.
10. Artificial Intelligence (AI): It refers to machines that mimic human intelligence,
capable of thinking, learning, reasoning, problem-solving, and decision-making
like humans. AI implementation involves various technologies such as machine
learning, deep learning, neural networks, natural language processing, computer
vision, and robotics. These AI technologies can improve data analysis, pattern
recognition, automation, judgment and decision-making, product development,
and customer support.
The methods of applying these digital technologies in industrial digital transfor-
mation vary by industry, including IoT-based predictive maintenance, data-driven
retailing, cloud-based collaboration, blockchain in supply chains, robots in logistics,
AI-based customer support, and more. Some of specific examples are as listed below:
Manufacturing companies can install IoT sensors on machines for real-time
monitoring and predictive analytics to detect failures and reduce maintenance costs.
Retail businesses can use big data analytics to analyze customer purchase histories
for personalized product recommendations, increasing sales and customer retention.
Global tech companies can move their operations to the cloud, enabling globally
dispersed employees to access documents and collaborate remotely.
Food companies can introduce blockchain for product traceability, enhancing
transparency and quickly identifying the causes of food spoilage during recalls.
E-commerce platforms can employ AI chatbots to efficiently handle customer
inquiries, improving customer satisfaction and reducing support costs.
These examples illustrate how digital technology applications can improve oper-
ations, enhance customer experiences, and drive innovation, further highlighting the
potential for increased automation and intelligence in most tasks through proactive
use of AI technology.

6.4 Digital Transformation in Traditional Industries

Introducing digital transformation into traditional industries can lead to opera-


tional improvements and efficiencies. It enables the optimization of manufacturing
processes, enhancing productivity, and the immediate detection of errors in manufac-
turing processes. It also makes predictive maintenance possible and optimizes inven-
tory. Furthermore, it improves understanding and communication with customers
and enhances customer service effectiveness. Such operational improvements and
efficiencies result in cost reductions and create a foundation for new innovations.
6.4 Digital Transformation in Traditional Industries 169

6.4.1 Operational Improvement in Traditional Industries

In traditional industries, digital transformation can enhance operational efficiency by


restructuring manufacturing processes. Digital transformation optimizes workflows,
introduces automation to reduce manual labor, increases process speed, and reduces
errors, leading to efficiency improvements. It also allows for the real-time collection
and monitoring of data, enabling swift problem resolution and preventive decision-
making. Moreover, integrating data from various sources provides a comprehensive
perspective and deep insights for decision-making.
In addition, digital transformation facilitates cost reduction through operational
efficiency. Adopting automation, predictive maintenance, and resource optimization
directly leads to cost savings. Utilizing digital technologies to effectively allocate
labor, machinery, and resources also reduces costs. Furthermore, optimizing energy
consumption through data-based control can reduce both environmental impact and
costs.
Digital transformation also enables innovation in product development, busi-
ness models, and customer-centric operations. Using digital technologies allows for
rapid prototyping and testing, shortening the development time for new products
and services. Digital transformation facilitates exploring new revenue streams and
experimenting with new business models, such as subscription services. Moreover, it
enables a customer-centric approach, allowing for the tailored provision of products
and services to meet customer needs.
Thus, digital transformation in traditional industries enables improvements and
efficiencies in various areas, including workflow automation, real-time supply chain
management, energy management, innovative product launches, subscription-based
services, and personalized retail. Some of specific examples are as listed below:
A manufacturing plant undergoing digital transformation can use data analytics
to optimize energy consumption, effectively manage inventory, reduce costs, and cut
carbon emissions.
A technology company can use digital prototyping and simulation to develop and
launch new products faster than competitors, increasing market share.
A logistics company that adopts digital transformation and introduces IoT sensors
across the supply chain can improve inventory management and enhance customer
service.
A retail business can use customer data analytics to provide personalized
recommendations, increasing sales and customer retention.

6.4.2 The Case of Energy Company ENGIE

See Fig. 6.1.


Real-world examples of traditional industries implementing digital transformation
to improve industry operations can be found everywhere. Among them the case of
170 6 Digital and AI Transformation of Industry

Fig. 6.1 ENGIE logo

the French energy company ENGIE is a notable example of successful digital trans-
formation.2 ENGIE was established in 2008 through the merger of Gaz de France
(founded in 1946) and Suez S. A. (founded in 1858), and operates in over 70 coun-
tries. In 2016, ENGIE’s CEO, Isabelle Kocher, recognized two inseparable forces
shaking the core of ENGIE’s industry, namely, digital transformation and energy
transition. She identified that decarbonization, decentralization, and digitalization
were revolutionizing the energy industry, and a fundamental digital transformation
was necessary for survival and prosperity in the new energy world. Kocher estab-
lished a vision for ENGIE’s digital transformation and announced a 1.5 billion Euro
investment over the next three years. She launched ENGIE Digital as the central orga-
nization to spread digital transformation efforts across the company. ENGIE Digital
organized the ‘Digital Factory,’ internally for developing software and innovative IT
tools for company-wide distribution. Kocher then appointed a Chief Digital Officer
to lead the digital transformation and hired digital experts.
The first step in ENGIE’s digital transformation was identifying high-value
applications across the company’s operations and establishing a comprehensive
digital transformation roadmap. The Digital Factory created a comprehensive project
roadmap and prioritized tasks. First, for gas assets, it applied predictive analytics and
AI algorithms to identify the main causes of efficiency decline, reduce asset loss,
improve uptime, perform predictive maintenance, and optimize electricity genera-
tion. Second, for customer management, it widely applied various online services,
including service applications that allow customers to manage their energy usage
directly. For individual residents and building managers, it developed applications
that analyze data from smart sensors to precisely identify opportunities for energy
savings. Third, regarding renewable energy, it developed a digital platform for appli-
cations to optimize electricity production from renewable sources, using predic-
tive analytics and AI to predict maintenance conditions, identify underperforming
assets, and provide real-time analysis and maintenance information to field operators.
Fourth, for smart cities, foreseeing the global increase in urban population from 50%
currently to 70% by 2050, it aimed to build sustainable, energy-efficient, connected
cities, planning to develop and deploy numerous applications for efficient district
heating and cooling, traffic control, eco-friendly mobility, waste management, and
security.

2Reference: Thomas M. Siebel, Digital Transformation: Survive, Thrive in an Era of Mass


Extinction, Rosetta Books, New York, 2019.
6.4 Digital Transformation in Traditional Industries 171

Fig. 6.2 John Deere &


Company logo

ENGIE strongly pursued digital transformation by investing significant funds,


establishing a top institution for driving digital transformation, collaborating
with business unit leaders, defining requirements, creating roadmaps, measuring
outcomes, and systematically developing and deploying applications. As a result,
starting from 150,000 employees and a revenue of 60.6 billion Euros in 2018, the
workforce increased to 170,000 by 2021, and revenue rose to 93.9 billion Euros by
2022.
ENGIE’s digital transformation is considered exemplary because it is well-
equipped with the necessary elements for digital transformation. First, the CEO
Isabelle Kocher deeply understood the need for digital transformation. Second,
ENGIE established the ‘Digital Factory’ as the top decision-making and implemen-
tation body for digital transformation. Third, ENGIE invested a substantial amount
of 1.5 billion Euros in purchasing various digital devices and developing software/
applications. Fourth, ENGIE secured experts in fields such as big data analysis,
software development, AI utilization, and cybersecurity maintenance for the Digital
Factory. Fifth, ENGIE ensured the Digital Factory closely collaborates with busi-
ness unit leaders to develop digital transformation software/applications and IT tools
tailored to the field requirements, distributing them across the company.

6.4.3 The Case of John Deere & Company

See Fig. 6.2.


Another example of successful digital transformation is John Deere & Company,
a leading American agricultural machinery manufacturing company.3 Founded in
1837, John Deere is known for producing tractors, combines, excavators, balers, and
more, holding a prominent position in the world market for agricultural, construction,
and industrial engines.
Initially focused on traditional agricultural machinery such as tractors, plows,
and combines, John Deere has kept pace with technological advancements over
nearly 200 years, swiftly moving toward digital transformation in the era of digital
change. By incorporating precision agriculture technologies into its machinery, John

3 Reference: ibid.
172 6 Digital and AI Transformation of Industry

Deere produced GPS-guided tractors, automatic steering systems, and data collection
tools. It developed agricultural management platforms like “John Deere Operations
Center” and “MyJohnDeere,” enabling the collection and management of data from
agricultural machinery and making data-based decisions possible. IoT and telematics
(telecommunication + informatics) systems were installed to collect real-time data on
field conditions and machinery performance, and agricultural management software
was developed and integrated with platforms to help farmers plan, track, and optimize
their farming activities efficiently. John Deere researched and developed autonomous
and electrically powered machinery, introducing an autonomous tractor at CES 2022
and announcing its commercial sale later that year.
By utilizing data analysis and AI, John Deere enabled decision-making based
on real-time and historical data, and with the help of 5G mobile communications,
it improved connectivity in rural areas and enhanced the effectiveness of digital
solutions. Thus, John Deere pursued digital transformation by building data plat-
forms, integrating IoT, utilizing AI, and applying autonomous and 5G technolo-
gies, contributing to increased agricultural productivity, cost reduction, and the
development of sustainable and environmentally friendly agriculture.
A particularly noteworthy aspect of John Deere’s digital transformation is inven-
tory management. Operating numerous factories worldwide and producing a variety
of agricultural machinery with many components, and facing thousands of possible
combinations of customer-selected options for custom orders, managing optimal
inventory levels in the manufacturing process is complex. This complexity is
compounded by uncertainties such as demand fluctuations, supplier delivery times,
and production line disruptions. Historically, to accommodate these uncertainties,
sufficient inventory levels were maintained to immediately fulfill orders, but this
approach entailed high costs and complex management. As part of its digital trans-
formation, John Deere developed software solutions to support production planning
and inventory management, considering all these factors. Starting with production
lines using over 40,000 parts, it developed AI applications to optimize inventory
levels and algorithms to manage stock history daily based on various parameters. As
a result, John Deere was able to optimize order parameters, quantify material usage
based on production orders, minimize safety stock levels, and consequently reduce
parts inventory by 25–35%.

6.5 Digitalization in the Manufacturing Industry

When we talk about digital transformation in various sectors, it goes beyond merely
adopting new digital technologies. It involves a comprehensive change in business
models, operations, organizational structures, and decision-making processes, revi-
talizing companies. In the modern manufacturing industry, this sequence of digital
transformation begins with the adoption of digital technologies in the manufacturing
process. Narrowing down to the manufacturing sector, adopting digital technologies
in manufacturing processes, or digitalization of these processes, is central to digital
6.5 Digitalization in the Manufacturing Industry 173

transformation. This is what is termed “Industry 4.0”. Initiated by the German govern-
ment in 2011, Industry 4.0 is an industrial policy aimed at transitioning traditional
manufacturing into smart factories equipped with intelligent production systems by
integrating ICT technologies.
The evolution of industrial society can be segmented based on the adoption of
specific technologies in the manufacturing industry, marking different industrial
revolutions. The use of steam engines powered by carbon resources signifies the
“First Industrial Revolution”; the transition to electricity as a power source marks
the “Second Industrial Revolution”; the adoption of electronics for automation indi-
cates the “Third Industrial Revolution”; and the adoption of digital technologies is
characterized as the “Fourth Industrial Revolution”. Thus, Industry 4.0 corresponds
to the Fourth Industrial Revolution. Comparing Industry 4.0 with digital transfor-
mation, while digital transformation seeks innovation across all aspects of busi-
ness, including manufacturing, operations, organization, and customer engagement,
Industry 4.0 specifically focuses on the digitalization of manufacturing processes,
representing a narrower scope of digital transformation.

6.5.1 ‘Industry 4.0’

The core components of Industry 4.0 include cyberphysical system (CPS), the
Internet of Things (IoT), cloud computing, and cognitive computing.
First, the CPS integrates computer computation and networking with physical
processes, where embedded computers and networks monitor and control physical
processes, with feedback loops allowing physical processes and computer computa-
tions to affect each other. CPS integrates physical processes with computation and
networking, enabling real-time data collection and analysis. The term ‘cyber’ refers
to computers, software, and networks, while ‘physical’ refers to the actual physical
systems or processes. CPS combines computers and physical components closely,
with sensors collecting data from physical systems for digital processing and anal-
ysis. CPS systems are often used in manufacturing, energy distribution, transportation
systems, etc., for real-time monitoring and control of physical processes, continu-
ally adjusting physical work based on computer’s computational analysis. Applica-
tions include automation, smart manufacturing, smart grids, intelligent transportation
systems, and health monitoring.
Second, the IoT connects machines, devices, sensors, and people to collect and
communicate data, enhancing operational efficiency.
Third, cloud computing stores and processes data on remote servers, increasing
data scalability, flexibility, and accessibility.
Fourth, cognitive computing, designed to mimic human cognitive processes,
solves complex problems without human intervention, interpreting unstructured
data and understanding context. Cognitive computing mimics human thought
processes in complex situations, employing self-learning systems using data mining,
pattern recognition, and natural language processing, mimicking how the human
174 6 Digital and AI Transformation of Industry

brain operates. Cognitive systems learn through interaction, improving over time,
adjusting algorithms based on new data and experiences, and naturally interacting
with users through conversations, understanding questions, and providing answers.
Applications include customer service through chatbots, healthcare, finance, and
more.
Among these four elements, IoT and cloud computing are well-known digital tech-
nologies, but CPS and cognitive computing might be relatively unfamiliar. Among
the digital technologies introduced in Chap. 4, Digital Twin and AI Machine Learning
could be similar to CPS and cognitive computing, respectively.
CPS and digital twins both integrate physical processes with digital models, collect
real-time data on physical systems through sensors to feed back into digital systems,
and are used to improve decision-making, process optimization, and predictive main-
tenance across various industries. However, CPS focuses on the integration and inter-
action between physical processes and computer systems, concentrating on control,
automation, and real-time data processing of physical systems, while digital twins
create a digital replica of a physical system for analysis, monitoring, prediction,
and simulation. CPS is interested in real-time control and interaction with phys-
ical systems or processes, while digital twins focus on simulation, analysis, and
optimization of physical systems.
Cognitive computing and machine learning process information and make
decisions using advanced algorithms. Both are subsets of artificial intelligence,
mimicking human-like intelligence and learning from data to improve performance.
They process decisions based on data, analyze large datasets to identify patterns,
and predict or act accordingly. However, they differ in human interaction. Cognitive
computing is designed to aid human decision-making, mimicking human thought
processes and problem-solving to interact with humans. In contrast, machine learning
focuses on learning from data to make predictions or decisions without being explic-
itly programmed for specific tasks, generally operating automatically in the back-
ground without human interaction. Cognitive computing is interested in comple-
menting human decision-making, while machine learning aims to create algorithms
capable of learning from data and making autonomous decisions.
Industry 4.0 has four design principles that determine whether a manufacturing
industry falls under Industry 4.0. First is interconnectivity, where machines, devices,
sensors, and people must be connected and communicate through IoT. Second is
information transparency, providing raw data about the physical system’s conditions
to create digital replicas like CPS. Third is technical assistance, offering various
information supports to aid informed decision-making and solve urgent problems in
the short term, along with physical support to reduce human physical and mental
fatigue and risks. Fourth is decentralized decisions, where CPS can make decisions
independently and perform tasks as autonomously as possible.
By equipping with the four key components of Industry 4.0 described above and
adhering to the four design principles mentioned, manufacturing processes can be
made more efficient and flexible through smart devices and systems. In addition, using
data analysis and IoT, equipment can be diagnosed and maintained predictively. It also
enables the manufacturing of personalized, customized products as efficiently as mass
6.5 Digitalization in the Manufacturing Industry 175

production, optimizes supply chains and logistics, revolutionizes the manufacturing


industry, and transforms manufacturing plants into smart factories. This represents
the digital transformation of the modern manufacturing industry through Industry
4.0, following the German model.

6.5.2 The Case of Steel Company POSCO

See Fig. 6.3.


POSCO’s “Smart Factory” represents a manufacturing plant that has successfully
implemented the principles and technologies of the Fourth Industrial Revolution and
Industry 4.0, earning recognition as a “Lighthouse Factory” by the World Economic
Forum in 2019. This designation acknowledges POSCO’s leadership in adopting
digital transformation and setting an example for other manufacturers to follow.
Digitalization of factories goes beyond automation to transform all processes with
digital technologies, shifting from automated processes based on preset values to
digital processes that adapt in real-time based on self-measured data.
POSCO’s Smart Factory, embodying Industry 4.0 in manufacturing facilities, has
applied digital technologies to optimize efficiency, flexibility, and productivity in
manufacturing processes. The designation as a Lighthouse Factory highlights several
key features of the Smart Factory:
1. Connectivity and Monitoring: Utilizing IoT, sensors, and data analytics, the Smart
Factory connects and monitors various equipment and processes. This connec-
tivity enables real-time data collection and analysis for decision-making and
predictive maintenance.
2. Automation and Robotics: The Smart Factory automates hazardous, repetitive,
and labor-intensive tasks, using robots to reduce human intervention in dangerous
or tedious tasks, thereby increasing accuracy and consistency in production.

Fig. 6.3 POSCO logo


176 6 Digital and AI Transformation of Industry

3. Data Analytics and AI: Employing data analysis techniques and AI, the Smart
Factory optimizes processes, reduces waste, and improves product quality. AI is
used for predictive maintenance, quality control, and process optimization.
4. Smart Energy Management: Implementing smart energy management systems,
the Smart Factory enhances energy efficiency and ensures sustainable operation,
reducing energy consumption, minimizing environmental impact, and cutting
operational costs.
5. Customization and Flexibility: By adopting 3D printing and digital twin tech-
nologies, the Smart Factory enables customized manufacturing and flexible
production in response to customer orders.
POSCO’s platform for the Smart Factory is “PosFrame,” developed by POSCO
ICT (renamed to POSCO DX in 2023), a subsidiary responsible for smart factories,
smart logistics, and industrial robots. PosFrame is a platform designed for digitalizing
steel manufacturing processes, featuring a simple structure that encompasses both
lower and upper layers’ functionalities, real-time control capabilities, and the ability
to directly manage equipment operation. It analyzes and controls data collected from
digital twins of various equipment through IoT, utilizing big data and AI. All compo-
nents are connected through a central data network, and all data can be accessed from
a virtual database regardless of the physical location. PosFrame provides a common
software layer offering APIs, UI/UX, and AR/VR functionalities, on top of which
basic applications, application sets, and individual applications are deployed. Basic
apps are those provided by the platform itself, app sets bundle large functions related
to factory operations, and individual apps consist of separate, smaller application
apps. Among the app sets related to factory operations, the most important is the
manufacturing execution system app, which is responsible for production, execu-
tion, and management. This app controls equipment according to the production
plan to ensure that the desired product is produced on time, optimizing production
and execution. The characteristic of the PostFrame platform is that it was developed
targeting the steel process, which means it is suitable for continuous processes such
as steel and chemical manufacturing but not appropriate for assembly processes like
electronic product assembly lines.
In general, in order to effectively introduce factory digitalization, as in the case
of POSCO’s Smart Factory, systematic preliminary preparation and procedural
execution are required.
First, preparation begins with designing the overall structure of the digital factory
and the related IT architecture. This includes deciding which digital factory platform
to use, and considering necessary sensors, IoT, data backbones, virtual databases, the
level of digital twins, platform architecture, connection to the cloud, UI/UX, security
methods, etc.
Second, prioritize processes where the digital factory can have a visible effect,
then gradually expand the successful experience to other processes as a step-by-step
strategy.
Third, install sensors and IoT to collect data and establish a communication system
to transmit the data to a virtual database through the data backbone.
6.5 Digitalization in the Manufacturing Industry 177

Fourth, establish a control system to analyze collected data for new insights and
accordingly improve processes, planning how to distribute big data analysis and AI
functions between edge computing and cloud computing.
Fifth, establish criteria for comparing and analyzing the costs invested in the
digital factory and the benefits gained from it, and evaluate the actual application
results based on these criteria.
In order to successfully carry out a digital factory project, such preliminary prepa-
rations and execution strategies are necessary. In addition, it is advisable to start with
a big picture but begin implementation with small, definite steps. Moreover, it is
more practical to progress step by step, building one component at a time, rather
than tackling the entire digital factory at once.

6.5.3 The Case of Hyundai Motor Company

The automotive industry has been at the forefront of embracing digital transforma-
tion, driven by innovations in electric vehicles (EVs), a software-centric approach,
and autonomous driving technologies. Companies like Tesla led the way in electric
vehicles, while global automakers including BMW, General Motors, Volkswagen,
Ford, and Hyundai Motor Company have joined the digital shift, propelling rapid
development within the automotive sector.
The backdrop to digital transformation in the automotive industry includes envi-
ronmental concerns. With climate change posing a global crisis, the alarm was raised
over fossil fuel usage, prompting a shift from internal combustion engine vehicles
to electric vehicles.4 The rise of Tesla, the Diesel-gate scandal in Europe,5 and
policy support in China boosted the ascent of electric vehicles from 2017, marking
a rapid uptrend. European and Chinese environmental regulations have continued to
strengthen the EV market, though the limits of battery technology and raw materials
pose questions on the growth’s extent. Hybrid cars, combining electric and internal
combustion engines, have become a widespread alternative, and hydrogen fuel cell
vehicles have emerged as a new option.6

4 Historically, electric vehicles predated internal combustion engine cars, with Detroit Electric’s
Edison electric car in 1913 reportedly capable of traveling up to 100 km at a top speed of 40 km/
h on a single charge. However, due to long charging times and heavy batteries, mass practical use
was not achieved, and internal combustion engine cars gained momentum with the introduction of
Ford’s assembly line system, the discovery of Texas oil, and the drop in gasoline prices.
5 Diesel-gate is a scandal that emerged when it was revealed that European car companies, including

Volkswagen, had manipulated emissions data. This incident brought diesel engines, which use diesel
fuel, and further, internal combustion engines themselves into focus as a factor in environmental
issues. Ultimately, it became the starting point for the movement to phase out internal combustion
engines in the 2020s.
6 Hydrogen fuel cell vehicles, operating on the principle of generating electricity through the chem-

ical reaction of hydrogen and oxygen in fuel cells, represent a promising alternative, driving electric
motors and refueled at hydrogen stations.
178 6 Digital and AI Transformation of Industry

Fig. 6.4 Hyundai Motor


Company logo

Digital transformation in the automotive industry encompasses electric vehicle


innovation, connected cars, autonomous vehicles (AV), software-defined vehi-
cles (SDV), and digitalization of manufacturing. Connected cars utilize vehicle
sensors, GPS, and telecommunications for telematics, infotainment (information +
entertainment), vehicle condition management, and safety. Autonomous vehicles
employ sensors, cameras, radar, LiDAR, and GPS for environmental perception and
autonomous operation. SDVs are characterized by their core functions and features
being controlled and updated through software. Digitalization of manufacturing
represents the implementation of Industry 4.0.
Hyundai Motor Company, similar to BMW and other automakers, has embraced
all the aforementioned elements in its digital transformation. It has expanded its
lineup of electric, hybrid, and eco-friendly hydrogen vehicles, developed advanced
connected vehicle technologies including remote vehicle management, in-car info-
tainment, and telematics. Hyundai has actively invested in autonomous vehicle tech-
nology and introduced smart manufacturing practices in its production processes,
applying robotics, automation, and data analytics. In addition, Hyundai has been
developing comprehensive hydrogen solutions for transitioning to a hydrogen society
and pioneering SDV for mobility technology innovation. In the following, we
focus on Hyundai Motor Company’s digitalization in automotive manufacturing,
showcasing how it has integrated various digital transformation elements into its
manufacturing processes.
See Fig. 6.4.
Hyundai Motor Company has been a leader in adopting smart manufacturing,
which signifies the digitalization of manufacturing processes. Smart manufacturing
employs digital technologies such as big data analytics, IoT, AI, and robotics to
enhance efficiency, reduce waste, and improve product quality. Specifically, advanced
robotic systems are introduced in welding, painting, and assembly operations to
increase precision and efficiency. Sensors and data analytics are utilized to predict
equipment failures, thereby reducing downtime and maintenance costs. Digital tech-
nologies enable the maintenance of efficient production processes while offering
more customized options to customers.
Hyundai Motor Company has embraced Industry 4.0, pursuing the digital-
ization of manufacturing processes and utilizing Industry 4.0’s core components
such as CPS, IoT, cloud computing, and cognitive computing. A digital thread
6.5 Digitalization in the Manufacturing Industry 179

running through the connected supply chain integrates all aspects of the produc-
tion process, enabling a comprehensive overview and coordination. The establish-
ment of smart factories equipped with smart devices for data collection and analysis
improves decision-making and operational efficiency. Moreover, digital twins simu-
late, predict, and optimize the performance of actual manufacturing equipment and
processes.
By adopting smart manufacturing and Industry 4.0 principles, Hyundai Motor
Company focuses on improving efficiency, minimizing environmental impact, and
concentrating on vehicle customization and quality enhancement. The company
invests in the necessary development and employee training for manufacturing digi-
talization. The ultimate goal is to create an agile and sustainable manufacturing
process that can quickly respond to market changes and meet market demands.
Hyundai is developing a future-oriented and intelligent factory called “E-
FOREST,” a manufacturing platform that incorporates digital technologies such as
AI, robotics, ICT, IoT, and big data into innovative automated methods and human-
friendly smart technologies. E-FOREST is based on four core values, namely Auto-
Flex, Intelligence, Humanity, and Green, aiming for flexible and advanced automa-
tion in assembly, logistics, and inspection, intelligent control systems based on AI,
a human-centered work environment, and an eco-friendly factory.
E-FOREST aims to implement a smart production system capable of real-time
prediction and autonomous production by connecting and analyzing all data gener-
ated in the production plant. The smart factory integrates all plant data through an IoT
platform, provides real-time data monitoring and analysis through the “Factory-BI”
system, and manages previously unmanageable areas to maximize overall produc-
tion efficiency. By deploying robots in hard-to-reach work environments, it improves
worker conditions and enhances safety and efficiency in production plants. Based on
cloud technology infrastructure, it aims for a software-driven factory (SDF) where
production equipment control, all data and IT services, and the entire plant system
are organically connected and integrated.
E-FOREST embodies a flexible production system, high-level automation,
human–robot collaboration, custom manufacturing, intelligent factory, and quality-
completed factory. Such innovation in manufacturing processes can reduce the time
and cost of new car development, allowing for a focus on creating better vehicles,
ultimately providing consumers with high-quality products at reasonable prices.
By using big data and AI technologies to predict production scales and manufac-
ture accordingly, it allows for flexible responses to unforeseen situations, offering
products that match customer preferences and enabling customer-centric custom
production.
Hyundai Motor Company has realized the blueprint of the E-FOREST smart
factory with the completion of the “Hyundai Motor Group Innovation Center Singa-
pore (HMGICS)” in 2023. HMGICS features a cell-based flexible production system
with digital technologies, efficient production operation based on digital twin tech-
nology that synchronizes reality and virtuality, data-driven intelligent operation
180 6 Digital and AI Transformation of Industry

systems, and a human-centered manufacturing process harmonizing humans and


robots. HMGICS will serve as a testbed for developing and validating intelligent
manufacturing platforms that implement Industry 4.0.

6.6 Digital Transformation Strategy

The successful cases of companies like ENGIE, John Deere, POSCO, and Hyundai
provide valuable insights into effective approaches for industries navigating digital
transformation. Drawing lessons from their experiences and success stories can
significantly enhance the chances of success in similar initiatives. The following
presents a digital transformation strategy inspired by these insights.
First, set clear objectives and present a vision. Understand the specific challenges
and opportunities the company faces. Examine how digital transformation can bring
about changes in various aspects of business operations, such as improving customer
experience, simplifying operations, and innovating products. Predict the duration
and investment required for digital transformation and when tangible results can be
expected. Anticipate potential obstacles in the process of digital transformation and
review solutions in advance. Based on this, set goals for digital transformation and
present a vision of what the company aims to achieve. Convincing the CEO with
these goals and vision becomes a priority, considering the critical importance of the
CEO’s firm recognition and support for successful digital transformation.
Second, appoint a dedicated leader for digital transformation. The CEO should
appoint a responsible leader to oversee digital transformation, grant authority to
form and operate a dedicated team, secure the budget for digital transformation,
and ensure access and cooperation from all departments within the company. In
addition, the CEO should encourage the management team to support digital trans-
formation, accept requests from the digital transformation leader, and adapt company
management accordingly. The CEO should also lead the change in organizational
culture to embrace change, innovation, and continuous learning that aligns with
digital transformation.
Third, form a dedicated organization for digital transformation. Once the digital
transformation plan is approved and budgeted by the CEO, the digital transforma-
tion leader should secure specialists needed for digital transformation and form a
dedicated team. Securing talented specialists in IT, big data analysis, software devel-
opment, AI application, and cybersecurity maintenance is critically important for
successful digital transformation. This can be time-consuming and costly. Simulta-
neously, necessary digital devices should be purchased and installed. The team should
plan which technologies to adopt and which to develop, considering the complexity
and development duration of each technology.
Fourth, develop a specific plan for digital transformation. The digital transfor-
mation leader should organize a planning team to devise a comprehensive plan for
digital transformation, including all relevant tasks, and create a detailed implemen-
tation plan from start to finish. This involves designing the supporting IT structure,
6.6 Digital Transformation Strategy 181

deciding on platforms, and planning for the acquisition or development of neces-


sary digital technologies. Plans for purchasing related digital devices and securing
specialists should also be included. Include milestones for visible success in the plan
to maintain momentum in digital transformation, and to gain external recognition
and boost internal motivation.
Fifth, ensure close collaboration between the digital transformation team and the
production site. Often, the core of digital transformation involves digitalizing the
factory, necessitating close and constant cooperation with the site. Installing sensors
and IoT, collecting data, and applying data analysis results to improve processes
are tasks performed on-site. Thus, adapting digital devices like sensors and IoT to
the site conditions, developing software/applications according to site requirements,
and applying data analysis results on-site are essential steps. Establishing a close and
friendly collaboration with the site management team becomes a key virtue for the
digital transformation leader.
Sixth, build agile and scalable IT infrastructure. The success of digital transfor-
mation is significantly influenced by the IT infrastructure. It is necessary to establish
a flexible IT infrastructure that can adapt quickly to changes in the environment and
requirements and can be scaled as needed. In addition, fostering a DevOps envi-
ronment and culture that enhances communication, collaboration, and integration
between software development and IT operations teams is essential.7 IT infrastruc-
ture should be made agile, which enables flexible IT systems that can quickly adapt
to new technologies, business models, and user demands. To build such an infras-
tructure, it is crucial to modularize the infrastructure to allow for part replacements
or updates without affecting the whole system, automate repetitive tasks to reduce
manual work, integrate with the cloud for flexible operations, and harmonize devel-
opment and operations with a DevOps approach. For scalable IT infrastructure, it is
important to design with scalability in mind from the start, integrate with the cloud to
flexibly use storage and computing capabilities, distribute workloads across multiple
servers, and build scalable data management and networks.
Seventh, manage and analyze data for data-based decision-making. Data is an
indispensable raw material for realistic decision-making in business operations. It is
essential to use relevant, high-quality data, manage data to collect, store, organize, and
maintain it usefully for business operations, and analyze data to extract actionable
insights, thereby supporting data-centric decisions. In terms of data management,
collect data from various sources like customer interactions, transactions, and social
media, store it securely in the cloud, and ensure data’s accuracy, consistency, and
reliability. Also, combine data from various sources to provide an integrated view,
comply with laws and regulations, and maintain privacy and security legally. From an
analysis perspective, understand business trends, customer behavior, and operational
efficiency, analyze historical data to understand past events, and equip necessary data
analysis tools. Use data mining to understand the causes of events or trends, apply

7DevOps is a combination of development and operations, referring to a development environment


or culture that emphasizes smooth communication, collaboration, integration, as well as visibility
and transparency between software developers and IT professionals.
182 6 Digital and AI Transformation of Industry

statistical models and machine learning techniques to predict future outcomes based
on past data, and facilitate real-time analysis of large data volumes for timely insights.
Eighth, repeat performance evaluation and feedback with a measure and repeat
approach. Build systems, processes, products, and strategies targeted for digital
transformation to fundamentally learn from data and experience, capturing changes
quickly and reflecting them effectively. This involves capturing changes by measuring
data and immediately providing feedback to manage risks and continuously improve,
thereby effectively responding and competing in the rapidly changing business envi-
ronment. Specifically, to evaluate the performance of targeted systems, processes,
products, and strategies for digital transformation, first set specific, measurable key
performance indicators (KPIs) aligned with business objectives. Second, systemati-
cally collect quantitative and qualitative data from various sources such as sales data,
customer feedback, and website analytics. Third, analyze and interpret data using
various statistical tools and methods, and compare performance with competitors
to gain insights for decision-making and feedback for incremental changes. Fourth,
repeat this process continuously with an agile methodology to swiftly respond to
market changes.8
Ninth, maintain a customer-centric approach. The success or failure of a business is
determined by sales to customers; thus, the success of digital transformation depends
on how customers accept it. A customer-centric approach is crucial throughout the
business process, requiring a shift in mindset and changes in organizational processes
and strategies. This approach goes beyond providing good customer service to prior-
itizing customer value and satisfaction in all aspects of the business. To practice
this, thoroughly research to understand customers’ needs, issues, and expectations
by collecting customer feedback, communicating through social media, customer
service channels, and community forums. Gain insights into customer behavior and
preferences through data analysis, improve customer satisfaction by interacting with
customers, and design products and processes with customers in mind, customizing
products and services to individual customer needs. In customer management, use
customer data to gain insights, reflecting this in marketing strategies, product devel-
opment, and customer service improvements. Focus on building long-term relation-
ships with customers, continuously improving products and services, rather than just
focusing on transactions.

8 ‘Agile’ signifies being quick and adaptable. In ‘agile methodology,’ it refers to swiftly adjusting
to changes and quickly applying these adjustments to business processes. Initially used in software
development, agile methodology involves developing software in iterative cycles, continuously
incorporating feedback and evolving requirements. This approach allows for dynamic development
and improvement. Applying agile methodology beyond software to business operations involves
shifting from traditional hierarchical structures to collaborative, horizontal frameworks. This empha-
sizes rapid response to customer needs and integrating insights into business strategies, fostering
an environment of innovation, adaptability, and continuous improvement.
6.7 AI Transformation of Industry 183

6.7 AI Transformation of Industry

The implementation of AI transformation across industries promises to bring


profound changes, extending well beyond productivity enhancements and cost reduc-
tions. AI enables companies to develop innovative products and services, offer
personalized customer experiences, and adopt sustainable operational practices that
can lead to significant advancements in the global marketplace. The benefits of AI
in industry are wide-ranging and include the following:
1. Process Automation and Operational Optimization: AI automates not only repet-
itive tasks but also complex decision-making processes, reducing the need for
human intervention. This leads to time savings, cost reductions, and fewer errors
in both simple and highly specialized tasks.
2. Supply Chain and Inventory Management Optimization: By analyzing complex
datasets, AI can accurately forecast demand and optimize inventory levels,
preventing overstock and shortages, ensuring smoother supply chain operations,
and reducing operational costs.
3. Customized Production and Product Innovation: AI allows for the devel-
opment of customized products tailored to individual customer preferences.
By analyzing consumer behavior patterns, companies can anticipate demand
and deliver bespoke solutions that enhance customer satisfaction and market
competitiveness.
4. Real-time Decision-making and Predictive Maintenance: AI provides real-time
data analysis, allowing companies to make informed decisions swiftly. In addi-
tion, AI predicts equipment failures through predictive maintenance, reducing
downtime and improving operational efficiency by enabling proactive repairs.
5. Quality Control and Energy Usage Optimization: AI systems can identify defects
in manufacturing processes early and optimize energy consumption, contributing
to more efficient and sustainable production methods.
While many of these advantages are also part of the digital transformation, AI
brings these processes to new heights by increasing intelligence and autonomy at
every step, reducing human involvement as necessary. As a consequence, AI trans-
formation represents the next phase in digital transformation, emphasizing the role
of AI technology to a greater extent.
AI’s ability to learn from large datasets allows industries to optimize resource
usage, reduce waste, and complete tasks more quickly and accurately. As a result,
work speeds up, outcomes become more precise, and quality improves. AI’s capacity
for complex data analysis uncovers patterns and insights that traditional digital tech-
nologies or human analysts might miss, leading to better strategic decisions and an
overall increase in efficiency and effectiveness.
The transformative potential of AI extends far beyond the realms of manu-
facturing, offering significant enhancements in efficiency and effectiveness across
multiple industry sectors. Below are some industry-specific examples:
184 6 Digital and AI Transformation of Industry

1. Healthcare: AI is revolutionizing healthcare by offering precise diagnostics,


personalized treatment plans, and accelerating drug discovery. AI algorithms can
analyze medical data to detect disease patterns that may elude human experts,
enabling earlier detection and better outcomes for patients. Customized treat-
ments based on genetic data are also becoming possible, improving the efficacy
of treatments.
2. Finance: In the financial sector, AI enhances security by monitoring transactions
for signs of fraud. It also improves risk assessment models, providing more
accurate loan and investment predictions. AI algorithms can analyze real-time
market data, enabling automated trading decisions that capitalize on immediate
opportunities.
3. Retail: AI transforms retail by delivering personalized shopping experiences and
optimizing inventory management. Machine learning algorithms track shopping
patterns to recommend products tailored to individual preferences, enhancing
customer engagement and loyalty. In addition, AI can improve supply chain
management, ensuring better demand forecasting.
4. Agriculture: AI optimizes agricultural operations through precision farming,
which uses data from drones and satellites to monitor crop health and soil condi-
tions. AI-guided farming practices lead to higher yields and more sustainable
land use, minimizing environmental impacts.
5. Energy: AI enhances the energy sector by optimizing production, distribution,
and integration of renewable energy sources. AI systems can forecast energy
demand in real-time, supporting the efficient use of energy and contributing to a
more sustainable energy grid.
6. Logistics: In logistics, AI optimizes delivery routes, saves fuel, and reduces costs.
AI-driven systems analyze traffic patterns and propose the most efficient paths,
improving customer satisfaction while reducing environmental impacts through
more efficient deliveries.
7. Autonomous Vehicles: AI is essential for autonomous vehicles, enabling them to
navigate complex environments and make real-time decisions. Autonomous vehi-
cles use AI to interpret sensor data, navigate through traffic, and learn from new
experiences, paving the way for safer and more efficient transportation systems.
As AI continues to evolve, its potential to reshape industries and enhance human
life grows. However, its rapid integration into key sectors also raises significant
concerns about ethics, privacy, and accountability. Balancing the benefits of AI with
its potential risks will be critical in ensuring that the AI transformation leads to a
prosperous and sustainable future for all industries.
Chapter 7
Digital and AI Transformation in Society

In the past, the Industrial Revolution shifted societies from agrarian to industrial
structures, dramatically altering the fabric of daily life. The focal point of life moved
from rural communities centered at farmlands to cities built around factories, with
lifestyles evolving to incorporate products mass-produced in these new industrial
centers. Today, as industrial society transitions into a digital society, cities remain
the primary living spaces, but an increasing number of people are now working
or participating in activities within cyberspaces like the metaverse. Additionally,
digital technologies are permeating various industries, leading to profound changes
in lifestyles. Whereas the shift from agrarian to industrial societies increased the
need for travel to work and social engagements, the transition to a digital society has
reduced physical travel due to innovations such as remote work, online shopping,
and social media.
As this transition from an industrial to a digital society unfolds, the impact of
digital technologies on human life is becoming increasingly apparent. Digital tech-
nology is transforming everything from communication methods, access to infor-
mation, and work and learning practices, to lifestyles. This rapid pace of change,
which accelerated during the COVID-19 pandemic, has reshaped not only individual
lives but also business operations and government administration. Digital transfor-
mation is now redefining social behavior, with both positive and negative effects.
On the positive side, it improves access to information and services, enhances social
connectivity, and drives industrial innovation. However, it also presents challenges
such as the digital divide, digital illiteracy, job displacement, misinformation, privacy
concerns, and ethical dilemmas. As such, it has become crucial to observe the impact
of digital transformation on education, politics, society, and culture, while addressing
the complex problems that arise alongside these changes.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 185
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_7
186 7 Digital and AI Transformation in Society

7.1 Society in the Digital Age

Looking around society today, it is clear that lifestyles have changed significantly
compared to one or two decades ago. Digital devices have become central to both
work and daily life, with smartphones being indispensable. People now attend meet-
ings via video conferencing and work remotely when commuting is challenging.
Instead of using traditional cookbooks, they turn to online videos like YouTube for
recipes. Education has also adapted, offering remote access to lectures and seminars
from abroad. When faced with questions, people turn to internet searches or AI like
ChatGPT instead of asking someone directly. For navigation, they rely on digital maps
and GPS to drive to unfamiliar destinations. Movie watching has shifted from sched-
uled theaters or TV programming to on-demand platforms like Netflix. Commu-
nication with friends is done through messenger services and social media, while
video calls with international contacts are made using free apps. When addressing
social issues, people express opinions on social media and form online communi-
ties for collective action. Booking transportation and shopping are done online, often
through apps on smartphones. People monitor their health through wearable devices,
with data sent directly to hospitals for remote consultations. These lifestyle changes
show how deeply our society is immersed in digital transformation.
Digital technology has revolutionized both work and lifestyle, offering significant
conveniences. The backbone of this transformation is the advancement of information
and communication technologies (ICT), which enables high-speed global connec-
tions. The shift from voice-centric telephone networks to internet-based systems
that handle video, data, and voice, along with the expansion of optical and wireless
networks, has built an infrastructure that allows for unconstrained communication
and information access across borders, time zones, and formats. This has opened
up new avenues for remote social and business activities, creating cyberspaces that
transcend physical limitations. Beyond simply acquiring information, these spaces
enable the formation of human networks for sharing ideas and communication. The
rise of social network services (SNS) has ushered in an era of hyperconnectivity,
allowing people worldwide to connect and collaborate as though they were in the
same space. Social media has fundamentally changed how people build relation-
ships, share information, and even organize for political and social causes, offering
a powerful tool for raising grievances and shaping public opinion.
The rise of digital technology has also transformed commerce. Companies no
longer need to set up physical stores or rely solely on TV advertising to reach
customers. They can now list products on mobile platforms like the App Store or
Play Store, instantly reaching a global audience. Consumers can browse, compare,
and purchase products through mobile apps, breaking the constraints of time and
space. This shift has enabled small and medium-sized enterprises (SMEs) to launch
products in the global market without incurring high costs or delays. From the
consumer’s perspective, it is easier to gather comprehensive information, compare
prices, and make informed purchasing decisions. This transformation has changed
consumer behavior and created new consumer trends, where even geographically
7.2 Culture in the Era of Digital Transformation 187

distant customers can collaborate in online communities, sharing information and


making joint purchases. As a result, a small venture can quickly become a global
company if its product meets consumer demands, while companies that fail to satisfy
customers can fall just as quickly.
However, digital transformation has also brought several challenges. The
increased reliance on digital technology raises concerns about the collection, storage,
and use of personal data. While digital service platforms collect necessary informa-
tion to provide various services, excessive data collection or misuse can cause harm to
users. In addition, the spread of misinformation and fake news through social media
can lead to social unrest and division. The interconnected nature of the digital world
increases the risk of cyberattacks and data breaches, potentially affecting critical
infrastructure and financial systems or exposing sensitive information. Moreover,
individuals who are unable to adapt to new technologies may experience a digital
divide, exacerbating existing socioeconomic disparities. Job displacement is also
a concern, as automation and artificial intelligence may replace traditional jobs in
factories and other sectors.
Addressing these challenges requires systemic solutions rather than individual
efforts. Governments need to enact or strengthen laws to protect personal data, ensure
data security, and safeguard consumer rights. Education should focus on fostering
digital ethics and responsible behavior in the digital age. In addition, technology
to counter misinformation and fake news must be developed, along with penalties
for offenders. Strengthening network security to prevent hacking and cyberattacks
is essential. To ensure equitable access to the benefits of digital technology, skill
development programs should be created to help individuals adapt to the changing
job market and become familiar with digital devices. Therefore, it has become a
societal challenge that the digital society must address to ensure that the benefits of
digital technology are evenly enjoyed by the entire population while minimizing or
preventing adverse effects.

7.2 Culture in the Era of Digital Transformation

Digital transformation has reshaped the way people live and interact, giving rise
to new cultural norms. This shift has altered communication methods, access to
and sharing of information, artistic and cultural expression, economic activities,
work processes, education, and political and social engagement. At the same time,
it has introduced new concerns around privacy and ethics. Digital technology has
revolutionized how people communicate, with social media platforms, messaging
services, and video calls enabling connections that transcend geographical bound-
aries. Expressive mediums have expanded to include emojis, emoticons, internet
slang, and memes, creating new forms of communication, especially among younger
188 7 Digital and AI Transformation in Society

generations.1 The internet has made it easy to search for and share information,
fostering access to vast knowledge. Digital platforms have also facilitated the creation
of diverse online communities, where people with shared interests can connect. More-
over, the rise of virtual worlds like the metaverse allows individuals to live dual lives
as themselves and as avatars, bridging the gap between real and virtual environments.
These developments demonstrate how digital technology has enabled the creation of
new cultural frameworks that were unimaginable in the past.
Digital technology has transformed daily life into a highly individualized expe-
rience, with smartphones at the center. Whether on public transportation or else-
where, it is common to see individuals immersed in their smartphones, using them
for tasks such as calling, texting, reading the news, booking tickets, web surfing,
shopping, streaming videos, gaming, and more. A smartphone now integrates the
functions of a phone, television, computer, camera, and more into a single device.
It has become a personal vault, storing photos, calendars, contacts, call logs, chat
histories, emails, and payment details. In essence, the smartphone represents the
convergence of communication and computing, and it has become an indispensable
all-purpose tool for modern humans. Thus the individualized, smartphone-centered
lifestyle has become the cultural norm of the digital transformation era.
As previously discussed, digital transformation has brought significant changes to
industries, altering corporate ecosystems, work structures, and methods, and creating
new corporate cultures. Education and learning have also been transformed, with
digital literacy now an essential skill. New cultural dimensions have emerged in
how education is delivered, with remote learning and online resources becoming
common. In addition, political and social participation have changed, as digital plat-
forms have enabled the consolidation of opinions and the formation of groups for
activism and civic engagement, giving rise to new form of cultures of political and
social participation.
In the arts, digital technology has introduced new tools and techniques that have
revolutionized artistic creation. Musicians now compose with tools like Musical
Instrument Digital Interface (MIDI), experiment with Virtual Studio Technology
Instrument (VSTi), and use Digital Audio Workstations (DAWs) to record, edit,

1 Emojis are small digital icons used to express emotions or concepts, commonly employed in text
messages and on social media platforms via mobile devices and computers. Unlike emojis, emoti-
cons are composed of keyboard characters and symbols arranged to represent facial expressions or
convey emotions, primarily used in text-based communication. Internet slang refers to abbreviations
and acronyms that originate from online culture, making digital communication more efficient. The
term meme, first introduced by Richard Dawkins in his seminal work The Selfish Gene, originally
referred to an idea, behavior, or style that spreads within a culture, functioning similarly to how
genes transmit biological information. In today’s context, memes (or internet memes) have evolved
into a key element of online culture, particularly among Generation Z. Memes often start as viral
internet content—such as humorous images, videos, or parodies—that capture widespread attention
and are shared extensively across social media. This phenomenon reflects a unique form of digital
culture shaped by the interaction between advanced technology and the communication habits of
Generation Z.
7.2 Culture in the Era of Digital Transformation 189

and arrange music.2 These tools allow for streamlined workflows, experimentation,
and collaboration. In visual arts, digital tools enable both “fully digital” art, where
artists create directly on a digital canvas using software, and “semi-digital” art, where
traditional artworks are scanned and digitally enhanced. New genres such as elec-
tronic dance music (EDM), chiptune, generative art, pixel art, VR art, AR art, and
AI-generated art have emerged from the intersection of art and digital technology,
providing new avenues for artistic expression.3
Film production has also evolved, thanks to advancements in computer graphics
and 3D technology. Movies like Avatar showcase the potential of digital technology,
using CGI to create characters like the Na’vi and combining real and digital envi-
ronments to craft exotic settings. Motion capture technology adds lifelike movement
to digital characters, while 3D camera technology enhances visual depth, creating
immersive cinematic experiences. As such, digital technology has provided film-
makers with tools to realize rich imaginations and creative challenges in cinematic
form, offering audiences enchanting and compelling movie experiences. Recently,
with smartphones equipped with powerful cameras and video editing apps, film-
making has become popularized—anyone can now shoot, edit, and produce films.
Features like wide-angle, telephoto, and macro lenses, along with adjustable reso-
lution and frame rate, allow users to create personalized films without professional
equipment.
Digital technology has reshaped the landscape of direct-to-consumer platforms,
drastically impacting cultural evolution. Much like how the App Store and Play
Store revolutionized the app industry, these platforms enable content creators to
bypass traditional distribution networks and connect directly with their audience.
OTT platforms exemplify this shift, allowing creators to offer their content without
intermediaries.4 This transformation has had a profound effect, as seen with the

2 MIDI: Musical Instrument Digital Interface, a protocol that standardizes the exchange of digital
signals between electronic musical instruments. VSTi: Virtual Studio Technology Instrument, a
plugin format adopting the standard specification (VST) used for connecting electronic music editing
software, recording systems, synthesizers, etc. DAW: Digital Audio Workstation, a workstation
supporting the playback, recording, and editing of digital audio.
3 Electronic dance music (EDM) is a music genre that centers on synthesized sounds and is charac-

terized by strong beats and electronic production techniques. Chiptune is a style of music made using
vintage video game hardware or emulators, often from the 1970s and 1980s, to produce unique,
nostalgic sounds reminiscent of early video games. Generative music employs algorithms and
coding to create self-generating or evolving compositions, offering dynamic and often unpredictable
musical experiences. In the realm of digital art, various forms have emerged alongside technological
advances. Pixel art uses small, square pixels to create images, evoking a sense of nostalgia for early
video game and computer graphics aesthetics. VR art allows artists to create immersive 3D envi-
ronments through virtual reality headsets, offering interactive and multi-dimensional experiences
that transcend traditional artistic boundaries. AR art layers digital creations onto the physical world,
visible through augmented reality headsets or smartphones, to create interactive and location-based
experiences. AI art leverages artificial intelligence and machine learning algorithms to generate or
refine artworks, fostering a new frontier of collaboration between human artists and AI systems.
4 OTT (short for “Over the Top”) refers to services that deliver content directly to consumers over

the internet, bypassing traditional broadcast, cable, or satellite television platforms that historically
controlled content distribution. The term “set-top box” originates from the early days of television,
190 7 Digital and AI Transformation in Society

global rise of Korean culture, including K-pop, K-dramas, K-cinema, K-games, K-


webtoons, and K-animation, which has been propelled onto the global stage thanks
to these platforms. In this new era, cultural productions, much like industrial goods,
navigate through digital production and distribution channels directly to consumers.
These platforms also streamline processes like marketing, advertising, and trans-
actions, empowering creators to focus on content development. In addition, digital
platforms enable investments from OTT services, fostering an ecosystem ripe for
innovation and high-quality content creation. This model holds great promise for
the performing arts sector, traditionally constrained by physical venue capacities.
By broadcasting live performances or relaying them to cinemas via OTT platforms,
artists can reach a global audience in real-time, dramatically broadening their reach
beyond traditional confines. As such, digital integration expands the horizons for
cultural dissemination, bridging creators and consumers in unprecedented ways.

7.3 Changes of Jobs in the Digital Age

In the digital age, the increasing use of digital devices has greatly improved work
efficiency, resulting in fewer people being needed to handle the same workload.
The widespread adoption of RPA has led to significant reductions in production
jobs. Moreover, the development of AI has not only diminished clerical roles but is
also beginning to transform jobs that require higher levels of cognitive and analyt-
ical skills. For example, banking transactions have shifted to fintech platforms, and
services like ticketing and ordering have been replaced by self-service touch screens.
As face-to-face services transition to electronic transactions and operations move to
cyberspace, the demand for traditional jobs continues to decrease. While the current
global job shortage is partly due to economic downturns, a more fundamental cause
is the reduction of jobs caused by advancements in digital technology, which is an
issue commonly affecting both developed and developing countries.
The digital era is expected to bring considerable changes to the job market.
Automation and digitalization are reducing the need for some jobs while simultane-
ously creating new opportunities. Roles in repetitive administrative tasks, data entry,
basic data analysis, manufacturing, retail, and customer support are declining. Mean-
while, demand is growing for jobs such as data scientists, cybersecurity experts, AI

when an external device was placed on top of the TV to receive and decode broadcast signals from
satellite, cable, or other direct broadcast methods. These devices were necessary for converting
signals into viewable content on a television. In contrast, OTT services utilize the internet to stream
content directly to consumers on various devices, such as televisions, smartphones, tablets, and
computers, without the need for traditional broadcasting methods or intermediary hardware like
a set-top box. This direct-to-consumer model allows for a more flexible and extensive content
delivery system, including movies, TV shows, live events, and more, offering greater convenience
for users. Leading examples of OTT platforms include Netflix, YouTube, Hulu, Disney+, AppleTV+,
and Amazon Prime Video, each offering a vast library of on-demand content tailored to a global
audience.
7.3 Changes of Jobs in the Digital Age 191

specialists, digital marketing professionals, software developers, robotics experts,


e-commerce specialists, digital health professionals, and online content creators.
Essentially, jobs requiring simple, repetitive tasks are being replaced by automation,
while positions related to developing, operating, or applying digital technologies are
gaining importance.
With the rise of digital transformation, the ability to work with digital technology
has become a key factor in employability. Proficiency in digital tools, platforms, and
concepts is now an essential skill in the evolving job market. As digital technology
continues to advance, it is necessary for individuals to continuously update their
skills, making lifelong learning crucial for staying competitive. Companies must
provide retraining opportunities to help employees adapt to the new demands of
digital transformation. Similarly, the public sector needs to develop and offer re-
education programs for individuals seeking career transitions into new digital roles.
The nature of work at companies has also changed significantly due to digital
transformation. The development of communication technologies has made remote
work possible, a trend that gained rapid momentum during the COVID-19 pandemic.
As remote work continues in many sectors, new digital tools for collaboration and
productivity have emerged. Employees have become accustomed to this new working
style, leading to shifts in work values and behaviors. Many workers are now diver-
sifying their income sources by pursuing freelance or personal projects alongside
remote work. Even post-pandemic, employees often prefer to continue working
remotely, despite efforts by companies to bring them back to the office.
In the digital transformation era, companies need to make concerted efforts
to adopt digital technologies for operational efficiency. Transitioning meetings to
video conferencing and utilizing project management software are crucial steps in
improving collaboration and work continuity. Developing a flexible hybrid model
that balances office work with remote work can help enhance both work-life balance
and productivity. Companies also need to explore futuristic workplace innovations,
such as AR for remote work, AI for decision-making, and human–machine collabora-
tion. For instance, many companies are already experimenting with the metaverse for
meetings, training, and promotional activities, which is a trend that can be expanded
into other areas of work.
As workplaces become increasingly digital, data security becomes more crit-
ical. Protecting sensitive information and ensuring the integrity of data processing
algorithms is vital for safeguarding company data and personal information from
cyberthreats. Companies must implement robust cybersecurity measures and build
a secure digital infrastructure to support the future of work. In addition, estab-
lishing ethical guidelines for digital work practices is essential to ensure fairness
and to prevent bias in a digital work environment. Employees should adhere to these
standards to help create a positive and ethical digital workplace culture.
192 7 Digital and AI Transformation in Society

7.4 Digital Divide, Digital Inclusion

As the digital transformation progresses, people across society are struggling to


adapt to changes in digital technology and services. The elderly, who did not grow
up in a digital environment, face various difficulties in receiving services through
digital devices. In addition, during the COVID-19 pandemic, students from families
without PCs or with limited internet access faced challenges in attending online
classes and completing homework. Thus, lacking the opportunity to learn and become
familiar with digital technology, or not having access to digital devices and internet
connections, can lead to missing out on the benefits of digital technology and facing
disadvantages in using internet searches and various online services.
This digital divide leads to social inequality in the digital age. It refers to the
gap in access to and use of digital technology and the internet, often caused by
differences in socioeconomic status, geographical location, education level, and age.
Considering digital technology forms the foundation of social life, communication,
and interconnection in the digital age, the inequality resulting from the digital divide
can be devastating. The primary causes of the digital divide include lack of access to
digital devices and services and lack of knowledge about digital technology. Resi-
dents of rural areas or those economically unable to afford computers or smart-
phones and high-speed internet connections are unable to utilize online services
for civic engagement, education, remote healthcare, and e-commerce. Geographic
and economic factors, as well as not being part of the digital generation, make it
difficult to participate equally in social activities of the digital age, leading to educa-
tional inequalities due to restricted access to digital knowledge resources and various
information via internet connection.
Addressing the digital divide is a primary task in the era of digital transformation,
as resolving it is crucial for achieving social equality, equal opportunities for infor-
mation access, and equal educational opportunities in the digital age. To mitigate the
digital divide, it is necessary to build high-speed internet infrastructure in all resi-
dential areas, including rural and mountainous regions, and to provide policy support
for low-income groups to access the internet affordably. It is demanding to establish
support programs to help individuals without digital devices afford them at reason-
able prices and to install public computer and internet centers offering free use of
digital devices. It is desirable to develop programs to teach digital technology effec-
tively to the elderly and those with weak digital literacy, and develop applications
and services for education, healthcare, and civic engagement accessible via mobile
internet in remote areas. In addition, collaborative efforts are required from govern-
ments, businesses, and civic organizations to create various programs that develop
comprehensive and sustainable solutions to the digital divide. Also schools should
build infrastructure to ensure all students have equal access to all digital educational
resources, including digital devices and internet connectivity.
Ensuring that all members of society, regardless of socioeconomic status,
geographical location, physical ability, education level, or age, can access digital
technology, devices, and the internet, is termed ‘digital inclusion.’ Digital inclusion
7.5 Education in the Digital Age 193

lays the foundation for social equality, equal opportunities for information access, and
equal educational opportunities. Furthermore, it serves as a basis for equitable partic-
ipation in economic activities, social integration, and economic development. Digital
inclusion enables all members of society to access knowledge and technology, with a
particularly close relationship to education. Establishing a digital inclusion environ-
ment ensures that educational opportunities in the digital age are equally available to
students, allowing each student to develop digital literacy.5 Digital literacy, alongside
science literacy, is an essential trait for living in the twenty-first century, an era of
digital transformation and advanced science and technology.6

7.5 Education in the Digital Age

Digital transformation is fundamentally changing how teachers teach and students


learn by integrating digital technology into education. Instant access to a vast amount
of information through digital libraries, online databases, and educational platforms
has broken the conventional educational mold. Education software now allows for
personalized learning tailored to each student’s pace and capability, and multimedia
content like videos, simulations, and gamified learning enables active student partic-
ipation. Remote learning has become accessible for students unable to attend school
in person, and ‘flipped learning,’ where students watch prerecorded lectures at home
and come to class for questions, has become feasible. Digital literacy has emerged as
a critical educational component, necessitating the inclusion of coding, data analysis,
and digital communication skills in the curriculum.
In the digital age, ensuring that all students have equal access to digital devices
and the internet is crucial, as failure to do so can lead to educational inequality and
inadequate development of digital literacy. This, in turn, can hinder effective learning
aligned with the digital age and disadvantage students in their post-graduation social
lives. Digital technology provides opportunities for interactive learning, personalized
learning, and remote learning. It is important to ensure that all schools are equipped

5 Digital literacy encompasses the technical and cognitive abilities required to effectively search,
interpret, create, and communicate information in a digital environment, going beyond merely
knowing how to use digital devices. It implies the capacity to navigate and understand information in
digital platforms, assess the credibility of information, use digital tools and resources critically, and
comprehend issues related to online safety and privacy. This competence is essential for consuming
and producing information in modern society, enabling individuals to actively participate in the
digital world.
6 Science literacy refers to the knowledge and understanding necessary to grasp scientific concepts,

methods, and reasoning. It involves the ability to think critically about scientific information, inter-
pret scientific data and arguments, understand the nature of scientific inquiry, and apply scientific
principles in everyday life. This literacy extends beyond mere familiarity with scientific facts,
encompassing the ability to engage with scientific content, evaluate the reliability of scientific
information, and make informed decisions based on scientific evidence. It includes foundational
knowledge in key disciplines such as physics, chemistry, biology, and earth sciences, as well as the
capacity to utilize numerical and digital tools to interact with scientific data.
194 7 Digital and AI Transformation in Society

with the necessary educational content and tools and that all students have equal
access to them to enhance educational outcomes. Teachers must first be familiar with
digital technology and develop ways to effectively use it in their teaching. However,
protecting student privacy and online safety is essential when incorporating various
digital technologies into education, necessitating strong data security measures.
Utilizing digital technology in education signifies a major transformation that can
face various obstacles and resistance. Socioeconomic and geographical disparities
that prevent equal access to digital devices and internet connectivity for all schools
and students are significant barriers. Even with digital infrastructure, failure to update
it according to technological changes can become another obstacle. Costs associated
with installing digital infrastructure, purchasing various applications and services,
and updating them can also be barriers to adopting digital education methods. The
potential for issues with handling student personal information in the process of
using diverse digital technologies and tools in education necessitates robust privacy
protection measures. In addition, there might be hesitation or resistance from teachers
or parents toward adopting digital technology, requiring strategies to support teachers
in understanding the importance of digital technology and effectively using it in
education.
Ensuring that all students can access digital devices and the internet without
discrimination and enhancing the effectiveness of education through the application
of digital technology are merely the initial steps in the digital age of education.
Utilizing various digital tools in education to enhance students’ digital literacy and
scientific literacy is just the basics. The critical point is that the content of education
needs to change, as education in the digital age must be restructured to prepare for
a future where humans coexist with digital technology.
In the digital era, humans will live alongside various digital technologies and
devices, including AI and intelligent robots. It is necessary to research how humans
can coexist harmoniously with these digital machines and what the human role will
be in such situations. Moreover, it is crucial to closely examine what capabilities
humans need to effectively fulfill these roles and how education should change to
nurture these abilities. Observing the development of AI, it is essential to understand
anew what it means to be human in light of AI and what capabilities humans need to
coexist with it. Furthermore, in preparation for the future when humanoid AI robots
achieve or surpass human abilities, research is needed on maintaining a mutually
beneficial symbiotic relationship and reflecting this in education.
In principle, if machines can outperform humans in certain tasks, it is better to
let machines do those tasks, and humans focus on what they do best. For example,
there is no need for humans to memorize information that can be easily accessed via
internet searches or by questions to ChatGPT. While there are movements to exclude
AI from education due to the confusion it may cause, defensive measures alone would
not wisely address the future. Instead, it is better to explore what capabilities humans
need to effectively utilize AI and educate accordingly. However, this general principle
cannot be applied uniformly in all cases. Decisions should be made after examining
each case and identifying its essence. For example, extreme views, such as excluding
ChatGPT over concerns it might write student reports or asserting that students no
7.6 Politics in the Digital Age 195

longer need to learn writing because ChatGPT can do it, miss the point. The essence
of writing and its educational benefits, such as expressing thoughts, fostering critical
thinking, and creativity, must be considered first. Writing education is necessary even
to evaluate whether a composition by ChatGPT is proper. Considering these points,
writing education remains essential, regardless of ChatGPT’s presence.7

7.6 Politics in the Digital Age

As we navigate through the digital transformation era, society faces numerous chal-
lenges in its political and social spheres. Calls for fairness and justice are overshad-
owed by increasing instances of injustice and unfair practices. Critical voices face
both physical and psychological harassment from zealous supporters of certain indi-
viduals or political parties, leading to widespread discomfort. Despite clear exposure
of deceit and misconduct, there is a troubling persistence of defiance without any
signs of shame or guilt. The convenience of the internet and social media comes at
the cost of enduring harmful comments and the rapid spread of harmful ideologies,
further aggravating social conflict and division. The circulation of misinformation
and fake news not only intensifies these conflicts but also skews public perception
and influences election outcomes, creating a landscape where digital advancements
contribute to complex societal dilemmas.
Such pathological phenomena are not directly caused by digital transformation.
They arise from various factors, among which post-truth and tribalism phenomena
stand out. Post-truth refers to the phenomenon where emotional appeals are
responded to more than objective facts, distorting the truth, and tribalism involves
acting according to the identity of one’s group. Although these factors have always
existed, their prominence today is fueled by growing sociopolitical and economic
issues like income disparity, social dissatisfaction, inequality, and instability, and the
powerful dissemination tools provided by digital technologies.
Digital transformation, signified by hyperconnectivity, has enabled activities
beyond the constraints of time and space, such as acquiring information, distributing
it, expressing opinions, and collective action, bringing revolutionary changes to polit-
ical and social activities. It has maximized the openness of information, reducing the
possibility of power concentration through information monopoly and advancing
democratization. Converting government administrative tasks to e-governance has
increased national transparency and efficiency and improved public services, while

7 Writing can indeed be broadly categorized into creative and critical domains. Creative writing,
encompassing essays, poetry, and fiction, fosters imaginative thinking and the expression of personal
or imaginative narratives. On the other hand, critical writing, which includes columns, research
papers, and critiques, is geared toward analytical and evaluative thinking, aiming to deepen under-
standing and articulate well-reasoned arguments. Both domains play crucial roles in enhancing
writing skills: while creative writing allows for the exploration of ideas and emotions in novel
ways, critical writing develops the ability to assess, argue, and articulate complex ideas clearly and
effectively.
196 7 Digital and AI Transformation in Society

digital technology has facilitated electronic voting and national opinion collection.
Social media allows for expressing opinions and participating in group activities,
eliminating geographical barriers in political and social activities. However, misuse
of social media and internet broadcasts in political activities can lead to the spread of
false information and distorted public opinion. The manipulation and distortion of
information for illegal power gain can regress democratization, and personal infor-
mation leakage and malicious comments can violate human rights, while the produc-
tion and dissemination of misinformation and fake news can confuse public opinion.
In addition, various media turning to indirect advertising for profit poses risks by
covertly distorting facts and biasing public opinion.
In the digital transformation era, the integration of digital technologies into poli-
tics and society has both positive and negative aspects. One of the most positive
aspects is the emergence of digital or e-government. Digitizing all documents and
sharing them among government departments for electronic processing and making
information accessible to the public can enhance the efficiency and transparency of
government operations. This digital shift enables online processing of administrative
tasks, license applications, and tax payments, saving time and costs. Digital elections
could potentially increase participation rates through remote voting and improve
accuracy and efficiency by electronically processing votes.8 Moreover, digital plat-
forms can collect real-time public opinions on specific political issues or national
governance, allowing digital petitions and open policy proposals to be reflected in
policymaking. This approach makes political and policy decisions more grounded
in reality, enhances citizen participation and ownership, and advances democracy.
On the negative side, the digital era provides means for the rapid spread of
misinformation and fake news. Previously, information and news were disseminated
through formal newspapers and public broadcasts, which had mechanisms for fact-
checking, making it difficult for unverified information to be published. However,
social media and internet personal broadcasts in the digital age can disseminate infor-
mation indiscriminately and instantly without fact-checking, creating a significant
impact due to the echo chamber effect.9 Misinformation and fake news pose a critical
risk when used in politics, packaging distorted beliefs or extreme opinions in misin-
formation to conduct malign campaigns, manipulate public opinion, incite the public,
and contaminate elections with populist promises. This not only risks reversing elec-
tion outcomes but also places democracy in jeopardy. Persistent misinformation,
manipulation of public opinion, and incitement can undermine social trust in the

8 If the voting and counting process is handled electronically, accuracy and efficiency improve,
assuming a normal situation without attempts to manipulate election results through hacking. In
reality, suspicions of such hacking attempts cannot be completely dispelled, and there is no techno-
logical guarantee that hacking can be entirely prevented. Therefore, there are arguments to return
to paper ballots and manual counting as in the past, and some countries are actually implementing
this.
9 Social media inherently encourages group conformity among its members, which can lead to

collective action based on group identity. If misused politically, combined with misinformation and
extreme ideology, it can form hostile factions and provoke destructive behaviors.
7.6 Politics in the Digital Age 197

government, media, and electoral systems, leading to extreme political strife, faction-
alism, and public opinion division, jeopardizing national unity. Adversarial nations
might exploit this to intervene in elections and politics through misinformation and
manipulation.10
The indiscriminate spread of misinformation and fake news has devastating conse-
quences, making it urgent to devise appropriate strategies to combat it. A key aspect of
the response strategy is establishing methods to determine the authenticity of misin-
formation and fake news and implementing legal actions based on those findings.
From a preventative standpoint, it is also necessary to run digital literacy campaigns
to promote the correct use of digital technologies. Accurately discerning the truth-
fulness of information disseminated through media requires interdisciplinary collab-
oration among experts in various fields such as journalism, social psychology, soci-
ology, political science, science and technology, and communications. To implement
this, regular meetings among these experts for organic networking are essential. A
viable solution for promptly addressing the issue of false information could involve
deploying AI technology for real-time filtering, an approach termed ‘AI filtering.’
This ‘technological treatment’ leverages advanced algorithms to automatically distin-
guish between factual and misleading content, offering a proactive measure against
the spread of misinformation. This could serve as a real-time solution, supple-
mented by expert group verification as a post-treatment method. For social media,
both real-time and post-event actions are necessary to prevent the amplification of
misinformation and fake news, ensuring appropriate legal measures are taken.
It is crucial to recognize the transformative impact of social media platforms,
which emerged through digital transformation, on political activities and social move-
ments. The most shocking change that hyperconnected social media has brought to
the political-social environment is collective action mediated by the internet. SNS
and internet personal broadcasts via platforms like YouTube are prime examples.
SNS, as a network of social relationships formed for communication, information
sharing, and expanding contacts, allows its members to express and share personal
statements. Internet personal broadcasting provides a means to propagate personal
views to an unspecified majority. Using these social media platforms, it is possible
to form groups that share opinions and engage in collective actions, ranging from
aggressive cyberactivities like negative comments to physical collective actions like
protests. The distinctive feature of such collective actions today is their formation and
execution beyond the constraints of time and space. Group members can participate
in collective actions even if they are dispersed across different regions or nations,
and they can even disrupt and aggravate situations from adversarial nations. There-
fore, it has become a critical task to sensitively address these environmental changes
and develop multilayered countermeasures for personal information protection and

10 Reports have highlighted China’s involvement in cyberespionage, influence operations, cyber-


attacks, internet comment manipulation, and political maneuvers in countries like Taiwan, Thailand,
Australia, South Korea, and Canada. Professor Kerry Gershaneck of National Chengchi University
has noted that China deeply intervenes in Taiwan’s politics and elections through digital political
warfare and media wars, employing similar strategies and tactics against South Korea. Refer to
Political Warfare, by Kerry Gershaneck, published in 2022.
198 7 Digital and AI Transformation in Society

national information security. Nevertheless, careful consideration and restraint is


required to ensure that regulatory measures do not infringe on freedom of the press
or stifle the creative and innovative attempts of new digital technologies.

7.7 Digital Surveillance

In the era of digital transformation, there’s a phenomenon quietly unfolding. It


involves the collection of information on individuals for surveillance purposes and
the gathering of consumer data for business use. Collecting individual information
for national surveillance is an act more likely to happen in authoritarian countries but
is not permitted in free democratic countries. However, even in free democratic coun-
tries, individual information is still being collected and utilized, mainly by digital
platform companies gathering consumer data for targeted advertising. The former is
surveillance by the state, while the latter is surveillance by corporations. The former
is conducted unilaterally by the state without citizens’ consent, whereas the latter is
largely carried out by businesses with consumers’ consent. If the former infringes
on citizens’ human rights through political acts, the latter is an economic activity
conducted with considerable consumer understanding. While the former creates a
digital surveillance society, the latter leads to a surveillance capitalism society.

7.7.1 Digital Surveillance Society

Even in the past, authoritarian states employed various surveillance methods to


monitor specific individuals. They used numerous secret agents for trailing, eaves-
dropping, and surveillance, which was costly. Today, surveillance is still preva-
lent in authoritarian states, but the difference lies in the ease, precision, and cost-
effectiveness of conducting surveillance with the help of digital technology. It has
become possible to surveil not just specific individuals but a vast number of people,
even the entire population. High-performance cameras installed everywhere can
capture people in motion, and facial recognition technology allows for the identifica-
tion of each person, enabling the collection of individual movement data. Lip-reading
using AI technology can reveal conversations. In the future, the introduction of
Central Bank Digital Currency (CBDC) and its mandatory use by citizens could also
enable the collection of individual financial transaction data. Such tracking of indi-
vidual physical movements and financial flows enables comprehensive surveillance
of the entire population.
China operates a surveillance system called “Skynet”, which utilizes CCTV
cameras, internet monitoring, facial recognition, and other data collection and anal-
ysis technologies. It is estimated that there are 600 million CCTV cameras installed
in China, and the country is known to have some of the most advanced facial recogni-
tion technology in the world. According to the state-run Workers’ Daily, Skynet can
7.7 Digital Surveillance 199

scan the entire population of China in just one second, identify moving individuals,
and offer up to 99.8% accuracy by considering facial expressions, movements, and
variations in light and shadow. Following the Skynet project, the Chinese govern-
ment made it mandatory from December 2021 to register facial information when
activating mobile phones. Eventually, Skynet, armed with more advanced digital
and AI technologies, will closely monitor every move of its citizens, protecting the
communist regime from anti-establishment unrest.
In addition, China has implemented a nationwide Social Credit System that
assigns credit scores to individuals and companies. This system evaluates a range of
factors, including financial behaviors like income tax payments, utility bills, and loan
repayments; social behaviors like traffic law compliance and public transport fare
payments; and online behaviors such as online conversations, comment reliability,
and shopping habits. Citizens begin with a base score that is adjusted based on their
actions, with positive behaviors like timely tax payments, public welfare contribu-
tions, and blood donations increasing one’s score, while actions like environmental
violations, jaywalking, or parking infractions result in deductions. The accumulated
scores influence various aspects of life, such as insurance premiums, school admis-
sions, scholarships, internet access, high-speed train and flight eligibility, foreign
travel, public sector job applications, and loan interest rates. For instance, a high score
may grant benefits such as priority hospital reservations, discounts on utility bills,
lower loan interest rates, and free health check-ups, whereas a low score may result
in difficulties in securing public sector employment, limited children’s admission to
private schools, or restrictions on travel and accommodation options.
While China’s Social Credit System may initially appear as a structured way to
incentivize positive behavior, it raises several concerns. One issue is the fundamental
concept of evaluating individuals through a scoring system, which could be seen as
conflicting with the idea of personal freedoms. In addition, the criteria for evalua-
tion may be subject to interpretation. For example, a rule about reducing one’s score
for spreading false information could be applied broadly, potentially affecting those
who express dissenting views. In some cases, evaluation criteria extend beyond indi-
vidual social behavior to include aspects like social circles, personal relationships,
or political and religious views. These aspects, which may not be directly related
to an individual’s actions, could disproportionately impact their score. Such criteria
raise questions about fairness and transparency and may be viewed as mechanisms
for exerting social control rather than promoting societal well-being.
China’s Social Credit System integrates with existing technical surveillance
methods, such as cameras and facial recognition, creating a dual system of over-
sight. The combination of retrospective surveillance with the proactive elements of
the credit system allows for a more comprehensive form of monitoring. While this
system is designed to encourage socially harmonious behavior, critics argue that it
also serves to regulate public expression and dissent. The system’s opaqueness, where
individual score criteria are not fully disclosed, can lead to uncertainty about which
actions will affect one’s score. This lack of transparency may foster a climate of
self-censorship and social caution, as individuals may avoid associating with those
whose scores have been lowered to avoid negative repercussions. As a result, the
200 7 Digital and AI Transformation in Society

Social Credit System may not only function as a governance tool but also as a means
of influencing public behavior and maintaining control over societal discourse.11
While China was the first to implement the Social Credit System, it is a mecha-
nism that could potentially be adopted by other authoritarian states. However, even
democratic nations are not immune. As seen in the USA, considered a benchmark of
democracy, the selection of a president can significantly sway national governance.
This prompts a crucial reflection on how to prevent the creation of new surveillance
tools and stop nations from descending into surveillance societies.
As a measure against the Social Credit System, the first consideration could be
to enshrine the protection of human rights, privacy, and the prohibition of guilt by
association in the constitution, safeguarded by the Supreme Court as guardians of
constitutionalism and the rule of law. This assumes that Supreme Court justices
are appointed based on their dedication to upholding the constitution and the rule
of law. However, appointments to justices can be influenced by the president, and
in situations where political tribalism and populist politics may challenge the rule of
law, constitutional provisions alone are not reassuring.12
Another potential strategy to address surveillance society issues is legalizing the
restrictions on the long-term storage of personal information by governments and
corporations. Specifically, laws could be established to limit the retention period
of sensitive information, such as political or religious preferences, or to preferably
prevent the collection of such information altogether. In addition, forming interna-
tional agreements, though non-binding, could solidify national commitments as an
international promise. The ultimate recourse is for citizens to stand against authori-
tarian or controlling governments to protect human rights and freedom. While resis-
tance movements may face limitations in countries already under surveillance by
authoritarian regimes, in liberal democracies, early opposition against the installation
of digital or social surveillance systems can have a chance of success.

7.7.2 Surveillance Capitalism Society

Digital platform companies use digital technologies to massively collect and analyze
user data, employing it for various purposes such as customized advertising, product
development, and market forecasting to pursue economic gains. This data includes

11 The Artificial Intelligence Act (AIA), enacted by the EU in 2024, prohibits AI systems that
evaluate individuals’ social behavior and impose benefits or punishments based on such evaluations,
as well as systems that manipulate people’s behavior or thoughts in an unfair manner. See Chap. 5,
Sect. 5.9.5 for reference.
12 “Political tribalism,” a term coined by Amy Chua, refers to the inherent human instinct to affiliate

with groups, fostering a sense of belonging and attachment. Chua elaborates on tribalism’s dynamics,
noting that once individuals align with a group, their identities become remarkably intertwined with
that group. This allegiance compels individuals to aggressively support their group members, often
leading to unwarranted hostility toward outsiders. This insight is detailed in Amy Chua’s Political
Tribalism, published in 2020.
7.7 Digital Surveillance 201

online searches, website visits, social media interactions, location information,


behavioral patterns, and consumption habits. These companies collect and analyze
such data to gain insights into individual behaviors, preferences, and interests, metic-
ulously documenting them in user profiles. This practice of data collection and utiliza-
tion can infringe on individual freedom and privacy, especially since large-scale data
analysis and algorithmic predictions can manipulate personal behavior. Thus, users
in the digital age may find themselves surveilled and manipulated by digital platform
companies without noticing it. This has led to the emergence of a new concept called
“Surveillance Capitalism.” Surveillance capitalism refers to an economic system and
social phenomenon where digital platform companies collect, analyze, and monitor
user data, converting it into commercial value for profit.13
The case of Google may help to understand the real-world background of surveil-
lance capitalism. Google, founded in September 1998, began offering its search
engine service without a revenue model initially. However, after launching the adver-
tising platform “AdWords” (now Google Ads) in October 2000, Google started
generating revenue, eventually evolving into customized and targeted advertising.
AdWords allowed advertisers to bid on keywords and create text ads next to Google’s
search results, initially focusing mainly on keywords. Recognizing the value of user
reactions and search queries as useful information, Google developed advertising
based on user data. Initially, the company collected user information to improve
the quality of search results but soon recognized user information as a valuable
asset. Google began actively collecting information on users’ interests, preferences,
and online behaviors, developing various services and products to encourage users
to expose more of this information. Subsequently, utilizing this surplus user data
for targeted advertising became a core element of Google’s advertising strategy,
significantly contributing to its advertising revenue.
Google uses a variety of techniques to collect and analyze user data, which is then
applied for personalized advertising and service improvements. By analyzing search
queries on its search engine, Google can identify users’ interests and serve relevant
ads. In Gmail, automated systems may scan emails to identify certain keywords, such
as travel plans, to show targeted advertisements. Google Maps and mobile location
data are used to provide information about nearby places of interest. On YouTube,
viewing histories and preferences help recommend videos and display ads that are
more likely to align with users’ tastes. For users logged into their Google accounts, the
company combines data across its services, such as Google Search, Gmail, Google

13 Shoshana Zuboff, author of The Age of Surveillance Capitalism, views the surveillance capitalism

society as one where online-based product sales and marketing in the digital platform era primarily
rely on individual digital traces. Platform companies extract data left online by individuals for
free, gaining commercial profits and power. Similar to how industrial capitalism utilized labor, the
power-holders of surveillance capitalism consume every digital trace of individuals, increasingly
amplifying their power and reducing individuals to a state of slavery. Individual data is not only
collected, analyzed, and categorized for commercial use but also employed to guide, control, and
manipulate individuals. People become custom consumers who only consume what algorithms
present based on their data, transitioning from beings with free will to analyzed data and puppets
utilized for others’ gains.
202 7 Digital and AI Transformation in Society

Maps, and YouTube, to build detailed user profiles. These profiles are further enriched
with information collected through cookies and tracking technologies that monitor
users’ online behaviors, enabling more precise ad targeting. Leveraging big data
analysis and machine learning, Google analyzes these profiles to detect patterns and
predict user behaviors and preferences. This approach allows Google to deliver highly
targeted advertisements, enhancing the effectiveness of its advertising platform. As a
result, advertisers benefit from more tailored ad placements, and Google strengthens
its revenue streams from advertising.14
While Google is often used as an example, other digital platform operators engage
in similar practices. Companies like Meta (Facebook) and Amazon, which rely
heavily on digital advertising revenue, actively collect user information through
their social networks and online commerce platforms for advertising and marketing
purposes. Data points such as conversations, messages, ‘likes’ on Facebook, photos
uploaded to Instagram, and comments on those photos are captured, analyzed, and
stored in individual user profiles. This enables platforms to gain deep insights into
personal information, including food preferences, travel habits, religious views, polit-
ical leanings, and social interests. The data is then used to deliver targeted adver-
tisements, predict future actions, and, in some cases, influence behavior. As a result,
users are under continuous data collection by these platforms. These practices have
contributed to the emergence of the concept known as ‘surveillance capitalism,’
where user data is monetized as part of the economic model.
When digital platform companies like Google, Meta, and Amazon collect more
user data than necessary for improving service quality and use that surplus data for
targeted advertising, it enables the phenomenon of surveillance capitalism. These
platforms analyze data generated by users’ online activities, gathering comprehen-
sive information such as location, movement patterns, interests, social networks,
consumption habits, search behaviors, and even political views and religious pref-
erences, which are stored in individual profiles. By analyzing these profiles, plat-
form operators gain insights into user behaviors, creating a parallel between data and
behavior—with “surplus data” representing “behavioral surplus.” Although platform
companies claim that user data is collected for service improvement, only a portion
of it is used for that purpose. The remainder—behavioral surplus—can be repur-
posed for objectives such as targeted advertising, consumption prediction, or even
political influence. Shoshana Zuboff argues that just as surplus labor fueled industrial
capitalism, behavioral surplus drives surveillance capitalism in the digital platform
era. In contrast to industrial capitalism, where human labor created value, Zuboff
suggests that in surveillance capitalism, human behavior becomes the raw material,
captured by digital systems and transformed into valuable data. Machines, which

14In response to growing concerns over user privacy, Google has announced changes to its data
management approach. Starting in 2024, information related to users’ movements, which was
previously stored on both the user’s device and Google’s servers, will only be retained on the
user’s device.
7.7 Digital Surveillance 203

once served as fixed capital in the industrial age, now act as variable capital, contin-
uously upgrading through machine learning and improving their ability to predict
and influence behavior.15
Surveillance capitalism differs from traditional capitalism in several ways. Capi-
talism has historically been defined by the privatization of production, profit maxi-
mization, and market competition, primarily involving the production and exchange
of goods and services. Surveillance capitalism extends this framework by utilizing
data (behavior) as a key resource for profit generation. Zuboff points out that in
this model, machines have assumed the role of value creation, reducing humans to
sources of behavioral surplus. While applying the term “capitalism” to personal data
collection may be considered controversial, the secretive nature of data collection by
platforms bears resemblance to “surveillance,” and the pursuit of profit from this data
aligns with the principles of capitalism. Hence, the term “surveillance capitalism”
encapsulates these dynamics, raising awareness of the practices within the digital
platform era and encouraging individuals to be mindful of these developments.
The concept of surveillance capitalism presents a crucial opportunity for users
of digital platforms to reflect on how their online behavior, often shared without
much thought, can be used and what consequences it may have. It prompts individ-
uals to consider how to navigate digital spaces responsibly and avoid the potential
negative impacts of exposing too much personal information. Furthermore, it raises
questions about the societal mechanisms needed to prevent harm, ensuring that indi-
viduals retain control over their data. As society moves further into the digital and
AI transformation era, it becomes increasingly important for individuals to reclaim
their right to information protection, demand transparency and accountability from
corporations and governments, and work toward maintaining a more democratic and
ethical society. Addressing large-scale data collection and surveillance requires rein-
forcing privacy rights, improving personal data protections, and strengthening legal
frameworks to secure transparency and consent in data collection.16 For individuals

15 To be more specific, Zuboff’s argument is as follows: Under the regime of surveillance capitalism,
humans are no longer the agents of value realization. Far from being entities that create value through
labor, humans have been relegated to being part of the means of production, or more precisely, raw
material. While industrial capitalism transformed raw materials obtained from nature into products,
surveillance capitalism seeks to utilize human nature. In return, the capacity to create value in
the capitalist production process, which in the era of industrial capitalism was just ‘fixed capital’
represented by machines, has now shifted. During the industrial age, although machines participated
in production, they couldn’t enhance their own value, thus remaining ‘fixed capital.’ However, the
scenario has completely changed with the advent of digital transformation and the development of
machine intelligence. Google’s machine intelligence technologies grow by consuming ‘behavioral
surplus,’ and the more behavioral surplus is fed into it, the more accurate the predictive products
created by machine intelligence become. Through machine learning mechanisms, machines now
upgrade themselves at every moment of operation, transforming into ‘variable capital’.
16 The Digital Services Act (DSA) enacted by the European Union in 2022 includes measures to

limit the collection of consumer information and its use for personalized advertising. It mandates
that consumers have the option to halt personal information collection and deactivate recommenda-
tion algorithms, and prohibits personalized advertising based on religion, race, sexual orientation,
political leanings, etc., especially targeting children and adolescents. In addition, the Artificial
Intelligence Act (AIA), enacted in 2024, prohibits AI systems that unfairly manipulate people’s
204 7 Digital and AI Transformation in Society

to navigate digital transformation effectively, they must be empowered to control how


their data is collected and used, supported by the establishment of ethical guidelines
to prevent misuse. Achieving this requires active civic engagement, public discourse,
and solidarity among citizens to restore and protect individual rights in the digital
age, while ensuring social welfare and ethical governance.

7.8 Digital Self-restraint

Digital surveillance is common to both surveillance societies and surveillance capi-


talism societies, utilizing digital technology to collect personal information and
directly or indirectly restrain individuals. In a digital surveillance society, citizens are
mentally and physically constrained by a controlling government, while in a digital
surveillance capitalism society, users are unknowingly exploited by corporations.
Both scenarios involve the collection of personal or private information to surveil
and restrain citizens or users. While such digital surveillance uses digital technology
to constrain citizens or users, there also exists a phenomenon where users self-
restrain while using digital services. This self-restraint manifests in various forms,
such as distraction from work, immersion in videos leading to a misperception of
reality, anxiety when disconnected from the outside world, thoughts trapped in a
mold unknowingly, and unconscious assimilation into a group.

7.8.1 Fear of Disconnection

Humans have an instinctive desire to feel a sense of belonging and connection to


others. In today’s hyperconnected society, personal digital devices such as smart-
phones, tablets, laptops, and PCs, combined with a well-developed internet infras-
tructure, enable seamless connections with anyone, anywhere. In addition, the variety
of connection types through search engines, social networks, and messaging plat-
forms is vast. Maintaining these connections is often low-cost or free, allowing
constant interaction and satisfying the innate need for social belonging. Social
networks, in particular, help sustain relationships and create a sense of community,
reinforcing the feeling of being connected. However, this can lead to dependency
on social media, where disconnection may cause feelings of isolation, loneliness,
and anxiety. For some, this sense of disconnection manifests as fear of missing out
(FOMO), which can intensify from mild anxiety to negative emotions, depression,
and a loss of self-identity. When self-esteem becomes tied to social media valida-
tion, such as the number of “likes” or comments, disconnection from these platforms

behavior or thoughts. Misuse of surplus data for purposes such as targeted advertising, consumer
predictions, or political manipulation could violate this provision. Refer to Chap. 5, Sect. 5.9.5 for
further details.
7.8 Digital Self-restraint 205

may result in emotional distress and a diminished sense of self-worth. Furthermore,


continuous exposure to information overload followed by sudden disconnection
can trigger withdrawal-like symptoms, similar to those seen in cases of substance
addiction.
FOMO and a loss of self-identity are examples of digital societal challenges
resulting from the excessive use of social media. These issues often arise when
individuals do not form an independent self-identity, instead deriving their sense
of self from their position within a group. Addressing these concerns requires a
conscious effort to reduce dependency on digital platforms, while also fostering
real-world connections and activities that contribute to a balanced lifestyle.

7.8.2 Degraded Concentration

Living in a hyperconnected society with smartphones, the internet, and various plat-
form services often leads to a continuous flow of information, which can disrupt
one’s ability to focus on tasks. The frequent shifts between communication partners,
topics, and methods force individuals to switch their attention rapidly, making it
difficult to maintain focus on a single task for extended periods.17 Features of social
media, such as “likes,” comments, and shares, are designed to engage users and
can contribute to addictive behaviors, further dispersing focus. While digital tech-
nology enables multitasking, using multiple devices and applications simultaneously
increases cognitive load, reducing the depth of concentration that can be applied to
any single task. Moreover, the overwhelming volume of information and the pres-
sure to stay connected can heighten anxiety and stress, further hindering focus.
Frequent interruptions from emails, messages, and notifications fragment attention,
preventing deep, focused work and leading to more superficial task completion. In
addition, digital technology and social media can alter the brain’s reward system and
expectations, making slow-paced tasks seem less appealing as individuals become
accustomed to fast information and entertainment.
In order to counteract the decline in concentration, a digital detox may be neces-
sary. This involves temporarily reducing or ceasing the use of digital devices like
smartphones, computers, and tablets. The term “digital detox” combines “digital”
with “detoxification,” referring to the process of stepping away from electronic
devices, the internet, and social media to recover from over-reliance on these tech-
nologies. Engaging in a digital detox can help individuals refocus on real-life activ-
ities and social interactions without the constant interference of digital distractions,
ultimately reducing stress and improving concentration.

17Johann Hari, in his book Stolen Focus, cites technological distractions like social media and
smartphones, information overload, and a multitasking culture as factors that degrade concentration.
206 7 Digital and AI Transformation in Society

7.8.3 Digital Escapism

Some individuals immerse themselves in TV programs, OTT services, and videos


circulating on social media, indulging in fictional worlds and at times mistaking them
for reality. While these fictional worlds may offer temporary happiness, individuals
often feel disappointment and pessimism when they confront the gap between fiction
and reality. This form of escapism provides a refuge from harsh or unpleasant realities,
but habitual indulgence in these fantasies can become problematic. This tendency is
particularly noticeable among youth who may experience anxiety about their future
or difficulties securing employment, leading them to fictional worlds as a form of
relief. However, the deeper one becomes absorbed in these fictional realms, the
greater the shock when returning to reality. A strong sense of identification with
characters or narratives can result in feelings of loss or sadness when a show or
movie ends, and the contrast between the excitement of fiction and the dullness of
reality becomes more pronounced. This contrast effect occurs when two contrasting
experiences, such as dramatic fictional events and routine real-life moments, make
the latter feel even more mundane or disappointing. Some individuals may struggle
to distinguish between the intense emotions experienced through fictional media and
the more somber experiences of real life.
In order to address the issue of escaping into fictional worlds and becoming overly
absorbed in them, it is essential to develop the willpower to confront reality without
relying on entertainment as a form of escapism. By cultivating a healthier balance
between recreational media consumption and real-world engagement, individuals
can enjoy fictional content without becoming detached from their actual lives.

7.8.4 Digital Technostress

As digital technology continues to evolve and accelerate, technostress is becoming an


increasingly common aspect of modern life. While the introduction of new technolo-
gies, especially those designed to make everyday tasks or work more efficient, can
initially bring feelings of joy and liberation, technology that is imposed on consumers
without an apparent need can instead lead to stress. New technology needs to be both
understandable and intuitively usable; if it is overly complex or difficult to compre-
hend, adapting to it can become challenging and stressful. Change is manageable
when it occurs at a pace within one’s mental and physical limits, but when it exceeds
those limits, stress accumulates. As new technologies rapidly accumulate, they form
a new kind of civilization, and encountering this without the ability to fully adapt
can lead to cultural alienation and mental strain. The rapid changes in technology
can overwhelm individuals, making them feel disconnected from their own era or
environment, almost as if they are living in a different time or place. For example,
the invention of telephones and washing machines during the industrial age was
7.8 Digital Self-restraint 207

generally welcomed due to their convenience and ease of use. However, technolo-
gies introduced in the digital age, such as digital kiosks and online banking services,
can cause stress for many people, especially when their functionality is difficult to
understand. Reducing staff in favor of kiosks or closing bank branches to promote
digital banking can make some people view digital technology as a source of incon-
venience and a threat to their ability to manage everyday tasks. This technostress
is contributing to a harsher societal environment, leading some to feel nostalgic for
past technologies, even forming subculture groups that resist modern innovations.
Technostress caused by digital technology can partly be attributed to the immatu-
rity of current technologies. As society progresses toward a digital/AI-driven future
and technology becomes more sophisticated, some issues related to usability and
adaptation may be alleviated. For instance, AI-powered kiosks could eventually
mimic the natural interactions of a human clerk, restoring some of the ease and
comfort lost in the transition to digital systems. However, the continuous introduc-
tion of new technologies that replace familiar products and services at a rapid pace,
without addressing whether there is a perceived need for these changes, can still lead
to cultural alienation and temporal dissonance. This challenge, stemming from the
speed of change rather than the maturity of the technology itself, may require broader
solutions beyond just technological advancement.

7.8.5 Confinement of Thoughts

Although slightly different in nature, search engines and social media create filter
bubbles and echo chamber effects, unknowingly restricting or distorting informa-
tion for users, confining their thinking within certain frameworks. Internet search
engine algorithms are designed to provide users with personalized, quick services by
remembering past search histories and presenting search results within similar ranges
when new queries are entered. This results in users being unknowingly confined to
the scope of their past search histories. Such filter bubble phenomenon, by limiting
the information accessible through internet searches, narrows users’ perspectives
and distorts their thinking. As this phenomenon recurs, it prevents users from seeing
things from various viewpoints, leading them to prejudices and self-confirmation,
fostering distorted thinking.
Social media platforms allow users to create chat rooms for communication,
where the information’s echo effect can homogenize beliefs among participants,
distorting and biasing thought. This echo chamber effect solidifies beliefs even in
those initially without strong convictions, through conversations with like-minded
groups, leading to the formation of factional groups and trapping thought within the
collective mindset. Furthermore, the echo chamber effect can homogenize beliefs
among members of the same chat room, form groups, and encourage group action.
If utilized politically, this can lead to the formation of hostile factions and destruc-
tive actions. While filter bubbles cause bias by restricting collected information,
208 7 Digital and AI Transformation in Society

echo chambers distort information egocentrically, leading to confirmation bias. Both


phenomena represent how digital technology unknowingly confines or limits thought.

7.9 AI Transformation in Society

The transition to AI in society will be a transformative force that shapes the future
in ways both visible and subtle. AI technology is already improving everyday life,
with smartphones now equipped with AI assistants that handle routine tasks, manage
schedules, and even anticipate our needs through predictive algorithms. Smart homes
use AI to control lighting, security, and appliances, creating a seamless, personal-
ized living experience. In the realm of entertainment, AI can curate content tailored
to individual preferences, offering personalized recommendations that reflect user
habits. Similarly, AI-driven healthcare is revolutionizing medical consultations, with
remote systems providing precise diagnostics and customized treatments, ensuring
that medical care is more efficient and accessible than ever before.
Beyond these personal benefits, AI is having a profound impact on workplace envi-
ronments and social activities. AI-enabled tools are enhancing the speed and precision
of tasks, in various areas like finance, marketing, or manufacturing, reducing the need
for human intervention. In fact, many have already experienced the convenience of AI
in day-to-day interactions with chatbots, such as ChatGPT, which provide on-demand
information, customer service, and even emotional support. This shift suggests that
AI transformation will redefine not just the efficiency of society but also how people
interact, communicate, and build relationships.
The AI transformation will have a more significant and direct influence on society
than the digital transformation did. While digital transformation centered on smart-
phones that gave users control over communication and services, AI could change
the very nature of interaction, placing AI in a more authoritative role. People will
rely on AI-powered smartphones, which will take on many decision-making tasks
autonomously.18 For instance, a smartphone may no longer just be a tool that responds
to commands; it may anticipate needs and take independent actions. This raises ques-
tions about over-reliance on AI and the gradual shift of decision-making authority
from humans to machines.
Platforms that were central to the digital transformation, such as search engines,
social media, and e-commerce, will need to evolve to survive in this AI-driven land-
scape. Initially, these platforms may face challenges, as AI offers new ways to interact
and consume content. However, just as digital platforms adapted to mobile OS
systems, AI platforms will emerge,hosting various applications tailored to specific AI

18 Samsung released the Galaxy S24 series on January 31, 2024, as its first AI-powered smartphone,
featuring innovations like Live Translate and Chat Assist. In September 2024, Apple launched the
iPhone 16 with AI features, introducing Apple Intelligence, a suite of AI tools for tasks like image
analysis and text rewriting.
7.9 AI Transformation in Society 209

services, reshaping the marketplace in the process.19 Users will begin to interact with
AI systems for an increasing number of activities, including shopping, socializing,
and even creative tasks like content generation.
One key consequence of this AI revolution is that real-world interactions may
increasingly be AI-mediated. Human-to-human interactions could be replaced by
human-to-AI or even AI-to-AI transactions. For example, in online shopping, a
user might only need to mention a desired product to their AI assistant, which will
autonomously handle the entire purchase, communicating with the AI systems of e-
commerce platforms. This level of automation presents unprecedented convenience
but also shifts the role of humans from active participants to passive overseers, with
AI taking over tasks that were once human-dominated.
With the AI transformation, the job landscape will change dramatically. While
AI will lead to unprecedented efficiency, especially in administrative roles, it will
also result in significant job displacement. Traditional jobs, even in professional
sectors such as medicine, law, and management, could be substantially replaced
by AI, as algorithms become more capable of handling tasks that require complex
decision-making. However, this shift will create new opportunities. Jobs such as AI
specialists, AI trainers, and AI integrators will rise in demand, requiring advanced
skills to manage and optimize AI systems across industries. As a consequence, it
becomes crucial to evolve vocational training to equip workers with the necessary
AI-related skills, and AI itself can be a powerful tool for such education. For example,
AI-driven simulators can offer hands-on training in fields as varied as aviation and
medicine, providing workers with immersive, interactive learning environments.
As AI becomes more pervasive, concerns over the digital divide will evolve into
worries about the AI divide. Individuals who lack proficiency in AI technologies
may be left behind, exacerbating existing inequalities. Ensuring AI literacy for all
members of society is a critical task. While AI has the potential to create interactive
and personalized learning environments that can bridge these divides, it also poses
the risk of leaving those without access or understanding further marginalized. In
addressing this, AI itself may possibly offer solutions, such as voice-activated AI
systems with conversational capability at kiosks or in public spaces that allow users
to access services as if they were engaging with a human. It can provide inclusive
solutions for the pre-digital generation and those with limited technical skills.
However, the AI transformation is not without its risks. The spread of misinforma-
tion, already a significant issue in the digital age, could be exacerbated by AI, partic-
ularly through the use of deepfake technology. Deepfakes use AI to create manip-
ulated audio and video content, often indistinguishable from reality. These tools,
while technologically impressive, pose threats to political stability, personal repu-
tations, and social trust. Moreover, the ability to distinguish authentic content from
fabricated media becomes increasingly difficult, presenting a fundamental challenge
in maintaining truth in an AI-driven society.

19For example, OpenAI launched the “GPT Store” in January 2024, allowing users to create and
use customized GPT applications.
210 7 Digital and AI Transformation in Society

As AI operates on vast amounts of data, concerns will grow around privacy, data
security, and surveillance. AI systems rely on extensive datasets to make informed
decisions, leading to fears about privacy violations and the misuse of personal
data. However, AI also offers the potential to strengthen security by creating more
advanced protective measures. The key challenge lies in balancing the benefits of
AI-driven security with the need to safeguard individual rights and prevent unwar-
ranted surveillance. In addition, ethical concerns surrounding intellectual property
and bias in decision-making must be addressed, as AI continues to assume roles in
creative fields and judgment-based industries.
In summary, the AI transformation offers immense promise but also significant
challenges. Society must work to ensure that the benefits of AI, such as increased
efficiency, better healthcare, and improved education, are maximized while mini-
mizing risks related to privacy, bias, ethical dilemmas, and job displacement. The
future of AI is one of potential, but also one that demands thoughtful management
to navigate the complex social, legal, and philosophical issues that will arise.
Chapter 8
Challenges of Digital and AI
Transformation

We have explored various aspects of digital and AI transformation throughout this


work. Initially, we examined the foundational elements of digital transformation,
including the processes involved in establishing these foundations. Following that,
we analyzed the functions and dysfunctions of newly emerged digital platforms,
focusing on four representative platforms and the challenges they present. We also
delved into the technologies driving digital transformation, discussing their roles
and applications. Moving into the realm of AI, we covered fundamental concepts
like algorithms, machine learning, and neural networks, leading into the transformer
architecture, particularly its application in models like GPT. Building on this, we
discussed the digital transformation of industries and its impact on society as a
whole.
In this final chapter, we will revisit the key points of digital and AI transformation,
while addressing the challenges that emerged throughout our discussions. We will
begin with a brief overview of the Digital Revolution that set the stage for digital
transformation, followed by an exploration of topics related to digital platforms,
the transformation of industries, and its societal impacts in areas such as educa-
tion, media, and personal skills. We will also reflect on the role of governments in
managing digital transformation and provide a brief forecast of the AI era that will
follow the digital age.

8.1 Digital Revolution

We regard the Digital Revolution as the starting point that transformed industrial
society into a digital society. Just as the Industrial Revolution transitioned agrarian
society into an industrial one, it is reasonable to assume a corresponding Digital
Revolution marked the transition into a digital society. While the Industrial Revolu-
tion is symbolically dated to James Watt’s invention of the steam engine in 1814, this

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 211
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_8
212 8 Challenges of Digital and AI Transformation

is emblematic. By that time, the technologies supporting such an invention already


existed, and the atmosphere of the Industrial Revolution was ripe when Watt’s steam
engine invention pulled the trigger. Similarly, the Digital Revolution was not trig-
gered by a single invention or event. Instead, the trend and potential for digitalization
grew across various industries and technologies, culminating in certain events that
sparked the Digital Revolution.
Compared to the Industrial Revolution, the Digital Revolution unfolded quietly.
The Industrial Revolution brought about rapid changes in industry, economy, society,
transportation, and the environment, providing both conveniences and pains to
human life, marking a significant transformation in human history. The long-standing
agrarian and handicraft-based industries shifted to machine-based manufacturing,
leading to the relocation of rural populations to factories and the reorganization of
society around urban centers. It also led to the formation of new social classes such
as the industrial middle class and the working class and saw significant develop-
ments in transportation through railways and steamships. However, it also caused
environmental destruction through coal mining and deforestation, leading to severe
air and water pollution from factory waste. These actions sacrificed numerous lives
and had a significant impact on earth’s climate and ecosystems, setting the stage
for today’s climate change issues. Compared to this, the Digital Revolution has
unfolded without such abrupt changes, quietly spreading digital technologies across
industries and society without causing social upheaval or environmental destruction.
The only notable change was the rapid rise of digital platform companies to the top
10 in market capitalization globally. Other than that, digital transformation occurred
quietly, largely unnoticed.
The primitive starting point of digital transformation was the digital conversion of
analog signals such as voice and video into digital formats. Digital technology repre-
sented a new breakthrough in the quest for noise-free long-distance communication.
This breakthrough led to the realization of digital long-distance communication, the
integration of communication and computers at the signal transmission level, and
eventually their fusion at the system level, creating a new digital world. The digital
concept, discovered in the pursuit of realizing long-distance communication dreams,
ultimately became the cornerstone of creating the digital world.
It took nearly 50 years for the concept of digital technology to mature and become
the main agent of digital transformation, undergoing a lengthy development process
integrating communications and computing. This integration went through what can
be termed as the “First Digital War” and the “Second Digital War,” each marking
battles between wired and wireless domains, and between circuit switching and
packet switching modes, resulting in the integration at communication level. The
emergence of smartphones led to the “Third Digital War,” shifting integration to the
system level and merging at the operating system (OS) level. The outcomes of the
first two digital wars established communication platforms, and the third digital war
resulted in the establishment of content platforms. The form of the communication
platform was the internet, and the core of the content platforms was iOS and Android.
This marked the end of the lengthy process of integration, leading to the “ICT Big
8.2 Digital Platforms 213

Bang” with the opening of open application marketplaces on content platforms,


ultimately leading to the Digital Revolution.
What is the essence of the ICT Big Bang, and how is it connected to the Digital
Revolution? The ICT Big Bang unfolded at three levels—device, OS, and business.
The convergence of communications and computers culminated in the smartphone,
where a computer chip merged with a communication device. This merger ignited the
ICT Big Bang at the device level. Smartphones, with their mobile operating systems
(OS), introduced application stores that established open content marketplaces. This
shift saw a massive influx of content providers, causing an explosive increase in
the number of available apps and creating the ICT Big Bang at the OS level. The
true essence of the ICT Big Bang lies in the fusion of smartphones and application
marketplaces, which dismantled the previously walled garden of content distribution,
revolutionizing the market by enabling direct transactions between app providers and
users. As a result, content emerged as the central player in the communications land-
scape, pushing traditional communication businesses to the periphery. This marks
the ICT Big Bang at the business level. The flood of applications onto content plat-
forms and the subsequent rapid expansion of app services triggered a powerful wave
of digital change, thus constituting the Digital Revolution. The explosive force of
the ICT Big Bang propelled the development of digital technologies and drove the
paradigm shift toward the digital era, marking the core of the Digital Revolution.
The commonly discussed “Fourth Industrial Revolution” refers to the application of
these digital technologies in traditional manufacturing industries, transitioning them
to a digital paradigm. In short, the Fourth Industrial Revolution represents digital
transformation at the manufacturing level.

8.2 Digital Platforms

There are three main types of digital platforms. These include the communication
platform represented by the internet, which was established through the integration
of communications and computing; the OS-centered content platforms like iOS and
Android, established through the system-level integration of communications and
computing; and the various application platforms created by applications built on
top of these content platforms. Collectively, these are referred to as digital platforms.
Various types of content and services are provided to users through these three plat-
forms. Specifically, web browsing, file transfer, remote computer access, and email
are directly offered on the internet communication platform; various applications
are provided on the OS content platforms; and other services like social media,
e-commerce, cloud computing, and content sharing services are offered through
their respective specialized application platforms. Leading platform companies like
Google, Amazon, and Meta (Facebook) operate various kinds of application plat-
forms, and Apple and Google each operate the OS-centered content platforms App
Store and Play Store, respectively.
214 8 Challenges of Digital and AI Transformation

These digital platforms are the symbols of the digital age and the pioneers leading
it. They are at the heart of digital age civilization, providing various forms of commu-
nication and connection, and have accelerated the world toward “hyperconnectivity.”
Digital platforms represent a new digital industry that did not exist in the industrial
age, emerging alongside the Digital Revolution. Founders of application platform
companies quickly recognized the signs of digital change, secured their territories,
and grew their businesses. The launch of Steve Jobs’ Apple iPhone-iOS-App Store
is a prime example. Platform companies have provided mankind with various types
of services, receiving active interest and love from users, and generating immense
wealth. They pioneered and utilized digital technologies ahead of others, growing
their businesses and rapidly expanding into natural monopolies through network
externality effects, thereby establishing a firm position in the global top 10 by market
capitalization.
A prime platform for application platforms could be considered Apple’s content
platform centered around iOS. With the release of the iPhone equipped with iOS
and the simultaneous launch of the App Store operating on iOS, Apple opened the
door for numerous applications to be hosted and various application platforms to be
launched for the first time. This catalyzed the ICT Big Bang and paved the way for
the Digital Revolution. Did Steve Jobs predict such explosive changes and release the
iPhone-iOS-App Store combination? Exploring the background provides an affirma-
tive answer to this question. Firstly, Jobs recognized that the user (i.e., the customer)
is the ultimate point of business outcomes. He understood early on that the essence
of all business lies in satisfying the needs of users, which enabled the iPhone to
feature exceptional UX/UI. Secondly, he understood market dynamics well, recog-
nizing the need for an open market where service providers and users could transact
directly. Thirdly, his experience with iTunes informed him of the importance of
content and foresaw that future communications networks would serve as distribu-
tion channels for such content. Fourthly, his experience developing the Macintosh
PC acquainted him with the importance of the OS, leading him to insist from the start
that the team developing the iPhone uses iOS. Fifthly, he understood the importance
of allies in business, knowing that forming a mutually beneficial eco-cluster would
be essential, and he strived to create a business model that shared benefits. Jobs
developed ambitious products and services based on his sharp understanding of the
situation, boldly challenging the information and communication market. This led to
the ICT Big Bang, overwhelming traditional communications operators entrenched
in business-centered thinking.
The four major platform companies known as ‘Big Tech’—Apple, Google,
Amazon, and Meta—were all founded in the USA. What is the implication of this?
It may be attributed to several factors characteristic of the US environment. The
USA has a culture of challenge, adventure, and pioneering spirit, consistent with its
origins as a nation of immigrants. It guarantees human rights, freedom of thought,
and economic activity, fostering an environment conducive to innovation, technolog-
ical development, and entrepreneurship. The USA also has well-established infras-
tructure for venture investment, technological guidance, and talent education that
8.3 Education, Digital Literacy 215

effectively supports new innovations, technology development, and startups. More-


over, a societal atmosphere of respect and tolerance for innovation, invention, and
commercialization encourages patience until efforts bear fruit, provided they do not
cause societal issues or limit market competition.1
However, the era of unrestricted expansion for platform companies is gradually
coming to an end. The US government has been tolerant, allowing Big Tech compa-
nies to grow substantially. Now, the negative impacts of platforms are becoming
societal issues, with concerns about monopolistic practices and competition restric-
tions emerging, leading to various lawsuits. While US legal actions have focused on
competition restrictions, European sanctions are comprehensive. Europe has enacted
the Digital Markets Act (DMA) and the Digital Services Act (DSA) to fundamentally
regulate platform companies. The DMA addresses market monopolies of platform
companies to open market entry for competitors, while the DSA goes further to regu-
late service monopolies of platform companies, securing users’ fundamental rights
and demanding measures against misinformation.
Platform companies, which have operated freely, must now adhere to regulatory
laws, similar to obeying traffic laws while driving in physical spaces. They need to
exercise discretion in collecting and using user information and devise means to filter
false information, aligning their operations with regulations. In this new regulatory
environment, platform companies will strive to find new breakthroughs to continue
generating profits and growing, potentially at the users’ expense.

8.3 Education, Digital Literacy

As discussed earlier, education is one of the areas facing significant challenges due
to digital transformation. The introduction of digital educational tools necessitates
changing teaching methods to enhance the effectiveness of education through new
forms of learning such as interactive learning, personalized learning, and remote
learning. In support of this, students need to be ensured to have equal access to
digital devices and the internet. Therefore, a basic condition for education in the
digital age is to equip all schools with digital educational devices, high-speed internet
connections, and the necessary educational content and tools. Teachers also need to
be familiar with digital technologies and consider developing educational materials
and methods using these technologies as a basic part of their duties.

1 This situation differs from contexts where state control imposes significant limitations on creative
and corporate activities. For example, while countries like China have made considerable invest-
ments in scientific research and technological development, with the goal of becoming a global
leader, the system’s centralized direction and government oversight shape the landscape of research
funding and corporate growth. Despite these advancements, the historical and cultural context can
sometimes limit the potential for innovative and disruptive entrepreneurship. This also contrasts
with regions where legacy regulations or political factors present obstacles to the development of
new enterprises.
216 8 Challenges of Digital and AI Transformation

A critical element of education in the digital age is digital literacy. Digital transfor-
mation brings about digital divide issues related to socioeconomic status, geographic
location, educational level, and age, and education can provide a starting point to
address these issues. Proper learning with digital educational tools from a young age
can equip students with digital literacy, freeing them from the digital divide as they
enter society. Thus, educating students to achieve digital literacy should be consid-
ered a fundamental element of education in the digital age. Further, if schools can
provide digital education programs to community members, teaching them how to
use digital tools and develop digital literacy, it would contribute to closing the social
digital divide.
In addition to digital literacy, another important quality to prepare from a young
age involves understanding and exercising restraint in the use of social media. Exces-
sive use and addiction to social media can lead to various adverse effects, such as
wasted time, social isolation, anxiety, depression, stress, loss of self-esteem due to
comparison with others, information overload, lack of sleep, and even potential harm
to physical health. Moreover, excessively revealing personal information can lead to
privacy violations. Addiction to excessive use of social media can cause various prob-
lems and adversely affect social activities after graduation. Therefore, it is essential
to teach students to use social media discerningly to prevent mistakes during their
growth from becoming lifelong regrets.
The fundamental challenge digital transformation poses to education is how to
adapt its content in preparation for a future coexisting with digital technology. In
the era of digital and AI technology, ignoring digital capabilities is not an option,
and tasks that can be performed by digital means need not to be duplicated by
humans. However, reliance on digital technology for everything can lead to human
incapacity, with the risk of living under the dominion of technology rather than
utilizing it. For instance, while using ChatGPT for writing when necessary can be
beneficial, if students neglect learning how to write themselves, they miss developing
sophisticated expression skills, critical thinking, and creativity. They may also lose
the ability to judge the adequacy of ChatGPT’s writing. This scenario could become
more acute as digital technologies become increasingly intelligent. Therefore, it is
crucial to research what it means to be human and what basic abilities humans should
possess to coexist with future AI and robots. The findings should then be reflected
in education to nurture the fundamental capabilities necessary for living in an era of
coexistence.
In the digital age, computational thinking and coding are recognized as vital
skills,2 and many countries have included them in their educational programs.3

2 Computational thinking refers to a mindset related to defining problems in a way that a computer
can understand and solve. Computational thinking can be divided into abstraction and automation.
Abstraction is the process of structuring and breaking down complex problems into a simplified
state, while automation is the process of translating the abstracted problem into the language of
computers. Coding refers to the process of inputting instructions to a computer in programming
languages that the computer can understand, such as C, Java, and Python.
3 Many countries around the world are incorporating coding and computational thinking into their

elementary school curricula to equip students with basic skills essential in the digital age, such as
8.4 Misinformation, Media 217

Computational thinking involves problem-solving in ways that a computer can under-


stand, which is foundational for effective communication with computers and AI
devices. While logical and analytical thinking, areas where AI can excel, form
the basis of computational thinking, human thinking encompasses a wide range,
including creative, critical, analytical, synthetic, reflective, strategic, and problem-
solving thinking. Among these, humans excel in areas like critical, creative, and
synthetic thinking, especially when tied to emotional, ethical, cultural, and social
factors, for which AI cannot match human capability. Reflective thinking, involving
self-awareness and introspection, represents a distinctly human domain that AI
cannot replicate or express. Furthermore, the capacity to navigate profound emotions,
ethical considerations, self-awareness, and complex social contexts, along with imag-
ination, affection, empathy, creativity, intuition, and consciousness, are uniquely
human traits that AI cannot mimic.
To maintain a leading role in the coexistence with digital and AI devices, it is essen-
tial to research and educate on developing these uniquely human abilities and char-
acteristics. This approach ensures that humans can leverage their unique strengths in
an increasingly digital world, balancing the benefits of digital technology with the
preservation and enhancement of human capacities.

8.4 Misinformation, Media

As previously discussed, one significant societal harm accompanying digital trans-


formation is the rapid spread of false information and fake news. The advent of
social media and individual internet broadcasting in the digital era has led to the
mass production and indiscriminate dissemination of information, among which
false information and fake news can be mixed. In particular, individual internet
broadcasts, unlike public broadcasting media, are not regulated, posing the risk of
producing information or making baseless claims based on interest without a sense
of responsibility to discern the truth or the ability to self-regulate against fake news.
When used in political activities, the social harm can be substantial, contaminating
elections with opinion manipulation, mass mobilization, and populist promises, thus
endangering democracy. Therefore, in the digital age, the role of public media in
responsibly handling false information and fake news becomes even more crucial.
The digital era is characterized by an information overload, making the media
more critical as people seek reliable sources to discern accurate information from
the flood of data. In an era overflowing with information and where distinguishing
truth becomes challenging, trust becomes more important than information itself.
People tend to believe the words of those they trust, even when faced with differing

problem-solving abilities, logic, analytical skills, and creativity. The UK was the first to include
coding in its elementary education in 2014, and it has since been included in Australia, Finland,
Estonia, Singapore, France, Canada (in some regions), China (in some regions), and Korea plans to
make it a mandatory part of the curriculum in elementary and middle schools starting from 2025.
218 8 Challenges of Digital and AI Transformation

opinions from many others. Similarly, when faced with varying reports from many
different media outlets, people will choose to trust the one they have always believed
in. The treasure of the digital age is not information but trust, making the media that
provides reliable, well-considered journalism even more valuable. Just as a treasure
shines even when buried in the earth, true journalism shines all the more amid rampant
false information.
Public media in the digital age must equip themselves with filters to block the
increasing and cunning false information and fake news. With the surge of new
information and news, the media is in a race to report the truth swiftly, necessi-
tating media outlets to develop their own methods for quick verification of truth.
Although challenging, AI technology could offer solutions, such as developing real-
time truth-verification software utilizing AI filtering. Until such solutions are found,
public media should resist the temptation for sensational or interest-driven reporting,
preferring to delay publication until the truth can be verified. Future legal and regu-
latory measures by governments to block false information and fake news could
reduce the amount of information needing verification. Similarly, if countries enact
laws similar to the EU’s Digital Services Act (DSA), platform operators will lead in
blocking false information, significantly reducing its distribution. By seeking self-
solutions and maintaining journalistic integrity until then, public media’s stature will
be reinforced by societal trust, further solidifying its position.
The rapid proliferation of AI technology following the release of ChatGPT poses
another dimension of challenge to society at large, including the media. The opening
of AI source codes will break down the barriers of high investment and long-term
research and development, enabling anyone to develop AI. In addition, the activation
of the GPT Store will likely lead to a wide spread of various customized AI applica-
tions. As a result, false information and fake news could become heavily armed with
such new technologies. AI is automating the creation of fake news, dramatically
increasing web content that mimics realistic articles, spreading false information
about elections, wars, and natural disasters.
However, as seen in the case of deepfakes, the false information generated by AI
is so sophisticated and cunning that it is extremely difficult to discern its falsehood.
If used for manipulating public opinion, inciting the public, or election strategies,
society could plunge into severe chaos. In this situation, the role of the media in
reporting the truth and maintaining journalistic integrity becomes even more critical.
Yet, even the most advanced media equipped with the latest AI technology may
struggle to handle this challenge alone. In order to tackle this situation, strong legal
and regulatory support is necessary to punish the manipulation of false information,
in line with the executive order issued by the US government to prevent misuse of AI
technology and the code of conduct for AI companies established by the G7 nations.
8.5 Personal Information, Personal Competence 219

8.5 Personal Information, Personal Competence

The era of digital transformation brings various new technological benefits along
with issues of information protection and management, especially personal data
protection. It is akin to paying a fee to enjoy the numerous benefits brought by
digital technology. Since everything generated and processed by digital technology
is data, and information is derived from processing this data, the use of information
invariably entails information management issues.4 When using search platforms or
online marketplaces, problems of personal data leakage arise, and the same occurs
with social media platforms. This personal data becomes valuable business assets
for platform operators and can be used for targeted advertising and other purposes.
If a smartphone is lost and falls into the hands of someone with malicious intent,
not only the personal information of the smartphone owner but also the contents of
messages, chats, and emails can be exposed, causing harm not only to the owner but
also to their friends who have communicated with them. When digital technology is
used in education and learning, educational content platforms collect students’ infor-
mation, and if this information is leaked, it can cause psychological stress to growing
students. If politicians or political groups with bad intentions use facial recognition
technology and digital currency to monitor citizens and collect information on their
movements and financial transactions, it can result in severe oppression and stress
for individuals.
The leakage of personal information can lead to not only mental stress but also
physical and financial harm. Therefore, information management and personal data
protection become critical social issues during the digital transition period. At the
individual level, it is essential to handle platform services connected through the
internet with caution and restraint, taking into account the potential for information
leakage in advance. This careful and moderate approach is a fundamental attitude
needed in the digital transition era.
In the digital age, digital literacy is a fundamental personal competency. It involves
understanding digital technologies and the ability to find, utilize, and communicate
information using digital tools. To develop and maintain digital literacy, it is advisable
to make a habit of learning the purpose and operation of new digital technologies
and tools as they emerge. In addition, an essential skill is the ability to discern
information. This includes the capability to select the information you need from the
vast amount of data circulated through various digital media, including social media,
and to distinguish between accurate and inaccurate information. However, this may
require a high level of expertise and extensive research. Therefore, it is important

4Data is unprocessed facts or numbers, while information is content that has been processed,
organized, and structured to give it meaning. For example, temperatures represented by numbers
such as 28°, 26°, 30°, 27°, 29° are temperature data. By processing and assigning meaning to this
data, information such as ‘the average temperature is 28°, which is suitable for outdoor activities’
can be extracted.
220 8 Challenges of Digital and AI Transformation

to learn in advance how to obtain the necessary information and choose the correct
information.5
In the digital age, with many socioeconomic activities being conducted via the
internet, it is necessary to understand and use various digital platform services such
as social media and search engines discerningly. It is important to understand the
potential social impact of the posts I make on social media, how platform companies
collect and use the information about me from those posts, and how confidential
information I classify as such can be leaked and create difficult situations if it circu-
lates on the internet. Moreover, when using search engines, it is necessary to be
aware that the content I search for can be limited by the filter bubble phenomenon.
Similarly, when using social networks, it is important to understand that the echo
chamber effect can lead to confirmation bias. By understanding these facts, one can
act discerningly when posting messages on social media and develop the habit of
critically accepting the messages received from search engines.
Especially when using digital devices connected to the internet or other networks,
it is necessary to habituate careful behavior, with the term “behavior” referring to
the data sent out through connected digital devices. Platforms can collect all such
data to create profiles and analyze them, thereby learning all information about
one’s behaviors, such as interests, personal networks, consumption habits, movement
patterns, search patterns, political leanings, and religious preferences. When using
social media, conducting online transactions, or searching for information, it is neces-
sary to be mindful and cautious of the fact that the data I input can be collected and
analyzed by the platform, resulting in targeted advertising or unforeseen outcomes.
Moreover, in the digital space, it is important to be aware that all data I input could
be stored somewhere indefinitely and never completely deleted. This includes posts
made in youth or mistakes, which if found and spread by someone, could lead to
embarrassing situations and sometimes cause very serious problems. Vigilance and
cautious online engagement are indispensable in the digital transformation era.

8.6 Role of Government

In the era of digital transformation, the role of the government is substantial and
critical: Although businesses and various sectors of society will make efforts toward
digital transformation on their own, legal and institutional support for various aspects
is needed, along with financial and systemic backing. At the national level, the
success of digital transformation is determined by how faithfully the government
plays its role. Therefore, it is a prerequisite for government officials to understand

5 Heather Kelly introduced eight precautions in her Washington Post column “How to avoid falling
for misinformation, AI images on social media” on October 9, 2023: (1) Know why something
might be misinformation. (2) Slow down while reading and watching. (3) Check the source; don’t
always trust “verified” accounts. (4) Make a collection of trusted sources. (5) Seek out additional
context about news events. (6) Use these tricks to spot AI images. (7) Vet videos and real images,
too. (8) Use fact-checking sites and tools.
8.6 Role of Government 221

digital transformation ahead of others and to be knowledgeable about what roles the
government should play for a successful digital transformation.
First of all, the sector to pay attention for digital transformation is the educa-
tion which requires significant budgets to build digital infrastructure such as digital
educational tools and high-speed internet, and to purchase and update various appli-
cations and services. In addition, budgets are required for the development of various
educational programs to offer new forms of learning, such as interactive learning,
personalized learning, and remote learning. The government needs to support schools
to secure the necessary resources for digital transformation. If the built digital infras-
tructure for the education of students could be extended for the digital education
of community residents, it would contribute to reducing the social digital divide. In
addition to such financial support issues, improvement of educational programs for
the digital age is equally important for building digital education infrastructure. The
content of education needs to be changed in preparation for a future where humans
coexist with digital technology. It would be beneficial if the government invested to
conducting in-depth research to support this by appointing expert groups and helping
schools to reflect the results in the curriculum.
In the digital age, understanding digital technology and using digital devices is
fundamental, and being excluded or left behind leads to a digital divide, which in turn
becomes a constraint on socioeconomic activities. Especially, generations before the
digitalization are suffering from not being able to properly use the increasing digital
tools in banks, public institutions, ticket offices, restaurants, etc. Without reducing the
digital divide between generations, social equality cannot be achieved in the digital
age. The government must actively work to eliminate the digital divide. Initially,
institutional backing is necessary to ensure that all public institutions and stores
provide at least one counter for face-to-face services, so that the rapidly increasing
elderly population can be freed from the stress of digital devices.6 Furthermore, high-
speed internet infrastructure must be built in all residential areas, including rural and
mountainous regions. Support for free facility is needed so that low-income groups
can access the internet affordably. Some types of support are also needed so that
individuals without digital devices can purchase or use them at an affordable price.
Installing public digital centers where internet and digital devices can be used for free
could be a viable option. Moreover, digital education programs should be developed
to improve the digital literacy of the entire population.
As we enter the era of digital transformation, the ability to handle digital tech-
nology has become essential in various professions, and with the evolution of digital
technology and changes in the job environment, there has arisen a need to enhance
individual digital skills. Furthermore, to transition to newly emerging digital jobs, one
must possess advanced digital skills. Therefore, re-education and lifelong learning are
essential to maintain competitiveness in jobs of the digital age. In addition, as digital

6 Institutionalizing face-to-face service counters in private businesses may be considered excessive


government intervention, but solving social problems during the digital transition is also the govern-
ment’s responsibility, and it needs to be persuaded as a temporary system. Even during the digital
transition, if AI services can be developed well enough to replace face-to-face services, installing
them could be a viable alternative.
222 8 Challenges of Digital and AI Transformation

technologies such as factory automation and AI can replace human jobs, education
for digital job transition has become necessary to address job loss caused by these
technologies. In response to these contemporary trends, the government may take
some role, which may differ depending on the states, to explore various measures
to provide re-education, lifelong learning, and job transition education programs for
those wishing to enhance or change their careers.
The government also needs to actively address the issue of misinformation and
fake news, which have emerged as problems in the digital age. While ensuring
freedom of expression on social media, which is often the source of the problem, there
is a need to establish a culture where individuals are responsible for the consequences
of their expressions. In particular, personal internet broadcasting often spreads infor-
mation indiscriminately without adhering to the basic ethics followed by all public
broadcasting, causing social controversy and potentially influencing elections. There-
fore, personal internet broadcasts should also be held accountable for errors and false-
hoods to the same extent as public broadcasting.7 However, since personal internet
broadcasts that do not require a permit cannot be sanctioned through permit revoca-
tion, separate laws and systems need to be established to deal with misinformation
and fake news.
Another concern in the digital age is the protection of personal information. The
spread of digital technology has increased concerns about the collection, storage,
and use of personal information. Digital platforms, while providing various services,
collect user information, which can lead to significant harm if the collection is exces-
sive or the information is leaked or misused. The leakage of sensitive personal infor-
mation can lead to privacy invasion, financial loss, and unauthorized use of credit
cards. Furthermore, it could be used for fraudulent transactions, legal issues arising
from identity theft, and damage to an individual’s credit and reputation. In addition,
as the digital society is interconnected through networks, the risk of cyber-attacks
and data breaches has increased, leading to potential damage to critical infrastruc-
ture, financial loss, and exposure of sensitive information. Therefore, it is necessary
for government and legislature to legislate for the protection of personal information,
data security, consumer rights, and to establish or strengthen punishment regulations.
Similar to the EU’s Digital Services Act (DSA), it may also be worth considering
the legalization of restrictions on the collection, use, and management of personal
information by platform operators.
A particular area of concern during the digital transformation era is the small and
medium-sized enterprises (SMEs). Digital transformation requires heavy investment
for installing digital technology and hiring digital experts, which may be burdensome
for SMEs in traditional industries as they lack the financial resources to proceed. Since
digital transformation ultimately relates to the sustainability of businesses, it is advis-
able for the government to find some ways to help SMEs manage digital transfor-
mation successfully, thereby maintaining the jobs they offer. In general, government

7 In line with this, considering the opinion polls can play in distorting public opinion and critically
affecting elections, unregistered polling organizations should be banned from operating or subject
to regulations comparable to those for registered polling organizations.
8.7 Era of AI, Age of AI Robots 223

intervention in businesses is not desirable, but SMEs during the digital transforma-
tion era are an exception, as the challenges they face are due to changes in the era,
not due to their own inefficiencies or incompetence.

8.7 Era of AI, Age of AI Robots

The AI era is widely regarded as the natural successor to the digital age. However,
the shift from a digital society to an AI society differs fundamentally from the earlier
transition from industrial to digital societies. While the digital age marked a paradigm
shift, the AI era represents an evolution—a continuation and maturation of digital
technologies. AI, initially one among many digital tools, is now poised to become the
central axis of digital transformation. The launch of ChatGPT-3.5 in 2022 marked
the beginning of this shift, signaling the onset of the AI era as a defining force of the
digital-AI age.
The emergence of ChatGPT elevated AI to new levels of prominence. AI, having
developed through incremental advancements, reached a watershed moment with
ChatGPT’s ability to engage in natural language conversations with humans. The
initial input method, text, is rapidly expanding to voice and video inputs, making AI
interaction more intuitive and seamless. OpenAI, the company behind ChatGPT, has
been hailed as a “game-changer”, with its valuation skyrocketing. The emergence of
ChatGPT spurred AI research and development across industries, forcing platform
operators to accelerate their AI initiatives.
As AI progresses, concerns about AI accountability have become central to
public discourse. Initiatives like the Montreal Declaration for Responsible AI (2018)
emphasized the importance of ethical, transparent, and accountable AI develop-
ment. Core principles include safety, fairness, transparency, and privacy protection,
essential to ensuring trust in AI systems as they become integral to production and
services. Following ChatGPT’s launch, these concerns intensified. Governments,
industry leaders, and academics have expressed apprehensions about AI’s impact on
employment, education, and national security.
In response, legislative bodies have taken action. The US government held public
hearings to explore measures for responsible AI development, while OpenAI’s CEO,
Sam Altman, advocated for regulation at US Senate hearings. In 2023, President
Biden signed an executive order aimed at mitigating AI-related risks. Internationally,
the G7 established an AI code of conduct, promoting responsible AI use, personal
data protection, and the labeling of AI-generated content. The EU’s AI Act, enacted in
2024, represents the first comprehensive legislative effort to regulate AI development
and ensure its responsible use.
Meanwhile, OpenAI continued its ambitious developments. After the release of
ChatGPT-3.5 in November 2022, the company launched GPT-4 in March 2023,
followed by the unveiling of the GPT Store in January 2024 and GPT-4o in May
2024. Further, it released GPT-o1 in October 2024, a completely new model with
enhanced inference capabilities. The introduction of the GPT Store parallels the
224 8 Challenges of Digital and AI Transformation

impact of Apple’s iPhone and App Store, signaling the dawn of a new AI platform
economy. Just as the App Store revolutionized the digital platform era, the GPT Store
may usher in the AI era, offering a diverse range of customized GPT applications.
What might be the third revolution following the Industrial Revolution and the
Digital Revolution? If the Industrial Revolution ushered in the “First Machine Age”
of industrial society, and the Digital Revolution opened the “Second Machine Age”
of digital society, what could the “Third Machine Age” introduced by the third
revolution be?8 It could well be conceptualized not merely as the “AI Age” but more
specifically as the “Age of AI Robots.” As discussed earlier, while AI technology is
part of the digital technology suite and the AI era is an extension of the digital age,
the Age of AI Robots represents a shift to a different dimension. In the future, as
AI develops and surpasses the singularity point, it will evolve into a super-AI that
exceeds human intelligence, and humanoid robots will develop correspondingly to
surpass human physical capabilities. When these two merge into one entity as ‘AI
robots,’ it will signify the birth of a superhuman, and the newly opened ‘Age of AI
Robots’ will truly be the ‘Third Machine Age.’ This will bring about a paradigm
shift that surpasses the transition from industrial society to digital society.
The “Age of AI Robots” symbolizes the advent of superhuman entities and signi-
fies the opening of a truly “Third Machine Age.” The transition from industrial to
digital societies will be overshadowed by a paradigm shift that promises to fundamen-
tally transform human life. This transformation is not a distant future but could occur
within the next two or three decades. It is humanity itself driving this change, and with
accelerating AI research and development competition, this timeline may be further
shortened. The post-ChatGPT era has seen platform companies openly competing by
releasing AI models (like Meta with Llama 2, Google with Gemini, and others), and
the 2024 CES offering a testimony to AI’s central role across numerous innovative
products and services. The open-source movement accelerates AI’s advancement,
lowering barriers so that generative AI development is not confined to entities with
substantial investment capabilities.9 However, this accessibility also harbors the risk
of misuse, such as the production of deepfakes and the spread of misinformation by
criminal organizations. The actions taken by AI experts in declaring principles for
AI research and development, proposing regulations for AI technology, the admin-
istrative orders and codes of conduct implemented by the USA and G7, and the AI
Act enacted by the EU underscore the critical need for oversight and responsible
development in such fast-evolving AI landscape.
The year 2022 may be remembered as a significant turning point in the history of
digital development. In July 2022, the EU enacted the Digital Markets Act (DMA)
and the Digital Services Act (DSA). In November 2022, ChatGPT-3.5 was released.
The enactment of the DMA and DSA signals that the digital age has peaked, while

8 The terms “First Machine Age” and “Second Machine Age” are used to refer to the Industrial Age
and the Digital Age, respectively, by Erik Brynjolfsson and Andrew McAfee. See their book “The
Second Machine Age”.
9 One year after OpenAI launched ChatGPT, Meta, together with IBM, formed the ‘AI Alliance,’

bringing together over 50 AI-related companies and institutions. This is an open-source alliance for
sharing AI technology for free.
8.7 Era of AI, Age of AI Robots 225

the release of ChatGPT-3.5 signals the beginning of the AI era. Consequently, digital
platform companies began to shift their focus to AI research and development. There-
fore, 2022 will be recorded as the year when the digital age began to pass the baton
to the AI era. However, the advent of the AI era does not signify the end of the digital
age but rather its maturation, marking the progress into the combined digital-AI era.
Bibliography

1. A.V. Oppenheim, R.W. Schafer, Digital Signal Processing (Pearson, 1975)


2. A.V. Oppenheim, R.W. Schafer, J.R. Buck, Discrete-Time Signal Processing, 2nd edn.
(Prentice-Hall, 1999)
3. Bell Laboratories, A Brief History of Engineering Science in the Bell System, Switching
Technology (Bell Laboratories, 1982)
4. Bell Laboratories, A Brief History of Engineering Science in the Bell System, Transmission
Technology (Bell Laboratories, 1985)
5. Bell Laboratories, Transmission Systems for Communications, 5th edn. (Bell Laboratories,
1982)
6. Bell Laboratories, Engineering and Operations in the Bell Systems, 2nd edn. (Bell Laboratories,
1983)
7. Bell Laboratories, A Brief History of Engineering Science in the Bell System, Communication
Sciences (Bell Laboratories, 1984)
8. B.G. Lee, M. Kang, J. Lee, Broadband Telecommunications Technology, 2nd edn. (Artech
House, 1996)
9. B.G. Lee, W.J. Kim, Integrated Broadband Networks, IP, ATM and Optics (Artech House,
2002)
10. B.G. Lee, S. Choi, Mobile WiMAX & WiFi: Broadband Wireless Access and Local Networks
(Artech House, 2008)
11. D. Choi, Manage with Digital Factory, (Huckleberry Books, 2019) (in Korean)
12. IEEE Communications Society, A Brief History of Communications (IEEE, 2012)
13. T.S. Rappaport, Wireless Communications: Principles and Practice (Prentice Hall, 2002)
14. J. Schiller, Mobile Communications, 2nd edn. (Pearson, 2003)
15. S.C. Yang, OFDMA System Analysis and Design (Artech House, 2010)
16. A. Osseiran, J.F. Monserrat, P. Marsch, 5G Mobile and Wireless Communications Technology
(Cambridge University Press, 2016)
17. G.L. Stüber, Principles of Mobile Communication (Springer, 2017)
18. D. Ince, The Computer: A Very Short Introduction (Oxford University Press, 2011)
19. M. Campbell-Kelly, W. Aspray, N. Ensmenger, J.R. Yost, Computer: A History of the
Information Machine (Westview Press, 2013)
20. N. Dale, J. Lewis, Computer Science Illuminated (Jones & Bartlett Learning, 2020)
21. J. Meyers, A Brief History of the Computer (BC–1993 AD) (2017) (online), http://www.jeremy
meyers.com/comp
22. Computer History Museum, Timeline of Computer History (2017). http://www.computerhist
ory.org/timeline/computers/
23. A. Blum, Tubes: A Journey to the Center of the Internet (Ecco, 2012)

© The Editor(s) (if applicable) and The Author(s), under exclusive license 227
to Springer Nature Singapore Pte Ltd. 2025
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5
228 Bibliography

24. Internet Society, Brief History of the Internet (2017). http://www.internetsociety.org/internet/


what-internet/history-internet/brief-history-internet
25. J. Gleick, The Information: A History, A Theory, A Flood (Pantheon Books, 2011)
26. I. Goodfellow, Y. Bengio, A. Courville, Deep Learning (MIT Press, 2016)
27. H.A. Kissinger, E. Schmidt, D. Huttenlocher, The Age of AI: And Our Human Future (Little,
Brown and Company, 2021)
28. C. Metz, Genius Makers: The Mavericks Who Brought AI to Google, Facebook, and the World
(Dutton, 2021)
29. K. Schwab, The Fourth Industrial Revolution (World Economic Forum, 2016)
30. T. Saldanhal, Why Digital Transformations Fail: The Surprising Disciplines of How to Take
Off and Stay Ahead (Berrett-Koehler Publishers, 2019)
31. J.R. Highsmith, L. Luu, D. Robinson, EDGE: Value-Driven Digital Transformation (Addison-
Wesley Professional, 2019)
32. T.M. Siebel, Digital Transformation: Survive and Thrive in an Era of Mass Extinction (Rosetta
Books, NewYork, 2019)
33. B. Marr, Tech Trends in Practice: The 25 Technologies that are Driving the 4th Industrial
Revolution (Wiley, 2020)
34. E. Brynjolfsson, A. McAfee, The Second Machine Age: Work, Progress, and Prosperity in a
Time of Brilliant Technologies (Norton & Company, 2014)
35. A. Mcafee, E. Brynjolfsson, Machine, Platform, Crowd (WW Norton, 2017)
36. S. Galloway, The Four: The Hidden DNA of Amazon, Apple, Facebook, and Google (Random
House, 2017)
37. A. McAfee, More from Less (Scribner, 2020)
38. N. Shadbolt, R. Hampson, The Digital Ape: How to Live (in Peace) with Smart Machines
(Oxford University Press, 2019)
39. W. Isaacson, The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the
Digital Revolution (Simon & Schuster, 2014)
40. S. Galloway, Post Corona: From Crisis to Opportunity (Penguin Random House, 2020)
41. S. Zuboff, The Age of Surveillance Capitalism: The Fight for a Human Future at the New
Frontier of Power (Profile, 2019)
42. S. Russell, P. Norvig, Artificial Intelligence: A Modern Approach, 4th US edn. (Pearson, 2022)

You might also like