Understanding The Digital and AI Transformation
Understanding The Digital and AI Transformation
Understanding
the Digital and AI
Transformation
Understanding the Digital and AI Transformation
Byeong Gi Lee
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2025
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or
information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Human society is now transitioning into a digital society. Just as the Industrial Revo-
lution led to a shift from agrarian society to industrial society, the Digital Revolution
is driving the transformation from industrial society to digital society. The digital
society has begun to sprint towards a digital and artificial intelligence (AI) society.
As human society transitions into a digital society, everything is changing to a
new paradigm. The digital transformation began in industry, fundamentally changing
corporations. It went beyond merely adopting digital technologies in the manufac-
turing sector, transforming business models, operations, organizational structures,
and decision-making processes, thereby innovating businesses. The digital transfor-
mation is spilling over into society, broadly changing human social activities. From
politics and economics to social activities, and even the daily lives of individuals, a
comprehensive change is occurring in digital conception and digital methodology.
Transitioning to a digital society does not mean the disappearance of existing
industries, just as agriculture did not vanish with the transition to an industrial society.
It is simply a shift to a new paradigm. Just as farming transitioned from plows to
tractors in the industrial society, in the digital society, this will evolve into farming
with autonomous driving tractors utilizing digital technology.
The start of human society’s transformation to a digital society occurred approx-
imately 50 years after the first emergence of digital concepts. The kickoff for digital
transformation was the digital conversion started in the 1960s to realize the dream
of long-distance communication. Converting analog signals to digital not only made
long-distance communication possible but also the comprehensive processing of
voice and video signals with computer data. This provided the impetus for the conver-
gence of communications and computing. First, communications and computers
converged at the level of communication, establishing an internet-based communi-
cation platform. Subsequently, convergence occurred at the system level, along with
the emergence of smartphones, establishing a content platform based on operating
systems (OS). On this platform, open application marketplaces opened, attracting a
flood of applications, which in turn spurred the rapid growth of various application
platforms for search, social media, online commerce, and content sharing. This led
to the “ICT Big Bang,” marking the start of the Digital Revolution sweeping across
v
vi Preface
industries and society, propelling the digital transformation. This digital transforma-
tion is accelerating towards a digital and AI transformation with the emergence of
AI technologies like ChatGPT.
Digital technologies at the heart of digital transformation are greatly impacting
human life today. Digital technology is bringing about changes in communication
methods, access and distribution of information, work and learning methods, and
lifestyles. Furthermore, the spearhead of digital technology, AI, is emerging and
starting to add new momentum to these changes. These changes are extending beyond
individual lifestyles to corporate activities and government operations, significantly
altering the way human society operates. Moreover, while there are positive changes
such as improved accessibility to information and services, enhanced social connec-
tivity, and industrial innovation, various issues are also emerging, including digital
divides, digital illiteracy, job loss, misinformation, fake news, and privacy concerns.
What is the wise way to live through this era of digital and AI transformation? It
is essential to understand the direction of changes of the times and act socially and
personally in alignment with the trends. In order to do so, one must first understand
what digital technologies are, what their characteristics are, and how they function. In
addition, it is necessary to comprehend digital platforms, the services they provide,
and the reciprocal obligations implicitly attached to these services. Also crucial is to
understand the AI technology, how digital and AI technologies transform industries
and society, and what are the benefits and problems these changes entail.
This book is intended to help find solutions to such problems. By reading and
contemplating on this book, the reader will gain an understanding of the essence of
digital and AI transformation and find ways to live and act wisely in the digital age.
I am grateful to everyone who helped in the process of writing this book. My
thanks go to Profs. Bahk Saewoong, Shim Byunghyo, and Moon Byung-Ro of Seoul
National University, Profs. Kang Chung Gu and Lee Inkyu of Korea University, Prof.
Steven Whang of KAIST, Professor Emeritus Noh Seok Kyun of Yeungnam Univer-
sity, Executive Vice President Choi Sunghyun of Samsung Electronics, former Pres-
ident Choi DooWhan of POSCO DX, and Vice President Doh Youngjin of Hyundai
Motor Company. I also thank the Academy of Science of Republic of Korea for their
support in writing this book.
vii
Contents
ix
x Contents
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Acronyms
xv
xvi Acronyms
We often feel like living in unfamiliar territory in our daily lives. The familiar land-
scapes of the past are continually disappearing, replaced by unfamiliar digital devices
and tools. The life pattern of the industrial age has been gradually changing to that
of the digital age. Everyone carries a smartphone and depends on it for everything.
Beyond just making calls and sending texts, we enter chat rooms, search for informa-
tion, read newspapers, listen to music, watch videos, take photos, make notes, check
appointments, and find our way. Smartphones have expanded the range of tasks we
can perform independently, while simultaneously reducing what we can do without
them. In the midst of becoming accustomed to smartphones, the symbol of the digital
age, we have unwittingly stepped into the digital age.
Human society, after maintaining a hunter-gatherer lifestyle for ages, transitioned
to an agrarian society and then into an industrial society with the onset of the Indus-
trial Revolution in the nineteenth century. Utilizing buried carbon resources to create
power and applying spinning wheel technology that began in medieval monasteries,
humans produced power beyond the limits of human muscle. The Industrial Revo-
lution, sparked by the invention of the steam engine in 1814, shaped the industrial
society through the nineteenth and twentieth centuries, developing various technolo-
gies and ultimately blooming into today’s prosperous material civilization. On this
foundation, the electronic civilization took the baton, advancing communication,
computers, and semiconductors, laying the groundwork for the Digital Revolution.
The development and convergence of communications and computers, supported by
semiconductors, intensified the momentum of the Digital Revolution. This conver-
gence peaked with the integration of communication–computer devices (i.e., smart-
phone) and the ignition of the ‘ICT Big Bang’ with the App Store, triggering the
Digital Revolution and initiating the digital transformation of the industrial society.
We are living in an era of digital transformation. Digital kiosks replacing staff in
restaurants, online book orders that allow purchasing books even as physical stores
disappear, internet banking for transferring money at midnight, queue apps intro-
duced in local clinics, search engines for finding any information, and social media
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 1
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_1
2 1 Introduction to Digital and AI Transformation
networks for mingling with friends—all these represent the changes of the digital
transformation era. The smartphone, an essential item we carry and use daily, is an
emblematic device of digital transformation technology that integrates voice, video,
data, computer, and communication services. The internet and social media we use
every day are means of social connection and communication brought by digital trans-
formation. The frequently mentioned ‘Fourth Industrial Revolution’ is none other
than the digital transformation of the manufacturing industry. Thus, digital trans-
formation is fundamentally reshaping personal life, social activities, and industrial
production methods, totally transforming human society and industry.
Digital transformation stems from the transformation of the industrial society
caused by the Digital Revolution. Just as the Industrial Revolution brought about a
paradigm shift from agrarian to industrial society, the Digital Revolution brings a
new paradigm shift from industrial to digital society. During the Industrial Revolu-
tion, various production machines, including the steam engine, were the epicenter,
causing a major shift in production methods, transitioning from rural to urban-centric
societies, and changing lifestyles. Similarly, the Digital Revolution is changing indus-
trial activities and spreading to social activities on platforms established through the
vertical convergence of communications and computers, transitioning from an indus-
trial to a digital society. This transformation of the industrial society driven by the
digital paradigm is the essence of digital transformation.
Foundation of Digital Transformation
How did the digital transformation, which exerts such a massive influence, begin?
Was it invented out of necessity by humanity, or did it start as a small change that
spread and led to larger changes? How has digital transformation unfolded and taken
root in human society? What impact does it have on human industry and social
activities, and how does it transform them? These questions need to be carefully
considered and their answers meticulously reviewed. By doing so we will be able to
understand the changes brought about by digital transformation more clearly, identify
and better respond to associated problems, and predict future directions of change.
Amidst this, we can engage in social activities that align with digital transformation
and lead a comfortable life in the digital age without anxiety.
Digital transformation was not deliberately invented to bring convenience to
human life or contribute to productivity enhancement. It was a phenomenon that
naturally appeared during the development of communications and computers. The
starting point of digital transformation was digital conversion. This was the recip-
rocal signal processing action of converting analog signals to digital signals and
vice versa. Digital represented a new world pioneered by communications engi-
neers who dreamed of noise-free long-distance communication, leaving behind the
analog world. They discovered and invented the theories and technologies supporting
digital conversion, implemented it in the form of digital communication, and realized
the dream of long-distance communication. Once analog signals were converted to
digital signals, it became possible to process voice and images along with computer
data, providing the impetus for the convergence of communications and computers.
Communications and computers developed independently, competed, and clashed
1 Introduction to Digital and AI Transformation 3
mimic various aspects of human cognitive abilities to adapt to and improve situations
and operate autonomously. These tasks require capabilities in learning, reasoning,
problem-solving, perception, and understanding human language.
The term AI was first used in the 1950s, but the implementation process has
progressed slowly. The core elements of AI, including algorithms, machine learning,
and neural networks, have evolved over about 70 years, overcoming various chal-
lenges. Technological breakthroughs have propelled its development, marked by
several challenging events. Technologically, the emergence of recurrent neural
networks (RNNs) and convolutional neural networks (CNNs) effectively processed
sequential data like voice and grid-type data like images, respectively. In addition,
the advent of variational autoencoders (VAEs) and generative adversarial networks
(GANs) opened the door to generative models, and the emergence of the self-attention
mechanism facilitated parallel processing. From an event perspective, milestones in
AI development were set when IBM’s Deep Blue defeated the world chess champion
in 1997, IBM Watson won against human champions in a game show in 2011, and
Google’s AlphaGo beat the world’s top Go player in 2016.
The self-attention mechanism introduced in 2017 was implemented in the trans-
former architecture, revolutionizing natural language processing. The transformer
architecture was adopted in Google’s BERT and OpenAI’s GPT, marking a new
era in natural language processing. GPT, developed since 2018, is a “generative
pre-trained transformer” as its name suggests, breaking through the limits of neural
network size and capabilities with its self-attention mechanism and transformer struc-
ture. GPT evolved into GPT-3, GPT-4, etc., and similarly, BERT evolved into Bard,
Gemini, etc.
ChatGPT-3.5, released in November 2022, is an advanced language model appli-
cation of GPT. It is a generative neural network model operating on pre-training and
possessing a transformer architecture, specialized in natural language processing.
ChatGPT inherits the strengths of the transformer architecture, demonstrating excep-
tional abilities in understanding and generating natural language. The name ChatGPT
reflects its ability to engage in dialogue, answer questions, and perform various func-
tions in a chat format. However, ChatGPT also inherits the limitations of the biases
and inaccuracies present in the data used for its training.
The release of ChatGPT-3.5 sparked extraordinary interest in AI. Witnessing AI’s
ability to generate human-like texts and converse with humans, people realized that
AI is not a future concept but a present reality. The launch of ChatGPT spurred
intense competition among ICT and platform providers, rapidly shifting research
and development focus toward AI. Concurrently, the release of ChatGPT served
as a catalyst for raising awareness about the potential risks of AI, sparking public
discussions on AI’s dangers and prompting government policy responses.
Digital and AI Transformation in Industry
The industrial sector was the first to apply digital technology, initiating digital trans-
formation. The reason traditional industries began to focus on digital transformation
was due to competition. If a company does not transition to digital, it risks falling
behind in the competitive race and eventually becoming obsolete. Just as sticking to
1 Introduction to Digital and AI Transformation 7
plow farming in the industrial age would inevitably lead to being outpaced by tractor
farming, in the digital era, clinging to simple tractor farming means being outcom-
peted by digitally enhanced, autonomous-driving tractors. Therefore, companies
have recognized digital transformation as a timely challenge and have competitively
plunged into it.
As much as digital transformation was a focus for companies, it was narrowly
defined in the past as the act of applying digital technologies to various business
areas to change corporate operations. However, the digital transformation pursued
by companies goes beyond merely adopting digital technologies. It seeks a compre-
hensive change that leverages digital technologies to transform business models,
processes, organizational structures, and decision-making processes, aiming to create
new value. Notably, with the rapid evolution of AI technology, companies are
applying AI significantly to tasks like processing large volumes of data, making
real-time decisions, ensuring precise quality control, and actively responding to
customers, pursuing automation and intelligence. This is known as AI transformation
of industry.
The digital and AI transformation of industries requires substantial investment,
such as purchasing various digital devices, developing software, and hiring digital
experts. It also demands concerted efforts from all management and entails the chal-
lenging task of comprehensively innovating the company. Thus, for a successful
transformation, it is necessary to present clear goals and visions, appoint a trans-
formation leader with full authority, establish a concrete transformation plan, and
form a dedicated organization. Moreover, it is essential to invest in securing digital
devices and technologies, build IT infrastructure, manage and analyze data, collabo-
rate closely with the field, and thoroughly plan and execute performance evaluation
and analysis.
The benefits a company gains from transitioning to digital and AI are diverse. It
can improve efficiency, reduce costs, and shorten market entry time through process
automation and operation optimization. Utilizing digital and AI technology allows
for the provision of personalized services that meet customer needs, offering new
experiences and satisfaction through interaction with customers. Digital and AI trans-
formation can spur innovation, enabling the development of new products, services,
and revenue streams that were unimaginable in the past. By leveraging real-time
data and advanced analytics for data-based decision-making, businesses can optimize
their strategies and quickly respond to market changes. Digital and AI transformation
also enables the optimization of resource use and minimizes environmental impact,
bringing companies closer to sustainability goals. Ultimately, embracing digital
and AI transformation allows for agile response to market changes, maintaining
a competitive edge.
Digital and AI Transformation in Society
Digital technology has permeated into society through the innovative services of
platform companies, significantly changing the way we engage in social activities
and conduct our daily lives. Digital technology has brought about changes in every-
thing from communication methods and access to and distribution of information, to
8 1 Introduction to Digital and AI Transformation
work and learning methods, and even lifestyle habits. For instance, everyone carries
smartphones that enable communication with others, searching for various informa-
tion, and enjoying news, sports, movies, videos, taking photos, and playing games.
The changes include positive aspects such as improved social connectivity, enhanced
access to information and services, and increased innovation in industries. However,
there are also negative aspects such as digital divides, job losses, and the spread of
fake news. Furthermore, there are risks of misusing digital technology to lead to a
digital surveillance society.
In the digital age, the use of digital devices has increased work efficiency, leading
to a trend of reduced manpower needed for tasks. Especially with the spread of robotic
process automation (RPA), rapid replacement of manual labor jobs has resulted
in significant job losses, and the development of artificial intelligence (AI) is also
reducing office jobs. In addition, job opportunities further reduce as banking oper-
ations transition to fintech using software and applications, ticketing or ordering
services are replaced with digital kiosks, and face-to-face services are being replaced
by electronic and cyberoperations. The current issue of job shortages experienced
by nearly all countries is partly due to economic recessions, but a more funda-
mental cause is the reduction in the number of jobs due to the advancement of digital
technology. This is a common problem faced by both developed and developing
countries.
Digital and AI transformation is fundamentally changing the way teachers teach
and students learn by integrating digital technology into education. The first important
aspect of education in the digital age is to ensure all students have equal access to
digital devices and internet connections, enhancing the effectiveness of education
through the application of digital and AI technologies. In addition, it is crucial to use
various digital tools in education to improve students’ digital literacy and scientific
literacy. Moreover, education in the digital era needs to be restructured in preparation
for the future where humans coexist with digital and AI technologies. To that end, it
is essential to closely observe the development of AI to understand what it means to
be human in light of AI and explore which human abilities need to be developed for
coexistence with AI.
Entering the digital and AI transformation era, we observe various pathological
phenomena in political and social contexts. It is important to note how social media
platforms, emerging from digital transformation, have radically changed political
activities and social movements. Among the many changes brought about by hyper-
connected social media to the political and social environment, the most shocking is
the collective actions mediated by the internet. Social network services (SNSs) and
internet personal broadcasting through YouTube are platforms that have made this
possible. SNS provides a means for members to express and share their opinions,
while internet personal broadcasting allows individuals to disseminate their views to
an unspecified number of people. These social media platforms enable the formation
of groups that share opinions, and members of these groups can unite in collective
action. Unlike in the past, groups can form and act without the constraints of time or
space.
1 Introduction to Digital and AI Transformation 9
One phenomenon quietly unfolding in the digital transformation era is the collec-
tion of individual citizens’ and consumers’ information for surveillance purposes.
Collecting information on individuals for national surveillance is an act that occurs in
controlled societies and is not permitted in free democratic societies. However, even
in free democratic countries, individuals’ information is being collected and used,
such as when digital platform companies collect consumer information for targeted
advertising. The former is surveillance by the state, while the latter is surveillance by
businesses. While the former is conducted unilaterally by the state without the consent
of its citizens, the latter is carried out by businesses with the consent of consumers to
a significant extent. If the former violates citizens’ rights through political actions,
the latter is an economic activity conducted with consumers’ understanding. If the
former leads to a digital surveillance society, the latter guides us toward a surveillance
capitalism society.
Challenges of Digital and AI Transformation
As the digital era matures with the progress of digital transformation, the advent
of ChatGPT signals the baton passing to AI, preparing us for the AI era. However,
the transition from a digital society to an AI society differs from the shift from an
industrial to a digital society. While the transition to a digital society corresponds
to a paradigm change, the AI era exists on a continuum with the digital era. AI
emerged as one of the digital technologies and has been in use, gradually increasing
in importance until it eventually becomes the central axis of digital technology.
Therefore, the development of AI signifies that the digital transformation is fully
transitioning into a digital and AI transformation.
Digital transformation has introduced several new challenges to human society.
Digital platform services have raised issues regarding the protection of personal infor-
mation, and search engines and social media have introduced biases and distortions
to information. Especially, digital transformation has brought numerous challenges
to society, such as digital divides, job losses, cyber-attacks and data security, and the
spread of false information and fake news. The digital divide can cause socioeco-
nomic and educational inequalities in the digital age. False information and fake news
can amplify social conflicts, undermine national unity, overturn election results, and
put democracy at risk if misused in politics. The application of digital technology to
remote cameras and facial recognition technology for surveillance purposes raises
the possibility of a surveillance society. In education, there’s the task of redefining
educational content in preparation for the future where humanity coexists with AI
robots and other digital technologies.
Solving these issues has become an essential task for the development of human
society and the improvement of human life. While some can be resolved through
individual efforts, most require systematic solutions through social institutions and
infrastructure. It is necessary to legislate or strengthen existing laws to protect
personal information, data security, and consumer rights. Also necessary is the
development of technologies to counter false information and fake news alongside
establishing punitive regulations, and the enhancement of network security measures
against cyber-attacks. In addition, various measures are also needed to control the
10 1 Introduction to Digital and AI Transformation
&KDOOHQJHVRI'LJLWDO7UDQVIRUPDWLRQ
'LJLWDO7UDQVIRUPDWLRQ 'LJLWDO7UDQVIRUPDWLRQ
$SSOLFDWLRQV
RI,QGXVWU\ LQ6RFLHW\
'LJLWDO7HFKQRORJLHV $UWLILFLDO,QWHOOLJHQFH
7HFKQRORJLHV
'LJLWDO3ODWIRUPV
)RXQGDWLRQRI'LJLWDO7UDQVIRUPDWLRQ
,QWURGXFWLRQWR'LJLWDO7UDQVIRUPDWLRQ
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 13
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_2
14 2 Foundations of Digital Transformation
Wired communications evolved into wireless and mobile communications. The foun-
dation of wireless communication was established in the late 1800s, with Maxwell
formulating four equations related to electromagnetic waves in the 1860s, and Hertz
successfully generating and detecting electromagnetic waves in 1888. Based on this,
Marconi invented wireless telegraphy in 1896 and succeeded in communicating
across the Atlantic using electromagnetic waves in 1901. Wireless technology was
initially used for wireless telegraphy, then shifted to broadcasting, leading to the
start of radio broadcasting in 1916 and the first demonstration of black and white
television broadcasting in 1927. Radar was invented in 1935. The first communica-
tions satellite, Echo I, was launched in 1960, followed by the broadcasting satellite
Telstar I in 1962, and INTELSAT I in 1965, ushering in the era of satellite commu-
nications. Mobile communications were first offered by Bell Labs in 1946 but did
not become widespread. The first generation of mobile communications (1G) began
much later, with the analog AMPS system in 1983. The second generation (2G) saw
the commercialization of the TDMA-based GSM system in 1991 and the CDMA-
based IS-95 system in 1996. The third generation (3G) converged both into the
CDMA technology, with the asynchronous WCDMA system starting in 2006 and
the synchronous cdma2000 system in 2007. The fourth generation (4G) used the
OFDMA technology, with the m-WiMAX system commercialized in 2006 and the
LTE system in 2009, but the LTE system took lead in the market. The fifth generation
(5G) was commercialized in 2018, based on the OFDMA technology.
2.1.3 Computers
When the ENIAC computer was first developed in 1946, it was large enough to fill a
big room.1 The development of computers has progressed to the point where they can
now be held in hand. Technically, vacuum tubes were used in the 1950s, transistors
in the 1960s, and integrated circuits (IC) and large-scale integrated circuits (LSIC)
from the 1970s onwards. Large mainframe computers were first developed, followed
by minicomputers, microcomputers, personal computers (PC), and smartphones,
becoming smaller in size and lower in price. The user base also expanded from experts
to the general public. In the 1950s, only computer experts could use computers; in
the 1960s and 1970s, highly skilled individuals could use them; and from the 1980s
onwards, they became accessible to everyone.
In the 1960s, IBM led the development of mainframe computers, releasing the
IBM 1401 in 1960 and the IBM 360 in 1964. These large mainframe computers were
primarily used for large corporations, military, and space development. From 1965,
1 ENIAC, created by John Mauchly and Presper Eckert at the University of Pennsylvania in 1946,
is considered the precursor of computers. It used 18,000 vacuum tubes and required a cooling area
of 167 square meters due to the heat generated.
16 2 Foundations of Digital Transformation
small minicomputers like the PDP-8 became popular, led by DEC, Data General,
and HP. In the 1970s, several microcomputers were released, including the CDC’s
Datapoint2200 in 1970 and HP’s HP9830A in 1972, keeping computers in the realm
of experts until then.
The widespread distribution of computers for general use began in the mid-1970s
when IBM, HP, Apple, Commodore, and Compaq competitively launched personal
computers (PC). In 1973, IBM’s Los Gatos Research Laboratory created the proto-
type of a portable computer called SCAMP, known as the precursor of PCs. IBM
set the standard for PCs using DOS and Windows as operating systems (OS), and
most other companies released products compatible with IBM PCs. However, Apple
competed with IBM by launching PCs with a new concept using OS X as the oper-
ating system. The widespread adoption of PCs began in earnest in 1977 when Steve
Wozniak and Steve Jobs sold the first Apple II PC, equipped with BASIC language,
color graphics, and 4,100 characters of memory, for $1,297. IBM then shifted its
strategy from focusing on mainframes and minicomputers to actively entering the
PC business.
The transition of PCs to smartphones began with the personal digital assistants
(PDAs) of the 1990s. PDAs featured email, calendars, address books, calculators,
web browsing, and fax send/receive capabilities and could be used as cellular phones
when closed. When defining a smartphone as a small computer with a small OS
and a mobile phone, the first commercially successful smartphones were RIM’s
BlackBerry phone in 1999 and Motorola’s A760 handset in 2003. In 2007, Apple
launched the iPhone with iOS, disrupting the communications landscape with the
introduction of the App Store, marking the start of the smartphone era and leading to
the “ICT Big Bang.“ In response to the iPhone, Samsung Electronics launched the
Galaxy with Android OS in 2009.
1999. It has since evolved into various versions, significantly increasing transmission
capacity and establishing itself as a means for wireless Internet access. Meanwhile,
in 1990, Tim Berners-Lee developed the World Wide Web (WWW), and Internet
Service Providers (ISPs) began to emerge in the late 1980s. The commercialization
of the Internet accelerated with the launch of Netscape in 1994 and Explorer in 1995,
leading to rapid development. Particularly, the introduction of standardized ADSL in
1995, which opened the door to high-capacity subscriber network communications
after 1998, made the Internet a formidable competitor to traditional communications.
2.1.5 Broadcasting
Broadcasting began in the 1920s, starting with radio and followed by television,
which became practical in the 1930s.2 TV broadcasting evolved from black and white
to color TV in the 1950s.3 Originally, broadcasting modulated programs on waves
to a non-specific audience via terrestrial transmission, but cable television (CATV)
began in the 1950s to target areas with poor reception, becoming widespread in the
1970s4 In the 1980s, terrestrial broadcasting expanded to satellite broadcasting, and
in the 1990s, it transitioned to digital, providing high-definition television (HDTV)
broadcasting services.5 Alongside, CATV also evolved into digital CATV. The tran-
sition of terrestrial broadcasting to digital required turning off all analog broadcasts
(“analog switchoff”) and simultaneously turning on digital broadcasting (“digital
switchover”).6
Cable television serves as a broadcast method that receives terrestrial or satellite
broadcasts via a communal antenna and provides services to subscribers through
cables, initially using coaxial cables and later fiber optics. Originally, because
antennas were set up high in areas with difficult reception for communal use, it was
also known as “community antenna television (CATV).” Broadcasting transitioned
from transmitting to a non-specific audience to providing subscription services to
cable subscribers.
Initially, CATV primarily offered one-way broadcast services but gradually
improved the cable network to enable two-way services. With the rise of the Internet
in the 1990s, cable modems were attached to provide Internet services alongside
2 The journey started in 1927 when Baird transmitted a signal over 705 km from London to Glasgow
through telephone lines, followed by the first transatlantic television signal between London and
the US in 1928, and a trial television service in Germany in 1929. However, the first commercial
television service was provided by BBC broadcasting in 1936.
3 In the United States, NBC broadcasted the Rose Parade in color for the first time on January 1,
Digital transformation was not initiated by knowing in advance that processing tasks
digitally would be effective for integration of signals and then developing related
technologies. Instead, digital transformation originated from the digital conversion,
which involved converting analog signals to digital signals for long-distance commu-
nication. Once all analog signals were converted to digital, it became possible to
process all types of signals, including data signals, in digital format. This capability
created a synergistic effect, leading to integration and laying the groundwork for
digital transformation. Initially, the analog signals that were converted to digital
were telephone voice signals and later expanded to include images and television
video signals.
9 This scanning process repeats 30 times per second in countries using the NTSC standard, like
the US, Japan, and Korea, while it occurs 25 times per second in countries adopting the PAL or
SECAM standards, based on the electrical supply frequency of 60 Hz and 50 Hz, respectively.
10 For example, standard-definition TVs scanning 720 pixels per line and 480 lines at 30 frames per
second with 8 bits per color per pixel result in a data rate of approximately 248.83 Mbps.
20 2 Foundations of Digital Transformation
with the pixel density or resolution, with standard-definition TV (SDTV) at 720 × 480
pixels, high-definition TV (HDTV) at 1920 × 1080 pixels, and ultra-high-definition
TV (UHDTV) at 3840 × 2160 pixels. Data signals, originally digital, typically have
lower transmission rates than voice or video signals but can be higher for multimedia
data.
How do analog and digital signals differ? Essentially, all sounds in their natural state
can be considered analog signals. Sounds heard by the ears and scenes seen by the
eyes are all analog. Human voices and voices transmitted through telephones are
analog, as are filmed scenes and broadcast television images. Analog signals change
continuously over time, as illustrated in Fig. 2.1. For instance, an analog signal x(t)
changes continuously over time t, with time t having continuous real values and the
analog signal x(t) also having real values.
In contrast, digital signals, as shown in the figure, are discontinuous. They do
not exist at all times but only at specific intervals, that is, at discrete times. Thus,
the signal is represented by natural numbers n in the figure. The digital signal x[n]
represents the magnitude of the signal at time n, which can be real values or limited
to certain values expressible in binary form. Strictly speaking, the former is called
a discrete-time signal, and the latter a digital signal. Since the time axis is discrete
time in both cases, the terms discrete-time signal and digital signal are sometimes
used interchangeably.
Since the invention of the telephone in the 1870s, traditional telephone commu-
nications have processed and transmitted voice signals in analog form. The system
that processes analog signals is referred to as an analog circuit (see Fig. 2.2). Analog
circuits consist of analog components like resistors (R), inductors (L), capacitors
(C), and operational amplifiers (OpAmp), with actual electronic components and
physical connections.
After the 1960s, as analog signals began to be converted to digital signals and tele-
phone communications shifted from analog to digital, voice signals were converted
The motivation for converting analog signals to digital signals was simply the desire
to overcome the limitations of transmission noise and enable long-distance commu-
nication. With analog signals, accumulated noise could not be removed, making
long-distance communication impossible. The reason was that there was no way
to distinguish between the signal sent by the sender and the noise that interfered
during transmission, making it impossible to remove the noise and restore the orig-
inal transmitted signal. However, when analog signals are converted to digital signals
and transmitted as binary 0’s and 1’s (in actual circuits, for example, 0 and 5 V), the
receiver can consider parts deviating from 0 and 1 as noise and decode the trans-
mission signal to restore the original signal (refer to Fig. 2.3). When this digital
conversion issue was theoretically and practically resolved, digital communication
became a reality, and the dream of long-distance communication was realized.
How was it possible to convert analog signals to digital signals and then convert
them back to recover the original analog signals? Two processes are necessary to
convert an analog signal to a digital signal. Using Fig. 2.1 as an example, one is
the sampling process related to the x-axis (i.e., the time axis), and the other is the
quantization process related to the y-axis (i.e., the signal magnitude axis). On the
22 2 Foundations of Digital Transformation
#PCNQI5KIPCN
&KIKVCN5KIPCN
x-axis, sampling involves taking samples at regular intervals from the continuous-
time analog signal to create a discrete-time signal. On the y-axis, quantizing this
discrete-time signal reads it as a digital discrete-time signal.11
The challenge in digital conversion is not just taking samples from the analog
continuous-time signal at regular intervals to obtain a discrete-time signal, but
whether it is possible to restore the original continuous-time signal from that discrete-
time signal. Thus, the crux of the digital conversion problem is how to sample in
the sampling process so that the original analog signal can be restored from the
sampled discrete-time signal. The Nyquist Sampling Theorem solves this problem.
It states that if samples are taken at a frequency more than twice the highest frequency
component contained in the analog signal, the original analog signal can be perfectly
restored from those samples.12 Of course, this theory assumes that no errors occur in
the quantization process and its reverse. This depends on how many bits are allocated
for quantization, and by setting the number of bits sufficiently large, quantization
error can be made negligible. In theory, if the number of bits is infinitely large, the
digital signal becomes identical to the analog signal.
As digital conversion (i.e., analog-to-digital and digital-to-analog conversion)
became feasible, active academic research supported the processing of all analog
signal processing digitally. As a result, in the 1960s, digital signal processing theory,
11 ‘Quantization’ here means quantification or digitization, which means expressing a real value as
an m-bit binary digital signal. Quantizing is performed by checking which of the 2m equally spaced
gradations the real value is closest to and then reading that gradation value in binary.
12 Frequency components of a signal can be found by converting it into a Fourier series or applying
a Fourier transform. The highest frequency component in the transformed domain is identified, and
then sampling is done at more than twice that frequency.
2.2 Digital Conversion 23
%KTEWKV/QFG 2CEMGV/QFG
& &
# #
$WHHGT
$WHHGT
$ ' $ '
%KTEWKV 2CEMGV
% ( % (
corresponding to analog circuit theory, was established.13 For nearly a century after
the invention of the telephone, circuits were composed, analyzed, filtered, frequency
converted and modulated/demodulated in analog form, but digital signal processing
became independently possible for all these analog signal processing methods.
Components like resistors, inductors, and capacitors that constituted analog circuits
were replaced with digital components performing operations such as delay, addition,
and multiplication, and semiconductor chips performing these operations supported
digital signal processing and communication. As a result, in the 1960s, digital
channel banks, replacing analog channel banks, emerged, thus marking the begin-
ning of digital transmission. With the maturity of digital technology, digital transmis-
sion expanded to wireless, optical, undersea, and satellite communications, and as
switches were replaced with digital switches as well, a fully digital communication
network became possible.
13See A. V. Oppenheim and R. W. Schafer, Digital Signal Processing, Pearson, 1975, and A. V.
Oppenheim and R. W. Schafer with J. R. Buck, Discrete-Time Signal Processing, 2nd Edition,
Prentice-Hall, 1999.
24 2 Foundations of Digital Transformation
In a phone call, when the handset is lifted and the receiver’s phone number is
entered via dial or button, the telephone exchange receives that number, finds a path
to connect to the receiver, and sets the switches on all exchanges along that path. This
establishes the circuit. The process of finding a path to connect to the receiver is called
routing, which is similarly applied today when using a navigation system to find a
route to a destination on a map. Once the call starts, the circuit remains connected
until the end of the call, and no other user can interfere with that connection. The
user exclusively uses the circuit during the call duration, and thus, the usage fee is
charged based on the connection time. Since there are significant periods of silence
during a call, the efficiency of communication resource use is low.
When sending data from a computer, the sender’s computer packs the sender’s data
into packets, attaches the receiver’s address, and sends them to a router. Typically, a
message is divided into multiple packets for processing. The router reads the packet’s
address and sends it out through an exit leading to the destination, choosing exits with
less traffic. The next router receives it and processes it in the same manner. This way,
the packet passes through several routers to reach the destination computer, where the
receiver’s computer receives all the packets comprising the same message, assembles
them, and delivers them to the receiver. Since packets from the same message may
take different routes and arrive in a different order, the assembly process corrects
the sequence. Packet mode, therefore, has a relatively complex packet processing
procedure and has limitations in handling real-time signals. However, because users
do not monopolize communication resources and share them with all users, the
efficiency of resource use is high.
Communication services for voice, video, and data signals began independently
and evolved by forming separate networks. However, as each domain expanded,
conflicts arose among different signal types, necessitating a digital conversion
and the need to deliver these signals in an integrated manner through a single
transmission medium. Particularly, the need for integrated delivery between tele-
phone voice services and computer data services grew, leading to frequent conflicts
between the two domains. Originally, data communication started to transmit data
between computers or between a computer and a terminal. As computers evolved
and PCs became widespread, the scope and frequency of computer communication
increased, leading to frequent conflicts between telephone communications. The
conflicts, which started in wired communications, intensified with the need to inte-
grate voice, video, and data services and later moved to the wireless mobile commu-
nication domain. Thus, the development process of information and communication
technology involved conflicts and confrontations, eventually finding solutions for
integration or convergence.
2.3 Digital Integration 25
after digital conversion, they all took the same form, with the only difference being
the number of bits. Digitally converted, the information content of TV video signals
can be hundreds to thousands of times greater than that of telephone voice signals.
Thus, allocating the length of time slots or the number of ATM cells proportional to
the information content of each signal type allows the simultaneous provision of all
three services through a single digital network.
14 The internet, which started as a packet mode network called ARPANET in 1969, transitioned to
the Internet after adopting the TCP/IP protocol, which combines the Transmission Control Protocol
(TCP) and the Internet Protocol (IP), in 1983. Subsequently, the Domain Name System (DNS) was
introduced, allowing the use of domain names and IP addresses for addressing. Internet Service
Providers (ISPs) emerged in the late 1980s, and the commercialization of the internet began in
earnest in the mid-1990s, leading to its explosive growth.
2.3 Digital Integration 27
Following the conclusion of digital conversion and integration based on packet mode
in the wired communication domain, the battle between circuit and packet modes
shifted to the realm of wireless mobile communication.
15 The battle between ATM-BISDN and internet in the 1990s was a global confrontation in digital
communications, involving all telecom operators and manufacturers worldwide, with the internet
camp fully mobilized. At that time, the key debate topic at various academic conferences in the
fields of communications and computers was ‘ATM or the internet?’.
28 2 Foundations of Digital Transformation
The dream of wireless mobile communication began in the 1900s, but the first
mobile communication system was established in 1946, and the first portable cell-
phone appeared in 1973. The first commercial mobile communication service was
provided in 1983 with the first generation (1G) mobile communication system
AMPS, an analog mobile communication using Frequency-Division Multiple Access
(FDMA). The user interest and competition among operators in mobile communica-
tion heated up, leading to the emergence of the second generation (2G) digital mobile
communication in 1991, which used Global System for Mobile Communications
(GSM) based on Time-Division Multiple Access (TDMA). In opposition, another
2G digital mobile communication system, IS-95, appeared in 1996, utilizing Code-
Division Multiple Access (CDMA).16 The third generation (3G) mobile communi-
cation emerged in 2006 with asynchronous WCDMA and synchronous cdma2000,
both based on CDMA. Later, the fourth generation (4G) appeared in 2009 and the
fifth generation (5G) in 2018, both utilizing Orthogonal Frequency-Division Multiple
Access (OFDMA). Table 2.1 lists a collection of the properties of 1G through 5G
mobile communication systems.
With the advent of 2G digital mobile communication, data services began, and
users were interested in data transmission speeds, making speed enhancement a
focal point of competition from 3G onwards. Voice signals were accommodated
16 The CDMA technology was developed by Qualcomm in the USA, with South Korea pioneering
its commercial system. CDMA mobile communication was mainly used in the USA, South Korea,
Vietnam, India, Brazil, and Chile, covering about 13% of global mobile phone subscribers, while
86% used GSM.
2.3 Digital Integration 29
using circuit mode, and data using packet mode. Until 3G, mobile communica-
tion primarily focused on voice while emphasizing data, but with the rise of internet
usage, the demand for data services significantly surpassed voice services, which was
reflected directly on the development of the 4G mobile communication. Expanding
transmission capacity and efficiently handling internet data became central concerns
in developing 4G. Consensus was reached on adopting OFDMA to increase transmis-
sion capacity. However, opinions divided on handling data efficiently: one approach
continued using circuit mode time slots for data packets, and the other adopted the
‘all-IP mode’ for accommodating both voice and data in IP packets, as used on the
internet. The former approach led to LTE adopted by the traditional 2G and 3G
standardization group 3GPP, while the latter led to mobile WiMAX (m-WiMAX)
stemming from IEEE 802 series standards.17
The standardization battle for 4G mobile communication ultimately became
a competition between ITU’s 3GPP LTE and IEEE’s m-WiMAX. From another
perspective, as 3GPP represented traditional circuit mode and IEEE802 repre-
sented IP packet mode, this was essentially the ‘Second Digital War’ in wireless
mobile communication. During the lengthy standardization process, a surprising
turn occurred when the 3GPP camp, recognizing that future communications would
inevitably be data-centric, pivoted the LTE standard proposal to the ‘all-IP mode’.
Thus, both sides adopted the ‘all-IP mode’ for 4G standards, leading to a some-
what anticlimactic victory of packet mode over circuit mode in 4G wireless mobile
communication. Although LTE and m-WiMAX were both adopted as standards,
with most telecom operators and manufacturers participating in LTE,18 packet mode
emerged as the winner in the ‘Second Digital War’. As a result, mobile communica-
tion retained its traditional communication exterior but adopted the genetic makeup
of packet mode internet interior.
The victory of the IP packet mode in the ‘Second Digital War’ in wireless commu-
nication, following the ‘First Digital War’ in wired communication, marked a signif-
icant turning point in the history of communications development. It transformed all
domains, wired and wireless, to handle voice and video signals as IP packets, inte-
grated and processed together. Since the introduction of IPTV services in 1998, even
TV video signals began to be transmitted as IP packets. This development estab-
lished the IP packet mode as the dominant communication protocol, centralizing
the communications platform around the internet and integrating all communication
signals into it. Thus, the 130-year history of circuit mode communications ceded its
throne to IP packet mode, positioning the internet as the primary communications
platform.
17 As the internet penetrated the subscriber network via ADSL and spread to wired local area
networks (LAN/MAN) and wireless local area networks (WLAN/WMAN), it formed WiFi and
subsequently evolved into the broadband WiMAX. The expansion of fixed WiMAX into mobile
form is known as m-WiMAX.
18 Though m-WiMAX was aggressively pushed and standardized by companies like Intel and
Samsung Electronics, it couldn’t overcome the market dominance of LTE led by traditional telecom
groups.
30 2 Foundations of Digital Transformation
So, what exactly is the internet, the victor that toppled the traditional communication
“Goliath,” and what is the core DNA within it, the IP packet mode?
The IP packet mode refers to the packet mode technique used by the Internet
Protocol (IP) within the TCP/IP protocol suite. The fundamental difference between
circuit and packet modes, as previously explained, is that circuit mode establishes
a circuit for continuous signal transmission for real-time services like voice, while
packet mode collects data as it becomes available and sends it in packets intermittently
for non-real-time services like data. For example, in telephone voice services, dialing
a receiver’s phone number prompts the exchange to establish a circuit connecting the
sender and receiver, which is exclusively used by them for the duration of the call, and
charges are based on the duration of circuit occupancy. Conversely, in packet mode,
the sender packages data into packets, attaches the receiver’s address, and sends
them to a router, where the packets are sent over a communication path shared with
other packets, with charges potentially based on the number of packets sent. Circuit
mode monopolizes communication paths, leading to low network usage efficiency,
while packet mode shares paths, enhancing efficiency but possibly introducing delays
due to packet processing and waiting times. Thus, circuit mode sacrifices resource
efficiency for real-time reliability, whereas packet mode achieves higher efficiency
by not being constrained by real-time requirements.
The fundamental design concepts of traditional circuit-switched networks and
internet-based IP packet-switched networks are diametrically opposed. Traditional
telephone networks concentrate intelligence in the central switch, leaving user termi-
nals (i.e., telephones) with minimal intelligence. In contrast, the internet disperses
intelligence to the user terminals (i.e., computers), with routers in the network core
performing simple functions. Circuit-switched networks rely on complex systems
like exchanges that synchronize all incoming signals and compute routing paths
considering the entire network, leading to high development and maintenance costs.
Routers, however, only set paths within individual routers, simplifying their function
and significantly reducing costs.
The IP packet mode refers to the packet system used on the internet, where the
TCP/IP protocol is utilized to control transmission, addressing, and routing. The
TCP/IP protocol provides a simple yet powerful means of data transmission through
its 4-layer structure. Notably, the Internet Protocol (IP) has a resilience akin to that of
a weed, capable of easily riding over any physical network and delivering data. This
allows it to traverse subscriber networks via ADSL. TCP, the Transmission Control
Protocol, supports various protocols used by internet users such as HTTP for WWW
access, FTP for file transfer, Telnet for remote access, SMTP for mail delivery, etc.,
all operating on top of it (refer to Table 2.2).19
19 The application layer of the TCP/IP (Transmission Control Protocol/Internet Protocol) protocol
includes protocols like HTTP (Hypertext Transfer Protocol), FTP (File Transfer Protocol), Telnet (a
remote access protocol), and SMTP (Simple Mail Transfer Protocol), while the link layer includes
2.4 Digital Convergence, ‘ICT Big Bang’ 31
The IP packet mode, developed initially for data services, requires performance
improvements to accommodate real-time services like voice and video. This involves
reducing the delay in segmenting, transmitting, and reassembling service signals
into IP packets to meet real-time service requirements. Though initially considered a
daunting challenge, advances in router technology and the introduction of optical
communications have gradually made solutions feasible. The TCP/IP protocol,
designed for stable wired networks, must consider additional factors when applied to
wireless mobile communication services due to varying wireless channel conditions.
The outcome of the “First and Second Digital Wars,” where the IP packet mode
prevailed over circuit mode to become the dominant communication platform, repre-
sents a significant technological shift. The Internet Protocol (IP) now provides a
universal medium for traffic delivery, offering a common foundation for related infor-
mation processing. This shift led to a change from traditional billing methods based
on call duration to packet-based charging, with modern mobile carriers adopting
tiered pricing plans based on data usage, a variant of packet-proportional billing.
Expensive voice service businesses transformed into cost-effective VoIP operations.
However, the most crucial change was the establishment of the IP-based communi-
cation platform itself, enabling a variety of services like email, web browsing, file
sharing, video conferencing, and online shopping to be provided over the internet.
The IP-based communication platform represents a vertical integration centered by
IP, solidifying this union and exerting significant influence across all digital industries
and services.
The birth of the internet from computer communications and its victory over long-
standing traditional communications to establish itself as a communication platform
was a revolutionary event, significant enough to be called the ‘Big Bang’ of communi-
cations. However, the wave of digital integration did not end there. Communications
Medium Access Control (MAC) protocols like Ethernet, ADSL, and ISDN. UDP is the User Data-
gram Protocol, used for simple message exchanges; ARP is the Address Resolution Protocol, which
maps IP addresses to physical network addresses (MAC addresses); and RARP is the Reverse
Address Resolution Protocol, which maps MAC addresses back to IP addresses.
32 2 Foundations of Digital Transformation
and computers advanced further, to merge at the system level, and eventually formed
a content platform, paving the way for the ‘ICT Big Bang’.
20 Feature phones are multifunctional mobile phones used before the advent of smartphones. Before
feature phones, there were simple phones with only calling functionality. Simple phones evolved into
feature phones equipped with additional features like music, video, text messaging, and cameras,
which then further evolved into today’s smartphones with computer functions.
2.4 Digital Convergence, ‘ICT Big Bang’ 33
Fig. 2.5 Development of communications and computers, and ICT Big Bang
forcing application developers to create the same application for the OS specified
by each telecom operator. While Symbian and BlackBerry had a relatively large
adoption among manufacturers and a broad user base, their services were not user-
friendly, their interfaces were not intuitive, and their OS functionalities were limited,
hindering application activation. The application marketplace transformed rapidly
with the introduction of the open marketplace, the App Store, with the launch of the
iPhone. Application developers began to freely post their applications on the App
Store, and Google later entered the competition with the Play Store based on the
Android OS, leading to an explosive increase in the number of applications. This
represents the OS-level ‘ICT Big Bang’ phenomenon.
With the emergence of the App Store and Play Store, the competition in smart-
phones shifted from communications to applications, placing the OS at the center
of competition. This intense competition at the OS level marked the ‘Third Digital
War.’ Initially, Apple’s iOS competed against traditional mobile OSs like Symbian,
BlackBerry, and Series 40, but the scene expanded with the rapid rise of Android as a
latecomer. By 2012, iOS and Android began to dominate, with many manufacturers
adopting Android, establishing it as the dominant force in the mobile OS market (see
Fig. 2.6).21 Thus, the so-called Third Digital War in the mobile OS arena ultimately
concluded with a victory for the two major camps: Google’s Android and Apple’s
iOS.
The reasons behind the failure of Symbian, BlackBerry, Windows Mobile, and
others against iOS and Android include various factors. First, iOS introduced inno-
vation in user experience with its intuitive interface and seamless integration with
the existing Apple ecosystem, while Android provided a user-centric experience
with customizable interfaces and integration with Google services, unlike Symbian
21In January 2010, the mobile market share was distributed with iOS at 33%, Symbian at 34%,
BlackBerry at 10%, and Android at 5%. By December 2012, this had shifted to Android at 33%,
iOS at 23%, Symbian at 11%, and BlackBerry at 4%. As of September 2023, the market is divided
between Android at 70% and iOS at 30%.
34 2 Foundations of Digital Transformation
#PFTQKF
5[ODKCP
K15
$NCEMDGTT[
5GTKGU
Fig. 2.6 OS market share from January 2009 to September 2023. Source StatCounter Global Stats
and others which were not user-friendly or intuitive. Second, Apple’s App Store
and Google’s Play Store created a vast app ecosystem, significantly outperforming
Symbian’s Ovi Store and other OS’s, attracting users with the availability of diverse
applications. Third, Apple and Google rapidly evolved their mobile OS’s, intro-
ducing new features and performance improvements that captured users’ interest,
unlike Symbian and others. Fourth, iOS and Android attracted developers by offering
market share expansion, ease of development, and revenue generation opportunities,
while Symbian and others had limited development tools and support. Fifth, Apple
and Google effectively marketed their OS and devices, creating strong brand loyalty,
especially Apple, which created a premium brand perception, unlike Symbian and
others that lacked brand appeal. Sixth, Apple and Google presented a clear vision and
strategy for their mobile platforms and invested heavily in ecosystem development,
unlike Symbian and others. Seventh, Apple optimized its devices by controlling
both hardware and software, creating a technical ecosystem, while Android flexibly
accommodated various hardware manufacturers’ ecosystems, unlike Symbian and
others.
The ‘ICT Big Bang’ mentioned above was a device-level Big Bang resulting from
the system-level convergence of communications and computers, and an OS-level
Big Bang resulting from the combination of smartphones and application stores.
However, the Big Bang phenomenon that created explosive changes in reality was
2.4 Digital Convergence, ‘ICT Big Bang’ 35
&RQWHQW
7UDIILF
&RQWHQW 6HUYLFH
&RQWHQW 'HYLFH 8VHU
3URYLGHU 3URYLGHU
3ULFH &KDUJH
'HYLFH 3ULFH
0DQXIDF
WXUHU
at the level of the communications business. The ICT Big Bang pushed traditional
telecom operators from the center stage to the periphery, replaced by application
marketplaces. As the core of telecom business shifted from voice to data services,
and with the emergence and success of open application stores, traditional telecom
operators were forced into a defensive position in the data business, leading to their
marginalization. The fact that telecom operators, who had dominated the commu-
nications stage for 130 years, were pushed to the periphery is indeed an ‘ICT Big
Bang’. Figure 2.7 illustrates this by showing that the central position which used to
be taken by telecom service providers before the ICT Big Bang was taken by the
open application store after. This was a historical Big Bang that upended the foun-
dations of the communications market, marking the end of the telecom-led era and
the dawn of a content-led era. This Big Bang, which reshaped the landscape of the
ICT industry, is thus referred to as the ‘ICT Big Bang’, and since it was triggered by
the advent of smartphones, It is also called the ‘Smart Big Bang’.
Since the digital conversion of analog signals, there have been three significant
conflicts in the development of communications and computers, known as the ‘Digital
Wars’. The first was a conflict between the circuit mode and the IP packet mode in the
realm of wired communication, referred to as the ‘First Digital War’. This conflict
then shifted to the mobile communications domain, leading to the ‘Second Digital
War’. As a result, the IP packet mode of the Internet, victorious from the first two
wars, ascended as the communications platform.
The third conflict took place in the mobile OS of smartphones, which integrated
communication terminals and computer systems. With the advent of smartphones,
computers became integrated into communication devices, and computer compa-
nies transformed into communications companies. A prime example of this is the
transformation of Apple, a computer company, into a communications manufacturer
after the launch of the iPhone. Apple entered the mobile communications market
with the iPhone equipped with the iOS operating system, competing against the
existing operating systems of telecom service providers like Symbian and Black-
Berry. This competition intensified with the entry of smartphones equipped with
Google’s Android OS, sparking the ‘Third Digital War’. This war concluded with the
joint victory of iOS and Android, linked to the App Store and Play Store, establishing
them as the foundation of the ‘Content Platform’.
Through these three Digital Wars, communications and computers fully inte-
grated both internally, in terms of communication methods, and externally, at the
device level. As a result, a communication platform based on the IP packet mode of
the internet and a content platform based on iOS and Android were established. The
complete convergence of communications and computers, epitomized by the combi-
nation of smartphones and application marketplaces, caused the ‘ICT Big Bang’ (or
‘Smart Big Bang’). The Big Bang triggered transitioning of the communications era
to the content era, completely changing the ICT industry landscape and impacting
all industries and society. This laid the technical foundation for the full-scale digital
transformation. Ultimately, the foundation of digital transformation was established
as a result of three Digital Wars in the development and convergence process of
communications and computers.
Chapter 3
Digital Platforms
The digital platforms were established through the first, second, and third Digital
Wars, as previously discussed. In both the wired and wireless communication sectors,
the IP packet mode triumphed over the circuit mode, leading to the convergence of
voice, video, and data signals into IP packets. This integration of communication and
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 37
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_3
38 3 Digital Platforms
computing established the internet based on the IP packet mode as the communica-
tion platform. This platform represents a solid integration of traditional and computer
communications through IP packets, laying the foundation for offering both informa-
tion and communication services and content services. This convergence has created
a cornerstone for converging all digital content into ICT.
The shift to providing data services began with 2G mobile communication, with
data speeds increasing through 3G, 4G, and 5G. This led to a shift in competition from
communication services to content services. Initially, telecom operators competed
with their content services, but the launch of the iPhone with iOS and Google’s entry
with the Android mobile OS escalated the competition into the Third Digital War—a
battle among operating systems, with iOS and Android emerging as the victors in
the content platform space. This has led to applications being traded and operated
through the app marketplaces running on these two OSs.
Thus, the communication platform born from the integration of communication
and computer in both wired and wireless networks is the internet platform based
on IP packets. The content platform, arising from the system-level integration of
communication and computing, operates on the internet platform through OS-driven
content ecosystems. Today’s ability to make calls, surf the web, engage in social
media, and conduct online transactions on a smartphone is due to the integration of
the communication platform and the content platform on the device. These platforms
work together to run various applications, turning the applications themselves into
application platforms that provide various services.
Therefore, when defining a digital platform as a platform that provides digital
services, it includes communication platform, content platforms, and application
platforms. What appears externally is the internet, which is the lower layer, and
what is transmitted through the internet is content, the upper layer. All voice, video,
and data signals are transmitted over the communication platform in the form of IP
packets, and all content is distributed through applications on the content platform
according to the OS protocol. For example, iOS and Android themselves form the
core of the content platform, and the various applications built on them become
application platforms that provide the unique services of the application.
The term platform is familiar to the general public thanks to platforms found at train
stations. A train station platform is a space where passengers board and alight from
trains. It is not a space for specific trains or passengers; instead, it is an open space
that all trains and passengers can use together. Digital platforms operate similarly.
They are digital spaces where developers and providers of various digital resources
meet users to exchange digital resources. The digital resources provided here include
various contents, applications, and services. Generally speaking, a digital platform is
a digital ecosystem space where developers, providers, and users participate together
to create, distribute, and consume various digital resources such as digital content,
3.2 Types of Digital Platforms 39
$SS
&RQWHQW3ODWIRUP
26
&RPPXQLFDWLRQ3ODWIRUP
,QWHUQHW
From a physical standpoint, the developers, providers, and users of digital platforms
are interconnected and digital resources are delivered through communication plat-
form based on Internet Protocol. In other words, the internet is a communication
platform that connects various devices, enabling information exchange and commu-
nication. The devices like desktop computers, laptops, and mobile devices provide
the means for users to access and interact with relevant websites and apps through
the internet. This means providing the hardware and interfaces necessary for web
browsing, emailing, social media, and other services through internet access. There-
fore, the internet communication platform forms a comprehensive ecosystem that
connects device developers, communication service providers, and users through
various information and communication devices connected to the communication
network, facilitating communication, information exchange, and digital services
worldwide.
Today’s widely used services such as social media, mobile shopping, cloud
computing, and content sharing are provided dependent on applications on content
platforms. However, not all content and services are provided this way. Several
services are directly provided on the communication platform. For example, services
like web browsing, file transfer, remote computer access, and email do not rely on
iOS or Android operating systems but are provided directly through the internet.
40 3 Digital Platforms
These services can be accessed via desktop computers, laptops, mobile devices, or
other specialized devices. While dedicated applications for iOS and Android devices
can be developed and offered, users are not limited to them and can access these
services from various devices and OS’s.
The operation of services like web browsing on communication platforms involves
connecting to the internet and accessing websites and web-based content through
web browsers such as Google Chrome, Mozilla Firefox, and Microsoft Edge. These
browsers are available on various operating systems like Windows, macOS, Linux,
etc. File transfers between computers can be done using File Transfer Protocol (FTP),
and web browsing involves transferring web pages and images between computers
and web servers using HyperText Transfer Protocol (HTTP/HTTPS). These proto-
cols are usable across various platforms and devices. Remote computer access is
achieved through services like Remote Desktop, Secure Shell (SSH), Virtual Network
Computing (VNC) over the internet, available on various operating systems. Email
services like Gmail, Outlook, Yahoo Mail operate independently of the underlying
operating system and can be accessed on various devices through web browsers or
specific email client applications.
a standardized way. In addition, content platforms offer user interfaces such as web
or mobile apps that enable users to access and interact with content and services.
At the core of the content platform is an operating system. The content platform
operates on the internet communication platform powered by an OS. The content
ecosystem is like a community that uses the same OS as its language. For example,
it is akin to a linguistic community that uses the language of Android OS or iOS.
Content platforms produce, distribute, and consume content on the foundation of
the same OS. The previously discussed ‘Third Digital War’ was essentially a battle
between different OS’s, and a war between platform ecosystems using those OS’s.
The winner was decided by the performance of the OS itself, its user-friendliness,
and the adaptability of content creators. Therefore, a content platform may well be
referred to as an OS platform.
case of travel booking platforms, accommodation providers are also included in the
ecosystem.
A comparison of the characteristics of the three types of digital platforms discussed
above is illustrated in Table 3.1.
Table 3.2 Comparison of platform services (e.g., web service and mail service)
Web service Mail service Operating system
(a) When receiving by PC
Application program Browser (Chrome, Email (Gmail, Windows, MacOS,
Edge, Firefox, Safari, Outlook, Yahoo, etc.) Linux
etc.)
Application layer HTTP/HTTPS, DNS SMTP, IMAP, POP3
Transport layer TCP, UDP
Internet layer IP
Link layer Ethernet, WiFi
(b) when receiving by smartphone
Application program Web app/API Email app iOS, Android
Application layer HTTP/HTTPS, DNS SMTP, IMAP, POP3
Transport layer TCP, UDP
Internet layer IP
Link layer Ethernet, WiFi, LTE, 5G
is delivered to the internet layer, where it is further divided into IP packets, encap-
sulated, and attached with the IP addresses of the sender (i.e., the user’s device) and
the receiver (i.e., Google’s server). The task of converting the domain name (i.e.,
google.com) to an IP address is carried out by the application layer’s Domain Name
System (DNS). Fourth, the IP packets are passed to the link layer and transmitted
through the network. In this case, the network can be Ethernet or WiFi when using a
PC, and Ethernet, WiFi, LTE, 5G network, etc., when using a smartphone. The link
layer uses protocols and hardware interfaces suited to the type of network to transmit
data.
The IP packets transmitted in this manner pass through various routers and
networks on their way to Google’s servers. Once these packets arrive at Google’s
servers, they undergo a reverse process of the above steps and are reassembled into the
original HTTP/HTTPS request. The search server, having received the original search
query, runs sophisticated search engine software to process the query. It searches the
vast index of the web to find relevant results, utilizing powerful data centers to do so
in a very short time. The search results are packaged into an HTTP/HTTPS response
and sent back to the user’s device. This process is carried out through the same steps
as when the search query was sent, that is, processed through Google server’s appli-
cation layer, transport layer, internet layer, link layer, then through the network to
the user’s device, where it is reassembled and interpreted by the user’s web browser
or search app. Finally, the search results are displayed on the user’s device. All these
processes happen in a very short time.
The description of the web service above also applies to email services. In the
case of using a PC, the application changes from a browser to an email server,
44 3 Digital Platforms
and in the case of smartphones, the web app changes to an email app, but every-
thing else remains the same. Table 3.2 illustrates such service procedures through
communication platforms and content platforms, using search and email services
as examples. In the case of content platforms, using different applications and their
respective APIs instead of web applications and their APIs allows for the provision
of application-related services.
Given that digital platforms are ecosystems where developers, providers, and users
all participate and where a variety of digital resources such as digital content, appli-
cations, and services are congregated, digital platform companies today hold a signif-
icant share in industry and society. The number of companies providing application
platforms is as diverse as the types of platforms themselves. Among these, the largest
and most influential companies include Apple, Google, Amazon, and Meta (formerly
Facebook). As of June 2024, all four companies are among the top 10 companies in
the world by market capitalization. This fact underscores how platform companies
are leading the digital transformation era and shaping the economic landscape.1
Apple, founded in 1976 by Steve Jobs, Steve Wozniak, and Ronald Wayne, initially
developed hardware and software for personal computers (PCs) like the Apple I,
Apple II, and Macintosh (Fig. 3.2). Subsequently, Apple diversified its business to
produce smartphones such as the iPhone, content consumption and creation tools
like the iPad, laptop computers like the MacBook, as well as portable music players
like the iPod, and music and media management programs like iTunes. Apple also
launched wearable devices such as the Apple Watch and AirPods, and the streaming
music service, Apple Music. The iPhone, in particular, revolutionized the smartphone
market, transforming Apple into a communications manufacturer and a leader in the
communications device sector. The iPhone accounts for about 50% of Apple’s total
revenue, with hardware device sales comprising 78% of the total.2
Apple’s products are known for their excellent brand image and product design.
They have impressed users with intuitive and user-friendly interfaces. Apple’s prod-
ucts and services form a synergistic ecosystem, a powerful tool that encourages the
integration of various Apple products. The quality and reliability of its products have
given customers a positive image. Apple distributes numerous applications through
1 As of June 28, 2024, the ranking of the top 10 companies by market capitalization is as follows:
Microsoft ($3.4 trillion), Apple ($3.3 trillion), Nvidia ($3.0 trillion), Alphabet (Google) ($2.3
trillion), Amazon ($2.1 trillion), Saudi Aramco ($1.8 trillion), Meta Platforms (Facebook) ($1.3
trillion), Berkshire Hathaway ($880 billion), Eli Lilly ($820 billion), and TSMC ($760 billion).
2 In 2023, Apple’s revenue was $383.3 billion, with iPhone sales at $201 billion being the largest,
followed by iPad at $28 billion, MacBook at $29 billion, wearables like Apple Watch and AirPods
at $40 billion, and services like the App Store at $85 billion. Hardware device sales accounting for
78% of total sales affirm Apple’s status as a manufacturing company.
46 3 Digital Platforms
the App Store, building a diverse app ecosystem. The Apple App Store, in partic-
ular, has become a genuine application marketplace that satisfies both users and
developers, spearheading the ‘ICT Big Bang’ in conjunction with the iPhone.
Thus, Apple has brought innovation to communications through the iPhone and
App Store and has led the development of the digital industry in terms of product
design, technological innovation, and user experience. Furthermore, tools like the
iPad and Apple Pencil have supported the creativity of designers and artists. Apple
is expected to bring more innovations in technology and products in the future,
expanding its interests into AI, VR, AR, and the metaverse.
However, there are various issues and concerns. Apple’s ecosystem is monopo-
listic, limiting interoperability with other companies’ devices and services, and high
product prices pose a barrier to entry for low-income groups. Apple’s high market
share raises concerns about market dominance and unfair competition due to monop-
olistic corporate behavior, notably in its unilateral control over App Store platform
usage and fees, causing dissatisfaction among developers. As a manufacturer, Apple
faces criticism for environmental responsibilities related to resource consumption
and waste generation. The high dependency on Chinese manufacturing and supply
chain operations poses geopolitical risks, with ethical concerns about labor condi-
tions and human rights at Chinese manufacturers. Meanwhile, Apple’s collection of
user data for service improvement and targeted advertising has drawn criticism for
privacy compliance and refusal to comply with government requests.
Google was founded in 1998 by Larry Page and Sergey Brin. Initially, it led inno-
vations in the web search domain by developing a search engine (Fig. 3.3). Today,
its business areas have diversified into advertising, operating systems (OS), video
content, and cloud services. Google remains the dominant player in the search engine
market. In advertising, Google generates substantial revenue through its advertising
platforms, AdWords (renamed to Google Ads in 2018) and AdSense. Google owns
Android, an OS that competes with Apple’s iOS, and operates YouTube, a popular
video platform. It also runs Google Cloud Platform, offering cloud computing, data
3.3 Digital Platform Companies 47
storage, and analytics tools. The majority of Google’s revenue comes from digital
advertising, with ads through search and other platforms accounting for nearly 60% of
total revenue. Including Google Network ads and YouTube ads, advertising revenue
reaches 77% of total revenue.3
Google has contributed to popularizing knowledge with its innovative search
engine, becoming synonymous with online searching. Android OS has played a
significant role in the digital platform ecosystem by providing an open alternative
to Apple’s closed iOS system. Google excels in data analytics and artificial intel-
ligence, offering insightful information and trends as well as personalized services
and functionalities. Recognized for its open culture and capability for technological
innovation, Google is expected to increase its investments in AI, cloud technologies,
healthcare technologies, and autonomous vehicles.
However, there are several concerns and issues with Google Search. While it has
played a role in popularizing knowledge, it has also been instrumental in Google’s
success as an advertising company. The main issue arises from the collection of user
information during the search process and its use in targeted advertising. Google
collects user information under the pretext of improving search services, providing
personalized and faster search services. However, personalized services have led
to the creation of filter bubbles, where search algorithms remember past searches
to offer similar results, inadvertently filtering information within the confines of
previous search histories.
Moreover, Google’s use of user information for targeted advertising has signif-
icantly increased its advertising revenue, exposing users to targeted ads without
their awareness. Concerns extend beyond this, including the risk of massive user
information being exposed through hacking and cyber-attacks. Google’s search algo-
rithms may also create political biases or discriminatory perceptions, leading to social
controversies. Furthermore, Google’s active collection of user information beyond
search services raises concerns about its future impact on users and society. With
77% of its revenue coming from advertising and its leading position in the global
digital advertising market, Google’s dominance in the advertising market also raises
monopolistic concerns.4
3 In 2023, Google’s revenue was $307.4 billion, with Google Search ads at $175 billion, Google
Network ads at $31.3 billion, YouTube ads at $31.5 billion, and cloud services at $33.0 billion.
Advertising revenue totaled $237.9 billion, accounting for 77% of total revenue, making Google
predominantly an advertising company. (Source: FourWeekMBA).
4 In 2022, global digital advertising revenue (market share) was distributed among Google ($168.4
billion, 29%), Meta ($112.7 billion, 19%), Alibaba ($41 billion, 7%), Amazon ($38 billion, 6%),
48 3 Digital Platforms
Amazon was founded by Jeff Bezos in July 1994, starting as an online bookstore
before expanding into a comprehensive e-commerce company (Fig. 3.4). Amazon
entered the e-book market with the launch of the Kindle e-book reader, further diver-
sifying into Amazon devices like Echo smart speakers and Fire tablets. Amazon Web
Services (AWS) marked its venture into cloud computing services, while Amazon
Prime and Amazon Studios represent its venture into entertainment.
Amazon’s core business today is e-commerce, operating various online market-
places worldwide and offering third-party sellers access to Amazon’s online market.
Amazon Marketplace sells products produced by Amazon alongside those from
other sellers, fostering competition and improving product quality. AWS provides
computing power, storage, and databases to individuals, businesses, and govern-
ments. Amazon Prime offers subscription services like fast shipping and streaming,
while Amazon invests in content creation through Amazon Prime Video and Amazon
Studios. Online shopping sales account for 43% of Amazon’s total revenue, and
including third-party and offline sales, it reaches 71%, solidifying Amazon’s status
as a commerce company.5 However, while online sales slow, cloud services and
advertising revenues are surging, with traffic moving from Google to Amazon for
shopping-related searches.6
Unlike Apple’s contribution to ICT innovation with the iPhone and App Store
or Google’s popularization of knowledge via its search engine, Amazon’s founding
story and success formula differ. Amazon, starting as an online bookstore and shifting
to online retail, aimed to become “the most customer-centric company in the world.”
and ByteDance ($29.1 billion, 5%), according to Insider Intelligence (note that statistics may vary
by data source). In 2023, the digital advertising market share changed significantly, with Google’s
share increasing to 39%, Meta at 18%, Amazon at 7%, ByteDance at 3%, and Baidu at 2%.
5 In 2023, Amazon’s revenue was $554 billion, with online sales accounting for $231.9 billion, third-
party seller services generating $140 billion, AWS revenue at $90.8 billion, advertising revenue at
$46.9 billion, subscription services revenue at $40.2 billion, offline store sales at $20 billion, and
other revenue at $5 billion. Total commerce revenue, which includes online, third-party, and offline
store sales, was $391.9 billion, making up 71% of the total revenue. (Source: FourWeekMBA).
6 Notably, AWS led the cloud service market in Q4 2022 with a 32% share, a significant future
business landscape shift for Amazon. Cloud service market share in Q3 2023 remained stable, with
Amazon at 32%, Microsoft at 23%, and Google at 11%.
3.3 Digital Platform Companies 49
This vision has driven Amazon’s strategic decisions and operations, leading to its
success. Amazon has built a loyal customer base by ensuring customer satisfaction,
robust infrastructure, a wide product range, and reliable delivery, stimulating compe-
tition, service improvements, and price reductions. It has pioneered e-commerce,
smart devices, and cloud computing services, innovating markets and industries.
Through self-publishing and Kindle Direct Publishing, Amazon has offered oppor-
tunities for writers and transformed the publishing industry. Amazon Marketplace
enables small businesses to sell products globally.
Despite its contributions, Amazon faces concerns. Its core business in online
retail has succeeded by encroaching on existing retail markets with a unique
vision of customer centrality, essentially attracting customers from existing busi-
nesses in a zero-sum game. With a 37.6% share in the online retail market, far
surpassing competitors like Walmart, Amazon faces monopoly concerns.7 Domi-
nance in the online retail market restricts consumer choice and raises fears of
arbitrary rent increases. For example, third-party sellers pay Amazon various fees,
including referral, fulfillment, subscription, advertising, and long-term storage fees.
Amazon’s aggressive expansion through mergers and acquisitions, integrating them
to strengthen market dominance, raises concerns about the future “Amazon Empire.”
As a major employer, Amazon faces criticism for warehouse working conditions
and labor treatment, with rapid robot adoption reducing employment. Amazon’s
collection of vast user data for online businesses also raises privacy and security
concerns.
7The market share of the US online retail market in 2023 was as follows: Amazon 37.6%, Walmart
6.4%, Apple 3.6%, eBay 3.0%, and Target 1.9% (Source: Statista).
50 3 Digital Platforms
and the metaverse with the launch of Oculus Quest. However, the Oculus business
did not meet revenue expectations, with the majority of Meta’s revenue still coming
from advertising on its social media platforms.8
Meta owns several platforms, including Facebook, Instagram, and WhatsApp,
serving hundreds of millions of users. These platforms satisfy diverse user needs
and expand connectivity and social networks. Meta’s personalized advertising plat-
form offers advertisers targeted marketing opportunities, enhancing connectivity and
communication among people and forming social networks, which is recognized
as Meta’s social contribution. In the future, Meta is expected to actively develop
and innovate around the metaverse and virtual worlds, expanding social interaction
functionalities.
However, Meta faces significant concerns about the potential risks and negative
impacts of dominant social media platforms. The primary concern is related to privacy
issues. Meta collects vast amounts of personal information through various platforms
and uses it for targeted advertising, raising concerns about privacy protection, data
security, and the possibility of data leaks. In addition, Meta’s social media platforms
can be used to spread misinformation and fake news and can cause problems such as
misinformation, incitement, and opinion division, which can influence elections and
politics through information manipulation and distortion. Furthermore, excessive use
of social media can lead to addiction, have negative impacts on mental health, and
cause serious social issues like online cyberbullying. Particularly, young users may
be sensitive to the negative effects of social media, prompting calls for appropriate
regulation.9 Meanwhile, Meta faces scrutiny over competition issues and suspicions
of monopolistic practices due to its dominant position in the social media market.
From a different perspective, just as a train station platform serves as the physical
space where the railway company (i.e., the service provider) connects with passengers
(i.e., the service users) to offer transportation, digital platforms function as virtual
ecosystems where developers, providers, and users converge to create, distribute,
and consume digital resources, such as content, applications, and services. Among
digital platforms, content platforms form ecosystems based on operating systems to
support application platforms, while application platforms create ecosystems based
on applications to provide various services and content. Before the rise of app stores,
8 In 2023, Meta Platforms’ revenue was $134 billion, of which $131.9 billion, or 98%, came from
advertising.
9 Meta faced a class-action lawsuit from 41 US states, accused of intentionally designing addictive
systems to keep users engaged for long periods, causing irreversible harm to the mental health
of children and adolescents. This lawsuit stems from 2021 when Frances Haugen, a former Meta
product manager, exposed hundreds of internal documents, revealing that Meta was aware of the
harmful effects of social media on teens’ mental and physical health but failed to act, instead
enhancing the addictive aspects of its platforms.
3.4 Nature of Digital Platforms 51
10The concept of two-sided markets was extensively researched by Jean Tirole. Refer to J-C. Rochet
and J. Tirole, ‘Platform competition in two-sided markets’, Journal of the European Economic
Association 1, pp. 990–1029, 2003.
52 3 Digital Platforms
the efficiency and quality of services provided to both providers and users within the
ecosystems.
In the two-sided markets of digital platforms, the method of covering transaction costs
differs from traditional markets. In offline commerce, buyers pay sellers directly for
products, and sellers pay rent to property owners. In online commerce, buyers pay
the platform provider, who then deducts platform usage fees before transferring the
remaining amount to the seller. In digital platforms handling information, the cost
of using information can be borne by users (i.e., buyers) and service providers (i.e.,
sellers), or by a third party, such as advertisers. In reality, it is often the sellers
or third parties who bear the costs. For example, when booking accommodation
through Airbnb or Booking.com, users only pay the accommodation fee, while the
property owner pays a significant portion of this fee to the platform provider as a cost.
Similarly, when using Google’s services like the search engine, YouTube, or Gmail,
Google provides these services for free to users but charges advertisers beyond the
cost, inserting ads throughout the service.
A unique mechanism in the basic nature of digital platform two-sided markets is that
platform operators themselves perform a regulatory function. The platform provider
may also impose restrictions on selling prices. Particularly, it monitors to ensure
that sellers do not pass costs onto consumers. The quality of products delivered to
the platform may also be inspected, and arbitration mechanisms are set up in case of
disputes. Such regulatory functions help the platform operator maintain market order
and protect consumers. Why is this? It is because the platform operates as a two-
sided market. By maintaining market order and protecting consumer interests, more
consumers are drawn to the platform, which in turn attracts more sellers, benefiting
the platform operator.
Digital platforms have enabled various functions that were not possible in the indus-
trial society. These functions have created new services that did not exist before,
made inefficient tasks efficient, and brought various conveniences to human life.
Using internet search, humanity can now instantly access knowledge accumulated
over thousands of years, and with navigators installed in cars, one can travel to every
corner of the globe. Messenger services have made it possible for anyone to commu-
nicate from anywhere at any time, and social networking services have enabled
people to form and interact within groups. E-commerce services allow the purchase
of goods produced in various countries with just a few clicks. They have also made
it possible to enjoy movies, performances, sports, and entertainment from around
the world while sitting at home. All these benefits brought by digital civilization are
provided through digital platforms.
However, digital platforms have not only provided positive functions. The emer-
gence of new platform capabilities has brought about unintended negative effects
as well. Internet search services have raised privacy issues due to the collection of
personal information and its use in targeted advertising. Social network services
have been associated with the distribution of false information, fake news, and issues
like cyberbullying. In addition, as e-commerce platforms have increased the market
dominance of providers, they have arbitrarily raised rents for sellers and narrowed
consumer choices, leading to detrimental effects. These problems did not exist in
the industrial society, or if they did, they were not as severe. Internet search services
and social networks did not exist before, so their associated issues were also non-
existent. Furthermore, before online e-commerce, offline commerce was limited to
specific countries or regions, so even if a company had significant market power, it
was subject to national regulations, preventing serious monopoly issues.
54 3 Digital Platforms
The first dysfunction of search services is that the information collected can constrain
future searches, limiting search results to areas of past interest. For example, if
someone has previously searched for travel in Morocco, subsequent searches about
Morocco may prioritize content related to travel and tourism, while information
about its history or political system might be excluded or placed far down in the
results. This happens because internet search engine algorithms are designed to
remember past searches and provide fast, personalized service by offering search
terms within a similar scope. Consequently, users may unknowingly find themselves
filtered within the categories of their past search histories, a phenomenon known as
a “filter bubble.” Filter bubble limits the range of information accessible via internet
searches, potentially obscuring users’ views and distorting their understanding. If this
phenomenon repeats, it may lead users to view things from a narrow perspective,
relying only on information provided by the search engine, thus falling into prejudice
and overconfidence.
The search algorithms themselves are not without issues. Since algorithms are
generally kept secret, it is unclear how search results are ranked, causing users and
content creators to speculate about biases in search outcomes. Search engines tend to
prioritize well-known and frequently visited websites, making it difficult for lesser-
known or smaller businesses and content creators to gain visibility. This has led to
suspicions that Google uses its market dominance to control web traffic, impacting
the sustainability of online businesses and favoring its products and services over
competitors. Google has faced antitrust investigations in several countries for such
competitive practices. In addition, there is potential for search algorithms to be
manipulated to promote false or misleading information, propaganda, disinforma-
tion campaigns, and the spread of fake news. Malicious actors could also use search
engine optimization (SEO) techniques to manipulate search results to prominently
display erroneous information.
Like messenger services, social networks, and online commerce, search engines
collect extensive information about users, allowing for the creation of detailed user
profiles. While the nominal reason is to provide faster search services or personalized
services, the amount of personal information collected by platform providers often
exceeds what is necessary for service improvement. Once information collection
begins, various problems emerge, such as users being subjected to unwanted ads or
temptations to purchase products. If collected personal information is not securely
managed, it could be stolen through hacking or cyber-attacks, leading to significant
unforeseen damages. Thus, platform operators must prioritize security measures to
protect user privacy, but the adequacy of these measures is often unclear, leaving
privacy concerns as a highly explosive dysfunction.
3.5 Dysfunctions of Digital Platforms 55
Like search engine services, social media platforms collect personal information,
leading to potential data breaches as a basic dysfunction. However, a more significant
and subtly operative dysfunction is the “echo chamber effect.” Social media amplifies
or strengthens homogeneous beliefs among its participants, creating a phenomenon
where like-minded individuals reinforce each other’s views, leading to information
distortion and bias. This effect can drive users away from the truth and even lead to
the formation of factional groups. Initially, uncertain individuals may become more
convinced and act boldly after being exposed to similar opinions within these echo
chambers. If filter bubbles create bias through collected information, echo chambers
do so through the overlapping selection of information, leading to confirmation bias.
The echo chamber effect causes social media users to cluster into groups with similar
tendencies, and these formed groups become blindly supportive and follow the group.
Today, collective actions mediated through the internet have become a social
issue, which is the lethal dysfunctions that social media brings to the political and
social environment. Various social networking services and personal internet broad-
casts through platforms enable collective action. When used effectively, social media
can help overcome communication barriers due to time constraints or geographical
limits, thereby facilitating the formation and maintenance of human relationships.
However, if social media is used to form malicious groups, incite collective actions,
or engage in cyberviolence, it becomes a dysfunction. Specifically, personal internet
broadcasting, unlike traditional media, is not regulated and can be used irresponsibly
to generate sensationalist content or unfounded claims. This can greatly pollute the
media ecosystem and degrade into a tool that incites and mediates collective actions
or violence. Moreover, if social media circulates false information and fake news
and is misused to manipulate information and stir up public sentiment, it can have
severe impacts on elections and politics.
Misuse of social media can deteriorate relationships and harm personal privacy or
dignity. This could stem from the platforms’ inherent dysfunctions or users’ indis-
criminate behaviors. Social media platforms can be addictive, leading users to spend
excessive time on them at the expense of neglecting real-life responsibilities and
relationships. Overuse of social media can also trigger mental health issues such
as depression, anxiety, and loneliness. Constant comparison with others can lead to
feelings of inadequacy or negatively impact self-esteem, and may even result in Fear
of Missing Out (FOMO) syndrome.11 These phenomena can be considered inherent
dysfunctions of social media itself, as they reflect the detrimental psychological and
social effects associated with its pervasive use. On the other hand, cyberbullying is a
common occurrence on social networks, where users may suffer psychological and
emotional harm due to derogatory comments, threats, and cyberabuse. Malicious
individuals can exploit social media to commit phishing attacks, identity theft, and
11 The term FOMO syndrome is primarily used to describe a fear of missing out or being excluded,
or a vague anxiety about situations where others seem to be having worthwhile experiences that
one has not tried.
56 3 Digital Platforms
other cybercrimes, causing social disorder. These issues can be attributed not so much
to the inherent dysfunctions of social media itself, but rather to the misuse of social
media by its users.
The enduring presence of social media footprints is indeed another dysfunction
of the medium. Actions and statements from the past do not simply disappear but
continue to affect lives today. A personal mistake stored on social media can resurface
at any time, and someone who remembers these past actions might spread them
via social media for various reasons. This can lead to a form of “public opinion
trial,” where past actions are judged by today’s standards, resulting in the public
condemnation of past mistakes. While it might be considered positive that individuals
are held accountable for their actions, the judgments made through public opinion
rather than through legal proceedings are inherently unfair. The “right to be forgotten”
faces challenges against the argument for the “right to know.” Even attempts to erase
past digital records through “digital undertakers” cannot guarantee the complete
removal of all stored files across social media and digital devices, and the aggrieved
memories imprinted in victims’ minds remain inerasable. This highlights the struggle
of creating a forgiving society that allows for reflection and recovery in the face of
social media’s barriers.
on social networks, and various physical information sent via wearable health check
apps. This data is collected and categorized individually by platform companies.
Analyzing this data allows for the understanding of user behavior (such as loca-
tion, movement trajectories, interests, personal networks, social graphs, consumption
habits, and internet search patterns). There is an inseparable relationship between
user data and behavior; thus, analyzing user data enables the understanding of user
behavior. In other words, platform companies collect much more user data than
necessary for service improvement and use this data for business advantages.
If platform companies use personal information to attempt price discrimination,
users may experience pricing unfairness. If the platform does not have information
about a user’s preferences, it might offer the standard or a lower price, but if it
knows the user needs a particular product, it could charge a higher price. Such
price differentiation allows platform companies to make more profits. If such price
discrimination is applied to medical insurance, it could potentially lead to a crisis in
the insurance system. Discriminating by charging lower premiums when someone
is healthy and higher premiums when their health deteriorates can exclude some
individuals from medical insurance, ultimately endangering the insurance system.
When personal information is disclosed to others, the individual becomes
constrained in social life, potentially leading to unhappiness. Disclosure means that
one’s private space narrows while the public space expands. Although one can live
comfortably in private spaces, in public spaces, one becomes acutely aware of the
gaze of others. Thus, the disclosure of personal information means that many aspects
previously considered private become public, constraining social behavior. This can
be particularly disadvantageous in conflict situations and lead to unexpected conse-
quences. To escape such constraints and restrictions, people may act differently than
usual, may not freely express their opinions, and may not seek help even when
facing difficulties. If such a state of isolation becomes unbearably difficult, one may
seek refuge in safe spaces, which could be physical places like churches or temples
or virtual spaces like other social networks. However, if one joins a homogeneous
group on a social network, while it may initially feel safe and comforting, differences
in opinions among members over time can lead to divisions and eventually harmful
actions against each other. This phenomenon can occur in both religious and political
groups, albeit with varying degrees of severity.
Thus, the leakage of personal information through digital platforms can be consid-
ered the most critical dysfunction at an individual level. The best strategy is to prevent
the leakage of personal information as much as possible. Primarily, each individual
should not leak their personal information, and secondarily, platforms should take the
best measures to prevent the leakage of personal information. While personal infor-
mation leakage by individuals can be minimized by prudent behavior, leakage by
platform operators can only be minimized by enacting and strictly enforcing personal
information protection laws. Further, more proactive measures are also considerable,
such as legislating to limit the content and scope of collected information or limiting
the period of storing collected information.
58 3 Digital Platforms
Another dysfunction of digital platforms is that they can deceive users through the
collection of massive amounts of personal data and sophisticated sales tactics, which
users can hardly imagine. For example, products recommended by a platform’s
marketplace are likely to be specific products that pay substantial commissions to
the platform’s own products. This means that products recommended by platforms
may be more beneficial for the platform operators than for the users. Even if platform
operators use extensive user personal data to expose items that might interest users
and present them as the best products for the user, ultimately, this is more about
making money for the business than helping the user. Such dual business practices
raise issues concerning the transparency of information and the neutrality of platform
operators.
In the case of intermediary service platforms like online commerce and travel
agencies, there exists a sales tactic known as the “lowest price guarantee,” which
can disrupt the order of commerce by luring users. For example, if hotel booking
platform A advertises that it guarantees the lowest price, how can platform A ensure
the lowest price while compensating for any shortfall? If platform A is a well-known
platform that is convenient to use and connected to various hotels, and it guarantees
the lowest prices, everyone would want to use this platform. If all users flock to
platform A and stop using other platforms, what happens? Platform A may pass on
the losses incurred from guaranteeing the lowest prices to the hotels by demanding
additional fees, and the hotels, having no other way to connect with users except
through platform A, will have to comply. As a result, hotels paying additional fees
will seek ways to recover their losses, eventually passing them on to other customers
or other platforms. For instance, they might increase the rates for customers not
booked through platform A or for those booking through other travel platforms like
platform B. Thus, the losses from one platform’s lowest price guarantee end up
being transferred to other users as additional charges. This is a problem for all digital
platforms, especially intermediary commerce platforms, leading to the breakdown
of a healthy online commerce order.
Such lowest price guarantees are banned or regulated in European countries like
Germany and France due to their harmful effects on market order. These regula-
tions primarily prevent online commerce and hotel booking platforms from limiting
competition or restricting how third parties (e.g., hotels, merchants) set prices through
lowest price guarantees. Germany, in particular, pays attention to clauses related to
the lowest price guarantee in digital markets, often referred to as price parity or
most-favored nation (MFN) clauses. The Federal Cartel Office investigates compa-
nies that use such clauses and restrict or prohibit them if they are deemed to hinder
fair competition. France investigates these clauses and, if found to limit competition
or negatively affect consumers, requires limitations or corrective measures.
3.5 Dysfunctions of Digital Platforms 59
Digital platforms inherently possess the risk of evolving into natural monopolies
due to network externalities. As exemplified earlier, search engines can collect more
data and enhance their algorithms with an increasing user base, thus providing better
search results. Similarly, navigational apps become more accurate and efficient as
more users contribute data about their behavior patterns and road conditions. Conse-
quently, users naturally gravitate toward platforms with a larger user base, and this
choice further strengthens the platform through network externalities, promoting a
winner-takes-all scenario and leading to a natural monopoly.
With the content platforms divided between iOS and Android, where Android is
used by several manufacturers but iOS is exclusively controlled by Apple, Apple’s
ecosystem showcases high market dominance and associated detriments.12 This
exclusivity means Apple can tightly control hardware, software, and applications,
limiting freedom for users and developers who must adhere to Apple’s guidelines
and policies. Apple’s control over the iOS app marketplace further restricts compe-
tition for developers, who may face rejection or high fees for store entry. In addi-
tion, the requirement to use Apple’s exclusive development tools and programming
language can constrain open-source and cross-platform development preferences.
Apple’s premium pricing for its devices restricts consumer access, and the lack of
compatibility with other manufacturers’ devices hinders platform switching and data
sharing.
Google’s dominance in the global search market, with a 91% share, and its
leading position in digital advertising, showcases potential monopolistic dysfunc-
tions. The overwhelming market share restricts competition and innovation, making
it challenging for users to find alternatives and suppressing diversity in the digital
ecosystem. Google’s market power could lead to bias in search algorithms away
from the most relevant results toward those most profitable, affecting the quality
and diversity of search information. Similarly, in digital advertising, Google’s
dominance raises concerns about price manipulation, which limits advertisers’
options and potentially increases costs that may be passed on to consumers. In
additional, Google’s control over content publishers’ revenue streams can signif-
icantly influence disputes over revenue sharing and threaten the sustainability of
high-quality journalism. Like its search engine, Google’s dominance in advertising
makes it difficult for smaller competitors to challenge its vast resources, thereby
suppressing innovation and limiting the introduction of new services to the detriment
of consumers.
Amazon, with a 37.6% share in the global online retail market, overwhelmingly
surpasses its closest competitor, Walmart, which holds only 6.4%. This dominant
position allows Amazon to impose arbitrary fees on platform sellers and restrict
12As of April 2024, the global OS market share stands at Android 70.9% versus iOS 28.4%, yet in
North America and Oceania, iOS dominates with 55% and 54% market share, respectively. While
Android is utilized by a majority of smartphone manufacturers like Samsung, Huawei, and Xiaomi,
iOS remains confined within Apple’s closed ecosystem, including iPhones, iPads, and iPod touches.
60 3 Digital Platforms
consumer choices. Amazon’s vast delivery network and warehouses exert pressure
on competitors, and during the pandemic, while other competitors were on the defen-
sive, Amazon aggressively expanded employment and investment. Numerous third-
party sellers depend on Amazon’s marketplace to reach customers, and Amazon
controls them through various fees and access to customer data. Fees for third-party
sellers include referral fees, fulfillment fees, subscription fees, warehouse fees, long-
term storage fees, and product removal fees, causing distress and complaints among
sellers.
Meta’s unparalleled position in social networking and messaging services, with
Facebook, Instagram, WhatsApp, and Facebook Messenger dominating user counts,
raises concerns over monopolistic practices.13 The Federal Trade Commission
(FTC)’s lawsuit against Meta for acquiring Instagram and WhatsApp highlights
potential antitrust violations, indicating that such dominant market power can hinder
market entry for other companies and limit user choices. If a monopoly eliminates
competition, it enables firms to act against user interests, lessen product improvement
pressures, and ultimately disadvantage consumers.
13 As of April 2024, Meta’s platforms have a combined total of 8.1 billion monthly active users, with
Facebook at 3.1 billion, Instagram at 2.0 billion, WhatsApp at 2 billion, and Facebook Messenger
at 1.0 billion, according to Statista data. This overwhelmingly surpasses YouTube at 2.5 billion,
TikTok at 1.6 billion, WeChat at 1.3 billion, and Telegram at 900 million users.
3.6 Regulation of Digital Platform Companies 61
has provided a free environment that has allowed platform companies to innovate
and grow rapidly, contributing to their fast growth. However, the USA has begun
regulating platform companies including Meta.14 Google,15 Amazon,16 and Apple17
for antitrust violations and Europe has also begun to formalize regulation of digital
platform companies.18
Digital platforms are deeply infiltrated in our society today, and our daily lives are
significantly dependent on digital platform services. Therefore, the negative aspects
of digital platforms already affect our lives, making regulation of digital platforms
an urgent task of our times. Effective regulation of digital platforms is necessary
to reduce the social harm caused by certain operators’ monopolies. To that end, it
is essential to accurately describe the characteristics of digital platforms and find
suitable methods of regulation. The methods should be considerately designed such
that they can preserve the advantages and innovative development of digital plat-
form services while preventing their negative aspects and side effects. Further, they
should be able to open doors to new competitors in the digital platform’s two-sided
14 In December 2020, the FTC filed a lawsuit against Facebook (Meta), alleging that its acquisitions
of Instagram and WhatsApp were anticompetitive. The first lawsuit was dismissed in June 2021, but
the FTC filed an amended lawsuit, with a trial expected to take place in 2024. The FTC argues that
while the acquisitions did not lead to price increases, they resulted in poorer service and restricted
consumer choice.
15 In January 2023, the U.S. Department of Justice filed a lawsuit against Google for abusing
its monopoly power in the digital advertising market and harming fair competition, in violation
of antitrust laws. The lawsuit claimed that Google demanded default monopoly rights to block
competitors. Specifically, it was alleged that Google entered into revenue-sharing agreements with
electronic device manufacturers and wireless carriers to set its web browser as the default search
engine, thereby excluding competitors. On August 5, the federal court in Washington DC ruled that
“Google is a monopolist, and it has acted as one to maintain its monopoly”.
16 On September 26, 2023, the FTC filed a lawsuit against Amazon for allegedly using its dominant
position in the e-commerce market to harm competitors. The FTC argued that Amazon penalized
sellers on its platform if they offered their products at lower prices on competing platforms. It was
also claimed that Amazon forced sellers to use its expensive logistics network, resulting in harm to
competitors and causing consumers to pay higher prices for their purchases.
17 On March 21, 2024, the U.S. Department of Justice, along with 16 states, filed an antitrust lawsuit
against Apple. The lawsuit alleged Apple was harming consumers by using its monopolistic position
in the market. Key issues include ‘super apps,’ differentiation in messaging app bubble colors,
restrictions on cloud streaming gaming apps, and limiting cross-platform compatibility for digital
wallets and smartwatches, which restrict consumer choices and hinder competition.
18 In July 2022, the EU enacted the Digital Markets Act (DMA), a competition law aimed at
regulating monopolies in digital platforms. Simultaneously, the Digital Services Act (DSA) was
established to regulate the monopolies of digital platform services. For specific details, refer to
Sects. 6.2 and 6.3.
62 3 Digital Platforms
market and ensure that all participants are fairly compensated according to their
contributions.
However, the problem is that digital platforms present a case that has not been
experienced in the industrial society of the past, so it is unclear what the most effective
way to regulate them is. In the past, industries like telecommunications, electricity,
and railways had high entry barriers and a high potential for monopoly, which was
regulated to prevent. The method used to regulate monopolistic companies at that
time was to split the company and introduce competition. For example, in the case of
telecommunications, local communication businesses requiring essential facilities
were considered core businesses, so the long-distance communication business was
chosen to separate for competition. Similarly, in the railway business, train stations
and railway networks were seen as essential facilities, so the transport sector was
separated for competition.
However, it appears inappropriate to apply the same concept of splitting core busi-
nesses and competitive areas to today’s digital platform business. It is because the
reasons for high entry barriers and monopolies are different between the two. In the
past, the high entry barriers were due to the need for substantial facility investments,
naturally forming a monopoly structure. In contrast, the difficulty for new entrants
in the digital platform market is not due to substantial facility investments but due to
the network externality effects, which concentrate users on some particular service
providers. Therefore, it is not valid or effective to apply the regulatory methods
used in traditional industries to digital platform businesses. Moreover, while tradi-
tional industries were confined within a country, allowing the government to regu-
late them, digital platform businesses are global, making such regulatory methods
inapplicable.19
Therefore, it is necessary to develop and apply new methods of regulation that
fit the nature of digital platforms, such as creating new competition laws to regulate
natural monopolies in digital platforms. For example, unlike traditional monopolistic
companies where high entry barriers were due to essential facilities, digital platforms
have barriers due to user concentration. It might be more appropriate to target the
core business directly, considering that the user concentration phenomenon occurs
in the core business segment. If competitors could access the core business without
discrimination, it could resolve the issues caused by user concentration. Therefore
in the two-sided markets like digital platforms, a “multi-homing” system that allows
users and providers to participate in multiple platforms simultaneously could help
to resolve the natural monopoly problem. Multi-homing, originally a term used in
communications, refers to connecting users to multiple networks to improve the
reliability and performance of communications. For instance, in the US, taxi opera-
tors like Uber and Lyft allow drivers and passengers to freely use both companies,
19 Nobel Prize-winning economist Jean Tirole argued that traditional regulatory and antitrust rules
are ineffective in addressing the issues of increased dominance in digital platforms, and new rules are
needed to ensure competition in the digital platform market, including the potential for competition
and fair compensation to the contributions of the platform users. Refer to “The Future of Platform
Regulation” NBER conference lecture, April 22, 2022.
3.6 Regulation of Digital Platform Companies 63
The European Union (EU) was the first to legislate regulations for digital platform
companies with the establishment of the Digital Markets Act (DMA) in July 2022,
aimed at regulating monopolies in digital platforms. The DMA was first proposed
in December 2020, adopted by the European Parliament in July 2022, officially
published in the Official Journal in October 2022, came into effect in November
2022, and has been applied since May 2023.
The DMA is a law designed to increase fairness and the possibility of competition
in the digital platform market. It sets clear criteria for identifying ‘gatekeepers’ and
stipulates the obligations and prohibitions that gatekeepers must comply with. Gate-
keepers refer to large digital platforms that provide core platform services, such as
online search engines, app stores, and messaging services. Specifically, a gatekeeper
is defined as a digital platform that, first, holds a strong economic position with
significant impact in domestic markets and operates across several EU countries;
second, connects a large user base and many businesses from a powerful interme-
diary position; and third, has had a stable and durable position in the market, meeting
the above two criteria for at least two out of the last three fiscal years.
Examples of obligations that the DMA requires gatekeepers to comply with are as
follows: In certain situations, the gatekeeper must allow other companies to interop-
erate with the gatekeeper’s own services. It must provide business users of the gate-
keeper’s platform with access to data generated during their use of the platform. It
is required to provide tools and information necessary for advertisers and publishers
to perform independent verification of advertisements placed on the gatekeeper’s
platform. Business users of the gatekeeper’s platform must be allowed to promote
and make contracts with their customers outside of the gatekeeper’s platform.
Examples of prohibitions that the DMA requires gatekeepers to comply with are
as follows: The gatekeeper must not give preferential ranking to its own services or
products over similar services or products provided by third parties on the platform.
It must not hinder consumers from connecting with businesses outside the platform.
It must not prevent users from uninstalling pre-installed software or apps if the users
choose to do so. Without valid user consent, the gatekeeper must not track users for
targeted advertising purposes outside of its core platform services.
If these obligations and prohibitions are not complied with, the company can be
fined up to 10% of its total annual worldwide turnover, and up to 20% in the case of
repeated violations. If gatekeepers systematically breach DMA obligations, further
64 3 Digital Platforms
20 Following the implementation of the DMA, Apple announced that from March 2024, iPhone
apps could be downloaded from app markets other than its own App Store in Europe, and it allowed
the use of third-party payment systems for in-app purchases. This move by Apple, in response to
Europe’s strong antitrust regulation through the DMA, indicates that it can no longer maintain a
closed ecosystem.
Table 3.3 DMA-designated gatekeepers and platform services
Social network Intermediation NI-ICS* Ads Video sharing Search Browser OS
Alphabet (Google) Google Map Google Ads YouTube Google Search Chrome Android
Play Store
Google Shopping
Amazon Marketplace Amazon Ads
3.6 Regulation of Digital Platform Companies
The European Union (EU) legislated the Digital Services Act (DSA) alongside the
Digital Markets Act (DMA) in July 2022 to regulate the monopolies of digital plat-
form services. Like the DMA, the DSA was first proposed in December 2020, adopted
by the European Parliament in July 2022, published in the Official Journal in October
2022, and came into effect in November 2022, with enforcement starting in August
2023.
The DSA works closely with the DMA to protect users’ fundamental rights,
establish a foundation for fair competition among companies, create a safer digital
space, and foster innovation, growth, and fair competition in both the European
single market and the global market. The DSA covers a wide range of digital plat-
form services from simple websites to internet infrastructure services and online
platforms. The rules set out in the DSA apply mainly to online intermediation plat-
form businesses such as e-commerce, social networks, content sharing platforms,
app stores, and online travel and accommodation platforms.
The motivation behind the EU’s establishment of the DSA stems from the benefits
and challenges brought by digital platform services. Digital platform services have
enhanced human life in various ways, such as communication, information search,
shopping, entertainment, food ordering, and movie watching, and have facilitated
business activities across borders and new market access. However, they also intro-
duced significant issues, including the trade of illegal goods, services, and content
online, and the spread and exploitation of false information created by manipulative
algorithms. Moreover, some large platforms have dominated the digital economy
ecosystem, acting as gatekeepers that impose unfair conditions on businesses and
limit users’ choices by establishing their own rules. Therefore, the EU introduced
the DSA as a modern legal framework to ensure online safety for EU users, protect
the fundamental rights of businesses, and maintain a fair and open online platform
environment.
After the DSA’s implementation in November 2022, the European Commission
designated certain digital platforms as Very Large Online Platforms (VLOP) and
Very Large Online Search Engines (VLOSE) in April 2023, based on their active
user numbers exceeding 45 million (10% of the European population) in the EU as
of February that year. The designated VLOPs included 17 entities like AliExpress,
Amazon Store, and the App Store, while VLOSEs included Bing and Google Search,
among others, as listed in Table 3.4.
The DSA sets rules and responsibilities for online platforms, particularly VLOPs
and VLOSEs. Examples of obligations that the DSA requires VLOPs and VLOSEs
to comply with are as follows: They must publish a content moderation activity report
describing the efforts they make to address illegal content and misinformation. They
are required to implement robust content moderation systems and mechanisms to
prevent the spread of illegal content, hate speech, and harmful misinformation. They
must provide relevant users with information regarding the removal or restriction of
3.6 Regulation of Digital Platform Companies 67
such content and offer an opportunity to appeal these actions. Platforms must coop-
erate with regulatory authorities and law enforcement to respond to illegal activities
related to content.
The DSA imposes additional obligations on online platforms such as e-commerce
sites and social media networks. These include providing clear information about
each advertisement displayed and discontinuing personalized ads based on sensitive
data or profiling of children’s data. Platforms must avoid online interface designs
intended to manipulate user behavior and disclose in their Terms & Conditions (T&C)
any use of fully or partially automated systems for recommending certain information
to users. They are required to offer a notification mechanism for users to report
illegal content and notify affected users of any actions taken regarding their content.
Platforms must provide an effective internal complaint-handling system for users
impacted by decisions related to content or account management.
Examples of prohibitions that the DSA requires VLOPs and VLOSEs to comply
with are as follows: They are prohibited from managing or spreading illegal content,
including hate speech, child abuse, and unauthorized copyrighted material. They
must avoid engaging in unfair competition practices that harm consumers, other
businesses, or stifle innovation. Discrimination based on personal characteristics such
as gender, race, or religion is not allowed. They are also prohibited from engaging
in manipulative practices that distort the presentation of content or services.
68 3 Digital Platforms
A notable aspect of the obligations and prohibitions for VLOPs and VLOSEs is
that personalized advertising based on an individual’s religion, race, sexual orien-
tation, or political views is prohibited, and no form of personalized advertising can
be directed at children and adolescents. This is because if advertisements containing
specific religious, ethnic, or political messages are “personalized,” it could lead to
the entrenchment of user biases. For example, it prevents recommending provoca-
tive content with white supremacist messages to users who enjoy content involving
racial hate. In addition, harmful content, such as discriminatory or biased speech,
terrorism, and child sexual abuse, must be swiftly removed, and platforms must
establish internal measures to prevent the spread of misinformation. Furthermore,
users must be provided with the ability to opt out of data collection and disable
recommendation algorithms, meaning they should be able to view posts in simple
chronological order without recommendations.
VLOPs and VLOSEs are directly regulated by the EU. To this end, EU regulatory
authorities have the right to access the data and information held by VLOPs and
VLOSEs, and if access is refused or obstructed, they can take legal action. Authorities
have the power to order the removal or restriction of specific content that violates
DSA regulations, and failure to comply with such orders can result in penalties.
Large platform companies that violate the DSA may be fined up to 6% of their
global revenue, and in cases of serious or repeated violations, their services may
be temporarily suspended or permanently banned from the European market. In
addition, companies that do not comply with DSA regulations may face legal action
from regulatory authorities, users, or other affected parties.
Along with the DSA, the era of unregulated growth for digital platform companies
has come to an end. It is likely that the EU’s regulations on digital markets and
services will spread to other countries, meaning that digital platform companies will
have to operate under strict regulations in the future. The previously unregulated use
of sensitive personal information for personalized advertising or content exposure is
no longer allowed. Even content recommendation algorithms, which are considered
a core competitive advantage, must be fully revised to comply with the regulations.
For companies like Google, which relies on advertising for approximately 80% of
its revenue, and Meta, which relies on over 90%, the DSA could represent a seismic
shift in their business models.
Chapter 4
Digital Technology
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 69
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_4
70 4 Digital Technology
Users communicate wirelessly with the base station in their area using a cellular
phone, and if a user moves to a different base station’s area, the previous base
station hands over the communication signal to that particular base station, enabling
uninterrupted communication while on the move.
The wireless frequency bands used in 5G mobile communication are divided
into bands below 6 GHz (FR1) and the 24–54 GHz band (FR2). The maximum
channel bandwidth is 100 MHz for FR1 and 400 MHz for FR2. FR1 has rela-
tively longer wavelengths, allowing signals to propagate further, resulting in base
stations being several kilometers apart. However, FR2 belongs to the millimeter
wave (mmWave) band, experiencing significant attenuation and shorter signal prop-
agation distances, with base stations only tens to hundreds of meters apart. Therefore,
while the FR2 band offers higher data transmission rates than FR1, it requires more
frequent handovers.
In mobile communication, multiple access refers to the technique of carrying
user signals across frequency bands. Each generation of mobile communication has
used different multiple access methods (refer to Table 2.1), and like 4G, 5G mobile
communication uses Orthogonal Frequency Division Multiple Access (OFDMA)
technology for both uplink and downlink. OFDMA is a version of Orthogonal
Frequency Division Multiplexing (OFDM) modulation that allows multiple users
to access simultaneously by assigning different subcarriers to different individual
users.
In addition, 5G mobile communication adopts Massive MIMO (Multi-Input
Multi-Output) technology which involves equipping both base stations and termi-
nals with multiple antennas to transmit and receive signals. MIMO simultaneously
transmits and receives signals from multiple users, increasing throughput via spatial
multiplexing or improving reception performance through beamforming by transmit-
ting and combining the same user’s signal through multiple antennas. While 4G uses
2 to 4 MIMO antennas, 5G uses up to hundreds of antennas to transmit signals through
dozens of channels simultaneously, known as Massive MIMO, and supports beam-
forming technology.1 That is, 5G supports beamforming through Massive MIMO,
enhancing signal strength and reducing interference, thus improving the efficiency
of signal transmission and reception.
By adopting various advanced technologies including massive MIMO and beam-
forming, 5G has significantly improved performance. Specifically, 5G’s data trans-
mission rate is 10–100 times faster than 4G, allowing for the download of large files
and streaming of high-definition videos. 5G reduces latency to about a tenth of 4G,
beneficial for applications requiring immediate responses, such as remote surgery,
autonomous driving, augmented reality (AR) and Intelligent Traffic Systems (ITS).2
1 Beamforming technology enables numerous simultaneous connections, increasing both data rate
and network capacity. Beamforming sends signals from multiple antennas simultaneously to focus
them on a specific receiver, allowing efficient wireless signal reception at the receiver without
increasing transmission power.
2 ITS applies ICT to the field of road traffic, enhancing traffic efficiency and safety, increasing road
capacity, and reducing travel time by systematizing the infrastructure, vehicles, users, traffic, and
mobility management, and interfaces with other modes of transport.
4.1 5G/6G Mobile Communication 71
In addition, 5G has a larger network capacity and supports significantly more simul-
taneous device connections compared to 4G, enabling it to accommodate and serve
a large number of devices at the same time in applications such as the Internet of
Things (IoT), smart cities, and industrial automation.
For example, examining how 5G technology becomes an essential element in
ITS services, we find the following points. First, fast data transmission is crucial
for sending and receiving large amounts of data in real-time, which is necessary
for managing and optimizing traffic flow in ITS. Second, low latency is essential
for enabling real-time communication among vehicles, infrastructure, and central
control systems in ITS. Third, the capability for massive simultaneous device connec-
tions allows various devices such as sensors, cameras, traffic lights, and vehicles in
ITS to be connected and communicate with each other at the same time. Fourth,
beamforming technology enhances the accuracy of location services required for
vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication, as well
as collision prevention.
Thus, 5G technology, compared to 4G, has outstanding performance in terms
of data transmission speed, latency, the number of simultaneous connections, and
frequency efficiency, offering great potential for future services like IoT, smart cities,
and remote education, bringing innovation and efficiency improvement. However, it
is assessed as technically limited in fully supporting unmanned remote services like
remote healthcare, industrial automation, or high-risk services such as autonomous
driving and ITS. To overcome the limitations of 5G and meet the advanced industry
requirements like smart factories, smart farms, remote healthcare, autonomous
driving, and ITS, research on the sixth generation (6G) mobile communication is
actively underway globally, integrating cutting-edge technologies like computing,
sensing, communication, and artificial intelligence. 6G research aims to support
fundamental industry innovation by meeting demanding requirements across all
industries.
While 4G focused on increasing data transmission volumes by securing broadband
channels, 5G aimed to expand network capacity, massively increase the number of
simultaneous connections, and significantly reduce transmission delay. Going one
step further, 6G aims to advance the broadband, massive connectivity, and low-
latency goals of 5G to achieve ultra-broadband (up to 1000 Gbps), ultra-massive
connectivity (100 devices per square meter), and ultra-low latency (6 ms delay within
a 1000 km range). It also aims to extend communication to space, save energy, and
achieve ultra-precise positioning. If these technological goals to provide 6G mobile
communication services could be successfully achieved by around 2030, it would
be possible to provide uninterrupted services while traveling at supersonic speeds in
airplanes or hyperloops, as well as high-speed internet services in deserts, seas, or
remote areas. It would be also possible to fully and safely commercialize massive
unmanned agriculture, transportation, and distribution using unmanned autonomous
vehicles, robots, and drones.
72 4 Digital Technology
The Internet of Things (IoT) extends the internet used by people to objects (things). It
connects the objects around us to the internet, allowing them to communicate, share
data, and operate intelligently. This connection between the physical and digital
worlds enables data-driven decisions and automation. Specifically, it is a network
equipped with sensors and software on various physical devices, vehicles, buildings,
and objects, connected through a network to collect, exchange, and share data. To
accommodate the vast number of objects, the IoT protocol has expanded from the
32-bit address field of IPv4 to the 128-bit address field of IPv6.3
Technically, IoT comprises sensors, actuators, and communication devices.
Sensors detect and collect data from the environment, like temperature and humidity
sensors. Actuators execute commands affecting the environment, like turning lights
on or off in a smart lighting system. Communication devices transmit data collected
by sensors or commands to actuators using WiFi, Bluetooth,4 and other wireless and
mobile communication devices.
The data generated by numerous IoT devices can be processed at the network
edge through edge computing or sent to the network’s core for cloud computing
processing. Edge computing processes large amounts of data directly at the edge,
reducing latency, and compresses the data to send to cloud servers as necessary. Edge
computing is particularly important for applications requiring real-time responsive-
ness. Through such processes, IoT enables intelligent data-based decision-making,
adjusts responses to environmental changes, and prevents foreseeable issues.
IoT applications are diverse, as the following cases exemplify. Smart home
systems can automate and remotely control lighting, heating, and security. Smart
city systems can optimize infrastructure and services considering traffic, energy effi-
ciency, and public safety. Manufacturing can optimize and automate production lines.
Agriculture can monitor crop conditions and adjust irrigation to increase production
efficiency. Healthcare can manage health conditions and respond to emergencies
by connecting various sensors on the body to medical institutions. IoT enhances
automation and efficiency, provides real-time situational awareness and responses,
manages resources, and saves energy, thereby promoting connectivity, automation,
efficiency, and innovation, and thus facilitating digital transformation in businesses,
public, and personal sectors.
3 The internet was originally built on the IPv4 protocol (Internet Protocol version 4) with a 32-bit
address field, which was nearly exhausted and anticipated to require a much larger address space
for future IoT implementation, leading to the standardization of IPv6 (Internet Protocol version 6)
with a 128-bit address field. A 32-bit address field can represent 232 (approximately 4.3 billion)
addresses, and a 128-bit field can represent 2128 (approximately 3.4 × 1038 ) addresses.
4 Bluetooth is a short-range wireless communication industry standard that supports data commu-
Looking more closely at smart cities within IoT applications, smart cities are
comprehensive systems that operate cities efficiently and sustainably to improve citi-
zens’ quality of life and provide convenience. Specifically, smart cities comprehen-
sively control and maintain traffic systems, energy management, waste management,
water resources management, ICT connectivity, medical services, public safety and
security, and environmental monitoring. IoT plays a central role in smart city systems
by connecting various sensors and management systems to efficiently manage assets
and resources. For example, smart traffic lights equipped with weather monitoring
sensors can adjust brightness based on weather conditions. Sensors collecting traffic
volume data can send this information to traffic management departments to prevent
congestion. Smart parking places sensors in each parking space to collect availability
data in advance, displaying available spaces at the entrance as vehicles enter. Sensors
placed throughout roads can immediately report accidents to traffic management
departments for automatic emergency response.
Wearable technologies that measure various health indicators such as heart rate,
step count, sleep quality, and body temperature by attaching sensors to the user’s
body are also an extension of IoT. These health wearables typically use specialized
sensors to measure physical and health states, such as glucose monitors for diabetes.
The measured data is transmitted to other devices, medical institutions, or cloud
computing via Bluetooth, WiFi, mobile communications, etc. Information security
is crucial as it involves sensitive personal health information. Health wearable tech-
nology is rapidly evolving to improve sensor accuracy, user interface, battery life,
and the aesthetics of sensor devices.
Cloud computing is a technology that allows users to access data storage and
computing power provided by remote data centers via the internet. Cloud services
enable users to save on costs by having the provider centrally install the commonly
needed storage and computing resources in the network, allowing users to access
these resources.
The advantages of using cloud computing are numerous. Since users receive
computing services such as memory storage, computational power, databases,
networking, and software from cloud service providers via the communication
network, there is no need for users to individually acquire such computing resources.
Instead of owning and managing physical hardware and infrastructure, individuals or
businesses can store and process data using the storage, computers, and application
software housed in the cloud servers of the service providers. Users can connect to
the provider’s cloud servers via the internet and receive cloud computing services
on-demand, choosing as much as they need. For example, users can rent only hard-
ware resources, or hardware plus operating software, and even application soft-
ware. Renting hardware resources is called IaaS (Infrastructure as a Service), adding
74 4 Digital Technology
0HPRU\ 0HPRU\
&RPSXWLQJ &RPSXWLQJ
,QWHUQHW 1HWZRUNLQJ 1HWZRUNLQJ
'DWD%DVH 'DWD%DVH
6RIWZDUH 6RIWZDUH
In the digital environment, technologies that provide users with immersive experi-
ences include augmented reality (AR), virtual reality (VR), and the Metaverse.5 These
utilize advanced computing technologies and user interfaces, including 3D graphics,
real-time rendering,6 and artificial intelligence, to provide interactive experiences.
However, they differ significantly; for example, AR overlays digital information onto
the real world using smartphones or AR glasses, VR creates a completely immersive
virtual world by blocking out the real world, and the Metaverse creates a virtual
shared space that fuses physical reality AR with digital reality VR.
5 With the widespread use of virtual reality technology, concepts such as mixed reality and extended
reality have emerged. Mixed Reality (MR) mixes AR and VR, while Extended Reality (XR) encom-
passes AR, VR, and MR. For example, Meta’s Oculus Quest 3 headset and Apple’s VisionPro headset
are mixed reality devices.
6 Rendering, also known as image synthesis, is an essential process in computer graphics that uses
or information over the real world. Smartphones and tablets capture the real envi-
ronment with camera lenses, and AR software overlays digital information on that
scene. AR glasses or headsets use special lenses to project digital content into the
user’s field of view, naturally overlaying it with the real world. Fifth, sensors collect
information about the user’s environment and interactions within it. Commonly used
sensors include motion sensors (e.g., accelerometers, gyroscopes), environmental
sensors (to detect depth, brightness, etc.), and input sensors (touchscreens, voice
recognition microphones, eye-tracking sensors, etc.).
Applications of AR range from mobile apps and shopping to navigation, gaming,
and education. Mobile apps can provide additional information or visual enhance-
ments when the device’s camera is pointed at physical objects or locations. AR allows
users to try on clothes or see how furniture looks in their home before purchasing. AR
navigation apps overlay road information on the real world seen through the camera,
making it easier to explore unfamiliar areas. AR games integrate virtual elements
into the user’s physical environment, allowing actions like capturing virtual objects
during gameplay. AR can enhance learning experiences through interactive visual-
izations and simulations, such as showing complex molecular structures or historical
artifacts in three dimensions.
Virtual reality (VR) is an immersive digital environment technology that makes users
feel and act as if they are physically present in a separate virtual world. Users typically
use headsets or goggles that completely cover their field of vision to enter and interact
within a digital world, with the physical environment being blocked out. Users can
interact with the virtual world using hand controllers or motion sensors, not just
immersing in the VR world but actively engaging with it, distinguishing VR from
simulation.
VR requires a VR headset, a high-performance PC or game console, motion
tracking sensors, controllers, and audio output. To create an immersive interactive
experience, VR needs several components: a three-dimensional immersive environ-
ment, interactivity, and audio-visual synchronization. The three-dimensional immer-
sive environment allows users to experience interactions very similar to those in
physical spaces. Interactivity enables real-time interaction with the virtual environ-
ment and its objects. Accurate synchronization of VR’s visual and auditory elements
can create a realistic environment with spatial sound and high-quality graphics. The
VR user interface must be intuitively and flexibly integrated into the virtual envi-
ronment. Real-time rendering is necessary for VR content to immediately respond
to user actions and movements. Adding vibration or physical sensations through
special gloves or controllers can enhance immersion by allowing users to feel the
virtual environment or objects.
VR is applied in various fields, including gaming, training simulations, education,
therapy and rehabilitation, architectural visualization, and others. VR games immerse
4.4 Digital Virtual Spaces 77
players in the game world, allowing natural interactions with game characters. VR
provides realistic training simulations in aviation, medical, and military fields without
actual risks. It offers immersive educational experiences, letting students explore
historical events, scientific concepts, and virtual field trips. VR can enhance therapy
by immersing patients in virtual environments for pain management or physical
rehabilitation. It allows architects to visualize building designs, enabling clients to
experience buildings and spaces before construction.
4.4.3 Metaverse
A digital twin is a technology that creates a digital replica of a real-world object using
a computer and simulates situations that may occur in reality to predict outcomes and
find solutions to problems. Data and information that represent the structure, context,
and operation of physical systems in the real world are input into the digital twin
on the computer, and by running simulations, the past and present operational states
can be understood, and the future can be predicted. The digital twin is a powerful
digital entity that can be used to optimize the physical world, and by using digital
twins, the operational performance of real-world objects and business processes can
be significantly improved.7
The core technologies that make up digital twins include data analytics, the IoT,
simulation, AI, and cloud computing. Digital twins predict results and find solutions
by analyzing real-time data collected from their physical counterparts. To collect real-
time data, sensors and IoT devices are attached to the physical counterpart, providing
a continuous stream of data. Advanced simulation and modeling techniques replicate
the characteristics and behaviors of the physical counterpart in the digital domain.
Digital twins process data to identify patterns and anomalies and use AI algorithms
to predict behaviors. When necessary, digital twins can utilize cloud infrastructure
to secure storage, processing capacity, and accessibility for the digital counterpart.
Specifically, digital twins are built through the following steps: First, data related
to the physical entity to be replicated is collected from various sources, including
sensors, IoT devices, historical records, and measurements. Second, the collected data
is integrated to create a comprehensive dataset representing the behavior, characteris-
tics, and state of the physical entity. Third, a computational model that simulates the
7 The term “digital twin” has been used since the 1960s by NASA for remotely operating and
maintaining systems and was used by Michael Grieves in 2003 as a tool to optimize the entire
lifecycle management of products in industrial environments. Although the technology at the time
could not fully implement digital twins due to the need for extensive data storage and processing
devices, advancements in technology over the past 20 years have led to its widespread practical use
in various industries today.
4.5 Digital Twin 79
data from digital twins requires significant expertise in data analysis, modeling, and
simulation, necessitating digital twin technology experts and dedicated team training
and education.
With the advancement of internet and mobile technologies and the spread of
various digital platforms, humanity has formed a global hyperconnectivity network,
producing an unimaginable amount of data that was previously unthinkable. Billions
of people worldwide are connected through the internet and mobile communica-
tions, generating vast amounts of data. The emergence of social media platforms
like Facebook, Twitter, and Instagram has attracted users globally, who exchange
messages, post various contents, and upload media, creating massive amounts of
data daily. The widespread adoption of smart home devices, wearable devices, and
various industrial sensors among IoT devices also generates a considerable amount
of data. Online shopping platforms produce extensive data on consumer behavior,
preferences, and purchase history through e-commerce, and digital banking and
online transactions also produce significant amounts of data. The amount of data
distributed through streaming platforms like Netflix and YouTube is enormous, and
the data users generate, share, and distribute themselves is overwhelming. Research
in human genome projects and life sciences generates vast amounts of data, and the
digitization of health records significantly increases the data pool in the medical field.
Such unprecedented increase in data volume poses a unique challenge in the digital
transformation era, leading to the emergence of the term “big data” and the field of
“big data analytics”.
Big data refers to large volumes of data, but the real interest in big data lies in how
to collect, store, process, and analyze this vast amount of data to extract meaningful
patterns, correlations, trends, insights, and knowledge. It also involves utilizing this
analysis to gain a deeper understanding of various phenomena, make data-driven
decisions, and optimize business processes. Therefore, while commonly referred to
as big data, the core content is big data analytics. In other words, big data analytics
involves analyzing the essence hidden within vast amounts of data (big data). Big
data analytics is a multidisciplinary field requiring expertise in data science, statistics,
computer science, information technology, and related areas, developing alongside
the advancements in artificial intelligence, machine learning, and cloud computing.
Big data analytics involves various processing stages, including data prepro-
cessing, data integration, data storage, data analysis, visualization, and interpretation.
4.6 Big Data 81
Big data analytics systems must be capable of adequately responding to various vari-
ables such as data volume, generation speed, diversity, reliability, variability, and
scalability across these multi-stage processing steps. First, the vast amount of gener-
ated data requires expandable and efficient infrastructure for storage, processing, and
analysis. Second, rapid data generation speeds necessitate corresponding real-time
data processing capabilities. Third, the diversity and inclusion of structured, semi-
structured, and unstructured data from various sources require complex integration
and analysis capabilities. Fourth, data from diverse sources may be incomplete,
inconsistent, or contain errors, necessitating the ability to handle such variability.
Advanced analytical capabilities are needed to extract reliable and valuable insights
even in such situations. Fifth, the infrastructure must be flexible enough to expand
processing capabilities as the load increases or data volume spikes.
Big data analytics systems must be equipped to handle the various data character-
istics mentioned above and legally manage personal information protection, security,
and ethical considerations. There is concern over privacy and security breaches during
the mass storage and processing of sensitive data. It is crucial to ensure that no ethical
issues or potential biases arise, especially concerning personal information handling
or general data analysis. Therefore, systems must be designed to strictly adhere to
data protection and ethical regulations.
Big data analytics involves several considerations. Data from various sources
lacks consistency and accuracy, requiring significant time and resources for data
cleansing and management. Storing and processing large volumes of big data requires
large storage systems and powerful computing capabilities, demanding substan-
tial investment. The demand for real-time data processing and analysis is growing,
but implementing this capability technically has its limits. Analyzing massive and
complex datasets to extract meaningful essence is a challenging task, requiring
multidisciplinary knowledge and comprehensive insight.
A successful example of utilizing big data analytics in business is Netflix. Netflix
uses big data analytics to analyze viewers’ behavior, preferences, and viewing
patterns, customizing content recommendations, producing or acquiring new content,
and making content easily discoverable for users. When expanding into global
markets, Netflix utilized big data analytics to understand regional preferences and
content consumption patterns, informing its content library and marketing strate-
gies in each region. Netflix’s data-driven decision-making, content recommendation,
content production approach, and global market expansion showcase how big data
analytics can lead to success in the entertainment industry.
4.6.2 Bioinformatics
Bioinformatics is the application of big data analytics in the fields of biology and
medicine. It is an interdisciplinary field that combines biology, computer science,
mathematics, and statistics to analyze and interpret biological data. As a key area
in modern biological research, bioinformatics uses computational techniques to
82 4 Digital Technology
firewalls, IDS/IPS, and encryption technologies can protect networks from unau-
thorized access, data breaches, and cyber-attacks. Second, applying encryption and
access control, along with secure storage practices, can ensure the confidentiality,
integrity, and availability of sensitive data. Third, applying encryption and access
control, along with continuous monitoring, can protect data and applications serviced
in cloud environments. Fourth, utilizing firewalls, IDS/IPS, encryption, and access
control can protect IoT devices from hacking and unauthorized access that could
severely impact critical infrastructure and personal information. Fifth, implementing
encryption, secure authentication methods, and app security measures can ensure
the security of mobile devices and transmitted data. Sixth, identifying vulnerabil-
ities in software applications and applying appropriate patches can defend against
hackers’ attacks. In addition, utilizing security information management, machine
learning, and artificial intelligence can enhance cybersecurity capabilities and allow
for real-time detection and response.
For individual user, the first target for protection against cyberthreats is personal
computers (PCs), for which it is necessary to adhere to the following ten precautions.
First, use antivirus and anti-malware software and regularly update them, activating
real-time scanning to detect and block threats. Second, regularly update the oper-
ating system and all software to address security vulnerabilities, setting up automatic
updates for essential software and operating systems. Third, use strong passwords
that mix letters, numbers, and special characters, avoiding the use of the same pass-
word across multiple accounts. Consider using a password manager to securely store
and manage passwords.8 Fourth, use the operating system’s built-in firewall or third-
party firewalls to monitor and control incoming and outgoing network traffic. Fifth,
be cautious of emails from unknown sources, especially those requesting personal
information or prompting to click on links, and familiarize yourself with various
phishing scam techniques. Sixth, use WPA3 or at least WPA2 encryption for WiFi
networks and set strong passwords for WiFi networks.9 Seventh, enable two-factor
authentication (2FA) for online accounts related to sensitive services such as email,
banking, and social media.10 Eighth, regularly back up important data to external
8 A password manager is a software application designed to store and organize passwords. It typi-
cally encrypts the password database with a master password, and later on, the user only needs to
remember the master password. Since a password manager stores passwords in an encrypted form,
it is less susceptible to hacking and theft.
9 Wi-Fi Protected Access 2 (WPA2) is an enhancement of the original WPA standard made in 2004
and has become the de facto security standard for Wi-Fi networks. It is widely used in both personal
and enterprise Wi-Fi networks. Wi-Fi Protected Access 3 (WPA3) is the latest version of the Wi-Fi
security protocol released in 2018, offering stronger security features than WPA2, though it is still
in the process of adoption.
10 Two-factor authentication (2FA) is a security method in which a user provides two different
authentication factors to verify themselves. This adds an additional layer of security (e.g., text
message, biometric factors) to the traditional single-factor authentication method, where the user
only provides one factor (typically a password), making it more difficult for attackers to gain access.
For example, the user enters a username and password as usual, followed by providing a second
authentication factor, such as entering a code sent to user’s phone or scanning user’s fingerprint.
4.8 Robots, Autonomous Driving 85
drives or cloud storage, setting up automatic periodic backups. Ninth, avoid down-
loading software or opening attachments from untrusted sources, and if necessary,
pre-scan downloaded files and email attachments with antivirus software. Tenth,
lock your computer when not in use and set passwords for sensitive data, restricting
physical access to the computer to trusted individuals only.
4.8.1 Robots
Robots, in their physical form, are machines or devices that mimic human actions
or perform automated tasks. Comprising sensors, actuators, and computing systems,
robots interact with the environment and perform tasks based on pre-programmed
instructions. Capable of independently performing various tasks as programmed,
robots can recognize their surroundings through sensors and move by rolling on
wheels, walking on legs, flying, or swimming, and manipulate objects using arms,
hands, or specialized tools. Robots are utilized in various fields and tasks, such as
assembly and processing in manufacturing and surgical assistance in healthcare.
Robotics, the field that deals with the development, design, manufacture, operation,
and use of robots, covers both hardware and software development of robots, as well
as their interaction with environments and applications in various fields.
Robots include not only stationary industrial robots but also mobile robots, intel-
ligent mobile robots, and humanoid robots. Mobile robots, designed to move around
in their environment using wheels, tracks, legs, or by flying like drones, are used for
exploration, surveillance, delivery, and search and rescue. Intelligent mobile robots,
equipped with high-performance sensing, perception, and decision-making capabili-
ties, can process data from their surroundings, make decisions, and adapt to changing
situations, with autonomous vehicles and drones being prime examples.
Robots are categorized into industrial robots, service robots, exploration robots,
drones, etc., based on their applications. Industrial robots are installed in manu-
facturing settings to perform tasks like welding, painting, assembly, and handling
objects. Service robots perform tasks such as vacuum cleaning, lawn mowing, and
86 4 Digital Technology
Robotic process automation (RPA) uses software robots (i.e., bots) to automate busi-
ness processes, aiming to replace or assist human work with rule-based automation.
RPA plays a crucial role in enhancing process efficiency and reducing errors through
rule-based and repetitive task automation. RPA performs tasks following predefined
rules and automates work based on pre-set logic, allowing bots to handle tasks without
human intervention. By delegating routine and repetitive tasks to bots, organizations
can automate and optimize business processes, freeing humans for more critical tasks
and thus contributing to organizational competitiveness.
RPA offers benefits such as time savings, error reduction, system integration, and
process improvement. It can process tasks quickly and consistently, reducing the like-
lihood of human error and increasing work efficiency. RPA automates data and work-
flows between different systems and applications, facilitating system integration.
Thus, RPA can improve work efficiency and reduce costs through process automa-
tion. RPA is applied in various areas, including financial operations, accounting tasks,
data entry, customer support, human resources management, and maintenance work.
11 Recently, humanoid robot technology has made leaps with the incorporation of AI. Tesla unveiled
the 173 cm tall, 73 kg humanoid robot Optimus in 2021 and its second generation, Bumblebee,
in December 2023, equipped with AI. Various AI robots were displayed at CES in January 2024.
Goldman Sachs estimates the humanoid robot market to reach $154 billion by 2035, predicting that
humanoids will fill labor shortages in manufacturing and service industries.
4.8 Robots, Autonomous Driving 87
Autonomous drones are unmanned aerial vehicles that use various sensors, cameras,
GPS, and AI-based systems to detect the environment and fly independently along
pre-programmed paths. Sensors and cameras identify and detect surrounding objects
and obstacles, while GPS and navigation systems determine the drone’s location and
enable it to follow a set route. The onboard computer controls the drone’s flight path,
speed, and altitude, and AI and machine learning technologies allow it to adapt to
unexpected situations.
Significant technological advancements have been made in autonomous drone
technology due to competitive research and development by various companies.
4.8 Robots, Autonomous Driving 89
12 The Russia-Ukraine war that erupted in 2022 has significantly accelerated the militarization of
drones. In this conflict, drones have been utilized for surveillance, target acquisition, and direct
attacks, among other military purposes. Due to their capability for remote operation, drones offer
tactical advantages such as risk-free intelligence gathering and precision strikes. With the ongoing
war serving as a turning point, drones are expected to be extensively used in future military strategies.
90 4 Digital Technology
its launch point, but still requires human oversight. Level 3 autonomy is Conditional
Autonomy, where the drone can fly autonomously and make decisions in controlled
environments (e.g., predefined airspace), but may still require human intervention
in complex situations. Level 4 autonomy is High Autonomy, where the drone can
operate without human intervention in specific, controlled environments. It is capable
of navigating obstacles, making decisions based on sensor data, and adjusting to envi-
ronmental changes. Level 5 autonomy is Full Autonomy, where the drone is capable
of operating entirely autonomously in all environments and conditions, handling
complex tasks such as navigation, obstacle avoidance, and decision-making without
human oversight. Compared to autonomous vehicles, autonomous drones face unique
challenges, including airspace regulations, dynamic weather conditions, and the need
for advanced sense-and-avoid systems to ensure safe flight.
Humans inherently seek freedom and autonomy and are wary of centralized control.
This aspect has manifested technologically through decentralized and distributed
technology. While centralization offers the benefits of efficiency and order, there has
been a movement toward decentralization and distribution to secure freedom and
autonomy, even at the expense of those benefits. This trend has led from centrally
controlled infrastructure networks to decentralized ad-hoc networks,13 from central-
ized financial systems to the invention of decentralized blockchain cryptocurren-
cies, and from the corporate-controlled Web 2.0 to the emerging decentralized
Web 3.0. This evolution toward decentralized and distributed networks essentially
reflects human nature’s orientation toward autonomy, transparency, and a cooperative
community, technologically.
4.9.1 Blockchain
13A wireless computer network built on the foundation of physical communication infrastructure is
called an infrastructure network, while a wireless network where computers communicate directly
with each other autonomously, without the help of communication infrastructure, is called an ad
hoc network.
4.9 Decentralized/Distributed Technology 91
ledger centrally, every participant maintains and manages an identical ledger. Trans-
actions are recorded in blocks, and these blocks are chained together in chronological
order of transactions, hence the name “blockchain.”
The most significant feature of blockchain is its decentralization. Unlike tradi-
tional centralized systems, blockchain operates on a distributed network of computers
(nodes). Since each node holds a copy of the entire blockchain, transparency is guar-
anteed to all users, and the system remains operational even if some nodes encounter
problems. Before adding transaction records to the ledger, all nodes must agree on the
transaction’s validity through a consensus mechanism, which is a critically impor-
tant process in blockchain. The most typical consensus mechanism, “Proof of Work”
(PoW), is designed to select the longest chain of blocks as the principle to make it
difficult for malicious hackers to interfere.14 Moreover, since blockchain is composed
of a chain of blocks and transactions are grouped and encrypted into the next block,
once added to the blockchain, it becomes extremely difficult to alter the information
within a block, which ensures the security and integrity of recorded data. The combi-
nation of a distributed network and encryption technology provides high security,
and because the ledger is held by all participants, transactions are transparent.
Blockchain technology has the potential to be utilized across various industries
due to its ability to record and verify transactions in a secure, transparent, and efficient
manner. The most well-known application is cryptocurrency. Blockchain serves as
the technical foundation for various cryptocurrencies, including Bitcoin. It can also
enhance supply chain transparency and authenticity, simplify international payments,
improve financial transaction transparency, and streamline settlements. Blockchain
is used in smart contracts, which are automatically executed and enforced when
predefined conditions are met, and in identity management, providing secure and
decentralized control to reduce the risk of identity theft.
4.9.2 Cryptocurrency
Proof of Work (PoW) consensus mechanism to validate transactions and are rewarded
with Bitcoins, which process is known as mining. The amount of Bitcoin received for
mining decreases by half approximately every 3–4 years, known as Bitcoin halving.
Bitcoin differs from fiat currency in several aspects. While opening a bank account
may have stringent requirements, account freezing, limited transaction hours, fees,
and restricted transaction countries, cryptocurrency transactions are free from these
constraints, allowing for anonymous transactions without personal information expo-
sure. Unlike fiat currencies, which can be affected by financial crises, inflation, panic,
or deflation due to indiscriminate issuance, cryptocurrencies are not subject to these
issues. However, Bitcoin lacks the physical asset basis for price determination, like
gold, which leads to price instability and unpredictability, making it unsuitable for
everyday transactions.
Bitcoin’s open-source nature allows for the creation of other cryptocurrencies,
known as alternative coins or altcoins, numbering in the thousands, with Ethereum
being a prominent example. Ethereum’s cryptocurrency, ether (ETH), uses the Proof
of Stake (PoS) consensus mechanism in its latest version, and Ethereum 2.0, for
mining. PoS rewards validators who stake more ether, encouraging continuous
computer operation for validation. However, PoS’s drawback is that validators with
more ether have greater mining opportunities, leading to the proposal of alternative
mechanisms to address this issue.
Recently, China has been actively promoting Central Bank Digital Currency
(CBDC), which, although utilizing blockchain technology, differs from public
blockchains used in cryptocurrencies by allowing centralized control. CBDC aims
to ensure compliance with financial transaction regulations, monitoring, and anti-
money laundering by central banks. While offering some level of privacy like tradi-
tional banking, CBDC does not provide the anonymity and decentralization of cryp-
tocurrencies, potentially becoming a tool for monitoring citizens’ economic activities
if misused. Essentially, CBDC is not a typical cryptocurrency but an extension of
national currency into the digital realm, partially utilizing blockchain technology.
The World Wide Web (WWW), initially known as Web 1.0, evolved into what we
currently use as Web 2.0. The next generation of the web, proposed as a decentralized
alternative to Web 2.0, is Web 3.0.
Web 1.0 was a “static web,” mainly composed of static web pages without dynamic
content or user-generated content. It was a one-way, read-only web where users
mainly consumed information provided by websites. The era of Web 1.0 was domi-
nated by text-based content, based on HTML (HyperText Markup Language) tech-
nology,15 limiting the amount and type of information and limiting the role of users
15 HTML is a markup language developed for displaying web pages. A markup language is a type
of language that uses tags and other elements to specify the structure of documents or data.
4.9 Decentralized/Distributed Technology 93
to content consumers. Microsoft, a pioneer of the Web 1.0 era, significantly benefited
by bundling Internet browsers with its Windows operating system, revolutionizing
access to information.
Web 2.0, or the “social web,” transitioned from static Web 1.0 to a platform
featuring dynamic content and interactive user engagement. Characterized by bidi-
rectional interconnectivity and adopting new technologies like XML (eXtensible
Markup Language) and HTTP (HyperText Transfer Protocol),16 it opened the door
for users to actively participate in content production, sharing, and communication.
This transformation allowed for the continuous sharing and reproduction of content,
turning the web into a dynamic space. The advent of Web 2.0 platforms, utilizing
user data to attract advertisers, led to a platform-centric ecosystem. Companies like
Apple, Google, Amazon, and Meta (Facebook) quickly grew into platform giants by
securing a large volume of data ahead of others.
The emergence of Web 2.0 enabled a digital world where users not only consumed
information but also created and provided it. User-generated content became main-
stream, with social media platforms like YouTube, Facebook, and Instagram oper-
ating on content produced by users rather than the companies themselves, generating
substantial revenue by connecting advertisers with this content. However, users who
produced the data transferred ownership rights to the platform companies, not only
missing out on revenue sharing but also losing ownership rights. Moreover, these
companies’ centralization of data raised issues with cybersecurity, privacy, and ethics.
Web 3.0 is presented as an alternative to address these issues.
Web 3.0 is a “distributed web” that aims for a decentralized, distributed network
and emphasizes user ownership and control of data. To achieve decentralization, it
adopts blockchain technology, uses smart contracts that enable transactions without
the need for trust, and applies AI technology for data processing and analysis. By
choosing a distributed approach, user data is stored across network nodes, or users’
computers, instead of being stored on the servers of platform companies. Since
blockchain technology is adopted as the method for implementing decentralization,
all the advantages of blockchain discussed earlier are carried over to Web 3.0. The use
of blockchain makes the web environment more transparent and secure, solving issues
related to privacy and targeted advertising. In Web 3.0, data ownership shifts from
corporations to individuals, and a reward system for data usage can be established.
In other words, users can be compensated for the efforts involved in generating
information.
Since Web 3.0 uses blockchain, it enables secure, transparent, and tamper-proof
record keeping and transactions, and allows for the use of decentralized finance
(DeFi) and smart contracts. DeFi, unlike traditional financial services, does not
require identity verification processes like certificates, and as long as there is an
internet connection, users can access a variety of financial services such as deposits,
16 XML is a markup language designed to facilitate the easy exchange of data between different
types of systems connected to the Internet. HTTP is a request/response protocol for exchanging
messages between clients and servers. The data transmitted via HTTP can be accessed through
internet addresses that start with http; known as URLs (Uniform Resource Locators).
94 4 Digital Technology
including a limited selection of printing materials, slower speeds for large objects,
and the need for post-processing.
A unique advantage of 3D printing is remote or distributed manufacturing. Designs
can be digitally transmitted, allowing identical objects to be produced remotely. This
eliminates the need for physical transportation of prototypes or products, saving
on shipping costs and time. It also allows for design customization and production
volume adjustment according to local demands.
3D printing has wide applications across various industries. In healthcare, it
can produce patient-specific implants, prosthetics, and hearing aids. The aerospace
industry uses it to create lightweight, fuel-efficient components for aircraft and space-
craft. Automotive manufacturers use it for prototyping, custom parts, and even entire
car models. Architects use large 3D printers for complex models and construction
components, while fashion designers and artists use it for unique accessories, jewelry,
clothing, and sculptures. 3D printing also supports STEM education, manufacturing
process training, production of limited-edition consumer goods, customized cake
decorations, tools, film props, and more.
4D printing adds the dimension of time to 3D printing, creating objects that can
change shape or function in response to environmental stimuli such as temperature,
humidity, light, or other factors. Unlike 3D printing’s static shapes, 4D printed objects
can adapt and transform, offering potential for innovation in engineering, materials
science, medicine, architecture, and more.
Both 3D and 4D printing technologies have transformed manufacturing, design,
and other fields, yet face technical challenges and limitations. The range of printable
materials is restricted, and some lack the strength or durability of traditional materials.
The resolution of 3D printing can affect product quality and precision, while finding
materials that change shape for 4D printing is challenging. The cost of printers and
materials can be high, especially for industrial machines, and 3D printing can be
slow and unsuitable for mass production. Operating and designing with 3D and
4D printers require specialized skills and training, and ethical considerations arise
with applications such as printing human tissues or organs. Future developments
are needed to overcome these limitations, improve printer speed and resolution, and
discover new materials and printing technologies.
than 1,000 qubits could surpass supercomputers for certain types of problems.
Around the same time, Time magazine selected Hewlett Packard’s exa-FLOPS
(1018 Floating-point Operations per Second) supercomputer ‘Frontier’ as one of
the greatest inventions of 2023.
Comparing ‘Condor’ and ‘Frontier’ as of 2023 in terms of performance and other
aspects reveals key distinctions. Both excel in high-speed computing, with ‘Condor’
leveraging its qubits for quantum-specific tasks and ‘Frontier’ using traditional binary
computing with massive parallel processing to achieve speed. However, ‘Condor’s’
computational capabilities are still highly experimental, with issues such as stability
and error correction yet to be fully resolved. On the other hand, ‘Frontier’ represents
the peak of classical computing power, delivering stabilized, reliable performance
across various scientific, technological, and industrial applications.
In terms of cost, ‘Condor’ demands cutting-edge technology and materials for
quantum processing, error correction, and sophisticated cooling systems to main-
tain qubit stability at temperatures near absolute zero. These requirements make its
development and operation highly expensive and complex. While ‘Frontier’ is also
costly due to its exascale performance, large size, and cooling needs, it operates with
well-established technology and is more easily maintained.
Regarding physical space, while quantum computers like ‘Condor’ may require
less room for the core computing unit, the infrastructure for cooling, error correction,
and maintaining the controlled environment needed for quantum systems signifi-
cantly increases the space needed. In contrast, ‘Frontier’ occupies an area equivalent
to two basketball courts to house the supercomputer, storage systems, and auxiliary
devices.
As for applications, the general comparison between quantum computers and
classical computers applies here. ‘Condor’ has the potential to excel in solving
complex optimization problems, simulating quantum systems, tackling specific types
of cryptography, and other tasks that are difficult for classical computers. Meanwhile,
‘Frontier’ excels in large-scale simulations, weather forecasting, and physics and
astronomy research that demand massive classical computing power.
In summary, while ‘Condor’ represents the future potential of quantum computing,
it is still in the experimental phase, particularly in terms of stability and scala-
bility. ‘Frontier’, however, is a pinnacle of classical supercomputing and is already
being applied in a wide range of fields. Quantum computers like ‘Condor’ are
unlikely to replace classical supercomputers like ‘Frontier’; rather, they will serve as
complementary technologies, each excelling in different domains (see Table 4.1).
Artificial intelligence (AI) refers to machines designed and trained to think, learn,
reason, solve problems, and make decisions like humans. Machine learning is the
process by which AI learns. AI is generally categorized into two types: ‘Narrow
AI’ (NAI or ANI), designed and trained for specific tasks like personal assistants or
98 4 Digital Technology
image recognition technologies, and ‘General AI’ (GAI or AGI), which possesses
intelligence similar to human intelligence and is the goal AI research aims to achieve.
Machine learning improves algorithm performance using data. It enables
computers to enhance their intelligence through self-learning, involving training
and evaluation. Training allows computers to discover features in similar datasets
independently, while evaluation uses different datasets for assessment. Feedback
from evaluation results, combined with repeated learning, enhances intelligence.
This learning method is known as supervised learning. Unsupervised learning, in
contrast, involves giving datasets to the computer to discover patterns, structures,
and relationships on its own. Besides supervised and unsupervised learning, there is
also reinforced learning, where learning occurs through trial and error by interacting
with the environment, taking actions based on the current state, and adjusting actions
based on feedback from the environment.
Implementing AI involves various technologies, including machine learning,
neural networks, natural language processing, computer vision, and robotics. Neural
networks, inspired by the human brain’s neural structure, can perform deep learning
when layered multiple times, used for language and image recognition. Natural
language processing enables machines to understand, interpret, and converse in
human language, used in chatbots, language translation, and sentiment analysis.
Computer vision allows machines to interpret and understand visual information,
recognizing objects, scenes, and faces in images and videos. Robotics combines
4.12 Artificial Intelligence (AI) 99
machine learning and mechanical engineering to create intelligent machines that can
interact with their environment.
The development of AI and machine learning has seen several groundbreaking
inventions. The concept of deep neural networks emerged in the 1980s, followed by
the backpropagation algorithm and recurrent neural networks (RNN), kickstarting
the development of deep neural networks. Convolutional neural networks (CNNs)
advanced computer vision. The 2010s saw the proposal of generative models, opening
new possibilities for computer vision, natural language processing, and data anal-
ysis. The introduction of the transformer architecture with self-attention mechanisms
brought significant advancements to deep learning, revolutionizing natural language
processing, speech processing, and image recognition. (For explanations and details
on CNN, RNN, DL, etc., see Sect. 5.5.)
A significant milestone in AI’s development occurred on November 2022, with
the release of ChatGPT-3.5 for general testing. ChatGPT is an application model of
the Generative Pre-trained Transformer (GPT) designed by OpenAI for conversa-
tional purposes. GPT is a generative natural language processing model with a trans-
former architecture, trained on vast amounts of text data from the internet, capable
of conversing in human-like text. OpenAI later released ChatGPT-4 for subscription
and subsequently GPT-4 Turbo and GPT-o1.
ChatGPT has had several positive impacts, such as demonstrating that large
language models can generate human-like text and engage in conversations. It
helps users by answering questions, providing explanations, and supporting creative
writing and content generation, thus enhancing productivity. In addition, it has
promoted research and development in the fields of natural language processing
and artificial intelligence and provided educational tools that enable users to learn
new topics, concepts, and languages.
On the negative side, ethical concerns have been raised about the potential errors,
biases, rudeness, and discriminatory behavior of AI-generated content. There are
also concerns about the spread of misinformation and disinformation, as well as the
potential for information manipulation and deception. Furthermore, privacy and data
security issues arise due to the possibility of sharing personal or sensitive information.
In addition, it has been suggested that jobs requiring communication and interaction
with customers could be reduced, and an over-reliance on AI could lead to a decline in
critical thinking and independent decision-making skills. (For detailed information
on artificial intelligence, see Chap. 5.)
Chapter 5
Artificial Intelligence
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 101
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_5
102 5 Artificial Intelligence
The process by which humans understand questions and generate answers involves
multiple brain regions and complex cognitive functions. As revealed by cognitive
science, this process unfolds as follows:
First, the cognitive process begins when a question is either read or heard. Visual
or auditory information is received by the sensory organs, such as the eyes or ears.
At this stage, an attention mechanism activates to focus on the relevant question,
filtering out irrelevant sensory information.
Second, the perceived question is processed in the language centers of the brain,
primarily located in the left hemisphere. These centers include Broca’s area, which
is involved in speech production, and Wernicke’s area, which is responsible for
language comprehension. During this phase, the grammatical structure of the ques-
tion is analyzed, and the meaning of words is interpreted in the context of the
conversation or text.
Third, once the question is understood, the brain begins retrieving relevant infor-
mation from memory. This process involves the hippocampus, which is key for
long-term memory, and the prefrontal cortex, which aids in memory retrieval. The
brain searches through stored knowledge and experiences to find information that
can be used to construct an appropriate answer.
Fourth, the prefrontal cortex integrates the retrieved information with the context
of the question. It applies logic and reasoning to determine the most suitable response.
General knowledge, personal experience, and contextual clues are used to infer any
implied meaning or depth of explanation required.
Fifth, after deciding on the content of the answer, the brain plans how to express it.
The prefrontal cortex is involved in selecting appropriate words and structuring them
into coherent sentences. Broca’s area manages the production of speech or written
language.
Sixth, the motor cortex initiates the physical process of delivering the response. It
activates the muscles needed for speaking or writing. As the response is expressed,
auditory or visual feedback systems monitor its accuracy. If any discrepancies are
detected (such as a misspoken word), the brain adjusts the response in real-time.
5.1 Human Intelligence and Artificial Intelligence 103
AI, like humans, can understand questions and generate answers, but its operation
is fundamentally different. AI processes information through complex calculations,
rather than through biological cognition. We can explore this by examining a widely
used transformer-based AI model, GPT.
First, when GPT receives a question, it tokenizes the input text into smaller units
(tokens) and converts these tokens into numerical vectors through an embedding
process. These tokens, representing parts or whole words, start with arbitrary vector
values that adapt over time as GPT learns to associate them with semantic meanings.
Second, GPT uses its transformer architecture and self-attention mechanism to
analyze the relationships and relative importance of these tokens. This allows it to
deeply understand the structure, grammar, and context of the question, determining
its theme and intent. The embedded data is processed through several layers of
self-attention, refining its understanding based on patterns learned during training.
Third, unlike humans, GPT does not search an internal knowledge base or memory
for answers. Instead, it relies on patterns learned during pre-training on a vast corpus
of data. This extensive training allows GPT to generate responses based on the context
provided in the input, rather than retrieving information as humans do from memory.
Fourth, GPT synthesizes an answer by analyzing the input tokens in context.
Although its reasoning abilities are still limited, it can perform some level of inference
through its interaction with the trained weight parameters and input tokens. When
104 5 Artificial Intelligence
examples are provided in the prompt or when guided by a “chain of thought,” GPT’s
ability to infer and reason can improve.1
Fifth, GPT generates an answer in natural language. During this process, it predicts
the next word (token) based on the context of the question and the tokens it has already
generated. This is done through complex calculations that rely on its extensive pre-
trained data. The sequence of generated tokens is then converted into a coherent text
and presented as the final answer.
Comparing human and AI cognitive processes (as illustrated in Fig. 5.1a and
b), we see that while their functions may appear similar, they are fundamentally
different. AI lacks the ability to retrieve specific information from memory; it can
only simulate knowledge retrieval based on its pre-trained data. Furthermore, its
inference and reasoning capabilities are still underdeveloped, which is indicated
by the dashed lines around the ‘Pre-trained Knowledge’ and ‘Inference, Synthesis’
blocks in Fig. 5.1b.
It is important to note that GPT’s ability to understand and generate consistent
text comes from being trained as a language model. By pre-training on vast amounts
of digital text, it learns patterns of language, grammar, facts, and reasoning, which
enables it to generate responses that are relevant, grammatically correct, and logically
coherent.
5.1.3 Understanding
How does AI “understand” questions, and what does “understanding” mean for AI?
Unlike humans, who use conscious thought, existing knowledge, and reasoning
to reflect on the meaning, context, and implications of a question, GPT—a model of
AI—relies on pattern recognition, statistical correlations, and predictive modeling
to simulate understanding.
For GPT, “understanding” begins with interpreting the input text. It uses a self-
attention mechanism within a pre-trained transformer architecture to identify patterns
and correlations in the input. As it passes through multiple layers of the self-attention
network, GPT refines its interpretation by detecting patterns and relationships in the
data. This process allows it to grasp the context, nuances, and intended meaning of
the question.
Once GPT has processed the content and context of the question, it generates a
response based on its training. It selects words and constructs sentences by drawing
from learned patterns of how similar questions have been answered in the past.
However, GPT does not retain any memory of the interaction. Once a response is
1 Two representative methods for enhancing AI’s inference capability using prompts are in-context
learning (ICL) and chain-of-thought (CoT). ICL is a method where the AI model analyzes and
learns from the context within the input data itself, without requiring additional training. It enables
the model to make predictions by understanding the given context. CoT, on the other hand, is a
technique that guides the AI model to solve problems step-by-step in a sequential manner, allowing
it to perform multi-step inference and logical progression more effectively.
5.2 Development of Artificial Intelligence 105
generated, the task is complete, and if the same question is repeated, GPT will go
through the entire process again, potentially producing a different response each time
due to the probabilistic nature of its predictions.
Thus, for AI, “understanding” refers to the ability to interpret input information
and generate a relevant response. Unlike humans, who store information and can
consciously reflect on their responses, GPT does not retain or remember the content
it processes or the responses it generates. Its understanding is momentary, driven by
patterns in data rather than conscious reasoning or long-term memory.
5.1.4 Memory
The three elements of algorithms, machine learning, and neural networks have
evolved over a long period of approximately 70 years, overcoming various chal-
lenges through technical breakthroughs and setting milestones in the development
of AI with several challenging events. Technologically, the emergence of recurrent
neural networks (RNNs) and convolutional neural networks (CNNs) enabled effec-
tive processing of sequential data and images, respectively, while variational autoen-
coders (VAEs) and generative adversarial networks (GANs) opened the doors to
106 5 Artificial Intelligence
generative models, and the advent of the transformer architecture with self-attention
mechanism paved the way for parallel processing. Events such as the introduction
of IBM Deep Blue, IBM Watson, and Google AlphaGo have marked milestones in
the history of AI development.
Although the concept of artificial intelligence (AI) existed as early as the 1940s,
the term AI was first used at a conference hosted by John McCarthy and others at
Dartmouth College in 1956, where AI was defined as the science and engineering
of making intelligent machines. The concept and fundamental theories of neural
networks existed in the 1940s, referring to computing systems inspired by the struc-
ture and function of the human brain, particularly neurons and their interconnections.
In 1943, Warren McCulloch and Walter Pitts proposed a model of artificial neurons
with binary outputs. In 1949, Donald Hebb introduced the concept of learning through
strengthened neural connections. Building on these ideas, in 1957, Frank Rosen-
blatt introduced the perceptron, a model designed for pattern recognition inspired
by biological neurons.
In 1956, the first AI program, Logic Theorist, was developed, capable of solving
puzzles using symbolic logic. That same year, the early natural language processing
computer program ELIZA was developed, demonstrating the possibility, albeit prim-
itive, of machines understanding and responding to human language. In the 1970s,
Expert Systems capable of mimicking the decision-making of human experts in
specific fields were developed and widely disseminated.
In 1997, IBM’s computer Deep Blue defeated world chess champion Garry
Kasparov, proving that machines could perform complex calculations and strate-
gies. Between the 1990s and the 2000s, neural network research saw a revival, and
machine learning algorithms capable of learning and making decisions based on
data improved AI capabilities. In 2011, IBM Watson demonstrated its ability to
understand natural language and answer questions on the game show “Jeopardy!”
by defeating human champions. In 2016, Google DeepMind’s AlphaGo defeated the
world’s leading Go player, Lee Sedol, proving AI’s ability to tackle complex board
games through deep reinforcement learning.
concept was introduced in the 1980s, but practical application required further devel-
opments, culminating in the creation of long short-term memory (LSTM) networks in
1997. LSTMs addressed the problem of long-term dependencies in data sequences,
significantly advancing natural language processing and speech recognition with
RNNs.
Convolutional neural networks (CNNs), introduced in the 1980s, employ a method
of filtering image pixels with a weight matrix to extract features. Practical application
became feasible only after improvements in computer performance and data capacity
allowed for larger and more complex neural networks. The emergence of AlexNet
in 2012 marked a significant development in CNNs, leading to breakthroughs in the
field of computer vision and revolutionary advancements in deep learning.
Deep neural networks (DNNs) are neural networks with depth, meaning a large
number of layers. Neural networks were emerged in the 1980s but were not widely
used or practical at that time. Initially, researched neural networks were shallow,
with only a few layers of depth. Neural networks became practically viable with the
application of the backpropagation algorithm in 1986, which subsequently became
a core technology for DNNs. However, it was not until 2006, when fast and efficient
training methods were introduced, that DNNs, including RNNs and CNNs, became
widely applicable.
The concept of graph neural networks (GNNs) was first introduced in 2009,
but practical application occurred after 2017, following sufficient advancements in
computational capabilities, the availability of large datasets, and several improve-
ments to GNN structures, leading to sophisticated graph neural network models.
Unlike traditional neural networks that dealt with 1D sequences or 2D images, GNNs
process data represented in graph form, emerging as a new tool for analyzing inher-
ently graph-structured data such as social networks, communication networks, and
molecular structures.
Machine learning has evolved into supervised learning, unsupervised learning,
deep learning, and reinforcement learning. Supervised learning involves learning
from labeled data, while unsupervised learning occurs without labeled data. Deep
learning uses neural networks with many layers, and reinforcement learning involves
learning through interaction with an environment by taking actions and receiving
feedback. In 2016, deep reinforcement learning emerged as Google DeepMind devel-
oped AlphaGo by intricately combining deep learning with reinforcement learning.
Deep reinforcement learning has enabled machines to learn and adapt in complex
and uncertain environments like games or autonomous driving, becoming a key
technology in AI development.
The attention mechanism was introduced in 2014, and the self-attention mechanism
followed in 2017. Unlike RNNs or LSTMs, the self-attention mechanism can process
each part of the input data in parallel and adaptively work with the length of the input
data, significantly contributing to natural language processing.
The transformer architecture, presented in 2017, has achieved groundbreaking
progress in natural language processing. Its hallmark is the use of the self-attention
mechanism, which allows for parallel processing of data, moving away from the
sequential data processing used by RNNs and LSTMs. The transformer architecture
has been adopted in most large-scale AI models that followed.
In 2018, BERT, which adopts the transformer architecture, brought innovation
to natural language processing. Applied to various natural language processing
applications, from language translation to chatbots, BERT improved performance,
multifunctionality, and depth of language understanding.
Developed between 2018 and 2023, GPT is also a large-scale language model
that uses the transformer architecture, bringing about revolutionary advancements
in natural language processing by breaking the limits on the size and capabilities of
neural networks. In November 2022, ChatGPT emerged, capable of generating text,
engaging in conversations, and answering questions in a manner akin to humans.
5.3 Algorithms
Search algorithms are used for searching data or finding paths. Traditional examples
of search algorithms in computer science include depth-first search (DFS), breadth-
first search (BFS), and the A-Star algorithm. DFS is used for exploring graph or tree
structures, BFS for finding the shortest paths, and A-Star for shortest path finding
and graph traversal.
Among various search algorithms, we single out the DFS and discuss how it
operates. Historically, DFS was used for symbolic reasoning, logic, and problem-
solving, and it continues to be used for solving certain types of problems related
to search and optimization in AI. In complex AI systems, especially those dealing
with structured data or requiring exhaustive search and exploration, DFS operates as
follows: Starting from a node, it moves to an adjacent node, marks it as visited, and
repeats this process until it reaches the end of a branch. It does not revisit previously
visited nodes. If it reaches the end of a branch and there are no unvisited adjacent
nodes left, it backtracks to the last visited node that has unvisited adjacent nodes
and begins exploring a different branch. This process continues until all nodes have
been visited. DFS is used for complete exploration of a graph, visiting all nodes.
However, it does not necessarily find the shortest path from the starting node to the
destination node; it aims to explore as deeply as possible. Thus, DFS is useful for
finding whether a path exists between two nodes and what that path is.
Today’s search engines, widely used on search platforms, also fall into the cate-
gory of search algorithms but are more complex than traditional algorithms and
incorporate elements of AI’s machine learning and neural networks. Search engines
operate on complex algorithms to search data in search indexes and provide the
most relevant results for a search query. For Google search, algorithms such as
PageRank, Hummingbird, RankBrain, and BERT are used. PageRank is a link anal-
ysis algorithm used to rank web pages in search engine results, Hummingbird is
an algorithm to understand the intent and contextual meaning of a search query,
RankBrain interprets search queries to find related pages even if the words are
not exact, and BERT is a neural network-based technology for pre-training natural
language processing, helping to understand the nuances and context of words and
identify search query-related results more accurately (For more details on BERT, see
Sect. 5.7).
110 5 Artificial Intelligence
Machine learning (ML) is a core technology of AI that helps computers learn from
data and make predictions or decisions. The goal of machine learning is to improve
performance on specific tasks by learning from a given dataset, with the quality and
quantity of the learning data significantly impacting the results. In the early stages of
machine learning, using decision trees has the advantage of visually displaying the
characteristics by which the data is classified.3 The goal of machine learning is to
develop a learning model that can generalize from training data to new, unseen data.
2 When a problem is solved using the divide-and-conquer method with a recursive algorithm, ineffi-
ciencies can occur due to redundant recursive calls that solve the same subproblems multiple times.
Dynamic programming is a technique that improves efficiency by storing the results of subproblems
and reusing them, thereby avoiding redundant calculations.
3 A decision tree is a simple yet powerful machine learning algorithm that can be used to analyze and
predict complex data structures. It is useful for visually representing and interpreting the inherent
patterns in data, especially by simplifying complex decision paths. This helps in understanding the
structure and characteristics of the data in the early stages of machine learning. The structure of
a decision tree intuitively conveys the knowledge gained during the learning process, playing a
crucial role in analyzing and verifying the decision-making process of the trained model.
112 5 Artificial Intelligence
Effective generalization means the machine learning algorithm can apply what it has
learned to new data effectively. However, overly complex models may capture noise
in the data, reducing performance, while overly simple models may fail to capture
basic patterns.
Machine learning includes several types: supervised learning, unsupervised
learning, semi-supervised learning, reinforcement learning, and deep learning (deep
neural networks). In supervised learning, the algorithm learns a function that maps
inputs to outputs based on input–output pairs in the dataset. Unsupervised learning
identifies patterns, structures, or features in data without predefined outputs. Semi-
supervised learning builds better models using both labeled and unlabeled data.
Reinforcement learning learns the optimal actions through interaction with an envi-
ronment, based on rewards or penalties for the actions taken. Deep learning uses
multiple layers of neural networks to extract features from low to high levels of
abstraction. The performance of these machine learning algorithms varies depending
on the problem, available data, and desired outcomes, necessitating the selection of
the most appropriate algorithm for each situation. This section will explain super-
vised, unsupervised, semi-supervised, and reinforcement learning, with deep learning
discussed in a subsequent section following the discussion on neural networks (Refer
to Sect. 5.5.6).
While there are various types of machine learning, the basic learning proce-
dure shares common steps: data collection, organizing and transforming data into a
format suitable for machine learning, identifying or generating features to improve
model performance, selecting a specific machine learning algorithm for training,
and evaluating the results of the training. If the evaluation results are satisfactory, the
learning process ends; otherwise, the model’s internal parameters are adjusted, and
the learning and evaluation process is repeated. Once learning concludes, the model
is deployed in the target environment for use, continuously monitored and updated,
and if performance degrades, it may be retrained.
In supervised learning, training algorithms use a dataset with specified labels. Labels
are the expected outputs or target values for each input data used for training. They
distinguish from input data and serve as essential guidelines for learning the mapping
between inputs and their corresponding outputs. For instance, if training involves
distinguishing between pictures of dogs and cats, the pictures serve as input data,
and “dog” and “cat” are the labels. Supervised learning algorithms attempt to predict
the labels based on the input data, adjusting parameters through comparisons between
the predictions and given labels during training.
The supervised learning process begins with data collection and preprocessing,
where input data is cleaned, missing values are handled, and variables are transformed
into a format suitable for the learning model. The data is then divided into training and
evaluation datasets, with the training set used for model training and the evaluation
5.4 Machine Learning 113
set for assessing model performance. Depending on the problem type, an appropriate
machine learning model is selected, such as linear regression models for regression
problems (predicting continuous values) and decision trees or neural network models
for classification problems (predicting categories).4 After training the model with
the training dataset, its performance is evaluated using the evaluation dataset. The
learning process concludes once the model achieves satisfactory accuracy.
As an example of supervised learning, we consider training a system to classify
emails as spam or non-spam. The process begins with collecting a dataset of emails
labeled as spam or non-spam, extracting features such as word frequency, sender
address, and email length during preprocessing. Part of the dataset is designated
for training and the rest for evaluation. A suitable model is selected based on the
classification task and trained using the training dataset. After sufficient training, the
model’s accuracy is evaluated using the evaluation dataset. If the model performs
satisfactorily, it can be deployed in email applications to filter spam. In this example,
the model learns from labeled email data and uses this knowledge to correctly classify
new emails.
Unsupervised learning involves learning from data that is not labeled or classified.
Unlike supervised learning, which trains on labeled data, unsupervised learning
is designed to identify patterns, relationships, or structures in datasets without
predefined labels or categories. It is used to discover unknown groups, underlying
structures, or distributions and patterns in the given data.
Like supervised learning, unsupervised learning starts with data collection and
preprocessing. However, there are no output categories or labels for the input data.
Depending on the task, such as clustering, dimension reduction, or association anal-
ysis, an appropriate unsupervised learning algorithm is selected. The model discovers
patterns or structures in the dataset during training without comparisons to known
outputs, aiming to explore the data itself. The process involves adjusting model
parameters and applying the algorithm repeatedly to better understand the data struc-
ture. Once unsupervised learning outcomes are generated, data experts must interpret
the identified clusters, patterns, or relationships.
Customer segmentation in marketing is an example of unsupervised learning,
where customers are segmented into various groups based on their purchasing
behavior without predefined categories. Data on customers’ purchase history, demo-
graphics, and browsing behavior is collected, and a clustering algorithm like k-
means is selected for the segmentation task. K-means clustering is an unsupervised
4 Regression problems involve predicting through variables that are real numbers, where the predic-
tions are continuous real numbers. In contrast, classification problems target cases where the subject
is not a continuous real number but a unique value or categorical variable. For example, predicting
tomorrow’s temperature is a regression problem, while distinguishing between dogs and cats is a
classification problem.
114 5 Artificial Intelligence
learning method that minimizes the variance of distances between the points of a
cluster and its centroid. By applying k-means, clusters of customers with similar
purchasing behaviors are identified. Analyzing the classified clusters helps under-
stand different customer groups (e.g., frequent buyers, occasional shoppers, high-
value item purchasers). Insights from this learning can be used for tailored marketing
strategies, personalized recommendations, and individualized customer service. In
this example, the unsupervised learning algorithm uncovers hidden patterns in
customer behavior, providing valuable information for business strategy. This exem-
plifies unsupervised learning’s ability to find structure in data that is not explicitly
labeled or classified.
Typically, the structure of a neural network includes an input layer, several hidden
layers, and an output layer. Each layer consists of nodes, and the links connecting
layers carry parameters known as weights. The process of machine learning is essen-
tially about finding the optimal set of weights. In deep learning, or deep neural
networks, there are many hidden layers, resulting in a large number of weight param-
eters to learn. There are various types of neural networks, including convolutional
neural networks (CNNs), useful for image and video recognition, and recurrent neural
networks (RNNs), useful for language modeling and text generation, which have
evolved into transformer structures.
The learning process of a neural network involves forward propagation, back-
ward propagation, and repeated weight adjustments. Input data fed into the network
passes sequentially through each layer’s neurons to calculate output values. The
error between the output values and target values is propagated backward through
the layers, calculating each weight’s contribution to the error. By applying methods
such as gradient descent, weights are updated. This process repeats until the neural
network learns and ultimately finds the optimal weights, concluding the learning
phase.
Designed to mimic the way the human brain processes information, the structure of
neural networks also emulates the biological neural network structure of the brain.
The architecture of a neural network is composed of layers of interconnected nodes,
corresponding to neurons, each taking input data, performing simple calculations,
and outputting the result. Typically, a neural network has a single input layer, multiple
hidden layers, and a single output layer connected in a sequence (refer to Fig. 5.2).
The input layer receives the input data. Each node in the input layer represents
one feature of the input data. For instance, in image recognition, each input node
represents the intensity of a pixel.
The hidden layers exist between the input and output layers, often in multiple
layers. The term “hidden” is used because, from the outside, only the input and
output layers are visible, while the intermediate layers are not. In the hidden layers,
actual computations are performed by applying weights to the input data. Each node
in these layers acts as a neuron responsible for these computations, producing a result
that is sent as output. The number of hidden layers and the number of neurons in
each layer are determined based on the complexity of the task the neural network
is designed to perform. For deep learning tasks, a neural network architecture with
many hidden layers is chosen.
The output layer produces the final output data of the network. The form of the
output varies depending on the task. For classification tasks, it outputs the probabil-
ities of various classes, and for regression tasks, which predict continuous values, it
outputs continuous values.
Each node, corresponding to a neuron, performs the following computation
process as the basic operational unit of the neural network (refer to Fig. 5.3). First, it
multiplies each input data by its corresponding weight and then sums them. This sum
is then added to a bias term, and the result is passed through an activation function
to produce the output. Mathematically, if the inputs are x 1 , x 2 , …, x n , the weights are
w1 , w2 , …, wn , the output is y, the bias is b, and the activation function is f (·), then
the relationship is y = f (w1 x 1 + w2 x 2 + · · · + wn x n + b). Adding the bias adjusts the
activation point of the activation function. The activation function shapes the output
into the desired form. This neuron model is also referred to as a perceptron, as was
initially named by Frank Rosenblatt.
The calculation above is for the output of a single neuron, and since each layer is
composed of multiple neurons (for example, m neurons), it results in multiple outputs.
Therefore, the output of each layer can be represented as a vector Y composed of m
components. Since the input signal can be represented as a vector X composed of n
ࠆ̛߾
ܹࡶݥ
ԯଜࠝݤݫ.
⋯
1RGH 1HXURQ
Neural networks come in various types designed to suit the characteristics of the
tasks they perform. Common neural networks include feedforward neural networks,
recurrent neural networks, convolutional neural networks, long short-term memory
networks, variational autoencoder networks, and generative adversarial networks.
Each type has its strengths, so the choice depends on the nature of the task, the
characteristics of the input data, and the required output type.
Feedforward neural networks (FNNs) have a simple structure where connections
between nodes do not form feedback, allowing data to move in one direction from
the input layer, through hidden layers, to the output layer. They are typically used for
classification and regression tasks. The neural network structure depicted in Fig. 5.2
represents a typical feedforward neural network.
Recurrent neural networks (RNNs) are designed for sequential or time series
data, such as sequences or text, and employ a self-feedback mechanism.6 The RNN
structure allows each hidden layer to feed its output back as input, processing it
alongside new inputs in a recursive manner. Deep RNNs have this recursive struc-
ture across multiple layers. While RNNs can handle variable-length inputs, they
5 The formulas are f (x) = 1/(1 + e^(− x)) for the sigmoid function and f (x) = (e^(x) − e^(− x))/
(e^(x) + e^(− x)) for the tanh function, where ‘^’ denotes exponentiation.
6 J. J. Hopfield, “Neural Networks and Physical Systems with Emergent Collective Computational
Abilities.” Proceedings of the National Academy of Sciences, vol. 79(8), pp. 2554–2558, 1982.
5.5 Neural Networks 119
struggle with long sequences. LSTM networks, designed to learn long-term depen-
dencies within sequences, overcome this limitation and are effective in applica-
tions requiring long-term context, such as machine translation and speech recogni-
tion. RNNs are commonly used for sequence analysis, speech recognition, language
modeling, natural language processing, translation, and text generation (For more
details on RNNs, refer to the next subsection).
Convolutional neural networks (CNNs) are designed to process grid-like data,
such as images.7 CNNs consist of convolutional layers, pooling layers, and fully
connected layers. Feedforward neural networks are modified by replacing hidden
layers with convolutional layers, followed by pooling layers, and ending with fully
connected layers. Convolutional layers apply convolutional filters to the input data
to extract features. Pooling layers reduce the spatial dimension of the input for the
next convolutional layer. Fully connected layers, located at the end of CNNs, use the
features extracted by the convolutional layers to determine the final output. CNNs are
widely used for image and video recognition, image classification, medical image
analysis, and natural language processing (For more details on CNNs, refer to the
next subsection).
Variational autoencoders (VAEs) are generative neural network models that use
unsupervised learning to generate new data similar to the training data.8 VAEs consist
of three main components: an encoder, a decoder, and a latent space between them.
The encoder compresses input data into a lower-dimensional representation (specifi-
cally, the means and variances of probability distributions) and passes it to the latent
space. The decoder then randomly samples from this distribution to regenerate the
input data. The VAE’s loss function comprises two parts: reconstruction loss and
regularization loss. Reconstruction loss measures how accurately the decoder can
reconstruct the input data from the latent representation, while regularization loss
ensures that the learned distribution remains close to the prior distribution, typically a
standard Gaussian. VAEs excel at learning complex data distributions and generating
new data similar to the original, making them useful for image and text generation
tasks.
Generative adversarial networks (GANs) consist of two competing networks: a
generator and a discriminator.9 The generator’s objective is to create data that is indis-
tinguishable from real data, while the discriminator’s role is to distinguish between
real and generated data. During training, the generator continuously improves its
national Conference on Learning Representations (ICLR), 2014, and Danilo Jimenez Rezende,
Shakir Mohamed, and Daan Wierstra, “Stochastic backpropagation and approximate Inference in
deep generative models” Proceedings of Machine Learning Research, vol. 32, pp/1278–1286, 2014.
9 Ian Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil
Ozair, Aaron Courville, Yoshua Bengio, “Generative Adversarial Nets” Proceedings of the
Conference on Neural Information Processing Systems (NIPS 2014), December 2014.
120 5 Artificial Intelligence
ability to produce realistic data, while the discriminator becomes better at identi-
fying generated data. Training concludes when the discriminator can no longer reli-
ably distinguish generated data from real data. GANs are widely used for generating
and enhancing realistic, high-quality data and have also been applied in creative and
artistic fields.
Among various neural network types, VAEs and GANs are widely used as gener-
ative models.10 Generative models focus on creating new data that resembles the
training data. GANs involve a generator producing data and a discriminator eval-
uating it against real data until the generated data becomes indistinguishable from
real data. VAEs, on the other hand, learn by compressing input data into a lower-
dimensional latent representation using an encoder and then regenerating the data
from this representation through a decoder. More recently, specialized neural network
models for generative tasks, including transformer architectures, have emerged.
While VAEs are effective for generating complex high-dimensional data and GANs
excel in image generation and editing, transformers are particularly powerful for
processing sequential data and for tasks such as natural language understanding
and generation. They have also demonstrated impressive performance in image
processing and generation (For more details on transformers, see Sect. 5.6).
Recurrent neural networks (RNNs) and convolutional neural networks (CNNs) play
pivotal roles in showcasing the diversity and capabilities of neural network models.
RNNs are designed to process sequential data, while CNNs are optimized for grid-like
data, such as images. Their unique characteristics make them suitable for different
types of applications: RNNs are well-suited for tasks involving long-term depen-
dencies in sequential data, whereas CNNs are highly efficient in real-time computer
vision tasks. RNNs and CNNs serve as foundational models when developing new
neural network architectures and optimization techniques, enhancing performance
and efficiency for specific tasks.
RNNs and CNNs also form the foundation for constructing advanced transformer-
based AI systems like BERT and GPT. Research aimed at improving neural network
models often emphasizes the importance of RNN, CNN, and transformer architec-
tures. A growing trend involves combining RNNs and CNNs with transformers in
hybrid models, where CNNs efficiently extract features from images and videos,
followed by transformers, which process and interpret large-scale data. This hybrid
approach leverages the strengths of both CNNs and transformers for more powerful
AI systems.
10 Restricted Boltzmann Machines (RBMs) also operate as generative models, consisting of a visible
layer and a hidden layer, with a simpler structure than VAEs or GANs. They are used for dimen-
sionality reduction, classification, regression, collaborative filtering, feature learning, and topic
modeling.
5.5 Neural Networks 121
11The cyclical nature of RNNs implies that the process of output being fed back and stored to be
used with incoming inputs in the next time-step repeats indefinitely within the same layer. If this
process is unfolded over time, it resembles a feedforward structure with an infinitely large number
of layers. This is analogous to the infinite impulse response (IIR) filter in signal processing, in
contrast to the finite impulse response (FIR) filter corresponding to FNNs.
122 5 Artificial Intelligence
model to focus on different parts of the input sequence when generating each output.
(For more details on self-attention mechanisms, see Sect. 5.6.2).
• Convolutional Neural Networks (CNNs)
CNNs are designed for processing grid-structured data, particularly suitable for visual
image analysis. They have achieved significant success in tasks such as image recog-
nition, image classification, and object detection. The distinguishing feature of CNNs
is their ability to capture the spatial hierarchical structure of features within images,
inspired by the organization of the animal visual cortex. This design enables CNNs to
automatically and adaptively learn the spatial hierarchical features of input images.12
Unlike traditional neural networks, which fully connect each input to all neurons,
CNNs apply convolutional filters to small, localized regions of the input, reducing
the number of parameters. This approach allows the network to focus on local
spatial consistency and effectively recognize visual patterns in images with minimal
preprocessing.
CNNs typically consist of multiple layers that transform the input image to output
features present in the image. These layers include convolutional layers, activation
layers, pooling (or down-sampling) layers, and fully connected layers. Convolutional
layers apply multiple convolutional filters to the input data to extract features, sliding
each filter across the input data and computing the dot product.13 Activation layers,
typically featuring the ReLU function, add nonlinearity to the network, enabling it
to learn complex patterns. Pooling layers sift through the extracted feature data to
reduce the amount of data for the next convolutional layer, thereby reducing the spatial
dimensions. Fully connected layers use the features extracted by the convolutional
layers to determine the final output.
The architecture of CNNs leverages the 2D structure of input images, processing
the images in a hierarchical manner across layers. For instance, the first layer might
extract edges, the next layer patterns, and subsequent layers higher-level features
such as objects or faces, enabling multi-level processing. This hierarchical approach
allows CNNs to extract complex features from simple data.
Applications of CNNs are diverse, including image and video recognition,
image classification, object detection, face recognition, medical image analysis, and
autonomous vehicles. CNNs can accurately identify objects, places, and people in
images or videos, categorize images based on visual content, detect specific classes of
12 Hongping Fu1, Zhendong Niu, Chunxia Zhang, Jing Ma and Jie Chen, “Visual cortex inspired
CNN model for feature construction in text analysis,” Frontier in Computational Neuroscience, vol.
10, pp. 1–10, July 2016.
13 The term “convolution” is widely used in circuit theory and signal processing, representing the
output signal y obtained by passing the input signal x through a filter h, expressed as y = x ∗ h
and read as “x convolution h”. Here, h represents the filter’s impulse response function, and the
convolution is calculated by flipping h across the time axis, overlaying it on the input signal x, and
calculating the overlapping area. This process is repeated bymoving h along the time axis to obtain
as y(t) = x(τ)h(t − τ)dτ. In case the input and
the output function y, mathematically expressed
output signals are digital, dot product y(n) = x(m)h(n − m) is applied instead of integration.
5.5 Neural Networks 123
objects in digital images and videos, analyze medical images to assist in disease diag-
nosis, and help autonomous vehicles recognize traffic signs, pedestrians, and other
vehicles for safe navigation. The ability to learn and recognize patterns in visual data
makes CNNs a core element of contemporary AI systems requiring visual under-
standing. Beyond image and video processing, CNNs can also be applied to tasks in
natural language processing and time series analysis.
CNNs offer high computational and training efficiency by using small receptive
fields, which reduce the number of parameters and allow for scalability to large
images and complex datasets. Their suitability for parallel processing makes them
well-optimized for hardware like GPUs, and they can automatically learn hierarchical
features from data. However, CNNs also have several limitations, including the need
for large amounts of labeled data, high computational demands, long training times,
and susceptibility to overfitting, especially on small datasets. In addition, CNNs can
inherit biases and fairness issues from the training data, struggle to generalize to
cases not covered in the training set, and are sensitive to input variations. They are
also vulnerable to adversarial attacks, where small changes in input data can lead
to incorrect predictions. Addressing these challenges requires advances in model
architecture, training methods, and more efficient computational strategies.
When comparing RNNs and CNNs to other neural network architectures, fully
connected networks are versatile but inefficient for tasks that require recognizing
temporal or spatial patterns, as they lack the specialized recurrent or convolutional
layers. VAEs are designed for unsupervised tasks like dimensionality reduction and
feature learning, capturing complex data distributions but struggling with temporal
or spatial data. GANs are effective for generating new data but are not as well-
suited as RNNs and CNNs for processing sequential and spatial data, respectively.
Transformer architectures surpass RNNs in sequential data processing, offering better
performance and parallelism, but they demand significant computational resources.
Despite the rise of newer structures like transformers, RNNs and CNNs remain
relevant for specialized applications due to their architecture, which is specifically
optimized for handling sequential and spatial data.
The learning process of a neural network begins with the input data passing through
the network, from the input layer, through hidden layers, and finally to the output
layer, performing node (neuron) operations in a forward direction. The operational
process of each node in every layer involves multiplying each input data by its
corresponding weight, summing all these values, adding a bias, and then passing the
result through an activation function to produce an output. This output becomes the
input for the next layer, continuing the node operation process in a chain reaction
until the final output data is produced.
Once the final output data is obtained through the forward operation process, it is
compared with the target value to calculate and quantify the error. This quantification
124 5 Artificial Intelligence
process is performed by a function known as the cost function.14 The most widely
used cost functions are the mean square error (MSE) and the cross-entropy func-
tion. MSE, as the name suggests, is obtained by taking the square of the difference
between the calculated outputs and target values and then averaging it. The closer
the calculated values are to the target values, the closer the MSE converges to 0.
Cross-entropy is calculated by taking the logarithm of each predicted probability,
multiplying it by the corresponding true target value (usually a one-hot encoded
vector, with the correct class represented by 1 and the other classes by 0), summing
the results across all outputs, and then applying a negative sign. Cross-entropy is
primarily used in models that output probabilities, where the entropy decreases as
the calculated probability distribution gets closer to the target distribution. MSE is
commonly used for regression tasks, while cross-entropy is used for classification
tasks.
After the cost function is computed, the next step is to determine how to adjust
each weight to minimize the cost function. This process can be divided into two
stages: first, finding the direction to change, i.e., the gradient; and second, making the
adjustment in that direction. The first process applies the backpropagation method,
and the second applies the gradient descent technique.
According to optimization theory, changes should be made in the opposite direc-
tion of the steepest gradient, and this direction’s gradient can be found by taking the
partial derivative of the cost function with respect to each weight. However, calcu-
lating the partial derivative for all weights in all layers is computationally excessive.
Therefore, applying the chain rule of differentiation layer by layer is more efficient.
The process starts by taking the partial derivative of the cost function with respect to
the weights of the last layer, the output layer, and then for the weights of the second-
last layer, and so on, moving backward (i.e., backpropagating) to the front input layer.
This method of calculating partial derivatives in a backward direction is known as
the backpropagation algorithm.15 It is necessary to store intermediate results layer
by layer during the forward operation to aid backpropagation calculations.
Once the gradient for each weight is calculated using the backpropagation method,
the gradient descent technique is applied to adjust the weights in the direction that
reduces the cost function the most, effectively descending along the calculated
gradient. If the cost function is J(w) and the gradient is ∇J(w), then the weight
w is adjusted in the opposite direction of the gradient by w(n + 1) = w(n) − α ∗
∇J(w(n)), where, α represents the learning rate, a hyperparameter that controls the
size of the change. If this parameter is too large, there is a risk of overshooting the
optimum value, while too small a value can result in a slow convergence. To improve
the rate of convergence of the weights, methods that adjust the learning rate for each
weight based on the history of gradient changes are also used.
14 The objective function in an optimization process corresponds to the cost (or loss) function in
the neural network learning process.
15 David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams, “Learning Representations by
17Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh, “A Fast Learning Algorithm for Deep
Belief Nets,” Neural Computation, vol. 18, pp. 1527–1554, 2006.
5.6 Transformer Architecture 127
18Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez,
Lukasz Kaiser, Illia Polosukhin, “Attention Is All You Need”, Proceedings of The 31st Conference
on Neural Information Processing Systems (NIPS 2017), 2017.
128 5 Artificial Intelligence
sequence (such as a partially translated sentence in a translation task). The encoder processes
the input sequence (e.g., the original sentence in the source language), and the decoder generates
5.6 Transformer Architecture 129
2XWSXWV
3UREDELOLWLHV
6RIWPD[
/LQHDU
$GG 1RUP
)HHG
)RUZDUG
1[
$GG 1RUP $GG 1RUP
0XOWL+HDG 0DVNHG0XOWL
$WWHQWLRQ +HDG $WWHQWLRQ
3RVLWLRQDO 3RVLWLRQDO
(QFRGLQJ
(QFRGLQJ
,QSXW 2XWSXW
(PEHGGLQJ (PEHGGLQJ
2XWSXWV
,QSXWV
VKLIWHGULJKW
from attending to future tokens, ensuring that the model can only use information
from previous tokens when predicting the next token. The subsequent attention block,
known as the cross-attention block, allows the decoder to focus on the relevant parts
of the input sequence by attending to the encoder’s output representations. The feed-
forward block and the addition and normalization block function similarly to those in
the encoder. The annotation ‘shifted right’ on the decoder’s input ‘outputs’ indicates
that the sequence is shifted one position to the right, ensuring that the prediction
for each token depends only on the previous tokens, not on future tokens. This is
essential for training the transformer to predict the next word in a sequence without
access to the subsequent (unknown) words.
The vector representation coming out of the decoder finally passes through the
‘linear’ and ‘softmax’ blocks. The ‘linear’ block serves as the output stage’s fully
connected feedforward neural network (FNN) layer, mapping the decoder’s output
the corresponding output (e.g., the translated sentence) based on both the encoded input and the
previously generated tokens in the target sequence.
130 5 Artificial Intelligence
to the dimension matching the token size. The ‘softmax’ block converts the output of
the ‘linear’ block into probabilities as the final operation process of the transformer.
The softmax function ensures that all output values are between 0 and 1 and that their
sum equals 1, forming a probability distribution. The values generated by the softmax
represent the transformer model’s estimation of the probability that a particular token
will be the next token in the sequence.
22In the case of GPT-3, the number of heads in the masked multi-head attention block is known to
be 12–96.
5.6 Transformer Architecture 131
relationships or features using one head which may be much larger in size than a
head in the multi-head attention.
The self-attention mechanism operates as follows: First, the input sequence is
tokenized into individual tokens. These tokens are transformed into vectors through
an embedding process. In the multi-head attention block, the embedding vectors are
used to calculate three sets of vectors: query (Q), key (K), and value (V ). This is done
by applying three separate linear transformations to the input embedding vectors,
each using distinct weight matrices learned during training: W Q for the query, W K
for the key, and W V for the value vectors. Initially, these three weight matrices are
randomly initialized and have no specific distinction, but during training, they learn
to focus on different aspects of the input data. As a result of these transformations, we
obtain the three sets of vectors—Q, K, and V —which are then used in the following
steps of the attention mechanism.
The self-attention mechanism computes attention scores (S) by taking the dot
product of each query vector (Q) with the corresponding key vector (K) from other
tokens. This measures the similarity or “compatibility” between tokens. These scores
are then scaled (usually by dividing by the square root of the dimensionality of the
key vectors) and normalized using the softmax function to produce attention weights
(A). These weights represent the relative importance of each token in relation to
others. Finally, the attention weights are used to compute the output at each token
position as the weighted sum of the value vectors (V ). This allows the model to focus
dynamically on the most relevant tokens when processing the sequence, capturing
both context and relationships between tokens. Figure 5.5 illustrates the multi-head
attention mechanism as described above.
/LQHDU
WUDQVIRUP $WWHQWLRQ $WWHQWLRQ
VFRUHV ZHLJKWV
4
:4
,QQHU 6 $
TTTTT 6RIWPD[
,QSXW HHHHH 3URGXFW
ę,JRWRVFKRROĚ
(PEHG
(
:. .
7RNHQV GLQJ
WWWWW NNNNN &RQFDWHQDWLRQ
3RVLWLRQDO
(QFRGLQJ 2XWSXW 2 2BFRQFDW
:9 3URFHVV :2
9 2XWSXW
YYYYY $WWHQWLRQ
2XWSXW
.5
The roles of the query (Q), key (K), and value (V ) vectors in the self-attention
mechanism can be summarized as follows: The query vector (Q) is generated for
each input token and is used to determine how much attention the model should pay
to other tokens in the sequence. Each token also has a corresponding Key vector
(K), which represents the features of the token that other tokens will attend to. The
attention scores are computed by taking the dot product between the query vector
of the current token and the Key vectors of all tokens, including itself. These scores
represent the relevance or compatibility between the current token and all others.
The attention scores are then normalized using the softmax function, converting
them into attention weights (A). These weights represent the importance of each
token relative to the others in the sequence. The value vector (V ) contains the actual
content or information of each token. The weighted sum of the value vectors, based
on the attention weights, is then computed to produce the output O for each token. In
essence, the query and key vectors are used to calculate how tokens in the sequence
relate to each other, while the value vectors contain the information that is used to
generate the final output. The self-attention mechanism leverages these components
to dynamically assign different weights to various tokens, allowing the model to
capture complex relationships and dependencies across the input sequence.
In the case of a multi-head attention setup, the transformer model expands this
mechanism by having multiple sets of W Q , W K , and W V matrices, each constituting
an “attention head.” Each head independently computes its own set of Q, K, and V
vectors, allowing the model to simultaneously focus on different aspects of the input
sequence from various representational subspaces. The outputs from each attention
head are then concatenated and linearly transformed once more through an additional
weight matrix, often denoted as W O , to combine the diverse insights gathered from
each head into a single, unified output O_concat (see Fig. 5.5). This aggregation
process enables the model to integrate a richer set of contextual cues and relationships,
enhancing its ability to understand and process the sequence comprehensively. The
output O_concat thus obtained therefore becomes a new representation of the input
sequence reconstructed through self-attention.
Through this elaborate orchestration of multiple attention heads and the subse-
quent aggregation of their outputs, the self-attention component dynamically allo-
cates weights across the input sequence, elucidating intricate relationships within
it. The multi-head attention mechanism thus significantly contributes to the trans-
former’s skillfulness in capturing the subtle interaction of elements within the
sequence, supporting a deeper and more nuanced understanding of the data.
These capabilities make transformers highly suitable for processing sequential data,
such as natural language, while maintaining superior performance in comparison to
RNNs and CNNs.
RNNs are designed for sequential data and, in theory, can model long-distance
dependencies. However, in practice, they struggle with very long sequences due to
issues like vanishing or exploding gradients, which hinder effective learning over long
time steps. CNNs, while excellent at capturing local dependencies through convo-
lutional filters, are inherently limited when it comes to capturing long-range depen-
dencies in sequential data. In contrast, the transformer’s self-attention mechanism
calculates relationships between all tokens in a sequence simultaneously, enabling it
to capture dependencies regardless of their distance.
A major limitation of RNNs is their sequential data processing. Each step depends
on the output of the previous step, making parallelization difficult and leading to high
computational costs when scaling to large datasets or long sequences. CNNs, while
more parallelizable due to their use of convolutions, are constrained by the size of their
receptive fields, which limits their ability to process sequences that extend beyond
this fixed size. Transformers, on the other hand, can process entire input sequences
in parallel, splitting them into tokens and handling them simultaneously. This paral-
lelization significantly reduces training and inference times, especially when lever-
aging hardware like GPUs, making transformers highly scalable and efficient for
large datasets.
While RNNs excel in sequence-based tasks and CNNs are well-suited for image
data, transformers exhibit versatility across a broad range of tasks, including natural
language processing (NLP), image recognition, and audio processing. This flexi-
bility makes transformers more generalizable compared to RNNs and CNNs. RNNs
gradually capture context through sequential steps, and CNNs capture local context
through convolutional features. In contrast, transformers capture the entire sequence
context at every layer. By leveraging self-attention, transformers can assess rela-
tionships between every token in the sequence, making them highly effective for
complex sequence tasks such as NLP, where understanding relationships across the
entire sequence is essential.
capturing intricate relationships within text. These features have made transformers
the dominant architecture in modern NLP.
Transformers have significantly improved machine translation, allowing models
to maintain context and nuance, even in complex sentence structures. Transformer-
based models like GPT have demonstrated the ability to generate coherent and contex-
tually relevant text, making them invaluable for tasks such as creative writing, chat-
bots, and automatic content creation. In addition to text generation, transformers excel
at text classification, helping categorize texts by genre or emotion and recognizing
named entities like names, locations, and dates. This makes them ideal for tasks like
sentiment analysis and information extraction. Furthermore, transformers are effec-
tive in summarizing lengthy documents, condensing large texts into concise, mean-
ingful summaries. Despite these strengths, transformer models require significant
amounts of training data and computational resources, and they may face difficulties
in generalizing to completely unseen data or new tasks without fine-tuning.
Beyond NLP, transformers have expanded into other domains, including image
and audio processing. The vision transformer (ViT) architecture treats an image as
a sequence of patches, similar to how text is treated, and has demonstrated excep-
tional performance in image recognition tasks.23 In speech recognition, transformers
capture long-distance dependencies and context within audio sequences, making
them highly effective for tasks like speech-to-text conversion. Similarly, transformers
have shown great potential in music generation by treating musical elements, such
as notes and rhythms, as sequential data. In addition, when combined with reinforce-
ment learning, transformers help agents remember long sequences of actions and
their outcomes, which is useful in game-playing scenarios. In these contexts, trans-
formers model complex relationships between events, predict outcomes, and assist
in decision-making processes.
23 Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai,
Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob
Uszkoreit, Neil Houlsby, “An image is worth 16 × 16 words: Transformer for image recognition at
scale,” Proceedings of International Conference on Learning Representations (ICLR), 2021.
24 Since then, the transformer architecture has been used in various AI models for decision manage-
ment, robotic process automation, natural language processing, computer vision, optimization and
others. In particular, AlphaFold 2, an optimization AI model developed by Google DeepMind, has
dramatically improved the performance of protein structure prediction.
5.7 GPT and BERT 135
capabilities, while BERT set new standards in the area of language understanding.
GPT and BERT have marked milestones in the decades-long journey of AI evolu-
tion, demonstrating sophisticated neural network architectures’ ability to understand
and generate human language with remarkable accuracy. GPT evolved into subse-
quent models like ChatGPT-3.5, GPT-4, GPT-4o, GPT-o1, etc., and BERT led to the
development of improved models like LaMDA, Bard, and Gemini.25 Both GPT and
BERT, as well as their derivative models, utilize the transformer architecture as their
foundation.
While GPT and BERT both revolutionize natural language processing through their
use of the transformer structure, they differ in their core concepts. GPT is a generative
model capable of producing text, whereas BERT focuses on deeply understanding
the context of language. GPT excels in text completion, content creation, and creative
writing due to its unidirectional (left to right) context understanding, meaning each
word prediction depends only on the preceding words. On the other hand, BERT
analyzes text bidirectionally (both from left to right and right to left), allowing for a
comprehensive understanding of each word’s context within the text.
GPT and BERT also differ in their applications and training methods. GPT is
primarily used for text generation, suitable for applications requiring consistent and
contextually relevant text paragraphs. In contrast, BERT is mainly used for text
understanding and interpretation, making it ideal for tasks like sentiment analysis,
question answering, and language inference. GPT is trained as a language model
to predict the next word in a sequence based on previous words, while BERT is
trained to understand the context of words in sentences by masking random words
and predicting them based on their surrounding context. Therefore, GPT is more
suited for tasks requiring text generation, such as writing assistance, chatbots, and
creative writing tools. Meanwhile, BERT is more effective in tasks involving text
understanding, like information extraction, search engines, and text classification
systems.
If we compare the architectures of GPT and BERT, both base their structures
on the transformer architecture but adopt different components for their use. GPT
utilizes the decoder of the transformer (the right half of Fig. 5.4), whereas BERT is
built upon the encoder (the left half of Fig. 5.4). In the case of GPT architecture,
the multi-head attention block in the middle of the decoder, which used to take in
25 Language Model for Dialogue Applications (LaMDA), Bard, and Gemini are transformer-based
AI models developed by Google and released in 2021, March 2023, and December 2023, respec-
tively. LaMDA was built on advancements from models like BERT, which excel at understanding
context within sentences. Bard extends LaMDA’s conversational capabilities by integrating Google’s
powerful search technology to provide real-time, up-to-date conversational responses. While
Gemini, introduced in December 2023, is the successor to Bard, publicly available information
about its technical features is limited.
136 5 Artificial Intelligence
2XWSXWV 2XWSXWV
3UREDELOLWLHV 3UREDELOLWLHV
6RIWPD[ 6RIWPD[
/LQHDU /LQHDU
)HHG )HHG
)RUZDUG )RUZDUG
1[
1[ $GG 1RUP
$GG 1RUP
0DVNHG0XOWL
0XOWL+HDG +HDG $WWHQWLRQ
$WWHQWLRQ
3RVLWLRQDO
3RVLWLRQDO
(QFRGLQJ
(QFRGLQJ
,QSXW
,QSXW
(PEHGGLQJ
(PEHGGLQJ
,QSXWV ,QSXWV
D E
the output of the encoder, is no longer necessary. Figure 5.6 shows the resulting
architectures.26
GPT employs only the decoder part of the transformer architecture, which makes it
highly effective for text generation tasks. The decoder in GPT includes a masked self-
attention mechanism along with feedforward neural networks. The masking ensures
that the model can only attend to the previously generated words in a sequence,
making it a unidirectional model. This setup allows GPT to generate text sequentially
by predicting the next word based on the words that precede it. As a result, GPT excels
at generating coherent, contextually relevant text, making it ideal for tasks such as
creative writing, summarization, and dialogue generation.
On the other hand, BERT utilizes the encoder part of the transformer architecture,
which is optimized for understanding context rather than generating text. BERT
26 The linear and softmax blocks in Fig. 5.6a correspond to task-specific layers that are added
when BERT is fine-tuned for downstream tasks such as classification. However, different output
heads, such as ones specialized for the masked language modeling (MLM) and next sentence
prediction (NSP) tasks, would be used during pre-training rather than the task-specific heads shown
in fine-tuning.
5.7 GPT and BERT 137
The applications of GPT are wide-ranging, with its primary strength being text
generation. GPT excels at creating creative content such as stories, poems, and
dialogues, making it ideal for use in interactive storytelling, game narratives, and
creative writing. It can also be applied to chatbots and conversational agents, where
generating coherent and contextually appropriate responses is crucial. In addition,
GPT is used in news article writing, content creation, and even in code generation for
software development. It can assist with language translation by generating context-
aware translations. In the field of education, GPT can be used to create educational
content, such as practice questions and explanations, providing interactive learning
experiences.
BERT, on the other hand, excels at text classification, information extraction,
and search engine optimization. BERT is highly effective for sentiment analysis,
making it useful for analyzing customer feedback, social media posts, and reviews.
Its ability to understand text context allows it to organize and categorize large
amounts of content, which makes it invaluable in applications like content filtering
and text categorization. BERT has also been adopted by search engines like Google to
improve the understanding of query intent and provide more relevant search results.
In question-answering systems, BERT’s deep contextual understanding helps find
specific answers within large texts, making it useful in applications ranging from
customer service to information retrieval. In addition, BERT can identify and clas-
sify named entities (such as names, locations, and organizations) and can be used
138 5 Artificial Intelligence
for machine translation and text summarization by extracting key information from
input texts.
When comparing the applications of GPT and BERT, GPT’s strength lies in gener-
ating coherent and creative content, making it ideal for applications that require
language generation and interaction. In contrast, BERT excels at tasks that require a
deep understanding of text, such as classification, sentiment analysis, and question-
answering. Both models are versatile and can be adapted for a wide range of NLP
tasks. However, their application areas are defined by their core strengths—GPT is
more effective for text generation tasks, while BERT is better suited for text compre-
hension and analysis. In natural language processing, both models have found success
across various industries and services, with GPT dominating in content creation and
BERT in information extraction and contextual understanding.
GPT and BERT are revolutionary models in natural language processing (NLP),
but they encounter challenges when working with long texts due to their fixed
context windows. GPT, known for its powerful text generation capabilities, can
struggle to maintain coherence over extended narratives because it only processes
a limited number of tokens at a time (usually around 2048 tokens for GPT-3 and
earlier versions). As a result, GPT may lose track of long-term context in lengthy
texts. Newer versions of GPT, such as GPT-3.5 and GPT-4, aim to mitigate this by
increasing the model’s attention span and token capacity, enabling better handling
of long sequences.
Similarly, BERT, which excels at context understanding and text analysis, is
constrained by a maximum input length (typically 512 tokens), limiting its ability
to process long documents in one pass. Techniques like text segmentation or sliding
windows can be used to process longer documents by dividing them into chunks, but
this often leads to a loss of global context across segments. Despite these limitations,
ongoing research and advancements (such as the development of Longformer and
Big Bird models) seek to address these issues by extending the attention mechanism
to capture broader contexts in longer texts, thereby improving the models’ ability to
handle large-scale documents.
5.7.3 ChatGPT
27In December 2023, Google unveiled ‘Gemini’, a generative AI based on large language models,
which is known to surpass human experts in large-scale multi-task language understanding tests.
5.7 GPT and BERT 139
released GPT-3 in June 2020, ChatGPT-3.5 for public testing in November 2022,
ChatGPT-4 in March 2023, and GPT-4o in May 2024. Furthermore, in October
2024, it released GPT-o1 with enhanced inference capability.28
ChatGPT can understand text input in various languages, grasp the context of
conversations, and generate consistent and contextually appropriate responses. It
was trained on a wide range of internet texts, enabling it to respond knowledgeably
to topics covered during its training. However, it cannot provide information on real-
time events or topics not included in its training data up until its last training session.
Also, ChatGPT may have limitations in responding to languages with insufficient
training data.
ChatGPT can engage in interactive conversations with users, answering ques-
tions, providing explanations, assisting with creative writing, solving coding prob-
lems, and more. It can maintain the context of a conversation, remember previous
questions and answers, and provide comprehensive responses based on this infor-
mation. Some versions can connect to external tools like a web browser for informa-
tion retrieval or DALL-E for image generation, offering additional functionalities.
Furthermore, ChatGPT can significantly aid scientific and technological research,
such as designing new molecules or simulating cellular behavior. It is expected to
become a competent assistant in literature review, data summarization, hypothesis
setting, concept development, experiment design support, coding and data analysis,
and drafting research proposals and papers.29
Despite its powerful natural language processing capabilities, ChatGPT has limi-
tations. It inherits the limitations of the GPT model that relies on training. Insuf-
ficient training can lead to a lack of knowledge, and biases in training data can be
perpetuated. Sometimes, it may provide plausible but inaccurate or absurd responses,
a phenomenon known as “hallucination”. It can also produce ethically question-
able or value-misaligned responses and has limitations in accessing real-time data.
Such limitations are considered to be solvable or mitigable by enhancing AI’s infer-
ence capabilities and search capabilities, and thus, AI developers are focusing on
improving AI’s inference and planning abilities. (GPT-o1, for example, is known to
28 OpenAI expanded the capabilities of GPT-4, allowing users to create and use customized GPT
applications. In January 2024, they launched the ‘GPT Store’, where these applications can be
shared.
29 In December 2023, the scientific journal “Nature” included ChatGPT in its “Nature’10” list,
30
For instance, Google’s Gemini, Meta’s Llama2, Amazon’s Bedrock service, and IBM’s
Watsonx.ai are notable examples.
5.8 Implementation of AI Cognitive Functions 141
1. Input Embedding
The input embedding block performs the ‘vector conversion’ function of AI as
shown in Fig. 5.1. This block splits the input text into tokens, which are small units
like words, and converts each token into numerical vectors. Through this embed-
ding process, the input text is transformed into numerical vectors, and all subse-
quent processes within GPT are carried out through numerical calculations. Each
token’s embedded vector includes positional information indicating where the token
appeared in the text.
2. Masked Multi-Head Attention
This block, together with the subsequent feedforward block, is responsible for inter-
preting the question and understanding its context. These functions are achieved
through the self-attention mechanism. The blocks performing the self-attention func-
tion are called heads. GPT processes self-attention through multiple heads in parallel,
with each head focusing on different features of the input data. After passing through
0Z
/CUMGF
+PRWV /WNVK #FF (GGF #FF
'ODGF *GCF (QTYCTF .KPGCT 5QHVOCZ
FKPI #VVGPVKQP 0QTO 0QTO
2QUKVKQPCN
'PEQFKPI
the multi-head attention, the model can effectively identify patterns within the input
data and understand relationships and context between words.
3. Feedforward
Data that has passed through the multi-head attention block is further transformed
in the feedforward block, capturing complex patterns. This block consists of two
feedforward neural networks (FNNs) with a nonlinear activation function in between.
The first FNN expands the data’s dimensions, the nonlinear activation function,
GELU, allows the model to learn more complex patterns through nonlinearity, and
the second FNN reduces the data back to its original dimensions. The reason for
expanding dimensions and processing in a higher-dimensional space is that nonlinear
processing in an expanded space can capture and represent more complex features and
patterns. Projecting the input into a high-dimensional space enables more complex
transformations and interactions within that space.
Comparing with Fig. 5.1 AI, the transformer block’s multi-head attention and
feedforward blocks together perform the functions of ‘question analysis’, ‘pre-trained
knowledge’, and ‘inference, synthesis’.
4. Linear Transformation and Softmax
The linear transformation block and the softmax block are responsible for generating
the response. The linear transformation block is a fully connected FNN that trans-
forms the input data into a space that matches the model’s vocabulary size, expanding
the input into a set of logits (unnormalized prediction values). The softmax block then
applies the softmax operation to these logits. The softmax function converts all output
values into non-negative values and normalizes them to sum up to 1, allowing the
model to interpret the logits as a probability distribution. This probability indicates
the likelihood of each word being the next word in the output sequence. Therefore,
the word with the highest probability is selected as the next output word.
Comparing with Fig. 5.1 AI, the linear transformation block and softmax block
together perform the function of ‘answer generation.’
31 Although newer models such as GPT-4 and GPT-4o have been released, with improvements over
GPT-3, system details for these newer models are not fully publicly available. As a result, for the sake
of convenience, we will review the complexity of AI models using the GPT-3 175B model. While
GPT-4 and other recent models introduce enhancements in areas such as accuracy and contextual
understanding, their underlying architectures can be viewed as extensions or refinements of the
transformer-based approach used in GPT-3. Therefore, many aspects of the structure, functionality,
and complexity of these newer models can be understood by analyzing GPT-3.
5.8 Implementation of AI Cognitive Functions 143
1. Input Embedding
The embedding block converts input text into tokens and then transforms each token
into numerical vectors. While it is possible to convert tokens into simple numbers,
transforming them into numerical vectors captures semantic relationships, syntactic
roles, and contextual information, making calculations within the transformer struc-
ture more efficient. The size T of each embedding vector is 12,288, meaning each
token is represented by a 12,288-dimensional vector. Each vector is expressed as a
32-bit or 64-bit floating-point number.32
When the transformed embedding vectors for each token are combined, they form
a matrix called the embedding matrix. The total set of tokens is called the vocabulary,
and if the vocabulary size is D, the embedding matrix W E has dimensions of D ×
12,288. For the GPT-3 175B model, with a vocabulary size of 50,257, the embedding
matrix size is 50,257 × 12,288.33
The embedding matrix is learned during the training process, much like other
weight matrices in a neural network. Once training is complete, the embedding
matrix contains a representation for each input token, where each token is mapped
to a corresponding vector in the matrix. This matrix can be visualized as a table,
where input tokens are associated with specific rows, and the vectors representing
the tokens are stored in the corresponding rows. During the usage stage, when an
input token is processed, the model simply retrieves its pre-learned embedding vector
from this table. This retrieved vector represents the token’s position in the semantic
space and serves as the input to subsequent layers of the model. This lookup process
is efficient and constitutes the input embedding phase.
2. Masked Multi-Head Attention
The masked multi-head attention block, a core element of the transformer structure,
performs the self-attention mechanism, with detailed functions as shown in Fig. 5.5.
Multi-head attention processes the embedding vector in parallel by dividing it by
the number of heads. Specifically, the embedding vector dimension T is divided
by the number of heads H, with each head processing a T /H-dimensional vector.
This reduction of the embedding vector’s dimension is achieved by passing each
embedding vector through weight matrices W Q , W K , and W V , with each weight
matrix sized T × (T /H). The resulting Q, K, and V vectors are reduced to T /H
dimensions. For the GPT-3 175B model, with T = 12,288 and H = 96, each weight
matrix W Q , W K , W V is sized 12,288 × 128. Thus, the Q, K, and V vectors are reduced
32 This applies to the GPT-3 175B model, which has 175 billion parameters, but the size of the
numerical vector varies with different models. For example, the GPT-3 Small model with 125 million
parameters has a vector size of 768 dimensions, the GPT-3 XL model with 1.3 billion parameters
has a vector size of 1600 dimensions, and the GPT-3 13B model with 13 billion parameters has a
vector size of 5120 dimensions.
33 The reason why the GPT-3 175B model uses a vocabulary size of about 50,000 tokens, in
contrast to the Oxford English Dictionary’s approximately 600,000 words, is to balance model
complexity and the ability to capture diverse linguistic nuances, allowing it to process various
language structures and vocabularies without excessive computational load and memory increase.
144 5 Artificial Intelligence
0Z
(GGF
/CUMGF
+PRWV #FF (QTYCTF #FF
/WNVK*GCF .KPGCT
'ODGF 5QHVOCZ
#VVGPVKQP
FKPI 0QTO W^F1, 0QTO
W^Q, W^K, W^L
W^F2
W^E W^V, W^O
2QUKVKQPCN
'PEQFKPI
to T /H = 128 dimensions, and the weight matrix W O , which connects the outputs
of the H multi-head attention blocks, is sized 12,288 × 12,288.34
3. Feedforward
The first FNN is a linear transformation that expands the input vector’s dimension,
and the second FNN following the nonlinear activation function in the middle is
a linear transformation that reduces the vector back to its original dimension. For
the GPT-3 175B model, the expansion ratio is 4×. Thus, the weight matrix W F1 for
the first linear transformation is 12,288 × 49,152, and the weight matrix W F2 for
the second linear transformation is 49,152 × 12,288. The calculation process of the
intermediate nonlinear activation function, GELU, is straightforward.
The transformer block, composed of masked multi-head attention and feedfor-
ward blocks, is repeated N times in sequence, where N represents the depth of the
transformer network. For the GPT-3 175B model, N = 96.
The linear block linearly transforms the input data into a space matching the vocab-
ulary size, expanding it into a set of logits. The size of the weight matrix performing
this transformation is determined by the vocabulary size. For the GPT-3 175B model,
the linear block weight matrix W L is 12,288 × 50,257.
The softmax block applies the softmax operation to the output logits of the linear
block. The softmax function is a mathematical function that converts logits into a
probability distribution and is simple to compute.
The cumulative weight matrices discussed above are summarized in Fig. 5.9.
34The number of heads H varies by model, with GPT-3 Small having 12 heads, GPT-3 XL having
32 heads, and GPT-3 13B having 40 heads. Consequently, the sizes of the weight matrices W ^Q,
W ^K, W ^V are 768 × 64 for GPT-3 Small, 1600 × 50 for GPT-3 XL, and 5120 × 128 for GPT-3
13B.
5.8 Implementation of AI Cognitive Functions 145
For the GPT-3 175B model mentioned in the previous sections, the number of weight
matrices that need to be learned during the training process is extremely large. The
weight matrices to be learned include W E from the input embedding block, W Q ,
W K , W V , W O from the masked multi-head attention block, W F1 , W F2 from the
feedforward block, and W L from the linear block (refer to Fig. 5.9). The size of
each matrix for the GPT-3 175B model is as follows: W E : 50,257 × 12,288, W Q :
12,288 × 128, W K : 12,288 × 128, W V : 12,288 × 128, W O : 12,288 × 12,288,
W F1 : 12,288 × 49,152, W F2 : 49,152 × 12,288, W L : 12,288 × 50,257. Notably, the
masked multi-head attention block and feedforward block are repeated N = 96 times
in succession.
GPT-3 simultaneously learns all the weight matrices listed above. Comparing with
the neural network structure in Fig. 5.2, each weight matrix corresponds to a layer in
a multilayer neural network. However, unlike a standard multilayer neural network
where the output of each layer directly feeds into the next, additional processing steps
are involved. If we consider the three matrices W Q , W K , and W V in the multi-head
attention block as being computed simultaneously, the input embedding block forms
one layer, the masked multi-head attention block forms two layers, the feedforward
block forms two layers, and the linear transformation block forms one layer. Given
that the masked multi-head attention and feedforward blocks are repeated 96 times,
the GPT-3 175B model can be considered a deep neural network with a total of 1 +
(2 + 2) × 96 + 1 = 386 layers.
Now, let’s calculate the number of weight parameters for the GPT-3 175B model
based on Fig. 5.9. The model name 175B indicates that there are ‘175 billion’ weight
parameters. Let’s verify how this number is derived.
1. Input Embedding
The input embedding block contains 50,257 × 12,288 = 617,588,016 weight
parameters in the weight matrix W E .
2. Masked Multi-Head Attention
A single masked multi-head attention block contains 12,288 × 128 = 1,572,864
weight parameters in each of the W Q , W K , and W V matrices. With 96 heads (H),
the total number of parameters is 1,572,864 × 3 × 96 = 452,984,832. In addition,
the W O matrix contains 12,288 × 12,288 = 150,994,944 parameters. Therefore, the
total number of parameters in the masked multi-head attention block is 603,979,776.
3. Feedforward
A single feedforward block contains 12,288 × 49,152 = 603,979,776 weight param-
eters in W F1 and 49,152 × 12,288 = 603,979,776 weight parameters in W F2 , totaling
1,207,959,552 parameters.
A single transformer block contains one masked multi-head attention block and
one feedforward block. The weight parameters are only in the masked multi-head
146 5 Artificial Intelligence
The purpose of training an AI system is to enable it to predict the next token based
on the preceding tokens. During the training process, the target of learning is the
35 In each layer’s normalization, the input x_i is normalized with respect to the mean μ and standard
deviation σ as y_i = (x_i − μ)/σ. The result is then scaled and shifted using the two learnable
parameters γ and β in the form z.i = γy_i + β. The parameters γ and β are determined during the
training process, and they help prevent information loss and improve convergence.
Table 5.1 Calculation of weight parameters (GPT-3 175B model)
Input embedding Masked multi-head attention Feedforward Linear
Weight matrix size WE: 50,257 × 12,288 W Q, W K, W V:
12,288 × 128 W F1 : 12,288 × 49,152 W L : 12,288 × 50,257
W O : 12.288 × 12,288 W F2 : 49,152 × 12,288
Number of weight parameters W E : 617,588,016 W Q , W K , W V : 1,572,864 × 3 × W F1 : 603,979,776 W L : 617,588,016
96 = 452,984,832 W F2 : 603,979,776
5.8 Implementation of AI Cognitive Functions
W O :12.288 × 12,288 =
150,994,944
Repetition 1 96 96 1
Total weight parameters in each 617,588,016 (452,984,832 + 150,994,944) × (603,979,776 + 603,979,776) × 617,588,016
block 96 = 57,982,058,496 96 = 115,964,116,992
Total weight parameters of the 175,181,351,520
model
147
148 5 Artificial Intelligence
weight parameters, and the learning results are stored as weight values.36 These
weight values play the role of the brain in understanding questions and generating
answers. As calculated above, the GPT-3 175B model has around 175 billion weight
parameters, most of which are concentrated in the multi-head attention blocks and
feedforward blocks within the transformer blocks.
The training process of an AI system is similar to that of training a deep neural
network (e.g., GPT-3 175B model is a deep neural network consisting of 386 layers).
When training an AI system, all the weight parameters of each layer are initially set
to random values and are adjusted as the learning progresses. By receiving training
data and repeating the process of forward propagation, cost function calculation,
backpropagation, and weight updates, as explained in Sect. 5.5, learning progresses.
When the cost function converges to zero, the training ends, and the weights at that
point constitute the final weight matrices. These weight matrices are used for actual
question-answering and are not modified until the next training process.
Specifically, the GPT-3 175B model uses self-supervised learning during the
training process. This is a learning method where the model generates its own labels
from the input data to train itself. While self-supervised learning can be classified as
unsupervised learning because labels are not explicitly provided, it effectively oper-
ates like supervised learning since the training input data inherently contains labels
(i.e., the next tokens) that the model utilizes. The cross-entropy function is used as the
loss function, calculated by comparing the predicted token probability distribution to
the actual distribution of the next token. The backpropagation technique is employed
to compute the gradients of the weight parameters with respect to the loss function,
and the Adam optimization algorithm is applied to update the weights based on those
gradients.37 The input data is processed by dividing it into multiple batches (with a
batch size of up to 2,048 tokens), and all computations within the transformer block
are also handled on a batch basis. During this process, masking is applied to ensure
that future tokens are not involved in predicting the next token.38 The cost function
is calculated on a per-token basis and averaged over the batch to update the weight
parameters. After performing the forward pass, loss calculation, and backpropaga-
tion for the first batch, the weight parameters are updated once, and then the process
is immediately repeated for the second batch. Once the weight parameter updates
for all batches are complete, an epoch is said to be finished. Training on the given
input data concludes when the cost function reaches zero or when there is no change
in the weights after repeated epochs. The final weights are then used as the starting
36 Although relatively small in number, normalization parameters and bias parameters are also
targets of learning.
37 Adam (adaptive moment estimation) optimization technique improves convergence speed and
maintains robust performance by adaptively adjusting the learning rate using the mean of the
gradients and the mean of the squared gradients.
38 Masking can be applied by adding a mask matrix M to the attention score matrix S (see Fig. 5.5).
The elements m_ij of matrix M are set to 0 for i ≤ j and to − ∞ for i > j. This way, the terms
to which − ∞ is added become 0 during the following softmax process, effectively achieving the
masking effect.
5.8 Implementation of AI Cognitive Functions 149
point for training on the next input data, and this training process continues until all
training data is exhausted.39
The data used for training AI systems is extensive. Large amounts of data are
collected from various sources, preprocessed, and then used for training. Prepro-
cessing is necessary because data collected from various sources can vary greatly in
form and content and may include incorrect or irrelevant information. Subsequently,
fine-tuning is performed using specific datasets and human feedback, adjusting
performance in detail during this process. In particular, reinforced learning by human
feedback (RLHF) is crucial. Humans evaluate the model’s output, verify if the
model’s performance meets the intended accuracy and reliability, and identify and
correct the model’s biases or defects.
The performance of an AI system depends on the quantity and quality of the
training data used. If the information in the training data is accurate, the AI learns
correctly, but if the training data contains biased information, the AI model’s
responses will exhibit bias. Training large-scale AI models requires a large and
diverse set of datasets, including Common Crawl,40 BookCorpus,41 Wikipedia,
books, articles, journals, and other texts: For models like GPT-4o, multimode data
such as images, audio, and video are also necessary.
AI systems only learn during the training process and do not learn further once
the training period is over.42 Thus, the weight parameters remain unchanged after
the training is completed. Even if the AI learns new facts during question-answering
sessions (which it can separately store), it does not immediately reflect these in the
weight parameters. This distinguishes AI from human intelligence. While humans
think about various aspects and store information in memory during the process
of understanding questions and generating answers, AI ends its role at generating
responses based on what it understands.
So far, we have examined how AI’s cognitive functions are implemented in
systems through a question-and-answer process. In conclusion, like humans, AI
understands questions and provides answers; however, every step is conducted
through numerical calculations. System implementation involves designing the
39 GPT-3 is known to have been trained on 570 gigabytes of text data from various sources. This
dataset includes hundreds of billions of tokens, and the training involved millions of repeated epochs.
It is estimated that this training was completed in about 34 days using 1024 Nvidia V100 GPUs.
40 Common Crawl is a non-profit organization that regularly (e.g., monthly) crawls the web to
systematically collect data from websites and provides the archives and datasets to the public for
free. The Common Crawl web archive consists of petabytes (i.e., thousands of terabytes) of data
collected since 2008.
41 BookCorpus is a dataset composed of texts from about 7,000 self-published books collected
from the indie e-book distribution website Smashwords. This dataset consists of approximately
985 million words and includes books from various genres such as romance, science fiction, and
fantasy. It was used to train the initial GPT and BERT models.
42 For example, the final training of GPT-4 was completed on March 14, 2023. GPT-4o, released on
May 13, 2024, is an optimized version of GPT-4 offering new features and improvements, without
retraining GPT-4. In contrast, GPT-o1, released on October 4, 2024, is a completely new model,
designed and trained from scratch to achieve advanced capabilities in reasoning and problem-
solving.
150 5 Artificial Intelligence
computational architecture that executes these processes and determining the weight
parameters embedded within it. The understanding of the question happens within
the transformer blocks, while the answers are generated in subsequent blocks. The
values of the weight parameters are determined through training, which requires vast
amounts of data, computing power, and significant energy consumption. However,
once training is complete, the amount of computation and energy consumption
required during the usage phase is small.
AI, with its immense potential, stands as a technological marvel that has rapidly
evolved and is poised to further accelerate due to intensified competition among
corporations. Its proliferation across industries and society is bound to have profound
impacts on human life. However, the path forward for AI is filled with numerous
challenges and limitations. Technical constraints are the first hurdles it will encounter,
and overcoming these will be a formidable task. Moreover, AI will likely give rise to
ethical and societal concerns, leading to significant debates and possibly resulting in
various regulatory measures. Such regulations could redefine the trajectory of AI’s
development within a more constrained framework.
The mixed feelings of anticipation and apprehension toward AI’s future among
humanity highlight the complexity of its integration into our lives. There’s a tangible
fear that, if not managed wisely, AI could bring about adverse outcomes for
humanity. It will be crucial to successfully navigate these challenges and limitations
in determining the future of AI.
Addressing technical challenges requires continuous innovation and research to
improve AI’s efficiency, reduce its environmental impact, and enhance its ability to
generalize across different tasks without compromising on performance. Ethically,
it necessitates a balanced approach that considers the societal impact of AI, ensuring
that its development and deployment are aligned with human values and benefit
society as a whole.
The potential regulatory landscape could both safeguard against the misuse of
AI and ensure that its development is aligned with ethical standards and societal
well-being. However, too stringent regulations could stifle innovation and hinder the
potential benefits AI could offer.
Ultimately, the future of AI will hinge on our ability to advance the tech-
nology responsibly, addressing ethical concerns and societal impacts while navi-
gating through the regulatory frameworks that may emerge. This balanced approach
will enable us to harness the full potential of AI, mitigating risks and ensuring that
its development and application contribute positively to humanity’s progress.
5.9 Challenges and Limitations of AI 151
43 GPUs were originally designed for graphic rendering and are capable of high levels of parallel
processing. They support various machine learning frameworks such as TensorFlow and PyTorch.
TPUs, developed by Google to accelerate machine learning tasks like deep learning, can handle
large-scale computations and are optimized for TensorFlow.
44 Efforts are actively underway to make AI models lighter while maintaining performance in order
to solve these problems. On one hand, research is being conducted to reduce the weight of current
AI models or increase their processing speed. On the other hand, research is aimed at emulating
the human brain, which performs high-level thinking without matrix operations like deep neural
networks. The Kolmogorov-Arnold Networks (KAN) may be regarded as an example of such efforts.
Such research is being carried out simultaneously at both the semiconductor level and the software
code level.
152 5 Artificial Intelligence
more prone to overfitting. Thus, the cost of enlarging the model may not justify
the performance gains. Training larger models often requires distributed systems,
complicating the coordination and parallelization of the training process. Therefore,
scaling AI models presents new challenges in computation, data, and efficiency.
The generalization problem in AI models, which stems from a lack of diversity in
training data, highlights several key limitations in AI models. These limitations arise
due to constrained context comprehension, biases in training data, and inherent limi-
tations in machine learning algorithms. AI models, relying heavily on their training
data, excel in domains similar to their training environment but often falter in unfa-
miliar territories or contexts. This is partly because machine learning models based
on AI, such as GPT and BERT, struggle with scenarios not well-represented in
their training data, lacking deep contextual understanding. The models’ overfitting
to training data features makes adapting to new, feature-different data challenging.
Without the capacity for common-sense reasoning to interpret new situations beyond
their training, current AI models face a significant barrier to generalization.
Moreover, the complexity and inaccessibility of systems like GPT and BERT lend
them a “black-box” nature, making it difficult to comprehend the reasoning behind
specific decisions or to explain generated or analyzed content. This poses a challenge
in applications requiring explanations, such as in the medical or financial sectors.
Transparency and accountability become crucial when errors or biases emerge, espe-
cially in applications where these qualities are demanded. The lack of transparency
complicates diagnosing and rectifying issues related to generalization.
Interoperability, integration, and standardization also represent significant chal-
lenges for AI. The diversity of data formats and protocols across different systems
and industries leads to compatibility issues, hindering seamless interaction among
various AI algorithms. The absence of universal standards for AI models and data
complicates achieving interoperability between different AI systems. The complexity
and opacity of deep learning-based AI models limit the ability of different AI systems
to understand and utilize each other’s outputs. While the rapid evolution of AI tech-
nology makes establishing unified standards premature, the diverse requirements and
constraints across domains render the creation of universal standards impractical.
These challenges underscore the necessity for ongoing research and development
in AI to enhance generalization capabilities, improve transparency and explainability,
and foster interoperability through adaptive and flexible standards. Addressing these
issues will be crucial for realizing the full potential of AI across a broad spectrum
of applications, ensuring it can be deployed responsibly and effectively in various
domains.
AI’s development raises various social and ethical concerns, fundamentally origi-
nating from biases in training data. These biases, when ingrained in AI systems, can
5.9 Challenges and Limitations of AI 153
be reflected in the outputs, potentially causing significant social and ethical reper-
cussions. Addressing these social and ethical challenges is crucial for AI to progress
smoothly with societal support.
Large-scale AI models heavily rely on vast amounts of high-quality data, and
their performance is directly linked to the quality of training data. Biases in training
data can lead to unfair or discriminatory outcomes, especially in sensitive areas like
employment, lending, law enforcement, and credit scoring. Biases may manifest in
various forms, such as racial, gender, or socioeconomic biases, leading to ethical
dilemmas. The inclusion of personal information in training data also raises privacy
concerns.
One of the critical issues with AI systems is the lack of transparency and account-
ability. The black-box nature of deep learning models makes it difficult to under-
stand and explain AI decisions. This opacity can lead to unintended biases and
ethical issues, fostering suspicion and mistrust toward AI system operators. There’s
a growing demand for AI systems to be not only accurate but also interpretable and
explainable.
The issue of accountability in AI systems is also contentious. As AI autonomy
increases, it becomes unclear who should be held accountable for AI’s decisions—
developers, users, or the AI itself. In addition, ensuring AI systems comply with laws
and regulations and determining liability in case of malfunctions are critical aspects
of AI’s responsibility.
The most significant societal concern with the advancement of AI is the issue
of employment. There is a possibility that AI, especially through automation, could
replace jobs involving routine and repetitive tasks. This includes occupations in
manufacturing and warehouse management, data entry and processing, customer
service and support, retail and sales, transportation and delivery, accounting and
bookkeeping, basic analysis, and medical support. As AI systems become more
extensive and intelligent, even professions requiring a high degree of specialized
knowledge, such as legal services, medical services, management support services,
and research and development tasks, could potentially be replaced by AI. The replace-
ment of these tasks by AI means not so much that the jobs will disappear entirely,
but rather that the focus of the work shifts to AI, changing the role of people in
those jobs. On the other hand, AI also creates new job opportunities in fields such as
AI development, data analysis, machine learning, cybersecurity, and AI ethics and
governance.
The advancement of AI holds the potential to create new industries and markets,
enhance the efficiency of production and business, and spur economic growth.
However, if the benefits of AI technology are not widely shared but become concen-
trated, or if AI infrastructure is not evenly distributed, social inequality could increase.
It could widen the gap between those proficient in and able to utilize AI technology
and those who are not, as well as between those with access to education and training
in fields like computer science, data analysis, and engineering and those without, and
between workers who can adapt to and learn AI-related technologies and those who
cannot. This can be considered the ‘AI divide’, analogous to the ‘digital divide’
154 5 Artificial Intelligence
of the digital transformation era. Like the digital divide, the AI divide could ulti-
mately expand socioeconomic disparities or shift social dynamics in unforeseen
ways. Addressing this gap issue requires the construction of AI infrastructure to
effectively utilize AI technology, along with education in ‘digital and AI literacy’.
AI raises various concerns and issues related to personal information and privacy.
AI systems, needing a vast amount of data for effective training and operation, often
collect individuals’ data without explicit consent or a complete understanding of
data usage, even with consent. This collected personal data can become a target
for cyber-attacks, and AI’s unauthorized use of sensitive information like an indi-
vidual’s health status or personal preferences can cause privacy issues. Moreover, as
AI unconsciously learns and applies biases present in the training data, it could violate
privacy and personal dignity. Strong legislation and strict enforcement regarding
personal data collection, use, and privacy invasion are necessary to prevent these
issues. Particularly, it is crucial that AI technologies are developed to be aware of
and comply with privacy regulations.
AI’s impact on society is diverse, but among these, ethical issues are the most
complex and serious. The bias and fairness issues, transparency and accountability
problems, and personal information and privacy issues mentioned above are all inter-
connected with ethical concerns. These internal problems of AI systems are matched
by external problems, or ethical issues arising from the use of AI technology.
Malicious use of AI technology can cause significant harm to society and
human lives, with cyber-attacks, the spread of false information, and autonomous
weapon systems (AWS)45 being prime examples. Especially important is preventing
the destructive use of AI in critical infrastructure and defense sectors and main-
taining security from cyberthreats. In addition, using AI technology for surveillance
techniques like facial recognition or behavior prediction can compromise human
autonomy and dignity, affecting individual freedom and privacy adversely. There-
fore, it is essential to ensure that AI is developed and used safely, not posing a threat
to global security or producing false information such as deepfakes.46
45 An autonomous weapon system (AWS), also referred to as a lethal autonomous weapon (LAW)
or a killer robot, is a weapon system that uses AI to identify, select, and engage targets without
human intervention. Unlike unmanned drones that are remotely controlled by humans, autonomous
weapons make decisions using AI algorithms on their own.
46 Deepfake is a digitally forged artifact created using AI algorithms, manipulating audio and video
content to make it appear as if an individual has done something they have not actually done. With
the advancement of deepfake technology, the content produced is becoming increasingly difficult
to distinguish from reality, raising various ethical and legal issues at social, political, and personal
levels.
5.9 Challenges and Limitations of AI 155
the media, and leading to a disconnection from reality. It also makes the fact-checking
work of the press more challenging. However, deepfakes are hard to detect because
as soon as AI algorithms are developed to detect them, new methods of creating
undetectable deepfakes could emerge. Moreover, punishing the misuse of deepfakes
without suppressing legitimate expression and respecting intellectual property rights
is highly challenging.
If AI is used for cyber-attacks, it can be extremely threatening in terms of the
scale and sophistication of attacks. Integrating AI into hacking devices and tech-
niques or into cyberwarfare and cybercrime can lead to severe cybersecurity threats.
AI technology can analyze vast amounts of data more efficiently than human hackers
to identify system vulnerabilities, learn and adapt in real-time to changing situa-
tions, and adjust strategies accordingly. AI cyber-attacks can automate and opti-
mize the execution of attacks, making them more sophisticated and harder to detect.
AI cyber-attacks can target AI systems such as autonomous vehicles or industrial
control systems, potentially disrupting operations or damaging the learning process
or outcomes. AI can also be used for cyberbullying, using deepfakes as a weapon for
personal attacks and cyberharassment.
Utilizing AI technology for surveillance and privacy invasion becomes a painful
threat. AI technology can process vast amounts of data from various sources such as
CCTV cameras, online activities, and mobile devices, and if this capability is used
for mass surveillance, it can greatly infringe on privacy and freedom. In addition,
using AI for high-precision facial recognition technology can identify individuals
in crowds, track movements and behaviors, and create detailed profiles of personal
activities. If government agencies use this technology for monitoring citizens, it can
lead to significant privacy invasion and restrict free living. While the government
may install CCTV under the pretext of preventing crime, it could ultimately be used
for surveillance and profile creation of specific communities or local residents. If
personal information gathered through CCTV and facial recognition technology is
combined with data from online activities and mobile devices to create comprehen-
sive profiles, it could compile sensitive information about an individual’s habits,
health status, financial situation, political inclinations, and religious preferences,
potentially leading to detrimental consequences if misused. This becomes a very
threatening presence to privacy and free living.
How can we mitigate or manage these risks and threats? For technical risks, we
can only encourage and wait for developers to do their best to solve the problems.
However, other risks related to usage, i.e., ethical, social, and security risks, need to
be addressed through strict problem identification, planning countermeasures, and
taking actions. First, we need to create a model for assessing risks, apply the model
to real-world problems to analyze risks, and then develop strategies for risk manage-
ment. Following that, we should implement strategies, regulate through legal systems,
encourage stakeholder understanding, and educate the public. Especially for deep-
fakes and cyber-attacks using AI technology, technical responses for detection and
defense are crucial, as is the establishment and strict enforcement of stringent legal
systems. The issue of government surveillance or privacy invasion is something that
might happen in a controlled totalitarian state, but it cannot be ignored in democratic
5.9 Challenges and Limitations of AI 157
In order to address the social and ethical issues posed by AI technology, appropriate
regulations need to be established. For internal problems such as bias and fairness
issues, transparency and accountability issues, and personal information and privacy
issues, it is necessary to set ethical guidelines for professionals to keep in mind during
AI research and development. For external problems that threaten socital and interna-
tional safety, such as cyber-attacks, the spread of false information, and autonomous
weapon systems, proactive measures need to be taken by enacting domestic and inter-
national laws and systems. However, due to the rapid changes in AI technology and
conflicts of interest among nations, regulatory enactment faces several challenges.
Regulation is meant to prevent problems from recurring or spreading by under-
standing the essence of the issue, but it is difficult to establish a framework for AI
technology. The pace of technological development outstrips the regulatory process,
making it hard to keep up. Laws and regulations that cannot keep pace with tech-
nological advancement quickly become outdated, failing to adequately address new
developments or risks, and can sometimes become obstacles. In addition, while AI is
developed and used worldwide, regulatory approaches vary by country and region,
complicating regulatory enactment. Establishing consistent international standards
and regulations requires extensive international cooperation and consensus building,
which is complex and slow due to conflicting national interests and concerns.
When enacting regulations for AI, it is essential to strike a balance between
fostering innovation and ensuring responsible oversight, while also considering the
timeliness of intervention. While regulations are critical for managing risks, rushing
into overly strict or premature regulations can hinder or stifle technological progress.
The challenge lies in creating policies that encourage technological advancements
158 5 Artificial Intelligence
while mitigating potential risks—a delicate balance that defines the art of regula-
tion. Historical examples illustrate how this balance has been successfully achieved,
benefiting both society and national development. For instance, the US government
introduced regulations when AT&T sought to expand into the computer business after
its communications technology had sufficiently matured. Similarly, current efforts
to regulate digital platforms have emerged only after the technology developed to
a level where issues of fair competition became evident. However, AI technology
differs fundamentally from these previous cases due to its vast potential power and
associated risks. Unlike other technologies, if AI is not regulated before its full devel-
opment and widespread commercialization, there may be no effective way to mitigate
its risks later. Thus, proactive research and public consultation are crucial for devel-
oping comprehensive ethical guidelines. In addition, international cooperation must
be accelerated to establish a unified regulatory framework that addresses the global
nature of AI’s influence and its potential impact on society.
When creating regulations and ethical guidelines for AI, several key areas need to
be addressed. Firstly, internal issues related to AI technology development should be
regulated through comprehensive ethical guidelines for developers. These include:
1. Transparency and Explainability: AI systems should be designed to be trans-
parent, allowing users to understand how decisions are made.
2. Bias and Fairness: Ethical guidelines should prevent biases learned during
training from leading to biased or unfair outcomes in the AI system’s operations.
3. Accountability: Guidelines are needed to clarify responsibility for decisions made
by AI systems and legal accountability for any problems that arise, including
moral and legal responsibilities related to autonomous systems like self-driving
vehicles.
4. Safety and Security: AI systems must be designed and built to operate safely and
be protected from cyberthreats. Developers should follow guidelines that ensure
the development of robust security protocols and regular vulnerability checks.
5. Sustainability and Environmental Impact: Developers should consider the
substantial energy consumption of large-scale AI models and their environmental
impact, and guidelines should encourage the development of energy-efficient AI
technologies.
Secondly, for external issues related to the use of AI technology, various laws and
systems need to be established. These include:
1. Laws on Data Use and Personal Information Handling: Strict regulations are
necessary for data collection, use, and sharing, including consent for data
collection, secure data storage, and prevention of personal information leaks.
2. Regulations to Prevent the Spread of False Information: Strict laws and enforce-
ment are needed to prevent the production and spread of false information using
AI technologies like deepfakes.
3. Compliance with Ethical Guidelines and the Principle of Harmlessness: Laws
should prohibit the use of AI for harmful purposes, such as autonomous weapon
systems development or unauthorized surveillance.
5.9 Challenges and Limitations of AI 159
The Artificial Intelligence Act (AIA) enacted by the European Union (EU) in 2024,47
is a comprehensive legislative measure that synthesizes and addresses the challenges
and limitations of AI that we have discussed so far, including technical, ethical, and
social issues, as well as concerns related to the risks and threats, and the regulation and
governance, posed by AI. The AIA is the world’s first comprehensive legal framework
designed to address the challenges posed by AI while promoting innovation. The
AIA aims to ensure the ethical, safe, and transparent use of AI by establishing clear
obligations for AI developers, distributors, and users. The regulation covers technical,
ethical, and social issues related to AI, while also addressing risks and threats posed
by the technology. The act focuses on the reliability, accountability, and transparency
of AI systems, ensuring they respect fundamental human rights and align with societal
values.
At the heart of the AIA is the classification of AI systems into four categories
based on their risk levels:
1. Unacceptable Risk: AI systems that pose a significant threat to human safety or
fundamental rights are prohibited. This includes AI systems used for real-time
remote biometric identification in public spaces for law enforcement purposes
(with exceptions), social scoring based on personal behavior or characteristics,
and systems that manipulate human behavior or thought.
2. High Risk: AI systems that could affect human safety, health, or rights must
comply with strict regulations. These systems include AI in healthcare, trans-
portation, and law enforcement. Obligations include conformity assessments,
risk management, and ongoing monitoring to ensure safety and compliance with
performance standards.
3. Limited Risk: These systems must meet transparency requirements. Users must
be informed when they are interacting with an AI system, and AI-generated
content must be clearly labeled (e.g., text, audio, and video, including deep-
fakes).48 Transparency and disclosure are critical to ensuring users are aware of
AI involvement.
4. Acceptable Risk: This category includes AI systems that pose minimal or no
risk and can be freely used. Examples include spam filters and AI-enabled video
games, which do not require significant oversight.
47 The Artificial Intelligence Act (AIA) was first proposed by the European Commission on April
21, 2021. It was passed by the European Parliament on March 13, 2024, and unanimously approved
by the Council of the EU on May 21, 2024. The AIA was published in the Official Journal of the
EU on July 12, 2024, and has come into effect on August 1, 2024.
48 The AIA regulates deepfakes. It mandates that any image, audio, or video content generated or
manipulated by AI must be clearly labeled as such. However, exceptions can be made for legal
purposes.
5.9 Challenges and Limitations of AI 161
Industry was the first to embrace digital technology, initiating the digital transfor-
mation. The reason industries focused on digital transformation early on was due
to competition. Failing to transform digitally means falling behind in the compe-
tition and, eventually, becoming obsolete. Just as sticking to plow farming would
inevitably lead to being outcompeted by tractor farming in the industrial age, insisting
on simple tractor farming in the digital and AI era will result in obsolescence by digi-
tally enhanced autonomous tractors. Therefore, companies have recognized digital
transformation as a timely challenge and have competitively jumped into it. Because
companies led the way in digital transformation, the act of integrating digital tech-
nologies into various business areas to change operations was initially defined as
digital transformation.
Lately, as digital technologies matured, AI technology has emerged and started to
advance rapidly. Catching this trend, companies are once again shifting their direction
toward AI technology. They are applying AI technology significantly to tasks such as
processing large volumes of data, making real-time decisions, precise quality control,
and proactive customer service, aiming for automation and intelligence. This is what
is known as the AI transformation. However, since AI technology is included in
digital technology and the AI transformation is seen as an extension of the digital
transformation, they are collectively referred to as the digital-AI transformation.
The digital transformation in the industrial sector goes beyond just adopting new
digital technologies; it involves using digital tools and technologies to optimize
existing operational methods, enhance customer experience, and create new busi-
ness models. In essence, digital transformation reflects the incorporation of digital
tools and concepts into business models, processes, and organizational structures,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 163
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_6
164 6 Digital and AI Transformation of Industry
The digital tools supporting digital transformation in the industry are the digital
technologies that have driven the digital transformation itself. These technologies
are detailed in Chap. 4, and those closely related to industrial digital transformation
are as follows:
1. 5G Mobile Communications Technology: It is superior to 4G in terms of
data transmission speed, latency, the number of simultaneous connections, and
frequency efficiency. It is essential for future-oriented services like autonomous
1 A Boston Consulting Group study found that the actual success rate of digital transformation is
only about 30%. The report suggests strategies to increase the success rate to 80%, including setting
a cohesive strategy reflecting clear change goals, rallying commitment from top executives to middle
managers, deploying top digital talent, adopting an agile organizational management mindset for
widespread digital transformation, effectively monitoring progress toward targeted outcomes, and
well-equipping business-centric modular technology and data platforms. Refer to Boston Consulting
Group’s Digital Transformation Report “Flipping the odds of digital transformation success” by
Patric Forth et al., October 29, 2020.
6.3 Application of Digital Technologies 167
the French energy company ENGIE is a notable example of successful digital trans-
formation.2 ENGIE was established in 2008 through the merger of Gaz de France
(founded in 1946) and Suez S. A. (founded in 1858), and operates in over 70 coun-
tries. In 2016, ENGIE’s CEO, Isabelle Kocher, recognized two inseparable forces
shaking the core of ENGIE’s industry, namely, digital transformation and energy
transition. She identified that decarbonization, decentralization, and digitalization
were revolutionizing the energy industry, and a fundamental digital transformation
was necessary for survival and prosperity in the new energy world. Kocher estab-
lished a vision for ENGIE’s digital transformation and announced a 1.5 billion Euro
investment over the next three years. She launched ENGIE Digital as the central orga-
nization to spread digital transformation efforts across the company. ENGIE Digital
organized the ‘Digital Factory,’ internally for developing software and innovative IT
tools for company-wide distribution. Kocher then appointed a Chief Digital Officer
to lead the digital transformation and hired digital experts.
The first step in ENGIE’s digital transformation was identifying high-value
applications across the company’s operations and establishing a comprehensive
digital transformation roadmap. The Digital Factory created a comprehensive project
roadmap and prioritized tasks. First, for gas assets, it applied predictive analytics and
AI algorithms to identify the main causes of efficiency decline, reduce asset loss,
improve uptime, perform predictive maintenance, and optimize electricity genera-
tion. Second, for customer management, it widely applied various online services,
including service applications that allow customers to manage their energy usage
directly. For individual residents and building managers, it developed applications
that analyze data from smart sensors to precisely identify opportunities for energy
savings. Third, regarding renewable energy, it developed a digital platform for appli-
cations to optimize electricity production from renewable sources, using predic-
tive analytics and AI to predict maintenance conditions, identify underperforming
assets, and provide real-time analysis and maintenance information to field operators.
Fourth, for smart cities, foreseeing the global increase in urban population from 50%
currently to 70% by 2050, it aimed to build sustainable, energy-efficient, connected
cities, planning to develop and deploy numerous applications for efficient district
heating and cooling, traffic control, eco-friendly mobility, waste management, and
security.
3 Reference: ibid.
172 6 Digital and AI Transformation of Industry
Deere produced GPS-guided tractors, automatic steering systems, and data collection
tools. It developed agricultural management platforms like “John Deere Operations
Center” and “MyJohnDeere,” enabling the collection and management of data from
agricultural machinery and making data-based decisions possible. IoT and telematics
(telecommunication + informatics) systems were installed to collect real-time data on
field conditions and machinery performance, and agricultural management software
was developed and integrated with platforms to help farmers plan, track, and optimize
their farming activities efficiently. John Deere researched and developed autonomous
and electrically powered machinery, introducing an autonomous tractor at CES 2022
and announcing its commercial sale later that year.
By utilizing data analysis and AI, John Deere enabled decision-making based
on real-time and historical data, and with the help of 5G mobile communications,
it improved connectivity in rural areas and enhanced the effectiveness of digital
solutions. Thus, John Deere pursued digital transformation by building data plat-
forms, integrating IoT, utilizing AI, and applying autonomous and 5G technolo-
gies, contributing to increased agricultural productivity, cost reduction, and the
development of sustainable and environmentally friendly agriculture.
A particularly noteworthy aspect of John Deere’s digital transformation is inven-
tory management. Operating numerous factories worldwide and producing a variety
of agricultural machinery with many components, and facing thousands of possible
combinations of customer-selected options for custom orders, managing optimal
inventory levels in the manufacturing process is complex. This complexity is
compounded by uncertainties such as demand fluctuations, supplier delivery times,
and production line disruptions. Historically, to accommodate these uncertainties,
sufficient inventory levels were maintained to immediately fulfill orders, but this
approach entailed high costs and complex management. As part of its digital trans-
formation, John Deere developed software solutions to support production planning
and inventory management, considering all these factors. Starting with production
lines using over 40,000 parts, it developed AI applications to optimize inventory
levels and algorithms to manage stock history daily based on various parameters. As
a result, John Deere was able to optimize order parameters, quantify material usage
based on production orders, minimize safety stock levels, and consequently reduce
parts inventory by 25–35%.
When we talk about digital transformation in various sectors, it goes beyond merely
adopting new digital technologies. It involves a comprehensive change in business
models, operations, organizational structures, and decision-making processes, revi-
talizing companies. In the modern manufacturing industry, this sequence of digital
transformation begins with the adoption of digital technologies in the manufacturing
process. Narrowing down to the manufacturing sector, adopting digital technologies
in manufacturing processes, or digitalization of these processes, is central to digital
6.5 Digitalization in the Manufacturing Industry 173
transformation. This is what is termed “Industry 4.0”. Initiated by the German govern-
ment in 2011, Industry 4.0 is an industrial policy aimed at transitioning traditional
manufacturing into smart factories equipped with intelligent production systems by
integrating ICT technologies.
The evolution of industrial society can be segmented based on the adoption of
specific technologies in the manufacturing industry, marking different industrial
revolutions. The use of steam engines powered by carbon resources signifies the
“First Industrial Revolution”; the transition to electricity as a power source marks
the “Second Industrial Revolution”; the adoption of electronics for automation indi-
cates the “Third Industrial Revolution”; and the adoption of digital technologies is
characterized as the “Fourth Industrial Revolution”. Thus, Industry 4.0 corresponds
to the Fourth Industrial Revolution. Comparing Industry 4.0 with digital transfor-
mation, while digital transformation seeks innovation across all aspects of busi-
ness, including manufacturing, operations, organization, and customer engagement,
Industry 4.0 specifically focuses on the digitalization of manufacturing processes,
representing a narrower scope of digital transformation.
The core components of Industry 4.0 include cyberphysical system (CPS), the
Internet of Things (IoT), cloud computing, and cognitive computing.
First, the CPS integrates computer computation and networking with physical
processes, where embedded computers and networks monitor and control physical
processes, with feedback loops allowing physical processes and computer computa-
tions to affect each other. CPS integrates physical processes with computation and
networking, enabling real-time data collection and analysis. The term ‘cyber’ refers
to computers, software, and networks, while ‘physical’ refers to the actual physical
systems or processes. CPS combines computers and physical components closely,
with sensors collecting data from physical systems for digital processing and anal-
ysis. CPS systems are often used in manufacturing, energy distribution, transportation
systems, etc., for real-time monitoring and control of physical processes, continu-
ally adjusting physical work based on computer’s computational analysis. Applica-
tions include automation, smart manufacturing, smart grids, intelligent transportation
systems, and health monitoring.
Second, the IoT connects machines, devices, sensors, and people to collect and
communicate data, enhancing operational efficiency.
Third, cloud computing stores and processes data on remote servers, increasing
data scalability, flexibility, and accessibility.
Fourth, cognitive computing, designed to mimic human cognitive processes,
solves complex problems without human intervention, interpreting unstructured
data and understanding context. Cognitive computing mimics human thought
processes in complex situations, employing self-learning systems using data mining,
pattern recognition, and natural language processing, mimicking how the human
174 6 Digital and AI Transformation of Industry
brain operates. Cognitive systems learn through interaction, improving over time,
adjusting algorithms based on new data and experiences, and naturally interacting
with users through conversations, understanding questions, and providing answers.
Applications include customer service through chatbots, healthcare, finance, and
more.
Among these four elements, IoT and cloud computing are well-known digital tech-
nologies, but CPS and cognitive computing might be relatively unfamiliar. Among
the digital technologies introduced in Chap. 4, Digital Twin and AI Machine Learning
could be similar to CPS and cognitive computing, respectively.
CPS and digital twins both integrate physical processes with digital models, collect
real-time data on physical systems through sensors to feed back into digital systems,
and are used to improve decision-making, process optimization, and predictive main-
tenance across various industries. However, CPS focuses on the integration and inter-
action between physical processes and computer systems, concentrating on control,
automation, and real-time data processing of physical systems, while digital twins
create a digital replica of a physical system for analysis, monitoring, prediction,
and simulation. CPS is interested in real-time control and interaction with phys-
ical systems or processes, while digital twins focus on simulation, analysis, and
optimization of physical systems.
Cognitive computing and machine learning process information and make
decisions using advanced algorithms. Both are subsets of artificial intelligence,
mimicking human-like intelligence and learning from data to improve performance.
They process decisions based on data, analyze large datasets to identify patterns,
and predict or act accordingly. However, they differ in human interaction. Cognitive
computing is designed to aid human decision-making, mimicking human thought
processes and problem-solving to interact with humans. In contrast, machine learning
focuses on learning from data to make predictions or decisions without being explic-
itly programmed for specific tasks, generally operating automatically in the back-
ground without human interaction. Cognitive computing is interested in comple-
menting human decision-making, while machine learning aims to create algorithms
capable of learning from data and making autonomous decisions.
Industry 4.0 has four design principles that determine whether a manufacturing
industry falls under Industry 4.0. First is interconnectivity, where machines, devices,
sensors, and people must be connected and communicate through IoT. Second is
information transparency, providing raw data about the physical system’s conditions
to create digital replicas like CPS. Third is technical assistance, offering various
information supports to aid informed decision-making and solve urgent problems in
the short term, along with physical support to reduce human physical and mental
fatigue and risks. Fourth is decentralized decisions, where CPS can make decisions
independently and perform tasks as autonomously as possible.
By equipping with the four key components of Industry 4.0 described above and
adhering to the four design principles mentioned, manufacturing processes can be
made more efficient and flexible through smart devices and systems. In addition, using
data analysis and IoT, equipment can be diagnosed and maintained predictively. It also
enables the manufacturing of personalized, customized products as efficiently as mass
6.5 Digitalization in the Manufacturing Industry 175
3. Data Analytics and AI: Employing data analysis techniques and AI, the Smart
Factory optimizes processes, reduces waste, and improves product quality. AI is
used for predictive maintenance, quality control, and process optimization.
4. Smart Energy Management: Implementing smart energy management systems,
the Smart Factory enhances energy efficiency and ensures sustainable operation,
reducing energy consumption, minimizing environmental impact, and cutting
operational costs.
5. Customization and Flexibility: By adopting 3D printing and digital twin tech-
nologies, the Smart Factory enables customized manufacturing and flexible
production in response to customer orders.
POSCO’s platform for the Smart Factory is “PosFrame,” developed by POSCO
ICT (renamed to POSCO DX in 2023), a subsidiary responsible for smart factories,
smart logistics, and industrial robots. PosFrame is a platform designed for digitalizing
steel manufacturing processes, featuring a simple structure that encompasses both
lower and upper layers’ functionalities, real-time control capabilities, and the ability
to directly manage equipment operation. It analyzes and controls data collected from
digital twins of various equipment through IoT, utilizing big data and AI. All compo-
nents are connected through a central data network, and all data can be accessed from
a virtual database regardless of the physical location. PosFrame provides a common
software layer offering APIs, UI/UX, and AR/VR functionalities, on top of which
basic applications, application sets, and individual applications are deployed. Basic
apps are those provided by the platform itself, app sets bundle large functions related
to factory operations, and individual apps consist of separate, smaller application
apps. Among the app sets related to factory operations, the most important is the
manufacturing execution system app, which is responsible for production, execu-
tion, and management. This app controls equipment according to the production
plan to ensure that the desired product is produced on time, optimizing production
and execution. The characteristic of the PostFrame platform is that it was developed
targeting the steel process, which means it is suitable for continuous processes such
as steel and chemical manufacturing but not appropriate for assembly processes like
electronic product assembly lines.
In general, in order to effectively introduce factory digitalization, as in the case
of POSCO’s Smart Factory, systematic preliminary preparation and procedural
execution are required.
First, preparation begins with designing the overall structure of the digital factory
and the related IT architecture. This includes deciding which digital factory platform
to use, and considering necessary sensors, IoT, data backbones, virtual databases, the
level of digital twins, platform architecture, connection to the cloud, UI/UX, security
methods, etc.
Second, prioritize processes where the digital factory can have a visible effect,
then gradually expand the successful experience to other processes as a step-by-step
strategy.
Third, install sensors and IoT to collect data and establish a communication system
to transmit the data to a virtual database through the data backbone.
6.5 Digitalization in the Manufacturing Industry 177
Fourth, establish a control system to analyze collected data for new insights and
accordingly improve processes, planning how to distribute big data analysis and AI
functions between edge computing and cloud computing.
Fifth, establish criteria for comparing and analyzing the costs invested in the
digital factory and the benefits gained from it, and evaluate the actual application
results based on these criteria.
In order to successfully carry out a digital factory project, such preliminary prepa-
rations and execution strategies are necessary. In addition, it is advisable to start with
a big picture but begin implementation with small, definite steps. Moreover, it is
more practical to progress step by step, building one component at a time, rather
than tackling the entire digital factory at once.
The automotive industry has been at the forefront of embracing digital transforma-
tion, driven by innovations in electric vehicles (EVs), a software-centric approach,
and autonomous driving technologies. Companies like Tesla led the way in electric
vehicles, while global automakers including BMW, General Motors, Volkswagen,
Ford, and Hyundai Motor Company have joined the digital shift, propelling rapid
development within the automotive sector.
The backdrop to digital transformation in the automotive industry includes envi-
ronmental concerns. With climate change posing a global crisis, the alarm was raised
over fossil fuel usage, prompting a shift from internal combustion engine vehicles
to electric vehicles.4 The rise of Tesla, the Diesel-gate scandal in Europe,5 and
policy support in China boosted the ascent of electric vehicles from 2017, marking
a rapid uptrend. European and Chinese environmental regulations have continued to
strengthen the EV market, though the limits of battery technology and raw materials
pose questions on the growth’s extent. Hybrid cars, combining electric and internal
combustion engines, have become a widespread alternative, and hydrogen fuel cell
vehicles have emerged as a new option.6
4 Historically, electric vehicles predated internal combustion engine cars, with Detroit Electric’s
Edison electric car in 1913 reportedly capable of traveling up to 100 km at a top speed of 40 km/
h on a single charge. However, due to long charging times and heavy batteries, mass practical use
was not achieved, and internal combustion engine cars gained momentum with the introduction of
Ford’s assembly line system, the discovery of Texas oil, and the drop in gasoline prices.
5 Diesel-gate is a scandal that emerged when it was revealed that European car companies, including
Volkswagen, had manipulated emissions data. This incident brought diesel engines, which use diesel
fuel, and further, internal combustion engines themselves into focus as a factor in environmental
issues. Ultimately, it became the starting point for the movement to phase out internal combustion
engines in the 2020s.
6 Hydrogen fuel cell vehicles, operating on the principle of generating electricity through the chem-
ical reaction of hydrogen and oxygen in fuel cells, represent a promising alternative, driving electric
motors and refueled at hydrogen stations.
178 6 Digital and AI Transformation of Industry
running through the connected supply chain integrates all aspects of the produc-
tion process, enabling a comprehensive overview and coordination. The establish-
ment of smart factories equipped with smart devices for data collection and analysis
improves decision-making and operational efficiency. Moreover, digital twins simu-
late, predict, and optimize the performance of actual manufacturing equipment and
processes.
By adopting smart manufacturing and Industry 4.0 principles, Hyundai Motor
Company focuses on improving efficiency, minimizing environmental impact, and
concentrating on vehicle customization and quality enhancement. The company
invests in the necessary development and employee training for manufacturing digi-
talization. The ultimate goal is to create an agile and sustainable manufacturing
process that can quickly respond to market changes and meet market demands.
Hyundai is developing a future-oriented and intelligent factory called “E-
FOREST,” a manufacturing platform that incorporates digital technologies such as
AI, robotics, ICT, IoT, and big data into innovative automated methods and human-
friendly smart technologies. E-FOREST is based on four core values, namely Auto-
Flex, Intelligence, Humanity, and Green, aiming for flexible and advanced automa-
tion in assembly, logistics, and inspection, intelligent control systems based on AI,
a human-centered work environment, and an eco-friendly factory.
E-FOREST aims to implement a smart production system capable of real-time
prediction and autonomous production by connecting and analyzing all data gener-
ated in the production plant. The smart factory integrates all plant data through an IoT
platform, provides real-time data monitoring and analysis through the “Factory-BI”
system, and manages previously unmanageable areas to maximize overall produc-
tion efficiency. By deploying robots in hard-to-reach work environments, it improves
worker conditions and enhances safety and efficiency in production plants. Based on
cloud technology infrastructure, it aims for a software-driven factory (SDF) where
production equipment control, all data and IT services, and the entire plant system
are organically connected and integrated.
E-FOREST embodies a flexible production system, high-level automation,
human–robot collaboration, custom manufacturing, intelligent factory, and quality-
completed factory. Such innovation in manufacturing processes can reduce the time
and cost of new car development, allowing for a focus on creating better vehicles,
ultimately providing consumers with high-quality products at reasonable prices.
By using big data and AI technologies to predict production scales and manufac-
ture accordingly, it allows for flexible responses to unforeseen situations, offering
products that match customer preferences and enabling customer-centric custom
production.
Hyundai Motor Company has realized the blueprint of the E-FOREST smart
factory with the completion of the “Hyundai Motor Group Innovation Center Singa-
pore (HMGICS)” in 2023. HMGICS features a cell-based flexible production system
with digital technologies, efficient production operation based on digital twin tech-
nology that synchronizes reality and virtuality, data-driven intelligent operation
180 6 Digital and AI Transformation of Industry
The successful cases of companies like ENGIE, John Deere, POSCO, and Hyundai
provide valuable insights into effective approaches for industries navigating digital
transformation. Drawing lessons from their experiences and success stories can
significantly enhance the chances of success in similar initiatives. The following
presents a digital transformation strategy inspired by these insights.
First, set clear objectives and present a vision. Understand the specific challenges
and opportunities the company faces. Examine how digital transformation can bring
about changes in various aspects of business operations, such as improving customer
experience, simplifying operations, and innovating products. Predict the duration
and investment required for digital transformation and when tangible results can be
expected. Anticipate potential obstacles in the process of digital transformation and
review solutions in advance. Based on this, set goals for digital transformation and
present a vision of what the company aims to achieve. Convincing the CEO with
these goals and vision becomes a priority, considering the critical importance of the
CEO’s firm recognition and support for successful digital transformation.
Second, appoint a dedicated leader for digital transformation. The CEO should
appoint a responsible leader to oversee digital transformation, grant authority to
form and operate a dedicated team, secure the budget for digital transformation,
and ensure access and cooperation from all departments within the company. In
addition, the CEO should encourage the management team to support digital trans-
formation, accept requests from the digital transformation leader, and adapt company
management accordingly. The CEO should also lead the change in organizational
culture to embrace change, innovation, and continuous learning that aligns with
digital transformation.
Third, form a dedicated organization for digital transformation. Once the digital
transformation plan is approved and budgeted by the CEO, the digital transforma-
tion leader should secure specialists needed for digital transformation and form a
dedicated team. Securing talented specialists in IT, big data analysis, software devel-
opment, AI application, and cybersecurity maintenance is critically important for
successful digital transformation. This can be time-consuming and costly. Simulta-
neously, necessary digital devices should be purchased and installed. The team should
plan which technologies to adopt and which to develop, considering the complexity
and development duration of each technology.
Fourth, develop a specific plan for digital transformation. The digital transfor-
mation leader should organize a planning team to devise a comprehensive plan for
digital transformation, including all relevant tasks, and create a detailed implemen-
tation plan from start to finish. This involves designing the supporting IT structure,
6.6 Digital Transformation Strategy 181
statistical models and machine learning techniques to predict future outcomes based
on past data, and facilitate real-time analysis of large data volumes for timely insights.
Eighth, repeat performance evaluation and feedback with a measure and repeat
approach. Build systems, processes, products, and strategies targeted for digital
transformation to fundamentally learn from data and experience, capturing changes
quickly and reflecting them effectively. This involves capturing changes by measuring
data and immediately providing feedback to manage risks and continuously improve,
thereby effectively responding and competing in the rapidly changing business envi-
ronment. Specifically, to evaluate the performance of targeted systems, processes,
products, and strategies for digital transformation, first set specific, measurable key
performance indicators (KPIs) aligned with business objectives. Second, systemati-
cally collect quantitative and qualitative data from various sources such as sales data,
customer feedback, and website analytics. Third, analyze and interpret data using
various statistical tools and methods, and compare performance with competitors
to gain insights for decision-making and feedback for incremental changes. Fourth,
repeat this process continuously with an agile methodology to swiftly respond to
market changes.8
Ninth, maintain a customer-centric approach. The success or failure of a business is
determined by sales to customers; thus, the success of digital transformation depends
on how customers accept it. A customer-centric approach is crucial throughout the
business process, requiring a shift in mindset and changes in organizational processes
and strategies. This approach goes beyond providing good customer service to prior-
itizing customer value and satisfaction in all aspects of the business. To practice
this, thoroughly research to understand customers’ needs, issues, and expectations
by collecting customer feedback, communicating through social media, customer
service channels, and community forums. Gain insights into customer behavior and
preferences through data analysis, improve customer satisfaction by interacting with
customers, and design products and processes with customers in mind, customizing
products and services to individual customer needs. In customer management, use
customer data to gain insights, reflecting this in marketing strategies, product devel-
opment, and customer service improvements. Focus on building long-term relation-
ships with customers, continuously improving products and services, rather than just
focusing on transactions.
8 ‘Agile’ signifies being quick and adaptable. In ‘agile methodology,’ it refers to swiftly adjusting
to changes and quickly applying these adjustments to business processes. Initially used in software
development, agile methodology involves developing software in iterative cycles, continuously
incorporating feedback and evolving requirements. This approach allows for dynamic development
and improvement. Applying agile methodology beyond software to business operations involves
shifting from traditional hierarchical structures to collaborative, horizontal frameworks. This empha-
sizes rapid response to customer needs and integrating insights into business strategies, fostering
an environment of innovation, adaptability, and continuous improvement.
6.7 AI Transformation of Industry 183
In the past, the Industrial Revolution shifted societies from agrarian to industrial
structures, dramatically altering the fabric of daily life. The focal point of life moved
from rural communities centered at farmlands to cities built around factories, with
lifestyles evolving to incorporate products mass-produced in these new industrial
centers. Today, as industrial society transitions into a digital society, cities remain
the primary living spaces, but an increasing number of people are now working
or participating in activities within cyberspaces like the metaverse. Additionally,
digital technologies are permeating various industries, leading to profound changes
in lifestyles. Whereas the shift from agrarian to industrial societies increased the
need for travel to work and social engagements, the transition to a digital society has
reduced physical travel due to innovations such as remote work, online shopping,
and social media.
As this transition from an industrial to a digital society unfolds, the impact of
digital technologies on human life is becoming increasingly apparent. Digital tech-
nology is transforming everything from communication methods, access to infor-
mation, and work and learning practices, to lifestyles. This rapid pace of change,
which accelerated during the COVID-19 pandemic, has reshaped not only individual
lives but also business operations and government administration. Digital transfor-
mation is now redefining social behavior, with both positive and negative effects.
On the positive side, it improves access to information and services, enhances social
connectivity, and drives industrial innovation. However, it also presents challenges
such as the digital divide, digital illiteracy, job displacement, misinformation, privacy
concerns, and ethical dilemmas. As such, it has become crucial to observe the impact
of digital transformation on education, politics, society, and culture, while addressing
the complex problems that arise alongside these changes.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 185
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_7
186 7 Digital and AI Transformation in Society
Looking around society today, it is clear that lifestyles have changed significantly
compared to one or two decades ago. Digital devices have become central to both
work and daily life, with smartphones being indispensable. People now attend meet-
ings via video conferencing and work remotely when commuting is challenging.
Instead of using traditional cookbooks, they turn to online videos like YouTube for
recipes. Education has also adapted, offering remote access to lectures and seminars
from abroad. When faced with questions, people turn to internet searches or AI like
ChatGPT instead of asking someone directly. For navigation, they rely on digital maps
and GPS to drive to unfamiliar destinations. Movie watching has shifted from sched-
uled theaters or TV programming to on-demand platforms like Netflix. Commu-
nication with friends is done through messenger services and social media, while
video calls with international contacts are made using free apps. When addressing
social issues, people express opinions on social media and form online communi-
ties for collective action. Booking transportation and shopping are done online, often
through apps on smartphones. People monitor their health through wearable devices,
with data sent directly to hospitals for remote consultations. These lifestyle changes
show how deeply our society is immersed in digital transformation.
Digital technology has revolutionized both work and lifestyle, offering significant
conveniences. The backbone of this transformation is the advancement of information
and communication technologies (ICT), which enables high-speed global connec-
tions. The shift from voice-centric telephone networks to internet-based systems
that handle video, data, and voice, along with the expansion of optical and wireless
networks, has built an infrastructure that allows for unconstrained communication
and information access across borders, time zones, and formats. This has opened
up new avenues for remote social and business activities, creating cyberspaces that
transcend physical limitations. Beyond simply acquiring information, these spaces
enable the formation of human networks for sharing ideas and communication. The
rise of social network services (SNS) has ushered in an era of hyperconnectivity,
allowing people worldwide to connect and collaborate as though they were in the
same space. Social media has fundamentally changed how people build relation-
ships, share information, and even organize for political and social causes, offering
a powerful tool for raising grievances and shaping public opinion.
The rise of digital technology has also transformed commerce. Companies no
longer need to set up physical stores or rely solely on TV advertising to reach
customers. They can now list products on mobile platforms like the App Store or
Play Store, instantly reaching a global audience. Consumers can browse, compare,
and purchase products through mobile apps, breaking the constraints of time and
space. This shift has enabled small and medium-sized enterprises (SMEs) to launch
products in the global market without incurring high costs or delays. From the
consumer’s perspective, it is easier to gather comprehensive information, compare
prices, and make informed purchasing decisions. This transformation has changed
consumer behavior and created new consumer trends, where even geographically
7.2 Culture in the Era of Digital Transformation 187
Digital transformation has reshaped the way people live and interact, giving rise
to new cultural norms. This shift has altered communication methods, access to
and sharing of information, artistic and cultural expression, economic activities,
work processes, education, and political and social engagement. At the same time,
it has introduced new concerns around privacy and ethics. Digital technology has
revolutionized how people communicate, with social media platforms, messaging
services, and video calls enabling connections that transcend geographical bound-
aries. Expressive mediums have expanded to include emojis, emoticons, internet
slang, and memes, creating new forms of communication, especially among younger
188 7 Digital and AI Transformation in Society
generations.1 The internet has made it easy to search for and share information,
fostering access to vast knowledge. Digital platforms have also facilitated the creation
of diverse online communities, where people with shared interests can connect. More-
over, the rise of virtual worlds like the metaverse allows individuals to live dual lives
as themselves and as avatars, bridging the gap between real and virtual environments.
These developments demonstrate how digital technology has enabled the creation of
new cultural frameworks that were unimaginable in the past.
Digital technology has transformed daily life into a highly individualized expe-
rience, with smartphones at the center. Whether on public transportation or else-
where, it is common to see individuals immersed in their smartphones, using them
for tasks such as calling, texting, reading the news, booking tickets, web surfing,
shopping, streaming videos, gaming, and more. A smartphone now integrates the
functions of a phone, television, computer, camera, and more into a single device.
It has become a personal vault, storing photos, calendars, contacts, call logs, chat
histories, emails, and payment details. In essence, the smartphone represents the
convergence of communication and computing, and it has become an indispensable
all-purpose tool for modern humans. Thus the individualized, smartphone-centered
lifestyle has become the cultural norm of the digital transformation era.
As previously discussed, digital transformation has brought significant changes to
industries, altering corporate ecosystems, work structures, and methods, and creating
new corporate cultures. Education and learning have also been transformed, with
digital literacy now an essential skill. New cultural dimensions have emerged in
how education is delivered, with remote learning and online resources becoming
common. In addition, political and social participation have changed, as digital plat-
forms have enabled the consolidation of opinions and the formation of groups for
activism and civic engagement, giving rise to new form of cultures of political and
social participation.
In the arts, digital technology has introduced new tools and techniques that have
revolutionized artistic creation. Musicians now compose with tools like Musical
Instrument Digital Interface (MIDI), experiment with Virtual Studio Technology
Instrument (VSTi), and use Digital Audio Workstations (DAWs) to record, edit,
1 Emojis are small digital icons used to express emotions or concepts, commonly employed in text
messages and on social media platforms via mobile devices and computers. Unlike emojis, emoti-
cons are composed of keyboard characters and symbols arranged to represent facial expressions or
convey emotions, primarily used in text-based communication. Internet slang refers to abbreviations
and acronyms that originate from online culture, making digital communication more efficient. The
term meme, first introduced by Richard Dawkins in his seminal work The Selfish Gene, originally
referred to an idea, behavior, or style that spreads within a culture, functioning similarly to how
genes transmit biological information. In today’s context, memes (or internet memes) have evolved
into a key element of online culture, particularly among Generation Z. Memes often start as viral
internet content—such as humorous images, videos, or parodies—that capture widespread attention
and are shared extensively across social media. This phenomenon reflects a unique form of digital
culture shaped by the interaction between advanced technology and the communication habits of
Generation Z.
7.2 Culture in the Era of Digital Transformation 189
and arrange music.2 These tools allow for streamlined workflows, experimentation,
and collaboration. In visual arts, digital tools enable both “fully digital” art, where
artists create directly on a digital canvas using software, and “semi-digital” art, where
traditional artworks are scanned and digitally enhanced. New genres such as elec-
tronic dance music (EDM), chiptune, generative art, pixel art, VR art, AR art, and
AI-generated art have emerged from the intersection of art and digital technology,
providing new avenues for artistic expression.3
Film production has also evolved, thanks to advancements in computer graphics
and 3D technology. Movies like Avatar showcase the potential of digital technology,
using CGI to create characters like the Na’vi and combining real and digital envi-
ronments to craft exotic settings. Motion capture technology adds lifelike movement
to digital characters, while 3D camera technology enhances visual depth, creating
immersive cinematic experiences. As such, digital technology has provided film-
makers with tools to realize rich imaginations and creative challenges in cinematic
form, offering audiences enchanting and compelling movie experiences. Recently,
with smartphones equipped with powerful cameras and video editing apps, film-
making has become popularized—anyone can now shoot, edit, and produce films.
Features like wide-angle, telephoto, and macro lenses, along with adjustable reso-
lution and frame rate, allow users to create personalized films without professional
equipment.
Digital technology has reshaped the landscape of direct-to-consumer platforms,
drastically impacting cultural evolution. Much like how the App Store and Play
Store revolutionized the app industry, these platforms enable content creators to
bypass traditional distribution networks and connect directly with their audience.
OTT platforms exemplify this shift, allowing creators to offer their content without
intermediaries.4 This transformation has had a profound effect, as seen with the
2 MIDI: Musical Instrument Digital Interface, a protocol that standardizes the exchange of digital
signals between electronic musical instruments. VSTi: Virtual Studio Technology Instrument, a
plugin format adopting the standard specification (VST) used for connecting electronic music editing
software, recording systems, synthesizers, etc. DAW: Digital Audio Workstation, a workstation
supporting the playback, recording, and editing of digital audio.
3 Electronic dance music (EDM) is a music genre that centers on synthesized sounds and is charac-
terized by strong beats and electronic production techniques. Chiptune is a style of music made using
vintage video game hardware or emulators, often from the 1970s and 1980s, to produce unique,
nostalgic sounds reminiscent of early video games. Generative music employs algorithms and
coding to create self-generating or evolving compositions, offering dynamic and often unpredictable
musical experiences. In the realm of digital art, various forms have emerged alongside technological
advances. Pixel art uses small, square pixels to create images, evoking a sense of nostalgia for early
video game and computer graphics aesthetics. VR art allows artists to create immersive 3D envi-
ronments through virtual reality headsets, offering interactive and multi-dimensional experiences
that transcend traditional artistic boundaries. AR art layers digital creations onto the physical world,
visible through augmented reality headsets or smartphones, to create interactive and location-based
experiences. AI art leverages artificial intelligence and machine learning algorithms to generate or
refine artworks, fostering a new frontier of collaboration between human artists and AI systems.
4 OTT (short for “Over the Top”) refers to services that deliver content directly to consumers over
the internet, bypassing traditional broadcast, cable, or satellite television platforms that historically
controlled content distribution. The term “set-top box” originates from the early days of television,
190 7 Digital and AI Transformation in Society
In the digital age, the increasing use of digital devices has greatly improved work
efficiency, resulting in fewer people being needed to handle the same workload.
The widespread adoption of RPA has led to significant reductions in production
jobs. Moreover, the development of AI has not only diminished clerical roles but is
also beginning to transform jobs that require higher levels of cognitive and analyt-
ical skills. For example, banking transactions have shifted to fintech platforms, and
services like ticketing and ordering have been replaced by self-service touch screens.
As face-to-face services transition to electronic transactions and operations move to
cyberspace, the demand for traditional jobs continues to decrease. While the current
global job shortage is partly due to economic downturns, a more fundamental cause
is the reduction of jobs caused by advancements in digital technology, which is an
issue commonly affecting both developed and developing countries.
The digital era is expected to bring considerable changes to the job market.
Automation and digitalization are reducing the need for some jobs while simultane-
ously creating new opportunities. Roles in repetitive administrative tasks, data entry,
basic data analysis, manufacturing, retail, and customer support are declining. Mean-
while, demand is growing for jobs such as data scientists, cybersecurity experts, AI
when an external device was placed on top of the TV to receive and decode broadcast signals from
satellite, cable, or other direct broadcast methods. These devices were necessary for converting
signals into viewable content on a television. In contrast, OTT services utilize the internet to stream
content directly to consumers on various devices, such as televisions, smartphones, tablets, and
computers, without the need for traditional broadcasting methods or intermediary hardware like
a set-top box. This direct-to-consumer model allows for a more flexible and extensive content
delivery system, including movies, TV shows, live events, and more, offering greater convenience
for users. Leading examples of OTT platforms include Netflix, YouTube, Hulu, Disney+, AppleTV+,
and Amazon Prime Video, each offering a vast library of on-demand content tailored to a global
audience.
7.3 Changes of Jobs in the Digital Age 191
lays the foundation for social equality, equal opportunities for information access, and
equal educational opportunities. Furthermore, it serves as a basis for equitable partic-
ipation in economic activities, social integration, and economic development. Digital
inclusion enables all members of society to access knowledge and technology, with a
particularly close relationship to education. Establishing a digital inclusion environ-
ment ensures that educational opportunities in the digital age are equally available to
students, allowing each student to develop digital literacy.5 Digital literacy, alongside
science literacy, is an essential trait for living in the twenty-first century, an era of
digital transformation and advanced science and technology.6
5 Digital literacy encompasses the technical and cognitive abilities required to effectively search,
interpret, create, and communicate information in a digital environment, going beyond merely
knowing how to use digital devices. It implies the capacity to navigate and understand information in
digital platforms, assess the credibility of information, use digital tools and resources critically, and
comprehend issues related to online safety and privacy. This competence is essential for consuming
and producing information in modern society, enabling individuals to actively participate in the
digital world.
6 Science literacy refers to the knowledge and understanding necessary to grasp scientific concepts,
methods, and reasoning. It involves the ability to think critically about scientific information, inter-
pret scientific data and arguments, understand the nature of scientific inquiry, and apply scientific
principles in everyday life. This literacy extends beyond mere familiarity with scientific facts,
encompassing the ability to engage with scientific content, evaluate the reliability of scientific
information, and make informed decisions based on scientific evidence. It includes foundational
knowledge in key disciplines such as physics, chemistry, biology, and earth sciences, as well as the
capacity to utilize numerical and digital tools to interact with scientific data.
194 7 Digital and AI Transformation in Society
with the necessary educational content and tools and that all students have equal
access to them to enhance educational outcomes. Teachers must first be familiar with
digital technology and develop ways to effectively use it in their teaching. However,
protecting student privacy and online safety is essential when incorporating various
digital technologies into education, necessitating strong data security measures.
Utilizing digital technology in education signifies a major transformation that can
face various obstacles and resistance. Socioeconomic and geographical disparities
that prevent equal access to digital devices and internet connectivity for all schools
and students are significant barriers. Even with digital infrastructure, failure to update
it according to technological changes can become another obstacle. Costs associated
with installing digital infrastructure, purchasing various applications and services,
and updating them can also be barriers to adopting digital education methods. The
potential for issues with handling student personal information in the process of
using diverse digital technologies and tools in education necessitates robust privacy
protection measures. In addition, there might be hesitation or resistance from teachers
or parents toward adopting digital technology, requiring strategies to support teachers
in understanding the importance of digital technology and effectively using it in
education.
Ensuring that all students can access digital devices and the internet without
discrimination and enhancing the effectiveness of education through the application
of digital technology are merely the initial steps in the digital age of education.
Utilizing various digital tools in education to enhance students’ digital literacy and
scientific literacy is just the basics. The critical point is that the content of education
needs to change, as education in the digital age must be restructured to prepare for
a future where humans coexist with digital technology.
In the digital era, humans will live alongside various digital technologies and
devices, including AI and intelligent robots. It is necessary to research how humans
can coexist harmoniously with these digital machines and what the human role will
be in such situations. Moreover, it is crucial to closely examine what capabilities
humans need to effectively fulfill these roles and how education should change to
nurture these abilities. Observing the development of AI, it is essential to understand
anew what it means to be human in light of AI and what capabilities humans need to
coexist with it. Furthermore, in preparation for the future when humanoid AI robots
achieve or surpass human abilities, research is needed on maintaining a mutually
beneficial symbiotic relationship and reflecting this in education.
In principle, if machines can outperform humans in certain tasks, it is better to
let machines do those tasks, and humans focus on what they do best. For example,
there is no need for humans to memorize information that can be easily accessed via
internet searches or by questions to ChatGPT. While there are movements to exclude
AI from education due to the confusion it may cause, defensive measures alone would
not wisely address the future. Instead, it is better to explore what capabilities humans
need to effectively utilize AI and educate accordingly. However, this general principle
cannot be applied uniformly in all cases. Decisions should be made after examining
each case and identifying its essence. For example, extreme views, such as excluding
ChatGPT over concerns it might write student reports or asserting that students no
7.6 Politics in the Digital Age 195
longer need to learn writing because ChatGPT can do it, miss the point. The essence
of writing and its educational benefits, such as expressing thoughts, fostering critical
thinking, and creativity, must be considered first. Writing education is necessary even
to evaluate whether a composition by ChatGPT is proper. Considering these points,
writing education remains essential, regardless of ChatGPT’s presence.7
As we navigate through the digital transformation era, society faces numerous chal-
lenges in its political and social spheres. Calls for fairness and justice are overshad-
owed by increasing instances of injustice and unfair practices. Critical voices face
both physical and psychological harassment from zealous supporters of certain indi-
viduals or political parties, leading to widespread discomfort. Despite clear exposure
of deceit and misconduct, there is a troubling persistence of defiance without any
signs of shame or guilt. The convenience of the internet and social media comes at
the cost of enduring harmful comments and the rapid spread of harmful ideologies,
further aggravating social conflict and division. The circulation of misinformation
and fake news not only intensifies these conflicts but also skews public perception
and influences election outcomes, creating a landscape where digital advancements
contribute to complex societal dilemmas.
Such pathological phenomena are not directly caused by digital transformation.
They arise from various factors, among which post-truth and tribalism phenomena
stand out. Post-truth refers to the phenomenon where emotional appeals are
responded to more than objective facts, distorting the truth, and tribalism involves
acting according to the identity of one’s group. Although these factors have always
existed, their prominence today is fueled by growing sociopolitical and economic
issues like income disparity, social dissatisfaction, inequality, and instability, and the
powerful dissemination tools provided by digital technologies.
Digital transformation, signified by hyperconnectivity, has enabled activities
beyond the constraints of time and space, such as acquiring information, distributing
it, expressing opinions, and collective action, bringing revolutionary changes to polit-
ical and social activities. It has maximized the openness of information, reducing the
possibility of power concentration through information monopoly and advancing
democratization. Converting government administrative tasks to e-governance has
increased national transparency and efficiency and improved public services, while
7 Writing can indeed be broadly categorized into creative and critical domains. Creative writing,
encompassing essays, poetry, and fiction, fosters imaginative thinking and the expression of personal
or imaginative narratives. On the other hand, critical writing, which includes columns, research
papers, and critiques, is geared toward analytical and evaluative thinking, aiming to deepen under-
standing and articulate well-reasoned arguments. Both domains play crucial roles in enhancing
writing skills: while creative writing allows for the exploration of ideas and emotions in novel
ways, critical writing develops the ability to assess, argue, and articulate complex ideas clearly and
effectively.
196 7 Digital and AI Transformation in Society
digital technology has facilitated electronic voting and national opinion collection.
Social media allows for expressing opinions and participating in group activities,
eliminating geographical barriers in political and social activities. However, misuse
of social media and internet broadcasts in political activities can lead to the spread of
false information and distorted public opinion. The manipulation and distortion of
information for illegal power gain can regress democratization, and personal infor-
mation leakage and malicious comments can violate human rights, while the produc-
tion and dissemination of misinformation and fake news can confuse public opinion.
In addition, various media turning to indirect advertising for profit poses risks by
covertly distorting facts and biasing public opinion.
In the digital transformation era, the integration of digital technologies into poli-
tics and society has both positive and negative aspects. One of the most positive
aspects is the emergence of digital or e-government. Digitizing all documents and
sharing them among government departments for electronic processing and making
information accessible to the public can enhance the efficiency and transparency of
government operations. This digital shift enables online processing of administrative
tasks, license applications, and tax payments, saving time and costs. Digital elections
could potentially increase participation rates through remote voting and improve
accuracy and efficiency by electronically processing votes.8 Moreover, digital plat-
forms can collect real-time public opinions on specific political issues or national
governance, allowing digital petitions and open policy proposals to be reflected in
policymaking. This approach makes political and policy decisions more grounded
in reality, enhances citizen participation and ownership, and advances democracy.
On the negative side, the digital era provides means for the rapid spread of
misinformation and fake news. Previously, information and news were disseminated
through formal newspapers and public broadcasts, which had mechanisms for fact-
checking, making it difficult for unverified information to be published. However,
social media and internet personal broadcasts in the digital age can disseminate infor-
mation indiscriminately and instantly without fact-checking, creating a significant
impact due to the echo chamber effect.9 Misinformation and fake news pose a critical
risk when used in politics, packaging distorted beliefs or extreme opinions in misin-
formation to conduct malign campaigns, manipulate public opinion, incite the public,
and contaminate elections with populist promises. This not only risks reversing elec-
tion outcomes but also places democracy in jeopardy. Persistent misinformation,
manipulation of public opinion, and incitement can undermine social trust in the
8 If the voting and counting process is handled electronically, accuracy and efficiency improve,
assuming a normal situation without attempts to manipulate election results through hacking. In
reality, suspicions of such hacking attempts cannot be completely dispelled, and there is no techno-
logical guarantee that hacking can be entirely prevented. Therefore, there are arguments to return
to paper ballots and manual counting as in the past, and some countries are actually implementing
this.
9 Social media inherently encourages group conformity among its members, which can lead to
collective action based on group identity. If misused politically, combined with misinformation and
extreme ideology, it can form hostile factions and provoke destructive behaviors.
7.6 Politics in the Digital Age 197
government, media, and electoral systems, leading to extreme political strife, faction-
alism, and public opinion division, jeopardizing national unity. Adversarial nations
might exploit this to intervene in elections and politics through misinformation and
manipulation.10
The indiscriminate spread of misinformation and fake news has devastating conse-
quences, making it urgent to devise appropriate strategies to combat it. A key aspect of
the response strategy is establishing methods to determine the authenticity of misin-
formation and fake news and implementing legal actions based on those findings.
From a preventative standpoint, it is also necessary to run digital literacy campaigns
to promote the correct use of digital technologies. Accurately discerning the truth-
fulness of information disseminated through media requires interdisciplinary collab-
oration among experts in various fields such as journalism, social psychology, soci-
ology, political science, science and technology, and communications. To implement
this, regular meetings among these experts for organic networking are essential. A
viable solution for promptly addressing the issue of false information could involve
deploying AI technology for real-time filtering, an approach termed ‘AI filtering.’
This ‘technological treatment’ leverages advanced algorithms to automatically distin-
guish between factual and misleading content, offering a proactive measure against
the spread of misinformation. This could serve as a real-time solution, supple-
mented by expert group verification as a post-treatment method. For social media,
both real-time and post-event actions are necessary to prevent the amplification of
misinformation and fake news, ensuring appropriate legal measures are taken.
It is crucial to recognize the transformative impact of social media platforms,
which emerged through digital transformation, on political activities and social move-
ments. The most shocking change that hyperconnected social media has brought to
the political-social environment is collective action mediated by the internet. SNS
and internet personal broadcasts via platforms like YouTube are prime examples.
SNS, as a network of social relationships formed for communication, information
sharing, and expanding contacts, allows its members to express and share personal
statements. Internet personal broadcasting provides a means to propagate personal
views to an unspecified majority. Using these social media platforms, it is possible
to form groups that share opinions and engage in collective actions, ranging from
aggressive cyberactivities like negative comments to physical collective actions like
protests. The distinctive feature of such collective actions today is their formation and
execution beyond the constraints of time and space. Group members can participate
in collective actions even if they are dispersed across different regions or nations,
and they can even disrupt and aggravate situations from adversarial nations. There-
fore, it has become a critical task to sensitively address these environmental changes
and develop multilayered countermeasures for personal information protection and
scan the entire population of China in just one second, identify moving individuals,
and offer up to 99.8% accuracy by considering facial expressions, movements, and
variations in light and shadow. Following the Skynet project, the Chinese govern-
ment made it mandatory from December 2021 to register facial information when
activating mobile phones. Eventually, Skynet, armed with more advanced digital
and AI technologies, will closely monitor every move of its citizens, protecting the
communist regime from anti-establishment unrest.
In addition, China has implemented a nationwide Social Credit System that
assigns credit scores to individuals and companies. This system evaluates a range of
factors, including financial behaviors like income tax payments, utility bills, and loan
repayments; social behaviors like traffic law compliance and public transport fare
payments; and online behaviors such as online conversations, comment reliability,
and shopping habits. Citizens begin with a base score that is adjusted based on their
actions, with positive behaviors like timely tax payments, public welfare contribu-
tions, and blood donations increasing one’s score, while actions like environmental
violations, jaywalking, or parking infractions result in deductions. The accumulated
scores influence various aspects of life, such as insurance premiums, school admis-
sions, scholarships, internet access, high-speed train and flight eligibility, foreign
travel, public sector job applications, and loan interest rates. For instance, a high score
may grant benefits such as priority hospital reservations, discounts on utility bills,
lower loan interest rates, and free health check-ups, whereas a low score may result
in difficulties in securing public sector employment, limited children’s admission to
private schools, or restrictions on travel and accommodation options.
While China’s Social Credit System may initially appear as a structured way to
incentivize positive behavior, it raises several concerns. One issue is the fundamental
concept of evaluating individuals through a scoring system, which could be seen as
conflicting with the idea of personal freedoms. In addition, the criteria for evalua-
tion may be subject to interpretation. For example, a rule about reducing one’s score
for spreading false information could be applied broadly, potentially affecting those
who express dissenting views. In some cases, evaluation criteria extend beyond indi-
vidual social behavior to include aspects like social circles, personal relationships,
or political and religious views. These aspects, which may not be directly related
to an individual’s actions, could disproportionately impact their score. Such criteria
raise questions about fairness and transparency and may be viewed as mechanisms
for exerting social control rather than promoting societal well-being.
China’s Social Credit System integrates with existing technical surveillance
methods, such as cameras and facial recognition, creating a dual system of over-
sight. The combination of retrospective surveillance with the proactive elements of
the credit system allows for a more comprehensive form of monitoring. While this
system is designed to encourage socially harmonious behavior, critics argue that it
also serves to regulate public expression and dissent. The system’s opaqueness, where
individual score criteria are not fully disclosed, can lead to uncertainty about which
actions will affect one’s score. This lack of transparency may foster a climate of
self-censorship and social caution, as individuals may avoid associating with those
whose scores have been lowered to avoid negative repercussions. As a result, the
200 7 Digital and AI Transformation in Society
Social Credit System may not only function as a governance tool but also as a means
of influencing public behavior and maintaining control over societal discourse.11
While China was the first to implement the Social Credit System, it is a mecha-
nism that could potentially be adopted by other authoritarian states. However, even
democratic nations are not immune. As seen in the USA, considered a benchmark of
democracy, the selection of a president can significantly sway national governance.
This prompts a crucial reflection on how to prevent the creation of new surveillance
tools and stop nations from descending into surveillance societies.
As a measure against the Social Credit System, the first consideration could be
to enshrine the protection of human rights, privacy, and the prohibition of guilt by
association in the constitution, safeguarded by the Supreme Court as guardians of
constitutionalism and the rule of law. This assumes that Supreme Court justices
are appointed based on their dedication to upholding the constitution and the rule
of law. However, appointments to justices can be influenced by the president, and
in situations where political tribalism and populist politics may challenge the rule of
law, constitutional provisions alone are not reassuring.12
Another potential strategy to address surveillance society issues is legalizing the
restrictions on the long-term storage of personal information by governments and
corporations. Specifically, laws could be established to limit the retention period
of sensitive information, such as political or religious preferences, or to preferably
prevent the collection of such information altogether. In addition, forming interna-
tional agreements, though non-binding, could solidify national commitments as an
international promise. The ultimate recourse is for citizens to stand against authori-
tarian or controlling governments to protect human rights and freedom. While resis-
tance movements may face limitations in countries already under surveillance by
authoritarian regimes, in liberal democracies, early opposition against the installation
of digital or social surveillance systems can have a chance of success.
Digital platform companies use digital technologies to massively collect and analyze
user data, employing it for various purposes such as customized advertising, product
development, and market forecasting to pursue economic gains. This data includes
11 The Artificial Intelligence Act (AIA), enacted by the EU in 2024, prohibits AI systems that
evaluate individuals’ social behavior and impose benefits or punishments based on such evaluations,
as well as systems that manipulate people’s behavior or thoughts in an unfair manner. See Chap. 5,
Sect. 5.9.5 for reference.
12 “Political tribalism,” a term coined by Amy Chua, refers to the inherent human instinct to affiliate
with groups, fostering a sense of belonging and attachment. Chua elaborates on tribalism’s dynamics,
noting that once individuals align with a group, their identities become remarkably intertwined with
that group. This allegiance compels individuals to aggressively support their group members, often
leading to unwarranted hostility toward outsiders. This insight is detailed in Amy Chua’s Political
Tribalism, published in 2020.
7.7 Digital Surveillance 201
13 Shoshana Zuboff, author of The Age of Surveillance Capitalism, views the surveillance capitalism
society as one where online-based product sales and marketing in the digital platform era primarily
rely on individual digital traces. Platform companies extract data left online by individuals for
free, gaining commercial profits and power. Similar to how industrial capitalism utilized labor, the
power-holders of surveillance capitalism consume every digital trace of individuals, increasingly
amplifying their power and reducing individuals to a state of slavery. Individual data is not only
collected, analyzed, and categorized for commercial use but also employed to guide, control, and
manipulate individuals. People become custom consumers who only consume what algorithms
present based on their data, transitioning from beings with free will to analyzed data and puppets
utilized for others’ gains.
202 7 Digital and AI Transformation in Society
Maps, and YouTube, to build detailed user profiles. These profiles are further enriched
with information collected through cookies and tracking technologies that monitor
users’ online behaviors, enabling more precise ad targeting. Leveraging big data
analysis and machine learning, Google analyzes these profiles to detect patterns and
predict user behaviors and preferences. This approach allows Google to deliver highly
targeted advertisements, enhancing the effectiveness of its advertising platform. As a
result, advertisers benefit from more tailored ad placements, and Google strengthens
its revenue streams from advertising.14
While Google is often used as an example, other digital platform operators engage
in similar practices. Companies like Meta (Facebook) and Amazon, which rely
heavily on digital advertising revenue, actively collect user information through
their social networks and online commerce platforms for advertising and marketing
purposes. Data points such as conversations, messages, ‘likes’ on Facebook, photos
uploaded to Instagram, and comments on those photos are captured, analyzed, and
stored in individual user profiles. This enables platforms to gain deep insights into
personal information, including food preferences, travel habits, religious views, polit-
ical leanings, and social interests. The data is then used to deliver targeted adver-
tisements, predict future actions, and, in some cases, influence behavior. As a result,
users are under continuous data collection by these platforms. These practices have
contributed to the emergence of the concept known as ‘surveillance capitalism,’
where user data is monetized as part of the economic model.
When digital platform companies like Google, Meta, and Amazon collect more
user data than necessary for improving service quality and use that surplus data for
targeted advertising, it enables the phenomenon of surveillance capitalism. These
platforms analyze data generated by users’ online activities, gathering comprehen-
sive information such as location, movement patterns, interests, social networks,
consumption habits, search behaviors, and even political views and religious pref-
erences, which are stored in individual profiles. By analyzing these profiles, plat-
form operators gain insights into user behaviors, creating a parallel between data and
behavior—with “surplus data” representing “behavioral surplus.” Although platform
companies claim that user data is collected for service improvement, only a portion
of it is used for that purpose. The remainder—behavioral surplus—can be repur-
posed for objectives such as targeted advertising, consumption prediction, or even
political influence. Shoshana Zuboff argues that just as surplus labor fueled industrial
capitalism, behavioral surplus drives surveillance capitalism in the digital platform
era. In contrast to industrial capitalism, where human labor created value, Zuboff
suggests that in surveillance capitalism, human behavior becomes the raw material,
captured by digital systems and transformed into valuable data. Machines, which
14In response to growing concerns over user privacy, Google has announced changes to its data
management approach. Starting in 2024, information related to users’ movements, which was
previously stored on both the user’s device and Google’s servers, will only be retained on the
user’s device.
7.7 Digital Surveillance 203
once served as fixed capital in the industrial age, now act as variable capital, contin-
uously upgrading through machine learning and improving their ability to predict
and influence behavior.15
Surveillance capitalism differs from traditional capitalism in several ways. Capi-
talism has historically been defined by the privatization of production, profit maxi-
mization, and market competition, primarily involving the production and exchange
of goods and services. Surveillance capitalism extends this framework by utilizing
data (behavior) as a key resource for profit generation. Zuboff points out that in
this model, machines have assumed the role of value creation, reducing humans to
sources of behavioral surplus. While applying the term “capitalism” to personal data
collection may be considered controversial, the secretive nature of data collection by
platforms bears resemblance to “surveillance,” and the pursuit of profit from this data
aligns with the principles of capitalism. Hence, the term “surveillance capitalism”
encapsulates these dynamics, raising awareness of the practices within the digital
platform era and encouraging individuals to be mindful of these developments.
The concept of surveillance capitalism presents a crucial opportunity for users
of digital platforms to reflect on how their online behavior, often shared without
much thought, can be used and what consequences it may have. It prompts individ-
uals to consider how to navigate digital spaces responsibly and avoid the potential
negative impacts of exposing too much personal information. Furthermore, it raises
questions about the societal mechanisms needed to prevent harm, ensuring that indi-
viduals retain control over their data. As society moves further into the digital and
AI transformation era, it becomes increasingly important for individuals to reclaim
their right to information protection, demand transparency and accountability from
corporations and governments, and work toward maintaining a more democratic and
ethical society. Addressing large-scale data collection and surveillance requires rein-
forcing privacy rights, improving personal data protections, and strengthening legal
frameworks to secure transparency and consent in data collection.16 For individuals
15 To be more specific, Zuboff’s argument is as follows: Under the regime of surveillance capitalism,
humans are no longer the agents of value realization. Far from being entities that create value through
labor, humans have been relegated to being part of the means of production, or more precisely, raw
material. While industrial capitalism transformed raw materials obtained from nature into products,
surveillance capitalism seeks to utilize human nature. In return, the capacity to create value in
the capitalist production process, which in the era of industrial capitalism was just ‘fixed capital’
represented by machines, has now shifted. During the industrial age, although machines participated
in production, they couldn’t enhance their own value, thus remaining ‘fixed capital.’ However, the
scenario has completely changed with the advent of digital transformation and the development of
machine intelligence. Google’s machine intelligence technologies grow by consuming ‘behavioral
surplus,’ and the more behavioral surplus is fed into it, the more accurate the predictive products
created by machine intelligence become. Through machine learning mechanisms, machines now
upgrade themselves at every moment of operation, transforming into ‘variable capital’.
16 The Digital Services Act (DSA) enacted by the European Union in 2022 includes measures to
limit the collection of consumer information and its use for personalized advertising. It mandates
that consumers have the option to halt personal information collection and deactivate recommenda-
tion algorithms, and prohibits personalized advertising based on religion, race, sexual orientation,
political leanings, etc., especially targeting children and adolescents. In addition, the Artificial
Intelligence Act (AIA), enacted in 2024, prohibits AI systems that unfairly manipulate people’s
204 7 Digital and AI Transformation in Society
behavior or thoughts. Misuse of surplus data for purposes such as targeted advertising, consumer
predictions, or political manipulation could violate this provision. Refer to Chap. 5, Sect. 5.9.5 for
further details.
7.8 Digital Self-restraint 205
Living in a hyperconnected society with smartphones, the internet, and various plat-
form services often leads to a continuous flow of information, which can disrupt
one’s ability to focus on tasks. The frequent shifts between communication partners,
topics, and methods force individuals to switch their attention rapidly, making it
difficult to maintain focus on a single task for extended periods.17 Features of social
media, such as “likes,” comments, and shares, are designed to engage users and
can contribute to addictive behaviors, further dispersing focus. While digital tech-
nology enables multitasking, using multiple devices and applications simultaneously
increases cognitive load, reducing the depth of concentration that can be applied to
any single task. Moreover, the overwhelming volume of information and the pres-
sure to stay connected can heighten anxiety and stress, further hindering focus.
Frequent interruptions from emails, messages, and notifications fragment attention,
preventing deep, focused work and leading to more superficial task completion. In
addition, digital technology and social media can alter the brain’s reward system and
expectations, making slow-paced tasks seem less appealing as individuals become
accustomed to fast information and entertainment.
In order to counteract the decline in concentration, a digital detox may be neces-
sary. This involves temporarily reducing or ceasing the use of digital devices like
smartphones, computers, and tablets. The term “digital detox” combines “digital”
with “detoxification,” referring to the process of stepping away from electronic
devices, the internet, and social media to recover from over-reliance on these tech-
nologies. Engaging in a digital detox can help individuals refocus on real-life activ-
ities and social interactions without the constant interference of digital distractions,
ultimately reducing stress and improving concentration.
17Johann Hari, in his book Stolen Focus, cites technological distractions like social media and
smartphones, information overload, and a multitasking culture as factors that degrade concentration.
206 7 Digital and AI Transformation in Society
generally welcomed due to their convenience and ease of use. However, technolo-
gies introduced in the digital age, such as digital kiosks and online banking services,
can cause stress for many people, especially when their functionality is difficult to
understand. Reducing staff in favor of kiosks or closing bank branches to promote
digital banking can make some people view digital technology as a source of incon-
venience and a threat to their ability to manage everyday tasks. This technostress
is contributing to a harsher societal environment, leading some to feel nostalgic for
past technologies, even forming subculture groups that resist modern innovations.
Technostress caused by digital technology can partly be attributed to the immatu-
rity of current technologies. As society progresses toward a digital/AI-driven future
and technology becomes more sophisticated, some issues related to usability and
adaptation may be alleviated. For instance, AI-powered kiosks could eventually
mimic the natural interactions of a human clerk, restoring some of the ease and
comfort lost in the transition to digital systems. However, the continuous introduc-
tion of new technologies that replace familiar products and services at a rapid pace,
without addressing whether there is a perceived need for these changes, can still lead
to cultural alienation and temporal dissonance. This challenge, stemming from the
speed of change rather than the maturity of the technology itself, may require broader
solutions beyond just technological advancement.
Although slightly different in nature, search engines and social media create filter
bubbles and echo chamber effects, unknowingly restricting or distorting informa-
tion for users, confining their thinking within certain frameworks. Internet search
engine algorithms are designed to provide users with personalized, quick services by
remembering past search histories and presenting search results within similar ranges
when new queries are entered. This results in users being unknowingly confined to
the scope of their past search histories. Such filter bubble phenomenon, by limiting
the information accessible through internet searches, narrows users’ perspectives
and distorts their thinking. As this phenomenon recurs, it prevents users from seeing
things from various viewpoints, leading them to prejudices and self-confirmation,
fostering distorted thinking.
Social media platforms allow users to create chat rooms for communication,
where the information’s echo effect can homogenize beliefs among participants,
distorting and biasing thought. This echo chamber effect solidifies beliefs even in
those initially without strong convictions, through conversations with like-minded
groups, leading to the formation of factional groups and trapping thought within the
collective mindset. Furthermore, the echo chamber effect can homogenize beliefs
among members of the same chat room, form groups, and encourage group action.
If utilized politically, this can lead to the formation of hostile factions and destruc-
tive actions. While filter bubbles cause bias by restricting collected information,
208 7 Digital and AI Transformation in Society
The transition to AI in society will be a transformative force that shapes the future
in ways both visible and subtle. AI technology is already improving everyday life,
with smartphones now equipped with AI assistants that handle routine tasks, manage
schedules, and even anticipate our needs through predictive algorithms. Smart homes
use AI to control lighting, security, and appliances, creating a seamless, personal-
ized living experience. In the realm of entertainment, AI can curate content tailored
to individual preferences, offering personalized recommendations that reflect user
habits. Similarly, AI-driven healthcare is revolutionizing medical consultations, with
remote systems providing precise diagnostics and customized treatments, ensuring
that medical care is more efficient and accessible than ever before.
Beyond these personal benefits, AI is having a profound impact on workplace envi-
ronments and social activities. AI-enabled tools are enhancing the speed and precision
of tasks, in various areas like finance, marketing, or manufacturing, reducing the need
for human intervention. In fact, many have already experienced the convenience of AI
in day-to-day interactions with chatbots, such as ChatGPT, which provide on-demand
information, customer service, and even emotional support. This shift suggests that
AI transformation will redefine not just the efficiency of society but also how people
interact, communicate, and build relationships.
The AI transformation will have a more significant and direct influence on society
than the digital transformation did. While digital transformation centered on smart-
phones that gave users control over communication and services, AI could change
the very nature of interaction, placing AI in a more authoritative role. People will
rely on AI-powered smartphones, which will take on many decision-making tasks
autonomously.18 For instance, a smartphone may no longer just be a tool that responds
to commands; it may anticipate needs and take independent actions. This raises ques-
tions about over-reliance on AI and the gradual shift of decision-making authority
from humans to machines.
Platforms that were central to the digital transformation, such as search engines,
social media, and e-commerce, will need to evolve to survive in this AI-driven land-
scape. Initially, these platforms may face challenges, as AI offers new ways to interact
and consume content. However, just as digital platforms adapted to mobile OS
systems, AI platforms will emerge,hosting various applications tailored to specific AI
18 Samsung released the Galaxy S24 series on January 31, 2024, as its first AI-powered smartphone,
featuring innovations like Live Translate and Chat Assist. In September 2024, Apple launched the
iPhone 16 with AI features, introducing Apple Intelligence, a suite of AI tools for tasks like image
analysis and text rewriting.
7.9 AI Transformation in Society 209
services, reshaping the marketplace in the process.19 Users will begin to interact with
AI systems for an increasing number of activities, including shopping, socializing,
and even creative tasks like content generation.
One key consequence of this AI revolution is that real-world interactions may
increasingly be AI-mediated. Human-to-human interactions could be replaced by
human-to-AI or even AI-to-AI transactions. For example, in online shopping, a
user might only need to mention a desired product to their AI assistant, which will
autonomously handle the entire purchase, communicating with the AI systems of e-
commerce platforms. This level of automation presents unprecedented convenience
but also shifts the role of humans from active participants to passive overseers, with
AI taking over tasks that were once human-dominated.
With the AI transformation, the job landscape will change dramatically. While
AI will lead to unprecedented efficiency, especially in administrative roles, it will
also result in significant job displacement. Traditional jobs, even in professional
sectors such as medicine, law, and management, could be substantially replaced
by AI, as algorithms become more capable of handling tasks that require complex
decision-making. However, this shift will create new opportunities. Jobs such as AI
specialists, AI trainers, and AI integrators will rise in demand, requiring advanced
skills to manage and optimize AI systems across industries. As a consequence, it
becomes crucial to evolve vocational training to equip workers with the necessary
AI-related skills, and AI itself can be a powerful tool for such education. For example,
AI-driven simulators can offer hands-on training in fields as varied as aviation and
medicine, providing workers with immersive, interactive learning environments.
As AI becomes more pervasive, concerns over the digital divide will evolve into
worries about the AI divide. Individuals who lack proficiency in AI technologies
may be left behind, exacerbating existing inequalities. Ensuring AI literacy for all
members of society is a critical task. While AI has the potential to create interactive
and personalized learning environments that can bridge these divides, it also poses
the risk of leaving those without access or understanding further marginalized. In
addressing this, AI itself may possibly offer solutions, such as voice-activated AI
systems with conversational capability at kiosks or in public spaces that allow users
to access services as if they were engaging with a human. It can provide inclusive
solutions for the pre-digital generation and those with limited technical skills.
However, the AI transformation is not without its risks. The spread of misinforma-
tion, already a significant issue in the digital age, could be exacerbated by AI, partic-
ularly through the use of deepfake technology. Deepfakes use AI to create manip-
ulated audio and video content, often indistinguishable from reality. These tools,
while technologically impressive, pose threats to political stability, personal repu-
tations, and social trust. Moreover, the ability to distinguish authentic content from
fabricated media becomes increasingly difficult, presenting a fundamental challenge
in maintaining truth in an AI-driven society.
19For example, OpenAI launched the “GPT Store” in January 2024, allowing users to create and
use customized GPT applications.
210 7 Digital and AI Transformation in Society
As AI operates on vast amounts of data, concerns will grow around privacy, data
security, and surveillance. AI systems rely on extensive datasets to make informed
decisions, leading to fears about privacy violations and the misuse of personal
data. However, AI also offers the potential to strengthen security by creating more
advanced protective measures. The key challenge lies in balancing the benefits of
AI-driven security with the need to safeguard individual rights and prevent unwar-
ranted surveillance. In addition, ethical concerns surrounding intellectual property
and bias in decision-making must be addressed, as AI continues to assume roles in
creative fields and judgment-based industries.
In summary, the AI transformation offers immense promise but also significant
challenges. Society must work to ensure that the benefits of AI, such as increased
efficiency, better healthcare, and improved education, are maximized while mini-
mizing risks related to privacy, bias, ethical dilemmas, and job displacement. The
future of AI is one of potential, but also one that demands thoughtful management
to navigate the complex social, legal, and philosophical issues that will arise.
Chapter 8
Challenges of Digital and AI
Transformation
We regard the Digital Revolution as the starting point that transformed industrial
society into a digital society. Just as the Industrial Revolution transitioned agrarian
society into an industrial one, it is reasonable to assume a corresponding Digital
Revolution marked the transition into a digital society. While the Industrial Revolu-
tion is symbolically dated to James Watt’s invention of the steam engine in 1814, this
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025 211
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5_8
212 8 Challenges of Digital and AI Transformation
There are three main types of digital platforms. These include the communication
platform represented by the internet, which was established through the integration
of communications and computing; the OS-centered content platforms like iOS and
Android, established through the system-level integration of communications and
computing; and the various application platforms created by applications built on
top of these content platforms. Collectively, these are referred to as digital platforms.
Various types of content and services are provided to users through these three plat-
forms. Specifically, web browsing, file transfer, remote computer access, and email
are directly offered on the internet communication platform; various applications
are provided on the OS content platforms; and other services like social media,
e-commerce, cloud computing, and content sharing services are offered through
their respective specialized application platforms. Leading platform companies like
Google, Amazon, and Meta (Facebook) operate various kinds of application plat-
forms, and Apple and Google each operate the OS-centered content platforms App
Store and Play Store, respectively.
214 8 Challenges of Digital and AI Transformation
These digital platforms are the symbols of the digital age and the pioneers leading
it. They are at the heart of digital age civilization, providing various forms of commu-
nication and connection, and have accelerated the world toward “hyperconnectivity.”
Digital platforms represent a new digital industry that did not exist in the industrial
age, emerging alongside the Digital Revolution. Founders of application platform
companies quickly recognized the signs of digital change, secured their territories,
and grew their businesses. The launch of Steve Jobs’ Apple iPhone-iOS-App Store
is a prime example. Platform companies have provided mankind with various types
of services, receiving active interest and love from users, and generating immense
wealth. They pioneered and utilized digital technologies ahead of others, growing
their businesses and rapidly expanding into natural monopolies through network
externality effects, thereby establishing a firm position in the global top 10 by market
capitalization.
A prime platform for application platforms could be considered Apple’s content
platform centered around iOS. With the release of the iPhone equipped with iOS
and the simultaneous launch of the App Store operating on iOS, Apple opened the
door for numerous applications to be hosted and various application platforms to be
launched for the first time. This catalyzed the ICT Big Bang and paved the way for
the Digital Revolution. Did Steve Jobs predict such explosive changes and release the
iPhone-iOS-App Store combination? Exploring the background provides an affirma-
tive answer to this question. Firstly, Jobs recognized that the user (i.e., the customer)
is the ultimate point of business outcomes. He understood early on that the essence
of all business lies in satisfying the needs of users, which enabled the iPhone to
feature exceptional UX/UI. Secondly, he understood market dynamics well, recog-
nizing the need for an open market where service providers and users could transact
directly. Thirdly, his experience with iTunes informed him of the importance of
content and foresaw that future communications networks would serve as distribu-
tion channels for such content. Fourthly, his experience developing the Macintosh
PC acquainted him with the importance of the OS, leading him to insist from the start
that the team developing the iPhone uses iOS. Fifthly, he understood the importance
of allies in business, knowing that forming a mutually beneficial eco-cluster would
be essential, and he strived to create a business model that shared benefits. Jobs
developed ambitious products and services based on his sharp understanding of the
situation, boldly challenging the information and communication market. This led to
the ICT Big Bang, overwhelming traditional communications operators entrenched
in business-centered thinking.
The four major platform companies known as ‘Big Tech’—Apple, Google,
Amazon, and Meta—were all founded in the USA. What is the implication of this?
It may be attributed to several factors characteristic of the US environment. The
USA has a culture of challenge, adventure, and pioneering spirit, consistent with its
origins as a nation of immigrants. It guarantees human rights, freedom of thought,
and economic activity, fostering an environment conducive to innovation, technolog-
ical development, and entrepreneurship. The USA also has well-established infras-
tructure for venture investment, technological guidance, and talent education that
8.3 Education, Digital Literacy 215
As discussed earlier, education is one of the areas facing significant challenges due
to digital transformation. The introduction of digital educational tools necessitates
changing teaching methods to enhance the effectiveness of education through new
forms of learning such as interactive learning, personalized learning, and remote
learning. In support of this, students need to be ensured to have equal access to
digital devices and the internet. Therefore, a basic condition for education in the
digital age is to equip all schools with digital educational devices, high-speed internet
connections, and the necessary educational content and tools. Teachers also need to
be familiar with digital technologies and consider developing educational materials
and methods using these technologies as a basic part of their duties.
1 This situation differs from contexts where state control imposes significant limitations on creative
and corporate activities. For example, while countries like China have made considerable invest-
ments in scientific research and technological development, with the goal of becoming a global
leader, the system’s centralized direction and government oversight shape the landscape of research
funding and corporate growth. Despite these advancements, the historical and cultural context can
sometimes limit the potential for innovative and disruptive entrepreneurship. This also contrasts
with regions where legacy regulations or political factors present obstacles to the development of
new enterprises.
216 8 Challenges of Digital and AI Transformation
A critical element of education in the digital age is digital literacy. Digital transfor-
mation brings about digital divide issues related to socioeconomic status, geographic
location, educational level, and age, and education can provide a starting point to
address these issues. Proper learning with digital educational tools from a young age
can equip students with digital literacy, freeing them from the digital divide as they
enter society. Thus, educating students to achieve digital literacy should be consid-
ered a fundamental element of education in the digital age. Further, if schools can
provide digital education programs to community members, teaching them how to
use digital tools and develop digital literacy, it would contribute to closing the social
digital divide.
In addition to digital literacy, another important quality to prepare from a young
age involves understanding and exercising restraint in the use of social media. Exces-
sive use and addiction to social media can lead to various adverse effects, such as
wasted time, social isolation, anxiety, depression, stress, loss of self-esteem due to
comparison with others, information overload, lack of sleep, and even potential harm
to physical health. Moreover, excessively revealing personal information can lead to
privacy violations. Addiction to excessive use of social media can cause various prob-
lems and adversely affect social activities after graduation. Therefore, it is essential
to teach students to use social media discerningly to prevent mistakes during their
growth from becoming lifelong regrets.
The fundamental challenge digital transformation poses to education is how to
adapt its content in preparation for a future coexisting with digital technology. In
the era of digital and AI technology, ignoring digital capabilities is not an option,
and tasks that can be performed by digital means need not to be duplicated by
humans. However, reliance on digital technology for everything can lead to human
incapacity, with the risk of living under the dominion of technology rather than
utilizing it. For instance, while using ChatGPT for writing when necessary can be
beneficial, if students neglect learning how to write themselves, they miss developing
sophisticated expression skills, critical thinking, and creativity. They may also lose
the ability to judge the adequacy of ChatGPT’s writing. This scenario could become
more acute as digital technologies become increasingly intelligent. Therefore, it is
crucial to research what it means to be human and what basic abilities humans should
possess to coexist with future AI and robots. The findings should then be reflected
in education to nurture the fundamental capabilities necessary for living in an era of
coexistence.
In the digital age, computational thinking and coding are recognized as vital
skills,2 and many countries have included them in their educational programs.3
2 Computational thinking refers to a mindset related to defining problems in a way that a computer
can understand and solve. Computational thinking can be divided into abstraction and automation.
Abstraction is the process of structuring and breaking down complex problems into a simplified
state, while automation is the process of translating the abstracted problem into the language of
computers. Coding refers to the process of inputting instructions to a computer in programming
languages that the computer can understand, such as C, Java, and Python.
3 Many countries around the world are incorporating coding and computational thinking into their
elementary school curricula to equip students with basic skills essential in the digital age, such as
8.4 Misinformation, Media 217
problem-solving abilities, logic, analytical skills, and creativity. The UK was the first to include
coding in its elementary education in 2014, and it has since been included in Australia, Finland,
Estonia, Singapore, France, Canada (in some regions), China (in some regions), and Korea plans to
make it a mandatory part of the curriculum in elementary and middle schools starting from 2025.
218 8 Challenges of Digital and AI Transformation
opinions from many others. Similarly, when faced with varying reports from many
different media outlets, people will choose to trust the one they have always believed
in. The treasure of the digital age is not information but trust, making the media that
provides reliable, well-considered journalism even more valuable. Just as a treasure
shines even when buried in the earth, true journalism shines all the more amid rampant
false information.
Public media in the digital age must equip themselves with filters to block the
increasing and cunning false information and fake news. With the surge of new
information and news, the media is in a race to report the truth swiftly, necessi-
tating media outlets to develop their own methods for quick verification of truth.
Although challenging, AI technology could offer solutions, such as developing real-
time truth-verification software utilizing AI filtering. Until such solutions are found,
public media should resist the temptation for sensational or interest-driven reporting,
preferring to delay publication until the truth can be verified. Future legal and regu-
latory measures by governments to block false information and fake news could
reduce the amount of information needing verification. Similarly, if countries enact
laws similar to the EU’s Digital Services Act (DSA), platform operators will lead in
blocking false information, significantly reducing its distribution. By seeking self-
solutions and maintaining journalistic integrity until then, public media’s stature will
be reinforced by societal trust, further solidifying its position.
The rapid proliferation of AI technology following the release of ChatGPT poses
another dimension of challenge to society at large, including the media. The opening
of AI source codes will break down the barriers of high investment and long-term
research and development, enabling anyone to develop AI. In addition, the activation
of the GPT Store will likely lead to a wide spread of various customized AI applica-
tions. As a result, false information and fake news could become heavily armed with
such new technologies. AI is automating the creation of fake news, dramatically
increasing web content that mimics realistic articles, spreading false information
about elections, wars, and natural disasters.
However, as seen in the case of deepfakes, the false information generated by AI
is so sophisticated and cunning that it is extremely difficult to discern its falsehood.
If used for manipulating public opinion, inciting the public, or election strategies,
society could plunge into severe chaos. In this situation, the role of the media in
reporting the truth and maintaining journalistic integrity becomes even more critical.
Yet, even the most advanced media equipped with the latest AI technology may
struggle to handle this challenge alone. In order to tackle this situation, strong legal
and regulatory support is necessary to punish the manipulation of false information,
in line with the executive order issued by the US government to prevent misuse of AI
technology and the code of conduct for AI companies established by the G7 nations.
8.5 Personal Information, Personal Competence 219
The era of digital transformation brings various new technological benefits along
with issues of information protection and management, especially personal data
protection. It is akin to paying a fee to enjoy the numerous benefits brought by
digital technology. Since everything generated and processed by digital technology
is data, and information is derived from processing this data, the use of information
invariably entails information management issues.4 When using search platforms or
online marketplaces, problems of personal data leakage arise, and the same occurs
with social media platforms. This personal data becomes valuable business assets
for platform operators and can be used for targeted advertising and other purposes.
If a smartphone is lost and falls into the hands of someone with malicious intent,
not only the personal information of the smartphone owner but also the contents of
messages, chats, and emails can be exposed, causing harm not only to the owner but
also to their friends who have communicated with them. When digital technology is
used in education and learning, educational content platforms collect students’ infor-
mation, and if this information is leaked, it can cause psychological stress to growing
students. If politicians or political groups with bad intentions use facial recognition
technology and digital currency to monitor citizens and collect information on their
movements and financial transactions, it can result in severe oppression and stress
for individuals.
The leakage of personal information can lead to not only mental stress but also
physical and financial harm. Therefore, information management and personal data
protection become critical social issues during the digital transition period. At the
individual level, it is essential to handle platform services connected through the
internet with caution and restraint, taking into account the potential for information
leakage in advance. This careful and moderate approach is a fundamental attitude
needed in the digital transition era.
In the digital age, digital literacy is a fundamental personal competency. It involves
understanding digital technologies and the ability to find, utilize, and communicate
information using digital tools. To develop and maintain digital literacy, it is advisable
to make a habit of learning the purpose and operation of new digital technologies
and tools as they emerge. In addition, an essential skill is the ability to discern
information. This includes the capability to select the information you need from the
vast amount of data circulated through various digital media, including social media,
and to distinguish between accurate and inaccurate information. However, this may
require a high level of expertise and extensive research. Therefore, it is important
4Data is unprocessed facts or numbers, while information is content that has been processed,
organized, and structured to give it meaning. For example, temperatures represented by numbers
such as 28°, 26°, 30°, 27°, 29° are temperature data. By processing and assigning meaning to this
data, information such as ‘the average temperature is 28°, which is suitable for outdoor activities’
can be extracted.
220 8 Challenges of Digital and AI Transformation
to learn in advance how to obtain the necessary information and choose the correct
information.5
In the digital age, with many socioeconomic activities being conducted via the
internet, it is necessary to understand and use various digital platform services such
as social media and search engines discerningly. It is important to understand the
potential social impact of the posts I make on social media, how platform companies
collect and use the information about me from those posts, and how confidential
information I classify as such can be leaked and create difficult situations if it circu-
lates on the internet. Moreover, when using search engines, it is necessary to be
aware that the content I search for can be limited by the filter bubble phenomenon.
Similarly, when using social networks, it is important to understand that the echo
chamber effect can lead to confirmation bias. By understanding these facts, one can
act discerningly when posting messages on social media and develop the habit of
critically accepting the messages received from search engines.
Especially when using digital devices connected to the internet or other networks,
it is necessary to habituate careful behavior, with the term “behavior” referring to
the data sent out through connected digital devices. Platforms can collect all such
data to create profiles and analyze them, thereby learning all information about
one’s behaviors, such as interests, personal networks, consumption habits, movement
patterns, search patterns, political leanings, and religious preferences. When using
social media, conducting online transactions, or searching for information, it is neces-
sary to be mindful and cautious of the fact that the data I input can be collected and
analyzed by the platform, resulting in targeted advertising or unforeseen outcomes.
Moreover, in the digital space, it is important to be aware that all data I input could
be stored somewhere indefinitely and never completely deleted. This includes posts
made in youth or mistakes, which if found and spread by someone, could lead to
embarrassing situations and sometimes cause very serious problems. Vigilance and
cautious online engagement are indispensable in the digital transformation era.
In the era of digital transformation, the role of the government is substantial and
critical: Although businesses and various sectors of society will make efforts toward
digital transformation on their own, legal and institutional support for various aspects
is needed, along with financial and systemic backing. At the national level, the
success of digital transformation is determined by how faithfully the government
plays its role. Therefore, it is a prerequisite for government officials to understand
5 Heather Kelly introduced eight precautions in her Washington Post column “How to avoid falling
for misinformation, AI images on social media” on October 9, 2023: (1) Know why something
might be misinformation. (2) Slow down while reading and watching. (3) Check the source; don’t
always trust “verified” accounts. (4) Make a collection of trusted sources. (5) Seek out additional
context about news events. (6) Use these tricks to spot AI images. (7) Vet videos and real images,
too. (8) Use fact-checking sites and tools.
8.6 Role of Government 221
digital transformation ahead of others and to be knowledgeable about what roles the
government should play for a successful digital transformation.
First of all, the sector to pay attention for digital transformation is the educa-
tion which requires significant budgets to build digital infrastructure such as digital
educational tools and high-speed internet, and to purchase and update various appli-
cations and services. In addition, budgets are required for the development of various
educational programs to offer new forms of learning, such as interactive learning,
personalized learning, and remote learning. The government needs to support schools
to secure the necessary resources for digital transformation. If the built digital infras-
tructure for the education of students could be extended for the digital education
of community residents, it would contribute to reducing the social digital divide. In
addition to such financial support issues, improvement of educational programs for
the digital age is equally important for building digital education infrastructure. The
content of education needs to be changed in preparation for a future where humans
coexist with digital technology. It would be beneficial if the government invested to
conducting in-depth research to support this by appointing expert groups and helping
schools to reflect the results in the curriculum.
In the digital age, understanding digital technology and using digital devices is
fundamental, and being excluded or left behind leads to a digital divide, which in turn
becomes a constraint on socioeconomic activities. Especially, generations before the
digitalization are suffering from not being able to properly use the increasing digital
tools in banks, public institutions, ticket offices, restaurants, etc. Without reducing the
digital divide between generations, social equality cannot be achieved in the digital
age. The government must actively work to eliminate the digital divide. Initially,
institutional backing is necessary to ensure that all public institutions and stores
provide at least one counter for face-to-face services, so that the rapidly increasing
elderly population can be freed from the stress of digital devices.6 Furthermore, high-
speed internet infrastructure must be built in all residential areas, including rural and
mountainous regions. Support for free facility is needed so that low-income groups
can access the internet affordably. Some types of support are also needed so that
individuals without digital devices can purchase or use them at an affordable price.
Installing public digital centers where internet and digital devices can be used for free
could be a viable option. Moreover, digital education programs should be developed
to improve the digital literacy of the entire population.
As we enter the era of digital transformation, the ability to handle digital tech-
nology has become essential in various professions, and with the evolution of digital
technology and changes in the job environment, there has arisen a need to enhance
individual digital skills. Furthermore, to transition to newly emerging digital jobs, one
must possess advanced digital skills. Therefore, re-education and lifelong learning are
essential to maintain competitiveness in jobs of the digital age. In addition, as digital
technologies such as factory automation and AI can replace human jobs, education
for digital job transition has become necessary to address job loss caused by these
technologies. In response to these contemporary trends, the government may take
some role, which may differ depending on the states, to explore various measures
to provide re-education, lifelong learning, and job transition education programs for
those wishing to enhance or change their careers.
The government also needs to actively address the issue of misinformation and
fake news, which have emerged as problems in the digital age. While ensuring
freedom of expression on social media, which is often the source of the problem, there
is a need to establish a culture where individuals are responsible for the consequences
of their expressions. In particular, personal internet broadcasting often spreads infor-
mation indiscriminately without adhering to the basic ethics followed by all public
broadcasting, causing social controversy and potentially influencing elections. There-
fore, personal internet broadcasts should also be held accountable for errors and false-
hoods to the same extent as public broadcasting.7 However, since personal internet
broadcasts that do not require a permit cannot be sanctioned through permit revoca-
tion, separate laws and systems need to be established to deal with misinformation
and fake news.
Another concern in the digital age is the protection of personal information. The
spread of digital technology has increased concerns about the collection, storage,
and use of personal information. Digital platforms, while providing various services,
collect user information, which can lead to significant harm if the collection is exces-
sive or the information is leaked or misused. The leakage of sensitive personal infor-
mation can lead to privacy invasion, financial loss, and unauthorized use of credit
cards. Furthermore, it could be used for fraudulent transactions, legal issues arising
from identity theft, and damage to an individual’s credit and reputation. In addition,
as the digital society is interconnected through networks, the risk of cyber-attacks
and data breaches has increased, leading to potential damage to critical infrastruc-
ture, financial loss, and exposure of sensitive information. Therefore, it is necessary
for government and legislature to legislate for the protection of personal information,
data security, consumer rights, and to establish or strengthen punishment regulations.
Similar to the EU’s Digital Services Act (DSA), it may also be worth considering
the legalization of restrictions on the collection, use, and management of personal
information by platform operators.
A particular area of concern during the digital transformation era is the small and
medium-sized enterprises (SMEs). Digital transformation requires heavy investment
for installing digital technology and hiring digital experts, which may be burdensome
for SMEs in traditional industries as they lack the financial resources to proceed. Since
digital transformation ultimately relates to the sustainability of businesses, it is advis-
able for the government to find some ways to help SMEs manage digital transfor-
mation successfully, thereby maintaining the jobs they offer. In general, government
7 In line with this, considering the opinion polls can play in distorting public opinion and critically
affecting elections, unregistered polling organizations should be banned from operating or subject
to regulations comparable to those for registered polling organizations.
8.7 Era of AI, Age of AI Robots 223
intervention in businesses is not desirable, but SMEs during the digital transforma-
tion era are an exception, as the challenges they face are due to changes in the era,
not due to their own inefficiencies or incompetence.
The AI era is widely regarded as the natural successor to the digital age. However,
the shift from a digital society to an AI society differs fundamentally from the earlier
transition from industrial to digital societies. While the digital age marked a paradigm
shift, the AI era represents an evolution—a continuation and maturation of digital
technologies. AI, initially one among many digital tools, is now poised to become the
central axis of digital transformation. The launch of ChatGPT-3.5 in 2022 marked
the beginning of this shift, signaling the onset of the AI era as a defining force of the
digital-AI age.
The emergence of ChatGPT elevated AI to new levels of prominence. AI, having
developed through incremental advancements, reached a watershed moment with
ChatGPT’s ability to engage in natural language conversations with humans. The
initial input method, text, is rapidly expanding to voice and video inputs, making AI
interaction more intuitive and seamless. OpenAI, the company behind ChatGPT, has
been hailed as a “game-changer”, with its valuation skyrocketing. The emergence of
ChatGPT spurred AI research and development across industries, forcing platform
operators to accelerate their AI initiatives.
As AI progresses, concerns about AI accountability have become central to
public discourse. Initiatives like the Montreal Declaration for Responsible AI (2018)
emphasized the importance of ethical, transparent, and accountable AI develop-
ment. Core principles include safety, fairness, transparency, and privacy protection,
essential to ensuring trust in AI systems as they become integral to production and
services. Following ChatGPT’s launch, these concerns intensified. Governments,
industry leaders, and academics have expressed apprehensions about AI’s impact on
employment, education, and national security.
In response, legislative bodies have taken action. The US government held public
hearings to explore measures for responsible AI development, while OpenAI’s CEO,
Sam Altman, advocated for regulation at US Senate hearings. In 2023, President
Biden signed an executive order aimed at mitigating AI-related risks. Internationally,
the G7 established an AI code of conduct, promoting responsible AI use, personal
data protection, and the labeling of AI-generated content. The EU’s AI Act, enacted in
2024, represents the first comprehensive legislative effort to regulate AI development
and ensure its responsible use.
Meanwhile, OpenAI continued its ambitious developments. After the release of
ChatGPT-3.5 in November 2022, the company launched GPT-4 in March 2023,
followed by the unveiling of the GPT Store in January 2024 and GPT-4o in May
2024. Further, it released GPT-o1 in October 2024, a completely new model with
enhanced inference capabilities. The introduction of the GPT Store parallels the
224 8 Challenges of Digital and AI Transformation
impact of Apple’s iPhone and App Store, signaling the dawn of a new AI platform
economy. Just as the App Store revolutionized the digital platform era, the GPT Store
may usher in the AI era, offering a diverse range of customized GPT applications.
What might be the third revolution following the Industrial Revolution and the
Digital Revolution? If the Industrial Revolution ushered in the “First Machine Age”
of industrial society, and the Digital Revolution opened the “Second Machine Age”
of digital society, what could the “Third Machine Age” introduced by the third
revolution be?8 It could well be conceptualized not merely as the “AI Age” but more
specifically as the “Age of AI Robots.” As discussed earlier, while AI technology is
part of the digital technology suite and the AI era is an extension of the digital age,
the Age of AI Robots represents a shift to a different dimension. In the future, as
AI develops and surpasses the singularity point, it will evolve into a super-AI that
exceeds human intelligence, and humanoid robots will develop correspondingly to
surpass human physical capabilities. When these two merge into one entity as ‘AI
robots,’ it will signify the birth of a superhuman, and the newly opened ‘Age of AI
Robots’ will truly be the ‘Third Machine Age.’ This will bring about a paradigm
shift that surpasses the transition from industrial society to digital society.
The “Age of AI Robots” symbolizes the advent of superhuman entities and signi-
fies the opening of a truly “Third Machine Age.” The transition from industrial to
digital societies will be overshadowed by a paradigm shift that promises to fundamen-
tally transform human life. This transformation is not a distant future but could occur
within the next two or three decades. It is humanity itself driving this change, and with
accelerating AI research and development competition, this timeline may be further
shortened. The post-ChatGPT era has seen platform companies openly competing by
releasing AI models (like Meta with Llama 2, Google with Gemini, and others), and
the 2024 CES offering a testimony to AI’s central role across numerous innovative
products and services. The open-source movement accelerates AI’s advancement,
lowering barriers so that generative AI development is not confined to entities with
substantial investment capabilities.9 However, this accessibility also harbors the risk
of misuse, such as the production of deepfakes and the spread of misinformation by
criminal organizations. The actions taken by AI experts in declaring principles for
AI research and development, proposing regulations for AI technology, the admin-
istrative orders and codes of conduct implemented by the USA and G7, and the AI
Act enacted by the EU underscore the critical need for oversight and responsible
development in such fast-evolving AI landscape.
The year 2022 may be remembered as a significant turning point in the history of
digital development. In July 2022, the EU enacted the Digital Markets Act (DMA)
and the Digital Services Act (DSA). In November 2022, ChatGPT-3.5 was released.
The enactment of the DMA and DSA signals that the digital age has peaked, while
8 The terms “First Machine Age” and “Second Machine Age” are used to refer to the Industrial Age
and the Digital Age, respectively, by Erik Brynjolfsson and Andrew McAfee. See their book “The
Second Machine Age”.
9 One year after OpenAI launched ChatGPT, Meta, together with IBM, formed the ‘AI Alliance,’
bringing together over 50 AI-related companies and institutions. This is an open-source alliance for
sharing AI technology for free.
8.7 Era of AI, Age of AI Robots 225
the release of ChatGPT-3.5 signals the beginning of the AI era. Consequently, digital
platform companies began to shift their focus to AI research and development. There-
fore, 2022 will be recorded as the year when the digital age began to pass the baton
to the AI era. However, the advent of the AI era does not signify the end of the digital
age but rather its maturation, marking the progress into the combined digital-AI era.
Bibliography
© The Editor(s) (if applicable) and The Author(s), under exclusive license 227
to Springer Nature Singapore Pte Ltd. 2025
B. G. Lee, Understanding the Digital and AI Transformation,
https://doi.org/10.1007/978-981-96-0033-5
228 Bibliography