STJ Post
STJ Post
Christopher Ayling
August 2018
1 Topic Outline
As seen in the 2016 White Paper released by the Japan Ministry of Edu-
cation, Culture, Sports, Science and Technology (MEXT), artificial intelli-
gence plays a central role in the functioning of the future Japanese Society
(dubbed a ”Super Smart Society”). [4] describes a Super Smart Society as
1
2 Introduction
This report presents three articles of research related to computer science
and AI technologies relevant to the advent and functioning of Japan’s Super
Smart Society. For each presented article; the context, technical details,
results, relevance and role will be summarized and discussed.
The first article is a paper which presents a novel approach to detecting
information about a room using a mobile device’s microphone [3]. The
second proposes a novel reward function for use in reinforcement learning
[1] while the third introduces an interpretability enhancing variation for the
modern generative adversarial network (GAN) architecture [9].
3 Research Articles
3.1 Inferring Room Semantics Using Acoustic Monitoring [3]
3.1.1 Context
in 2017, the 2017 IEEE International Workshop on Machine Learning for
Signal Processing was held in Roppongi, Tokyo. At the event, recent ad-
vances in machine learning for signal processing were presented and tutored.
These advances included using convolution neural networks (CNN) for inter-
pretable EEG analysis, the performance of sketch to photo inversion using
generative GANs as well as techniques for inferring room semantics from
audio. This section focuses on the paper about room semantic inference and
provides a summary of the paper’s technical details and results along with a
discussion on the technology’s impact on and relevance to the Super Smart
Society.
2
Figure 1: Test set confusion matrix from [3]
3.1.3 Results
Results can be seen in 1. The SVM which used ambient sounds performed
better than the GMM using RIR. It was noted that because of distinctive
structural features, RIR is effective in rooms like bathrooms and lecture
theatres.
3
3.2 Curiosity-driven reinforcement learning with homeostatic
regulation [1]
3.2.1 Context
Araya is a Japanese research laboratory based in Tokyo. Araya’s mission is
”to transform information into value for society” and as such they are con-
stantly exploring new technologies. The skills and fields represented among
the members of Araya include data science, neuroscience, physics, maths
and psychiatry. Research produced by Araya has been published in journals
such as Nature and PNAS and presented at conferences such as NIPS and
ICIIBMS. This section will focus on the paper presented the International
Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)
held in Okinawa, Japan in late 2017. The paper’s title is ”Curiosity-driven
reinforcement learning with homeostatic regulation”.
Reinforcement learning (RL) is a discipline of machine learning where
problems are formulated to involve an agent in an environment aims to
maximize a reward function. An industrial robot in a warehouse attempt-
ing to correctly store items or a humanoid robot learning to dance for the
entertainment of humans are examples of problems suited for RL.
In RL, the reward function captures the goal of the agent. The agent
then learns the actions to take in each possible state that maximize the
reward. Rewards can be either extrinsic or intrinsic. Examples mentioned
earlier such as box stacking are extrinsic while an an aim to learn something
new is intrinsically motivational. Intrinsic rewards are important because
as opposed to the honing of narrowly applicable skills, the development of
broad competencies is favoured [7].
3.2.2 Summary
The paper Curiosity-driven reinforcement learning with homeostatic reg-
ulation presents a novel reward function. The proposed reward function
encourages both actions leading to new situations and actions that lead to
future situations where future actions will yield more information about a
future state.
Encouraging actions which lead to new situations and knowledge is
known as curiosity. The amount of new information learned is quantified by
calculating the difference between the agent’s predicting and the observed
state [6]. This difference is known as the forward model error.
4
3.2.3 Technical Description
The proposed reward function is an extension on the function proposed by
[6].
Where f (st , at ) is the forward model and k(St , at , at+1 ) is the extended
model. These models were implemented using deep neural networks but
should work with any function suitable for predicting the future state. The
first equation (from [6]) captures heterostatic motivation and encourages
actions which lead to large forward model errors. The second equation
combines the heterostatic motivation with homeostatic motivation by also
encouraging action which lead to forward model errors but in areas of the
state-action space which is already familiar to the agent. The advantage of
the new approach is the regulation of the motivation to explore completely
new state spaces by balancing it with a motivation to fill out its knowledge
of existing state spaces. α determines strength of regulation.
3.2.4 Results
The hypothesis tested through experimental validation was ”exploring an
environment with several non-linearities could be optimized by regulating
the agent curiosity with homeostatic drive”. The experiment setup was a
three room environment where an agent learns a control policy with varying
levels of homeostatic regularization.
5
Figure 2: Accuracy of the forward model learned by the agent as a function
of homeostatic regulation (α). [1]
6
Super Smart Society. It is important for society that smart agents align
ethically in all stages of decision making and acting.
7
SelfExGAN has three components; an encoder (E), generator (G) and
discriminator (D). An existing architecture known an adversarial generator
encoder networks (AGE) uses the same components [8]. SelfExGAN makes
use of a Nash equilibrium between the components in order to relate latent
inputs to training data while AGE networks aim to minimize reconstruction
error.
3.3.3 Results
8
eration of new labelled training data and evaluation of similarities. 5 shows
examples of fake data and 4 shows a visualization of the latent space. The
clustered structure seen in the visualization indicates that the latent space
of the SelfEXGan is indeed understandable.
4 Conclusion
Three papers which varied in domain, scope, methods and technology were
summarized and their relevance to the Super Smart Society discussed.
References
[1] Ildefons Magrans de Abril and Ryota KANAI. Curiosity-driven rein-
forcement learning with homeostatic regulation. https://arxiv.org/
9
pdf/1801.07440.pdf, 2018. Accessed: 12/08/2018.
10